Steve, thanks for sharing your experience with AWS! At the moment I have evaluated several NoSQL storage solutions including SimpleDB, Riak, MongoDB and Cassandra. Lessons learned: 1) Storage that SimpleDB provides is too low-level and not very convenient to store dictionaries and other b-tree data structures that my app. works with. 2) "simpledb/dev" simulator is out of date and does not support the complete feature set of SimpleDB today. Thus, without major rewrite "simpledb/dev" emulator can not be used for the development. 3) SimpleDB storage is 100% specific to Amazon framework. From this follows that developing directly to SimpleDB interface will make app not portable across different cloud platforms. 4) Cassandra row/column abstraction is awkward for Data.Map structures that my app needs. 5) Riak provides convenient bucket/key/value abstraction and works in robust to failure node framework. REST/JSON protocol is simple to use, yet it is inefficient for data exchanges used by my app. I couldn't find simple libraries for binary exchange that Riak also supports. 6) MongoDB answers my requirements best of all - it is powerful on a server side (Javascript filters, etc) and works with efficient communication protocol based on BSON data exchange.
I also plan to use RabitMQ for communication between several Haskell processes and Java Web front-end that my app incorporates. It would be great to know what tools people use in the cloud (AWS, etc.) to communicate Web front-end with rest of the (Haskell) system ? What Haskell tools to build Web front-end? Thanks! Dmitri On Wed, Nov 16, 2011 at 9:01 PM, Steve Severance <[email protected]>wrote: > We use AWS extensively. We use the aws package and have contributed to it, > specifically SQS functionality. I will give you the rundown of what we do. > > We moved off of SimpleDb and now use mondodb. The reason is that simple db > seemed to have problems with write pressure and there are not good tools > for profiling your queries. My main application is extremely write heavy > with a single instance needing to do 100s or 1000s of writes a second. > Mongodb has worked well for us. I am scared of things like cassandra having > looked at the code, however some people have made it work. > > We store data such as crawled web pages in S3. The files are lzma > compressed and the data format is built on protocol buffers. We picked lzma > for both storage costs of cold data and the fact that the pipe between S3 > and EC2 is somewhat limited and we want to make the most effective use of > it as possible. > > In my experience AWS simulators are more trouble than they are worth since > they don't accurately model the way AWS will respond to you under load. The > free tier at AWS should allow you to experiment with building an app. The > first couple of months of development cost us less than $1. > > Steve > > On Tue, Nov 1, 2011 at 1:27 AM, dokondr <[email protected]> wrote: > >> >> >> On Tue, Nov 1, 2011 at 10:53 AM, Neil Davies < >> [email protected]> wrote: >> >>> Word of caution >>> >>> Understand the semantics (and cost profile) of the AWS services first - >>> you can't just open a HTTP connection and dribble data out over several >>> days and hope for things to work. It is not a system that has that sort of >>> laziness at its heart. >>> >>> AWS doesn't supply a traditional remote file store semantics - is >>> queuing, simple database and object store have all been designed for large >>> scale systems being offered as a service to a (potentially hostile) large >>> set of users - you can see that in the way that things are designed. There >>> are all sorts of (sensible from their point of view) performance related >>> limits and retries. >>> >>> The challenge in designing nice clean layers on top of AWS is how/when >>> to hide the transient/load related failures. >>> >>> >>> >> As a straw-man approach I would go first to NData.Map backed by Data.Map >> with addition of "flush" function to write Data.Map to external key-value >> store / NoSQL DB. >> Another requirement for NData.Map is concurrent consistency, so different >> clients could modify its state preserving "happen-before" relationship. For >> this I would add to NData.Map a "reftresh" function, that updates local >> copy from external key-value store. >> >> As for hSimpleDB package, it looks like it doesn't build on ghc7: >> http://hackage.haskell.org/package/hSimpleDB >> >> >>> The hSimpleDB package >>> >>> Interface to Amazon's SimpleDB service. >>> PropertiesVersions0.1 <http://hackage.haskell.org/package/hSimpleDB-0.1>, >>> 0.2 <http://hackage.haskell.org/package/hSimpleDB-0.2>, *0.3* >>> Dependenciesbase <http://hackage.haskell.org/package/base-3.0.3.2> (≥3 >>> & ≤4), bytestring<http://hackage.haskell.org/package/bytestring-0.9.2.0>, >>> Crypto <http://hackage.haskell.org/package/Crypto-4.2.4>, >>> dataenc<http://hackage.haskell.org/package/dataenc-0.14.0.2>, >>> HTTP <http://hackage.haskell.org/package/HTTP-4000.1.2>, >>> hxt<http://hackage.haskell.org/package/hxt-9.1.4>, >>> network <http://hackage.haskell.org/package/network-2.3.0.7>, >>> old-locale<http://hackage.haskell.org/package/old-locale-1.0.0.3>, >>> old-time <http://hackage.haskell.org/package/old-time-1.0.0.7>, >>> utf8-string <http://hackage.haskell.org/package/utf8-string-0.3.7> >>> LicenseBSD3AuthorDavid Himmelstrup 2009, Greg Heartsfield 2007Maintainer >>> David >>> Himmelstrup >>> <[email protected]>CategoryDatabase<http://hackage.haskell.org/packages/archive/pkg-list.html#cat:database>, >>> Web <http://hackage.haskell.org/packages/archive/pkg-list.html#cat:web>, >>> Network<http://hackage.haskell.org/packages/archive/pkg-list.html#cat:network> >>> Upload >>> dateThu Sep 17 17:09:26 UTC 2009Uploaded byDavidHimmelstrupBuilt on >>> ghc-6.10, >>> ghc-6.12Build failureghc-7.0 >>> (log<http://hackage.haskell.org/packages/archive/hSimpleDB/0.3/logs/failure/ghc-7.0> >>> ) >>> >> >> >> _______________________________________________ >> Haskell-Cafe mailing list >> [email protected] >> http://www.haskell.org/mailman/listinfo/haskell-cafe >> >> >
_______________________________________________ Haskell-Cafe mailing list [email protected] http://www.haskell.org/mailman/listinfo/haskell-cafe
