On 20.12.2010, at 23:34, Jan Lehnardt wrote: > > On 20 Dec 2010, at 23:20, Sebastian Cohnen wrote: > >> question inside :) >> >> On 20.12.2010, at 23:02, Jan Lehnardt wrote: >> >>> Hi, >>> >>> On 20 Dec 2010, at 22:32, Chenini, Mohamed wrote: >>> >>>> Hi, >>>> >>>> I found this info on the net at >>>> http://www.slideshare.net/danglbl/schemaless-databases >>>> [...] >>>> Does anyone knows if this was verified? >>> >>> I think the author's comment on slide 35 sums it up pretty nicely: >>> >>> "Of course this is just one (lame) test." >>> >>> Coming up good numbers is hard which means that people with easy ways to >>> make them come up with bad ones. >>> >>> I've written about the difficulties on benchmarks databases on my blog: >>> >>> http://jan.prima.de/~jan/plok/archives/175-Benchmarks-You-are-Doing-it-Wrong.html >>> http://jan.prima.de/~jan/plok/archives/176-Caveats-of-Evaluating-Databases.html >>> >>> They should give you a few pointers on why this is hard. >>> >>> -- >>> >>> To the point: CouchDB generally performs best with concurrent load. In the >>> case of loading data into CouchDB, bulk requests* will speed up things >>> again. To push CouchDB to a write limit, you want to use concurrent bulk >>> requests (specific numbers will depend on your data and hardware). >> >> Does this really speed up things? I've tried this approach (concurrent bulk >> inserts) with small/big docs and small/big bulk chunk sizes: the difference >> was not significant. I thought this was reasonable, since writes are >> serialized anyways. The setup was one box generating documents, creating >> bulks and keep them in memory and bulk insert batches of complete docs >> (incl. simple monotonic increasing ints as doc ids) to another node. delayed >> commit was off. > > Have you tested these against single doc inserts?
sorry, should have mentioned that I've tested bulk vs concurrent bulk inserts > Cheers > Jan > -- > >> >>> >>> * http://wiki.apache.org/couchdb/HTTP_Bulk_Document_API >>> >>> Unfortunately this means that these one-off benchmarks don't show any good >>> numbers for CouchDB, yet fortunately this shows easily that these one-off >>> benchmarks don't really reflect common real-world usage and should be >>> discouraged. >>> >>> Hope that helps, let us know if you have any more questions :) >>> >>> Cheers >>> Jan >>> -- >>> >> >
