On 20.12.2010, at 23:34, Jan Lehnardt wrote:

> 
> On 20 Dec 2010, at 23:20, Sebastian Cohnen wrote:
> 
>> question inside :)
>> 
>> On 20.12.2010, at 23:02, Jan Lehnardt wrote:
>> 
>>> Hi,
>>> 
>>> On 20 Dec 2010, at 22:32, Chenini, Mohamed wrote:
>>> 
>>>> Hi,
>>>> 
>>>> I found this info on the net at 
>>>> http://www.slideshare.net/danglbl/schemaless-databases
>>>> [...]
>>>> Does anyone knows if this was verified?
>>> 
>>> I think the author's comment on slide 35 sums it up pretty nicely:
>>> 
>>> "Of course this is just one (lame) test."
>>> 
>>> Coming up good numbers is hard which means that people with easy ways to 
>>> make them come up with bad ones.
>>> 
>>> I've written about the difficulties on benchmarks databases on my blog:
>>> 
>>> http://jan.prima.de/~jan/plok/archives/175-Benchmarks-You-are-Doing-it-Wrong.html
>>> http://jan.prima.de/~jan/plok/archives/176-Caveats-of-Evaluating-Databases.html
>>> 
>>> They should give you a few pointers on why this is hard.
>>> 
>>> --
>>> 
>>> To the point: CouchDB generally performs best with concurrent load. In the 
>>> case of loading data into CouchDB, bulk requests* will speed up things 
>>> again. To push CouchDB to a write limit, you want to use concurrent bulk 
>>> requests (specific numbers will depend on your data and hardware).
>> 
>> Does this really speed up things? I've tried this approach (concurrent bulk 
>> inserts) with small/big docs and small/big bulk chunk sizes: the difference 
>> was not significant. I thought this was reasonable, since writes are 
>> serialized anyways. The setup was one box generating documents, creating 
>> bulks and keep them in memory and bulk insert batches of complete docs 
>> (incl. simple monotonic increasing ints as doc ids) to another node. delayed 
>> commit was off.
> 
> Have you tested these against single doc inserts?

sorry, should have mentioned that I've tested bulk vs concurrent bulk inserts

> Cheers
> Jan
> -- 
> 
>> 
>>> 
>>> * http://wiki.apache.org/couchdb/HTTP_Bulk_Document_API
>>> 
>>> Unfortunately this means that these one-off benchmarks don't show any good 
>>> numbers for CouchDB, yet fortunately this shows easily that these one-off 
>>> benchmarks don't really reflect common real-world usage and should be 
>>> discouraged.
>>> 
>>> Hope that helps, let us know if you have any more questions :)
>>> 
>>> Cheers
>>> Jan
>>> -- 
>>> 
>> 
> 

Reply via email to