Re: Please report your indexing speed

Jan Lehnardt Sun, 04 Mar 2012 09:47:26 -0800

On Mar 4, 2012, at 18:40 , Jan Lehnardt wrote:

> 
> On Mar 4, 2012, at 18:24 , Jan Lehnardt wrote:
> 
>> I updated the google doc with results from an EC2 cc1.4xlarge instance 
>> (details are in the spreadsheet)
>> 
>> This on EBS and Ubuntu 11.04/64.
>> 
>> The results are bit different from the previous machine, but that isn't at 
>> all unexpected.
>> 
>> tl;dr: for small docs (10bytes, 100bytes) 1.2.x-filipe beats 1.2.x and 1.1.1 
>> , for large docs (1000bytes), 1.2.x beats 1.2.x-filipe  (6% difference).
> 
> Hah, I re-read through the results to make sure this is correct and I found a 
> mistake. A copy and paste formula error accounted for bigger improvements of 
> 1.2.x-filipe. This includes all my previous results.
> 
> The good thing is 1.2.x-filipe is still faster, across the board than 1.1.1 
> and 1.2.x. Still significantly, but not *as* much as about 30% in all but one 
> case.
> 
> The tl;dr for the EC2 run can now be changed to that 1.2.x-filipe beats 1.1.1 
> and 1.2.x for all docs, it's just that for large docs (1000bytes), 1.2.x is 
> faster than 1.1.1. But 1.2.x-filipe is even faster.
> 
> 
>> So far, across the board, 1.2.x-filipe is ~16% faster (stdev 9%) than 1.1.1 
>> for view builds.


Sorry for misquoting this line, it is new and the most significant of this 
email, I'll just repeat it :)


So far, across the board, 1.2.x-filipe is ~16% faster (stdev 9%) than 1.1.1 for 
view builds.
--------------------------------------------------------------------------------------------

The bigger the docs, the better the results, on both SSD and spinning disk.

Cheers
Jan
-- 

> 
> 
> If you have any more hardware I could run this on, I'm happy to help with the 
> setup, it isn't hard :)
> 
> Cheers
> Jan
> --
> 
> 
>> 
>> This still makes me want to include Filipe's patch into 1.2.x.
>> 
>> Cheers
>> Jan
>> -- 
>> 
>> On Mar 4, 2012, at 10:24 , Jan Lehnardt wrote:
>> 
>>> Hey all,
>>> 
>>> I made another run with a bit of a different scenario.
>>> 
>>> 
>>> # The Scenario
>>> 
>>> I used a modified benchbulk.sh for inserting data (because it is an order 
>>> of magnitude faster than the other methods we had). I added a command line 
>>> parameter to specify the size of a single document in bytes (this was 
>>> previously hardcoded in the script). Note that this script creates docs in 
>>> a btree-friendly incrementing ID way.
>>> 
>>> I added a new script benchview.sh which is basically the lower part of 
>>> Robert Newson's script. It creates a single view and queries it, measuring 
>>> execution time of curl.
>>> 
>>> And a third matrix.sh (yay) that would run, on my system, different 
>>> configurations.
>>> 
>>> See https://gist.github.com/1971611 for the scripts.
>>> 
>>> I ran ./benchbulk $size && ./benchview.sh for the following combinations, 
>>> all on Mac OS X 10.7.3, Erlang R15B, Spidermonkey 1.8.5:
>>> 
>>> - Doc sizes 10, 100, 1000 bytes
>>> - CouchDB 1.1.1, 1.2.x (as of last night), 1.2.x-filipe (as of last night + 
>>> Filipe's patch from earlier in the thread)
>>> - On an SSD and on a 5400rpm internal drive.
>>> 
>>> I ran each individual test three times and took the average to compare 
>>> numbers. The full report (see below) includes each individual run's numbers)
>>> 
>>> (The gist includes the raw output data from matrix.sh for the 5400rpm run, 
>>> for the SSDs, I don't have the original numbers anymore. I'm happy to 
>>> re-run this, if you want that data as well.)
>>> 
>>> # The Numbers
>>> 
>>> See 
>>> https://docs.google.com/spreadsheet/ccc?key=0AhESVUYnc_sQdDJ1Ry1KMTQ5enBDY0s1dHk2UVEzMHc
>>>  for the full data set. It'd be great to get a second pair of eyes to make 
>>> sure I didn't make any mistakes.
>>> 
>>> See the "Grouped Data" sheet for comparisons.
>>> 
>>> tl;dr: 1.2.x is about 30% slower and 1.2.x-filipe is about 30% faster than 
>>> 1.1.1 in the scenario above.
>>> 
>>> 
>>> # Conclusion
>>> 
>>> +1 to include Filipe's patch into 1.2.x.
>>> 
>>> 
>>> 
>>> I'd love any feedback on methods, calculations and whatnot :)
>>> 
>>> Also, I can run more variations, if you like, other Erlang or SpiderMokney 
>>> versions e.g., just let me know.
>>> 
>>> 
>>> Cheers
>>> Jan
>>> -- 
>>> 
>>> On Feb 28, 2012, at 14:17 , Jason Smith wrote:
>>> 
>>>> Forgive the clean new thread. Hopefully it will not remain so.
>>>> 
>>>> If you can, would you please clone https://github.com/jhs/slow_couchdb
>>>> 
>>>> And build whatever Erlangs and CouchDB checkouts you see fit, and run
>>>> the test. For example:
>>>> 
>>>> docs=500000 ./bench.sh small_doc.tpl
>>>> 
>>>> That should run the test and, God willing, upload the results to a
>>>> couch in the cloud. We should be able to use that information to
>>>> identify who you are, whether you are on SSD, what Erlang and Couch
>>>> build, and how fast it ran. Modulo bugs.
>>> 
>> 
>

Re: Please report your indexing speed

Reply via email to