Re: Write Performance

Josh Bryan Fri, 09 Jan 2009 16:12:18 -0800

After a couple more days of benchmarking and trying all the suggestions,
here is what I found out:

On a dual core pentium 3.0ghz with erlang 5.6 and couch 0.8.0 using bulk
writes, I get throughput of 95 writes / second .   I didn't get the 2000
per second that Michael did, but that is likely due to the fact that his
documents are considerably smaller than mine (each of my docs has a
4K-10K attachment).  By upgrading to the latest couchdb from svn, this
improved from 95 / second to about 150 / second.

On a quad core xeon 3.0ghz with erlang 5.6 and couch 0.8.0 using bulk
writes, I get throughput of 120 writes / second .  Upgrading to svn head
pushed this up to about 200 / second.  (woohoo, we are down from 2 weeks
to 4 days do import my data).

With both boxes I tried using a ram disk to verify writes were not
bounded by IO and got the exact same performance.  On both boxes, I also
tried parallelizing the writes among multiple databases (same couch
instance), and got the exact same throughput.  However, on the quad
core, if I fired up two copies of couch running on two separate ports,
and parallelized across the two ports, throughput rose to just under
400/second, and all 4 cores were utilized.

I understand why couchdb serializes writes through a single updater
thread for a single database file.  Clearly letting to threads write to
the same file can break consistency.  However, it seems to me (and I
make this comment knowing very little about erlang), that each database
should be able to get it's own updater thread, or at least have as many
updater threads as there are cpu's on the box.  Is there a reason this
wasn't the design?

Also, are there any major gotcha's I should be concerned about in terms
of file formats between versions.  If we start importing data using the
0.8 branch, how hard will an upgrade to 0.9 be?  The reason I ask is
that I am dealing with about 300 GB of data.  If the upgrade process
will require running some conversion process over the old tables, I
would like to start putting together some estimate of how much time that
will require.

Thanks again for the response.

Josh

Chris Anderson wrote:
> On Wed, Jan 7, 2009 at 5:47 PM, Josh Bryan <[email protected]> wrote:
>   
>> if I partition the data into two DBs and fire up two copies of couch, I
>> should be able to make use of another processor on the same machine?  I'll
>> test this tomorrow along with the newer versions.
>>
>>     
>
> Please do share your results. I am aware of some multi-core testing
> that's been done on Solaris and exotic Sun boxes, but knowing how this
> works for you (and making it work better) is important to us.
>
> Community: this is the perfect time where a standard benchmarking
> suite would be sweet. If anyone steps up to the plate on this, they
> win log(1000) internets.
>
>
>

Re: Write Performance

Reply via email to