Re: Tail Append Headers

Damien Katz Tue, 19 May 2009 09:51:13 -0700

As I think about it, I'm not surprised you aren't getting betternumbers with delayed updates, which amortize the cost of fsync of allthe docs being updated per second. But to get half the performanceseems wrong. I'm hoping it's something easy to fix, we'll need to runa profiler to be sure.

I'd like to see benchmarks across a variety of loads, and also viewbuild behavior too. For one thing, using full commits on individualdoc updates, the new code should be much faster. I also think viewrefreshes could be slower or faster. Slower because the docs they aremapping are more sparse on disk, but faster because it requires nofsync (if you are using a filesystem that guarantees orderedsequential writes).

Also, if performance generally turns out to be all around slower,we'll have to discuss if the pure tail append change is actually worthit. Maybe we can tail append headers with the old design too, but theyare only ever used when the front header is bad. The only problem is,without implementing the current design, I don't know of a workableway to find an valid header vs something that happens to look like acouchdb file header, such as a couchdb file attached inside a documentin a live db, or an intentional attack.


-Damien

On May 18, 2009, at 7:43 PM, Chris Anderson wrote:

On Mon, May 18, 2009 at 10:59 AM, Damien Katz <[email protected]>wrote:

Feedback on all this welcome. Please try out the branch to shakeout any
bugs or performance problems that might be lurking.


The code looks simpler, which is a nice surprise considering the
storage is actually more robust.

Here are comparative benchmarks on my MacBook. Two runs of
hovercraft:lightning() which factors out all http / json overhead, and
inserts small documents in batches of 1000. I've also done a round of
running my curl/bash benchmark script to insert 100k docs (with
sequential ids)

append only:
2> hovercraft:lightning().
Inserted 100000 docs in 27.614173 seconds with batch size of 1000.
(3621.328800974775 docs/sec)
3> hovercraft:lightning().
Inserted 100000 docs in 27.508795 seconds with batch size of 1000.
(3635.201032978726 docs/sec)

curl/bash: 2285.7 docs/sec

trunk:
2> hovercraft:lightning().
Inserted 100000 docs in 13.237762 seconds with batch size of 1000.
(7554.146992520337 docs/sec)
3> hovercraft:lightning().
Inserted 100000 docs in 13.032335 seconds with batch size of 1000.
(7673.222028132334 docs/sec)

curl/bash: 3417.6 docs/sec

So the preliminary results are that the append-only (on my particular
hardware with a contrived micro-benchmark) is about twice as slow.

It's a matter of priorities. Do we want absolute robustness, or do we
want more performance? Also, the append-only stuff is brand-new and
could conceivably be optimized. I would not be surprised at all to see
it get faster than trunk, with enough tuning.

Chris

--
Chris Anderson
http://jchrisa.net
http://couch.io

Re: Tail Append Headers

Reply via email to