On Wed, May 20, 2009 at 9:48 AM, Adam Kocoloski <[email protected]> wrote: > On May 20, 2009, at 11:34 AM, Damien Katz wrote: > >> >> On May 20, 2009, at 11:26 AM, Paul Davis wrote: >> >>> On Wed, May 20, 2009 at 11:22 AM, Damien Katz <[email protected]> wrote: >>>> >>>> On May 20, 2009, at 11:09 AM, Damien Katz wrote: >>>>> >>>>> Previously, only btree nodes were saved compressed and docs were not. I >>>>> didn't realize the compression was so expensive, but now that I switch >>>>> it >>>>> off on both the branch and on trunk, I see big performance boosts for >>>>> both. >>>>> And now the tail append stuff is slightly faster on my machine. >>>> >>>> To clarify, disabling the compression completely on both trunk and the >>>> branch results in big performance increases for both, with the >>>> tail_header >>>> branch now being slightly faster than trunk running the lightning test >>>> on my >>>> machine. >>>> >>>> -Damien >>>> >>> >>> Awesome. Is there a noticeable size difference on the database files? >> >> It looks to take about 2x as much diskspace as without compression. > > Nice find. I also see the the tail_header branch slightly faster than trunk > with compression turned off on both, and the DB size increased by ~2x. For > kicks I tried turning the compression level down to 1 (default is 6 on a 1-9 > scale). Running hovercraft:lightning() gives me > > compression level insert rate db size > 0 11725 16.7MB > 1 4186 8.2MB > 6 (default) 3938 7.8MB
This is a really cool chart. It'd be fun to keep a metric of this over time. I'm getting around 9k docs/sec on hovercraft:lightning() on the append branch which is a substantial step up over trunk, which runs closer to 4.5k docs/sec. Trunk with compression off gives me 5.5 - 6k doc/sec, so the tail_append is clearly faster. I wonder what a compressing filesystem would cost us in performance? Good work on this branch Damien. I'm pretty impressed by how quickly you put it together. > > So it's still a huge cost. The nice thing is that binary_to_term seems > perfectly happy reading a mix of compressed and uncompressed binaries, which > means the compression level can be a configuration parameter if we want it > to be. gzip decompresses pretty quickly, so I'm guessing that reading a > compressed DB will be faster than an uncompressed one. We'll have to > measure it, though. > > Adam > -- Chris Anderson http://jchrisa.net http://couch.io
