Re: Compaction Best Practices

Nicolas Peeters Thu, 14 Jun 2012 06:30:57 -0700

A few more hints, after investigation with the team.
1. We can't really have rotating DBs as sometimes we want to keep older
transaction records in the DB for a longer time.
2. We never replicate nor update the statements (so the _rev_limit won't
really change much (or will it for the compaction??))


On Thu, Jun 14, 2012 at 3:14 PM, Nicolas Peeters <[email protected]>wrote:

> Actually we never modify those records. Just query them up in certain
> cases.
>
> Regarding Robert's suggestion, I was indeed confused because he was
> suggesting to delete them one by one.
>
> I need to read about the "lower_revs_limit". We never replicate this data.
>
>
> On Thu, Jun 14, 2012 at 3:08 PM, Tim Tisdall <[email protected]> wrote:
>
>> I think he's suggesting avoiding compaction completely.  Just delete
>> the old DB when you've finished deleting all the records.
>>
>> On Thu, Jun 14, 2012 at 9:05 AM, Nicolas Peeters <[email protected]>
>> wrote:
>> > Interesting suggestion. However, this would perhaps have the same effect
>> > (deleting/compacting the old DB is what makes the system slower)...?
>> >
>> > On Thu, Jun 14, 2012 at 2:54 PM, Robert Newson <[email protected]>
>> wrote:
>> >
>> >> Do you eventually delete every document you add?
>> >>
>> >> If so, consider using a rolling database scheme instead. At some
>> >> point, perhaps daily, start a new database and write new transaction
>> >> logs there. Continue deleting old logs from the previous database(s)
>> >> until they're empty (doc_count:0) and then delete the database.
>> >>
>> >> B.
>> >>
>> >> On 14 June 2012 13:44, Nicolas Peeters <[email protected]> wrote:
>> >> > I'd like some advice from the community regarding compaction.
>> >> >
>> >> > *Scenario:*
>> >> >
>> >> > We have a large-ish CouchDB database that is being used for
>> transactional
>> >> > logs (very write heavy). Once in a while, we delete some of the
>> records
>> >> in
>> >> > large batches and we have scheduled compaction (not automatic (yet))
>> >> every
>> >> > 12hours.
>> >> >
>> >> > From what I can see, the DB is being hammered significantly every 12
>> >> hours
>> >> > and the compaction is taking 4 hours (with a size of 50-100GB of log
>> >> data).
>> >> >
>> >> > *The problem:*
>> >> >
>> >> > The problem is that compaction takes a very long time and reduces the
>> >> > performance of the stack. It seems that it's hard for the compaction
>> >> > process to "keep up" with the insertions, hence why it takes so long.
>> >> Also,
>> >> > what I'm not sure is how "incremental" the compaction is...
>> >> >
>> >> >   1. In this case, would it make sense to run the compaction more
>> often
>> >> >   (every 10 minutes); since we're write-heavy.
>> >> >      1. Should we just run more often? (so hopefully it doesn't do
>> >> >      unnecessary work too often). Actually, in our case, we should
>> >> probably
>> >> >      never have automatic compaction if there has been no
>> "termination".
>> >> >      2. Or actually only once in a while? (bigger batch, but less
>> >> >      "useless" overhead)
>> >> >      3. Or should we just wait that a given size (which is the
>> problem
>> >> >      really) is hit and use the auto compaction (in CouchDB 1.2.0)
>> for
>> >> this?
>> >> >   2. In CouchDB 1.2.0 there's a new feature: auto
>> >> > compaction<
>> >> http://wiki.apache.org/couchdb/Compaction#Automatic_Compaction>
>> >> > which
>> >> >   may be useful for us. There's the "strict_window" feature to give
>> a max
>> >> >   amount of time to compact and cancel the compaction after that (in
>> >> order
>> >> >   not to have it running for 4h+…). I'm wondering what the impact of
>> >> that is
>> >> >   on the long run. What if the compaction cannot be completed in that
>> >> window?
>> >> >
>> >> > Thanks a lot!
>> >> >
>> >> > Nicolas
>> >>
>>
>
>

Re: Compaction Best Practices

Reply via email to