[GitHub] nickva commented on issue #1579: Autocompaction never triggered in Couch v2.2

GitBox Wed, 28 Nov 2018 16:14:18 -0800

nickva commented on issue #1579: Autocompaction never triggered in Couch v2.2
URL: https://github.com/apache/couchdb/issues/1579#issuecomment-442655883
 
 
   @regner 
   
   Wonder what your min_file_size is set? 131072 that would mean that db or 
view files less than that won't be compacted.
   
   If you see a lot less `Fragmentation for database` messages it probably 
means the db iterator didn't get to them yet.  It was doing them one by one and 
did 1867 but never made to the other ones.
   
   
   Notice it does:
   
   
https://github.com/apache/couchdb/blob/369bec2b7db54d6781f4994543f92aa9bf24d28d/src/couch/src/couch_compaction_daemon.erl#L140
   
   `check_period` then if the time window is good, it tries to compact. If it 
tries to compact it will compute the fragmentation and emit that log line.
   
   Notice another interesting setting and that's snooze:
   
   ```
   SnoozePeriod = config:get_integer("compaction_daemon", "snooze_period", 3),
   ```
   
   That's the amount of time it will sleep between processing each db. That's 
done so it doesn't stampede the couch_server component and add too much load as 
soon as the time window hits. But, at the same time it default to 3 seconds. If 
it takes a few more seconds to process a db in addition to the snooze, let's 
say 7 seconds, that would be 10 * 1867 = 18670 seconds or about 5 hours. Which 
is close to the size of your time window.
   
   Also it seems db shard iteration also happens deterministically, in the same 
order. So you might notice number of your dbs are compacted but others are not. 
For example if you have access to a remsh on the cluster you can try running:
   
   ```
   couch_server:all_databases(fun(DbName, {Cnt, Max}) -> io:format("... ~s~n", 
[DbName]), case Cnt < Max of true -> {ok, {Cnt+1, Max}}; false -> {stop, ok} 
end end, {0, 10}).
   ```
   
   Will print the first 10. I think you'll see that the you get he same list 
every time, that's the list that would be compacted.
   
   ```
   ... shards/c0000000-dfffffff/_global_changes.1543445885
   ... shards/c0000000-dfffffff/_users.1543445884
   ... shards/c0000000-dfffffff/_replicator.1543445884
   ... shards/e0000000-ffffffff/_global_changes.1543445885
   ... shards/e0000000-ffffffff/_users.1543445884
   ... shards/e0000000-ffffffff/_replicator.1543445884
   ... shards/80000000-ffffffff/db1.1543445935
   ... shards/80000000-ffffffff/db1.1543446372
   ... shards/80000000-ffffffff/db1.1543446088
   ... shards/80000000-ffffffff/db1.1543446212
   ... shards/80000000-ffffffff/db1.1543446314
   ```
   
   One thing to try is to extend the window, reduce the snooze to 1 second or 
even 0. Increase the max file size so it skips some of the smaller files. 
Increase the fragmentation threshold so it only defragments highly fragmented 
things.
   
   Another option to try is `{parallel_view_compaction, true}` in the 
`_default` line. That will start compacting the views at the same time as the 
database, so might go faster.
   
   ```
   _default = [{db_fragmentation, "40%"}, {view_fragmentation, "30%"}, {from, 
"03:00"}, {to, "09:00"}, {parallel_view_compaction, true}]]
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] nickva commented on issue #1579: Autocompaction never triggered in Couch v2.2

Reply via email to