We have 100+ tablet servers, approx 860 tablets/server, ingest approx 300K+ docs/day, the problem recently started that queries during a minor or major compaction are taking about 100+ seconds as opposed to about 2 seconds when no compaction. Everyone on the cluster is effected, mapreduce jobs and batch scanners.
One table has as many as 65K tablets. In the hopes of reducing the compactions yesterday we changed on 2 tables that appeared to cause most of the compactions: compaction.ratio from 3 to 5 table.file.max from 15 to 45 split.threshold from 725M to 2G. tservers are set to 3G, top shows 6G res and 7G virt for the one I checked. The odd things is we expected the number of tablets to change and they did not. The only thing that happened was the number of compactions went up but the duration of the compactions went down by about half. Queries in off times did not seem to change. One more thing, we only store docs < 64M in accumulo, otherwise they are written directly to hdfs. The question would be, is there a way to reduce the compaction frequency and or duration? Thanks in advance. Paul -- View this message in context: http://apache-accumulo.1065345.n5.nabble.com/Compaction-slowing-queries-tp11278.html Sent from the Users mailing list archive at Nabble.com.
