[jira] [Updated] (HBASE-4579) CST.requestCompaction semantics changed, logs are now spammed when too many store files
[ https://issues.apache.org/jira/browse/HBASE-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans updated HBASE-4579: -- Attachment: HBASE-4579-v2.patch Ted pointed out offline that setting the priority higher is actually putting it farther in the queue, and upon inspection of the code it seems like the blocking stuff is already taken care of. I must say, the compaction priority code isn't easy to grok. CST.requestCompaction semantics changed, logs are now spammed when too many store files --- Key: HBASE-4579 URL: https://issues.apache.org/jira/browse/HBASE-4579 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Critical Fix For: 0.92.0 Attachments: HBASE-4579-v2.patch, HBASE-4579.patch Another bug I'm not so sure what's going on. I see this in my log: {quote} 2011-10-12 00:23:43,435 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact 2011-10-12 00:23:44,335 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact 2011-10-12 00:23:45,236 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact 2011-10-12 00:23:46,136 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact 2011-10-12 00:23:47,036 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact 2011-10-12 00:23:47,936 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact {quote} It spams for a while, and a little later instead I get: {quote} 2011-10-12 00:26:52,139 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. 2011-10-12 00:26:53,040 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. 2011-10-12 00:26:53,940 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. 2011-10-12 00:26:54,840 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. 2011-10-12 00:26:55,741 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. 2011-10-12 00:26:56,641 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. {quote} I believe I also saw something like that for flushes, but the region was closing so at least I know why it was spamming (would be nice if it just unrequested the flush): {quote} 2011-10-12 00:26:40,693 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested on TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5. 2011-10-12 00:26:40,694 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., flushing=false, writesEnabled=false 2011-10-12 00:26:40,733 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested on TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5. 2011-10-12 00:26:40,733 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., flushing=false, writesEnabled=false 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested on TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5. 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., flushing=false, writesEnabled=false 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested on TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5. 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., flushing=false, writesEnabled=false 2011-10-12 00:26:40,921 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested on TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5. 2011-10-12 00:26:40,922 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region
[jira] [Updated] (HBASE-4579) CST.requestCompaction semantics changed, logs are now spammed when too many store files
[ https://issues.apache.org/jira/browse/HBASE-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans updated HBASE-4579: -- Summary: CST.requestCompaction semantics changed, logs are now spammed when too many store files (was: Compactions that can't run are requested forever) I spent some time debugging this and the debug I added really helped: {quote} 2011-10-13 22:04:16,633 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Large Compaction requested: regionName=TestTable,0028803615,1318381132063.1724203bae0dda87cddfaac9700b9dc6., storeName=info, fileCount=4, fileSize=219.6m (54.8m, 55.0m, 55.0m, 54.8m), priority=-1, time=6062002855079899; Because: regionserver62023.cacheFlusher; compaction_queue=(1:0), split_queue=0 2011-10-13 22:04:21,999 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested on TestTable,0028803615,1318381132063.1724203bae0dda87cddfaac9700b9dc6. 2011-10-13 22:04:21,999 WARN org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Region TestTable,0028803615,1318381132063.1724203bae0dda87cddfaac9700b9dc6. has too many store files; delaying flush up to 9ms 2011-10-13 22:04:21,999 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact 1724203bae0dda87cddfaac9700b9dc6 2011-10-13 22:04:22,000 ERROR org.apache.hadoop.hbase.regionserver.Store: java.lang.Exception at org.apache.hadoop.hbase.regionserver.Store.compactSelection(Store.java:1007) at org.apache.hadoop.hbase.regionserver.Store.requestCompaction(Store.java:908) at org.apache.hadoop.hbase.regionserver.CompactSplitThread.requestCompaction(CompactSplitThread.java:204) at org.apache.hadoop.hbase.regionserver.CompactSplitThread.requestCompaction(CompactSplitThread.java:177) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:360) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:239) ... 2011-10-13 22:05:52,584 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Waited 90585ms on a compaction to clean up 'too many store files'; waited long enough... proceeding with flush of TestTable,0028803615,1318381132063.1724203bae0dda87cddfaac9700b9dc6. {quote} The exception is there just to show me where the call was coming from. So what we see is that compaction is grabbing all the store files and during that time the MemStoreFlusher is trying to flush but hits the too many store files issue. It used to be that requesting a compaction was just putting something on a queue silently, but now it's actually going all the way down into Store and that's why it's now spammy. Fixes I think are required: - For debugging purposes, show the encoded name when printing the info: no store files to compact message because it's pretty much useless if you don't know what it belongs to. - Require the compaction only once, the code already knows when it's the first time you spin so no need to ask it again and again. - Set a higher priority for the compaction CST.requestCompaction semantics changed, logs are now spammed when too many store files --- Key: HBASE-4579 URL: https://issues.apache.org/jira/browse/HBASE-4579 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Priority: Critical Fix For: 0.92.0 Another bug I'm not so sure what's going on. I see this in my log: {quote} 2011-10-12 00:23:43,435 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact 2011-10-12 00:23:44,335 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact 2011-10-12 00:23:45,236 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact 2011-10-12 00:23:46,136 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact 2011-10-12 00:23:47,036 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact 2011-10-12 00:23:47,936 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact {quote} It spams for a while, and a little later instead I get: {quote} 2011-10-12 00:26:52,139 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. 2011-10-12 00:26:53,040 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. 2011-10-12 00:26:53,940 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. 2011-10-12 00:26:54,840 DEBUG
[jira] [Updated] (HBASE-4579) CST.requestCompaction semantics changed, logs are now spammed when too many store files
[ https://issues.apache.org/jira/browse/HBASE-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans updated HBASE-4579: -- Attachment: HBASE-4579.patch Does the 3 things I listed above, tested on a cluster and verified it stopped spamming. CST.requestCompaction semantics changed, logs are now spammed when too many store files --- Key: HBASE-4579 URL: https://issues.apache.org/jira/browse/HBASE-4579 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Critical Fix For: 0.92.0 Attachments: HBASE-4579.patch Another bug I'm not so sure what's going on. I see this in my log: {quote} 2011-10-12 00:23:43,435 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact 2011-10-12 00:23:44,335 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact 2011-10-12 00:23:45,236 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact 2011-10-12 00:23:46,136 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact 2011-10-12 00:23:47,036 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact 2011-10-12 00:23:47,936 DEBUG org.apache.hadoop.hbase.regionserver.Store: info: no store files to compact {quote} It spams for a while, and a little later instead I get: {quote} 2011-10-12 00:26:52,139 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. 2011-10-12 00:26:53,040 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. 2011-10-12 00:26:53,940 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. 2011-10-12 00:26:54,840 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. 2011-10-12 00:26:55,741 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. 2011-10-12 00:26:56,641 DEBUG org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of info. Only 2 file(s) of size 176.4m have met compaction criteria. {quote} I believe I also saw something like that for flushes, but the region was closing so at least I know why it was spamming (would be nice if it just unrequested the flush): {quote} 2011-10-12 00:26:40,693 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested on TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5. 2011-10-12 00:26:40,694 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., flushing=false, writesEnabled=false 2011-10-12 00:26:40,733 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested on TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5. 2011-10-12 00:26:40,733 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., flushing=false, writesEnabled=false 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested on TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5. 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., flushing=false, writesEnabled=false 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested on TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5. 2011-10-12 00:26:40,873 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., flushing=false, writesEnabled=false 2011-10-12 00:26:40,921 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested on TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5. 2011-10-12 00:26:40,922 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region TestTable,0038168581,1318378894213.2beb8a1e29382a8d3e90a88b9662e5f5., flushing=false, writesEnabled=false 2011-10-12 00:26:40,923 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested on