[ https://issues.apache.org/jira/browse/HBASE-2462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924143#action_12924143 ]
Jonathan Gray commented on HBASE-2462: -------------------------------------- Simulations of the old algorithm: (Really not much different from the new one except harder to tune for reads) FINAL COMPARISON 2010-10-23 00:22:57,841 DEBUG [main] regionserver.TestCompact(272): MOST EFFICIENT THROUGHPUT 2010-10-23 00:22:57,841 DEBUG [main] regionserver.TestCompact(274): --- Config --- numPuts=1000000 putSizeRange=1.0KB to 10.0KB numPutsPerGet=10 flushSizeRange=64.0MB to 256.0MB max=10, threshold=3, force=6, factor=0.5 --- Result --- files=null memstoreSize=0B totalSize=0B totalThroughput=22.7GB averageFilesPerGet=3.56323 2010-10-23 00:22:57,841 DEBUG [main] regionserver.TestCompact(274): --- Config --- numPuts=1000000 putSizeRange=1.0KB to 10.0KB numPutsPerGet=10 flushSizeRange=64.0MB to 256.0MB max=10, threshold=3, force=3, factor=0.5 --- Result --- files=null memstoreSize=0B totalSize=0B totalThroughput=22.9GB averageFilesPerGet=3.66561 2010-10-23 00:22:57,842 DEBUG [main] regionserver.TestCompact(274): --- Config --- numPuts=1000000 putSizeRange=1.0KB to 10.0KB numPutsPerGet=10 flushSizeRange=64.0MB to 256.0MB max=10, threshold=3, force=3, factor=0.5 --- Result --- files=null memstoreSize=0B totalSize=0B totalThroughput=25.0GB averageFilesPerGet=3.61562 2010-10-23 00:22:57,842 DEBUG [main] regionserver.TestCompact(274): --- Config --- numPuts=1000000 putSizeRange=1.0KB to 10.0KB numPutsPerGet=10 flushSizeRange=64.0MB to 256.0MB max=10, threshold=3, force=3, factor=0.5 --- Result --- files=null memstoreSize=0B totalSize=0B totalThroughput=27.1GB averageFilesPerGet=3.64184 2010-10-23 00:22:57,843 DEBUG [main] regionserver.TestCompact(274): --- Config --- numPuts=1000000 putSizeRange=1.0KB to 10.0KB numPutsPerGet=10 flushSizeRange=64.0MB to 256.0MB max=10, threshold=3, force=3, factor=0.5 --- Result --- files=null memstoreSize=0B totalSize=0B totalThroughput=32.5GB averageFilesPerGet=3.8605 2010-10-23 00:22:57,843 DEBUG [main] regionserver.TestCompact(278): MOST EFFICIENT GETS 2010-10-23 00:22:57,844 DEBUG [main] regionserver.TestCompact(280): --- Config --- numPuts=1000000 putSizeRange=1.0KB to 10.0KB numPutsPerGet=10 flushSizeRange=64.0MB to 256.0MB max=10, threshold=3, force=6, factor=0.5 --- Result --- files=null memstoreSize=0B totalSize=0B totalThroughput=22.7GB averageFilesPerGet=3.56323 2010-10-23 00:22:57,844 DEBUG [main] regionserver.TestCompact(280): --- Config --- numPuts=1000000 putSizeRange=1.0KB to 10.0KB numPutsPerGet=10 flushSizeRange=64.0MB to 256.0MB max=10, threshold=3, force=3, factor=0.5 --- Result --- files=null memstoreSize=0B totalSize=0B totalThroughput=25.0GB averageFilesPerGet=3.61562 2010-10-23 00:22:57,844 DEBUG [main] regionserver.TestCompact(280): --- Config --- numPuts=1000000 putSizeRange=1.0KB to 10.0KB numPutsPerGet=10 flushSizeRange=64.0MB to 256.0MB max=10, threshold=3, force=3, factor=0.5 --- Result --- files=null memstoreSize=0B totalSize=0B totalThroughput=27.1GB averageFilesPerGet=3.64184 2010-10-23 00:22:57,844 DEBUG [main] regionserver.TestCompact(280): --- Config --- numPuts=1000000 putSizeRange=1.0KB to 10.0KB numPutsPerGet=10 flushSizeRange=64.0MB to 256.0MB max=10, threshold=3, force=3, factor=0.5 --- Result --- files=null memstoreSize=0B totalSize=0B totalThroughput=22.9GB averageFilesPerGet=3.66561 2010-10-23 00:22:57,844 DEBUG [main] regionserver.TestCompact(280): --- Config --- numPuts=1000000 putSizeRange=1.0KB to 10.0KB numPutsPerGet=10 flushSizeRange=64.0MB to 256.0MB max=10, threshold=3, force=3, factor=0.5 --- Result --- files=null memstoreSize=0B totalSize=0B totalThroughput=32.5GB averageFilesPerGet=3.8605 2010-10-23 00:22:57,845 DEBUG [main] regionserver.TestCompact(284): Throughput Range: 22.7GB to 32.5GB 2010-10-23 00:22:57,845 DEBUG [main] regionserver.TestCompact(287): Num Files Per Get Range: 3.56323 to 3.8605 > Review compaction heuristic and move compaction code out so standalone and > independently testable > ------------------------------------------------------------------------------------------------- > > Key: HBASE-2462 > URL: https://issues.apache.org/jira/browse/HBASE-2462 > Project: HBase > Issue Type: Improvement > Reporter: stack > Assignee: Jonathan Gray > Priority: Critical > > Anything that improves our i/o profile makes hbase run smoother. Over in > HBASE-2457, good work has been done already describing the tension between > minimizing compactions versus minimizing count of store files. This issue is > about following on from what has been done in 2457 but also, breaking the > hard-to-read compaction code out of Store.java out to a standalone class that > can be the easier tested (and easily analyzed for its performance > characteristics). > If possible, in the refactor, we'd allow specification of alternate merge > sort implementations. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.