[ https://issues.apache.org/jira/browse/OMID-145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16819794#comment-16819794 ]
Yonatan Gottesman commented on OMID-145: ---------------------------------------- [~lhofhansl] yesterday Ohad came up with this problem in https://jira.apache.org/jira/browse/OMID-146 It is a real problem if before the commit timestamp reaches the commit table the low watermark is updated and a compaction may think some of the cells are committed and pass them and others are not committed and discard them. Maybe a trivial solution would be to fail transactions that cause removal of entries from the commit cache if its higher then its start timestamp. Still not sure if that will work because the tso is multi threaded and maybe another transaction will cause the compaction before the commit reaches the table. Another solution would be to persist the new low watermark only after the commit reaches but im not sure how this will work with low latency (where the clients write to the commit table) > High CPU usage during compactions. > ---------------------------------- > > Key: OMID-145 > URL: https://issues.apache.org/jira/browse/OMID-145 > Project: Apache Omid > Issue Type: Bug > Reporter: Lars Hofhansl > Priority: Major > Fix For: 1.0.1 > > Attachments: compactionTiming.png, omid-145.patch > > > See attached image, almost all (96%!!) of the compaction time is spent in > org.apache.hadoop.hbase.regionserver.CompactorScanner.queryCommitTimestamp() > I guess that's when the shadowCell is not yet present. > We already have problems with long compactions in HBase, prolonging these > potentially by 25x (all the rest of the compaction logic took only 4% of the > time), would not be a pleasant idea. > Perhaps we can do that same caching we do with the commit cache during > regular scanning...? -- This message was sent by Atlassian JIRA (v7.6.3#76005)