[ https://issues.apache.org/jira/browse/HBASE-26384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
chenglei updated HBASE-26384: ----------------------------- Affects Version/s: (was: 3.0.0-alpha-1) 2.4.8 > Segment which already flushed to hfile may still be remained in > CompactingMemStore > ------------------------------------------------------------------------------------ > > Key: HBASE-26384 > URL: https://issues.apache.org/jira/browse/HBASE-26384 > Project: HBase > Issue Type: Bug > Components: in-memory-compaction > Affects Versions: 2.4.8 > Reporter: chenglei > Priority: Major > > When {{CompactingMemStore}} prepares flushing, > {{CompactingMemStore.snapshot}} invokes following > {{CompactingMemStore.pushPipelineToSnapshot}} method to get {{Snapshot}}, > following line 570 and line 575 uses {{CompactionPipeline#version}} to track > whether the Segments in {{CompactionPipeline#pipeline}} has changed since it > gets {{VersionedSegmentsList}} in line 570 before emptying > {{CompactionPipeline#pipeline}} in line 575. > {code:java} > 565 private void pushPipelineToSnapshot() { > 566 int iterationsCnt = 0; > 567 boolean done = false; > 568 while (!done) { > 569 iterationsCnt++; > 570 VersionedSegmentsList segments = > pipeline.getVersionedList(); > 571 pushToSnapshot(segments.getStoreSegments()); > 572 // swap can return false in case the pipeline was updated > by ongoing compaction > 573 // and the version increase, the chance of it happenning is > very low > 574 // In Swap: don't close segments (they are in snapshot now) > and don't update the region size > 575 done = pipeline.swap(segments, null, false, false); > ....... > } > {code} > However, when {{CompactingMemStore#inMemoryCompaction}} executes > {{CompactionPipeline#flattenOneSegment}}, it does not change > {{CompactionPipeline#version}} , if there is an {{in memeory compaction}} > which executes {{CompactingMemStore#flattenOneSegment}} between above line > 570 and line 575, the {{CompactionPipeline#version}} not change, but the > {{Segment}} in {{CompactionPipeline}} has changed. Because > {{CompactionPipeline#version}} not change, {{pipeline.swap}} in above line > 575 could think it is safe to invoke following > {{CompactionPipeline#swapSuffix}} method to remove {{Segment}} in > {{CompactionPipeline}} , but the {{Segment}} in {{CompactionPipeline}} has > changed because of {{CompactingMemStore#flattenOneSegment}} , so the > {{Segment}} not removed in following line 295 and still remaining in > {{CompactionPipeline}}. > {code:java} > 293 private void swapSuffix(List<? extends Segment> suffix, > ImmutableSegment segment, > 294 boolean closeSegmentsInSuffix) { > 295 pipeline.removeAll(suffix); > 296 if(segment != null) pipeline.addLast(segment); > .... > {code} > However {{CompactingMemStore.snapshot}} think it is successful and continues > to flush the {{Segment}} got by {{CompactingMemStore.snapshot}} as normal, > but the {{Segment}} with the same cells still be left in > {{CompactingMemStore}}. Leaving {{Segment}} which already flushed in > {{MemStore}} is dangerous: if a Major Compaction before the left {{Segment}} > flushing, there may be data erroneous. > My Fix in the PR is as following: > # Increasing the {{CompactionPipeline#version}} in > {{CompactingMemStore#flattenOneSegment}} . > branch-2 has this problem but master is right, it is because the branch-2 > patch for HBASE-18375 omitting this. > # For {{CompactionPipeline#swapSuffix}} , explicitly checking that the > {{Segment}} in {{suffix}} input parameter is same as the {{Segment}} in > {{pipeline}} one by one from > the last element to the first element of {{suffix}} . > -- This message was sent by Atlassian Jira (v8.3.4#803005)