saintstack commented on a change in pull request #1825: URL: https://github.com/apache/hbase/pull/1825#discussion_r434763752
########## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java ########## @@ -285,23 +286,35 @@ boolean splitLogFile(FileStatus logfile, CancelableProgressable reporter) throws String encodedRegionNameAsStr = Bytes.toString(region); lastFlushedSequenceId = lastFlushedSequenceIds.get(encodedRegionNameAsStr); if (lastFlushedSequenceId == null) { - if (sequenceIdChecker != null) { - RegionStoreSequenceIds ids = sequenceIdChecker.getLastSequenceId(region); - Map<byte[], Long> maxSeqIdInStores = new TreeMap<>(Bytes.BYTES_COMPARATOR); - for (StoreSequenceId storeSeqId : ids.getStoreSequenceIdList()) { - maxSeqIdInStores.put(storeSeqId.getFamilyName().toByteArray(), - storeSeqId.getSequenceId()); + if (!(isRegionDirPresentUnderRoot(entry.getKey().getTableName(), encodedRegionNameAsStr))) { + // The region directory itself is not present in the FS. This indicates that + // the region/table is already removed. We can just skip all the edits for this + // region. Setting lastFlushedSequenceId as Long.MAX_VALUE so that all edits + // will get skipped by the seqId check below. + // See more details at https://issues.apache.org/jira/browse/HBASE-24189 + LOG.debug( + "Region {} seems not available in the FS. Just skipping all edits for this region", Review comment: Don't say 'seems'. LOG this at info level I think. You don't need the word 'region' in the log... it is plain from context.... {} is not on the FS; skipping all edits. ########## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java ########## @@ -285,23 +286,35 @@ boolean splitLogFile(FileStatus logfile, CancelableProgressable reporter) throws String encodedRegionNameAsStr = Bytes.toString(region); lastFlushedSequenceId = lastFlushedSequenceIds.get(encodedRegionNameAsStr); if (lastFlushedSequenceId == null) { - if (sequenceIdChecker != null) { - RegionStoreSequenceIds ids = sequenceIdChecker.getLastSequenceId(region); - Map<byte[], Long> maxSeqIdInStores = new TreeMap<>(Bytes.BYTES_COMPARATOR); - for (StoreSequenceId storeSeqId : ids.getStoreSequenceIdList()) { - maxSeqIdInStores.put(storeSeqId.getFamilyName().toByteArray(), - storeSeqId.getSequenceId()); + if (!(isRegionDirPresentUnderRoot(entry.getKey().getTableName(), encodedRegionNameAsStr))) { + // The region directory itself is not present in the FS. This indicates that + // the region/table is already removed. We can just skip all the edits for this + // region. Setting lastFlushedSequenceId as Long.MAX_VALUE so that all edits + // will get skipped by the seqId check below. + // See more details at https://issues.apache.org/jira/browse/HBASE-24189 + LOG.debug( + "Region {} seems not available in the FS. Just skipping all edits for this region", + encodedRegionNameAsStr); + lastFlushedSequenceId = Long.MAX_VALUE; Review comment: If the Region no longer exists, should we even be adding state for it in here other than perhaps local state to save having to do expensive lookups again? The comment here exposes some of the soft logic this change depends upon; we are looking at FS and whether a dir is present or not, we then suppose Region present or not. I think we should be asking the Master. If it is racy around split/merge, then its a bug given Master transitions are meant to be locked down--not racy. What you think Anoop? ########## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java ########## @@ -285,23 +286,35 @@ boolean splitLogFile(FileStatus logfile, CancelableProgressable reporter) throws String encodedRegionNameAsStr = Bytes.toString(region); lastFlushedSequenceId = lastFlushedSequenceIds.get(encodedRegionNameAsStr); if (lastFlushedSequenceId == null) { - if (sequenceIdChecker != null) { - RegionStoreSequenceIds ids = sequenceIdChecker.getLastSequenceId(region); - Map<byte[], Long> maxSeqIdInStores = new TreeMap<>(Bytes.BYTES_COMPARATOR); - for (StoreSequenceId storeSeqId : ids.getStoreSequenceIdList()) { - maxSeqIdInStores.put(storeSeqId.getFamilyName().toByteArray(), - storeSeqId.getSequenceId()); + if (!(isRegionDirPresentUnderRoot(entry.getKey().getTableName(), encodedRegionNameAsStr))) { + // The region directory itself is not present in the FS. This indicates that + // the region/table is already removed. We can just skip all the edits for this + // region. Setting lastFlushedSequenceId as Long.MAX_VALUE so that all edits + // will get skipped by the seqId check below. + // See more details at https://issues.apache.org/jira/browse/HBASE-24189 + LOG.debug( + "Region {} seems not available in the FS. Just skipping all edits for this region", + encodedRegionNameAsStr); + lastFlushedSequenceId = Long.MAX_VALUE; Review comment: But....ouch.. ouch... ouch. This stuff runs on the RegionServer? You can't ask the Master.. dang. So we are splitting arbitrary WALs. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org