[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong
[ https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16443947#comment-16443947 ] Hudson commented on HBASE-18059: Results for branch master [build #304 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/304/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/master/304//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/master/304//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/master/304//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > The scanner order for memstore scanners are wrong > - > > Key: HBASE-18059 > URL: https://issues.apache.org/jira/browse/HBASE-18059 > Project: HBase > Issue Type: Bug > Components: regionserver, scan, Scanners >Affects Versions: 2.0.0 >Reporter: Duo Zhang >Assignee: Jingyun Tian >Priority: Critical > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-18059.master.001.patch > > > This is comments for KeyValueScanner.getScannerOrder > {code:title=KeyValueScanner.java} > /** >* Get the order of this KeyValueScanner. This is only relevant for > StoreFileScanners and >* MemStoreScanners (other scanners simply return 0). This is required for > comparing multiple >* files to find out which one has the latest data. StoreFileScanners are > ordered from 0 >* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max > since it always >* contains freshest data. >*/ > long getScannerOrder(); > {code} > As now we may have multiple memstore scanners, I think the right way to > select scanner order for memstore scanner is to ordered from Long.MAX_VALUE > in decreasing order. > But in CompactingMemStore and DefaultMemStore, the scanner order for memstore > scanner is also start from 0, which will be messed up with StoreFileScanners. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong
[ https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16443798#comment-16443798 ] Hudson commented on HBASE-18059: Results for branch branch-2 [build #632 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/632/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/632//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/632//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/632//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > The scanner order for memstore scanners are wrong > - > > Key: HBASE-18059 > URL: https://issues.apache.org/jira/browse/HBASE-18059 > Project: HBase > Issue Type: Bug > Components: regionserver, scan, Scanners >Affects Versions: 2.0.0 >Reporter: Duo Zhang >Assignee: Jingyun Tian >Priority: Critical > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-18059.master.001.patch > > > This is comments for KeyValueScanner.getScannerOrder > {code:title=KeyValueScanner.java} > /** >* Get the order of this KeyValueScanner. This is only relevant for > StoreFileScanners and >* MemStoreScanners (other scanners simply return 0). This is required for > comparing multiple >* files to find out which one has the latest data. StoreFileScanners are > ordered from 0 >* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max > since it always >* contains freshest data. >*/ > long getScannerOrder(); > {code} > As now we may have multiple memstore scanners, I think the right way to > select scanner order for memstore scanner is to ordered from Long.MAX_VALUE > in decreasing order. > But in CompactingMemStore and DefaultMemStore, the scanner order for memstore > scanner is also start from 0, which will be messed up with StoreFileScanners. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong
[ https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16443493#comment-16443493 ] Jingyun Tian commented on HBASE-18059: -- [~stack] sure. Thx for your review. > The scanner order for memstore scanners are wrong > - > > Key: HBASE-18059 > URL: https://issues.apache.org/jira/browse/HBASE-18059 > Project: HBase > Issue Type: Bug > Components: regionserver, scan, Scanners >Affects Versions: 2.0.0 >Reporter: Duo Zhang >Assignee: Jingyun Tian >Priority: Critical > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-18059.master.001.patch > > > This is comments for KeyValueScanner.getScannerOrder > {code:title=KeyValueScanner.java} > /** >* Get the order of this KeyValueScanner. This is only relevant for > StoreFileScanners and >* MemStoreScanners (other scanners simply return 0). This is required for > comparing multiple >* files to find out which one has the latest data. StoreFileScanners are > ordered from 0 >* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max > since it always >* contains freshest data. >*/ > long getScannerOrder(); > {code} > As now we may have multiple memstore scanners, I think the right way to > select scanner order for memstore scanner is to ordered from Long.MAX_VALUE > in decreasing order. > But in CompactingMemStore and DefaultMemStore, the scanner order for memstore > scanner is also start from 0, which will be messed up with StoreFileScanners. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong
[ https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16443481#comment-16443481 ] stack commented on HBASE-18059: --- Thank you [~tianjingyun] On review of what is here and following yours and @appy's commentary, makes sense. Thanks for digging in and the nice patch cleaning out order where it not needed. +1 for branch-2+. Not for branch-2.0. I'm being conservative not 2.0.0 is in RC. Let me commit this now. > The scanner order for memstore scanners are wrong > - > > Key: HBASE-18059 > URL: https://issues.apache.org/jira/browse/HBASE-18059 > Project: HBase > Issue Type: Bug > Components: regionserver, scan, Scanners >Affects Versions: 2.0.0 >Reporter: Duo Zhang >Assignee: Jingyun Tian >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-18059.master.001.patch > > > This is comments for KeyValueScanner.getScannerOrder > {code:title=KeyValueScanner.java} > /** >* Get the order of this KeyValueScanner. This is only relevant for > StoreFileScanners and >* MemStoreScanners (other scanners simply return 0). This is required for > comparing multiple >* files to find out which one has the latest data. StoreFileScanners are > ordered from 0 >* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max > since it always >* contains freshest data. >*/ > long getScannerOrder(); > {code} > As now we may have multiple memstore scanners, I think the right way to > select scanner order for memstore scanner is to ordered from Long.MAX_VALUE > in decreasing order. > But in CompactingMemStore and DefaultMemStore, the scanner order for memstore > scanner is also start from 0, which will be messed up with StoreFileScanners. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong
[ https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16443467#comment-16443467 ] Jingyun Tian commented on HBASE-18059: -- [~stack] Because the scanner order is used when the kvComparator cannot determine which is bigger between 2 cells, that means the 2 cells have same key and same seqId. But for memstore, all cells have its own seqld, thus it will nerver reach the compare of scanner order. [~appy] explain this very clear: {quote} - memstore scanners vs storefile(SF) scanner: -- if cell in SF has seqId: Its seqId should be less than that of memstore's cell. Memstore scanner will win. -- if cell in SF does NOT have seqId: SF'cell [defaults to seqId=0|https://github.com/apache/hbase/blob/e65d8653e566bbbae03578a1f9ad858cabcb48bc/hbase-common/src/main/java/org/apache/hadoop/hbase/Cell.java#L142], memstore scanner will win. (Note that seqId of cells in hfiles are removed on major compaction if older than certain time, default is 5 days) - memstore vs bulk loaded file(BLF) -- memstore cell's will have higher seqId, so memstore scanner will win.{quote} > The scanner order for memstore scanners are wrong > - > > Key: HBASE-18059 > URL: https://issues.apache.org/jira/browse/HBASE-18059 > Project: HBase > Issue Type: Bug > Components: regionserver, scan, Scanners >Affects Versions: 2.0.0 >Reporter: Duo Zhang >Assignee: Jingyun Tian >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-18059.master.001.patch > > > This is comments for KeyValueScanner.getScannerOrder > {code:title=KeyValueScanner.java} > /** >* Get the order of this KeyValueScanner. This is only relevant for > StoreFileScanners and >* MemStoreScanners (other scanners simply return 0). This is required for > comparing multiple >* files to find out which one has the latest data. StoreFileScanners are > ordered from 0 >* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max > since it always >* contains freshest data. >*/ > long getScannerOrder(); > {code} > As now we may have multiple memstore scanners, I think the right way to > select scanner order for memstore scanner is to ordered from Long.MAX_VALUE > in decreasing order. > But in CompactingMemStore and DefaultMemStore, the scanner order for memstore > scanner is also start from 0, which will be messed up with StoreFileScanners. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong
[ https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16443254#comment-16443254 ] stack commented on HBASE-18059: --- [~appy] Any opinion sir? [~tianjingyun] Why useless sir? Thanks. (And all tests passed here too!) > The scanner order for memstore scanners are wrong > - > > Key: HBASE-18059 > URL: https://issues.apache.org/jira/browse/HBASE-18059 > Project: HBase > Issue Type: Bug > Components: regionserver, scan, Scanners >Affects Versions: 2.0.0 >Reporter: Duo Zhang >Assignee: Jingyun Tian >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-18059.master.001.patch > > > This is comments for KeyValueScanner.getScannerOrder > {code:title=KeyValueScanner.java} > /** >* Get the order of this KeyValueScanner. This is only relevant for > StoreFileScanners and >* MemStoreScanners (other scanners simply return 0). This is required for > comparing multiple >* files to find out which one has the latest data. StoreFileScanners are > ordered from 0 >* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max > since it always >* contains freshest data. >*/ > long getScannerOrder(); > {code} > As now we may have multiple memstore scanners, I think the right way to > select scanner order for memstore scanner is to ordered from Long.MAX_VALUE > in decreasing order. > But in CompactingMemStore and DefaultMemStore, the scanner order for memstore > scanner is also start from 0, which will be messed up with StoreFileScanners. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong
[ https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16441950#comment-16441950 ] Jingyun Tian commented on HBASE-18059: -- [~Apache9] pls check this out. > The scanner order for memstore scanners are wrong > - > > Key: HBASE-18059 > URL: https://issues.apache.org/jira/browse/HBASE-18059 > Project: HBase > Issue Type: Bug > Components: regionserver, scan, Scanners >Affects Versions: 2.0.0 >Reporter: Duo Zhang >Assignee: Jingyun Tian >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-18059.master.001.patch > > > This is comments for KeyValueScanner.getScannerOrder > {code:title=KeyValueScanner.java} > /** >* Get the order of this KeyValueScanner. This is only relevant for > StoreFileScanners and >* MemStoreScanners (other scanners simply return 0). This is required for > comparing multiple >* files to find out which one has the latest data. StoreFileScanners are > ordered from 0 >* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max > since it always >* contains freshest data. >*/ > long getScannerOrder(); > {code} > As now we may have multiple memstore scanners, I think the right way to > select scanner order for memstore scanner is to ordered from Long.MAX_VALUE > in decreasing order. > But in CompactingMemStore and DefaultMemStore, the scanner order for memstore > scanner is also start from 0, which will be messed up with StoreFileScanners. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong
[ https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16441939#comment-16441939 ] Hadoop QA commented on HBASE-18059: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 38s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 40s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 9s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 53s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 53s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 13s{color} | {color:green} hbase-server: The patch generated 0 new + 61 unchanged - 1 fixed = 61 total (was 62) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 52s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 14m 17s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}106m 3s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}151m 7s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f | | JIRA Issue | HBASE-18059 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12919526/HBASE-18059.master.001.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux fcbb919c4db0 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / f4f2b68238 | | maven | version: Apache Maven 3.5.3 (3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) | | Default Java | 1.8.0_162 | | findbugs | v3.1.0-RC3 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/12516/testReport/ | | Max. process+thread count | 4158 (vs. ulimit of 1) | | modules | C: hbase-server U: hbase-server | | Console output |
[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong
[ https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16441839#comment-16441839 ] Jingyun Tian commented on HBASE-18059: -- IMO the scanner order for memstore is actually useless. I removed all code of scanner order in memstore related classes. This patch passed test at my local PC. > The scanner order for memstore scanners are wrong > - > > Key: HBASE-18059 > URL: https://issues.apache.org/jira/browse/HBASE-18059 > Project: HBase > Issue Type: Bug > Components: regionserver, scan, Scanners >Affects Versions: 2.0.0 >Reporter: Duo Zhang >Assignee: Jingyun Tian >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-18059.master.001.patch > > > This is comments for KeyValueScanner.getScannerOrder > {code:title=KeyValueScanner.java} > /** >* Get the order of this KeyValueScanner. This is only relevant for > StoreFileScanners and >* MemStoreScanners (other scanners simply return 0). This is required for > comparing multiple >* files to find out which one has the latest data. StoreFileScanners are > ordered from 0 >* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max > since it always >* contains freshest data. >*/ > long getScannerOrder(); > {code} > As now we may have multiple memstore scanners, I think the right way to > select scanner order for memstore scanner is to ordered from Long.MAX_VALUE > in decreasing order. > But in CompactingMemStore and DefaultMemStore, the scanner order for memstore > scanner is also start from 0, which will be messed up with StoreFileScanners. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong
[ https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310674#comment-16310674 ] Jingyun Tian commented on HBASE-18059: -- As [~Appy] said before, the only situation when the order is determined by getScannerOrder() is SF vs SF and both have no seqId. For SegmentScanner vs SegmentScanner, since all cells in memstore have SeqId, so it's impossible to reach getScannerOrder(). I think it's better to modify the comment of getScannerOrder() to: {code} /** * Get the order of this KeyValueScanner. This is only relevant for StoreFileScanners. * This is required for comparing multiple files to find out which one has the latest * data. StoreFileScanners are ordered from 0 (oldest) to newest in increasing order. */ {code} And I think for CompactingMemStore and DefaultMemStore: {code} long order = 1 + pipelineList.size() + snapshotList.size(); {code} Although this part doesn't work, counting order from LONG.MAX makes more sense. [~Apache9] > The scanner order for memstore scanners are wrong > - > > Key: HBASE-18059 > URL: https://issues.apache.org/jira/browse/HBASE-18059 > Project: HBase > Issue Type: Bug > Components: regionserver, scan, Scanners >Affects Versions: 2.0.0 >Reporter: Duo Zhang >Assignee: Jingyun Tian >Priority: Critical > Fix For: 2.0.0 > > > This is comments for KeyValueScanner.getScannerOrder > {code:title=KeyValueScanner.java} > /** >* Get the order of this KeyValueScanner. This is only relevant for > StoreFileScanners and >* MemStoreScanners (other scanners simply return 0). This is required for > comparing multiple >* files to find out which one has the latest data. StoreFileScanners are > ordered from 0 >* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max > since it always >* contains freshest data. >*/ > long getScannerOrder(); > {code} > As now we may have multiple memstore scanners, I think the right way to > select scanner order for memstore scanner is to ordered from Long.MAX_VALUE > in decreasing order. > But in CompactingMemStore and DefaultMemStore, the scanner order for memstore > scanner is also start from 0, which will be messed up with StoreFileScanners. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong
[ https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16037151#comment-16037151 ] Appy commented on HBASE-18059: -- Looking at things more closely after [~tianjingyun]'s comment. HBASE-15236 is year old, so had to dig deep again. If the following invariant always holds, then you're right. [~stack] can you please confirm, since my knowledge of seqId isn't much. "Seq id of cells in memstore will always be great than that of any store file or bulk loaded file." - Detailed analysis: KVHeap uses [KVScannerComparator|https://github.com/apache/hbase/blob/e65d8653e566bbbae03578a1f9ad858cabcb48bc/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueHeap.java#L191]. It first compares the cells from two scanners using a CellComparator. Only when the comparison is equal, it uses scanner.getScannerOrder(). Thinking about the case when two scanners in a KeyValueHeap might return the same key, KVScannerComparator will resolve ties as follows: - memstore scanners vs storefile(SF) scanner: -- if cell in SF has seqId: Its seqId should be less than that of memstore's cell. Memstore scanner will win. -- if cell in SF does NOT have seqId: SF'cell [defaults to seqId=0|https://github.com/apache/hbase/blob/e65d8653e566bbbae03578a1f9ad858cabcb48bc/hbase-common/src/main/java/org/apache/hadoop/hbase/Cell.java#L142], memstore scanner will win. (Note that seqId of cells in hfiles are removed on major compaction if older than certain time, default is 5 days) - memstore vs bulk loaded file(BLF) -- memstore cell's will have higher seqId, so memstore scanner will win. - SF vs SF -> Either cells' seqId will be used to tie break, or getScannerOrder() (determined by store files' seq ids). - SF vs BLF -- if both cells have seqId -> one with higher seqId wins. -- BLF's cell has seqId and SF's cell doesn't -> BLF scanner wins since SF's cell's seqId defaults to 0. (This will happen when bulk load file is new but SF is result of major compaction where cell's seqId were erased). -- SF's cell has seqId but BLF's cell doesn't -> can't happen. BLF always have seqId. - BLF vs BLF -- Cell's seqid will be used to tie break. > The scanner order for memstore scanners are wrong > - > > Key: HBASE-18059 > URL: https://issues.apache.org/jira/browse/HBASE-18059 > Project: HBase > Issue Type: Bug > Components: regionserver, scan, Scanners >Affects Versions: 2.0.0 >Reporter: Duo Zhang >Assignee: Jingyun Tian >Priority: Critical > Fix For: 2.0.0 > > > This is comments for KeyValueScanner.getScannerOrder > {code:title=KeyValueScanner.java} > /** >* Get the order of this KeyValueScanner. This is only relevant for > StoreFileScanners and >* MemStoreScanners (other scanners simply return 0). This is required for > comparing multiple >* files to find out which one has the latest data. StoreFileScanners are > ordered from 0 >* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max > since it always >* contains freshest data. >*/ > long getScannerOrder(); > {code} > As now we may have multiple memstore scanners, I think the right way to > select scanner order for memstore scanner is to ordered from Long.MAX_VALUE > in decreasing order. > But in CompactingMemStore and DefaultMemStore, the scanner order for memstore > scanner is also start from 0, which will be messed up with StoreFileScanners. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong
[ https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027162#comment-16027162 ] stack commented on HBASE-18059: --- [~tianjingyun] [~appy] said he'd be by here. I'd value his opinion above mine. It might be a while because he is in exotic locations currently (He is back Weds..). > The scanner order for memstore scanners are wrong > - > > Key: HBASE-18059 > URL: https://issues.apache.org/jira/browse/HBASE-18059 > Project: HBase > Issue Type: Bug > Components: regionserver, scan, Scanners >Affects Versions: 2.0.0 >Reporter: Duo Zhang >Assignee: Jingyun Tian >Priority: Critical > Fix For: 2.0.0 > > > This is comments for KeyValueScanner.getScannerOrder > {code:title=KeyValueScanner.java} > /** >* Get the order of this KeyValueScanner. This is only relevant for > StoreFileScanners and >* MemStoreScanners (other scanners simply return 0). This is required for > comparing multiple >* files to find out which one has the latest data. StoreFileScanners are > ordered from 0 >* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max > since it always >* contains freshest data. >*/ > long getScannerOrder(); > {code} > As now we may have multiple memstore scanners, I think the right way to > select scanner order for memstore scanner is to ordered from Long.MAX_VALUE > in decreasing order. > But in CompactingMemStore and DefaultMemStore, the scanner order for memstore > scanner is also start from 0, which will be messed up with StoreFileScanners. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong
[ https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027141#comment-16027141 ] Jingyun Tian commented on HBASE-18059: -- [~stack] sir, do you think the modification is necessary? I think the only situation that will cause problem is the hfile from bulk import have the same sequence ID with cells in memstore. > The scanner order for memstore scanners are wrong > - > > Key: HBASE-18059 > URL: https://issues.apache.org/jira/browse/HBASE-18059 > Project: HBase > Issue Type: Bug > Components: regionserver, scan, Scanners >Affects Versions: 2.0.0 >Reporter: Duo Zhang >Assignee: Jingyun Tian >Priority: Critical > Fix For: 2.0.0 > > > This is comments for KeyValueScanner.getScannerOrder > {code:title=KeyValueScanner.java} > /** >* Get the order of this KeyValueScanner. This is only relevant for > StoreFileScanners and >* MemStoreScanners (other scanners simply return 0). This is required for > comparing multiple >* files to find out which one has the latest data. StoreFileScanners are > ordered from 0 >* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max > since it always >* contains freshest data. >*/ > long getScannerOrder(); > {code} > As now we may have multiple memstore scanners, I think the right way to > select scanner order for memstore scanner is to ordered from Long.MAX_VALUE > in decreasing order. > But in CompactingMemStore and DefaultMemStore, the scanner order for memstore > scanner is also start from 0, which will be messed up with StoreFileScanners. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong
[ https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16019453#comment-16019453 ] Jingyun Tian commented on HBASE-18059: -- [~appy] since every cell in memstore has a sequence ID, I think the problem that memstore has the same scanner order with storefile makes the order wrong seems cannot happen. Because the CellComparator will consider the sequence ID before scanner order, so I think the comparison of scanner order will not happen. In addition, is there a situation that data from bulk import could have same sequence ID with cells in memstore? > The scanner order for memstore scanners are wrong > - > > Key: HBASE-18059 > URL: https://issues.apache.org/jira/browse/HBASE-18059 > Project: HBase > Issue Type: Bug > Components: regionserver, scan, Scanners >Affects Versions: 2.0.0 >Reporter: Duo Zhang >Assignee: Jingyun Tian > Fix For: 2.0.0 > > > This is comments for KeyValueScanner.getScannerOrder > {code:title=KeyValueScanner.java} > /** >* Get the order of this KeyValueScanner. This is only relevant for > StoreFileScanners and >* MemStoreScanners (other scanners simply return 0). This is required for > comparing multiple >* files to find out which one has the latest data. StoreFileScanners are > ordered from 0 >* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max > since it always >* contains freshest data. >*/ > long getScannerOrder(); > {code} > As now we may have multiple memstore scanners, I think the right way to > select scanner order for memstore scanner is to ordered from Long.MAX_VALUE > in decreasing order. > But in CompactingMemStore and DefaultMemStore, the scanner order for memstore > scanner is also start from 0, which will be messed up with StoreFileScanners. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong
[ https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013950#comment-16013950 ] Jingyun Tian commented on HBASE-18059: -- I will try to fix this and add a UT to it. > The scanner order for memstore scanners are wrong > - > > Key: HBASE-18059 > URL: https://issues.apache.org/jira/browse/HBASE-18059 > Project: HBase > Issue Type: Bug > Components: regionserver, scan, Scanners >Affects Versions: 2.0.0 >Reporter: Duo Zhang >Assignee: Jingyun Tian > Fix For: 2.0.0 > > > This is comments for KeyValueScanner.getScannerOrder > {code:title=KeyValueScanner.java} > /** >* Get the order of this KeyValueScanner. This is only relevant for > StoreFileScanners and >* MemStoreScanners (other scanners simply return 0). This is required for > comparing multiple >* files to find out which one has the latest data. StoreFileScanners are > ordered from 0 >* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max > since it always >* contains freshest data. >*/ > long getScannerOrder(); > {code} > As now we may have multiple memstore scanners, I think the right way to > select scanner order for memstore scanner is to ordered from Long.MAX_VALUE > in decreasing order. > But in CompactingMemStore and DefaultMemStore, the scanner order for memstore > scanner is also start from 0, which will be messed up with StoreFileScanners. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong
[ https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013379#comment-16013379 ] Duo Zhang commented on HBASE-18059: --- Any volunteers? I'm busy with other issues right now. The fix is trivial but I think we'd better add a UT to confirm that we do not make this mistake again. > The scanner order for memstore scanners are wrong > - > > Key: HBASE-18059 > URL: https://issues.apache.org/jira/browse/HBASE-18059 > Project: HBase > Issue Type: Bug > Components: regionserver, scan, Scanners >Affects Versions: 2.0.0 >Reporter: Duo Zhang > Fix For: 2.0.0 > > > This is comments for KeyValueScanner.getScannerOrder > {code:title=KeyValueScanner.java} > /** >* Get the order of this KeyValueScanner. This is only relevant for > StoreFileScanners and >* MemStoreScanners (other scanners simply return 0). This is required for > comparing multiple >* files to find out which one has the latest data. StoreFileScanners are > ordered from 0 >* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max > since it always >* contains freshest data. >*/ > long getScannerOrder(); > {code} > As now we may have multiple memstore scanners, I think the right way to > select scanner order for memstore scanner is to ordered from Long.MAX_VALUE > in decreasing order. > But in CompactingMemStore and DefaultMemStore, the scanner order for memstore > scanner is also start from 0, which will be messed up with StoreFileScanners. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong
[ https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013372#comment-16013372 ] Appy commented on HBASE-18059: -- Yeah, it's wrong then. I think your suggestion of counting down from Long.MAX_VALUE makes sense. > The scanner order for memstore scanners are wrong > - > > Key: HBASE-18059 > URL: https://issues.apache.org/jira/browse/HBASE-18059 > Project: HBase > Issue Type: Bug > Components: regionserver, scan, Scanners >Affects Versions: 2.0.0 >Reporter: Duo Zhang > Fix For: 2.0.0 > > > This is comments for KeyValueScanner.getScannerOrder > {code:title=KeyValueScanner.java} > /** >* Get the order of this KeyValueScanner. This is only relevant for > StoreFileScanners and >* MemStoreScanners (other scanners simply return 0). This is required for > comparing multiple >* files to find out which one has the latest data. StoreFileScanners are > ordered from 0 >* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max > since it always >* contains freshest data. >*/ > long getScannerOrder(); > {code} > As now we may have multiple memstore scanners, I think the right way to > select scanner order for memstore scanner is to ordered from Long.MAX_VALUE > in decreasing order. > But in CompactingMemStore and DefaultMemStore, the scanner order for memstore > scanner is also start from 0, which will be messed up with StoreFileScanners. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong
[ https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013349#comment-16013349 ] Duo Zhang commented on HBASE-18059: --- They just assign scanner order from {{1 + pipelineList.size() + snapshotList.size()}} down to 0... > The scanner order for memstore scanners are wrong > - > > Key: HBASE-18059 > URL: https://issues.apache.org/jira/browse/HBASE-18059 > Project: HBase > Issue Type: Bug > Components: regionserver, scan, Scanners >Affects Versions: 2.0.0 >Reporter: Duo Zhang > Fix For: 2.0.0 > > > This is comments for KeyValueScanner.getScannerOrder > {code:title=KeyValueScanner.java} > /** >* Get the order of this KeyValueScanner. This is only relevant for > StoreFileScanners and >* MemStoreScanners (other scanners simply return 0). This is required for > comparing multiple >* files to find out which one has the latest data. StoreFileScanners are > ordered from 0 >* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max > since it always >* contains freshest data. >*/ > long getScannerOrder(); > {code} > As now we may have multiple memstore scanners, I think the right way to > select scanner order for memstore scanner is to ordered from Long.MAX_VALUE > in decreasing order. > But in CompactingMemStore and DefaultMemStore, the scanner order for memstore > scanner is also start from 0, which will be messed up with StoreFileScanners. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong
[ https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013103#comment-16013103 ] Appy commented on HBASE-18059: -- yes sir. It was a part of HBASE-15236. bq. But in CompactingMemStore and DefaultMemStore, the scanner order for memstore scanner is also start from 0, which will be messed up with StoreFileScanners. Yes, if memstorescanners start from 0, it'll be bug. But i see following in CompactingMemStore, {noformat} long order = 1 + pipelineList.size() + snapshotList.size(); {noformat} Didn't dig around enough to make sure if it'll ensure that lowest number assigned to memstore is more than number of store files. > The scanner order for memstore scanners are wrong > - > > Key: HBASE-18059 > URL: https://issues.apache.org/jira/browse/HBASE-18059 > Project: HBase > Issue Type: Bug > Components: regionserver, scan, Scanners >Affects Versions: 2.0.0 >Reporter: Duo Zhang > Fix For: 2.0.0 > > > This is comments for KeyValueScanner.getScannerOrder > {code:title=KeyValueScanner.java} > /** >* Get the order of this KeyValueScanner. This is only relevant for > StoreFileScanners and >* MemStoreScanners (other scanners simply return 0). This is required for > comparing multiple >* files to find out which one has the latest data. StoreFileScanners are > ordered from 0 >* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max > since it always >* contains freshest data. >*/ > long getScannerOrder(); > {code} > As now we may have multiple memstore scanners, I think the right way to > select scanner order for memstore scanner is to ordered from Long.MAX_VALUE > in decreasing order. > But in CompactingMemStore and DefaultMemStore, the scanner order for memstore > scanner is also start from 0, which will be messed up with StoreFileScanners. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong
[ https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012648#comment-16012648 ] stack commented on HBASE-18059: --- [~appy] This your comment sir? > The scanner order for memstore scanners are wrong > - > > Key: HBASE-18059 > URL: https://issues.apache.org/jira/browse/HBASE-18059 > Project: HBase > Issue Type: Bug > Components: regionserver, scan, Scanners >Affects Versions: 2.0.0 >Reporter: Duo Zhang > Fix For: 2.0.0 > > > This is comments for KeyValueScanner.getScannerOrder > {code:title=KeyValueScanner.java} > /** >* Get the order of this KeyValueScanner. This is only relevant for > StoreFileScanners and >* MemStoreScanners (other scanners simply return 0). This is required for > comparing multiple >* files to find out which one has the latest data. StoreFileScanners are > ordered from 0 >* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max > since it always >* contains freshest data. >*/ > long getScannerOrder(); > {code} > As now we may have multiple memstore scanners, I think the right way to > select scanner order for memstore scanner is to ordered from Long.MAX_VALUE > in decreasing order. > But in CompactingMemStore and DefaultMemStore, the scanner order for memstore > scanner is also start from 0, which will be messed up with StoreFileScanners. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong
[ https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012430#comment-16012430 ] Duo Zhang commented on HBASE-18059: --- [~anastas] [~stack] FYI. > The scanner order for memstore scanners are wrong > - > > Key: HBASE-18059 > URL: https://issues.apache.org/jira/browse/HBASE-18059 > Project: HBase > Issue Type: Bug > Components: regionserver, scan, Scanners >Affects Versions: 2.0.0 >Reporter: Duo Zhang > Fix For: 2.0.0 > > > This is comments for KeyValueScanner.getScannerOrder > {code:title=KeyValueScanner.java} > /** >* Get the order of this KeyValueScanner. This is only relevant for > StoreFileScanners and >* MemStoreScanners (other scanners simply return 0). This is required for > comparing multiple >* files to find out which one has the latest data. StoreFileScanners are > ordered from 0 >* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max > since it always >* contains freshest data. >*/ > long getScannerOrder(); > {code} > As now we may have multiple memstore scanners, I think the right way to > select scanner order for memstore scanner is to ordered from Long.MAX_VALUE > in decreasing order. > But in CompactingMemStore and DefaultMemStore, the scanner order for memstore > scanner is also start from 0, which will be messed up with StoreFileScanners. -- This message was sent by Atlassian JIRA (v6.3.15#6346)