[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong

2018-04-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16443947#comment-16443947
 ] 

Hudson commented on HBASE-18059:


Results for branch master
[build #304 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/304/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/304//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/304//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/304//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> The scanner order for memstore scanners are wrong
> -
>
> Key: HBASE-18059
> URL: https://issues.apache.org/jira/browse/HBASE-18059
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, scan, Scanners
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
>Assignee: Jingyun Tian
>Priority: Critical
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-18059.master.001.patch
>
>
> This is comments for KeyValueScanner.getScannerOrder
> {code:title=KeyValueScanner.java}
>   /**
>* Get the order of this KeyValueScanner. This is only relevant for 
> StoreFileScanners and
>* MemStoreScanners (other scanners simply return 0). This is required for 
> comparing multiple
>* files to find out which one has the latest data. StoreFileScanners are 
> ordered from 0
>* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max 
> since it always
>* contains freshest data.
>*/
>   long getScannerOrder();
> {code}
> As now we may have multiple memstore scanners, I think the right way to 
> select scanner order for memstore scanner is to ordered from Long.MAX_VALUE 
> in decreasing order.
> But in CompactingMemStore and DefaultMemStore, the scanner order for memstore 
> scanner is also start from 0, which will be messed up with StoreFileScanners.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong

2018-04-19 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16443798#comment-16443798
 ] 

Hudson commented on HBASE-18059:


Results for branch branch-2
[build #632 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/632/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/632//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/632//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/632//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> The scanner order for memstore scanners are wrong
> -
>
> Key: HBASE-18059
> URL: https://issues.apache.org/jira/browse/HBASE-18059
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, scan, Scanners
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
>Assignee: Jingyun Tian
>Priority: Critical
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-18059.master.001.patch
>
>
> This is comments for KeyValueScanner.getScannerOrder
> {code:title=KeyValueScanner.java}
>   /**
>* Get the order of this KeyValueScanner. This is only relevant for 
> StoreFileScanners and
>* MemStoreScanners (other scanners simply return 0). This is required for 
> comparing multiple
>* files to find out which one has the latest data. StoreFileScanners are 
> ordered from 0
>* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max 
> since it always
>* contains freshest data.
>*/
>   long getScannerOrder();
> {code}
> As now we may have multiple memstore scanners, I think the right way to 
> select scanner order for memstore scanner is to ordered from Long.MAX_VALUE 
> in decreasing order.
> But in CompactingMemStore and DefaultMemStore, the scanner order for memstore 
> scanner is also start from 0, which will be messed up with StoreFileScanners.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong

2018-04-18 Thread Jingyun Tian (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16443493#comment-16443493
 ] 

Jingyun Tian commented on HBASE-18059:
--

[~stack] sure. Thx for your review.

> The scanner order for memstore scanners are wrong
> -
>
> Key: HBASE-18059
> URL: https://issues.apache.org/jira/browse/HBASE-18059
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, scan, Scanners
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
>Assignee: Jingyun Tian
>Priority: Critical
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-18059.master.001.patch
>
>
> This is comments for KeyValueScanner.getScannerOrder
> {code:title=KeyValueScanner.java}
>   /**
>* Get the order of this KeyValueScanner. This is only relevant for 
> StoreFileScanners and
>* MemStoreScanners (other scanners simply return 0). This is required for 
> comparing multiple
>* files to find out which one has the latest data. StoreFileScanners are 
> ordered from 0
>* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max 
> since it always
>* contains freshest data.
>*/
>   long getScannerOrder();
> {code}
> As now we may have multiple memstore scanners, I think the right way to 
> select scanner order for memstore scanner is to ordered from Long.MAX_VALUE 
> in decreasing order.
> But in CompactingMemStore and DefaultMemStore, the scanner order for memstore 
> scanner is also start from 0, which will be messed up with StoreFileScanners.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong

2018-04-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16443481#comment-16443481
 ] 

stack commented on HBASE-18059:
---

Thank you [~tianjingyun] On review of what is here and following yours and 
@appy's commentary, makes sense. Thanks for digging in and the nice patch 
cleaning out order where it not needed. +1 for branch-2+. Not for branch-2.0. 
I'm being conservative not 2.0.0 is in RC.  Let me commit this now.

> The scanner order for memstore scanners are wrong
> -
>
> Key: HBASE-18059
> URL: https://issues.apache.org/jira/browse/HBASE-18059
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, scan, Scanners
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
>Assignee: Jingyun Tian
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-18059.master.001.patch
>
>
> This is comments for KeyValueScanner.getScannerOrder
> {code:title=KeyValueScanner.java}
>   /**
>* Get the order of this KeyValueScanner. This is only relevant for 
> StoreFileScanners and
>* MemStoreScanners (other scanners simply return 0). This is required for 
> comparing multiple
>* files to find out which one has the latest data. StoreFileScanners are 
> ordered from 0
>* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max 
> since it always
>* contains freshest data.
>*/
>   long getScannerOrder();
> {code}
> As now we may have multiple memstore scanners, I think the right way to 
> select scanner order for memstore scanner is to ordered from Long.MAX_VALUE 
> in decreasing order.
> But in CompactingMemStore and DefaultMemStore, the scanner order for memstore 
> scanner is also start from 0, which will be messed up with StoreFileScanners.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong

2018-04-18 Thread Jingyun Tian (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16443467#comment-16443467
 ] 

Jingyun Tian commented on HBASE-18059:
--

[~stack] Because the scanner order is used when the kvComparator cannot 
determine which is bigger between 2 cells, that means the 2 cells have same key 
and same seqId. But for memstore, all cells have its own seqld, thus it will 
nerver reach the compare of scanner order. [~appy] explain this very clear:
{quote} - memstore scanners vs storefile(SF) scanner:
 -- if cell in SF has seqId: Its seqId should be less than that of memstore's 
cell. Memstore scanner will win.
 -- if cell in SF does NOT have seqId: SF'cell [defaults to 
seqId=0|https://github.com/apache/hbase/blob/e65d8653e566bbbae03578a1f9ad858cabcb48bc/hbase-common/src/main/java/org/apache/hadoop/hbase/Cell.java#L142],
 memstore scanner will win. (Note that seqId of cells in hfiles are removed on 
major compaction if older than certain time, default is 5 days)
 - memstore vs bulk loaded file(BLF)
 -- memstore cell's will have higher seqId, so memstore scanner will win.{quote}

> The scanner order for memstore scanners are wrong
> -
>
> Key: HBASE-18059
> URL: https://issues.apache.org/jira/browse/HBASE-18059
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, scan, Scanners
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
>Assignee: Jingyun Tian
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-18059.master.001.patch
>
>
> This is comments for KeyValueScanner.getScannerOrder
> {code:title=KeyValueScanner.java}
>   /**
>* Get the order of this KeyValueScanner. This is only relevant for 
> StoreFileScanners and
>* MemStoreScanners (other scanners simply return 0). This is required for 
> comparing multiple
>* files to find out which one has the latest data. StoreFileScanners are 
> ordered from 0
>* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max 
> since it always
>* contains freshest data.
>*/
>   long getScannerOrder();
> {code}
> As now we may have multiple memstore scanners, I think the right way to 
> select scanner order for memstore scanner is to ordered from Long.MAX_VALUE 
> in decreasing order.
> But in CompactingMemStore and DefaultMemStore, the scanner order for memstore 
> scanner is also start from 0, which will be messed up with StoreFileScanners.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong

2018-04-18 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16443254#comment-16443254
 ] 

stack commented on HBASE-18059:
---

[~appy] Any opinion sir?

[~tianjingyun] Why useless sir? Thanks. (And all tests passed here too!)

> The scanner order for memstore scanners are wrong
> -
>
> Key: HBASE-18059
> URL: https://issues.apache.org/jira/browse/HBASE-18059
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, scan, Scanners
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
>Assignee: Jingyun Tian
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-18059.master.001.patch
>
>
> This is comments for KeyValueScanner.getScannerOrder
> {code:title=KeyValueScanner.java}
>   /**
>* Get the order of this KeyValueScanner. This is only relevant for 
> StoreFileScanners and
>* MemStoreScanners (other scanners simply return 0). This is required for 
> comparing multiple
>* files to find out which one has the latest data. StoreFileScanners are 
> ordered from 0
>* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max 
> since it always
>* contains freshest data.
>*/
>   long getScannerOrder();
> {code}
> As now we may have multiple memstore scanners, I think the right way to 
> select scanner order for memstore scanner is to ordered from Long.MAX_VALUE 
> in decreasing order.
> But in CompactingMemStore and DefaultMemStore, the scanner order for memstore 
> scanner is also start from 0, which will be messed up with StoreFileScanners.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong

2018-04-18 Thread Jingyun Tian (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16441950#comment-16441950
 ] 

Jingyun Tian commented on HBASE-18059:
--

[~Apache9] pls check this out.

> The scanner order for memstore scanners are wrong
> -
>
> Key: HBASE-18059
> URL: https://issues.apache.org/jira/browse/HBASE-18059
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, scan, Scanners
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
>Assignee: Jingyun Tian
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-18059.master.001.patch
>
>
> This is comments for KeyValueScanner.getScannerOrder
> {code:title=KeyValueScanner.java}
>   /**
>* Get the order of this KeyValueScanner. This is only relevant for 
> StoreFileScanners and
>* MemStoreScanners (other scanners simply return 0). This is required for 
> comparing multiple
>* files to find out which one has the latest data. StoreFileScanners are 
> ordered from 0
>* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max 
> since it always
>* contains freshest data.
>*/
>   long getScannerOrder();
> {code}
> As now we may have multiple memstore scanners, I think the right way to 
> select scanner order for memstore scanner is to ordered from Long.MAX_VALUE 
> in decreasing order.
> But in CompactingMemStore and DefaultMemStore, the scanner order for memstore 
> scanner is also start from 0, which will be messed up with StoreFileScanners.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong

2018-04-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16441939#comment-16441939
 ] 

Hadoop QA commented on HBASE-18059:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
9s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
40s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
53s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
53s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
13s{color} | {color:green} hbase-server: The patch generated 0 new + 61 
unchanged - 1 fixed = 61 total (was 62) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
52s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
14m 17s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}106m  
3s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}151m  7s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f |
| JIRA Issue | HBASE-18059 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12919526/HBASE-18059.master.001.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux fcbb919c4db0 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 
15:37:11 UTC 2016 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / f4f2b68238 |
| maven | version: Apache Maven 3.5.3 
(3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/12516/testReport/ |
| Max. process+thread count | 4158 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 

[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong

2018-04-17 Thread Jingyun Tian (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16441839#comment-16441839
 ] 

Jingyun Tian commented on HBASE-18059:
--

IMO the scanner order for memstore is actually useless. I removed all code of 
scanner order in memstore related classes. This patch passed test at my local 
PC.

> The scanner order for memstore scanners are wrong
> -
>
> Key: HBASE-18059
> URL: https://issues.apache.org/jira/browse/HBASE-18059
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, scan, Scanners
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
>Assignee: Jingyun Tian
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-18059.master.001.patch
>
>
> This is comments for KeyValueScanner.getScannerOrder
> {code:title=KeyValueScanner.java}
>   /**
>* Get the order of this KeyValueScanner. This is only relevant for 
> StoreFileScanners and
>* MemStoreScanners (other scanners simply return 0). This is required for 
> comparing multiple
>* files to find out which one has the latest data. StoreFileScanners are 
> ordered from 0
>* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max 
> since it always
>* contains freshest data.
>*/
>   long getScannerOrder();
> {code}
> As now we may have multiple memstore scanners, I think the right way to 
> select scanner order for memstore scanner is to ordered from Long.MAX_VALUE 
> in decreasing order.
> But in CompactingMemStore and DefaultMemStore, the scanner order for memstore 
> scanner is also start from 0, which will be messed up with StoreFileScanners.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong

2018-01-03 Thread Jingyun Tian (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310674#comment-16310674
 ] 

Jingyun Tian commented on HBASE-18059:
--

As [~Appy] said before, the only situation when the order is determined by 
getScannerOrder() is SF vs SF and both have no seqId.
For SegmentScanner vs SegmentScanner, since all cells in memstore have SeqId, 
so it's impossible to reach getScannerOrder(). 

I think it's better to modify the comment of getScannerOrder() to:
{code}
  /**
   * Get the order of this KeyValueScanner. This is only relevant for 
StoreFileScanners.
   * This is required for comparing multiple files to find out which one has 
the latest 
   * data. StoreFileScanners are ordered from 0 (oldest) to newest in 
increasing order. 
   */
{code}

And I think for CompactingMemStore and DefaultMemStore:
{code}
long order = 1 + pipelineList.size() + snapshotList.size();
{code}
Although this part doesn't work, counting order from LONG.MAX makes more sense. 
[~Apache9]




> The scanner order for memstore scanners are wrong
> -
>
> Key: HBASE-18059
> URL: https://issues.apache.org/jira/browse/HBASE-18059
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, scan, Scanners
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
>Assignee: Jingyun Tian
>Priority: Critical
> Fix For: 2.0.0
>
>
> This is comments for KeyValueScanner.getScannerOrder
> {code:title=KeyValueScanner.java}
>   /**
>* Get the order of this KeyValueScanner. This is only relevant for 
> StoreFileScanners and
>* MemStoreScanners (other scanners simply return 0). This is required for 
> comparing multiple
>* files to find out which one has the latest data. StoreFileScanners are 
> ordered from 0
>* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max 
> since it always
>* contains freshest data.
>*/
>   long getScannerOrder();
> {code}
> As now we may have multiple memstore scanners, I think the right way to 
> select scanner order for memstore scanner is to ordered from Long.MAX_VALUE 
> in decreasing order.
> But in CompactingMemStore and DefaultMemStore, the scanner order for memstore 
> scanner is also start from 0, which will be messed up with StoreFileScanners.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong

2017-06-05 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16037151#comment-16037151
 ] 

Appy commented on HBASE-18059:
--

Looking at things more closely after [~tianjingyun]'s comment. HBASE-15236 is 
year old, so had to dig deep again.
If the following invariant always holds, then you're right. [~stack] can you  
please confirm, since my knowledge of seqId isn't much.
"Seq id of cells in memstore will always be great than that of any store file 
or bulk loaded file."
-
Detailed analysis:

KVHeap uses 
[KVScannerComparator|https://github.com/apache/hbase/blob/e65d8653e566bbbae03578a1f9ad858cabcb48bc/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueHeap.java#L191].
 It first compares the cells from two scanners using a CellComparator. Only 
when the comparison is equal, it uses scanner.getScannerOrder().

Thinking about the case when two scanners in a KeyValueHeap might return the 
same key, KVScannerComparator will resolve ties as follows:
- memstore scanners vs storefile(SF) scanner:
 -- if cell in SF has seqId: Its seqId should be less than that of memstore's 
cell. Memstore scanner will win. 
 -- if cell in SF does NOT have seqId: SF'cell [defaults to 
seqId=0|https://github.com/apache/hbase/blob/e65d8653e566bbbae03578a1f9ad858cabcb48bc/hbase-common/src/main/java/org/apache/hadoop/hbase/Cell.java#L142],
 memstore scanner will win. (Note that seqId of cells in hfiles are removed on 
major compaction if older than certain time, default is 5 days)
- memstore vs bulk loaded file(BLF)
 -- memstore cell's will have higher seqId, so memstore scanner will win.
- SF vs SF -> Either cells' seqId will be used to tie break, or 
getScannerOrder() (determined by store files' seq ids).
- SF vs BLF
 -- if both cells have seqId -> one with higher seqId wins.
  -- BLF's cell has seqId and SF's cell doesn't -> BLF scanner wins since SF's 
cell's seqId defaults to 0. (This will happen when bulk load file is new but SF 
is result of major compaction where cell's seqId were erased).
 -- SF's cell has seqId but BLF's cell doesn't -> can't happen. BLF always have 
seqId.
- BLF vs BLF
 -- Cell's seqid will be used to  tie break.

> The scanner order for memstore scanners are wrong
> -
>
> Key: HBASE-18059
> URL: https://issues.apache.org/jira/browse/HBASE-18059
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, scan, Scanners
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
>Assignee: Jingyun Tian
>Priority: Critical
> Fix For: 2.0.0
>
>
> This is comments for KeyValueScanner.getScannerOrder
> {code:title=KeyValueScanner.java}
>   /**
>* Get the order of this KeyValueScanner. This is only relevant for 
> StoreFileScanners and
>* MemStoreScanners (other scanners simply return 0). This is required for 
> comparing multiple
>* files to find out which one has the latest data. StoreFileScanners are 
> ordered from 0
>* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max 
> since it always
>* contains freshest data.
>*/
>   long getScannerOrder();
> {code}
> As now we may have multiple memstore scanners, I think the right way to 
> select scanner order for memstore scanner is to ordered from Long.MAX_VALUE 
> in decreasing order.
> But in CompactingMemStore and DefaultMemStore, the scanner order for memstore 
> scanner is also start from 0, which will be messed up with StoreFileScanners.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong

2017-05-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027162#comment-16027162
 ] 

stack commented on HBASE-18059:
---

[~tianjingyun] [~appy] said he'd be by here. I'd value his opinion above mine.  
It might be a while because he is in exotic locations currently (He is back 
Weds..).

> The scanner order for memstore scanners are wrong
> -
>
> Key: HBASE-18059
> URL: https://issues.apache.org/jira/browse/HBASE-18059
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, scan, Scanners
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
>Assignee: Jingyun Tian
>Priority: Critical
> Fix For: 2.0.0
>
>
> This is comments for KeyValueScanner.getScannerOrder
> {code:title=KeyValueScanner.java}
>   /**
>* Get the order of this KeyValueScanner. This is only relevant for 
> StoreFileScanners and
>* MemStoreScanners (other scanners simply return 0). This is required for 
> comparing multiple
>* files to find out which one has the latest data. StoreFileScanners are 
> ordered from 0
>* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max 
> since it always
>* contains freshest data.
>*/
>   long getScannerOrder();
> {code}
> As now we may have multiple memstore scanners, I think the right way to 
> select scanner order for memstore scanner is to ordered from Long.MAX_VALUE 
> in decreasing order.
> But in CompactingMemStore and DefaultMemStore, the scanner order for memstore 
> scanner is also start from 0, which will be messed up with StoreFileScanners.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong

2017-05-26 Thread Jingyun Tian (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16027141#comment-16027141
 ] 

Jingyun Tian commented on HBASE-18059:
--

[~stack] sir, do you think the modification is necessary?  I think the only 
situation that will cause problem is the hfile from bulk import have the same 
sequence ID with cells in memstore.

> The scanner order for memstore scanners are wrong
> -
>
> Key: HBASE-18059
> URL: https://issues.apache.org/jira/browse/HBASE-18059
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, scan, Scanners
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
>Assignee: Jingyun Tian
>Priority: Critical
> Fix For: 2.0.0
>
>
> This is comments for KeyValueScanner.getScannerOrder
> {code:title=KeyValueScanner.java}
>   /**
>* Get the order of this KeyValueScanner. This is only relevant for 
> StoreFileScanners and
>* MemStoreScanners (other scanners simply return 0). This is required for 
> comparing multiple
>* files to find out which one has the latest data. StoreFileScanners are 
> ordered from 0
>* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max 
> since it always
>* contains freshest data.
>*/
>   long getScannerOrder();
> {code}
> As now we may have multiple memstore scanners, I think the right way to 
> select scanner order for memstore scanner is to ordered from Long.MAX_VALUE 
> in decreasing order.
> But in CompactingMemStore and DefaultMemStore, the scanner order for memstore 
> scanner is also start from 0, which will be messed up with StoreFileScanners.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong

2017-05-22 Thread Jingyun Tian (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16019453#comment-16019453
 ] 

Jingyun Tian commented on HBASE-18059:
--

[~appy] since every cell in memstore has a sequence ID, I think the problem 
that memstore has the same scanner order with storefile makes the order wrong 
seems cannot happen. Because the CellComparator will consider the sequence ID 
before scanner order, so I think the comparison of scanner order will not 
happen. 
In addition, is there a situation that data from bulk import could have same 
sequence ID with cells in memstore?  

> The scanner order for memstore scanners are wrong
> -
>
> Key: HBASE-18059
> URL: https://issues.apache.org/jira/browse/HBASE-18059
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, scan, Scanners
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
>Assignee: Jingyun Tian
> Fix For: 2.0.0
>
>
> This is comments for KeyValueScanner.getScannerOrder
> {code:title=KeyValueScanner.java}
>   /**
>* Get the order of this KeyValueScanner. This is only relevant for 
> StoreFileScanners and
>* MemStoreScanners (other scanners simply return 0). This is required for 
> comparing multiple
>* files to find out which one has the latest data. StoreFileScanners are 
> ordered from 0
>* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max 
> since it always
>* contains freshest data.
>*/
>   long getScannerOrder();
> {code}
> As now we may have multiple memstore scanners, I think the right way to 
> select scanner order for memstore scanner is to ordered from Long.MAX_VALUE 
> in decreasing order.
> But in CompactingMemStore and DefaultMemStore, the scanner order for memstore 
> scanner is also start from 0, which will be messed up with StoreFileScanners.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong

2017-05-17 Thread Jingyun Tian (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013950#comment-16013950
 ] 

Jingyun Tian commented on HBASE-18059:
--

I will try to fix this and add a UT to it. 

> The scanner order for memstore scanners are wrong
> -
>
> Key: HBASE-18059
> URL: https://issues.apache.org/jira/browse/HBASE-18059
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, scan, Scanners
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
>Assignee: Jingyun Tian
> Fix For: 2.0.0
>
>
> This is comments for KeyValueScanner.getScannerOrder
> {code:title=KeyValueScanner.java}
>   /**
>* Get the order of this KeyValueScanner. This is only relevant for 
> StoreFileScanners and
>* MemStoreScanners (other scanners simply return 0). This is required for 
> comparing multiple
>* files to find out which one has the latest data. StoreFileScanners are 
> ordered from 0
>* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max 
> since it always
>* contains freshest data.
>*/
>   long getScannerOrder();
> {code}
> As now we may have multiple memstore scanners, I think the right way to 
> select scanner order for memstore scanner is to ordered from Long.MAX_VALUE 
> in decreasing order.
> But in CompactingMemStore and DefaultMemStore, the scanner order for memstore 
> scanner is also start from 0, which will be messed up with StoreFileScanners.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong

2017-05-16 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013379#comment-16013379
 ] 

Duo Zhang commented on HBASE-18059:
---

Any volunteers? I'm busy with other issues right now.

The fix is trivial but I think we'd better add a UT to confirm that we do not 
make this mistake again.

> The scanner order for memstore scanners are wrong
> -
>
> Key: HBASE-18059
> URL: https://issues.apache.org/jira/browse/HBASE-18059
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, scan, Scanners
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
> Fix For: 2.0.0
>
>
> This is comments for KeyValueScanner.getScannerOrder
> {code:title=KeyValueScanner.java}
>   /**
>* Get the order of this KeyValueScanner. This is only relevant for 
> StoreFileScanners and
>* MemStoreScanners (other scanners simply return 0). This is required for 
> comparing multiple
>* files to find out which one has the latest data. StoreFileScanners are 
> ordered from 0
>* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max 
> since it always
>* contains freshest data.
>*/
>   long getScannerOrder();
> {code}
> As now we may have multiple memstore scanners, I think the right way to 
> select scanner order for memstore scanner is to ordered from Long.MAX_VALUE 
> in decreasing order.
> But in CompactingMemStore and DefaultMemStore, the scanner order for memstore 
> scanner is also start from 0, which will be messed up with StoreFileScanners.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong

2017-05-16 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013372#comment-16013372
 ] 

Appy commented on HBASE-18059:
--

Yeah, it's wrong then. I think your suggestion of counting down from 
Long.MAX_VALUE makes sense.

> The scanner order for memstore scanners are wrong
> -
>
> Key: HBASE-18059
> URL: https://issues.apache.org/jira/browse/HBASE-18059
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, scan, Scanners
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
> Fix For: 2.0.0
>
>
> This is comments for KeyValueScanner.getScannerOrder
> {code:title=KeyValueScanner.java}
>   /**
>* Get the order of this KeyValueScanner. This is only relevant for 
> StoreFileScanners and
>* MemStoreScanners (other scanners simply return 0). This is required for 
> comparing multiple
>* files to find out which one has the latest data. StoreFileScanners are 
> ordered from 0
>* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max 
> since it always
>* contains freshest data.
>*/
>   long getScannerOrder();
> {code}
> As now we may have multiple memstore scanners, I think the right way to 
> select scanner order for memstore scanner is to ordered from Long.MAX_VALUE 
> in decreasing order.
> But in CompactingMemStore and DefaultMemStore, the scanner order for memstore 
> scanner is also start from 0, which will be messed up with StoreFileScanners.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong

2017-05-16 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013349#comment-16013349
 ] 

Duo Zhang commented on HBASE-18059:
---

They just assign scanner order from {{1 + pipelineList.size() + 
snapshotList.size()}} down to 0...

> The scanner order for memstore scanners are wrong
> -
>
> Key: HBASE-18059
> URL: https://issues.apache.org/jira/browse/HBASE-18059
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, scan, Scanners
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
> Fix For: 2.0.0
>
>
> This is comments for KeyValueScanner.getScannerOrder
> {code:title=KeyValueScanner.java}
>   /**
>* Get the order of this KeyValueScanner. This is only relevant for 
> StoreFileScanners and
>* MemStoreScanners (other scanners simply return 0). This is required for 
> comparing multiple
>* files to find out which one has the latest data. StoreFileScanners are 
> ordered from 0
>* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max 
> since it always
>* contains freshest data.
>*/
>   long getScannerOrder();
> {code}
> As now we may have multiple memstore scanners, I think the right way to 
> select scanner order for memstore scanner is to ordered from Long.MAX_VALUE 
> in decreasing order.
> But in CompactingMemStore and DefaultMemStore, the scanner order for memstore 
> scanner is also start from 0, which will be messed up with StoreFileScanners.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong

2017-05-16 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013103#comment-16013103
 ] 

Appy commented on HBASE-18059:
--

yes sir. It was a part of HBASE-15236.

bq. But in CompactingMemStore and DefaultMemStore, the scanner order for 
memstore scanner is also start from 0, which will be messed up with 
StoreFileScanners.
Yes, if memstorescanners start from 0, it'll be bug.
But i see following in CompactingMemStore,
{noformat}
long order = 1 + pipelineList.size() + snapshotList.size();
{noformat}
Didn't dig around enough to make sure if it'll ensure that lowest number 
assigned to memstore is more than number of store files.

> The scanner order for memstore scanners are wrong
> -
>
> Key: HBASE-18059
> URL: https://issues.apache.org/jira/browse/HBASE-18059
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, scan, Scanners
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
> Fix For: 2.0.0
>
>
> This is comments for KeyValueScanner.getScannerOrder
> {code:title=KeyValueScanner.java}
>   /**
>* Get the order of this KeyValueScanner. This is only relevant for 
> StoreFileScanners and
>* MemStoreScanners (other scanners simply return 0). This is required for 
> comparing multiple
>* files to find out which one has the latest data. StoreFileScanners are 
> ordered from 0
>* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max 
> since it always
>* contains freshest data.
>*/
>   long getScannerOrder();
> {code}
> As now we may have multiple memstore scanners, I think the right way to 
> select scanner order for memstore scanner is to ordered from Long.MAX_VALUE 
> in decreasing order.
> But in CompactingMemStore and DefaultMemStore, the scanner order for memstore 
> scanner is also start from 0, which will be messed up with StoreFileScanners.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong

2017-05-16 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012648#comment-16012648
 ] 

stack commented on HBASE-18059:
---

[~appy] This your comment sir?


> The scanner order for memstore scanners are wrong
> -
>
> Key: HBASE-18059
> URL: https://issues.apache.org/jira/browse/HBASE-18059
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, scan, Scanners
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
> Fix For: 2.0.0
>
>
> This is comments for KeyValueScanner.getScannerOrder
> {code:title=KeyValueScanner.java}
>   /**
>* Get the order of this KeyValueScanner. This is only relevant for 
> StoreFileScanners and
>* MemStoreScanners (other scanners simply return 0). This is required for 
> comparing multiple
>* files to find out which one has the latest data. StoreFileScanners are 
> ordered from 0
>* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max 
> since it always
>* contains freshest data.
>*/
>   long getScannerOrder();
> {code}
> As now we may have multiple memstore scanners, I think the right way to 
> select scanner order for memstore scanner is to ordered from Long.MAX_VALUE 
> in decreasing order.
> But in CompactingMemStore and DefaultMemStore, the scanner order for memstore 
> scanner is also start from 0, which will be messed up with StoreFileScanners.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18059) The scanner order for memstore scanners are wrong

2017-05-16 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012430#comment-16012430
 ] 

Duo Zhang commented on HBASE-18059:
---

[~anastas] [~stack] FYI.

> The scanner order for memstore scanners are wrong
> -
>
> Key: HBASE-18059
> URL: https://issues.apache.org/jira/browse/HBASE-18059
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, scan, Scanners
>Affects Versions: 2.0.0
>Reporter: Duo Zhang
> Fix For: 2.0.0
>
>
> This is comments for KeyValueScanner.getScannerOrder
> {code:title=KeyValueScanner.java}
>   /**
>* Get the order of this KeyValueScanner. This is only relevant for 
> StoreFileScanners and
>* MemStoreScanners (other scanners simply return 0). This is required for 
> comparing multiple
>* files to find out which one has the latest data. StoreFileScanners are 
> ordered from 0
>* (oldest) to newest in increasing order. MemStoreScanner gets LONG.max 
> since it always
>* contains freshest data.
>*/
>   long getScannerOrder();
> {code}
> As now we may have multiple memstore scanners, I think the right way to 
> select scanner order for memstore scanner is to ordered from Long.MAX_VALUE 
> in decreasing order.
> But in CompactingMemStore and DefaultMemStore, the scanner order for memstore 
> scanner is also start from 0, which will be messed up with StoreFileScanners.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)