[
https://issues.apache.org/jira/browse/HBASE-15236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216887#comment-15216887
]
Hadoop QA commented on HBASE-15236:
-----------------------------------
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m
0s {color} | {color:green} The patch appears to include 4 new or modified test
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m
32s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 12s
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 6m
32s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m
26s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m
33s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 16s
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 16s
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 52s
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 6m
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m
27s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}
34m 56s {color} | {color:green} Patch does not cause any errors with Hadoop
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 145m 10s
{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m
25s {color} | {color:green} Patch does not generate ASF License warnings.
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 215m 26s {color}
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Timed out junit tests |
org.apache.hadoop.hbase.master.procedure.TestMasterFailoverWithProcedures |
| | org.apache.hadoop.hbase.wal.TestDefaultWALProvider |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL |
https://issues.apache.org/jira/secure/attachment/12795883/HBASE-15236-master-v7.patch
|
| JIRA Issue | HBASE-15236 |
| Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck
hbaseanti checkstyle compile |
| uname | Linux asf910.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality |
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
|
| git revision | master / 7f39baf |
| Default Java | 1.7.0_79 |
| Multi-JDK versions | /home/jenkins/tools/java/jdk1.8.0:1.8.0
/usr/local/jenkins/java/jdk1.7.0_79:1.7.0_79 |
| findbugs | v3.0.0 |
| unit |
https://builds.apache.org/job/PreCommit-HBASE-Build/1210/artifact/patchprocess/patch-unit-hbase-server.txt
|
| unit test logs |
https://builds.apache.org/job/PreCommit-HBASE-Build/1210/artifact/patchprocess/patch-unit-hbase-server.txt
|
| Test Results |
https://builds.apache.org/job/PreCommit-HBASE-Build/1210/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output |
https://builds.apache.org/job/PreCommit-HBASE-Build/1210/console |
| Powered by | Apache Yetus 0.2.0 http://yetus.apache.org |
This message was automatically generated.
> Inconsistent cell reads over multiple bulk-loaded HFiles
> --------------------------------------------------------
>
> Key: HBASE-15236
> URL: https://issues.apache.org/jira/browse/HBASE-15236
> Project: HBase
> Issue Type: Bug
> Reporter: Appy
> Assignee: Appy
> Attachments: HBASE-15236-master-v2.patch,
> HBASE-15236-master-v3.patch, HBASE-15236-master-v4.patch,
> HBASE-15236-master-v5.patch, HBASE-15236-master-v6.patch,
> HBASE-15236-master-v7.patch, HBASE-15236.patch, TestWithSingleHRegion.java,
> tables_data.zip
>
>
> If there are two bulkloaded hfiles in a region with same seqID, same
> timestamps and duplicate keys*, get and scan may return different values for
> a key. Not sure how this would happen, but one of our customer uploaded a
> dataset with 2 files in a single region and both having same bulk load
> timestamp. These files are small ~50M (I couldn't find any setting for max
> file size that could lead to 2 files). The range of keys in two hfiles are
> overlapping to some extent, but not fully (so the two files are because of
> region merge).
> In such a case, depending on file sizes (used in
> StoreFile.Comparators.SEQ_ID), we may get different values for the same cell
> (say "r", "cf:50") depending on what we call: get "r" "cf:50" or get "r"
> "cf:".
> To replicate this, i ran ImportTsv twice with 2 different csv files but same
> timestamp (-Dimporttsv.timestamp=1). Collected two resulting hfiles in a
> single directory and loaded with LoadIncrementalHFiles. Here are the commands.
> {noformat}
> //CSV files should be in hdfs:///user/hbase/tmp
> sudo -u hbase hbase org.apache.hadoop.hbase.mapreduce.ImportTsv
> -Dimporttsv.columns=HBASE_ROW_KEY,c:1,c:2,c:3,c:4,c:5,c:6
> -Dimporttsv.separator='|' -Dimporttsv.timestamp=1
> -Dimporttsv.bulk.output=hdfs:///user/hbase/1 t tmp
> {noformat}
> tables_data.zip contains three tables: t, t2, and t3.
> To load these tables and test with them, use the attached
> TestWithSingleHRegion.java. (Change ROOT_DIR and TABLE_NAME variables
> appropriately)
> Hfiles for table 't':
> 1) 07922ac90208445883b2ddc89c424c89 : length = 1094: KVs = (c:5, 55) (c:6, 66)
> 2) 143137fa4fa84625b4e8d1d84a494f08 : length = 1192: KVs = (c:1, 1) (c:2, 2)
> (c:3, 3) (c:4, 4) (c:5, 5) (c:6, 6)
> Get returns 5 where as Scan returns 55.
> On the other hand if I make the first hfile, the one with values 55 and 66
> bigger in size than second hfile, then:
> Get returns 55 where as Scan returns 55.
> This can be seen in table 't2'.
> Weird things:
> 1. Get consistently returns values from larger hfile.
> 2. Scan consistently returns 55.
> Reason:
> Explanation 1: We add StoreFileScanners to heap by iterating over list of
> store files which is kept sorted in DefaultStoreFileManger using
> StoreFile.Comparators.SEQ_ID. It sorts the file first by increasing seq id,
> then by decreasing filesize. So our larger files sizes are always in the
> start.
> Explanation 2: In KeyValueHeap.next(), if both current and this.heap.peek()
> scanners (type is StoreFileScanner in this case) point to KVs which compare
> as equal, we put back the current scanner into heap, and call pollRealKV() to
> get new one. While inserting, KVScannerComparator is used which compares the
> two (one already in heap and the one being inserted) as equal and inserts it
> after the existing one. \[1\]
> Say s1 is scanner over first hfile and s2 over second.
> 1. We scan with s2 to get values 1, 2, 3, 4
> 2. see that both scanners have c:5 as next key, put current scanner s2 back
> in heap, take s1 out
> 4. return 55, advance s1 to key c:6
> 5. compare c:6 to c:5, put back s2 in heap since it has larger key, take out
> s1
> 6. ignore it's value (there is code for that too), advance s2 to c:6
> Go back to step 2 with both scanners having c:6 as next key
> This is wrong because if we add a key between c:4 and c:5 to first hfile, say
> (c:44, foo) which makes s1 as the 'current' scanner when we hit step 2, then
> we'll get the values - 1, 2, 3, 4, foo, 5, 6.
> (try table t3)
> Fix:
> Assign priority ids to StoreFileScanners when inserting them into heap,
> latest hfiles get higher priority, and use them in KVComparator instead of
> just seq id.
> \[1\]
> [PriorityQueue.siftUp()|http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/PriorityQueue.java#586]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)