[jira] [Commented] (HIVE-3179) HBase Handler doesn't handle NULLs properly

2013-04-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13631832#comment-13631832
 ] 

Hudson commented on HIVE-3179:
--

Integrated in Hive-trunk-hadoop2 #160 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/160/])
HIVE-3179 HBase Handler doesn't handle NULLs properly (Lars Francke via 
Navis) (Revision 1467874)

 Result = FAILURE
navis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1467874
Files : 
* 
/hive/trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseRow.java
* 
/hive/trunk/hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestLazyHBaseObject.java


 HBase Handler doesn't handle NULLs properly
 ---

 Key: HIVE-3179
 URL: https://issues.apache.org/jira/browse/HIVE-3179
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 0.9.0, 0.10.0
Reporter: Lars Francke
Priority: Critical
 Fix For: 0.12.0

 Attachments: HIVE-3179.1.patch


 We found a quite severe issue in the HBase Handler which actually means that 
 Hive potentially returns incorrect data if a column has NULL values in HBase 
 (which means the cell doesn't even exist)
 In HBase Shell:
 {noformat}
 create 'hive_hbase_test', 'test'
 put 'hive_hbase_test', '1', 'test:c1', 'c1-1'
 put 'hive_hbase_test', '1', 'test:c2', 'c2-1'
 put 'hive_hbase_test', '1', 'test:c3', 'c3-1'
 put 'hive_hbase_test', '2', 'test:c1', 'c1-2'
 {noformat}
 In Hive:
 {noformat}
 DROP TABLE IF EXISTS hive_hbase_test;
 CREATE EXTERNAL TABLE hive_hbase_test (
   id int,
   c1 string,
   c2 string,
   c3 string
 )
 STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
 WITH SERDEPROPERTIES (hbase.columns.mapping =
 :key#s,test:c1#s,test:c2#s,test:c3#s)
 TBLPROPERTIES(hbase.table.name = hive_hbase_test);
 hive select * from hive_hbase_test;
 OK
 1 c1-1c2-1c3-1
 2 c1-2NULLNULL
 hive select c1 from hive_hbase_test;
 c1-1
 c1-2
 hive select c1, c2 from hive_hbase_test;
 c1-1  c2-1
 c1-2  NULL
 {noformat}
 So far everything is correct but now:
 {noformat}
 hive select c1, c2, c2 from hive_hbase_test;
 c1-1  c2-1c2-1
 c1-2  NULLc2-1
 {noformat}
 Selecting c2 twice works the first time but the second time we
 actually get the value from the previous row.
 {noformat}
 hive select c1, c3, c2, c2, c3, c3, c1 from hive_hbase_test;
 c1-1  c3-1c2-1c2-1c3-1c3-1c1-1
 c1-2  NULLNULLc2-1c3-1c3-1c1-2
 {noformat}
 We've narrowed this down to an early initialization of 
 {{fieldsInited\[fieldID] = true}} in {{LazyHBaseRow#uncheckedGetField}} and 
 we'll try to provide a patch which surely needs review.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3179) HBase Handler doesn't handle NULLs properly

2013-04-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13632134#comment-13632134
 ] 

Hudson commented on HIVE-3179:
--

Integrated in Hive-trunk-h0.21 #2065 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2065/])
HIVE-3179 HBase Handler doesn't handle NULLs properly (Lars Francke via 
Navis) (Revision 1467874)

 Result = FAILURE
navis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1467874
Files : 
* 
/hive/trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseRow.java
* 
/hive/trunk/hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestLazyHBaseObject.java


 HBase Handler doesn't handle NULLs properly
 ---

 Key: HIVE-3179
 URL: https://issues.apache.org/jira/browse/HIVE-3179
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 0.9.0, 0.10.0
Reporter: Lars Francke
Priority: Critical
 Fix For: 0.12.0

 Attachments: HIVE-3179.1.patch


 We found a quite severe issue in the HBase Handler which actually means that 
 Hive potentially returns incorrect data if a column has NULL values in HBase 
 (which means the cell doesn't even exist)
 In HBase Shell:
 {noformat}
 create 'hive_hbase_test', 'test'
 put 'hive_hbase_test', '1', 'test:c1', 'c1-1'
 put 'hive_hbase_test', '1', 'test:c2', 'c2-1'
 put 'hive_hbase_test', '1', 'test:c3', 'c3-1'
 put 'hive_hbase_test', '2', 'test:c1', 'c1-2'
 {noformat}
 In Hive:
 {noformat}
 DROP TABLE IF EXISTS hive_hbase_test;
 CREATE EXTERNAL TABLE hive_hbase_test (
   id int,
   c1 string,
   c2 string,
   c3 string
 )
 STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
 WITH SERDEPROPERTIES (hbase.columns.mapping =
 :key#s,test:c1#s,test:c2#s,test:c3#s)
 TBLPROPERTIES(hbase.table.name = hive_hbase_test);
 hive select * from hive_hbase_test;
 OK
 1 c1-1c2-1c3-1
 2 c1-2NULLNULL
 hive select c1 from hive_hbase_test;
 c1-1
 c1-2
 hive select c1, c2 from hive_hbase_test;
 c1-1  c2-1
 c1-2  NULL
 {noformat}
 So far everything is correct but now:
 {noformat}
 hive select c1, c2, c2 from hive_hbase_test;
 c1-1  c2-1c2-1
 c1-2  NULLc2-1
 {noformat}
 Selecting c2 twice works the first time but the second time we
 actually get the value from the previous row.
 {noformat}
 hive select c1, c3, c2, c2, c3, c3, c1 from hive_hbase_test;
 c1-1  c3-1c2-1c2-1c3-1c3-1c1-1
 c1-2  NULLNULLc2-1c3-1c3-1c1-2
 {noformat}
 We've narrowed this down to an early initialization of 
 {{fieldsInited\[fieldID] = true}} in {{LazyHBaseRow#uncheckedGetField}} and 
 we'll try to provide a patch which surely needs review.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3179) HBase Handler doesn't handle NULLs properly

2013-02-09 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13575232#comment-13575232
 ] 

Brock Noland commented on HIVE-3179:


Mark,

How did the tests turn out?

 HBase Handler doesn't handle NULLs properly
 ---

 Key: HIVE-3179
 URL: https://issues.apache.org/jira/browse/HIVE-3179
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 0.9.0, 0.10.0
Reporter: Lars Francke
Priority: Critical
 Attachments: HIVE-3179.1.patch


 We found a quite severe issue in the HBase Handler which actually means that 
 Hive potentially returns incorrect data if a column has NULL values in HBase 
 (which means the cell doesn't even exist)
 In HBase Shell:
 {noformat}
 create 'hive_hbase_test', 'test'
 put 'hive_hbase_test', '1', 'test:c1', 'c1-1'
 put 'hive_hbase_test', '1', 'test:c2', 'c2-1'
 put 'hive_hbase_test', '1', 'test:c3', 'c3-1'
 put 'hive_hbase_test', '2', 'test:c1', 'c1-2'
 {noformat}
 In Hive:
 {noformat}
 DROP TABLE IF EXISTS hive_hbase_test;
 CREATE EXTERNAL TABLE hive_hbase_test (
   id int,
   c1 string,
   c2 string,
   c3 string
 )
 STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
 WITH SERDEPROPERTIES (hbase.columns.mapping =
 :key#s,test:c1#s,test:c2#s,test:c3#s)
 TBLPROPERTIES(hbase.table.name = hive_hbase_test);
 hive select * from hive_hbase_test;
 OK
 1 c1-1c2-1c3-1
 2 c1-2NULLNULL
 hive select c1 from hive_hbase_test;
 c1-1
 c1-2
 hive select c1, c2 from hive_hbase_test;
 c1-1  c2-1
 c1-2  NULL
 {noformat}
 So far everything is correct but now:
 {noformat}
 hive select c1, c2, c2 from hive_hbase_test;
 c1-1  c2-1c2-1
 c1-2  NULLc2-1
 {noformat}
 Selecting c2 twice works the first time but the second time we
 actually get the value from the previous row.
 {noformat}
 hive select c1, c3, c2, c2, c3, c3, c1 from hive_hbase_test;
 c1-1  c3-1c2-1c2-1c3-1c3-1c1-1
 c1-2  NULLNULLc2-1c3-1c3-1c1-2
 {noformat}
 We've narrowed this down to an early initialization of 
 {{fieldsInited\[fieldID] = true}} in {{LazyHBaseRow#uncheckedGetField}} and 
 we'll try to provide a patch which surely needs review.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3179) HBase Handler doesn't handle NULLs properly

2013-02-09 Thread Mark Grover (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13575246#comment-13575246
 ] 

Mark Grover commented on HIVE-3179:
---

They timed out on my pseudo-distributed laptop but that most likely is an 
environment issue local to me. But looks like Lars mentioned that he had run 
the tests, so that should be ok.

I will try to fix the environment, but don't wait on me.

 HBase Handler doesn't handle NULLs properly
 ---

 Key: HIVE-3179
 URL: https://issues.apache.org/jira/browse/HIVE-3179
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 0.9.0, 0.10.0
Reporter: Lars Francke
Priority: Critical
 Attachments: HIVE-3179.1.patch


 We found a quite severe issue in the HBase Handler which actually means that 
 Hive potentially returns incorrect data if a column has NULL values in HBase 
 (which means the cell doesn't even exist)
 In HBase Shell:
 {noformat}
 create 'hive_hbase_test', 'test'
 put 'hive_hbase_test', '1', 'test:c1', 'c1-1'
 put 'hive_hbase_test', '1', 'test:c2', 'c2-1'
 put 'hive_hbase_test', '1', 'test:c3', 'c3-1'
 put 'hive_hbase_test', '2', 'test:c1', 'c1-2'
 {noformat}
 In Hive:
 {noformat}
 DROP TABLE IF EXISTS hive_hbase_test;
 CREATE EXTERNAL TABLE hive_hbase_test (
   id int,
   c1 string,
   c2 string,
   c3 string
 )
 STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
 WITH SERDEPROPERTIES (hbase.columns.mapping =
 :key#s,test:c1#s,test:c2#s,test:c3#s)
 TBLPROPERTIES(hbase.table.name = hive_hbase_test);
 hive select * from hive_hbase_test;
 OK
 1 c1-1c2-1c3-1
 2 c1-2NULLNULL
 hive select c1 from hive_hbase_test;
 c1-1
 c1-2
 hive select c1, c2 from hive_hbase_test;
 c1-1  c2-1
 c1-2  NULL
 {noformat}
 So far everything is correct but now:
 {noformat}
 hive select c1, c2, c2 from hive_hbase_test;
 c1-1  c2-1c2-1
 c1-2  NULLc2-1
 {noformat}
 Selecting c2 twice works the first time but the second time we
 actually get the value from the previous row.
 {noformat}
 hive select c1, c3, c2, c2, c3, c3, c1 from hive_hbase_test;
 c1-1  c3-1c2-1c2-1c3-1c3-1c1-1
 c1-2  NULLNULLc2-1c3-1c3-1c1-2
 {noformat}
 We've narrowed this down to an early initialization of 
 {{fieldsInited\[fieldID] = true}} in {{LazyHBaseRow#uncheckedGetField}} and 
 we'll try to provide a patch which surely needs review.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3179) HBase Handler doesn't handle NULLs properly

2013-02-08 Thread Mark Grover (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13574553#comment-13574553
 ] 

Mark Grover commented on HIVE-3179:
---

Running TestHBaseCliDriver tests...

 HBase Handler doesn't handle NULLs properly
 ---

 Key: HIVE-3179
 URL: https://issues.apache.org/jira/browse/HIVE-3179
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 0.9.0, 0.10.0
Reporter: Lars Francke
Priority: Critical
 Attachments: HIVE-3179.1.patch


 We found a quite severe issue in the HBase Handler which actually means that 
 Hive potentially returns incorrect data if a column has NULL values in HBase 
 (which means the cell doesn't even exist)
 In HBase Shell:
 {noformat}
 create 'hive_hbase_test', 'test'
 put 'hive_hbase_test', '1', 'test:c1', 'c1-1'
 put 'hive_hbase_test', '1', 'test:c2', 'c2-1'
 put 'hive_hbase_test', '1', 'test:c3', 'c3-1'
 put 'hive_hbase_test', '2', 'test:c1', 'c1-2'
 {noformat}
 In Hive:
 {noformat}
 DROP TABLE IF EXISTS hive_hbase_test;
 CREATE EXTERNAL TABLE hive_hbase_test (
   id int,
   c1 string,
   c2 string,
   c3 string
 )
 STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
 WITH SERDEPROPERTIES (hbase.columns.mapping =
 :key#s,test:c1#s,test:c2#s,test:c3#s)
 TBLPROPERTIES(hbase.table.name = hive_hbase_test);
 hive select * from hive_hbase_test;
 OK
 1 c1-1c2-1c3-1
 2 c1-2NULLNULL
 hive select c1 from hive_hbase_test;
 c1-1
 c1-2
 hive select c1, c2 from hive_hbase_test;
 c1-1  c2-1
 c1-2  NULL
 {noformat}
 So far everything is correct but now:
 {noformat}
 hive select c1, c2, c2 from hive_hbase_test;
 c1-1  c2-1c2-1
 c1-2  NULLc2-1
 {noformat}
 Selecting c2 twice works the first time but the second time we
 actually get the value from the previous row.
 {noformat}
 hive select c1, c3, c2, c2, c3, c3, c1 from hive_hbase_test;
 c1-1  c3-1c2-1c2-1c3-1c3-1c1-1
 c1-2  NULLNULLc2-1c3-1c3-1c1-2
 {noformat}
 We've narrowed this down to an early initialization of 
 {{fieldsInited\[fieldID] = true}} in {{LazyHBaseRow#uncheckedGetField}} and 
 we'll try to provide a patch which surely needs review.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3179) HBase Handler doesn't handle NULLs properly

2013-02-07 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573987#comment-13573987
 ] 

Brock Noland commented on HIVE-3179:


I have verified this is an issue with trunk, the patch applies, and the patch 
addresses the issue.

{noformat}
hive select c1, c3, c2, c2, c3, c3, c1 from hive_hbase_test;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201302071609_0002, Tracking URL = 
http://localhost:50030/jobdetails.jsp?jobid=job_201302071609_0002
Kill Command = /opt/local/hadoop-1.1.1/libexec/../bin/hadoop job  -kill 
job_201302071609_0002
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2013-02-07 16:10:31,826 Stage-1 map = 0%,  reduce = 0%
2013-02-07 16:10:34,846 Stage-1 map = 100%,  reduce = 0%
2013-02-07 16:10:36,861 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201302071609_0002
MapReduce Jobs Launched: 
Job 0: Map: 1   HDFS Read: 260 HDFS Write: 60 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
c1-1c3-1c2-1c2-1c3-1c3-1c1-1
c1-2NULLNULLNULLNULLNULLc1-2
Time taken: 10.702 seconds, Fetched: 2 row(s)
hive 
{noformat}

 HBase Handler doesn't handle NULLs properly
 ---

 Key: HIVE-3179
 URL: https://issues.apache.org/jira/browse/HIVE-3179
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 0.9.0
Reporter: Lars Francke
Priority: Critical
 Attachments: HIVE-3179.1.patch


 We found a quite severe issue in the HBase Handler which actually means that 
 Hive potentially returns incorrect data if a column has NULL values in HBase 
 (which means the cell doesn't even exist)
 In HBase Shell:
 {noformat}
 create 'hive_hbase_test', 'test'
 put 'hive_hbase_test', '1', 'test:c1', 'c1-1'
 put 'hive_hbase_test', '1', 'test:c2', 'c2-1'
 put 'hive_hbase_test', '1', 'test:c3', 'c3-1'
 put 'hive_hbase_test', '2', 'test:c1', 'c1-2'
 {noformat}
 In Hive:
 {noformat}
 DROP TABLE IF EXISTS hive_hbase_test;
 CREATE EXTERNAL TABLE hive_hbase_test (
   id int,
   c1 string,
   c2 string,
   c3 string
 )
 STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
 WITH SERDEPROPERTIES (hbase.columns.mapping =
 :key#s,test:c1#s,test:c2#s,test:c3#s)
 TBLPROPERTIES(hbase.table.name = hive_hbase_test);
 hive select * from hive_hbase_test;
 OK
 1 c1-1c2-1c3-1
 2 c1-2NULLNULL
 hive select c1 from hive_hbase_test;
 c1-1
 c1-2
 hive select c1, c2 from hive_hbase_test;
 c1-1  c2-1
 c1-2  NULL
 {noformat}
 So far everything is correct but now:
 {noformat}
 hive select c1, c2, c2 from hive_hbase_test;
 c1-1  c2-1c2-1
 c1-2  NULLc2-1
 {noformat}
 Selecting c2 twice works the first time but the second time we
 actually get the value from the previous row.
 {noformat}
 hive select c1, c3, c2, c2, c3, c3, c1 from hive_hbase_test;
 c1-1  c3-1c2-1c2-1c3-1c3-1c1-1
 c1-2  NULLNULLc2-1c3-1c3-1c1-2
 {noformat}
 We've narrowed this down to an early initialization of 
 {{fieldsInited\[fieldID] = true}} in {{LazyHBaseRow#uncheckedGetField}} and 
 we'll try to provide a patch which surely needs review.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3179) HBase Handler doesn't handle NULLs properly

2013-02-07 Thread Shreepadma Venugopalan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573991#comment-13573991
 ] 

Shreepadma Venugopalan commented on HIVE-3179:
--

+1.

 HBase Handler doesn't handle NULLs properly
 ---

 Key: HIVE-3179
 URL: https://issues.apache.org/jira/browse/HIVE-3179
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 0.9.0, 0.10.0
Reporter: Lars Francke
Priority: Critical
 Attachments: HIVE-3179.1.patch


 We found a quite severe issue in the HBase Handler which actually means that 
 Hive potentially returns incorrect data if a column has NULL values in HBase 
 (which means the cell doesn't even exist)
 In HBase Shell:
 {noformat}
 create 'hive_hbase_test', 'test'
 put 'hive_hbase_test', '1', 'test:c1', 'c1-1'
 put 'hive_hbase_test', '1', 'test:c2', 'c2-1'
 put 'hive_hbase_test', '1', 'test:c3', 'c3-1'
 put 'hive_hbase_test', '2', 'test:c1', 'c1-2'
 {noformat}
 In Hive:
 {noformat}
 DROP TABLE IF EXISTS hive_hbase_test;
 CREATE EXTERNAL TABLE hive_hbase_test (
   id int,
   c1 string,
   c2 string,
   c3 string
 )
 STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
 WITH SERDEPROPERTIES (hbase.columns.mapping =
 :key#s,test:c1#s,test:c2#s,test:c3#s)
 TBLPROPERTIES(hbase.table.name = hive_hbase_test);
 hive select * from hive_hbase_test;
 OK
 1 c1-1c2-1c3-1
 2 c1-2NULLNULL
 hive select c1 from hive_hbase_test;
 c1-1
 c1-2
 hive select c1, c2 from hive_hbase_test;
 c1-1  c2-1
 c1-2  NULL
 {noformat}
 So far everything is correct but now:
 {noformat}
 hive select c1, c2, c2 from hive_hbase_test;
 c1-1  c2-1c2-1
 c1-2  NULLc2-1
 {noformat}
 Selecting c2 twice works the first time but the second time we
 actually get the value from the previous row.
 {noformat}
 hive select c1, c3, c2, c2, c3, c3, c1 from hive_hbase_test;
 c1-1  c3-1c2-1c2-1c3-1c3-1c1-1
 c1-2  NULLNULLc2-1c3-1c3-1c1-2
 {noformat}
 We've narrowed this down to an early initialization of 
 {{fieldsInited\[fieldID] = true}} in {{LazyHBaseRow#uncheckedGetField}} and 
 we'll try to provide a patch which surely needs review.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3179) HBase Handler doesn't handle NULLs properly

2013-02-06 Thread Lars Francke (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13572282#comment-13572282
 ] 

Lars Francke commented on HIVE-3179:


As far as I can tell this is still an issue. Would anyone mind doing a review 
on this one?

 HBase Handler doesn't handle NULLs properly
 ---

 Key: HIVE-3179
 URL: https://issues.apache.org/jira/browse/HIVE-3179
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 0.9.0
Reporter: Lars Francke
Priority: Critical
 Attachments: HIVE-3179.1.patch


 We found a quite severe issue in the HBase Handler which actually means that 
 Hive potentially returns incorrect data if a column has NULL values in HBase 
 (which means the cell doesn't even exist)
 In HBase Shell:
 {noformat}
 create 'hive_hbase_test', 'test'
 put 'hive_hbase_test', '1', 'test:c1', 'c1-1'
 put 'hive_hbase_test', '1', 'test:c2', 'c2-1'
 put 'hive_hbase_test', '1', 'test:c3', 'c3-1'
 put 'hive_hbase_test', '2', 'test:c1', 'c1-2'
 {noformat}
 In Hive:
 {noformat}
 DROP TABLE IF EXISTS hive_hbase_test;
 CREATE EXTERNAL TABLE hive_hbase_test (
   id int,
   c1 string,
   c2 string,
   c3 string
 )
 STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
 WITH SERDEPROPERTIES (hbase.columns.mapping =
 :key#s,test:c1#s,test:c2#s,test:c3#s)
 TBLPROPERTIES(hbase.table.name = hive_hbase_test);
 hive select * from hive_hbase_test;
 OK
 1 c1-1c2-1c3-1
 2 c1-2NULLNULL
 hive select c1 from hive_hbase_test;
 c1-1
 c1-2
 hive select c1, c2 from hive_hbase_test;
 c1-1  c2-1
 c1-2  NULL
 {noformat}
 So far everything is correct but now:
 {noformat}
 hive select c1, c2, c2 from hive_hbase_test;
 c1-1  c2-1c2-1
 c1-2  NULLc2-1
 {noformat}
 Selecting c2 twice works the first time but the second time we
 actually get the value from the previous row.
 {noformat}
 hive select c1, c3, c2, c2, c3, c3, c1 from hive_hbase_test;
 c1-1  c3-1c2-1c2-1c3-1c3-1c1-1
 c1-2  NULLNULLc2-1c3-1c3-1c1-2
 {noformat}
 We've narrowed this down to an early initialization of 
 {{fieldsInited\[fieldID] = true}} in {{LazyHBaseRow#uncheckedGetField}} and 
 we'll try to provide a patch which surely needs review.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3179) HBase Handler doesn't handle NULLs properly

2012-06-24 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400269#comment-13400269
 ] 

Carl Steinbach commented on HIVE-3179:
--

@Lars: Please post a review request on reviews.apache.org. Thanks.

 HBase Handler doesn't handle NULLs properly
 ---

 Key: HIVE-3179
 URL: https://issues.apache.org/jira/browse/HIVE-3179
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 0.9.0
Reporter: Lars Francke
Priority: Critical
 Attachments: HIVE-3179.1.patch


 We found a quite severe issue in the HBase Handler which actually means that 
 Hive potentially returns incorrect data if a column has NULL values in HBase 
 (which means the cell doesn't even exist)
 In HBase Shell:
 {noformat}
 create 'hive_hbase_test', 'test'
 put 'hive_hbase_test', '1', 'test:c1', 'c1-1'
 put 'hive_hbase_test', '1', 'test:c2', 'c2-1'
 put 'hive_hbase_test', '1', 'test:c3', 'c3-1'
 put 'hive_hbase_test', '2', 'test:c1', 'c1-2'
 {noformat}
 In Hive:
 {noformat}
 DROP TABLE IF EXISTS hive_hbase_test;
 CREATE EXTERNAL TABLE hive_hbase_test (
   id int,
   c1 string,
   c2 string,
   c3 string
 )
 STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
 WITH SERDEPROPERTIES (hbase.columns.mapping =
 :key#s,test:c1#s,test:c2#s,test:c3#s)
 TBLPROPERTIES(hbase.table.name = hive_hbase_test);
 hive select * from hive_hbase_test;
 OK
 1 c1-1c2-1c3-1
 2 c1-2NULLNULL
 hive select c1 from hive_hbase_test;
 c1-1
 c1-2
 hive select c1, c2 from hive_hbase_test;
 c1-1  c2-1
 c1-2  NULL
 {noformat}
 So far everything is correct but now:
 {noformat}
 hive select c1, c2, c2 from hive_hbase_test;
 c1-1  c2-1c2-1
 c1-2  NULLc2-1
 {noformat}
 Selecting c2 twice works the first time but the second time we
 actually get the value from the previous row.
 {noformat}
 hive select c1, c3, c2, c2, c3, c3, c1 from hive_hbase_test;
 c1-1  c3-1c2-1c2-1c3-1c3-1c1-1
 c1-2  NULLNULLc2-1c3-1c3-1c1-2
 {noformat}
 We've narrowed this down to an early initialization of 
 {{fieldsInited\[fieldID] = true}} in {{LazyHBaseRow#uncheckedGetField}} and 
 we'll try to provide a patch which surely needs review.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3179) HBase Handler doesn't handle NULLs properly

2012-06-24 Thread Lars Francke (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13400289#comment-13400289
 ] 

Lars Francke commented on HIVE-3179:


@Carl: Sure: https://reviews.apache.org/r/5542/ thanks for the reminder.

 HBase Handler doesn't handle NULLs properly
 ---

 Key: HIVE-3179
 URL: https://issues.apache.org/jira/browse/HIVE-3179
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 0.9.0
Reporter: Lars Francke
Priority: Critical
 Attachments: HIVE-3179.1.patch


 We found a quite severe issue in the HBase Handler which actually means that 
 Hive potentially returns incorrect data if a column has NULL values in HBase 
 (which means the cell doesn't even exist)
 In HBase Shell:
 {noformat}
 create 'hive_hbase_test', 'test'
 put 'hive_hbase_test', '1', 'test:c1', 'c1-1'
 put 'hive_hbase_test', '1', 'test:c2', 'c2-1'
 put 'hive_hbase_test', '1', 'test:c3', 'c3-1'
 put 'hive_hbase_test', '2', 'test:c1', 'c1-2'
 {noformat}
 In Hive:
 {noformat}
 DROP TABLE IF EXISTS hive_hbase_test;
 CREATE EXTERNAL TABLE hive_hbase_test (
   id int,
   c1 string,
   c2 string,
   c3 string
 )
 STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
 WITH SERDEPROPERTIES (hbase.columns.mapping =
 :key#s,test:c1#s,test:c2#s,test:c3#s)
 TBLPROPERTIES(hbase.table.name = hive_hbase_test);
 hive select * from hive_hbase_test;
 OK
 1 c1-1c2-1c3-1
 2 c1-2NULLNULL
 hive select c1 from hive_hbase_test;
 c1-1
 c1-2
 hive select c1, c2 from hive_hbase_test;
 c1-1  c2-1
 c1-2  NULL
 {noformat}
 So far everything is correct but now:
 {noformat}
 hive select c1, c2, c2 from hive_hbase_test;
 c1-1  c2-1c2-1
 c1-2  NULLc2-1
 {noformat}
 Selecting c2 twice works the first time but the second time we
 actually get the value from the previous row.
 {noformat}
 hive select c1, c3, c2, c2, c3, c3, c1 from hive_hbase_test;
 c1-1  c3-1c2-1c2-1c3-1c3-1c1-1
 c1-2  NULLNULLc2-1c3-1c3-1c1-2
 {noformat}
 We've narrowed this down to an early initialization of 
 {{fieldsInited\[fieldID] = true}} in {{LazyHBaseRow#uncheckedGetField}} and 
 we'll try to provide a patch which surely needs review.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3179) HBase Handler doesn't handle NULLs properly

2012-06-22 Thread Lars Francke (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13399322#comment-13399322
 ] 

Lars Francke commented on HIVE-3179:


We could add a second boolean array to go with {{fieldsInited}} that's called 
{{fieldsNull}} that caches those fields. Not sure if that's needed though.

Thanks to my colleague Oliver Meyn who actually looked at the code and found 
the fix, I only packaged it up and added the unit test.



 HBase Handler doesn't handle NULLs properly
 ---

 Key: HIVE-3179
 URL: https://issues.apache.org/jira/browse/HIVE-3179
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 0.9.0
Reporter: Lars Francke
Priority: Critical
 Attachments: HIVE-3179.1.patch


 We found a quite severe issue in the HBase Handler which actually means that 
 Hive potentially returns incorrect data if a column has NULL values in HBase 
 (which means the cell doesn't even exist)
 In HBase Shell:
 {noformat}
 create 'hive_hbase_test', 'test'
 put 'hive_hbase_test', '1', 'test:c1', 'c1-1'
 put 'hive_hbase_test', '1', 'test:c2', 'c2-1'
 put 'hive_hbase_test', '1', 'test:c3', 'c3-1'
 put 'hive_hbase_test', '2', 'test:c1', 'c1-2'
 {noformat}
 In Hive:
 {noformat}
 DROP TABLE IF EXISTS hive_hbase_test;
 CREATE EXTERNAL TABLE hive_hbase_test (
   id int,
   c1 string,
   c2 string,
   c3 string
 )
 STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
 WITH SERDEPROPERTIES (hbase.columns.mapping =
 :key#s,test:c1#s,test:c2#s,test:c3#s)
 TBLPROPERTIES(hbase.table.name = hive_hbase_test);
 hive select * from hive_hbase_test;
 OK
 1 c1-1c2-1c3-1
 2 c1-2NULLNULL
 hive select c1 from hive_hbase_test;
 c1-1
 c1-2
 hive select c1, c2 from hive_hbase_test;
 c1-1  c2-1
 c1-2  NULL
 {noformat}
 So far everything is correct but now:
 {noformat}
 hive select c1, c2, c2 from hive_hbase_test;
 c1-1  c2-1c2-1
 c1-2  NULLc2-1
 {noformat}
 Selecting c2 twice works the first time but the second time we
 actually get the value from the previous row.
 {noformat}
 hive select c1, c3, c2, c2, c3, c3, c1 from hive_hbase_test;
 c1-1  c3-1c2-1c2-1c3-1c3-1c1-1
 c1-2  NULLNULLc2-1c3-1c3-1c1-2
 {noformat}
 We've narrowed this down to an early initialization of 
 {{fieldsInited\[fieldID] = true}} in {{LazyHBaseRow#uncheckedGetField}} and 
 we'll try to provide a patch which surely needs review.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira