[jira] [Created] (HIVE-17335) Join query with STREAMTABLE fails by java.lang.IndexOutOfBoundsException

2017-08-16 Thread Aleksey Vovchenko (JIRA)
Aleksey Vovchenko created HIVE-17335:


 Summary: Join query with STREAMTABLE fails by 
java.lang.IndexOutOfBoundsException
 Key: HIVE-17335
 URL: https://issues.apache.org/jira/browse/HIVE-17335
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 2.1.1, 1.2.1
Reporter: Aleksey Vovchenko


Steps to reproduce this issue: 

h2. STEP 1. Create test tables and insert some data 

{noformat}
hive> create table test1(x int, y int, z int);
hive> create table test2(x int, y int, z int);
{noformat}

{noformat}
hive> insert into table test1 values(1,1,1), (2,2,2);
hive> insert into table test2 values(1,5,5), (2,6,6);
{noformat}

h2. STEP 2. Disable MapJoin

{noformat}
hive> set hive.auto.convert.join = false;
{noformat}

h2.STEP 3. Run query

{noformat}
select /*+ STREAMTABLE(test1) */ test1.*, test2.x from test1 left join test2 on 
test1.x = test2.x and test1.x > 1;
{noformat}

EXPECTED RESULT: 
{noformat}
OK
1   1   1   NULL
2   2   2   2
{noformat}

ACTUAL RESULT:
{noformat} 
2017-08-17 00:36:46,305 Stage-1 map = 0%,  reduce = 0%
2017-08-17 00:36:51,708 Stage-1 map = 50%,  reduce = 0%, Cumulative CPU 1.25 sec
2017-08-17 00:36:52,761 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.35 
sec
2017-08-17 00:37:17,137 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 2.35 
sec
MapReduce Total cumulative CPU time: 2 seconds 350 msec
Ended Job = job_1502889241527_0005 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1502889241527_0005_m_00 (and more) from job 
job_1502889241527_0005

Task with the most failures(4): 
-
Task ID:
  task_1502889241527_0005_r_00

-
Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row (tag=0) {"key":{"reducesinkkey0":1},"value":null}
at 
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:257)
at 
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row (tag=0) {"key":{"reducesinkkey0":1},"value":null}
at 
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:245)
... 7 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at 
org.apache.hadoop.hive.ql.exec.JoinOperator.process(JoinOperator.java:138)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:236)
... 7 more
Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at org.apache.hadoop.hive.ql.exec.JoinUtil.isFiltered(JoinUtil.java:248)
at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getFilteredValue(CommonJoinOperator.java:420)
at 
org.apache.hadoop.hive.ql.exec.JoinOperator.process(JoinOperator.java:91)
... 8 more


FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched: 
{noformat} 




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-16741) Counting number of records in hive and hbase are different for NULL fields in hive

2017-05-23 Thread Aleksey Vovchenko (JIRA)
Aleksey Vovchenko created HIVE-16741:


 Summary:  Counting number of records in hive and hbase are 
different for NULL fields in hive
 Key: HIVE-16741
 URL: https://issues.apache.org/jira/browse/HIVE-16741
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 2.1.0, 1.2.0
Reporter: Aleksey Vovchenko
Assignee: Aleksey Vovchenko


Steps to reproduce:

STEP 1.  

hbase> create 'testTable',{NAME=>'cf'}

STEP 2.
put 'testTable','10','cf:Address','My Address 411002'
put 'testTable','10','cf:contactId','653638'
put 'testTable','10','cf:currentStatus','Awaiting'
put 'testTable','10','cf:createdAt','1452815193'
put 'testTable','10','cf:Id','10'


put 'testTable','15','cf:contactId','653638'
put 'testTable','15','cf:currentStatus','Awaiting'
put 'testTable','15','cf:createdAt','1452815193'
put 'testTable','15','cf:Id','15'
(Note: Here Addrees column is not provided.It means that NULL.)

put 'testTable','20','cf:Address','My Address 411003'
put 'testTable','20','cf:contactId','653638'
put 'testTable','20','cf:currentStatus','Awaiting'
put 'testTable','20','cf:createdAt','1452815193'
put 'testTable','20','cf:Id','20'


put 'testTable','17','cf:Address','My Address 411003'
put 'testTable','17','cf:currentStatus','Awaiting'
put 'testTable','17','cf:createdAt','1452815193'
put 'testTable','17','cf:Id','17'

STEP 3.

hive> CREATE external TABLE hh_testTable(Id string,Address string,contactId 
string,currentStatus string,createdAt string) STORED BY 
'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES 
("hbase.columns.mapping"=":key,cf:Address,cf:contactId,cf:currentStatus,cf:createdAt")
 TBLPROPERTIES ("hbase.table.name"="testTable");

STEP 4.

hive> select count(*),contactid from hh_testTable group by contactid;

Actual result:
OK
3   653638

Expected result:
OK
1   NULL
3   653637




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16118) Some queries with multiple inserts fail in Hive-1.2 and work in Hive-0.13

2017-03-06 Thread Aleksey Vovchenko (JIRA)
Aleksey Vovchenko created HIVE-16118:


 Summary: Some queries with multiple inserts fail in Hive-1.2 and 
work in Hive-0.13
 Key: HIVE-16118
 URL: https://issues.apache.org/jira/browse/HIVE-16118
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 2.1.1, 2.1.0, 1.2.0
Reporter: Aleksey Vovchenko


STEPS TO REPRODUCE:

1. Create tables and insert data into it:

CREATE TABLE `table1` (`id` INT, `date` TIMESTAMP);
CREATE TABLE `table2` (`id` INT, `date` TIMESTAMP);
CREATE TABLE `table3` (`id` INT, `date` TIMESTAMP);
CREATE TABLE `table4` (`id` INT, `date` TIMESTAMP);

INSERT OVERWRITE TABLE `table1` VALUES (1,'2006-03-30 19:42:06'),(2,'2014-05-11 
09:39:11'),(3,'2010-09-01 04:42:17'),(4,'2012-01-04 19:56:20'),(5,'2011-02-12 
03:03:42');
INSERT OVERWRITE TABLE `table2` VALUES (1,'2006-03-30 19:42:06'),(2,'2014-05-11 
09:39:11'),(3,'2010-09-01 04:42:17'),(4,'2012-01-04 19:56:20'),(5,'2011-02-12 
03:03:42');


2. Run the query with multiple inserts:

FROM `table1` AS `t1`
LEFT OUTER JOIN `table2` AS `t2`
ON `t1`.`id` = `t2`.`id`
INSERT OVERWRITE TABLE `table3`
SELECT `t1`.`id`, `t1`.`date`
WHERE STRING(`t1`.`date`) <= 
STRING(from_unixtime(unix_timestamp(),'-MM-dd'))
INSERT OVERWRITE TABLE `table4`
SELECT `t1`.`id`, `t1`.`date`
WHERE `t1`.`id` <= INT(IF(`t2`.`id` IS NULL,0,`t2`.`id`));



ACTUAL RESULT:

The query failed with such exception:

2017-02-22 11:00:03,904 ERROR [Thread-14]: exec.Task 
(SessionState.java:printError(963)) - 
Task with the most failures(4): 
-
Task ID:
  task_1487688226052_0091_m_00

-
Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row {"id":1,"date":"2006-03-30 19:42:06"}
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:172)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:458)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:348)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row {"id":1,"date":"2006-03-30 19:42:06"}
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:546)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163)
... 8 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected 
exception: org.apache.hadoop.hive.serde2.SerDeException: Error: expecting 2 but 
asking for field 2
data=[Ljava.lang.Object;@39836856
tableType=struct
dataType=struct<_col0:int,_col1:timestamp,_col5:int>
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:426)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97)
at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:165)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:536)
... 9 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.hadoop.hive.serde2.SerDeException: Error: expecting 2 but asking for 
field 2
data=[Ljava.lang.Object;@39836856
tableType=struct
dataType=struct<_col0:int,_col1:timestamp,_col5:int>
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:789)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
at 
org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:122)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:644)
at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:676)
at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:754)
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:414)
... 13 more
Caused by: org.apache.hadoop.hive.serde2.SerDeException: Error: expecting 2 but 
asking for field 2
data=[Ljava.lang.Object;@39836856
tableType=struct
dataType=struct<_col0:int,_col1:timestamp,_col5:int>
at 

[jira] [Created] (HIVE-15440) Multiple inserts query from the union table failed by java.lang.IllegalArgumentException

2016-12-15 Thread Aleksey Vovchenko (JIRA)
Aleksey Vovchenko created HIVE-15440:


 Summary: Multiple inserts query from the union table failed by 
java.lang.IllegalArgumentException
 Key: HIVE-15440
 URL: https://issues.apache.org/jira/browse/HIVE-15440
 Project: Hive
  Issue Type: Bug
Reporter: Aleksey Vovchenko


STEP 1. Configure Hive on Tez

STEP2.  Create test tables

hive> create table t1(x int);
hive> create table t2(x int);
hive> create table t1_tmp(x int);
hive> create table t2_tmp(x int);

STEP 3. Run query

hive> from(select x from t1 union all select x from t2)tbl insert overwrite 
table t1_tmp select x insert overwrite table t2_tmp select x;

RESULT

2016-11-30 13:50:51,610 ERROR [main]: exec.Task (TezTask.java:execute(197)) - 
Failed to execute tez graph.
java.lang.IllegalArgumentException: VertexGroup must have at least 2 members
at org.apache.tez.dag.api.VertexGroup.(VertexGroup.java:77)
at org.apache.tez.dag.api.DAG.createVertexGroup(DAG.java:177)
at org.apache.hadoop.hive.ql.exec.tez.TezTask.build(TezTask.java:333)
at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:169)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1635)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1395)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1208)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1035)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1025)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:201)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:153)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:364)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:712)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:631)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:570)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)



This issue doesn't happen on MR mode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13460) ANALYZE TABLE COMPUTE STATISTICS FAILED max key length is 1000 bytes

2016-04-08 Thread Aleksey Vovchenko (JIRA)
Aleksey Vovchenko created HIVE-13460:


 Summary: ANALYZE TABLE COMPUTE STATISTICS FAILED max key length is 
1000 bytes
 Key: HIVE-13460
 URL: https://issues.apache.org/jira/browse/HIVE-13460
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.0.1
Reporter: Aleksey Vovchenko
Assignee: Aleksey Vovchenko


When Hive configured to Store Statistics in MySQL we have next error:
{noformat} 
2016-04-08 15:53:28,047 ERROR [main]: jdbc.JDBCStatsPublisher 
(JDBCStatsPublisher.java:init(316)) - Error during JDBC initialization.
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was 
too long; max key length is 767 bytes
{noformat} 

If set MySql properties as:

{noformat} 
set global innodb_large_prefix = ON;
set global innodb_file_format = BARRACUDA;
{noformat} 

Now we have next Error:

{noformat} 
2016-04-08 15:56:05,552 ERROR [main]: jdbc.JDBCStatsPublisher 
(JDBCStatsPublisher.java:init(316)) - Error during JDBC initialization.
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was 
too long; max key length is 3072 bytes
{noformat} 


 As a result of my investigation I figured out that MySQL does not allow to 
create primary key with size more than 3072 bytes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13279) SHOW TABLE EXTENDED doesn't show the correct lastUpdateTime of partition's file system

2016-03-14 Thread Aleksey Vovchenko (JIRA)
Aleksey Vovchenko created HIVE-13279:


 Summary: SHOW TABLE EXTENDED doesn't show the correct 
lastUpdateTime of partition's file system
 Key: HIVE-13279
 URL: https://issues.apache.org/jira/browse/HIVE-13279
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Aleksey Vovchenko
Assignee: Aleksey Vovchenko


h2. STEP 1. Create test Tables

Execute in command line:

{noformat} 
nano test.data
{noformat} 

Add to file:

{noformat}
1,aa
2,aa
3,ff
4,sad
5,adsf
6,adsf
7,affss
{noformat}

{noformat}
hadoop fs -put test.data /
{noformat} 

{noformat}
hive> create table test (x int, y string, z string) ROW FORMAT DELIMITED FIELDS 
TERMINATED BY ',';
hive> create table ptest(x int, y string) partitioned by(z string); 
hive> LOAD DATA  INPATH '/test.data' OVERWRITE INTO TABLE test;
hive> insert overwrite table ptest partition(z=65) select * from test;
hive> insert overwrite table ptest partition(z=67) select * from test;
{noformat}

h2. STEP 2. Compare lastUpdateTime

Execute in Hive shell:

{noformat}
hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='65');
hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='67');
{noformat}

lastUpdateTime should be different.

h2. STEP 3. Put data into hdfs and compare lastUpdateTime

Execute in command line:

{noformat}
hadoop fs -put test.data /user/hive/warehouse/ptest
{noformat}

Execute in Hive shell:

{noformat}
hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='65');
hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='67');
{noformat}

lastUpdateTime should be different but they are same.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)