[jira] [Commented] (HIVE-11466) HIVE-10166 generates more data on hive.log causing Jenkins to fill all the disk.

2015-08-06 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660277#comment-14660277
 ] 

Chao Sun commented on HIVE-11466:
-

Hmm, seems like with Thrift 0.9.0 it still have this message in the log.

 HIVE-10166 generates more data on hive.log causing Jenkins to fill all the 
 disk.
 

 Key: HIVE-11466
 URL: https://issues.apache.org/jira/browse/HIVE-11466
 Project: Hive
  Issue Type: Bug
Reporter: Sergio Peña
Assignee: Xuefu Zhang
 Attachments: HIVE-11466.patch


 An issue with HIVE-10166 patch is increasing the size of hive.log and  
 causing jenkins to fail because it does not have more space.
 Here's a test I run when running TestJdbcWithMiniHS2 before the patch, with 
 the patch, and after other commits.
 {noformat}
 BEFORE HIVE-10166
 13M Aug  5 11:57 ./hive-unit/target/tmp/log/hive.log
 WITH HIVE-10166
 2.4G Aug  5 12:07 ./hive-unit/target/tmp/log/hive.log
 CURRENT HEAD
 3.2G Aug  5 12:36 ./hive-unit/target/tmp/log/hive.log
 {noformat}
 This is just a single test, but on Jenkins, hive.log is more than 13G of size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7566) HIVE can't count hbase NULL column value properly

2015-08-06 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660284#comment-14660284
 ] 

Aihua Xu commented on HIVE-7566:


There is HIVE-5277 for the same issue.  I will resolve this as dup then.

 HIVE can't count hbase NULL column value properly
 -

 Key: HIVE-7566
 URL: https://issues.apache.org/jira/browse/HIVE-7566
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 0.13.0
 Environment: HIVE version 0.13.0
 HBase version 0.98.0
Reporter: Kent Kong

 HBase table structure is like this:
 table name : 'testtable'
 column family : 'data'
 column 1 : 'name'
 column 2 : 'color'
 HIVE mapping table is structure is like this:
 table name : 'hb_testtable'
 column 1 : 'name'
 column 2 : 'color'
 in hbase, put two rows
 James, blue
 May
 then do select in hive
 select * from hb_testtable where color is null
 the result is 
 May, NULL
 then try count 
 select count(*) from hb_testtable where color is null
 the result is 0, which should be 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-7566) HIVE can't count hbase NULL column value properly

2015-08-06 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu resolved HIVE-7566.

Resolution: Duplicate

 HIVE can't count hbase NULL column value properly
 -

 Key: HIVE-7566
 URL: https://issues.apache.org/jira/browse/HIVE-7566
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 0.13.0
 Environment: HIVE version 0.13.0
 HBase version 0.98.0
Reporter: Kent Kong

 HBase table structure is like this:
 table name : 'testtable'
 column family : 'data'
 column 1 : 'name'
 column 2 : 'color'
 HIVE mapping table is structure is like this:
 table name : 'hb_testtable'
 column 1 : 'name'
 column 2 : 'color'
 in hbase, put two rows
 James, blue
 May
 then do select in hive
 select * from hb_testtable where color is null
 the result is 
 May, NULL
 then try count 
 select count(*) from hb_testtable where color is null
 the result is 0, which should be 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11475) Bad rename of directory during commit, when using HCat dynamic-partitioning.

2015-08-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660846#comment-14660846
 ] 

Hive QA commented on HIVE-11475:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748944/HIVE-11475.1.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4849/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4849/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4849/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Tests exited with: ExecutionException: java.util.concurrent.ExecutionException: 
org.apache.hive.ptest.execution.ssh.SSHExecutionException: RSyncResult 
[localFile=/data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4849/succeeded/TestJdbcWithMiniHS2,
 remoteFile=/home/hiveptest/54.146.159.23-hiveptest-1/logs/, getExitCode()=12, 
getException()=null, getUser()=hiveptest, getHost()=54.146.159.23, 
getInstance()=1]: 'Address 54.146.159.23 maps to 
ec2-54-146-159-23.compute-1.amazonaws.com, but this does not map back to the 
address - POSSIBLE BREAK-IN ATTEMPT!
receiving incremental file list
./
TEST-TestJdbcWithMiniHS2-TEST-org.apache.hive.jdbc.TestJdbcWithMiniHS2.xml
   0   0%0.00kB/s0:00:00
5799 100%5.53MB/s0:00:00 (xfer#1, to-check=3/5)
hive.log
   0   0%0.00kB/s0:00:00
43384832   0%   41.38MB/s0:05:58
90177536   0%   43.02MB/s0:05:43
   139001856   0%   44.22MB/s0:05:33
   187334656   1%   44.68MB/s0:05:28
   235732992   1%   45.85MB/s0:05:19
   283541504   1%   46.09MB/s0:05:16
   330530816   2%   45.65MB/s0:05:18
   37604   2%   45.01MB/s0:05:22
   419135488   2%   43.75MB/s0:05:30
   457211904   3%   41.44MB/s0:05:48
   492863488   3%   38.73MB/s0:06:11
   528449536   3%   36.35MB/s0:06:34
   561774592   3%   34.02MB/s0:07:00
   599326720   3%   33.88MB/s0:07:01
   635011072   4%   33.87MB/s0:07:00
   671154176   4%   34.01MB/s0:06:57
   706478080   4%   34.37MB/s0:06:52
   742981632   4%   34.14MB/s0:06:54
   778862592   5%   34.19MB/s0:06:52
   814710784   5%   34.12MB/s0:06:52
   850558976   5%   34.33MB/s0:06:48
   886308864   5%   34.15MB/s0:06:49
   922058752   6%   34.14MB/s0:06:49
   956399616   6%   33.78MB/s0:06:52
   992477184   6%   33.84MB/s0:06:50
  1028849664   6%   34.00MB/s0:06:47
  1039106048   6%   27.86MB/s0:08:17
  1046544384   6%   21.45MB/s0:10:45
  1090256896   7%   23.24MB/s0:09:53
  1133019136   7%   24.76MB/s0:09:15
  1176043520   7%   32.63MB/s0:07:00
  1219526656   8%   41.20MB/s0:05:31
  1253605376   8%   38.95MB/s0:05:50
  1292730368   8%   38.08MB/s0:05:57
  1336770560   8%   38.31MB/s0:05:54
  1370324992   9%   35.94MB/s0:06:16
  1410990080   9%   37.51MB/s0:05:59
  1446674432   9%   36.69MB/s0:06:06
  1482588160   9%   34.77MB/s0:06:26
  1518272512   9%   35.23MB/s0:06:19
  1553793024  10%   33.99MB/s0:06:32
  1588854784  10%   33.12MB/s0:06:42
  1627947008  10%   33.86MB/s0:06:32
  1663664128  10%   33.94MB/s0:06:30
  1699282944  11%   33.97MB/s0:06:28
  1734934528  11%   34.83MB/s0:06:18
  1770782720  11%   34.04MB/s0:06:26
  1806499840  11%   34.02MB/s0:06:25
  1842249728  12%   33.95MB/s0:06:24
  1878392832  12%   34.09MB/s0:06:22
  1914109952  12%   34.08MB/s0:06:21
  1949532160  12%   34.03MB/s0:06:21
  1985216512  13%   34.09MB/s0:06:19
  2020868096  13%   33.99MB/s0:06:19
  2045214720  13%   31.27MB/s0:06:51
  2053996544  13%   24.89MB/s0:08:36
  2061762560  13%   18.24MB/s0:11:44
  2069135360  13%   11.50MB/s0:18:37
  2076344320  13%7.41MB/s0:28:52
  2083258368  13%6.97MB/s0:30:41
  2090008576  13%6.72MB/s0:31:47
  2103148544  13%8.10MB/s0:26:22
  2138898432  14%   14.90MB/s0:14:17
  2174484480  14%   21.74MB/s0:09:46
  2210365440  14%   28.70MB/s0:07:22
  2246148096  14%   34.09MB/s0:06:11
  2281996288  14%   34.12MB/s0:06:10
  2317713408  15%   34.15MB/s0:06:09
  2353496064  15%   34.13MB/s0:06:08
  2389180416  15%   34.11MB/s0:06:07
  2424537088  15%   33.98MB/s0:06:07
  2460418048  16%   34.01MB/s0:06:06
  2496135168  16%   34.00MB/s0:06:05
  2531885056  16%   33.99MB/s0:06:04
  2567766016  16%   34.11MB/s0:06:02
  2603581440  17%   34.12MB/s0:06:01
  2639331328  17%   34.12MB/s0:06:00
  2675179520  17%   34.16MB/s0:05:58
  2710831104  17%   34.11MB/s0:05:58
  2746679296  18%   34.11MB/s

[jira] [Commented] (HIVE-11397) Parse Hive OR clauses as they are written into the AST

2015-08-06 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660858#comment-14660858
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-11397:
--

+1 for the change in patch#2, pending test runs. [~jcamachorodriguez] can you 
resubmit the patch again, the earlier run was incomplete.

Thanks
Hari

 Parse Hive OR clauses as they are written into the AST
 --

 Key: HIVE-11397
 URL: https://issues.apache.org/jira/browse/HIVE-11397
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11397.1.patch, HIVE-11397.2.patch, HIVE-11397.patch


 When parsing A OR B OR C, hive converts it into 
 (C OR B) OR A
 instead of turning it into
 A OR (B OR C)
 {code}
 GenericUDFOPOr or = new GenericUDFOPOr();
 ListExprNodeDesc expressions = new ArrayListExprNodeDesc(2);
 expressions.add(previous);
 expressions.add(current);
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11494) Some positive constant double predicates gets rounded off while negative constants are not

2015-08-06 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11494:
-
Description: 
Check the predicates in filter expression for following queries. It looks 
closely related to HIVE-11477 and HIVE-11493
{code:title=explain select * from orc_ppd where f = -0.0799821186066;}
OK
Stage-0
   Fetch Operator
  limit:-1
  Select Operator [SEL_2]
 
outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13]
 Filter Operator [FIL_4]
predicate:(f = -0.0799821186066) (type: boolean)
TableScan [TS_0]
   alias:orc_ppd
{code}

{code:title=explain select * from orc_ppd where f = 0.0799821186066;}
OK
Stage-0
   Fetch Operator
  limit:-1
  Select Operator [SEL_2]
 
outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13]
 Filter Operator [FIL_4]
predicate:(f = 0.08) (type: boolean)
TableScan [TS_0]
   alias:orc_ppd
{code}

Negative strings constants gets rounded off.
{code:title=explain select * from orc_ppd where f = -0.0799821186066;}
OK
Stage-0
   Fetch Operator
  limit:-1
  Select Operator [SEL_2]
 
outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13]
 Filter Operator [FIL_4]
predicate:(f = -0.08) (type: boolean)
TableScan [TS_0]
   alias:orc_ppd
{code}

  was:
Check the predicates in filter expression for following queries. It looks 
closely related to HIVE-11477 and HIVE-11493
{code:title=explain select * from orc_ppd where f = -0.0799821186066;}
OK
Stage-0
   Fetch Operator
  limit:-1
  Select Operator [SEL_2]
 
outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13]
 Filter Operator [FIL_4]
predicate:(f = -0.0799821186066) (type: boolean)
TableScan [TS_0]
   alias:orc_ppd
{code}

{code:title=explain select * from orc_ppd where f = 0.0799821186066;}
OK
Stage-0
   Fetch Operator
  limit:-1
  Select Operator [SEL_2]
 
outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13]
 Filter Operator [FIL_4]
predicate:(f = 0.08) (type: boolean)
TableScan [TS_0]
   alias:orc_ppd
{code}


 Some positive constant double predicates gets rounded off while negative 
 constants are not
 --

 Key: HIVE-11494
 URL: https://issues.apache.org/jira/browse/HIVE-11494
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Pengcheng Xiong
Priority: Critical

 Check the predicates in filter expression for following queries. It looks 
 closely related to HIVE-11477 and HIVE-11493
 {code:title=explain select * from orc_ppd where f = -0.0799821186066;}
 OK
 Stage-0
Fetch Operator
   limit:-1
   Select Operator [SEL_2]
  
 outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13]
  Filter Operator [FIL_4]
 predicate:(f = -0.0799821186066) (type: boolean)
 TableScan [TS_0]
alias:orc_ppd
 {code}
 {code:title=explain select * from orc_ppd where f = 0.0799821186066;}
 OK
 Stage-0
Fetch Operator
   limit:-1
   Select Operator [SEL_2]
  
 outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13]
  Filter Operator [FIL_4]
 predicate:(f = 0.08) (type: boolean)
 TableScan [TS_0]
alias:orc_ppd
 {code}
 Negative strings constants gets rounded off.
 {code:title=explain select * from orc_ppd where f = -0.0799821186066;}
 OK
 Stage-0
Fetch Operator
   limit:-1
   Select Operator [SEL_2]
  
 outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13]
  Filter Operator [FIL_4]
 predicate:(f = -0.08) (type: boolean)
 TableScan [TS_0]
alias:orc_ppd
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9811) Hive on Tez leaks WorkMap objects

2015-08-06 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660680#comment-14660680
 ] 

Sergey Shelukhin commented on HIVE-9811:


Is this fixed by HIVE-10778? That just needs to be ported back probably

 Hive on Tez leaks WorkMap objects
 -

 Key: HIVE-9811
 URL: https://issues.apache.org/jira/browse/HIVE-9811
 Project: Hive
  Issue Type: Bug
  Components: Tez
Reporter: Oleg Danilov
 Attachments: HIVE-9811.patch


 TezTask doesn't fully clean gWorkMap, so as result Hive leaks WorkMap objects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11437) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with insert into

2015-08-06 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11437:
---
Attachment: HIVE-11437.04.patch

 CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with 
 insert into
 ---

 Key: HIVE-11437
 URL: https://issues.apache.org/jira/browse/HIVE-11437
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11437.01.patch, HIVE-11437.02.patch, 
 HIVE-11437.03.patch, HIVE-11437.04.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11466) HIVE-10166 generates more data on hive.log causing Jenkins to fill all the disk.

2015-08-06 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660822#comment-14660822
 ] 

Thejas M Nair commented on HIVE-11466:
--

[~csun] [~xuefuz] So you think its some other change in spark merge patch that 
is causing the problem ?

 HIVE-10166 generates more data on hive.log causing Jenkins to fill all the 
 disk.
 

 Key: HIVE-11466
 URL: https://issues.apache.org/jira/browse/HIVE-11466
 Project: Hive
  Issue Type: Bug
Reporter: Sergio Peña
Assignee: Xuefu Zhang
 Attachments: HIVE-11466.patch


 An issue with HIVE-10166 patch is increasing the size of hive.log and  
 causing jenkins to fail because it does not have more space.
 Here's a test I run when running TestJdbcWithMiniHS2 before the patch, with 
 the patch, and after other commits.
 {noformat}
 BEFORE HIVE-10166
 13M Aug  5 11:57 ./hive-unit/target/tmp/log/hive.log
 WITH HIVE-10166
 2.4G Aug  5 12:07 ./hive-unit/target/tmp/log/hive.log
 CURRENT HEAD
 3.2G Aug  5 12:36 ./hive-unit/target/tmp/log/hive.log
 {noformat}
 This is just a single test, but on Jenkins, hive.log is more than 13G of size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11477) CBO inserts a UDF cast for integer type promotion (only for negative numbers)

2015-08-06 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11477:
-
Priority: Critical  (was: Major)

 CBO inserts a UDF cast for integer type promotion (only for negative numbers)
 -

 Key: HIVE-11477
 URL: https://issues.apache.org/jira/browse/HIVE-11477
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Pengcheng Xiong
Priority: Critical

 When CBO is enabled, filters which compares tinyint, smallint columns with 
 constant integer types will insert a UDFToInteger cast for the columns. When 
 CBO is disabled, there is no such UDF. This behaviour breaks ORC predicate 
 pushdown feature as ORC ignores UDFs in the filters.
 In the following examples column t is tinyint
 {code:title=Explain for select count(*) from orc_ppd where t  -127; (CBO 
 OFF)}
 Filter Operator [FIL_9]
predicate:(t = 125) (type: boolean)
Statistics:Num rows: 1050 Data size: 611757 Basic 
 stats: COMPLETE Column stats: NONE
TableScan [TS_0]
   alias:orc_ppd
   Statistics:Num rows: 2100 Data size: 1223514 
 Basic stats: COMPLETE Column stats: NONE
 {code}
 {code:title=Explain for select count(*) from orc_ppd where t  -127; (CBO ON)}
 Filter Operator [FIL_10]
predicate:(UDFToInteger(t)  -127) (type: boolean)
Statistics:Num rows: 700 Data size: 407838 Basic 
 stats: COMPLETE Column stats: NONE
TableScan [TS_0]
   alias:orc_ppd
   Statistics:Num rows: 2100 Data size: 1223514 
 Basic stats: COMPLETE Column stats: NONE
 {code}
 CBO does not insert such cast for non-negative numbers
 {code:title=Explain for select count(*) from orc_ppd where t  127; (CBO ON)}
 Filter Operator [FIL_10]
predicate:(t  127) (type: boolean)
Statistics:Num rows: 700 Data size: 407838 Basic 
 stats: COMPLETE Column stats: NONE
TableScan [TS_0]
   alias:orc_ppd
   Statistics:Num rows: 2100 Data size: 1223514 
 Basic stats: COMPLETE Column stats: NONE
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11406) Vectorization: StringExpr::compare() == 0 is bad for performance

2015-08-06 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660667#comment-14660667
 ] 

Lefty Leverenz commented on HIVE-11406:
---

I meant, I don't see the branch-1 commit on the commits@hive email list -- just 
master.

 Vectorization: StringExpr::compare() == 0 is bad for performance
 

 Key: HIVE-11406
 URL: https://issues.apache.org/jira/browse/HIVE-11406
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Matt McCline
 Attachments: HIVE-11406.01.patch


 {{StringExpr::compare() == 0}} is forced to evaluate the whole memory 
 comparison loop for differing lengths of strings, though there is no 
 possibility they will ever be equal.
 Add a {{StringExpr::equals}} which can be a smaller and tighter loop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11340) Create ORC based table using like clause doesn't copy compression property

2015-08-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660669#comment-14660669
 ] 

Hive QA commented on HIVE-11340:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748942/HIVE-11340.1.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9324 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter_partitioned
org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4848/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4848/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4848/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748942 - PreCommit-HIVE-TRUNK-Build

 Create ORC based table using like clause doesn't copy compression property
 --

 Key: HIVE-11340
 URL: https://issues.apache.org/jira/browse/HIVE-11340
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.14.0, 1.0.0, 1.2.0
Reporter: Gaurav Kohli
Assignee: Yongzhi Chen
Priority: Minor
 Attachments: HIVE-11340.1.patch


 I found a issue in “create table like” clause, as it is not copying the table 
 properties from ORC File format based table.
 Steps to reproduce:
 Step1 :
 create table orc_table (
 time string)
 stored as ORC tblproperties (orc.compress=SNAPPY);
 Step 2: 
 create table orc_table_using_like like orc_table;
 Step 3:
 show create table orc_table_using_like;  
 Result:
 createtab_stmt
 CREATE TABLE `orc_table_using_like`(
   `time` string)
 ROW FORMAT SERDE 
   'org.apache.hadoop.hive.ql.io.orc.OrcSerde' 
 STORED AS INPUTFORMAT 
   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' 
 OUTPUTFORMAT 
   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
 LOCATION
   'hdfs://nameservice1/user/hive/warehouse/gkohli.db/orc_table_using_like'
 TBLPROPERTIES (
   'transient_lastDdlTime'='1437578939')
 Issue:  'orc.compress'='SNAPPY' property is missing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11493) Predicate with integer column equals double evaluates to false

2015-08-06 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660754#comment-14660754
 ] 

Prasanth Jayachandran commented on HIVE-11493:
--

This happens even when constant propagation and cbo is disabled.

 Predicate with integer column equals double evaluates to false
 --

 Key: HIVE-11493
 URL: https://issues.apache.org/jira/browse/HIVE-11493
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Pengcheng Xiong
Priority: Blocker

 Filters with integer column equals double constant evaluates to false 
 everytime. Negative double constant works fine.
 {code:title=explain select * from orc_ppd where t = 10.0;}
 OK
 Stage-0
Fetch Operator
   limit:-1
   Select Operator [SEL_2]
  
 outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13]
  Filter Operator [FIL_1]
 predicate:false (type: boolean)
 TableScan [TS_0]
alias:orc_ppd
 {code}
 {code:title=explain select * from orc_ppd where t = -10.0;}
 OK
 Stage-0
Fetch Operator
   limit:-1
   Select Operator [SEL_2]
  
 outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13]
  Filter Operator [FIL_1]
 predicate:(t = (- 10.0)) (type: boolean)
 TableScan [TS_0]
alias:orc_ppd
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11488) Add sessionId info to HS2 log

2015-08-06 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reassigned HIVE-11488:
---

Assignee: Aihua Xu

 Add sessionId info to HS2 log
 -

 Key: HIVE-11488
 URL: https://issues.apache.org/jira/browse/HIVE-11488
 Project: Hive
  Issue Type: New Feature
  Components: Logging
Affects Versions: 2.0.0
Reporter: Aihua Xu
Assignee: Aihua Xu

 Session is critical for a multi-user system like Hive. Currently Hive doesn't 
 log seessionId to the log file, which sometimes make debugging and analysis 
 difficult when multiple activities are going on at the same time and the log 
 from different sessions are mixed together.
 Currently, Hive already has the sessionId saved in SessionState and also 
 there is another sessionId in SessionHandle (Seems not used and I'm still 
 looking to understand it). Generally we should have one sessionId from the 
 beginning in the client side and server side. Seems we have some work on that 
 side first.
 The sessionId then can be added to log4j supported mapped diagnostic context 
 (MDC) and can be configured to output to log file through the log4j property. 
 MDC is per thread, so we need to add sessionId to the HS2 main thread and 
 then it will be inherited by the child threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11492) get rid of gWorkMap

2015-08-06 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11492:

Description: gWorkMap is an annoying ugly global that causes leaks. It's 
not clear why it is needed when we already have 10 different *Context objects 
floating around during compilation. At worst we can add another one, would 
still be better than the global map. It should be removed.  (was: gWorkMap is 
an annoying ugly global that causes leaks. It's not clear why this is needed 
when we already have 10 different *Context objects floating around during 
compilation. At worst we can add another one, would still be better than the 
global map. It should be removed.)

 get rid of gWorkMap
 ---

 Key: HIVE-11492
 URL: https://issues.apache.org/jira/browse/HIVE-11492
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin

 gWorkMap is an annoying ugly global that causes leaks. It's not clear why it 
 is needed when we already have 10 different *Context objects floating around 
 during compilation. At worst we can add another one, would still be better 
 than the global map. It should be removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11490) Lazily call ASTNode::toStringTree() after tree modification

2015-08-06 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-11490:
-
Attachment: HIVE-11490.1.patch

 Lazily call ASTNode::toStringTree() after tree modification
 ---

 Key: HIVE-11490
 URL: https://issues.apache.org/jira/browse/HIVE-11490
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-11490.1.patch


 Currently, we call toStringTree() as part of HIVE-11316 everytime the tree is 
 modified. This is a bad approach as we can lazily delay this to the point 
 when toStringTree() is called again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11387) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : fix reduce_deduplicate optimization

2015-08-06 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660677#comment-14660677
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-11387:
--

+1 pending test run.

 CBO: Calcite Operator To Hive Operator (Calcite Return Path) : fix 
 reduce_deduplicate optimization
 --

 Key: HIVE-11387
 URL: https://issues.apache.org/jira/browse/HIVE-11387
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11387.01.patch, HIVE-11387.02.patch, 
 HIVE-11387.03.patch, HIVE-11387.04.patch, HIVE-11387.05.patch


 {noformat}
 The main problem is that, due to return path, now we may have 
 (RS1-GBY2)-(RS3-GBY4) when map.aggr=false, i.e., no map aggr. However, in the 
 non-return path, it will be treated as (RS1)-(GBY2-RS3-GBY4). The main 
 problem is that it does not take into account of the setting.
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11490) Lazily call ASTNode::toStringTree() after tree modification

2015-08-06 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-11490:
-
Attachment: (was: HIVE-11490.1.patch)

 Lazily call ASTNode::toStringTree() after tree modification
 ---

 Key: HIVE-11490
 URL: https://issues.apache.org/jira/browse/HIVE-11490
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan

 Currently, we call toStringTree() as part of HIVE-11316 everytime the tree is 
 modified. This is a bad approach as we can lazily delay this to the point 
 when toStringTree() is called again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11490) Lazily call ASTNode::toStringTree() after tree modification

2015-08-06 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-11490:
-
Attachment: HIVE-11490.1.patch

 Lazily call ASTNode::toStringTree() after tree modification
 ---

 Key: HIVE-11490
 URL: https://issues.apache.org/jira/browse/HIVE-11490
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-11490.1.patch


 Currently, we call toStringTree() as part of HIVE-11316 everytime the tree is 
 modified. This is a bad approach as we can lazily delay this to the point 
 when toStringTree() is called again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11437) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with insert into

2015-08-06 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660784#comment-14660784
 ] 

Pengcheng Xiong commented on HIVE-11437:


[~jcamachorodriguez], as per your suggestion, I added a test file in CBO return 
path. Could you please take another look? Thanks.

 CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with 
 insert into
 ---

 Key: HIVE-11437
 URL: https://issues.apache.org/jira/browse/HIVE-11437
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11437.01.patch, HIVE-11437.02.patch, 
 HIVE-11437.03.patch, HIVE-11437.04.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11398) Parse wide OR and wide AND trees to flat OR/AND trees

2015-08-06 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11398:
---
Attachment: HIVE-11398.2.patch

[~gopalv], new patch fixes the issues with the optimization; triggering a new 
QA run.
There are some changes in vectorization tests (vectorization mode gets 
disabled) that I guess will be solved once [~mmccline] patches go in?
Thanks

 Parse wide OR and wide AND trees to flat OR/AND trees
 -

 Key: HIVE-11398
 URL: https://issues.apache.org/jira/browse/HIVE-11398
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer, UDF
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11398.2.patch, HIVE-11398.patch


 Deep trees of AND/OR are hard to traverse particularly when they are merely 
 the same structure in nested form as a version of the operator that takes an 
 arbitrary number of args.
 One potential way to convert the DFS searches into a simpler BFS search is to 
 introduce a new Operator pair named ALL and ANY.
 ALL(A, B, C, D, E) represents AND(AND(AND(AND(E, D), C), B), A)
 ANY(A, B, C, D, E) represents OR(OR(OR(OR(E, D), C),B),A)
 The SemanticAnalyser would be responsible for generating these operators and 
 this would mean that the depth and complexity of traversals for the simplest 
 case of wide AND/OR trees would be trivial.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HIVE-4570) More information to user on GetOperationStatus in Hive Server2 when query is still executing

2015-08-06 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu reopened HIVE-4570:
---
  Assignee: (was: Jaideep Dhok)

Reopening as the functionality to view task/stage progress is not available yet.

 More information to user on GetOperationStatus in Hive Server2 when query is 
 still executing
 

 Key: HIVE-4570
 URL: https://issues.apache.org/jira/browse/HIVE-4570
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 0.11.0
Reporter: Amareshwari Sriramadasu

 Currently in Hive Server2, when the query is still executing only the status 
 is set as STILL_EXECUTING. 
 This issue is to give more information to the user such as progress and 
 running job handles, if possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11437) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with insert into

2015-08-06 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659750#comment-14659750
 ] 

Jesus Camacho Rodriguez commented on HIVE-11437:


[~pxiong], patch looks good to me, but can we add a test file to test insert 
into in CBO return path? That will avoid creating any regressions in the 
future. Thanks

 CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with 
 insert into
 ---

 Key: HIVE-11437
 URL: https://issues.apache.org/jira/browse/HIVE-11437
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11437.01.patch, HIVE-11437.02.patch, 
 HIVE-11437.03.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11391) CBO (Calcite Return Path): Add CBO tests with return path on

2015-08-06 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11391:
---
Attachment: HIVE-11391.patch

 CBO (Calcite Return Path): Add CBO tests with return path on
 

 Key: HIVE-11391
 URL: https://issues.apache.org/jira/browse/HIVE-11391
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11391.patch, HIVE-11391.patch, HIVE-11391.patch, 
 HIVE-11391.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7476) CTAS does not work properly for s3

2015-08-06 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-7476:

Attachment: HIVE-7476.3.patch

Thanks for the review, addressed the comments.

 CTAS does not work properly for s3
 --

 Key: HIVE-7476
 URL: https://issues.apache.org/jira/browse/HIVE-7476
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.1, 1.1.0
 Environment: Linux
Reporter: Jian Fang
Assignee: Szehon Ho
 Attachments: HIVE-7476.1.patch, HIVE-7476.2.patch, HIVE-7476.3.patch


 When we use CTAS to create a new table in s3, the table location is not set 
 correctly. As a result, the data from the existing table cannot be inserted 
 into the new created table.
 We can use the following example to reproduce this issue.
 set hive.metastore.warehouse.dir=OUTPUT_PATH;
 drop table s3_dir_test;
 drop table s3_1;
 drop table s3_2;
 create external table s3_dir_test(strct structa:int, b:string, c:string)
 row format delimited
 fields terminated by '\t'
 collection items terminated by ' '
 location 'INPUT_PATH';
 create table s3_1(strct structa:int, b:string, c:string)
 row format delimited
 fields terminated by '\t'
 collection items terminated by ' ';
 insert overwrite table s3_1 select * from s3_dir_test;
 select * from s3_1;
 create table s3_2 as select * from s3_1;
 select * from s3_1;
 select * from s3_2;
 The data could be as follows.
 1 abc 10.5
 2 def 11.5
 3 ajss 90.23232
 4 djns 89.02002
 5 random 2.99
 6 data 3.002
 7 ne 71.9084
 The root cause is that the SemanticAnalyzer class did not handle s3 location 
 properly for CTAS.
 A patch will be provided shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11367) CBO: Calcite Operator To Hive Operator (Calcite Return Path): ExprNodeConverter should use HiveDecimal to create Decimal

2015-08-06 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-11367:
-
Fix Version/s: 1.3.0

 CBO: Calcite Operator To Hive Operator (Calcite Return Path): 
 ExprNodeConverter should use HiveDecimal to create Decimal
 

 Key: HIVE-11367
 URL: https://issues.apache.org/jira/browse/HIVE-11367
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-11367.01.patch, HIVE-11367.01.patch-branch-1






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11498) HIVE Authorization v2 should not check permission for dummy entity

2015-08-06 Thread Dapeng Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-11498:
--
Attachment: HIVE-11498.001.patch

 HIVE Authorization v2 should not check permission for dummy entity
 --

 Key: HIVE-11498
 URL: https://issues.apache.org/jira/browse/HIVE-11498
 Project: Hive
  Issue Type: Bug
Reporter: Dapeng Sun
Assignee: Dapeng Sun
 Attachments: HIVE-11498.001.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11376) CombineHiveInputFormat is falling back to HiveInputFormat in case codecs are found for one of the input files

2015-08-06 Thread Rajat Khandelwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661331#comment-14661331
 ] 

Rajat Khandelwal commented on HIVE-11376:
-

Taking patch from reviewboard and attaching

 CombineHiveInputFormat is falling back to HiveInputFormat in case codecs are 
 found for one of the input files
 -

 Key: HIVE-11376
 URL: https://issues.apache.org/jira/browse/HIVE-11376
 Project: Hive
  Issue Type: Bug
Reporter: Rajat Khandelwal
Assignee: Rajat Khandelwal
 Attachments: HIVE-11376.02.patch


 https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java#L379
 This is the exact code snippet:
 {noformat}
 / Since there is no easy way of knowing whether MAPREDUCE-1597 is present in 
 the tree or not,
   // we use a configuration variable for the same
   if (this.mrwork != null  !this.mrwork.getHadoopSupportsSplittable()) {
 // The following code should be removed, once
 // https://issues.apache.org/jira/browse/MAPREDUCE-1597 is fixed.
 // Hadoop does not handle non-splittable files correctly for 
 CombineFileInputFormat,
 // so don't use CombineFileInputFormat for non-splittable files
 //ie, dont't combine if inputformat is a TextInputFormat and has 
 compression turned on
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11376) CombineHiveInputFormat is falling back to HiveInputFormat in case codecs are found for one of the input files

2015-08-06 Thread Rajat Khandelwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajat Khandelwal updated HIVE-11376:

Attachment: HIVE-11376.02.patch

 CombineHiveInputFormat is falling back to HiveInputFormat in case codecs are 
 found for one of the input files
 -

 Key: HIVE-11376
 URL: https://issues.apache.org/jira/browse/HIVE-11376
 Project: Hive
  Issue Type: Bug
Reporter: Rajat Khandelwal
Assignee: Rajat Khandelwal
 Attachments: HIVE-11376.02.patch


 https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java#L379
 This is the exact code snippet:
 {noformat}
 / Since there is no easy way of knowing whether MAPREDUCE-1597 is present in 
 the tree or not,
   // we use a configuration variable for the same
   if (this.mrwork != null  !this.mrwork.getHadoopSupportsSplittable()) {
 // The following code should be removed, once
 // https://issues.apache.org/jira/browse/MAPREDUCE-1597 is fixed.
 // Hadoop does not handle non-splittable files correctly for 
 CombineFileInputFormat,
 // so don't use CombineFileInputFormat for non-splittable files
 //ie, dont't combine if inputformat is a TextInputFormat and has 
 compression turned on
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7476) CTAS does not work properly for s3

2015-08-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661368#comment-14661368
 ] 

Hive QA commented on HIVE-7476:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12749117/HIVE-7476.3.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9326 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4854/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4854/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4854/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12749117 - PreCommit-HIVE-TRUNK-Build

 CTAS does not work properly for s3
 --

 Key: HIVE-7476
 URL: https://issues.apache.org/jira/browse/HIVE-7476
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.1, 1.1.0
 Environment: Linux
Reporter: Jian Fang
Assignee: Szehon Ho
 Attachments: HIVE-7476.1.patch, HIVE-7476.2.patch, HIVE-7476.3.patch


 When we use CTAS to create a new table in s3, the table location is not set 
 correctly. As a result, the data from the existing table cannot be inserted 
 into the new created table.
 We can use the following example to reproduce this issue.
 set hive.metastore.warehouse.dir=OUTPUT_PATH;
 drop table s3_dir_test;
 drop table s3_1;
 drop table s3_2;
 create external table s3_dir_test(strct structa:int, b:string, c:string)
 row format delimited
 fields terminated by '\t'
 collection items terminated by ' '
 location 'INPUT_PATH';
 create table s3_1(strct structa:int, b:string, c:string)
 row format delimited
 fields terminated by '\t'
 collection items terminated by ' ';
 insert overwrite table s3_1 select * from s3_dir_test;
 select * from s3_1;
 create table s3_2 as select * from s3_1;
 select * from s3_1;
 select * from s3_2;
 The data could be as follows.
 1 abc 10.5
 2 def 11.5
 3 ajss 90.23232
 4 djns 89.02002
 5 random 2.99
 6 data 3.002
 7 ne 71.9084
 The root cause is that the SemanticAnalyzer class did not handle s3 location 
 properly for CTAS.
 A patch will be provided shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-5277) HBase handler skips rows with null valued first cells when only row key is selected

2015-08-06 Thread Swarnim Kulkarni (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Swarnim Kulkarni updated HIVE-5277:
---
Attachment: HIVE-5277.3.patch.txt

Updated patch to fix counts for count(key) and count(*)

 HBase handler skips rows with null valued first cells when only row key is 
 selected
 ---

 Key: HIVE-5277
 URL: https://issues.apache.org/jira/browse/HIVE-5277
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 0.11.0, 0.11.1, 0.12.0, 0.13.0
Reporter: Teddy Choi
Assignee: Swarnim Kulkarni
Priority: Critical
 Attachments: HIVE-5277.1.patch.txt, HIVE-5277.2.patch.txt, 
 HIVE-5277.3.patch.txt


 HBaseStorageHandler skips rows with null valued first cells when only row key 
 is selected.
 {noformat}
 SELECT key, col1, col2 FROM hbase_table;
 key1  cell1   cell2 
 key2  NULLcell3
 SELECT COUNT(key) FROM hbase_table;
 1
 {noformat}
 HiveHBaseTableInputFormat.getRecordReader makes first cell selected to avoid 
 skipping rows. But when the first cell is null, HBase skips that row.
 http://hbase.apache.org/book/perf.reading.html 12.9.6. Optimal Loading of Row 
 Keys describes how to deal with this problem.
 I tried to find an existing issue, but I couldn't. If you find a same issue, 
 please make this issue duplicated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11376) CombineHiveInputFormat is falling back to HiveInputFormat in case codecs are found for one of the input files

2015-08-06 Thread Rajat Khandelwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajat Khandelwal updated HIVE-11376:

Attachment: (was: HIVE-11376_02.patch)

 CombineHiveInputFormat is falling back to HiveInputFormat in case codecs are 
 found for one of the input files
 -

 Key: HIVE-11376
 URL: https://issues.apache.org/jira/browse/HIVE-11376
 Project: Hive
  Issue Type: Bug
Reporter: Rajat Khandelwal
Assignee: Rajat Khandelwal
 Attachments: HIVE-11376.02.patch


 https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java#L379
 This is the exact code snippet:
 {noformat}
 / Since there is no easy way of knowing whether MAPREDUCE-1597 is present in 
 the tree or not,
   // we use a configuration variable for the same
   if (this.mrwork != null  !this.mrwork.getHadoopSupportsSplittable()) {
 // The following code should be removed, once
 // https://issues.apache.org/jira/browse/MAPREDUCE-1597 is fixed.
 // Hadoop does not handle non-splittable files correctly for 
 CombineFileInputFormat,
 // so don't use CombineFileInputFormat for non-splittable files
 //ie, dont't combine if inputformat is a TextInputFormat and has 
 compression turned on
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11449) HybridHashTableContainer should throw exception if not enough memory to create the hash tables

2015-08-06 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-11449:
--
Attachment: HIVE-11449.2.patch

Attaching patch v2 - this prevents us from passing too low of a memUsage value 
by making sure it is at least wbSize.

 HybridHashTableContainer should throw exception if not enough memory to 
 create the hash tables
 --

 Key: HIVE-11449
 URL: https://issues.apache.org/jira/browse/HIVE-11449
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-11449.1.patch, HIVE-11449.2.patch


 Currently it only logs a warning message:
 {code}
   public static int calcNumPartitions(long memoryThreshold, long dataSize, 
 int minNumParts,
   int minWbSize, HybridHashTableConf nwayConf) throws IOException {
 int numPartitions = minNumParts;
 if (memoryThreshold  minNumParts * minWbSize) {
   LOG.warn(Available memory is not enough to create a 
 HybridHashTableContainer!);
 }
 {code}
 Because we only log a warning, processing continues and hits a 
 hard-to-diagnose error (log below also includes extra logging I added to help 
 track this down). We should probably just fail the query a useful logging 
 message instead.
 {noformat}
 2015-07-30 18:49:29,696 [pool-1269-thread-8()] WARN 
 org.apache.hadoop.hive.ql.exec.persistence.HybridHashTableContainer: 
 Available memory is not enough to create HybridHashTableContainers 
 consistently!
 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR 
 org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** 
 initialCapacity 1: 10
 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR 
 org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** 
 initialCapacity 2: 131072
 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR 
 org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** 
 maxCapacity: 0
 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR 
 org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** 
 initialCapacity 3: 0
 2015-07-30 18:49:29,699 
 [TezTaskRunner_attempt_1437197396589_0685_1_49_00_2(attempt_1437197396589_0685_1_49_00_2)]
  ERROR org.apache.hadoop.hive.ql.exec.tez.TezProcessor: 
 java.lang.RuntimeException: Map operator initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:258)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:168)
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:157)
   at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:349)
   at 
 org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71)
   at 
 org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
   at 
 org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60)
   at 
 org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35)
   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Async 
 initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:419)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:389)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:514)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:467)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:379)
   at 
 org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:243)
   ... 15 more
 Caused by: java.util.concurrent.ExecutionException: java.lang.AssertionError: 
 Capacity must be a power of two
   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
   at java.util.concurrent.FutureTask.get(FutureTask.java:188)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:409)
   ... 20 more
 Caused 

[jira] [Commented] (HIVE-11375) Broken processing of queries containing NOT (x IS NOT NULL and x 0)

2015-08-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661213#comment-14661213
 ] 

Hive QA commented on HIVE-11375:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748958/HIVE-11375.patch

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 9326 tests executed
*Failed tests:*
{noformat}
TestMarkPartition - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_udf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_cond_pushdown
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_join_breaktask
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_udf
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_filter_join_breaktask
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_udf
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_filter_join_breaktask
org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4852/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4852/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4852/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748958 - PreCommit-HIVE-TRUNK-Build

 Broken processing of queries containing NOT (x IS NOT NULL and x  0)
 --

 Key: HIVE-11375
 URL: https://issues.apache.org/jira/browse/HIVE-11375
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 2.0.0
Reporter: Mariusz Sakowski
Assignee: Aihua Xu
 Fix For: 2.0.0

 Attachments: HIVE-11375.patch


 When running query like this:
 {code}explain select * from test where (val is not null and val  0);{code}
 hive will simplify expression in parenthesis and omit is not null check:
 {code}
   Filter Operator
 predicate: (val  0) (type: boolean)
 {code}
 which is fine.
 but if we negate condition using NOT operator:
 {code}explain select * from test where not (val is not null and val  
 0);{code}
 hive will also simplify thing, but now it will break stuff:
 {code}
   Filter Operator
 predicate: (not (val  0)) (type: boolean)
 {code}
 because valid predicate should be *val == 0 or val is null*, while above row 
 is equivalent to *val == 0* only, filtering away rows where val is null
 simple example:
 {code}
 CREATE TABLE example (
 val bigint
 );
 INSERT INTO example VALUES (1), (NULL), (0);
 -- returns 2 rows - NULL and 0
 select * from example where (val is null or val == 0);
 -- returns 1 row - 0
 select * from example where not (val is not null and val  0);
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0

2015-08-06 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-10975:

Attachment: HIVE-10975.2.patch

 Parquet: Bump the parquet version up to 1.8.0
 -

 Key: HIVE-10975
 URL: https://issues.apache.org/jira/browse/HIVE-10975
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
Priority: Minor
 Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, 
 HIVE-10975.1.patch, HIVE-10975.2.patch, HIVE-10975.2.patch, 
 HIVE-10975.2.patch, HIVE-10975.patch


 There are lots of changes since parquet's graduation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0

2015-08-06 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661323#comment-14661323
 ] 

Ferdinand Xu commented on HIVE-10975:
-

Now the jenkins result looks good, [~spena], could you take a look at it? 
Thanks!

 Parquet: Bump the parquet version up to 1.8.0
 -

 Key: HIVE-10975
 URL: https://issues.apache.org/jira/browse/HIVE-10975
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
Priority: Minor
 Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, 
 HIVE-10975.1.patch, HIVE-10975.2.patch, HIVE-10975.2.patch, 
 HIVE-10975.2.patch, HIVE-10975.patch


 There are lots of changes since parquet's graduation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11356) SMB join on tez fails when one of the tables is empty

2015-08-06 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-11356:
--
Attachment: HIVE-11356.3.patch

[~hagleitn] can you review please.

 SMB join on tez fails when one of the tables is empty
 -

 Key: HIVE-11356
 URL: https://issues.apache.org/jira/browse/HIVE-11356
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-11356.1.patch, HIVE-11356.3.patch


 {code}
 :java.lang.IllegalStateException: Unexpected event. All physical sources 
 already initialized 
 at com.google.common.base.Preconditions.checkState(Preconditions.java:145) 
 at 
 org.apache.tez.mapreduce.input.MultiMRInput.handleEvents(MultiMRInput.java:142)
  
 at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.handleEvent(LogicalIOProcessorRuntimeTask.java:610)
  
 at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.access$1100(LogicalIOProcessorRuntimeTask.java:90)
  
 at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$1.run(LogicalIOProcessorRuntimeTask.java:673)
  
 at java.lang.Thread.run(Thread.java:745) 
 ]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex 
 vertex_1437168420060_17787_1_01 [Map 4] killed/failed due to:null] 
 Vertex killed, vertexName=Reducer 5, 
 vertexId=vertex_1437168420060_17787_1_02, diagnostics=[Vertex received Kill 
 while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0, 
 Vertex vertex_1437168420060_17787_1_02 [Reducer 5] killed/failed due to:null] 
 DAG failed due to vertex failure. failedVertices:1 killedVertices:1 
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.tez.TezTask 
 HQL-FAILED 
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11356) SMB join on tez fails when one of the tables is empty

2015-08-06 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661241#comment-14661241
 ] 

Vikram Dixit K commented on HIVE-11356:
---

[~jdere] review please.

 SMB join on tez fails when one of the tables is empty
 -

 Key: HIVE-11356
 URL: https://issues.apache.org/jira/browse/HIVE-11356
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-11356.1.patch, HIVE-11356.3.patch


 {code}
 :java.lang.IllegalStateException: Unexpected event. All physical sources 
 already initialized 
 at com.google.common.base.Preconditions.checkState(Preconditions.java:145) 
 at 
 org.apache.tez.mapreduce.input.MultiMRInput.handleEvents(MultiMRInput.java:142)
  
 at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.handleEvent(LogicalIOProcessorRuntimeTask.java:610)
  
 at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.access$1100(LogicalIOProcessorRuntimeTask.java:90)
  
 at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$1.run(LogicalIOProcessorRuntimeTask.java:673)
  
 at java.lang.Thread.run(Thread.java:745) 
 ]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex 
 vertex_1437168420060_17787_1_01 [Map 4] killed/failed due to:null] 
 Vertex killed, vertexName=Reducer 5, 
 vertexId=vertex_1437168420060_17787_1_02, diagnostics=[Vertex received Kill 
 while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0, 
 Vertex vertex_1437168420060_17787_1_02 [Reducer 5] killed/failed due to:null] 
 DAG failed due to vertex failure. failedVertices:1 killedVertices:1 
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.tez.TezTask 
 HQL-FAILED 
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11451) SemanticAnalyzer throws IndexOutOfBounds Exception

2015-08-06 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661254#comment-14661254
 ] 

Chao Sun commented on HIVE-11451:
-

+1 on the latest patch.

 SemanticAnalyzer throws IndexOutOfBounds Exception
 --

 Key: HIVE-11451
 URL: https://issues.apache.org/jira/browse/HIVE-11451
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Aihua Xu
Priority: Critical
 Attachments: HIVE-11451.patch


 Following queries throw IndexOutOfBoundsException in SemanticAnalyzer
 {code:title=Queries|borderStyle=solid}
 CREATE TABLE staging(t tinyint,
si smallint,
i int,
b bigint,
f float,
d double,
bo boolean,
s string,
ts timestamp,
dec decimal(4,2),
bin binary)
 ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
 STORED AS TEXTFILE;
 LOAD DATA LOCAL INPATH '../../data/files/over1k' OVERWRITE INTO TABLE staging;
 CREATE TABLE orc_ppd(t tinyint,
si smallint,
i int,
b bigint,
f float,
d double,
bo boolean,
s string,
c char(50),
v varchar(50),
da date,
ts timestamp,
dec decimal(4,2),
bin binary)
 STORED AS ORC tblproperties(orc.row.index.stride = 1000);
 insert overwrite table orc_ppd select si, i, b, f, d, bo, s, cast(s as 
 char(50)), cast(s as varchar(50)), cast(ts as date), ts, dec, bin from 
 staging;
 {code}
 {code:title=StackTrace|borderStyle=solid}
 java.lang.IndexOutOfBoundsException: Index: 13, Size: 13
 at java.util.ArrayList.rangeCheck(ArrayList.java:635)
 at java.util.ArrayList.get(ArrayList.java:411)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genConversionSelectOperator(SemanticAnalyzer.java:6754)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:6543)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8989)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8880)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9730)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9623)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10115)
 at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:330)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10126)
 at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:209)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:240)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:422)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:310)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1139)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1192)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1068)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1058)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
 at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-8506) UT: add test flag in hive-site.xml for spark tests

2015-08-06 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li resolved HIVE-8506.
--
Resolution: Duplicate

Close this one as it's done in HIVE-10903.

 UT: add test flag in hive-site.xml for spark tests
 --

 Key: HIVE-8506
 URL: https://issues.apache.org/jira/browse/HIVE-8506
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Thomas Friedrich
Assignee: Thomas Friedrich
Priority: Minor

 All tests dbtxnmgr_* fail because the metastore tables are not correctly 
 initialized. We need to set the hive.in.test flag in hive-site.xml under 
 data/conf/spark:
 property
 namehive.in.test/name
 valuetrue/value
 descriptionInternal marker for test. Used for masking env-dependent 
 values/description
 /property



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0

2015-08-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661275#comment-14661275
 ] 

Hive QA commented on HIVE-10975:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748971/HIVE-10975.2.patch

{color:green}SUCCESS:{color} +1 9326 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4853/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4853/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4853/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748971 - PreCommit-HIVE-TRUNK-Build

 Parquet: Bump the parquet version up to 1.8.0
 -

 Key: HIVE-10975
 URL: https://issues.apache.org/jira/browse/HIVE-10975
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
Priority: Minor
 Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, 
 HIVE-10975.1.patch, HIVE-10975.2.patch, HIVE-10975.2.patch, 
 HIVE-10975.2.patch, HIVE-10975.patch


 There are lots of changes since parquet's graduation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11406) Vectorization: StringExpr::compare() == 0 is bad for performance

2015-08-06 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661336#comment-14661336
 ] 

Lefty Leverenz commented on HIVE-11406:
---

Thanks [~gopalv].

 Vectorization: StringExpr::compare() == 0 is bad for performance
 

 Key: HIVE-11406
 URL: https://issues.apache.org/jira/browse/HIVE-11406
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Matt McCline
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-11406.01.patch


 {{StringExpr::compare() == 0}} is forced to evaluate the whole memory 
 comparison loop for differing lengths of strings, though there is no 
 possibility they will ever be equal.
 Add a {{StringExpr::equals}} which can be a smaller and tighter loop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11406) Vectorization: StringExpr::compare() == 0 is bad for performance

2015-08-06 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661016#comment-14661016
 ] 

Gopal V commented on HIVE-11406:


I see Matt's fix on both branches now. 

Git makes it too easy to commit a fix, but not push it.

 Vectorization: StringExpr::compare() == 0 is bad for performance
 

 Key: HIVE-11406
 URL: https://issues.apache.org/jira/browse/HIVE-11406
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Matt McCline
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-11406.01.patch


 {{StringExpr::compare() == 0}} is forced to evaluate the whole memory 
 comparison loop for differing lengths of strings, though there is no 
 possibility they will ever be equal.
 Add a {{StringExpr::equals}} which can be a smaller and tighter loop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11493) Predicate with integer column equals double evaluates to false

2015-08-06 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661152#comment-14661152
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-11493:
--

The change looks fine. +1 pending tests.

 Predicate with integer column equals double evaluates to false
 --

 Key: HIVE-11493
 URL: https://issues.apache.org/jira/browse/HIVE-11493
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Pengcheng Xiong
Priority: Blocker
 Attachments: HIVE-11493.01.patch


 Filters with integer column equals double constant evaluates to false 
 everytime. Negative double constant works fine.
 {code:title=explain select * from orc_ppd where t = 10.0;}
 OK
 Stage-0
Fetch Operator
   limit:-1
   Select Operator [SEL_2]
  
 outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13]
  Filter Operator [FIL_1]
 predicate:false (type: boolean)
 TableScan [TS_0]
alias:orc_ppd
 {code}
 {code:title=explain select * from orc_ppd where t = -10.0;}
 OK
 Stage-0
Fetch Operator
   limit:-1
   Select Operator [SEL_2]
  
 outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13]
  Filter Operator [FIL_1]
 predicate:(t = (- 10.0)) (type: boolean)
 TableScan [TS_0]
alias:orc_ppd
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11476) TypeInfoParser cannot handle column names with spaces in them

2015-08-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660960#comment-14660960
 ] 

Hive QA commented on HIVE-11476:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748943/HIVE-11476.1.patch

{color:green}SUCCESS:{color} +1 9326 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4850/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4850/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4850/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748943 - PreCommit-HIVE-TRUNK-Build

 TypeInfoParser cannot handle column names with spaces in them
 -

 Key: HIVE-11476
 URL: https://issues.apache.org/jira/browse/HIVE-11476
 Project: Hive
  Issue Type: Bug
  Components: Types
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Gopal V
Priority: Minor
 Attachments: HIVE-11476.1.patch


 When using column names which contain escaped spaces in them like `user id`, 
 the type info parser is unable to parse out the structures which have a 
 format similar to 
 structuser id:int,user group: int
 {code}
 java.lang.IllegalArgumentException: Error: : expected at the position 12 of 
 '' but 'structuser id:int,user group: int' is found.
 at 
 org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.expect(TypeInfoUtils.java:360)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11493) Predicate with integer column equals double evaluates to false

2015-08-06 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661125#comment-14661125
 ] 

Pengcheng Xiong commented on HIVE-11493:


[~hsubramaniyan], thanks a lot for your comments. Actually we do not need to 
cast the value, it is already there. We just need to change 
{code}
if (triedDouble ||
{code}
to
{code}
if (triedDouble 
{code}

 Predicate with integer column equals double evaluates to false
 --

 Key: HIVE-11493
 URL: https://issues.apache.org/jira/browse/HIVE-11493
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Pengcheng Xiong
Priority: Blocker

 Filters with integer column equals double constant evaluates to false 
 everytime. Negative double constant works fine.
 {code:title=explain select * from orc_ppd where t = 10.0;}
 OK
 Stage-0
Fetch Operator
   limit:-1
   Select Operator [SEL_2]
  
 outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13]
  Filter Operator [FIL_1]
 predicate:false (type: boolean)
 TableScan [TS_0]
alias:orc_ppd
 {code}
 {code:title=explain select * from orc_ppd where t = -10.0;}
 OK
 Stage-0
Fetch Operator
   limit:-1
   Select Operator [SEL_2]
  
 outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13]
  Filter Operator [FIL_1]
 predicate:(t = (- 10.0)) (type: boolean)
 TableScan [TS_0]
alias:orc_ppd
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-11381) QTest combine2_hadoop20.q fails when using -Phadoop-1 profile due to HIVE-11139

2015-08-06 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang resolved HIVE-11381.

Resolution: Won't Fix

I looked into it again. It is not an issue upstream actually. This test is 
ignored. Thanks.

 QTest combine2_hadoop20.q fails when using -Phadoop-1 profile due to 
 HIVE-11139
 ---

 Key: HIVE-11381
 URL: https://issues.apache.org/jira/browse/HIVE-11381
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.3.0
Reporter: Sergio Peña
Assignee: Sergio Peña

 The q-est {{combine2_hadoop20.q}} is failing when running -Phadoop-1 profile
 tests. The output test is different due to the changes added on HIVE-11139
 for more lineage information.
 Based on other HIVE-11139 tests, this test output needs to be regenerated 
 only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11451) SemanticAnalyzer throws IndexOutOfBounds Exception

2015-08-06 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-11451:

Attachment: (was: HIVE-11451.patch)

 SemanticAnalyzer throws IndexOutOfBounds Exception
 --

 Key: HIVE-11451
 URL: https://issues.apache.org/jira/browse/HIVE-11451
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Aihua Xu
Priority: Critical
 Attachments: HIVE-11451.patch


 Following queries throw IndexOutOfBoundsException in SemanticAnalyzer
 {code:title=Queries|borderStyle=solid}
 CREATE TABLE staging(t tinyint,
si smallint,
i int,
b bigint,
f float,
d double,
bo boolean,
s string,
ts timestamp,
dec decimal(4,2),
bin binary)
 ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
 STORED AS TEXTFILE;
 LOAD DATA LOCAL INPATH '../../data/files/over1k' OVERWRITE INTO TABLE staging;
 CREATE TABLE orc_ppd(t tinyint,
si smallint,
i int,
b bigint,
f float,
d double,
bo boolean,
s string,
c char(50),
v varchar(50),
da date,
ts timestamp,
dec decimal(4,2),
bin binary)
 STORED AS ORC tblproperties(orc.row.index.stride = 1000);
 insert overwrite table orc_ppd select si, i, b, f, d, bo, s, cast(s as 
 char(50)), cast(s as varchar(50)), cast(ts as date), ts, dec, bin from 
 staging;
 {code}
 {code:title=StackTrace|borderStyle=solid}
 java.lang.IndexOutOfBoundsException: Index: 13, Size: 13
 at java.util.ArrayList.rangeCheck(ArrayList.java:635)
 at java.util.ArrayList.get(ArrayList.java:411)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genConversionSelectOperator(SemanticAnalyzer.java:6754)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:6543)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8989)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8880)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9730)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9623)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10115)
 at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:330)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10126)
 at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:209)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:240)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:422)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:310)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1139)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1192)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1068)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1058)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
 at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11451) SemanticAnalyzer throws IndexOutOfBounds Exception

2015-08-06 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-11451:

Attachment: HIVE-11451.patch

 SemanticAnalyzer throws IndexOutOfBounds Exception
 --

 Key: HIVE-11451
 URL: https://issues.apache.org/jira/browse/HIVE-11451
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Aihua Xu
Priority: Critical
 Attachments: HIVE-11451.patch, HIVE-11451.patch


 Following queries throw IndexOutOfBoundsException in SemanticAnalyzer
 {code:title=Queries|borderStyle=solid}
 CREATE TABLE staging(t tinyint,
si smallint,
i int,
b bigint,
f float,
d double,
bo boolean,
s string,
ts timestamp,
dec decimal(4,2),
bin binary)
 ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
 STORED AS TEXTFILE;
 LOAD DATA LOCAL INPATH '../../data/files/over1k' OVERWRITE INTO TABLE staging;
 CREATE TABLE orc_ppd(t tinyint,
si smallint,
i int,
b bigint,
f float,
d double,
bo boolean,
s string,
c char(50),
v varchar(50),
da date,
ts timestamp,
dec decimal(4,2),
bin binary)
 STORED AS ORC tblproperties(orc.row.index.stride = 1000);
 insert overwrite table orc_ppd select si, i, b, f, d, bo, s, cast(s as 
 char(50)), cast(s as varchar(50)), cast(ts as date), ts, dec, bin from 
 staging;
 {code}
 {code:title=StackTrace|borderStyle=solid}
 java.lang.IndexOutOfBoundsException: Index: 13, Size: 13
 at java.util.ArrayList.rangeCheck(ArrayList.java:635)
 at java.util.ArrayList.get(ArrayList.java:411)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genConversionSelectOperator(SemanticAnalyzer.java:6754)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:6543)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8989)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8880)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9730)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9623)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10115)
 at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:330)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10126)
 at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:209)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:240)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:422)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:310)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1139)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1192)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1068)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1058)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
 at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11441) No DDL allowed on table if user accidentally set table location wrong

2015-08-06 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-11441:
--
Attachment: HIVE-11441.2.patch

Fix a compilation error under Hadoop 1.

 No DDL allowed on table if user accidentally set table location wrong
 -

 Key: HIVE-11441
 URL: https://issues.apache.org/jira/browse/HIVE-11441
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-11441.1.patch, HIVE-11441.2.patch


 If user makes a mistake, hive should either correct it in the first place, or 
 allow user a chance to correct it. 
 STEPS TO REPRODUCE:
 create table testwrongloc(id int);
 alter table testwrongloc set location 
 hdfs://a-valid-hostname/tmp/testwrongloc;
 --at this time, hive should throw error, as hdfs://a-valid-hostname is not a 
 valid path, it either needs to be hdfs://namenode-hostname:8020/ or 
 hdfs://hdfs-nameservice for HA
 alter table testwrongloc set location 
 hdfs://correct-host:8020/tmp/testwrongloc
 or 
 drop table testwrongloc;
 upon this hive throws error, that host 'a-valid-hostname' is not reachable
 {code}
 2015-07-30 12:19:43,573 DEBUG [main]: transport.TSaslTransport 
 (TSaslTransport.java:readFrame(429)) - CLIENT: reading data length: 293
 2015-07-30 12:19:43,720 ERROR [main]: ql.Driver 
 (SessionState.java:printError(833)) - FAILED: SemanticException Unable to 
 fetch table testloc. java.net.ConnectException: Call From 
 hdpsecb02.secb.hwxsup.com/172.25.16.178 to hdpsecb02.secb.hwxsup.com:8020 
 failed on connection exception: java.net.ConnectException: Connection 
 refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
 org.apache.hadoop.hive.ql.parse.SemanticException: Unable to fetch table 
 testloc. java.net.ConnectException: Call From 
 hdpsecb02.secb.hwxsup.com/172.25.16.178 to hdpsecb02.secb.hwxsup.com:8020 
 failed on connection exception: java.net.ConnectException: Connection 
 refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.getTable(BaseSemanticAnalyzer.java:1323)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.getTable(BaseSemanticAnalyzer.java:1309)
 at 
 org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.addInputsOutputsAlterTable(DDLSemanticAnalyzer.java:1387)
 at 
 org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeAlterTableLocation(DDLSemanticAnalyzer.java:1452)
 at 
 org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:295)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:417)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1069)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1131)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
 at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to fetch 
 table testloc. java.net.ConnectException: Call From 
 hdpsecb02.secb.hwxsup.com/172.25.16.178 to hdpsecb02.secb.hwxsup.com:8020 
 failed on connection exception: java.net.ConnectException: Connection 
 refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
 at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1072)
 at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1019)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.getTable(BaseSemanticAnalyzer.java:1316)
 ... 23 more
 {code}



--
This message was 

[jira] [Updated] (HIVE-11493) Predicate with integer column equals double evaluates to false

2015-08-06 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11493:
---
Attachment: HIVE-11493.01.patch

 Predicate with integer column equals double evaluates to false
 --

 Key: HIVE-11493
 URL: https://issues.apache.org/jira/browse/HIVE-11493
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Pengcheng Xiong
Priority: Blocker
 Attachments: HIVE-11493.01.patch


 Filters with integer column equals double constant evaluates to false 
 everytime. Negative double constant works fine.
 {code:title=explain select * from orc_ppd where t = 10.0;}
 OK
 Stage-0
Fetch Operator
   limit:-1
   Select Operator [SEL_2]
  
 outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13]
  Filter Operator [FIL_1]
 predicate:false (type: boolean)
 TableScan [TS_0]
alias:orc_ppd
 {code}
 {code:title=explain select * from orc_ppd where t = -10.0;}
 OK
 Stage-0
Fetch Operator
   limit:-1
   Select Operator [SEL_2]
  
 outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13]
  Filter Operator [FIL_1]
 predicate:(t = (- 10.0)) (type: boolean)
 TableScan [TS_0]
alias:orc_ppd
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11340) Create ORC based table using like clause doesn't copy compression property

2015-08-06 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661151#comment-14661151
 ] 

Yongzhi Chen commented on HIVE-11340:
-

The 3 failures are not related.
The 2 Minimr failures all have the error:
[Error 30017]: Skipping stats aggregation by error 
org.apache.hadoop.hive.ql.metadata.HiveException: [Error 30015]: Stats 
aggregator of type counter cannot be connected to
It may relate to network issue.
I tested them in my local machine, they all passed.
The testSSLConnectionWithProperty failure ages 5.
[~csun], could you review the patch? Thanks




 Create ORC based table using like clause doesn't copy compression property
 --

 Key: HIVE-11340
 URL: https://issues.apache.org/jira/browse/HIVE-11340
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.14.0, 1.0.0, 1.2.0
Reporter: Gaurav Kohli
Assignee: Yongzhi Chen
Priority: Minor
 Attachments: HIVE-11340.1.patch


 I found a issue in “create table like” clause, as it is not copying the table 
 properties from ORC File format based table.
 Steps to reproduce:
 Step1 :
 create table orc_table (
 time string)
 stored as ORC tblproperties (orc.compress=SNAPPY);
 Step 2: 
 create table orc_table_using_like like orc_table;
 Step 3:
 show create table orc_table_using_like;  
 Result:
 createtab_stmt
 CREATE TABLE `orc_table_using_like`(
   `time` string)
 ROW FORMAT SERDE 
   'org.apache.hadoop.hive.ql.io.orc.OrcSerde' 
 STORED AS INPUTFORMAT 
   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' 
 OUTPUTFORMAT 
   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
 LOCATION
   'hdfs://nameservice1/user/hive/warehouse/gkohli.db/orc_table_using_like'
 TBLPROPERTIES (
   'transient_lastDdlTime'='1437578939')
 Issue:  'orc.compress'='SNAPPY' property is missing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11493) Predicate with integer column equals double evaluates to false

2015-08-06 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661158#comment-14661158
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-11493:
--

nit : you could actually change the comment below to reflect what you are 
actually doing.

{code}
  // however, if we already tried this, or the column is NUMBER 
type and
  // the operator is EQUAL, return false due to the type mismatch
{code}


 Predicate with integer column equals double evaluates to false
 --

 Key: HIVE-11493
 URL: https://issues.apache.org/jira/browse/HIVE-11493
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Pengcheng Xiong
Priority: Blocker
 Attachments: HIVE-11493.01.patch


 Filters with integer column equals double constant evaluates to false 
 everytime. Negative double constant works fine.
 {code:title=explain select * from orc_ppd where t = 10.0;}
 OK
 Stage-0
Fetch Operator
   limit:-1
   Select Operator [SEL_2]
  
 outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13]
  Filter Operator [FIL_1]
 predicate:false (type: boolean)
 TableScan [TS_0]
alias:orc_ppd
 {code}
 {code:title=explain select * from orc_ppd where t = -10.0;}
 OK
 Stage-0
Fetch Operator
   limit:-1
   Select Operator [SEL_2]
  
 outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13]
  Filter Operator [FIL_1]
 predicate:(t = (- 10.0)) (type: boolean)
 TableScan [TS_0]
alias:orc_ppd
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11436) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with empty char

2015-08-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660963#comment-14660963
 ] 

Hive QA commented on HIVE-11436:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748959/HIVE-11436.03.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4851/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4851/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4851/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4851/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 3fe7e44 HIVE-11432 : Hive macro give same result for different 
arguments (Pengcheng Xiong, reviewed by Hari Subramaniyan)
+ git clean -f -d
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at 3fe7e44 HIVE-11432 : Hive macro give same result for different 
arguments (Pengcheng Xiong, reviewed by Hari Subramaniyan)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748959 - PreCommit-HIVE-TRUNK-Build

 CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with 
 empty char
 --

 Key: HIVE-11436
 URL: https://issues.apache.org/jira/browse/HIVE-11436
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11436.01.patch, HIVE-11436.02.patch, 
 HIVE-11436.03.patch


 BaseCharUtils checks whether the length of a char is in between [1,255]. This 
 causes return path to throw error when the the length of a char is 0. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11496) Better tests for evaluating ORC predicate pushdown

2015-08-06 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-11496:
-
Attachment: HIVE-11496.1.patch

 Better tests for evaluating ORC predicate pushdown
 --

 Key: HIVE-11496
 URL: https://issues.apache.org/jira/browse/HIVE-11496
 Project: Hive
  Issue Type: Improvement
Affects Versions: 1.3.0, 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: HIVE-11496.1.patch


 There were many regressions recently wrt ORC predicate pushdown. We don't 
 have system tests to capture these regressions. Currently there is only junit 
 tests for testing ORC predicate pushdown feature. Since hive counters are not 
 available during qfile test execution there is no easy way to verify if ORC 
 PPD feature worked or not. This jira is add a post execution hook to print 
 hive counters (esp. number of input records) to error stream so that it will 
 appear in qfile test output. This way we can verify ORC SARG evaluation and 
 avoid future regressions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11436) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with empty char

2015-08-06 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11436:
---
Attachment: HIVE-11436.04.patch

rebase and resubmit the patch

 CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with 
 empty char
 --

 Key: HIVE-11436
 URL: https://issues.apache.org/jira/browse/HIVE-11436
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11436.01.patch, HIVE-11436.02.patch, 
 HIVE-11436.03.patch, HIVE-11436.04.patch


 BaseCharUtils checks whether the length of a char is in between [1,255]. This 
 causes return path to throw error when the the length of a char is 0. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11436) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with empty char

2015-08-06 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660982#comment-14660982
 ] 

Pengcheng Xiong commented on HIVE-11436:


[~jcamachorodriguez], could u take look? Thanks.

 CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with 
 empty char
 --

 Key: HIVE-11436
 URL: https://issues.apache.org/jira/browse/HIVE-11436
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11436.01.patch, HIVE-11436.02.patch, 
 HIVE-11436.03.patch, HIVE-11436.04.patch


 BaseCharUtils checks whether the length of a char is in between [1,255]. This 
 causes return path to throw error when the the length of a char is 0. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11406) Vectorization: StringExpr::compare() == 0 is bad for performance

2015-08-06 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-11406:
---
Fix Version/s: 2.0.0
   1.3.0

 Vectorization: StringExpr::compare() == 0 is bad for performance
 

 Key: HIVE-11406
 URL: https://issues.apache.org/jira/browse/HIVE-11406
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Matt McCline
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-11406.01.patch


 {{StringExpr::compare() == 0}} is forced to evaluate the whole memory 
 comparison loop for differing lengths of strings, though there is no 
 possibility they will ever be equal.
 Add a {{StringExpr::equals}} which can be a smaller and tighter loop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11496) Better tests for evaluating ORC predicate pushdown

2015-08-06 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660973#comment-14660973
 ] 

Prasanth Jayachandran commented on HIVE-11496:
--

This patch adds only limited basic tests. I will add more tests after 
HIVE-11312, HIVE-11477, HIVE-11493 and HIVE-11494 are fixed.

 Better tests for evaluating ORC predicate pushdown
 --

 Key: HIVE-11496
 URL: https://issues.apache.org/jira/browse/HIVE-11496
 Project: Hive
  Issue Type: Improvement
Affects Versions: 1.3.0, 2.0.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: HIVE-11496.1.patch


 There were many regressions recently wrt ORC predicate pushdown. We don't 
 have system tests to capture these regressions. Currently there is only junit 
 tests for testing ORC predicate pushdown feature. Since hive counters are not 
 available during qfile test execution there is no easy way to verify if ORC 
 PPD feature worked or not. This jira is add a post execution hook to print 
 hive counters (esp. number of input records) to error stream so that it will 
 appear in qfile test output. This way we can verify ORC SARG evaluation and 
 avoid future regressions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11442) Remove commons-configuration.jar from Hive distribution

2015-08-06 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-11442:
--
Attachment: HIVE-11442.2.patch

Rerun precommit test.

 Remove commons-configuration.jar from Hive distribution
 ---

 Key: HIVE-11442
 URL: https://issues.apache.org/jira/browse/HIVE-11442
 Project: Hive
  Issue Type: Improvement
  Components: Build Infrastructure
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-11442.1.patch, HIVE-11442.2.patch


 Some customer report version conflicting for Hive bundled 
 commons-configuration.jar. Actually commons-configuration.jar is not needed 
 by Hive. It is a transitive dependency of Hadoop/Accumulo. User should be 
 able to pick those jars from Hadoop at runtime. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11385) LLAP: clean up ORC dependencies - move encoded reader path into a cloned ReaderImpl

2015-08-06 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-11385:

Attachment: HIVE-11385.01.patch

Moved the classes to storage-api that we recently integrated from master 

 LLAP: clean up ORC dependencies - move encoded reader path into a cloned 
 ReaderImpl
 ---

 Key: HIVE-11385
 URL: https://issues.apache.org/jira/browse/HIVE-11385
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-11385.01.patch, HIVE-11385.patch


 Before there's storage handler module, we can clean some things up
 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-3725) Add support for pulling HBase columns with prefixes

2015-08-06 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659550#comment-14659550
 ] 

Lefty Leverenz commented on HIVE-3725:
--

Doc note:  This still needs documentation.  See the Hive column mapping to 
hbase thread in the d...@hive.apache.org mailing list.

* [Re: Hive column mapping to hbase | 
http://mail-archives.apache.org/mod_mbox/hive-dev/201508.mbox/%3ccahnpetsyyitxxw5iptufbmjvqecnawr04jx5+3a+vtf0mqp...@mail.gmail.com%3e]

 Add support for pulling HBase columns with prefixes
 ---

 Key: HIVE-3725
 URL: https://issues.apache.org/jira/browse/HIVE-3725
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Affects Versions: 0.9.0
Reporter: Swarnim Kulkarni
Assignee: Swarnim Kulkarni
  Labels: TODOC12
 Fix For: 0.12.0

 Attachments: HIVE-3725.1.patch.txt, HIVE-3725.2.patch.txt, 
 HIVE-3725.3.patch.txt, HIVE-3725.4.patch.txt, HIVE-3725.patch.3.txt


 Current HBase Hive integration supports reading many values from the same row 
 by specifying a column family. And specifying just the column family can pull 
 in all qualifiers within the family.
 We should add in support to be able to specify a prefix for the qualifier and 
 all columns that start with the prefix would automatically get pulled in. A 
 wildcard support would be ideal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4570) More information to user on GetOperationStatus in Hive Server2 when query is still executing

2015-08-06 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659856#comment-14659856
 ] 

Amareshwari Sriramadasu commented on HIVE-4570:
---

Here is the modified thrift structure proposed :

{noformat}
struct TGetOperationStatusResp {
  1: required TStatus status
  2: optional TOperationState operationState
  // List of statuses of sub tasks
  3: optional string taskStatus

  // If operationState is ERROR_STATE, then the following fields may be set
  // sqlState as defined in the ISO/IEF CLI specification
  4: optional string sqlState

  // Internal error code
  5: optional i32 errorCode

  // Error message
  6: optional string errorMessage
  // When was the operation started
  7: optional i64 operationStarted
  // When was the operation completed
  8: optional i64 operationCompleted
}
{noformat}

Here is few commits in forked repo in our organization, which can be picked up 
as patch here :
https://github.com/inmobi/hive/commit/8eb3fd4a799157b1634876490c19061e257e83fd
https://github.com/InMobi/hive/commit/99475a9ed0dc840dd5445dcf100cd7abf322afc1
https://github.com/InMobi/hive/commit/85bf27311baaa4f83d928a39b44a1a182671b66f


 More information to user on GetOperationStatus in Hive Server2 when query is 
 still executing
 

 Key: HIVE-4570
 URL: https://issues.apache.org/jira/browse/HIVE-4570
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 0.11.0
Reporter: Amareshwari Sriramadasu

 Currently in Hive Server2, when the query is still executing only the status 
 is set as STILL_EXECUTING. 
 This issue is to give more information to the user such as progress and 
 running job handles, if possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11461) Transform flat AND/OR into IN struct clause

2015-08-06 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-11461:
---
Attachment: HIVE-11461.1.patch

 Transform flat AND/OR into IN struct clause
 ---

 Key: HIVE-11461
 URL: https://issues.apache.org/jira/browse/HIVE-11461
 Project: Hive
  Issue Type: Bug
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11461.1.patch, HIVE-11461.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11340) Create ORC based table using like clause doesn't copy compression property

2015-08-06 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-11340:

Affects Version/s: 0.14.0
   1.0.0

 Create ORC based table using like clause doesn't copy compression property
 --

 Key: HIVE-11340
 URL: https://issues.apache.org/jira/browse/HIVE-11340
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.14.0, 1.0.0
Reporter: Gaurav Kohli
Assignee: Yongzhi Chen
Priority: Minor
 Attachments: HIVE-11340.1.patch


 I found a issue in “create table like” clause, as it is not copying the table 
 properties from ORC File format based table.
 Steps to reproduce:
 Step1 :
 create table orc_table (
 time string)
 stored as ORC tblproperties (orc.compress=SNAPPY);
 Step 2: 
 create table orc_table_using_like like orc_table;
 Step 3:
 show create table orc_table_using_like;  
 Result:
 createtab_stmt
 CREATE TABLE `orc_table_using_like`(
   `time` string)
 ROW FORMAT SERDE 
   'org.apache.hadoop.hive.ql.io.orc.OrcSerde' 
 STORED AS INPUTFORMAT 
   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' 
 OUTPUTFORMAT 
   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
 LOCATION
   'hdfs://nameservice1/user/hive/warehouse/gkohli.db/orc_table_using_like'
 TBLPROPERTIES (
   'transient_lastDdlTime'='1437578939')
 Issue:  'orc.compress'='SNAPPY' property is missing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11250) Change in spark.executor.instances (and others) doesn't take effect after RSC is launched for HS2 [Spark Brnach]

2015-08-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659920#comment-14659920
 ] 

Hive QA commented on HIVE-11250:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748916/HIVE-11250.1.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9324 tests executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4844/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4844/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4844/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748916 - PreCommit-HIVE-TRUNK-Build

 Change in spark.executor.instances (and others) doesn't take effect after RSC 
 is launched for HS2 [Spark Brnach]
 

 Key: HIVE-11250
 URL: https://issues.apache.org/jira/browse/HIVE-11250
 Project: Hive
  Issue Type: Bug
  Components: Spark
Affects Versions: 1.1.0
Reporter: Xuefu Zhang
Assignee: Jimmy Xiang
 Fix For: 1.3.0, 2.0.0

 Attachments: HIVE-11250.1.patch, HIVE-11250.1.patch, 
 HIVE-11250.1.patch


 Hive CLI works as expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11461) Transform flat AND/OR into IN struct clause

2015-08-06 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659975#comment-14659975
 ] 

Jesus Camacho Rodriguez commented on HIVE-11461:


[~gopalv], new patch should make the transformation quicker.

 Transform flat AND/OR into IN struct clause
 ---

 Key: HIVE-11461
 URL: https://issues.apache.org/jira/browse/HIVE-11461
 Project: Hive
  Issue Type: Bug
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11461.1.patch, HIVE-11461.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11340) Create ORC based table using like clause doesn't copy compression property

2015-08-06 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-11340:

Affects Version/s: 1.2.0

 Create ORC based table using like clause doesn't copy compression property
 --

 Key: HIVE-11340
 URL: https://issues.apache.org/jira/browse/HIVE-11340
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.14.0, 1.0.0, 1.2.0
Reporter: Gaurav Kohli
Assignee: Yongzhi Chen
Priority: Minor
 Attachments: HIVE-11340.1.patch


 I found a issue in “create table like” clause, as it is not copying the table 
 properties from ORC File format based table.
 Steps to reproduce:
 Step1 :
 create table orc_table (
 time string)
 stored as ORC tblproperties (orc.compress=SNAPPY);
 Step 2: 
 create table orc_table_using_like like orc_table;
 Step 3:
 show create table orc_table_using_like;  
 Result:
 createtab_stmt
 CREATE TABLE `orc_table_using_like`(
   `time` string)
 ROW FORMAT SERDE 
   'org.apache.hadoop.hive.ql.io.orc.OrcSerde' 
 STORED AS INPUTFORMAT 
   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' 
 OUTPUTFORMAT 
   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
 LOCATION
   'hdfs://nameservice1/user/hive/warehouse/gkohli.db/orc_table_using_like'
 TBLPROPERTIES (
   'transient_lastDdlTime'='1437578939')
 Issue:  'orc.compress'='SNAPPY' property is missing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11340) Create ORC based table using like clause doesn't copy compression property

2015-08-06 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659866#comment-14659866
 ] 

Yongzhi Chen commented on HIVE-11340:
-

Not sure why the build has not run yet, priority too low? 
Add more comments for easier to understand the change:
The fix is following same pattern as patch for HIVE-8450.

 Create ORC based table using like clause doesn't copy compression property
 --

 Key: HIVE-11340
 URL: https://issues.apache.org/jira/browse/HIVE-11340
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.14.0, 1.0.0, 1.2.0
Reporter: Gaurav Kohli
Assignee: Yongzhi Chen
Priority: Minor
 Attachments: HIVE-11340.1.patch


 I found a issue in “create table like” clause, as it is not copying the table 
 properties from ORC File format based table.
 Steps to reproduce:
 Step1 :
 create table orc_table (
 time string)
 stored as ORC tblproperties (orc.compress=SNAPPY);
 Step 2: 
 create table orc_table_using_like like orc_table;
 Step 3:
 show create table orc_table_using_like;  
 Result:
 createtab_stmt
 CREATE TABLE `orc_table_using_like`(
   `time` string)
 ROW FORMAT SERDE 
   'org.apache.hadoop.hive.ql.io.orc.OrcSerde' 
 STORED AS INPUTFORMAT 
   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' 
 OUTPUTFORMAT 
   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
 LOCATION
   'hdfs://nameservice1/user/hive/warehouse/gkohli.db/orc_table_using_like'
 TBLPROPERTIES (
   'transient_lastDdlTime'='1437578939')
 Issue:  'orc.compress'='SNAPPY' property is missing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11484) Fix ObjectInspector for Char and VarChar

2015-08-06 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659891#comment-14659891
 ] 

Amareshwari Sriramadasu commented on HIVE-11484:


Here is another commit - 
https://github.com/InMobi/hive/commit/d7b1916da379b5a310639d479604786b05499cb2

 Fix ObjectInspector for Char and VarChar
 

 Key: HIVE-11484
 URL: https://issues.apache.org/jira/browse/HIVE-11484
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Amareshwari Sriramadasu

 The creation of HiveChar and Varchar is not happening through ObjectInspector.
 Here is fix we pushed internally : 
 https://github.com/InMobi/hive/commit/fe95c7850e7130448209141155f28b25d3504216



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-11491) Lazily call ASTNode::toStringTree() after tree modification

2015-08-06 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan resolved HIVE-11491.
--
Resolution: Duplicate

Accidentally created this one, this is the same as HIVE-11490.

 Lazily call ASTNode::toStringTree() after tree modification
 ---

 Key: HIVE-11491
 URL: https://issues.apache.org/jira/browse/HIVE-11491
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan

 Currently, we call toStringTree() as part of HIVE-11316 everytime the tree is 
 modified. This is a bad approach as we can lazily delay this to the point 
 when toStringTree() is called again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11448) Support vectorization of Multi-OR and Multi-AND

2015-08-06 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660569#comment-14660569
 ] 

Gunther Hagleitner commented on HIVE-11448:
---

[~mmccline] looks like there are a couple of plan changes now. Are those real 
issues or just need golden update? (cc [~gopalv])

 Support vectorization of Multi-OR and Multi-AND
 ---

 Key: HIVE-11448
 URL: https://issues.apache.org/jira/browse/HIVE-11448
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-11448.01.patch, HIVE-11448.02.patch, 
 HIVE-11448.03.patch


 Support more than 2 children for OR and AND when all children are expressions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11480) CBO: Calcite Operator To Hive Operator (Calcite Return Path): char/varchar as input to GenericUDAF

2015-08-06 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11480:
---
Attachment: HIVE-11480.01.patch

 CBO: Calcite Operator To Hive Operator (Calcite Return Path): char/varchar as 
 input to GenericUDAF 
 ---

 Key: HIVE-11480
 URL: https://issues.apache.org/jira/browse/HIVE-11480
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11480.01.patch


 Some of the UDAF can not deal with char/varchar correctly when return path is 
 on, for example udaf_number_format.q.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11480) CBO: Calcite Operator To Hive Operator (Calcite Return Path): char/varchar as input to GenericUDAF

2015-08-06 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660333#comment-14660333
 ] 

Pengcheng Xiong commented on HIVE-11480:


[~jcamachorodriguez], could u review the patch? Thanks.

 CBO: Calcite Operator To Hive Operator (Calcite Return Path): char/varchar as 
 input to GenericUDAF 
 ---

 Key: HIVE-11480
 URL: https://issues.apache.org/jira/browse/HIVE-11480
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11480.01.patch


 Some of the UDAF can not deal with char/varchar correctly when return path is 
 on, for example udaf_number_format.q.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11087) DbTxnManager exceptions should include txnid

2015-08-06 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660474#comment-14660474
 ] 

Alan Gates commented on HIVE-11087:
---

+1

 DbTxnManager exceptions should include txnid
 

 Key: HIVE-11087
 URL: https://issues.apache.org/jira/browse/HIVE-11087
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-11087.3.patch


 must include txnid in the exception so that user visible error can be 
 correlated with log file info



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11428) Performance: Struct IN() clauses are extremely slow (~10x slower)

2015-08-06 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660473#comment-14660473
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-11428:
--

The above test failure looks unrelated to the changes.

Thanks
Hari

 Performance: Struct IN() clauses are extremely slow (~10x slower) 
 --

 Key: HIVE-11428
 URL: https://issues.apache.org/jira/browse/HIVE-11428
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 1.3.0, 2.0.0
Reporter: Gopal V
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-11428.1.patch, HIVE-11428.2.patch


 Hive today does not support tuple IN() clauses today, but provides a way to 
 rewrite (a,b) IN (...) using complex types.
 select * from table where STRUCT(a,b) IN (STRUCT(1,2), STRUCT(2,3) ...);
 This would be fine, except it is massively slower due to ObjectConvertors and 
 Struct constructor not being constant folded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-4897) Hive should handle AlreadyExists on retries when creating tables/partitions

2015-08-06 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reassigned HIVE-4897:
--

Assignee: Aihua Xu

 Hive should handle AlreadyExists on retries when creating tables/partitions
 ---

 Key: HIVE-4897
 URL: https://issues.apache.org/jira/browse/HIVE-4897
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Aihua Xu
 Attachments: hive-snippet.log


 Creating new tables/partitions may fail with an AlreadyExistsException if 
 there is an error part way through the creation and the HMS tries again 
 without properly cleaning up or checking if this is a retry.
 While partitioning a new table via a script on distributed hive (MetaStore on 
 the same machine) there was a long timeout and then:
 {code}
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask. 
 AlreadyExistsException(message:Partition already exists:Partition( ...
 {code}
 I am assuming this is due to retry. Perhaps already-exists on retry could be 
 handled better.
 A similar error occurred while creating a table through Impala, which issued 
 a single createTable call that failed with an AlreadyExistsException. See the 
 logs related to table tmp_proc_8_d2b7b0f133be455ca95615818b8a5879_7 in the 
 attached hive-snippet.log



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11341) Avoid expensive resizing of ASTNode tree

2015-08-06 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-11341:
-
Attachment: HIVE-11341.6.patch

 Avoid expensive resizing of ASTNode tree 
 -

 Key: HIVE-11341
 URL: https://issues.apache.org/jira/browse/HIVE-11341
 Project: Hive
  Issue Type: Bug
  Components: Hive, Physical Optimizer
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-11341.1.patch, HIVE-11341.2.patch, 
 HIVE-11341.3.patch, HIVE-11341.4.patch, HIVE-11341.5.patch, HIVE-11341.6.patch


 {code}
 Stack TraceSample CountPercentage(%) 
 parse.BaseSemanticAnalyzer.analyze(ASTNode, Context)   1,605   90 
parse.CalcitePlanner.analyzeInternal(ASTNode)   1,605   90 
   parse.SemanticAnalyzer.analyzeInternal(ASTNode, 
 SemanticAnalyzer$PlannerContext) 1,605   90 
  parse.CalcitePlanner.genOPTree(ASTNode, 
 SemanticAnalyzer$PlannerContext)  1,604   90 
 parse.SemanticAnalyzer.genOPTree(ASTNode, 
 SemanticAnalyzer$PlannerContext) 1,604   90 
parse.SemanticAnalyzer.genPlan(QB)  1,604   90 
   parse.SemanticAnalyzer.genPlan(QB, boolean)  1,604   90 
  parse.SemanticAnalyzer.genBodyPlan(QB, Operator, Map)
  1,604   90 
 parse.SemanticAnalyzer.genFilterPlan(ASTNode, QB, 
 Operator, Map, boolean)  1,603   90 
parse.SemanticAnalyzer.genFilterPlan(QB, ASTNode, 
 Operator, boolean)1,603   90 
   parse.SemanticAnalyzer.genExprNodeDesc(ASTNode, 
 RowResolver, boolean)1,603   90 
  
 parse.SemanticAnalyzer.genExprNodeDesc(ASTNode, RowResolver, TypeCheckCtx)
 1,603   90 
 
 parse.SemanticAnalyzer.genAllExprNodeDesc(ASTNode, RowResolver, TypeCheckCtx) 
  1,603   90 

 parse.TypeCheckProcFactory.genExprNode(ASTNode, TypeCheckCtx)   1,603   90 
   
 parse.TypeCheckProcFactory.genExprNode(ASTNode, TypeCheckCtx, 
 TypeCheckProcFactory)  1,603   90 
  
 lib.DefaultGraphWalker.startWalking(Collection, HashMap)  1,579   89 
 
 lib.DefaultGraphWalker.walk(Node)  1,571   89 

 java.util.ArrayList.removeAll(Collection)   1,433   81 
   
 java.util.ArrayList.batchRemove(Collection, boolean) 1,433   81 
  
 java.util.ArrayList.contains(Object)  1,228   69 
 
 java.util.ArrayList.indexOf(Object)1,228   69 
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11341) Avoid expensive resizing of ASTNode tree

2015-08-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660386#comment-14660386
 ] 

Hive QA commented on HIVE-11341:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12748935/HIVE-11341.5.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4846/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4846/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4846/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Tests exited with: ExecutionException: java.util.concurrent.ExecutionException: 
java.io.IOException: Could not create 
/data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4846/succeeded/TestContribCliDriver
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12748935 - PreCommit-HIVE-TRUNK-Build

 Avoid expensive resizing of ASTNode tree 
 -

 Key: HIVE-11341
 URL: https://issues.apache.org/jira/browse/HIVE-11341
 Project: Hive
  Issue Type: Bug
  Components: Hive, Physical Optimizer
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-11341.1.patch, HIVE-11341.2.patch, 
 HIVE-11341.3.patch, HIVE-11341.4.patch, HIVE-11341.5.patch


 {code}
 Stack TraceSample CountPercentage(%) 
 parse.BaseSemanticAnalyzer.analyze(ASTNode, Context)   1,605   90 
parse.CalcitePlanner.analyzeInternal(ASTNode)   1,605   90 
   parse.SemanticAnalyzer.analyzeInternal(ASTNode, 
 SemanticAnalyzer$PlannerContext) 1,605   90 
  parse.CalcitePlanner.genOPTree(ASTNode, 
 SemanticAnalyzer$PlannerContext)  1,604   90 
 parse.SemanticAnalyzer.genOPTree(ASTNode, 
 SemanticAnalyzer$PlannerContext) 1,604   90 
parse.SemanticAnalyzer.genPlan(QB)  1,604   90 
   parse.SemanticAnalyzer.genPlan(QB, boolean)  1,604   90 
  parse.SemanticAnalyzer.genBodyPlan(QB, Operator, Map)
  1,604   90 
 parse.SemanticAnalyzer.genFilterPlan(ASTNode, QB, 
 Operator, Map, boolean)  1,603   90 
parse.SemanticAnalyzer.genFilterPlan(QB, ASTNode, 
 Operator, boolean)1,603   90 
   parse.SemanticAnalyzer.genExprNodeDesc(ASTNode, 
 RowResolver, boolean)1,603   90 
  
 parse.SemanticAnalyzer.genExprNodeDesc(ASTNode, RowResolver, TypeCheckCtx)
 1,603   90 
 
 parse.SemanticAnalyzer.genAllExprNodeDesc(ASTNode, RowResolver, TypeCheckCtx) 
  1,603   90 

 parse.TypeCheckProcFactory.genExprNode(ASTNode, TypeCheckCtx)   1,603   90 
   
 parse.TypeCheckProcFactory.genExprNode(ASTNode, TypeCheckCtx, 
 TypeCheckProcFactory)  1,603   90 
  
 lib.DefaultGraphWalker.startWalking(Collection, HashMap)  1,579   89 
 
 lib.DefaultGraphWalker.walk(Node)  1,571   89 

 java.util.ArrayList.removeAll(Collection)   1,433   81 
   
 java.util.ArrayList.batchRemove(Collection, boolean) 1,433   81 
  
 java.util.ArrayList.contains(Object)  1,228   69 
 
 java.util.ArrayList.indexOf(Object)1,228   69 
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-4897) Hive should handle AlreadyExists on retries when creating tables/partitions

2015-08-06 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-4897:
---
Attachment: HIVE-4897.patch

 Hive should handle AlreadyExists on retries when creating tables/partitions
 ---

 Key: HIVE-4897
 URL: https://issues.apache.org/jira/browse/HIVE-4897
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Aihua Xu
 Attachments: HIVE-4897.patch, hive-snippet.log


 Creating new tables/partitions may fail with an AlreadyExistsException if 
 there is an error part way through the creation and the HMS tries again 
 without properly cleaning up or checking if this is a retry.
 While partitioning a new table via a script on distributed hive (MetaStore on 
 the same machine) there was a long timeout and then:
 {code}
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask. 
 AlreadyExistsException(message:Partition already exists:Partition( ...
 {code}
 I am assuming this is due to retry. Perhaps already-exists on retry could be 
 handled better.
 A similar error occurred while creating a table through Impala, which issued 
 a single createTable call that failed with an AlreadyExistsException. See the 
 logs related to table tmp_proc_8_d2b7b0f133be455ca95615818b8a5879_7 in the 
 attached hive-snippet.log



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11359) Fix alter related Unit tests for HBase metastore

2015-08-06 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta reassigned HIVE-11359:
---

Assignee: Vaibhav Gumashta

 Fix alter related Unit tests for HBase metastore
 

 Key: HIVE-11359
 URL: https://issues.apache.org/jira/browse/HIVE-11359
 Project: Hive
  Issue Type: Sub-task
  Components: HBase Metastore, Metastore
Affects Versions: hbase-metastore-branch
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta

 alter_partition_change_col and alter1.q fail (of the 45 sampled q files; 
 there could be other failures we haven't identified yet).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11466) HIVE-10166 generates more data on hive.log causing Jenkins to fill all the disk.

2015-08-06 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660089#comment-14660089
 ] 

Xuefu Zhang commented on HIVE-11466:


It seems using thrift 0.9.2 for the code generation has caused the problem. I 
ran the test before and after HIVE-9152 commit on Spark branch and can see the 
behavior difference. Basically, you will either see or not see the following in 
hive.log after or before the commit when you run the test above.
{code}
2015-08-06 07:17:08,960 WARN  [Thread-17]: server.TThreadPoolServer 
(TThreadPoolServer.java:serve(206)) - Transport error occurred during 
acceptance of message.
org.apache.thrift.transport.TTransportException: No underlying server socket.
at 
org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:126)
at 
org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:35)
at 
org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:60)
at 
org.apache.thrift.server.TThreadPoolServer.serve(TThreadPoolServer.java:161)
at 
org.apache.hive.service.cli.thrift.ThriftBinaryCLIService.run(ThriftBinaryCLIService.java:100)
at java.lang.Thread.run(Thread.java:722)
{code}
I don't see anything obviously wrong though. [~csun], could you take a look? We 
can either find a fix (if that's easy), or simply regenerate thrift code using 
0.9.0.

 HIVE-10166 generates more data on hive.log causing Jenkins to fill all the 
 disk.
 

 Key: HIVE-11466
 URL: https://issues.apache.org/jira/browse/HIVE-11466
 Project: Hive
  Issue Type: Bug
Reporter: Sergio Peña
Assignee: Xuefu Zhang

 An issue with HIVE-10166 patch is increasing the size of hive.log and  
 causing jenkins to fail because it does not have more space.
 Here's a test I run when running TestJdbcWithMiniHS2 before the patch, with 
 the patch, and after other commits.
 {noformat}
 BEFORE HIVE-10166
 13M Aug  5 11:57 ./hive-unit/target/tmp/log/hive.log
 WITH HIVE-10166
 2.4G Aug  5 12:07 ./hive-unit/target/tmp/log/hive.log
 CURRENT HEAD
 3.2G Aug  5 12:36 ./hive-unit/target/tmp/log/hive.log
 {noformat}
 This is just a single test, but on Jenkins, hive.log is more than 13G of size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9152) Dynamic Partition Pruning [Spark Branch]

2015-08-06 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659675#comment-14659675
 ] 

Lefty Leverenz commented on HIVE-9152:
--

The doc is done (thanks [~csun]) so I'm removing the TODOC-SPARK and TODOC-1.3 
labels.

* [Configuration Properties -- hive.spark.dynamic.partition.pruning | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.spark.dynamic.partition.pruning]
* [Configuration Properties -- 
hive.spark.dynamic.partition.pruning.max.data.size | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.spark.dynamic.partition.pruning.max.data.size]

[~sladymon] added some information from the description in the patch, so you 
might want to review *hive.spark.dynamic.partition.pruning*.

 Dynamic Partition Pruning [Spark Branch]
 

 Key: HIVE-9152
 URL: https://issues.apache.org/jira/browse/HIVE-9152
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: spark-branch
Reporter: Brock Noland
Assignee: Chao Sun
  Labels: TODOC-SPARK, TODOC1.3
 Fix For: spark-branch, 1.3.0, 2.0.0

 Attachments: HIVE-9152.1-spark.patch, HIVE-9152.10-spark.patch, 
 HIVE-9152.11-spark.patch, HIVE-9152.12-spark.patch, HIVE-9152.2-spark.patch, 
 HIVE-9152.3-spark.patch, HIVE-9152.4-spark.patch, HIVE-9152.5-spark.patch, 
 HIVE-9152.6-spark.patch, HIVE-9152.8-spark.patch, HIVE-9152.9-spark.patch


 Tez implemented dynamic partition pruning in HIVE-7826. This is a nice 
 optimization and we should implement the same in HOS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11416) CBO: Calcite Operator To Hive Operator (Calcite Return Path): Groupby Optimizer assumes the schema can match after removing RS and GBY

2015-08-06 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659743#comment-14659743
 ] 

Jesus Camacho Rodriguez commented on HIVE-11416:


[~pxiong], I left a couple of comments in RB. Thanks

 CBO: Calcite Operator To Hive Operator (Calcite Return Path): Groupby 
 Optimizer assumes the schema can match after removing RS and GBY
 --

 Key: HIVE-11416
 URL: https://issues.apache.org/jira/browse/HIVE-11416
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-11416.01.patch, HIVE-11416.02.patch, 
 HIVE-11416.03.patch, HIVE-11416.04.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11180) Enable native vectorized map join for spark [Spark Branch]

2015-08-06 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-11180:
--
Labels: TODOC-SPARK  (was: )

 Enable native vectorized map join for spark [Spark Branch]
 --

 Key: HIVE-11180
 URL: https://issues.apache.org/jira/browse/HIVE-11180
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Rui Li
Assignee: Rui Li
  Labels: TODOC-SPARK
 Fix For: spark-branch

 Attachments: HIVE-11180.1-spark.patch, HIVE-11180.2-spark.patch


 The improvement was introduced in HIVE-9824. Let's use this task to track how 
 we can enable that for spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11180) Enable native vectorized map join for spark [Spark Branch]

2015-08-06 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659598#comment-14659598
 ] 

Lefty Leverenz commented on HIVE-11180:
---

Doc note:  This adds and Spark to the description of 
*hive.mapjoin.optimized.hashtable* in HiveConf.java, so the parameter needs to 
be updated (with version information) in Configuration Properties after the 
patch gets merged to master.

* [ConfigurationProperties -- hive.mapjoin.optimized.hashtable | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.mapjoin.optimized.hashtable]

Adding a TODOC-SPARK label just as a reminder -- no doc needed until merged to 
master.

 Enable native vectorized map join for spark [Spark Branch]
 --

 Key: HIVE-11180
 URL: https://issues.apache.org/jira/browse/HIVE-11180
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Rui Li
Assignee: Rui Li
  Labels: TODOC-SPARK
 Fix For: spark-branch

 Attachments: HIVE-11180.1-spark.patch, HIVE-11180.2-spark.patch


 The improvement was introduced in HIVE-9824. Let's use this task to track how 
 we can enable that for spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11180) Enable native vectorized map join for spark [Spark Branch]

2015-08-06 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659627#comment-14659627
 ] 

Rui Li commented on HIVE-11180:
---

Thanks [~leftylev].

 Enable native vectorized map join for spark [Spark Branch]
 --

 Key: HIVE-11180
 URL: https://issues.apache.org/jira/browse/HIVE-11180
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Rui Li
Assignee: Rui Li
  Labels: TODOC-SPARK
 Fix For: spark-branch

 Attachments: HIVE-11180.1-spark.patch, HIVE-11180.2-spark.patch


 The improvement was introduced in HIVE-9824. Let's use this task to track how 
 we can enable that for spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11485) Session close should not close async SQL operations

2015-08-06 Thread Deepak Barr (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Barr reassigned HIVE-11485:
--

Assignee: Deepak Barr

 Session close should not close async SQL operations
 ---

 Key: HIVE-11485
 URL: https://issues.apache.org/jira/browse/HIVE-11485
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Amareshwari Sriramadasu
Assignee: Deepak Barr

 Right now, session close on HiveServer closes all operations. But, queries 
 running are actually available across sessions and they are not tied to a 
 session (expect the launch - which requires configuration and resources). And 
 it allows getting the status of the query across sessions.
 But session close of the session ( on which operation is launched) closes all 
 the operations as well. 
 So, we should avoid closing all operations upon closing a session.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7476) CTAS does not work properly for s3

2015-08-06 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-7476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660105#comment-14660105
 ] 

Sergio Peña commented on HIVE-7476:
---

+1
The patch looks good. Encrypted files will work fine.
Just one quick feedback. If you can add more comment on needToCopy() method. I 
did not know what {{boolean diffFs = 
!srcFs.getClass().equals(destFs.getClass());}} was doing until I saw Lenni's 
comment.

 CTAS does not work properly for s3
 --

 Key: HIVE-7476
 URL: https://issues.apache.org/jira/browse/HIVE-7476
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.1, 1.1.0
 Environment: Linux
Reporter: Jian Fang
Assignee: Szehon Ho
 Attachments: HIVE-7476.1.patch, HIVE-7476.2.patch


 When we use CTAS to create a new table in s3, the table location is not set 
 correctly. As a result, the data from the existing table cannot be inserted 
 into the new created table.
 We can use the following example to reproduce this issue.
 set hive.metastore.warehouse.dir=OUTPUT_PATH;
 drop table s3_dir_test;
 drop table s3_1;
 drop table s3_2;
 create external table s3_dir_test(strct structa:int, b:string, c:string)
 row format delimited
 fields terminated by '\t'
 collection items terminated by ' '
 location 'INPUT_PATH';
 create table s3_1(strct structa:int, b:string, c:string)
 row format delimited
 fields terminated by '\t'
 collection items terminated by ' ';
 insert overwrite table s3_1 select * from s3_dir_test;
 select * from s3_1;
 create table s3_2 as select * from s3_1;
 select * from s3_1;
 select * from s3_2;
 The data could be as follows.
 1 abc 10.5
 2 def 11.5
 3 ajss 90.23232
 4 djns 89.02002
 5 random 2.99
 6 data 3.002
 7 ne 71.9084
 The root cause is that the SemanticAnalyzer class did not handle s3 location 
 properly for CTAS.
 A patch will be provided shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-11326) Parquet table: where clause with partition column fails

2015-08-06 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-11326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña resolved HIVE-11326.

Resolution: Duplicate
  Assignee: Sergio Peña

Hi [~tfriedr]

This issue has been fixed on HIVE-11401.

 Parquet table: where clause with partition column fails
 ---

 Key: HIVE-11326
 URL: https://issues.apache.org/jira/browse/HIVE-11326
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Affects Versions: 1.2.0, 1.2.1
Reporter: Thomas Friedrich
Assignee: Sergio Peña
  Labels: parquet

 Steps:
 create table t1 (c1 int) partitioned by (part string) stored as parquet;
 insert into table t1 partition (part='p1') values (1);
 select * from t1 where part='p1';
 Error message:
 Caused by: java.lang.IllegalArgumentException: Column [part] was not found in 
 schema!
 at parquet.Preconditions.checkArgument(Preconditions.java:55)
 at 
 parquet.filter2.predicate.SchemaCompatibilityValidator.getColumnDescriptor(SchemaCompatibilityValidator.java:190)
 at 
 parquet.filter2.predicate.SchemaCompatibilityValidator.validateColumn(SchemaCompatibilityValidator.java:178)
 at 
 parquet.filter2.predicate.SchemaCompatibilityValidator.validateColumnFilterPredicate(SchemaCompatibilityValidator.java:160)
 at 
 parquet.filter2.predicate.SchemaCompatibilityValidator.visit(SchemaCompatibilityValidator.java:94)
 at 
 parquet.filter2.predicate.SchemaCompatibilityValidator.visit(SchemaCompatibilityValidator.java:59)
 at parquet.filter2.predicate.Operators$Eq.accept(Operators.java:180)
 at 
 parquet.filter2.predicate.SchemaCompatibilityValidator.validate(SchemaCompatibilityValidator.java:64)
 at parquet.filter2.compat.RowGroupFilter.visit(RowGroupFilter.java:59)
 at parquet.filter2.compat.RowGroupFilter.visit(RowGroupFilter.java:40)
 at 
 parquet.filter2.compat.FilterCompat$FilterPredicateCompat.accept(FilterCompat.java:126)
 at 
 parquet.filter2.compat.RowGroupFilter.filterRowGroups(RowGroupFilter.java:46)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:275)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:99)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:85)
 at 
 org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:72)
 at 
 org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.init(CombineHiveRecordReader.java:67)
 Seems that problem was introduced with HIVE-10252 ([~dongc]). Filter can't 
 contain any partition columns in case of Parquet table. 
 While searching for an existing JIRA, I found a similar problem reported for 
 Spark - SPARK-6554
 I think the setFilter method should remove all predicates that reference 
 partition columns before building the FilterPredicate object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11466) HIVE-10166 generates more data on hive.log causing Jenkins to fill all the disk.

2015-08-06 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660231#comment-14660231
 ] 

Chao Sun commented on HIVE-11466:
-

OK, I'll take a look.

 HIVE-10166 generates more data on hive.log causing Jenkins to fill all the 
 disk.
 

 Key: HIVE-11466
 URL: https://issues.apache.org/jira/browse/HIVE-11466
 Project: Hive
  Issue Type: Bug
Reporter: Sergio Peña
Assignee: Xuefu Zhang

 An issue with HIVE-10166 patch is increasing the size of hive.log and  
 causing jenkins to fail because it does not have more space.
 Here's a test I run when running TestJdbcWithMiniHS2 before the patch, with 
 the patch, and after other commits.
 {noformat}
 BEFORE HIVE-10166
 13M Aug  5 11:57 ./hive-unit/target/tmp/log/hive.log
 WITH HIVE-10166
 2.4G Aug  5 12:07 ./hive-unit/target/tmp/log/hive.log
 CURRENT HEAD
 3.2G Aug  5 12:36 ./hive-unit/target/tmp/log/hive.log
 {noformat}
 This is just a single test, but on Jenkins, hive.log is more than 13G of size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11466) HIVE-10166 generates more data on hive.log causing Jenkins to fill all the disk.

2015-08-06 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-11466:

Attachment: HIVE-11466.patch

Let's see if using Thrift 0.9.0 solve the problem.

 HIVE-10166 generates more data on hive.log causing Jenkins to fill all the 
 disk.
 

 Key: HIVE-11466
 URL: https://issues.apache.org/jira/browse/HIVE-11466
 Project: Hive
  Issue Type: Bug
Reporter: Sergio Peña
Assignee: Xuefu Zhang
 Attachments: HIVE-11466.patch


 An issue with HIVE-10166 patch is increasing the size of hive.log and  
 causing jenkins to fail because it does not have more space.
 Here's a test I run when running TestJdbcWithMiniHS2 before the patch, with 
 the patch, and after other commits.
 {noformat}
 BEFORE HIVE-10166
 13M Aug  5 11:57 ./hive-unit/target/tmp/log/hive.log
 WITH HIVE-10166
 2.4G Aug  5 12:07 ./hive-unit/target/tmp/log/hive.log
 CURRENT HEAD
 3.2G Aug  5 12:36 ./hive-unit/target/tmp/log/hive.log
 {noformat}
 This is just a single test, but on Jenkins, hive.log is more than 13G of size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11466) HIVE-10166 generates more data on hive.log causing Jenkins to fill all the disk.

2015-08-06 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-11466:

Attachment: (was: HIVE-11466.patch)

 HIVE-10166 generates more data on hive.log causing Jenkins to fill all the 
 disk.
 

 Key: HIVE-11466
 URL: https://issues.apache.org/jira/browse/HIVE-11466
 Project: Hive
  Issue Type: Bug
Reporter: Sergio Peña
Assignee: Xuefu Zhang

 An issue with HIVE-10166 patch is increasing the size of hive.log and  
 causing jenkins to fail because it does not have more space.
 Here's a test I run when running TestJdbcWithMiniHS2 before the patch, with 
 the patch, and after other commits.
 {noformat}
 BEFORE HIVE-10166
 13M Aug  5 11:57 ./hive-unit/target/tmp/log/hive.log
 WITH HIVE-10166
 2.4G Aug  5 12:07 ./hive-unit/target/tmp/log/hive.log
 CURRENT HEAD
 3.2G Aug  5 12:36 ./hive-unit/target/tmp/log/hive.log
 {noformat}
 This is just a single test, but on Jenkins, hive.log is more than 13G of size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11466) HIVE-10166 generates more data on hive.log causing Jenkins to fill all the disk.

2015-08-06 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-11466:

Attachment: HIVE-11466.patch

 HIVE-10166 generates more data on hive.log causing Jenkins to fill all the 
 disk.
 

 Key: HIVE-11466
 URL: https://issues.apache.org/jira/browse/HIVE-11466
 Project: Hive
  Issue Type: Bug
Reporter: Sergio Peña
Assignee: Xuefu Zhang
 Attachments: HIVE-11466.patch


 An issue with HIVE-10166 patch is increasing the size of hive.log and  
 causing jenkins to fail because it does not have more space.
 Here's a test I run when running TestJdbcWithMiniHS2 before the patch, with 
 the patch, and after other commits.
 {noformat}
 BEFORE HIVE-10166
 13M Aug  5 11:57 ./hive-unit/target/tmp/log/hive.log
 WITH HIVE-10166
 2.4G Aug  5 12:07 ./hive-unit/target/tmp/log/hive.log
 CURRENT HEAD
 3.2G Aug  5 12:36 ./hive-unit/target/tmp/log/hive.log
 {noformat}
 This is just a single test, but on Jenkins, hive.log is more than 13G of size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9152) Dynamic Partition Pruning [Spark Branch]

2015-08-06 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-9152:
-
Labels:   (was: TODOC-SPARK TODOC1.3)

 Dynamic Partition Pruning [Spark Branch]
 

 Key: HIVE-9152
 URL: https://issues.apache.org/jira/browse/HIVE-9152
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: spark-branch
Reporter: Brock Noland
Assignee: Chao Sun
 Fix For: spark-branch, 1.3.0, 2.0.0

 Attachments: HIVE-9152.1-spark.patch, HIVE-9152.10-spark.patch, 
 HIVE-9152.11-spark.patch, HIVE-9152.12-spark.patch, HIVE-9152.2-spark.patch, 
 HIVE-9152.3-spark.patch, HIVE-9152.4-spark.patch, HIVE-9152.5-spark.patch, 
 HIVE-9152.6-spark.patch, HIVE-9152.8-spark.patch, HIVE-9152.9-spark.patch


 Tez implemented dynamic partition pruning in HIVE-7826. This is a nice 
 optimization and we should implement the same in HOS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11381) QTest combine2_hadoop20.q fails when using -Phadoop-1 profile due to HIVE-11139

2015-08-06 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-11381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660161#comment-14660161
 ] 

Sergio Peña commented on HIVE-11381:


[~jxiang] Is this an issue on upstream? or can I close the ticket?

 QTest combine2_hadoop20.q fails when using -Phadoop-1 profile due to 
 HIVE-11139
 ---

 Key: HIVE-11381
 URL: https://issues.apache.org/jira/browse/HIVE-11381
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.3.0
Reporter: Sergio Peña
Assignee: Sergio Peña

 The q-est {{combine2_hadoop20.q}} is failing when running -Phadoop-1 profile
 tests. The output test is different due to the changes added on HIVE-11139
 for more lineage information.
 Based on other HIVE-11139 tests, this test output needs to be regenerated 
 only.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-5277) HBase handler skips rows with null valued first cells when only row key is selected

2015-08-06 Thread Swarnim Kulkarni (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660186#comment-14660186
 ] 

Swarnim Kulkarni commented on HIVE-5277:


Just to update, this is an issue with count(*) type queries as well not only 
count(key).

 HBase handler skips rows with null valued first cells when only row key is 
 selected
 ---

 Key: HIVE-5277
 URL: https://issues.apache.org/jira/browse/HIVE-5277
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 0.11.0, 0.11.1, 0.12.0, 0.13.0
Reporter: Teddy Choi
Assignee: Swarnim Kulkarni
Priority: Critical
 Attachments: HIVE-5277.1.patch.txt, HIVE-5277.2.patch.txt


 HBaseStorageHandler skips rows with null valued first cells when only row key 
 is selected.
 {noformat}
 SELECT key, col1, col2 FROM hbase_table;
 key1  cell1   cell2 
 key2  NULLcell3
 SELECT COUNT(key) FROM hbase_table;
 1
 {noformat}
 HiveHBaseTableInputFormat.getRecordReader makes first cell selected to avoid 
 skipping rows. But when the first cell is null, HBase skips that row.
 http://hbase.apache.org/book/perf.reading.html 12.9.6. Optimal Loading of Row 
 Keys describes how to deal with this problem.
 I tried to find an existing issue, but I couldn't. If you find a same issue, 
 please make this issue duplicated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)