[jira] [Commented] (HIVE-11466) HIVE-10166 generates more data on hive.log causing Jenkins to fill all the disk.
[ https://issues.apache.org/jira/browse/HIVE-11466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660277#comment-14660277 ] Chao Sun commented on HIVE-11466: - Hmm, seems like with Thrift 0.9.0 it still have this message in the log. HIVE-10166 generates more data on hive.log causing Jenkins to fill all the disk. Key: HIVE-11466 URL: https://issues.apache.org/jira/browse/HIVE-11466 Project: Hive Issue Type: Bug Reporter: Sergio Peña Assignee: Xuefu Zhang Attachments: HIVE-11466.patch An issue with HIVE-10166 patch is increasing the size of hive.log and causing jenkins to fail because it does not have more space. Here's a test I run when running TestJdbcWithMiniHS2 before the patch, with the patch, and after other commits. {noformat} BEFORE HIVE-10166 13M Aug 5 11:57 ./hive-unit/target/tmp/log/hive.log WITH HIVE-10166 2.4G Aug 5 12:07 ./hive-unit/target/tmp/log/hive.log CURRENT HEAD 3.2G Aug 5 12:36 ./hive-unit/target/tmp/log/hive.log {noformat} This is just a single test, but on Jenkins, hive.log is more than 13G of size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7566) HIVE can't count hbase NULL column value properly
[ https://issues.apache.org/jira/browse/HIVE-7566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660284#comment-14660284 ] Aihua Xu commented on HIVE-7566: There is HIVE-5277 for the same issue. I will resolve this as dup then. HIVE can't count hbase NULL column value properly - Key: HIVE-7566 URL: https://issues.apache.org/jira/browse/HIVE-7566 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.13.0 Environment: HIVE version 0.13.0 HBase version 0.98.0 Reporter: Kent Kong HBase table structure is like this: table name : 'testtable' column family : 'data' column 1 : 'name' column 2 : 'color' HIVE mapping table is structure is like this: table name : 'hb_testtable' column 1 : 'name' column 2 : 'color' in hbase, put two rows James, blue May then do select in hive select * from hb_testtable where color is null the result is May, NULL then try count select count(*) from hb_testtable where color is null the result is 0, which should be 1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-7566) HIVE can't count hbase NULL column value properly
[ https://issues.apache.org/jira/browse/HIVE-7566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu resolved HIVE-7566. Resolution: Duplicate HIVE can't count hbase NULL column value properly - Key: HIVE-7566 URL: https://issues.apache.org/jira/browse/HIVE-7566 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.13.0 Environment: HIVE version 0.13.0 HBase version 0.98.0 Reporter: Kent Kong HBase table structure is like this: table name : 'testtable' column family : 'data' column 1 : 'name' column 2 : 'color' HIVE mapping table is structure is like this: table name : 'hb_testtable' column 1 : 'name' column 2 : 'color' in hbase, put two rows James, blue May then do select in hive select * from hb_testtable where color is null the result is May, NULL then try count select count(*) from hb_testtable where color is null the result is 0, which should be 1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11475) Bad rename of directory during commit, when using HCat dynamic-partitioning.
[ https://issues.apache.org/jira/browse/HIVE-11475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660846#comment-14660846 ] Hive QA commented on HIVE-11475: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748944/HIVE-11475.1.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4849/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4849/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4849/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Tests exited with: ExecutionException: java.util.concurrent.ExecutionException: org.apache.hive.ptest.execution.ssh.SSHExecutionException: RSyncResult [localFile=/data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4849/succeeded/TestJdbcWithMiniHS2, remoteFile=/home/hiveptest/54.146.159.23-hiveptest-1/logs/, getExitCode()=12, getException()=null, getUser()=hiveptest, getHost()=54.146.159.23, getInstance()=1]: 'Address 54.146.159.23 maps to ec2-54-146-159-23.compute-1.amazonaws.com, but this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT! receiving incremental file list ./ TEST-TestJdbcWithMiniHS2-TEST-org.apache.hive.jdbc.TestJdbcWithMiniHS2.xml 0 0%0.00kB/s0:00:00 5799 100%5.53MB/s0:00:00 (xfer#1, to-check=3/5) hive.log 0 0%0.00kB/s0:00:00 43384832 0% 41.38MB/s0:05:58 90177536 0% 43.02MB/s0:05:43 139001856 0% 44.22MB/s0:05:33 187334656 1% 44.68MB/s0:05:28 235732992 1% 45.85MB/s0:05:19 283541504 1% 46.09MB/s0:05:16 330530816 2% 45.65MB/s0:05:18 37604 2% 45.01MB/s0:05:22 419135488 2% 43.75MB/s0:05:30 457211904 3% 41.44MB/s0:05:48 492863488 3% 38.73MB/s0:06:11 528449536 3% 36.35MB/s0:06:34 561774592 3% 34.02MB/s0:07:00 599326720 3% 33.88MB/s0:07:01 635011072 4% 33.87MB/s0:07:00 671154176 4% 34.01MB/s0:06:57 706478080 4% 34.37MB/s0:06:52 742981632 4% 34.14MB/s0:06:54 778862592 5% 34.19MB/s0:06:52 814710784 5% 34.12MB/s0:06:52 850558976 5% 34.33MB/s0:06:48 886308864 5% 34.15MB/s0:06:49 922058752 6% 34.14MB/s0:06:49 956399616 6% 33.78MB/s0:06:52 992477184 6% 33.84MB/s0:06:50 1028849664 6% 34.00MB/s0:06:47 1039106048 6% 27.86MB/s0:08:17 1046544384 6% 21.45MB/s0:10:45 1090256896 7% 23.24MB/s0:09:53 1133019136 7% 24.76MB/s0:09:15 1176043520 7% 32.63MB/s0:07:00 1219526656 8% 41.20MB/s0:05:31 1253605376 8% 38.95MB/s0:05:50 1292730368 8% 38.08MB/s0:05:57 1336770560 8% 38.31MB/s0:05:54 1370324992 9% 35.94MB/s0:06:16 1410990080 9% 37.51MB/s0:05:59 1446674432 9% 36.69MB/s0:06:06 1482588160 9% 34.77MB/s0:06:26 1518272512 9% 35.23MB/s0:06:19 1553793024 10% 33.99MB/s0:06:32 1588854784 10% 33.12MB/s0:06:42 1627947008 10% 33.86MB/s0:06:32 1663664128 10% 33.94MB/s0:06:30 1699282944 11% 33.97MB/s0:06:28 1734934528 11% 34.83MB/s0:06:18 1770782720 11% 34.04MB/s0:06:26 1806499840 11% 34.02MB/s0:06:25 1842249728 12% 33.95MB/s0:06:24 1878392832 12% 34.09MB/s0:06:22 1914109952 12% 34.08MB/s0:06:21 1949532160 12% 34.03MB/s0:06:21 1985216512 13% 34.09MB/s0:06:19 2020868096 13% 33.99MB/s0:06:19 2045214720 13% 31.27MB/s0:06:51 2053996544 13% 24.89MB/s0:08:36 2061762560 13% 18.24MB/s0:11:44 2069135360 13% 11.50MB/s0:18:37 2076344320 13%7.41MB/s0:28:52 2083258368 13%6.97MB/s0:30:41 2090008576 13%6.72MB/s0:31:47 2103148544 13%8.10MB/s0:26:22 2138898432 14% 14.90MB/s0:14:17 2174484480 14% 21.74MB/s0:09:46 2210365440 14% 28.70MB/s0:07:22 2246148096 14% 34.09MB/s0:06:11 2281996288 14% 34.12MB/s0:06:10 2317713408 15% 34.15MB/s0:06:09 2353496064 15% 34.13MB/s0:06:08 2389180416 15% 34.11MB/s0:06:07 2424537088 15% 33.98MB/s0:06:07 2460418048 16% 34.01MB/s0:06:06 2496135168 16% 34.00MB/s0:06:05 2531885056 16% 33.99MB/s0:06:04 2567766016 16% 34.11MB/s0:06:02 2603581440 17% 34.12MB/s0:06:01 2639331328 17% 34.12MB/s0:06:00 2675179520 17% 34.16MB/s0:05:58 2710831104 17% 34.11MB/s0:05:58 2746679296 18% 34.11MB/s
[jira] [Commented] (HIVE-11397) Parse Hive OR clauses as they are written into the AST
[ https://issues.apache.org/jira/browse/HIVE-11397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660858#comment-14660858 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-11397: -- +1 for the change in patch#2, pending test runs. [~jcamachorodriguez] can you resubmit the patch again, the earlier run was incomplete. Thanks Hari Parse Hive OR clauses as they are written into the AST -- Key: HIVE-11397 URL: https://issues.apache.org/jira/browse/HIVE-11397 Project: Hive Issue Type: Bug Components: Logical Optimizer Affects Versions: 1.3.0, 2.0.0 Reporter: Gopal V Assignee: Jesus Camacho Rodriguez Attachments: HIVE-11397.1.patch, HIVE-11397.2.patch, HIVE-11397.patch When parsing A OR B OR C, hive converts it into (C OR B) OR A instead of turning it into A OR (B OR C) {code} GenericUDFOPOr or = new GenericUDFOPOr(); ListExprNodeDesc expressions = new ArrayListExprNodeDesc(2); expressions.add(previous); expressions.add(current); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11494) Some positive constant double predicates gets rounded off while negative constants are not
[ https://issues.apache.org/jira/browse/HIVE-11494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-11494: - Description: Check the predicates in filter expression for following queries. It looks closely related to HIVE-11477 and HIVE-11493 {code:title=explain select * from orc_ppd where f = -0.0799821186066;} OK Stage-0 Fetch Operator limit:-1 Select Operator [SEL_2] outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13] Filter Operator [FIL_4] predicate:(f = -0.0799821186066) (type: boolean) TableScan [TS_0] alias:orc_ppd {code} {code:title=explain select * from orc_ppd where f = 0.0799821186066;} OK Stage-0 Fetch Operator limit:-1 Select Operator [SEL_2] outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13] Filter Operator [FIL_4] predicate:(f = 0.08) (type: boolean) TableScan [TS_0] alias:orc_ppd {code} Negative strings constants gets rounded off. {code:title=explain select * from orc_ppd where f = -0.0799821186066;} OK Stage-0 Fetch Operator limit:-1 Select Operator [SEL_2] outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13] Filter Operator [FIL_4] predicate:(f = -0.08) (type: boolean) TableScan [TS_0] alias:orc_ppd {code} was: Check the predicates in filter expression for following queries. It looks closely related to HIVE-11477 and HIVE-11493 {code:title=explain select * from orc_ppd where f = -0.0799821186066;} OK Stage-0 Fetch Operator limit:-1 Select Operator [SEL_2] outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13] Filter Operator [FIL_4] predicate:(f = -0.0799821186066) (type: boolean) TableScan [TS_0] alias:orc_ppd {code} {code:title=explain select * from orc_ppd where f = 0.0799821186066;} OK Stage-0 Fetch Operator limit:-1 Select Operator [SEL_2] outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13] Filter Operator [FIL_4] predicate:(f = 0.08) (type: boolean) TableScan [TS_0] alias:orc_ppd {code} Some positive constant double predicates gets rounded off while negative constants are not -- Key: HIVE-11494 URL: https://issues.apache.org/jira/browse/HIVE-11494 Project: Hive Issue Type: Bug Affects Versions: 2.0.0 Reporter: Prasanth Jayachandran Assignee: Pengcheng Xiong Priority: Critical Check the predicates in filter expression for following queries. It looks closely related to HIVE-11477 and HIVE-11493 {code:title=explain select * from orc_ppd where f = -0.0799821186066;} OK Stage-0 Fetch Operator limit:-1 Select Operator [SEL_2] outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13] Filter Operator [FIL_4] predicate:(f = -0.0799821186066) (type: boolean) TableScan [TS_0] alias:orc_ppd {code} {code:title=explain select * from orc_ppd where f = 0.0799821186066;} OK Stage-0 Fetch Operator limit:-1 Select Operator [SEL_2] outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13] Filter Operator [FIL_4] predicate:(f = 0.08) (type: boolean) TableScan [TS_0] alias:orc_ppd {code} Negative strings constants gets rounded off. {code:title=explain select * from orc_ppd where f = -0.0799821186066;} OK Stage-0 Fetch Operator limit:-1 Select Operator [SEL_2] outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13] Filter Operator [FIL_4] predicate:(f = -0.08) (type: boolean) TableScan [TS_0] alias:orc_ppd {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9811) Hive on Tez leaks WorkMap objects
[ https://issues.apache.org/jira/browse/HIVE-9811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660680#comment-14660680 ] Sergey Shelukhin commented on HIVE-9811: Is this fixed by HIVE-10778? That just needs to be ported back probably Hive on Tez leaks WorkMap objects - Key: HIVE-9811 URL: https://issues.apache.org/jira/browse/HIVE-9811 Project: Hive Issue Type: Bug Components: Tez Reporter: Oleg Danilov Attachments: HIVE-9811.patch TezTask doesn't fully clean gWorkMap, so as result Hive leaks WorkMap objects. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11437) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with insert into
[ https://issues.apache.org/jira/browse/HIVE-11437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11437: --- Attachment: HIVE-11437.04.patch CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with insert into --- Key: HIVE-11437 URL: https://issues.apache.org/jira/browse/HIVE-11437 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11437.01.patch, HIVE-11437.02.patch, HIVE-11437.03.patch, HIVE-11437.04.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11466) HIVE-10166 generates more data on hive.log causing Jenkins to fill all the disk.
[ https://issues.apache.org/jira/browse/HIVE-11466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660822#comment-14660822 ] Thejas M Nair commented on HIVE-11466: -- [~csun] [~xuefuz] So you think its some other change in spark merge patch that is causing the problem ? HIVE-10166 generates more data on hive.log causing Jenkins to fill all the disk. Key: HIVE-11466 URL: https://issues.apache.org/jira/browse/HIVE-11466 Project: Hive Issue Type: Bug Reporter: Sergio Peña Assignee: Xuefu Zhang Attachments: HIVE-11466.patch An issue with HIVE-10166 patch is increasing the size of hive.log and causing jenkins to fail because it does not have more space. Here's a test I run when running TestJdbcWithMiniHS2 before the patch, with the patch, and after other commits. {noformat} BEFORE HIVE-10166 13M Aug 5 11:57 ./hive-unit/target/tmp/log/hive.log WITH HIVE-10166 2.4G Aug 5 12:07 ./hive-unit/target/tmp/log/hive.log CURRENT HEAD 3.2G Aug 5 12:36 ./hive-unit/target/tmp/log/hive.log {noformat} This is just a single test, but on Jenkins, hive.log is more than 13G of size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11477) CBO inserts a UDF cast for integer type promotion (only for negative numbers)
[ https://issues.apache.org/jira/browse/HIVE-11477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-11477: - Priority: Critical (was: Major) CBO inserts a UDF cast for integer type promotion (only for negative numbers) - Key: HIVE-11477 URL: https://issues.apache.org/jira/browse/HIVE-11477 Project: Hive Issue Type: Bug Affects Versions: 2.0.0 Reporter: Prasanth Jayachandran Assignee: Pengcheng Xiong Priority: Critical When CBO is enabled, filters which compares tinyint, smallint columns with constant integer types will insert a UDFToInteger cast for the columns. When CBO is disabled, there is no such UDF. This behaviour breaks ORC predicate pushdown feature as ORC ignores UDFs in the filters. In the following examples column t is tinyint {code:title=Explain for select count(*) from orc_ppd where t -127; (CBO OFF)} Filter Operator [FIL_9] predicate:(t = 125) (type: boolean) Statistics:Num rows: 1050 Data size: 611757 Basic stats: COMPLETE Column stats: NONE TableScan [TS_0] alias:orc_ppd Statistics:Num rows: 2100 Data size: 1223514 Basic stats: COMPLETE Column stats: NONE {code} {code:title=Explain for select count(*) from orc_ppd where t -127; (CBO ON)} Filter Operator [FIL_10] predicate:(UDFToInteger(t) -127) (type: boolean) Statistics:Num rows: 700 Data size: 407838 Basic stats: COMPLETE Column stats: NONE TableScan [TS_0] alias:orc_ppd Statistics:Num rows: 2100 Data size: 1223514 Basic stats: COMPLETE Column stats: NONE {code} CBO does not insert such cast for non-negative numbers {code:title=Explain for select count(*) from orc_ppd where t 127; (CBO ON)} Filter Operator [FIL_10] predicate:(t 127) (type: boolean) Statistics:Num rows: 700 Data size: 407838 Basic stats: COMPLETE Column stats: NONE TableScan [TS_0] alias:orc_ppd Statistics:Num rows: 2100 Data size: 1223514 Basic stats: COMPLETE Column stats: NONE {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11406) Vectorization: StringExpr::compare() == 0 is bad for performance
[ https://issues.apache.org/jira/browse/HIVE-11406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660667#comment-14660667 ] Lefty Leverenz commented on HIVE-11406: --- I meant, I don't see the branch-1 commit on the commits@hive email list -- just master. Vectorization: StringExpr::compare() == 0 is bad for performance Key: HIVE-11406 URL: https://issues.apache.org/jira/browse/HIVE-11406 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 1.3.0, 2.0.0 Reporter: Gopal V Assignee: Matt McCline Attachments: HIVE-11406.01.patch {{StringExpr::compare() == 0}} is forced to evaluate the whole memory comparison loop for differing lengths of strings, though there is no possibility they will ever be equal. Add a {{StringExpr::equals}} which can be a smaller and tighter loop. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11340) Create ORC based table using like clause doesn't copy compression property
[ https://issues.apache.org/jira/browse/HIVE-11340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660669#comment-14660669 ] Hive QA commented on HIVE-11340: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748942/HIVE-11340.1.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9324 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter_partitioned org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4848/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4848/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4848/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12748942 - PreCommit-HIVE-TRUNK-Build Create ORC based table using like clause doesn't copy compression property -- Key: HIVE-11340 URL: https://issues.apache.org/jira/browse/HIVE-11340 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.14.0, 1.0.0, 1.2.0 Reporter: Gaurav Kohli Assignee: Yongzhi Chen Priority: Minor Attachments: HIVE-11340.1.patch I found a issue in “create table like” clause, as it is not copying the table properties from ORC File format based table. Steps to reproduce: Step1 : create table orc_table ( time string) stored as ORC tblproperties (orc.compress=SNAPPY); Step 2: create table orc_table_using_like like orc_table; Step 3: show create table orc_table_using_like; Result: createtab_stmt CREATE TABLE `orc_table_using_like`( `time` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 'hdfs://nameservice1/user/hive/warehouse/gkohli.db/orc_table_using_like' TBLPROPERTIES ( 'transient_lastDdlTime'='1437578939') Issue: 'orc.compress'='SNAPPY' property is missing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11493) Predicate with integer column equals double evaluates to false
[ https://issues.apache.org/jira/browse/HIVE-11493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660754#comment-14660754 ] Prasanth Jayachandran commented on HIVE-11493: -- This happens even when constant propagation and cbo is disabled. Predicate with integer column equals double evaluates to false -- Key: HIVE-11493 URL: https://issues.apache.org/jira/browse/HIVE-11493 Project: Hive Issue Type: Bug Affects Versions: 2.0.0 Reporter: Prasanth Jayachandran Assignee: Pengcheng Xiong Priority: Blocker Filters with integer column equals double constant evaluates to false everytime. Negative double constant works fine. {code:title=explain select * from orc_ppd where t = 10.0;} OK Stage-0 Fetch Operator limit:-1 Select Operator [SEL_2] outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13] Filter Operator [FIL_1] predicate:false (type: boolean) TableScan [TS_0] alias:orc_ppd {code} {code:title=explain select * from orc_ppd where t = -10.0;} OK Stage-0 Fetch Operator limit:-1 Select Operator [SEL_2] outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13] Filter Operator [FIL_1] predicate:(t = (- 10.0)) (type: boolean) TableScan [TS_0] alias:orc_ppd {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-11488) Add sessionId info to HS2 log
[ https://issues.apache.org/jira/browse/HIVE-11488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu reassigned HIVE-11488: --- Assignee: Aihua Xu Add sessionId info to HS2 log - Key: HIVE-11488 URL: https://issues.apache.org/jira/browse/HIVE-11488 Project: Hive Issue Type: New Feature Components: Logging Affects Versions: 2.0.0 Reporter: Aihua Xu Assignee: Aihua Xu Session is critical for a multi-user system like Hive. Currently Hive doesn't log seessionId to the log file, which sometimes make debugging and analysis difficult when multiple activities are going on at the same time and the log from different sessions are mixed together. Currently, Hive already has the sessionId saved in SessionState and also there is another sessionId in SessionHandle (Seems not used and I'm still looking to understand it). Generally we should have one sessionId from the beginning in the client side and server side. Seems we have some work on that side first. The sessionId then can be added to log4j supported mapped diagnostic context (MDC) and can be configured to output to log file through the log4j property. MDC is per thread, so we need to add sessionId to the HS2 main thread and then it will be inherited by the child threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11492) get rid of gWorkMap
[ https://issues.apache.org/jira/browse/HIVE-11492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11492: Description: gWorkMap is an annoying ugly global that causes leaks. It's not clear why it is needed when we already have 10 different *Context objects floating around during compilation. At worst we can add another one, would still be better than the global map. It should be removed. (was: gWorkMap is an annoying ugly global that causes leaks. It's not clear why this is needed when we already have 10 different *Context objects floating around during compilation. At worst we can add another one, would still be better than the global map. It should be removed.) get rid of gWorkMap --- Key: HIVE-11492 URL: https://issues.apache.org/jira/browse/HIVE-11492 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin gWorkMap is an annoying ugly global that causes leaks. It's not clear why it is needed when we already have 10 different *Context objects floating around during compilation. At worst we can add another one, would still be better than the global map. It should be removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11490) Lazily call ASTNode::toStringTree() after tree modification
[ https://issues.apache.org/jira/browse/HIVE-11490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-11490: - Attachment: HIVE-11490.1.patch Lazily call ASTNode::toStringTree() after tree modification --- Key: HIVE-11490 URL: https://issues.apache.org/jira/browse/HIVE-11490 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-11490.1.patch Currently, we call toStringTree() as part of HIVE-11316 everytime the tree is modified. This is a bad approach as we can lazily delay this to the point when toStringTree() is called again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11387) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : fix reduce_deduplicate optimization
[ https://issues.apache.org/jira/browse/HIVE-11387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660677#comment-14660677 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-11387: -- +1 pending test run. CBO: Calcite Operator To Hive Operator (Calcite Return Path) : fix reduce_deduplicate optimization -- Key: HIVE-11387 URL: https://issues.apache.org/jira/browse/HIVE-11387 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11387.01.patch, HIVE-11387.02.patch, HIVE-11387.03.patch, HIVE-11387.04.patch, HIVE-11387.05.patch {noformat} The main problem is that, due to return path, now we may have (RS1-GBY2)-(RS3-GBY4) when map.aggr=false, i.e., no map aggr. However, in the non-return path, it will be treated as (RS1)-(GBY2-RS3-GBY4). The main problem is that it does not take into account of the setting. {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11490) Lazily call ASTNode::toStringTree() after tree modification
[ https://issues.apache.org/jira/browse/HIVE-11490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-11490: - Attachment: (was: HIVE-11490.1.patch) Lazily call ASTNode::toStringTree() after tree modification --- Key: HIVE-11490 URL: https://issues.apache.org/jira/browse/HIVE-11490 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Currently, we call toStringTree() as part of HIVE-11316 everytime the tree is modified. This is a bad approach as we can lazily delay this to the point when toStringTree() is called again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11490) Lazily call ASTNode::toStringTree() after tree modification
[ https://issues.apache.org/jira/browse/HIVE-11490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-11490: - Attachment: HIVE-11490.1.patch Lazily call ASTNode::toStringTree() after tree modification --- Key: HIVE-11490 URL: https://issues.apache.org/jira/browse/HIVE-11490 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-11490.1.patch Currently, we call toStringTree() as part of HIVE-11316 everytime the tree is modified. This is a bad approach as we can lazily delay this to the point when toStringTree() is called again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11437) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with insert into
[ https://issues.apache.org/jira/browse/HIVE-11437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660784#comment-14660784 ] Pengcheng Xiong commented on HIVE-11437: [~jcamachorodriguez], as per your suggestion, I added a test file in CBO return path. Could you please take another look? Thanks. CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with insert into --- Key: HIVE-11437 URL: https://issues.apache.org/jira/browse/HIVE-11437 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11437.01.patch, HIVE-11437.02.patch, HIVE-11437.03.patch, HIVE-11437.04.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11398) Parse wide OR and wide AND trees to flat OR/AND trees
[ https://issues.apache.org/jira/browse/HIVE-11398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-11398: --- Attachment: HIVE-11398.2.patch [~gopalv], new patch fixes the issues with the optimization; triggering a new QA run. There are some changes in vectorization tests (vectorization mode gets disabled) that I guess will be solved once [~mmccline] patches go in? Thanks Parse wide OR and wide AND trees to flat OR/AND trees - Key: HIVE-11398 URL: https://issues.apache.org/jira/browse/HIVE-11398 Project: Hive Issue Type: New Feature Components: Logical Optimizer, UDF Affects Versions: 1.3.0, 2.0.0 Reporter: Gopal V Assignee: Jesus Camacho Rodriguez Attachments: HIVE-11398.2.patch, HIVE-11398.patch Deep trees of AND/OR are hard to traverse particularly when they are merely the same structure in nested form as a version of the operator that takes an arbitrary number of args. One potential way to convert the DFS searches into a simpler BFS search is to introduce a new Operator pair named ALL and ANY. ALL(A, B, C, D, E) represents AND(AND(AND(AND(E, D), C), B), A) ANY(A, B, C, D, E) represents OR(OR(OR(OR(E, D), C),B),A) The SemanticAnalyser would be responsible for generating these operators and this would mean that the depth and complexity of traversals for the simplest case of wide AND/OR trees would be trivial. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HIVE-4570) More information to user on GetOperationStatus in Hive Server2 when query is still executing
[ https://issues.apache.org/jira/browse/HIVE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu reopened HIVE-4570: --- Assignee: (was: Jaideep Dhok) Reopening as the functionality to view task/stage progress is not available yet. More information to user on GetOperationStatus in Hive Server2 when query is still executing Key: HIVE-4570 URL: https://issues.apache.org/jira/browse/HIVE-4570 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.11.0 Reporter: Amareshwari Sriramadasu Currently in Hive Server2, when the query is still executing only the status is set as STILL_EXECUTING. This issue is to give more information to the user such as progress and running job handles, if possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11437) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with insert into
[ https://issues.apache.org/jira/browse/HIVE-11437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659750#comment-14659750 ] Jesus Camacho Rodriguez commented on HIVE-11437: [~pxiong], patch looks good to me, but can we add a test file to test insert into in CBO return path? That will avoid creating any regressions in the future. Thanks CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with insert into --- Key: HIVE-11437 URL: https://issues.apache.org/jira/browse/HIVE-11437 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11437.01.patch, HIVE-11437.02.patch, HIVE-11437.03.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11391) CBO (Calcite Return Path): Add CBO tests with return path on
[ https://issues.apache.org/jira/browse/HIVE-11391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-11391: --- Attachment: HIVE-11391.patch CBO (Calcite Return Path): Add CBO tests with return path on Key: HIVE-11391 URL: https://issues.apache.org/jira/browse/HIVE-11391 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-11391.patch, HIVE-11391.patch, HIVE-11391.patch, HIVE-11391.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7476) CTAS does not work properly for s3
[ https://issues.apache.org/jira/browse/HIVE-7476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-7476: Attachment: HIVE-7476.3.patch Thanks for the review, addressed the comments. CTAS does not work properly for s3 -- Key: HIVE-7476 URL: https://issues.apache.org/jira/browse/HIVE-7476 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.13.1, 1.1.0 Environment: Linux Reporter: Jian Fang Assignee: Szehon Ho Attachments: HIVE-7476.1.patch, HIVE-7476.2.patch, HIVE-7476.3.patch When we use CTAS to create a new table in s3, the table location is not set correctly. As a result, the data from the existing table cannot be inserted into the new created table. We can use the following example to reproduce this issue. set hive.metastore.warehouse.dir=OUTPUT_PATH; drop table s3_dir_test; drop table s3_1; drop table s3_2; create external table s3_dir_test(strct structa:int, b:string, c:string) row format delimited fields terminated by '\t' collection items terminated by ' ' location 'INPUT_PATH'; create table s3_1(strct structa:int, b:string, c:string) row format delimited fields terminated by '\t' collection items terminated by ' '; insert overwrite table s3_1 select * from s3_dir_test; select * from s3_1; create table s3_2 as select * from s3_1; select * from s3_1; select * from s3_2; The data could be as follows. 1 abc 10.5 2 def 11.5 3 ajss 90.23232 4 djns 89.02002 5 random 2.99 6 data 3.002 7 ne 71.9084 The root cause is that the SemanticAnalyzer class did not handle s3 location properly for CTAS. A patch will be provided shortly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11367) CBO: Calcite Operator To Hive Operator (Calcite Return Path): ExprNodeConverter should use HiveDecimal to create Decimal
[ https://issues.apache.org/jira/browse/HIVE-11367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-11367: - Fix Version/s: 1.3.0 CBO: Calcite Operator To Hive Operator (Calcite Return Path): ExprNodeConverter should use HiveDecimal to create Decimal Key: HIVE-11367 URL: https://issues.apache.org/jira/browse/HIVE-11367 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Fix For: 1.3.0, 2.0.0 Attachments: HIVE-11367.01.patch, HIVE-11367.01.patch-branch-1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11498) HIVE Authorization v2 should not check permission for dummy entity
[ https://issues.apache.org/jira/browse/HIVE-11498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dapeng Sun updated HIVE-11498: -- Attachment: HIVE-11498.001.patch HIVE Authorization v2 should not check permission for dummy entity -- Key: HIVE-11498 URL: https://issues.apache.org/jira/browse/HIVE-11498 Project: Hive Issue Type: Bug Reporter: Dapeng Sun Assignee: Dapeng Sun Attachments: HIVE-11498.001.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11376) CombineHiveInputFormat is falling back to HiveInputFormat in case codecs are found for one of the input files
[ https://issues.apache.org/jira/browse/HIVE-11376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661331#comment-14661331 ] Rajat Khandelwal commented on HIVE-11376: - Taking patch from reviewboard and attaching CombineHiveInputFormat is falling back to HiveInputFormat in case codecs are found for one of the input files - Key: HIVE-11376 URL: https://issues.apache.org/jira/browse/HIVE-11376 Project: Hive Issue Type: Bug Reporter: Rajat Khandelwal Assignee: Rajat Khandelwal Attachments: HIVE-11376.02.patch https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java#L379 This is the exact code snippet: {noformat} / Since there is no easy way of knowing whether MAPREDUCE-1597 is present in the tree or not, // we use a configuration variable for the same if (this.mrwork != null !this.mrwork.getHadoopSupportsSplittable()) { // The following code should be removed, once // https://issues.apache.org/jira/browse/MAPREDUCE-1597 is fixed. // Hadoop does not handle non-splittable files correctly for CombineFileInputFormat, // so don't use CombineFileInputFormat for non-splittable files //ie, dont't combine if inputformat is a TextInputFormat and has compression turned on {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11376) CombineHiveInputFormat is falling back to HiveInputFormat in case codecs are found for one of the input files
[ https://issues.apache.org/jira/browse/HIVE-11376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajat Khandelwal updated HIVE-11376: Attachment: HIVE-11376.02.patch CombineHiveInputFormat is falling back to HiveInputFormat in case codecs are found for one of the input files - Key: HIVE-11376 URL: https://issues.apache.org/jira/browse/HIVE-11376 Project: Hive Issue Type: Bug Reporter: Rajat Khandelwal Assignee: Rajat Khandelwal Attachments: HIVE-11376.02.patch https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java#L379 This is the exact code snippet: {noformat} / Since there is no easy way of knowing whether MAPREDUCE-1597 is present in the tree or not, // we use a configuration variable for the same if (this.mrwork != null !this.mrwork.getHadoopSupportsSplittable()) { // The following code should be removed, once // https://issues.apache.org/jira/browse/MAPREDUCE-1597 is fixed. // Hadoop does not handle non-splittable files correctly for CombineFileInputFormat, // so don't use CombineFileInputFormat for non-splittable files //ie, dont't combine if inputformat is a TextInputFormat and has compression turned on {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7476) CTAS does not work properly for s3
[ https://issues.apache.org/jira/browse/HIVE-7476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661368#comment-14661368 ] Hive QA commented on HIVE-7476: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12749117/HIVE-7476.3.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9326 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4854/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4854/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4854/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12749117 - PreCommit-HIVE-TRUNK-Build CTAS does not work properly for s3 -- Key: HIVE-7476 URL: https://issues.apache.org/jira/browse/HIVE-7476 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.13.1, 1.1.0 Environment: Linux Reporter: Jian Fang Assignee: Szehon Ho Attachments: HIVE-7476.1.patch, HIVE-7476.2.patch, HIVE-7476.3.patch When we use CTAS to create a new table in s3, the table location is not set correctly. As a result, the data from the existing table cannot be inserted into the new created table. We can use the following example to reproduce this issue. set hive.metastore.warehouse.dir=OUTPUT_PATH; drop table s3_dir_test; drop table s3_1; drop table s3_2; create external table s3_dir_test(strct structa:int, b:string, c:string) row format delimited fields terminated by '\t' collection items terminated by ' ' location 'INPUT_PATH'; create table s3_1(strct structa:int, b:string, c:string) row format delimited fields terminated by '\t' collection items terminated by ' '; insert overwrite table s3_1 select * from s3_dir_test; select * from s3_1; create table s3_2 as select * from s3_1; select * from s3_1; select * from s3_2; The data could be as follows. 1 abc 10.5 2 def 11.5 3 ajss 90.23232 4 djns 89.02002 5 random 2.99 6 data 3.002 7 ne 71.9084 The root cause is that the SemanticAnalyzer class did not handle s3 location properly for CTAS. A patch will be provided shortly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-5277) HBase handler skips rows with null valued first cells when only row key is selected
[ https://issues.apache.org/jira/browse/HIVE-5277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Swarnim Kulkarni updated HIVE-5277: --- Attachment: HIVE-5277.3.patch.txt Updated patch to fix counts for count(key) and count(*) HBase handler skips rows with null valued first cells when only row key is selected --- Key: HIVE-5277 URL: https://issues.apache.org/jira/browse/HIVE-5277 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.11.0, 0.11.1, 0.12.0, 0.13.0 Reporter: Teddy Choi Assignee: Swarnim Kulkarni Priority: Critical Attachments: HIVE-5277.1.patch.txt, HIVE-5277.2.patch.txt, HIVE-5277.3.patch.txt HBaseStorageHandler skips rows with null valued first cells when only row key is selected. {noformat} SELECT key, col1, col2 FROM hbase_table; key1 cell1 cell2 key2 NULLcell3 SELECT COUNT(key) FROM hbase_table; 1 {noformat} HiveHBaseTableInputFormat.getRecordReader makes first cell selected to avoid skipping rows. But when the first cell is null, HBase skips that row. http://hbase.apache.org/book/perf.reading.html 12.9.6. Optimal Loading of Row Keys describes how to deal with this problem. I tried to find an existing issue, but I couldn't. If you find a same issue, please make this issue duplicated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11376) CombineHiveInputFormat is falling back to HiveInputFormat in case codecs are found for one of the input files
[ https://issues.apache.org/jira/browse/HIVE-11376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajat Khandelwal updated HIVE-11376: Attachment: (was: HIVE-11376_02.patch) CombineHiveInputFormat is falling back to HiveInputFormat in case codecs are found for one of the input files - Key: HIVE-11376 URL: https://issues.apache.org/jira/browse/HIVE-11376 Project: Hive Issue Type: Bug Reporter: Rajat Khandelwal Assignee: Rajat Khandelwal Attachments: HIVE-11376.02.patch https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java#L379 This is the exact code snippet: {noformat} / Since there is no easy way of knowing whether MAPREDUCE-1597 is present in the tree or not, // we use a configuration variable for the same if (this.mrwork != null !this.mrwork.getHadoopSupportsSplittable()) { // The following code should be removed, once // https://issues.apache.org/jira/browse/MAPREDUCE-1597 is fixed. // Hadoop does not handle non-splittable files correctly for CombineFileInputFormat, // so don't use CombineFileInputFormat for non-splittable files //ie, dont't combine if inputformat is a TextInputFormat and has compression turned on {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11449) HybridHashTableContainer should throw exception if not enough memory to create the hash tables
[ https://issues.apache.org/jira/browse/HIVE-11449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-11449: -- Attachment: HIVE-11449.2.patch Attaching patch v2 - this prevents us from passing too low of a memUsage value by making sure it is at least wbSize. HybridHashTableContainer should throw exception if not enough memory to create the hash tables -- Key: HIVE-11449 URL: https://issues.apache.org/jira/browse/HIVE-11449 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-11449.1.patch, HIVE-11449.2.patch Currently it only logs a warning message: {code} public static int calcNumPartitions(long memoryThreshold, long dataSize, int minNumParts, int minWbSize, HybridHashTableConf nwayConf) throws IOException { int numPartitions = minNumParts; if (memoryThreshold minNumParts * minWbSize) { LOG.warn(Available memory is not enough to create a HybridHashTableContainer!); } {code} Because we only log a warning, processing continues and hits a hard-to-diagnose error (log below also includes extra logging I added to help track this down). We should probably just fail the query a useful logging message instead. {noformat} 2015-07-30 18:49:29,696 [pool-1269-thread-8()] WARN org.apache.hadoop.hive.ql.exec.persistence.HybridHashTableContainer: Available memory is not enough to create HybridHashTableContainers consistently! 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** initialCapacity 1: 10 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** initialCapacity 2: 131072 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** maxCapacity: 0 2015-07-30 18:49:29,696 [pool-1269-thread-8()] ERROR org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap: *** initialCapacity 3: 0 2015-07-30 18:49:29,699 [TezTaskRunner_attempt_1437197396589_0685_1_49_00_2(attempt_1437197396589_0685_1_49_00_2)] ERROR org.apache.hadoop.hive.ql.exec.tez.TezProcessor: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:258) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:168) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:157) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:349) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Async initialization failed at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:419) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:389) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:514) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:467) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:379) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:243) ... 15 more Caused by: java.util.concurrent.ExecutionException: java.lang.AssertionError: Capacity must be a power of two at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:409) ... 20 more Caused
[jira] [Commented] (HIVE-11375) Broken processing of queries containing NOT (x IS NOT NULL and x 0)
[ https://issues.apache.org/jira/browse/HIVE-11375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661213#comment-14661213 ] Hive QA commented on HIVE-11375: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748958/HIVE-11375.patch {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 9326 tests executed *Failed tests:* {noformat} TestMarkPartition - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_udf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_cond_pushdown org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_join_breaktask org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_udf org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_filter_join_breaktask org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_udf org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_filter_join_breaktask org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4852/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4852/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4852/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12748958 - PreCommit-HIVE-TRUNK-Build Broken processing of queries containing NOT (x IS NOT NULL and x 0) -- Key: HIVE-11375 URL: https://issues.apache.org/jira/browse/HIVE-11375 Project: Hive Issue Type: Bug Components: Logical Optimizer Affects Versions: 2.0.0 Reporter: Mariusz Sakowski Assignee: Aihua Xu Fix For: 2.0.0 Attachments: HIVE-11375.patch When running query like this: {code}explain select * from test where (val is not null and val 0);{code} hive will simplify expression in parenthesis and omit is not null check: {code} Filter Operator predicate: (val 0) (type: boolean) {code} which is fine. but if we negate condition using NOT operator: {code}explain select * from test where not (val is not null and val 0);{code} hive will also simplify thing, but now it will break stuff: {code} Filter Operator predicate: (not (val 0)) (type: boolean) {code} because valid predicate should be *val == 0 or val is null*, while above row is equivalent to *val == 0* only, filtering away rows where val is null simple example: {code} CREATE TABLE example ( val bigint ); INSERT INTO example VALUES (1), (NULL), (0); -- returns 2 rows - NULL and 0 select * from example where (val is null or val == 0); -- returns 1 row - 0 select * from example where not (val is not null and val 0); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0
[ https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-10975: Attachment: HIVE-10975.2.patch Parquet: Bump the parquet version up to 1.8.0 - Key: HIVE-10975 URL: https://issues.apache.org/jira/browse/HIVE-10975 Project: Hive Issue Type: Sub-task Reporter: Ferdinand Xu Assignee: Ferdinand Xu Priority: Minor Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, HIVE-10975.1.patch, HIVE-10975.2.patch, HIVE-10975.2.patch, HIVE-10975.2.patch, HIVE-10975.patch There are lots of changes since parquet's graduation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0
[ https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661323#comment-14661323 ] Ferdinand Xu commented on HIVE-10975: - Now the jenkins result looks good, [~spena], could you take a look at it? Thanks! Parquet: Bump the parquet version up to 1.8.0 - Key: HIVE-10975 URL: https://issues.apache.org/jira/browse/HIVE-10975 Project: Hive Issue Type: Sub-task Reporter: Ferdinand Xu Assignee: Ferdinand Xu Priority: Minor Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, HIVE-10975.1.patch, HIVE-10975.2.patch, HIVE-10975.2.patch, HIVE-10975.2.patch, HIVE-10975.patch There are lots of changes since parquet's graduation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11356) SMB join on tez fails when one of the tables is empty
[ https://issues.apache.org/jira/browse/HIVE-11356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-11356: -- Attachment: HIVE-11356.3.patch [~hagleitn] can you review please. SMB join on tez fails when one of the tables is empty - Key: HIVE-11356 URL: https://issues.apache.org/jira/browse/HIVE-11356 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-11356.1.patch, HIVE-11356.3.patch {code} :java.lang.IllegalStateException: Unexpected event. All physical sources already initialized at com.google.common.base.Preconditions.checkState(Preconditions.java:145) at org.apache.tez.mapreduce.input.MultiMRInput.handleEvents(MultiMRInput.java:142) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.handleEvent(LogicalIOProcessorRuntimeTask.java:610) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.access$1100(LogicalIOProcessorRuntimeTask.java:90) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$1.run(LogicalIOProcessorRuntimeTask.java:673) at java.lang.Thread.run(Thread.java:745) ]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex vertex_1437168420060_17787_1_01 [Map 4] killed/failed due to:null] Vertex killed, vertexName=Reducer 5, vertexId=vertex_1437168420060_17787_1_02, diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0, Vertex vertex_1437168420060_17787_1_02 [Reducer 5] killed/failed due to:null] DAG failed due to vertex failure. failedVertices:1 killedVertices:1 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask HQL-FAILED {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11356) SMB join on tez fails when one of the tables is empty
[ https://issues.apache.org/jira/browse/HIVE-11356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661241#comment-14661241 ] Vikram Dixit K commented on HIVE-11356: --- [~jdere] review please. SMB join on tez fails when one of the tables is empty - Key: HIVE-11356 URL: https://issues.apache.org/jira/browse/HIVE-11356 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-11356.1.patch, HIVE-11356.3.patch {code} :java.lang.IllegalStateException: Unexpected event. All physical sources already initialized at com.google.common.base.Preconditions.checkState(Preconditions.java:145) at org.apache.tez.mapreduce.input.MultiMRInput.handleEvents(MultiMRInput.java:142) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.handleEvent(LogicalIOProcessorRuntimeTask.java:610) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.access$1100(LogicalIOProcessorRuntimeTask.java:90) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$1.run(LogicalIOProcessorRuntimeTask.java:673) at java.lang.Thread.run(Thread.java:745) ]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex vertex_1437168420060_17787_1_01 [Map 4] killed/failed due to:null] Vertex killed, vertexName=Reducer 5, vertexId=vertex_1437168420060_17787_1_02, diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0, Vertex vertex_1437168420060_17787_1_02 [Reducer 5] killed/failed due to:null] DAG failed due to vertex failure. failedVertices:1 killedVertices:1 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask HQL-FAILED {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11451) SemanticAnalyzer throws IndexOutOfBounds Exception
[ https://issues.apache.org/jira/browse/HIVE-11451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661254#comment-14661254 ] Chao Sun commented on HIVE-11451: - +1 on the latest patch. SemanticAnalyzer throws IndexOutOfBounds Exception -- Key: HIVE-11451 URL: https://issues.apache.org/jira/browse/HIVE-11451 Project: Hive Issue Type: Bug Affects Versions: 2.0.0 Reporter: Prasanth Jayachandran Assignee: Aihua Xu Priority: Critical Attachments: HIVE-11451.patch Following queries throw IndexOutOfBoundsException in SemanticAnalyzer {code:title=Queries|borderStyle=solid} CREATE TABLE staging(t tinyint, si smallint, i int, b bigint, f float, d double, bo boolean, s string, ts timestamp, dec decimal(4,2), bin binary) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '../../data/files/over1k' OVERWRITE INTO TABLE staging; CREATE TABLE orc_ppd(t tinyint, si smallint, i int, b bigint, f float, d double, bo boolean, s string, c char(50), v varchar(50), da date, ts timestamp, dec decimal(4,2), bin binary) STORED AS ORC tblproperties(orc.row.index.stride = 1000); insert overwrite table orc_ppd select si, i, b, f, d, bo, s, cast(s as char(50)), cast(s as varchar(50)), cast(ts as date), ts, dec, bin from staging; {code} {code:title=StackTrace|borderStyle=solid} java.lang.IndexOutOfBoundsException: Index: 13, Size: 13 at java.util.ArrayList.rangeCheck(ArrayList.java:635) at java.util.ArrayList.get(ArrayList.java:411) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genConversionSelectOperator(SemanticAnalyzer.java:6754) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:6543) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8989) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8880) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9730) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9623) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10115) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:330) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10126) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:209) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:240) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:422) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:310) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1139) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1192) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1068) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1058) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-8506) UT: add test flag in hive-site.xml for spark tests
[ https://issues.apache.org/jira/browse/HIVE-8506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li resolved HIVE-8506. -- Resolution: Duplicate Close this one as it's done in HIVE-10903. UT: add test flag in hive-site.xml for spark tests -- Key: HIVE-8506 URL: https://issues.apache.org/jira/browse/HIVE-8506 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Thomas Friedrich Assignee: Thomas Friedrich Priority: Minor All tests dbtxnmgr_* fail because the metastore tables are not correctly initialized. We need to set the hive.in.test flag in hive-site.xml under data/conf/spark: property namehive.in.test/name valuetrue/value descriptionInternal marker for test. Used for masking env-dependent values/description /property -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10975) Parquet: Bump the parquet version up to 1.8.0
[ https://issues.apache.org/jira/browse/HIVE-10975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661275#comment-14661275 ] Hive QA commented on HIVE-10975: {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748971/HIVE-10975.2.patch {color:green}SUCCESS:{color} +1 9326 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4853/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4853/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4853/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12748971 - PreCommit-HIVE-TRUNK-Build Parquet: Bump the parquet version up to 1.8.0 - Key: HIVE-10975 URL: https://issues.apache.org/jira/browse/HIVE-10975 Project: Hive Issue Type: Sub-task Reporter: Ferdinand Xu Assignee: Ferdinand Xu Priority: Minor Attachments: HIVE-10975-parquet.patch, HIVE-10975.1-parquet.patch, HIVE-10975.1.patch, HIVE-10975.2.patch, HIVE-10975.2.patch, HIVE-10975.2.patch, HIVE-10975.patch There are lots of changes since parquet's graduation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11406) Vectorization: StringExpr::compare() == 0 is bad for performance
[ https://issues.apache.org/jira/browse/HIVE-11406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661336#comment-14661336 ] Lefty Leverenz commented on HIVE-11406: --- Thanks [~gopalv]. Vectorization: StringExpr::compare() == 0 is bad for performance Key: HIVE-11406 URL: https://issues.apache.org/jira/browse/HIVE-11406 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 1.3.0, 2.0.0 Reporter: Gopal V Assignee: Matt McCline Fix For: 1.3.0, 2.0.0 Attachments: HIVE-11406.01.patch {{StringExpr::compare() == 0}} is forced to evaluate the whole memory comparison loop for differing lengths of strings, though there is no possibility they will ever be equal. Add a {{StringExpr::equals}} which can be a smaller and tighter loop. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11406) Vectorization: StringExpr::compare() == 0 is bad for performance
[ https://issues.apache.org/jira/browse/HIVE-11406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661016#comment-14661016 ] Gopal V commented on HIVE-11406: I see Matt's fix on both branches now. Git makes it too easy to commit a fix, but not push it. Vectorization: StringExpr::compare() == 0 is bad for performance Key: HIVE-11406 URL: https://issues.apache.org/jira/browse/HIVE-11406 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 1.3.0, 2.0.0 Reporter: Gopal V Assignee: Matt McCline Fix For: 1.3.0, 2.0.0 Attachments: HIVE-11406.01.patch {{StringExpr::compare() == 0}} is forced to evaluate the whole memory comparison loop for differing lengths of strings, though there is no possibility they will ever be equal. Add a {{StringExpr::equals}} which can be a smaller and tighter loop. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11493) Predicate with integer column equals double evaluates to false
[ https://issues.apache.org/jira/browse/HIVE-11493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661152#comment-14661152 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-11493: -- The change looks fine. +1 pending tests. Predicate with integer column equals double evaluates to false -- Key: HIVE-11493 URL: https://issues.apache.org/jira/browse/HIVE-11493 Project: Hive Issue Type: Bug Affects Versions: 2.0.0 Reporter: Prasanth Jayachandran Assignee: Pengcheng Xiong Priority: Blocker Attachments: HIVE-11493.01.patch Filters with integer column equals double constant evaluates to false everytime. Negative double constant works fine. {code:title=explain select * from orc_ppd where t = 10.0;} OK Stage-0 Fetch Operator limit:-1 Select Operator [SEL_2] outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13] Filter Operator [FIL_1] predicate:false (type: boolean) TableScan [TS_0] alias:orc_ppd {code} {code:title=explain select * from orc_ppd where t = -10.0;} OK Stage-0 Fetch Operator limit:-1 Select Operator [SEL_2] outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13] Filter Operator [FIL_1] predicate:(t = (- 10.0)) (type: boolean) TableScan [TS_0] alias:orc_ppd {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11476) TypeInfoParser cannot handle column names with spaces in them
[ https://issues.apache.org/jira/browse/HIVE-11476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660960#comment-14660960 ] Hive QA commented on HIVE-11476: {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748943/HIVE-11476.1.patch {color:green}SUCCESS:{color} +1 9326 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4850/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4850/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4850/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12748943 - PreCommit-HIVE-TRUNK-Build TypeInfoParser cannot handle column names with spaces in them - Key: HIVE-11476 URL: https://issues.apache.org/jira/browse/HIVE-11476 Project: Hive Issue Type: Bug Components: Types Affects Versions: 1.3.0, 2.0.0 Reporter: Gopal V Assignee: Gopal V Priority: Minor Attachments: HIVE-11476.1.patch When using column names which contain escaped spaces in them like `user id`, the type info parser is unable to parse out the structures which have a format similar to structuser id:int,user group: int {code} java.lang.IllegalArgumentException: Error: : expected at the position 12 of '' but 'structuser id:int,user group: int' is found. at org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.expect(TypeInfoUtils.java:360) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11493) Predicate with integer column equals double evaluates to false
[ https://issues.apache.org/jira/browse/HIVE-11493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661125#comment-14661125 ] Pengcheng Xiong commented on HIVE-11493: [~hsubramaniyan], thanks a lot for your comments. Actually we do not need to cast the value, it is already there. We just need to change {code} if (triedDouble || {code} to {code} if (triedDouble {code} Predicate with integer column equals double evaluates to false -- Key: HIVE-11493 URL: https://issues.apache.org/jira/browse/HIVE-11493 Project: Hive Issue Type: Bug Affects Versions: 2.0.0 Reporter: Prasanth Jayachandran Assignee: Pengcheng Xiong Priority: Blocker Filters with integer column equals double constant evaluates to false everytime. Negative double constant works fine. {code:title=explain select * from orc_ppd where t = 10.0;} OK Stage-0 Fetch Operator limit:-1 Select Operator [SEL_2] outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13] Filter Operator [FIL_1] predicate:false (type: boolean) TableScan [TS_0] alias:orc_ppd {code} {code:title=explain select * from orc_ppd where t = -10.0;} OK Stage-0 Fetch Operator limit:-1 Select Operator [SEL_2] outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13] Filter Operator [FIL_1] predicate:(t = (- 10.0)) (type: boolean) TableScan [TS_0] alias:orc_ppd {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-11381) QTest combine2_hadoop20.q fails when using -Phadoop-1 profile due to HIVE-11139
[ https://issues.apache.org/jira/browse/HIVE-11381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang resolved HIVE-11381. Resolution: Won't Fix I looked into it again. It is not an issue upstream actually. This test is ignored. Thanks. QTest combine2_hadoop20.q fails when using -Phadoop-1 profile due to HIVE-11139 --- Key: HIVE-11381 URL: https://issues.apache.org/jira/browse/HIVE-11381 Project: Hive Issue Type: Bug Affects Versions: 1.3.0 Reporter: Sergio Peña Assignee: Sergio Peña The q-est {{combine2_hadoop20.q}} is failing when running -Phadoop-1 profile tests. The output test is different due to the changes added on HIVE-11139 for more lineage information. Based on other HIVE-11139 tests, this test output needs to be regenerated only. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11451) SemanticAnalyzer throws IndexOutOfBounds Exception
[ https://issues.apache.org/jira/browse/HIVE-11451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-11451: Attachment: (was: HIVE-11451.patch) SemanticAnalyzer throws IndexOutOfBounds Exception -- Key: HIVE-11451 URL: https://issues.apache.org/jira/browse/HIVE-11451 Project: Hive Issue Type: Bug Affects Versions: 2.0.0 Reporter: Prasanth Jayachandran Assignee: Aihua Xu Priority: Critical Attachments: HIVE-11451.patch Following queries throw IndexOutOfBoundsException in SemanticAnalyzer {code:title=Queries|borderStyle=solid} CREATE TABLE staging(t tinyint, si smallint, i int, b bigint, f float, d double, bo boolean, s string, ts timestamp, dec decimal(4,2), bin binary) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '../../data/files/over1k' OVERWRITE INTO TABLE staging; CREATE TABLE orc_ppd(t tinyint, si smallint, i int, b bigint, f float, d double, bo boolean, s string, c char(50), v varchar(50), da date, ts timestamp, dec decimal(4,2), bin binary) STORED AS ORC tblproperties(orc.row.index.stride = 1000); insert overwrite table orc_ppd select si, i, b, f, d, bo, s, cast(s as char(50)), cast(s as varchar(50)), cast(ts as date), ts, dec, bin from staging; {code} {code:title=StackTrace|borderStyle=solid} java.lang.IndexOutOfBoundsException: Index: 13, Size: 13 at java.util.ArrayList.rangeCheck(ArrayList.java:635) at java.util.ArrayList.get(ArrayList.java:411) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genConversionSelectOperator(SemanticAnalyzer.java:6754) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:6543) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8989) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8880) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9730) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9623) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10115) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:330) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10126) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:209) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:240) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:422) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:310) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1139) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1192) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1068) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1058) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11451) SemanticAnalyzer throws IndexOutOfBounds Exception
[ https://issues.apache.org/jira/browse/HIVE-11451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-11451: Attachment: HIVE-11451.patch SemanticAnalyzer throws IndexOutOfBounds Exception -- Key: HIVE-11451 URL: https://issues.apache.org/jira/browse/HIVE-11451 Project: Hive Issue Type: Bug Affects Versions: 2.0.0 Reporter: Prasanth Jayachandran Assignee: Aihua Xu Priority: Critical Attachments: HIVE-11451.patch, HIVE-11451.patch Following queries throw IndexOutOfBoundsException in SemanticAnalyzer {code:title=Queries|borderStyle=solid} CREATE TABLE staging(t tinyint, si smallint, i int, b bigint, f float, d double, bo boolean, s string, ts timestamp, dec decimal(4,2), bin binary) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '../../data/files/over1k' OVERWRITE INTO TABLE staging; CREATE TABLE orc_ppd(t tinyint, si smallint, i int, b bigint, f float, d double, bo boolean, s string, c char(50), v varchar(50), da date, ts timestamp, dec decimal(4,2), bin binary) STORED AS ORC tblproperties(orc.row.index.stride = 1000); insert overwrite table orc_ppd select si, i, b, f, d, bo, s, cast(s as char(50)), cast(s as varchar(50)), cast(ts as date), ts, dec, bin from staging; {code} {code:title=StackTrace|borderStyle=solid} java.lang.IndexOutOfBoundsException: Index: 13, Size: 13 at java.util.ArrayList.rangeCheck(ArrayList.java:635) at java.util.ArrayList.get(ArrayList.java:411) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genConversionSelectOperator(SemanticAnalyzer.java:6754) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:6543) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8989) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8880) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9730) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9623) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10115) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:330) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10126) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:209) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:240) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:422) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:310) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1139) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1192) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1068) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1058) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11441) No DDL allowed on table if user accidentally set table location wrong
[ https://issues.apache.org/jira/browse/HIVE-11441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-11441: -- Attachment: HIVE-11441.2.patch Fix a compilation error under Hadoop 1. No DDL allowed on table if user accidentally set table location wrong - Key: HIVE-11441 URL: https://issues.apache.org/jira/browse/HIVE-11441 Project: Hive Issue Type: Bug Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 1.3.0, 2.0.0 Attachments: HIVE-11441.1.patch, HIVE-11441.2.patch If user makes a mistake, hive should either correct it in the first place, or allow user a chance to correct it. STEPS TO REPRODUCE: create table testwrongloc(id int); alter table testwrongloc set location hdfs://a-valid-hostname/tmp/testwrongloc; --at this time, hive should throw error, as hdfs://a-valid-hostname is not a valid path, it either needs to be hdfs://namenode-hostname:8020/ or hdfs://hdfs-nameservice for HA alter table testwrongloc set location hdfs://correct-host:8020/tmp/testwrongloc or drop table testwrongloc; upon this hive throws error, that host 'a-valid-hostname' is not reachable {code} 2015-07-30 12:19:43,573 DEBUG [main]: transport.TSaslTransport (TSaslTransport.java:readFrame(429)) - CLIENT: reading data length: 293 2015-07-30 12:19:43,720 ERROR [main]: ql.Driver (SessionState.java:printError(833)) - FAILED: SemanticException Unable to fetch table testloc. java.net.ConnectException: Call From hdpsecb02.secb.hwxsup.com/172.25.16.178 to hdpsecb02.secb.hwxsup.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused org.apache.hadoop.hive.ql.parse.SemanticException: Unable to fetch table testloc. java.net.ConnectException: Call From hdpsecb02.secb.hwxsup.com/172.25.16.178 to hdpsecb02.secb.hwxsup.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.getTable(BaseSemanticAnalyzer.java:1323) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.getTable(BaseSemanticAnalyzer.java:1309) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.addInputsOutputsAlterTable(DDLSemanticAnalyzer.java:1387) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeAlterTableLocation(DDLSemanticAnalyzer.java:1452) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:295) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:417) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1069) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1131) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1006) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:996) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to fetch table testloc. java.net.ConnectException: Call From hdpsecb02.secb.hwxsup.com/172.25.16.178 to hdpsecb02.secb.hwxsup.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1072) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1019) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.getTable(BaseSemanticAnalyzer.java:1316) ... 23 more {code} -- This message was
[jira] [Updated] (HIVE-11493) Predicate with integer column equals double evaluates to false
[ https://issues.apache.org/jira/browse/HIVE-11493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11493: --- Attachment: HIVE-11493.01.patch Predicate with integer column equals double evaluates to false -- Key: HIVE-11493 URL: https://issues.apache.org/jira/browse/HIVE-11493 Project: Hive Issue Type: Bug Affects Versions: 2.0.0 Reporter: Prasanth Jayachandran Assignee: Pengcheng Xiong Priority: Blocker Attachments: HIVE-11493.01.patch Filters with integer column equals double constant evaluates to false everytime. Negative double constant works fine. {code:title=explain select * from orc_ppd where t = 10.0;} OK Stage-0 Fetch Operator limit:-1 Select Operator [SEL_2] outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13] Filter Operator [FIL_1] predicate:false (type: boolean) TableScan [TS_0] alias:orc_ppd {code} {code:title=explain select * from orc_ppd where t = -10.0;} OK Stage-0 Fetch Operator limit:-1 Select Operator [SEL_2] outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13] Filter Operator [FIL_1] predicate:(t = (- 10.0)) (type: boolean) TableScan [TS_0] alias:orc_ppd {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11340) Create ORC based table using like clause doesn't copy compression property
[ https://issues.apache.org/jira/browse/HIVE-11340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661151#comment-14661151 ] Yongzhi Chen commented on HIVE-11340: - The 3 failures are not related. The 2 Minimr failures all have the error: [Error 30017]: Skipping stats aggregation by error org.apache.hadoop.hive.ql.metadata.HiveException: [Error 30015]: Stats aggregator of type counter cannot be connected to It may relate to network issue. I tested them in my local machine, they all passed. The testSSLConnectionWithProperty failure ages 5. [~csun], could you review the patch? Thanks Create ORC based table using like clause doesn't copy compression property -- Key: HIVE-11340 URL: https://issues.apache.org/jira/browse/HIVE-11340 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.14.0, 1.0.0, 1.2.0 Reporter: Gaurav Kohli Assignee: Yongzhi Chen Priority: Minor Attachments: HIVE-11340.1.patch I found a issue in “create table like” clause, as it is not copying the table properties from ORC File format based table. Steps to reproduce: Step1 : create table orc_table ( time string) stored as ORC tblproperties (orc.compress=SNAPPY); Step 2: create table orc_table_using_like like orc_table; Step 3: show create table orc_table_using_like; Result: createtab_stmt CREATE TABLE `orc_table_using_like`( `time` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 'hdfs://nameservice1/user/hive/warehouse/gkohli.db/orc_table_using_like' TBLPROPERTIES ( 'transient_lastDdlTime'='1437578939') Issue: 'orc.compress'='SNAPPY' property is missing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11493) Predicate with integer column equals double evaluates to false
[ https://issues.apache.org/jira/browse/HIVE-11493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661158#comment-14661158 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-11493: -- nit : you could actually change the comment below to reflect what you are actually doing. {code} // however, if we already tried this, or the column is NUMBER type and // the operator is EQUAL, return false due to the type mismatch {code} Predicate with integer column equals double evaluates to false -- Key: HIVE-11493 URL: https://issues.apache.org/jira/browse/HIVE-11493 Project: Hive Issue Type: Bug Affects Versions: 2.0.0 Reporter: Prasanth Jayachandran Assignee: Pengcheng Xiong Priority: Blocker Attachments: HIVE-11493.01.patch Filters with integer column equals double constant evaluates to false everytime. Negative double constant works fine. {code:title=explain select * from orc_ppd where t = 10.0;} OK Stage-0 Fetch Operator limit:-1 Select Operator [SEL_2] outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13] Filter Operator [FIL_1] predicate:false (type: boolean) TableScan [TS_0] alias:orc_ppd {code} {code:title=explain select * from orc_ppd where t = -10.0;} OK Stage-0 Fetch Operator limit:-1 Select Operator [SEL_2] outputColumnNames:[_col0,_col1,_col2,_col3,_col4,_col5,_col6,_col7,_col8,_col9,_col10,_col11,_col12,_col13] Filter Operator [FIL_1] predicate:(t = (- 10.0)) (type: boolean) TableScan [TS_0] alias:orc_ppd {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11436) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with empty char
[ https://issues.apache.org/jira/browse/HIVE-11436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660963#comment-14660963 ] Hive QA commented on HIVE-11436: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748959/HIVE-11436.03.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4851/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4851/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4851/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4851/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 3fe7e44 HIVE-11432 : Hive macro give same result for different arguments (Pengcheng Xiong, reviewed by Hari Subramaniyan) + git clean -f -d + git checkout master Already on 'master' + git reset --hard origin/master HEAD is now at 3fe7e44 HIVE-11432 : Hive macro give same result for different arguments (Pengcheng Xiong, reviewed by Hari Subramaniyan) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12748959 - PreCommit-HIVE-TRUNK-Build CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with empty char -- Key: HIVE-11436 URL: https://issues.apache.org/jira/browse/HIVE-11436 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11436.01.patch, HIVE-11436.02.patch, HIVE-11436.03.patch BaseCharUtils checks whether the length of a char is in between [1,255]. This causes return path to throw error when the the length of a char is 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11496) Better tests for evaluating ORC predicate pushdown
[ https://issues.apache.org/jira/browse/HIVE-11496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-11496: - Attachment: HIVE-11496.1.patch Better tests for evaluating ORC predicate pushdown -- Key: HIVE-11496 URL: https://issues.apache.org/jira/browse/HIVE-11496 Project: Hive Issue Type: Improvement Affects Versions: 1.3.0, 2.0.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Attachments: HIVE-11496.1.patch There were many regressions recently wrt ORC predicate pushdown. We don't have system tests to capture these regressions. Currently there is only junit tests for testing ORC predicate pushdown feature. Since hive counters are not available during qfile test execution there is no easy way to verify if ORC PPD feature worked or not. This jira is add a post execution hook to print hive counters (esp. number of input records) to error stream so that it will appear in qfile test output. This way we can verify ORC SARG evaluation and avoid future regressions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11436) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with empty char
[ https://issues.apache.org/jira/browse/HIVE-11436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11436: --- Attachment: HIVE-11436.04.patch rebase and resubmit the patch CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with empty char -- Key: HIVE-11436 URL: https://issues.apache.org/jira/browse/HIVE-11436 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11436.01.patch, HIVE-11436.02.patch, HIVE-11436.03.patch, HIVE-11436.04.patch BaseCharUtils checks whether the length of a char is in between [1,255]. This causes return path to throw error when the the length of a char is 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11436) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with empty char
[ https://issues.apache.org/jira/browse/HIVE-11436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660982#comment-14660982 ] Pengcheng Xiong commented on HIVE-11436: [~jcamachorodriguez], could u take look? Thanks. CBO: Calcite Operator To Hive Operator (Calcite Return Path) : dealing with empty char -- Key: HIVE-11436 URL: https://issues.apache.org/jira/browse/HIVE-11436 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11436.01.patch, HIVE-11436.02.patch, HIVE-11436.03.patch, HIVE-11436.04.patch BaseCharUtils checks whether the length of a char is in between [1,255]. This causes return path to throw error when the the length of a char is 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11406) Vectorization: StringExpr::compare() == 0 is bad for performance
[ https://issues.apache.org/jira/browse/HIVE-11406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-11406: --- Fix Version/s: 2.0.0 1.3.0 Vectorization: StringExpr::compare() == 0 is bad for performance Key: HIVE-11406 URL: https://issues.apache.org/jira/browse/HIVE-11406 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 1.3.0, 2.0.0 Reporter: Gopal V Assignee: Matt McCline Fix For: 1.3.0, 2.0.0 Attachments: HIVE-11406.01.patch {{StringExpr::compare() == 0}} is forced to evaluate the whole memory comparison loop for differing lengths of strings, though there is no possibility they will ever be equal. Add a {{StringExpr::equals}} which can be a smaller and tighter loop. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11496) Better tests for evaluating ORC predicate pushdown
[ https://issues.apache.org/jira/browse/HIVE-11496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660973#comment-14660973 ] Prasanth Jayachandran commented on HIVE-11496: -- This patch adds only limited basic tests. I will add more tests after HIVE-11312, HIVE-11477, HIVE-11493 and HIVE-11494 are fixed. Better tests for evaluating ORC predicate pushdown -- Key: HIVE-11496 URL: https://issues.apache.org/jira/browse/HIVE-11496 Project: Hive Issue Type: Improvement Affects Versions: 1.3.0, 2.0.0 Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Attachments: HIVE-11496.1.patch There were many regressions recently wrt ORC predicate pushdown. We don't have system tests to capture these regressions. Currently there is only junit tests for testing ORC predicate pushdown feature. Since hive counters are not available during qfile test execution there is no easy way to verify if ORC PPD feature worked or not. This jira is add a post execution hook to print hive counters (esp. number of input records) to error stream so that it will appear in qfile test output. This way we can verify ORC SARG evaluation and avoid future regressions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11442) Remove commons-configuration.jar from Hive distribution
[ https://issues.apache.org/jira/browse/HIVE-11442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-11442: -- Attachment: HIVE-11442.2.patch Rerun precommit test. Remove commons-configuration.jar from Hive distribution --- Key: HIVE-11442 URL: https://issues.apache.org/jira/browse/HIVE-11442 Project: Hive Issue Type: Improvement Components: Build Infrastructure Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 1.3.0, 2.0.0 Attachments: HIVE-11442.1.patch, HIVE-11442.2.patch Some customer report version conflicting for Hive bundled commons-configuration.jar. Actually commons-configuration.jar is not needed by Hive. It is a transitive dependency of Hadoop/Accumulo. User should be able to pick those jars from Hadoop at runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11385) LLAP: clean up ORC dependencies - move encoded reader path into a cloned ReaderImpl
[ https://issues.apache.org/jira/browse/HIVE-11385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11385: Attachment: HIVE-11385.01.patch Moved the classes to storage-api that we recently integrated from master LLAP: clean up ORC dependencies - move encoded reader path into a cloned ReaderImpl --- Key: HIVE-11385 URL: https://issues.apache.org/jira/browse/HIVE-11385 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-11385.01.patch, HIVE-11385.patch Before there's storage handler module, we can clean some things up NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-3725) Add support for pulling HBase columns with prefixes
[ https://issues.apache.org/jira/browse/HIVE-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659550#comment-14659550 ] Lefty Leverenz commented on HIVE-3725: -- Doc note: This still needs documentation. See the Hive column mapping to hbase thread in the d...@hive.apache.org mailing list. * [Re: Hive column mapping to hbase | http://mail-archives.apache.org/mod_mbox/hive-dev/201508.mbox/%3ccahnpetsyyitxxw5iptufbmjvqecnawr04jx5+3a+vtf0mqp...@mail.gmail.com%3e] Add support for pulling HBase columns with prefixes --- Key: HIVE-3725 URL: https://issues.apache.org/jira/browse/HIVE-3725 Project: Hive Issue Type: Improvement Components: HBase Handler Affects Versions: 0.9.0 Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Labels: TODOC12 Fix For: 0.12.0 Attachments: HIVE-3725.1.patch.txt, HIVE-3725.2.patch.txt, HIVE-3725.3.patch.txt, HIVE-3725.4.patch.txt, HIVE-3725.patch.3.txt Current HBase Hive integration supports reading many values from the same row by specifying a column family. And specifying just the column family can pull in all qualifiers within the family. We should add in support to be able to specify a prefix for the qualifier and all columns that start with the prefix would automatically get pulled in. A wildcard support would be ideal. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-4570) More information to user on GetOperationStatus in Hive Server2 when query is still executing
[ https://issues.apache.org/jira/browse/HIVE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659856#comment-14659856 ] Amareshwari Sriramadasu commented on HIVE-4570: --- Here is the modified thrift structure proposed : {noformat} struct TGetOperationStatusResp { 1: required TStatus status 2: optional TOperationState operationState // List of statuses of sub tasks 3: optional string taskStatus // If operationState is ERROR_STATE, then the following fields may be set // sqlState as defined in the ISO/IEF CLI specification 4: optional string sqlState // Internal error code 5: optional i32 errorCode // Error message 6: optional string errorMessage // When was the operation started 7: optional i64 operationStarted // When was the operation completed 8: optional i64 operationCompleted } {noformat} Here is few commits in forked repo in our organization, which can be picked up as patch here : https://github.com/inmobi/hive/commit/8eb3fd4a799157b1634876490c19061e257e83fd https://github.com/InMobi/hive/commit/99475a9ed0dc840dd5445dcf100cd7abf322afc1 https://github.com/InMobi/hive/commit/85bf27311baaa4f83d928a39b44a1a182671b66f More information to user on GetOperationStatus in Hive Server2 when query is still executing Key: HIVE-4570 URL: https://issues.apache.org/jira/browse/HIVE-4570 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.11.0 Reporter: Amareshwari Sriramadasu Currently in Hive Server2, when the query is still executing only the status is set as STILL_EXECUTING. This issue is to give more information to the user such as progress and running job handles, if possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11461) Transform flat AND/OR into IN struct clause
[ https://issues.apache.org/jira/browse/HIVE-11461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-11461: --- Attachment: HIVE-11461.1.patch Transform flat AND/OR into IN struct clause --- Key: HIVE-11461 URL: https://issues.apache.org/jira/browse/HIVE-11461 Project: Hive Issue Type: Bug Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-11461.1.patch, HIVE-11461.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11340) Create ORC based table using like clause doesn't copy compression property
[ https://issues.apache.org/jira/browse/HIVE-11340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-11340: Affects Version/s: 0.14.0 1.0.0 Create ORC based table using like clause doesn't copy compression property -- Key: HIVE-11340 URL: https://issues.apache.org/jira/browse/HIVE-11340 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.14.0, 1.0.0 Reporter: Gaurav Kohli Assignee: Yongzhi Chen Priority: Minor Attachments: HIVE-11340.1.patch I found a issue in “create table like” clause, as it is not copying the table properties from ORC File format based table. Steps to reproduce: Step1 : create table orc_table ( time string) stored as ORC tblproperties (orc.compress=SNAPPY); Step 2: create table orc_table_using_like like orc_table; Step 3: show create table orc_table_using_like; Result: createtab_stmt CREATE TABLE `orc_table_using_like`( `time` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 'hdfs://nameservice1/user/hive/warehouse/gkohli.db/orc_table_using_like' TBLPROPERTIES ( 'transient_lastDdlTime'='1437578939') Issue: 'orc.compress'='SNAPPY' property is missing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11250) Change in spark.executor.instances (and others) doesn't take effect after RSC is launched for HS2 [Spark Brnach]
[ https://issues.apache.org/jira/browse/HIVE-11250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659920#comment-14659920 ] Hive QA commented on HIVE-11250: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748916/HIVE-11250.1.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9324 tests executed *Failed tests:* {noformat} org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4844/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4844/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4844/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12748916 - PreCommit-HIVE-TRUNK-Build Change in spark.executor.instances (and others) doesn't take effect after RSC is launched for HS2 [Spark Brnach] Key: HIVE-11250 URL: https://issues.apache.org/jira/browse/HIVE-11250 Project: Hive Issue Type: Bug Components: Spark Affects Versions: 1.1.0 Reporter: Xuefu Zhang Assignee: Jimmy Xiang Fix For: 1.3.0, 2.0.0 Attachments: HIVE-11250.1.patch, HIVE-11250.1.patch, HIVE-11250.1.patch Hive CLI works as expected. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11461) Transform flat AND/OR into IN struct clause
[ https://issues.apache.org/jira/browse/HIVE-11461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659975#comment-14659975 ] Jesus Camacho Rodriguez commented on HIVE-11461: [~gopalv], new patch should make the transformation quicker. Transform flat AND/OR into IN struct clause --- Key: HIVE-11461 URL: https://issues.apache.org/jira/browse/HIVE-11461 Project: Hive Issue Type: Bug Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-11461.1.patch, HIVE-11461.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11340) Create ORC based table using like clause doesn't copy compression property
[ https://issues.apache.org/jira/browse/HIVE-11340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-11340: Affects Version/s: 1.2.0 Create ORC based table using like clause doesn't copy compression property -- Key: HIVE-11340 URL: https://issues.apache.org/jira/browse/HIVE-11340 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.14.0, 1.0.0, 1.2.0 Reporter: Gaurav Kohli Assignee: Yongzhi Chen Priority: Minor Attachments: HIVE-11340.1.patch I found a issue in “create table like” clause, as it is not copying the table properties from ORC File format based table. Steps to reproduce: Step1 : create table orc_table ( time string) stored as ORC tblproperties (orc.compress=SNAPPY); Step 2: create table orc_table_using_like like orc_table; Step 3: show create table orc_table_using_like; Result: createtab_stmt CREATE TABLE `orc_table_using_like`( `time` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 'hdfs://nameservice1/user/hive/warehouse/gkohli.db/orc_table_using_like' TBLPROPERTIES ( 'transient_lastDdlTime'='1437578939') Issue: 'orc.compress'='SNAPPY' property is missing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11340) Create ORC based table using like clause doesn't copy compression property
[ https://issues.apache.org/jira/browse/HIVE-11340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659866#comment-14659866 ] Yongzhi Chen commented on HIVE-11340: - Not sure why the build has not run yet, priority too low? Add more comments for easier to understand the change: The fix is following same pattern as patch for HIVE-8450. Create ORC based table using like clause doesn't copy compression property -- Key: HIVE-11340 URL: https://issues.apache.org/jira/browse/HIVE-11340 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.14.0, 1.0.0, 1.2.0 Reporter: Gaurav Kohli Assignee: Yongzhi Chen Priority: Minor Attachments: HIVE-11340.1.patch I found a issue in “create table like” clause, as it is not copying the table properties from ORC File format based table. Steps to reproduce: Step1 : create table orc_table ( time string) stored as ORC tblproperties (orc.compress=SNAPPY); Step 2: create table orc_table_using_like like orc_table; Step 3: show create table orc_table_using_like; Result: createtab_stmt CREATE TABLE `orc_table_using_like`( `time` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 'hdfs://nameservice1/user/hive/warehouse/gkohli.db/orc_table_using_like' TBLPROPERTIES ( 'transient_lastDdlTime'='1437578939') Issue: 'orc.compress'='SNAPPY' property is missing -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11484) Fix ObjectInspector for Char and VarChar
[ https://issues.apache.org/jira/browse/HIVE-11484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659891#comment-14659891 ] Amareshwari Sriramadasu commented on HIVE-11484: Here is another commit - https://github.com/InMobi/hive/commit/d7b1916da379b5a310639d479604786b05499cb2 Fix ObjectInspector for Char and VarChar Key: HIVE-11484 URL: https://issues.apache.org/jira/browse/HIVE-11484 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Amareshwari Sriramadasu The creation of HiveChar and Varchar is not happening through ObjectInspector. Here is fix we pushed internally : https://github.com/InMobi/hive/commit/fe95c7850e7130448209141155f28b25d3504216 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-11491) Lazily call ASTNode::toStringTree() after tree modification
[ https://issues.apache.org/jira/browse/HIVE-11491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan resolved HIVE-11491. -- Resolution: Duplicate Accidentally created this one, this is the same as HIVE-11490. Lazily call ASTNode::toStringTree() after tree modification --- Key: HIVE-11491 URL: https://issues.apache.org/jira/browse/HIVE-11491 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Currently, we call toStringTree() as part of HIVE-11316 everytime the tree is modified. This is a bad approach as we can lazily delay this to the point when toStringTree() is called again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11448) Support vectorization of Multi-OR and Multi-AND
[ https://issues.apache.org/jira/browse/HIVE-11448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660569#comment-14660569 ] Gunther Hagleitner commented on HIVE-11448: --- [~mmccline] looks like there are a couple of plan changes now. Are those real issues or just need golden update? (cc [~gopalv]) Support vectorization of Multi-OR and Multi-AND --- Key: HIVE-11448 URL: https://issues.apache.org/jira/browse/HIVE-11448 Project: Hive Issue Type: Bug Components: Hive Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-11448.01.patch, HIVE-11448.02.patch, HIVE-11448.03.patch Support more than 2 children for OR and AND when all children are expressions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11480) CBO: Calcite Operator To Hive Operator (Calcite Return Path): char/varchar as input to GenericUDAF
[ https://issues.apache.org/jira/browse/HIVE-11480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11480: --- Attachment: HIVE-11480.01.patch CBO: Calcite Operator To Hive Operator (Calcite Return Path): char/varchar as input to GenericUDAF --- Key: HIVE-11480 URL: https://issues.apache.org/jira/browse/HIVE-11480 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11480.01.patch Some of the UDAF can not deal with char/varchar correctly when return path is on, for example udaf_number_format.q. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11480) CBO: Calcite Operator To Hive Operator (Calcite Return Path): char/varchar as input to GenericUDAF
[ https://issues.apache.org/jira/browse/HIVE-11480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660333#comment-14660333 ] Pengcheng Xiong commented on HIVE-11480: [~jcamachorodriguez], could u review the patch? Thanks. CBO: Calcite Operator To Hive Operator (Calcite Return Path): char/varchar as input to GenericUDAF --- Key: HIVE-11480 URL: https://issues.apache.org/jira/browse/HIVE-11480 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11480.01.patch Some of the UDAF can not deal with char/varchar correctly when return path is on, for example udaf_number_format.q. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11087) DbTxnManager exceptions should include txnid
[ https://issues.apache.org/jira/browse/HIVE-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660474#comment-14660474 ] Alan Gates commented on HIVE-11087: --- +1 DbTxnManager exceptions should include txnid Key: HIVE-11087 URL: https://issues.apache.org/jira/browse/HIVE-11087 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 1.0.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-11087.3.patch must include txnid in the exception so that user visible error can be correlated with log file info -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11428) Performance: Struct IN() clauses are extremely slow (~10x slower)
[ https://issues.apache.org/jira/browse/HIVE-11428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660473#comment-14660473 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-11428: -- The above test failure looks unrelated to the changes. Thanks Hari Performance: Struct IN() clauses are extremely slow (~10x slower) -- Key: HIVE-11428 URL: https://issues.apache.org/jira/browse/HIVE-11428 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.3.0, 2.0.0 Reporter: Gopal V Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-11428.1.patch, HIVE-11428.2.patch Hive today does not support tuple IN() clauses today, but provides a way to rewrite (a,b) IN (...) using complex types. select * from table where STRUCT(a,b) IN (STRUCT(1,2), STRUCT(2,3) ...); This would be fine, except it is massively slower due to ObjectConvertors and Struct constructor not being constant folded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-4897) Hive should handle AlreadyExists on retries when creating tables/partitions
[ https://issues.apache.org/jira/browse/HIVE-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu reassigned HIVE-4897: -- Assignee: Aihua Xu Hive should handle AlreadyExists on retries when creating tables/partitions --- Key: HIVE-4897 URL: https://issues.apache.org/jira/browse/HIVE-4897 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Aihua Xu Attachments: hive-snippet.log Creating new tables/partitions may fail with an AlreadyExistsException if there is an error part way through the creation and the HMS tries again without properly cleaning up or checking if this is a retry. While partitioning a new table via a script on distributed hive (MetaStore on the same machine) there was a long timeout and then: {code} FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. AlreadyExistsException(message:Partition already exists:Partition( ... {code} I am assuming this is due to retry. Perhaps already-exists on retry could be handled better. A similar error occurred while creating a table through Impala, which issued a single createTable call that failed with an AlreadyExistsException. See the logs related to table tmp_proc_8_d2b7b0f133be455ca95615818b8a5879_7 in the attached hive-snippet.log -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11341) Avoid expensive resizing of ASTNode tree
[ https://issues.apache.org/jira/browse/HIVE-11341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-11341: - Attachment: HIVE-11341.6.patch Avoid expensive resizing of ASTNode tree - Key: HIVE-11341 URL: https://issues.apache.org/jira/browse/HIVE-11341 Project: Hive Issue Type: Bug Components: Hive, Physical Optimizer Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-11341.1.patch, HIVE-11341.2.patch, HIVE-11341.3.patch, HIVE-11341.4.patch, HIVE-11341.5.patch, HIVE-11341.6.patch {code} Stack TraceSample CountPercentage(%) parse.BaseSemanticAnalyzer.analyze(ASTNode, Context) 1,605 90 parse.CalcitePlanner.analyzeInternal(ASTNode) 1,605 90 parse.SemanticAnalyzer.analyzeInternal(ASTNode, SemanticAnalyzer$PlannerContext) 1,605 90 parse.CalcitePlanner.genOPTree(ASTNode, SemanticAnalyzer$PlannerContext) 1,604 90 parse.SemanticAnalyzer.genOPTree(ASTNode, SemanticAnalyzer$PlannerContext) 1,604 90 parse.SemanticAnalyzer.genPlan(QB) 1,604 90 parse.SemanticAnalyzer.genPlan(QB, boolean) 1,604 90 parse.SemanticAnalyzer.genBodyPlan(QB, Operator, Map) 1,604 90 parse.SemanticAnalyzer.genFilterPlan(ASTNode, QB, Operator, Map, boolean) 1,603 90 parse.SemanticAnalyzer.genFilterPlan(QB, ASTNode, Operator, boolean)1,603 90 parse.SemanticAnalyzer.genExprNodeDesc(ASTNode, RowResolver, boolean)1,603 90 parse.SemanticAnalyzer.genExprNodeDesc(ASTNode, RowResolver, TypeCheckCtx) 1,603 90 parse.SemanticAnalyzer.genAllExprNodeDesc(ASTNode, RowResolver, TypeCheckCtx) 1,603 90 parse.TypeCheckProcFactory.genExprNode(ASTNode, TypeCheckCtx) 1,603 90 parse.TypeCheckProcFactory.genExprNode(ASTNode, TypeCheckCtx, TypeCheckProcFactory) 1,603 90 lib.DefaultGraphWalker.startWalking(Collection, HashMap) 1,579 89 lib.DefaultGraphWalker.walk(Node) 1,571 89 java.util.ArrayList.removeAll(Collection) 1,433 81 java.util.ArrayList.batchRemove(Collection, boolean) 1,433 81 java.util.ArrayList.contains(Object) 1,228 69 java.util.ArrayList.indexOf(Object)1,228 69 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11341) Avoid expensive resizing of ASTNode tree
[ https://issues.apache.org/jira/browse/HIVE-11341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660386#comment-14660386 ] Hive QA commented on HIVE-11341: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748935/HIVE-11341.5.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4846/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4846/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4846/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Tests exited with: ExecutionException: java.util.concurrent.ExecutionException: java.io.IOException: Could not create /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4846/succeeded/TestContribCliDriver {noformat} This message is automatically generated. ATTACHMENT ID: 12748935 - PreCommit-HIVE-TRUNK-Build Avoid expensive resizing of ASTNode tree - Key: HIVE-11341 URL: https://issues.apache.org/jira/browse/HIVE-11341 Project: Hive Issue Type: Bug Components: Hive, Physical Optimizer Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-11341.1.patch, HIVE-11341.2.patch, HIVE-11341.3.patch, HIVE-11341.4.patch, HIVE-11341.5.patch {code} Stack TraceSample CountPercentage(%) parse.BaseSemanticAnalyzer.analyze(ASTNode, Context) 1,605 90 parse.CalcitePlanner.analyzeInternal(ASTNode) 1,605 90 parse.SemanticAnalyzer.analyzeInternal(ASTNode, SemanticAnalyzer$PlannerContext) 1,605 90 parse.CalcitePlanner.genOPTree(ASTNode, SemanticAnalyzer$PlannerContext) 1,604 90 parse.SemanticAnalyzer.genOPTree(ASTNode, SemanticAnalyzer$PlannerContext) 1,604 90 parse.SemanticAnalyzer.genPlan(QB) 1,604 90 parse.SemanticAnalyzer.genPlan(QB, boolean) 1,604 90 parse.SemanticAnalyzer.genBodyPlan(QB, Operator, Map) 1,604 90 parse.SemanticAnalyzer.genFilterPlan(ASTNode, QB, Operator, Map, boolean) 1,603 90 parse.SemanticAnalyzer.genFilterPlan(QB, ASTNode, Operator, boolean)1,603 90 parse.SemanticAnalyzer.genExprNodeDesc(ASTNode, RowResolver, boolean)1,603 90 parse.SemanticAnalyzer.genExprNodeDesc(ASTNode, RowResolver, TypeCheckCtx) 1,603 90 parse.SemanticAnalyzer.genAllExprNodeDesc(ASTNode, RowResolver, TypeCheckCtx) 1,603 90 parse.TypeCheckProcFactory.genExprNode(ASTNode, TypeCheckCtx) 1,603 90 parse.TypeCheckProcFactory.genExprNode(ASTNode, TypeCheckCtx, TypeCheckProcFactory) 1,603 90 lib.DefaultGraphWalker.startWalking(Collection, HashMap) 1,579 89 lib.DefaultGraphWalker.walk(Node) 1,571 89 java.util.ArrayList.removeAll(Collection) 1,433 81 java.util.ArrayList.batchRemove(Collection, boolean) 1,433 81 java.util.ArrayList.contains(Object) 1,228 69 java.util.ArrayList.indexOf(Object)1,228 69 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-4897) Hive should handle AlreadyExists on retries when creating tables/partitions
[ https://issues.apache.org/jira/browse/HIVE-4897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-4897: --- Attachment: HIVE-4897.patch Hive should handle AlreadyExists on retries when creating tables/partitions --- Key: HIVE-4897 URL: https://issues.apache.org/jira/browse/HIVE-4897 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Aihua Xu Attachments: HIVE-4897.patch, hive-snippet.log Creating new tables/partitions may fail with an AlreadyExistsException if there is an error part way through the creation and the HMS tries again without properly cleaning up or checking if this is a retry. While partitioning a new table via a script on distributed hive (MetaStore on the same machine) there was a long timeout and then: {code} FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. AlreadyExistsException(message:Partition already exists:Partition( ... {code} I am assuming this is due to retry. Perhaps already-exists on retry could be handled better. A similar error occurred while creating a table through Impala, which issued a single createTable call that failed with an AlreadyExistsException. See the logs related to table tmp_proc_8_d2b7b0f133be455ca95615818b8a5879_7 in the attached hive-snippet.log -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-11359) Fix alter related Unit tests for HBase metastore
[ https://issues.apache.org/jira/browse/HIVE-11359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta reassigned HIVE-11359: --- Assignee: Vaibhav Gumashta Fix alter related Unit tests for HBase metastore Key: HIVE-11359 URL: https://issues.apache.org/jira/browse/HIVE-11359 Project: Hive Issue Type: Sub-task Components: HBase Metastore, Metastore Affects Versions: hbase-metastore-branch Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta alter_partition_change_col and alter1.q fail (of the 45 sampled q files; there could be other failures we haven't identified yet). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11466) HIVE-10166 generates more data on hive.log causing Jenkins to fill all the disk.
[ https://issues.apache.org/jira/browse/HIVE-11466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660089#comment-14660089 ] Xuefu Zhang commented on HIVE-11466: It seems using thrift 0.9.2 for the code generation has caused the problem. I ran the test before and after HIVE-9152 commit on Spark branch and can see the behavior difference. Basically, you will either see or not see the following in hive.log after or before the commit when you run the test above. {code} 2015-08-06 07:17:08,960 WARN [Thread-17]: server.TThreadPoolServer (TThreadPoolServer.java:serve(206)) - Transport error occurred during acceptance of message. org.apache.thrift.transport.TTransportException: No underlying server socket. at org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:126) at org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:35) at org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:60) at org.apache.thrift.server.TThreadPoolServer.serve(TThreadPoolServer.java:161) at org.apache.hive.service.cli.thrift.ThriftBinaryCLIService.run(ThriftBinaryCLIService.java:100) at java.lang.Thread.run(Thread.java:722) {code} I don't see anything obviously wrong though. [~csun], could you take a look? We can either find a fix (if that's easy), or simply regenerate thrift code using 0.9.0. HIVE-10166 generates more data on hive.log causing Jenkins to fill all the disk. Key: HIVE-11466 URL: https://issues.apache.org/jira/browse/HIVE-11466 Project: Hive Issue Type: Bug Reporter: Sergio Peña Assignee: Xuefu Zhang An issue with HIVE-10166 patch is increasing the size of hive.log and causing jenkins to fail because it does not have more space. Here's a test I run when running TestJdbcWithMiniHS2 before the patch, with the patch, and after other commits. {noformat} BEFORE HIVE-10166 13M Aug 5 11:57 ./hive-unit/target/tmp/log/hive.log WITH HIVE-10166 2.4G Aug 5 12:07 ./hive-unit/target/tmp/log/hive.log CURRENT HEAD 3.2G Aug 5 12:36 ./hive-unit/target/tmp/log/hive.log {noformat} This is just a single test, but on Jenkins, hive.log is more than 13G of size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9152) Dynamic Partition Pruning [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659675#comment-14659675 ] Lefty Leverenz commented on HIVE-9152: -- The doc is done (thanks [~csun]) so I'm removing the TODOC-SPARK and TODOC-1.3 labels. * [Configuration Properties -- hive.spark.dynamic.partition.pruning | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.spark.dynamic.partition.pruning] * [Configuration Properties -- hive.spark.dynamic.partition.pruning.max.data.size | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.spark.dynamic.partition.pruning.max.data.size] [~sladymon] added some information from the description in the patch, so you might want to review *hive.spark.dynamic.partition.pruning*. Dynamic Partition Pruning [Spark Branch] Key: HIVE-9152 URL: https://issues.apache.org/jira/browse/HIVE-9152 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: spark-branch Reporter: Brock Noland Assignee: Chao Sun Labels: TODOC-SPARK, TODOC1.3 Fix For: spark-branch, 1.3.0, 2.0.0 Attachments: HIVE-9152.1-spark.patch, HIVE-9152.10-spark.patch, HIVE-9152.11-spark.patch, HIVE-9152.12-spark.patch, HIVE-9152.2-spark.patch, HIVE-9152.3-spark.patch, HIVE-9152.4-spark.patch, HIVE-9152.5-spark.patch, HIVE-9152.6-spark.patch, HIVE-9152.8-spark.patch, HIVE-9152.9-spark.patch Tez implemented dynamic partition pruning in HIVE-7826. This is a nice optimization and we should implement the same in HOS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11416) CBO: Calcite Operator To Hive Operator (Calcite Return Path): Groupby Optimizer assumes the schema can match after removing RS and GBY
[ https://issues.apache.org/jira/browse/HIVE-11416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659743#comment-14659743 ] Jesus Camacho Rodriguez commented on HIVE-11416: [~pxiong], I left a couple of comments in RB. Thanks CBO: Calcite Operator To Hive Operator (Calcite Return Path): Groupby Optimizer assumes the schema can match after removing RS and GBY -- Key: HIVE-11416 URL: https://issues.apache.org/jira/browse/HIVE-11416 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-11416.01.patch, HIVE-11416.02.patch, HIVE-11416.03.patch, HIVE-11416.04.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11180) Enable native vectorized map join for spark [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-11180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-11180: -- Labels: TODOC-SPARK (was: ) Enable native vectorized map join for spark [Spark Branch] -- Key: HIVE-11180 URL: https://issues.apache.org/jira/browse/HIVE-11180 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Rui Li Assignee: Rui Li Labels: TODOC-SPARK Fix For: spark-branch Attachments: HIVE-11180.1-spark.patch, HIVE-11180.2-spark.patch The improvement was introduced in HIVE-9824. Let's use this task to track how we can enable that for spark. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11180) Enable native vectorized map join for spark [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-11180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659598#comment-14659598 ] Lefty Leverenz commented on HIVE-11180: --- Doc note: This adds and Spark to the description of *hive.mapjoin.optimized.hashtable* in HiveConf.java, so the parameter needs to be updated (with version information) in Configuration Properties after the patch gets merged to master. * [ConfigurationProperties -- hive.mapjoin.optimized.hashtable | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.mapjoin.optimized.hashtable] Adding a TODOC-SPARK label just as a reminder -- no doc needed until merged to master. Enable native vectorized map join for spark [Spark Branch] -- Key: HIVE-11180 URL: https://issues.apache.org/jira/browse/HIVE-11180 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Rui Li Assignee: Rui Li Labels: TODOC-SPARK Fix For: spark-branch Attachments: HIVE-11180.1-spark.patch, HIVE-11180.2-spark.patch The improvement was introduced in HIVE-9824. Let's use this task to track how we can enable that for spark. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11180) Enable native vectorized map join for spark [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-11180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659627#comment-14659627 ] Rui Li commented on HIVE-11180: --- Thanks [~leftylev]. Enable native vectorized map join for spark [Spark Branch] -- Key: HIVE-11180 URL: https://issues.apache.org/jira/browse/HIVE-11180 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Rui Li Assignee: Rui Li Labels: TODOC-SPARK Fix For: spark-branch Attachments: HIVE-11180.1-spark.patch, HIVE-11180.2-spark.patch The improvement was introduced in HIVE-9824. Let's use this task to track how we can enable that for spark. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-11485) Session close should not close async SQL operations
[ https://issues.apache.org/jira/browse/HIVE-11485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Barr reassigned HIVE-11485: -- Assignee: Deepak Barr Session close should not close async SQL operations --- Key: HIVE-11485 URL: https://issues.apache.org/jira/browse/HIVE-11485 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Amareshwari Sriramadasu Assignee: Deepak Barr Right now, session close on HiveServer closes all operations. But, queries running are actually available across sessions and they are not tied to a session (expect the launch - which requires configuration and resources). And it allows getting the status of the query across sessions. But session close of the session ( on which operation is launched) closes all the operations as well. So, we should avoid closing all operations upon closing a session. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7476) CTAS does not work properly for s3
[ https://issues.apache.org/jira/browse/HIVE-7476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660105#comment-14660105 ] Sergio Peña commented on HIVE-7476: --- +1 The patch looks good. Encrypted files will work fine. Just one quick feedback. If you can add more comment on needToCopy() method. I did not know what {{boolean diffFs = !srcFs.getClass().equals(destFs.getClass());}} was doing until I saw Lenni's comment. CTAS does not work properly for s3 -- Key: HIVE-7476 URL: https://issues.apache.org/jira/browse/HIVE-7476 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.13.1, 1.1.0 Environment: Linux Reporter: Jian Fang Assignee: Szehon Ho Attachments: HIVE-7476.1.patch, HIVE-7476.2.patch When we use CTAS to create a new table in s3, the table location is not set correctly. As a result, the data from the existing table cannot be inserted into the new created table. We can use the following example to reproduce this issue. set hive.metastore.warehouse.dir=OUTPUT_PATH; drop table s3_dir_test; drop table s3_1; drop table s3_2; create external table s3_dir_test(strct structa:int, b:string, c:string) row format delimited fields terminated by '\t' collection items terminated by ' ' location 'INPUT_PATH'; create table s3_1(strct structa:int, b:string, c:string) row format delimited fields terminated by '\t' collection items terminated by ' '; insert overwrite table s3_1 select * from s3_dir_test; select * from s3_1; create table s3_2 as select * from s3_1; select * from s3_1; select * from s3_2; The data could be as follows. 1 abc 10.5 2 def 11.5 3 ajss 90.23232 4 djns 89.02002 5 random 2.99 6 data 3.002 7 ne 71.9084 The root cause is that the SemanticAnalyzer class did not handle s3 location properly for CTAS. A patch will be provided shortly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-11326) Parquet table: where clause with partition column fails
[ https://issues.apache.org/jira/browse/HIVE-11326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña resolved HIVE-11326. Resolution: Duplicate Assignee: Sergio Peña Hi [~tfriedr] This issue has been fixed on HIVE-11401. Parquet table: where clause with partition column fails --- Key: HIVE-11326 URL: https://issues.apache.org/jira/browse/HIVE-11326 Project: Hive Issue Type: Bug Components: Query Planning Affects Versions: 1.2.0, 1.2.1 Reporter: Thomas Friedrich Assignee: Sergio Peña Labels: parquet Steps: create table t1 (c1 int) partitioned by (part string) stored as parquet; insert into table t1 partition (part='p1') values (1); select * from t1 where part='p1'; Error message: Caused by: java.lang.IllegalArgumentException: Column [part] was not found in schema! at parquet.Preconditions.checkArgument(Preconditions.java:55) at parquet.filter2.predicate.SchemaCompatibilityValidator.getColumnDescriptor(SchemaCompatibilityValidator.java:190) at parquet.filter2.predicate.SchemaCompatibilityValidator.validateColumn(SchemaCompatibilityValidator.java:178) at parquet.filter2.predicate.SchemaCompatibilityValidator.validateColumnFilterPredicate(SchemaCompatibilityValidator.java:160) at parquet.filter2.predicate.SchemaCompatibilityValidator.visit(SchemaCompatibilityValidator.java:94) at parquet.filter2.predicate.SchemaCompatibilityValidator.visit(SchemaCompatibilityValidator.java:59) at parquet.filter2.predicate.Operators$Eq.accept(Operators.java:180) at parquet.filter2.predicate.SchemaCompatibilityValidator.validate(SchemaCompatibilityValidator.java:64) at parquet.filter2.compat.RowGroupFilter.visit(RowGroupFilter.java:59) at parquet.filter2.compat.RowGroupFilter.visit(RowGroupFilter.java:40) at parquet.filter2.compat.FilterCompat$FilterPredicateCompat.accept(FilterCompat.java:126) at parquet.filter2.compat.RowGroupFilter.filterRowGroups(RowGroupFilter.java:46) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:275) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:99) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:85) at org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:72) at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.init(CombineHiveRecordReader.java:67) Seems that problem was introduced with HIVE-10252 ([~dongc]). Filter can't contain any partition columns in case of Parquet table. While searching for an existing JIRA, I found a similar problem reported for Spark - SPARK-6554 I think the setFilter method should remove all predicates that reference partition columns before building the FilterPredicate object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11466) HIVE-10166 generates more data on hive.log causing Jenkins to fill all the disk.
[ https://issues.apache.org/jira/browse/HIVE-11466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660231#comment-14660231 ] Chao Sun commented on HIVE-11466: - OK, I'll take a look. HIVE-10166 generates more data on hive.log causing Jenkins to fill all the disk. Key: HIVE-11466 URL: https://issues.apache.org/jira/browse/HIVE-11466 Project: Hive Issue Type: Bug Reporter: Sergio Peña Assignee: Xuefu Zhang An issue with HIVE-10166 patch is increasing the size of hive.log and causing jenkins to fail because it does not have more space. Here's a test I run when running TestJdbcWithMiniHS2 before the patch, with the patch, and after other commits. {noformat} BEFORE HIVE-10166 13M Aug 5 11:57 ./hive-unit/target/tmp/log/hive.log WITH HIVE-10166 2.4G Aug 5 12:07 ./hive-unit/target/tmp/log/hive.log CURRENT HEAD 3.2G Aug 5 12:36 ./hive-unit/target/tmp/log/hive.log {noformat} This is just a single test, but on Jenkins, hive.log is more than 13G of size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11466) HIVE-10166 generates more data on hive.log causing Jenkins to fill all the disk.
[ https://issues.apache.org/jira/browse/HIVE-11466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HIVE-11466: Attachment: HIVE-11466.patch Let's see if using Thrift 0.9.0 solve the problem. HIVE-10166 generates more data on hive.log causing Jenkins to fill all the disk. Key: HIVE-11466 URL: https://issues.apache.org/jira/browse/HIVE-11466 Project: Hive Issue Type: Bug Reporter: Sergio Peña Assignee: Xuefu Zhang Attachments: HIVE-11466.patch An issue with HIVE-10166 patch is increasing the size of hive.log and causing jenkins to fail because it does not have more space. Here's a test I run when running TestJdbcWithMiniHS2 before the patch, with the patch, and after other commits. {noformat} BEFORE HIVE-10166 13M Aug 5 11:57 ./hive-unit/target/tmp/log/hive.log WITH HIVE-10166 2.4G Aug 5 12:07 ./hive-unit/target/tmp/log/hive.log CURRENT HEAD 3.2G Aug 5 12:36 ./hive-unit/target/tmp/log/hive.log {noformat} This is just a single test, but on Jenkins, hive.log is more than 13G of size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11466) HIVE-10166 generates more data on hive.log causing Jenkins to fill all the disk.
[ https://issues.apache.org/jira/browse/HIVE-11466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HIVE-11466: Attachment: (was: HIVE-11466.patch) HIVE-10166 generates more data on hive.log causing Jenkins to fill all the disk. Key: HIVE-11466 URL: https://issues.apache.org/jira/browse/HIVE-11466 Project: Hive Issue Type: Bug Reporter: Sergio Peña Assignee: Xuefu Zhang An issue with HIVE-10166 patch is increasing the size of hive.log and causing jenkins to fail because it does not have more space. Here's a test I run when running TestJdbcWithMiniHS2 before the patch, with the patch, and after other commits. {noformat} BEFORE HIVE-10166 13M Aug 5 11:57 ./hive-unit/target/tmp/log/hive.log WITH HIVE-10166 2.4G Aug 5 12:07 ./hive-unit/target/tmp/log/hive.log CURRENT HEAD 3.2G Aug 5 12:36 ./hive-unit/target/tmp/log/hive.log {noformat} This is just a single test, but on Jenkins, hive.log is more than 13G of size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11466) HIVE-10166 generates more data on hive.log causing Jenkins to fill all the disk.
[ https://issues.apache.org/jira/browse/HIVE-11466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HIVE-11466: Attachment: HIVE-11466.patch HIVE-10166 generates more data on hive.log causing Jenkins to fill all the disk. Key: HIVE-11466 URL: https://issues.apache.org/jira/browse/HIVE-11466 Project: Hive Issue Type: Bug Reporter: Sergio Peña Assignee: Xuefu Zhang Attachments: HIVE-11466.patch An issue with HIVE-10166 patch is increasing the size of hive.log and causing jenkins to fail because it does not have more space. Here's a test I run when running TestJdbcWithMiniHS2 before the patch, with the patch, and after other commits. {noformat} BEFORE HIVE-10166 13M Aug 5 11:57 ./hive-unit/target/tmp/log/hive.log WITH HIVE-10166 2.4G Aug 5 12:07 ./hive-unit/target/tmp/log/hive.log CURRENT HEAD 3.2G Aug 5 12:36 ./hive-unit/target/tmp/log/hive.log {noformat} This is just a single test, but on Jenkins, hive.log is more than 13G of size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9152) Dynamic Partition Pruning [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-9152: - Labels: (was: TODOC-SPARK TODOC1.3) Dynamic Partition Pruning [Spark Branch] Key: HIVE-9152 URL: https://issues.apache.org/jira/browse/HIVE-9152 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: spark-branch Reporter: Brock Noland Assignee: Chao Sun Fix For: spark-branch, 1.3.0, 2.0.0 Attachments: HIVE-9152.1-spark.patch, HIVE-9152.10-spark.patch, HIVE-9152.11-spark.patch, HIVE-9152.12-spark.patch, HIVE-9152.2-spark.patch, HIVE-9152.3-spark.patch, HIVE-9152.4-spark.patch, HIVE-9152.5-spark.patch, HIVE-9152.6-spark.patch, HIVE-9152.8-spark.patch, HIVE-9152.9-spark.patch Tez implemented dynamic partition pruning in HIVE-7826. This is a nice optimization and we should implement the same in HOS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11381) QTest combine2_hadoop20.q fails when using -Phadoop-1 profile due to HIVE-11139
[ https://issues.apache.org/jira/browse/HIVE-11381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660161#comment-14660161 ] Sergio Peña commented on HIVE-11381: [~jxiang] Is this an issue on upstream? or can I close the ticket? QTest combine2_hadoop20.q fails when using -Phadoop-1 profile due to HIVE-11139 --- Key: HIVE-11381 URL: https://issues.apache.org/jira/browse/HIVE-11381 Project: Hive Issue Type: Bug Affects Versions: 1.3.0 Reporter: Sergio Peña Assignee: Sergio Peña The q-est {{combine2_hadoop20.q}} is failing when running -Phadoop-1 profile tests. The output test is different due to the changes added on HIVE-11139 for more lineage information. Based on other HIVE-11139 tests, this test output needs to be regenerated only. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-5277) HBase handler skips rows with null valued first cells when only row key is selected
[ https://issues.apache.org/jira/browse/HIVE-5277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660186#comment-14660186 ] Swarnim Kulkarni commented on HIVE-5277: Just to update, this is an issue with count(*) type queries as well not only count(key). HBase handler skips rows with null valued first cells when only row key is selected --- Key: HIVE-5277 URL: https://issues.apache.org/jira/browse/HIVE-5277 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.11.0, 0.11.1, 0.12.0, 0.13.0 Reporter: Teddy Choi Assignee: Swarnim Kulkarni Priority: Critical Attachments: HIVE-5277.1.patch.txt, HIVE-5277.2.patch.txt HBaseStorageHandler skips rows with null valued first cells when only row key is selected. {noformat} SELECT key, col1, col2 FROM hbase_table; key1 cell1 cell2 key2 NULLcell3 SELECT COUNT(key) FROM hbase_table; 1 {noformat} HiveHBaseTableInputFormat.getRecordReader makes first cell selected to avoid skipping rows. But when the first cell is null, HBase skips that row. http://hbase.apache.org/book/perf.reading.html 12.9.6. Optimal Loading of Row Keys describes how to deal with this problem. I tried to find an existing issue, but I couldn't. If you find a same issue, please make this issue duplicated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)