[jira] [Commented] (HIVE-10982) Customizable the value of java.sql.statement.setFetchSize in Hive JDBC Driver
[ https://issues.apache.org/jira/browse/HIVE-10982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934744#comment-14934744 ] Bing Li commented on HIVE-10982: Hi, [~vgumashta] Thank you for your comment. Do you mean to invoke a new property to hive-site.xml, which will control the max size responded by HS2 at the same time? Do you know the current control mechanism on HS2? > Customizable the value of java.sql.statement.setFetchSize in Hive JDBC Driver > -- > > Key: HIVE-10982 > URL: https://issues.apache.org/jira/browse/HIVE-10982 > Project: Hive > Issue Type: Improvement > Components: JDBC >Affects Versions: 1.2.0, 1.2.1 >Reporter: Bing Li >Assignee: Bing Li >Priority: Critical > Attachments: HIVE-10982.1.patch > > > The current JDBC driver for Hive hard-code the value of setFetchSize to 50, > which will be a bottleneck for performance. > Pentaho filed this issue as http://jira.pentaho.com/browse/PDI-11511, whose > status is open. > Also it has discussion in > http://forums.pentaho.com/showthread.php?158381-Hive-JDBC-Query-too-slow-too-many-fetches-after-query-execution-Kettle-Xform > http://mail-archives.apache.org/mod_mbox/hive-user/201307.mbox/%3ccacq46vevgrfqg5rwxnr1psgyz7dcf07mvlo8mm2qit3anm1...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9753) Wrong results when using multiple levels of Joins. When table alias of one of the table is null with left outer joins.
[ https://issues.apache.org/jira/browse/HIVE-9753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934720#comment-14934720 ] Feng Yuan commented on HIVE-9753: - [~gopalv]] > Wrong results when using multiple levels of Joins. When table alias of one of > the table is null with left outer joins. > > > Key: HIVE-9753 > URL: https://issues.apache.org/jira/browse/HIVE-9753 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0, 1.0.0 >Reporter: Pavan Srinivas >Priority: Critical > Attachments: HIVE-9753.0-0.14.0.patch, HIVE-9753.0-1.0.0.patch, > HIVE-9753.patch, table1.data, table2.data, table3.data > > > Let take scenario, where the tables are: > {code} > drop table table1; > CREATE TABLE table1( > col1 string, > col2 string, > col3 string, > col4 string > ) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '\t' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'; > drop table table2; > CREATE TABLE table2( > col1 string, > col2 bigint, > col3 string, > col4 string > ) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '\t' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'; > drop table table3; > CREATE TABLE table3( > col1 string, > col2 int, > col3 int, > col4 string) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '\t' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'; > {code} > Query with wrong results: > {code} > SELECT t1.col1 AS dummy, > t1.expected_column AS expected_column, > t2.col4 > FROM ( > SELECT col1, > '23-1', > '23-13' as three, > col4 AS expected_column > FROM table1 > ) t1 > JOIN table2 t2 > ON cast(t2.col1 as string) = cast(t1.col1 as string) > LEFT OUTER JOIN > (SELECT col4, col1 > FROM table3 > ) t3 > ON t2.col4 = t3.col1 > ; > {code} > and explain output: > {code} > STAGE DEPENDENCIES: > Stage-7 is a root stage > Stage-5 depends on stages: Stage-7 > Stage-0 depends on stages: Stage-5 > STAGE PLANS: > Stage: Stage-7 > Map Reduce Local Work > Alias -> Map Local Tables: > t1:table1 > Fetch Operator > limit: -1 > t3:table3 > Fetch Operator > limit: -1 > Alias -> Map Local Operator Tree: > t1:table1 > TableScan > alias: table1 > Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column > stats: NONE > Filter Operator > predicate: col1 is not null (type: boolean) > Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column > stats: NONE > Select Operator > expressions: col1 (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column > stats: NONE > HashTable Sink Operator > condition expressions: > 0 > 1 {col4} > keys: > 0 _col0 (type: string) > 1 col1 (type: string) > t3:table3 > TableScan > alias: table3 > Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column > stats: NONE > Select Operator > expressions: col1 (type: string) > outputColumnNames: _col1 > Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column > stats: NONE > HashTable Sink Operator > condition expressions: > 0 {_col0} {_col7} {_col7} > 1 > keys: > 0 _col7 (type: string) > 1 _col1 (type: string) > Stage: Stage-5 > Map Reduce > Map Operator Tree: > TableScan > alias: t2 > Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column > stats: NONE > Filter Operator > predicate: col1 is not null (type: boolean) > Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column > stats: NONE > Map Join Operator > condition map: > Inner Join 0 to 1 > condition expressions: > 0 {_col0} > 1 {col4} > keys: > 0 _col0 (type: string) > 1 col1 (type: string) > outputColumnNames:
[jira] [Commented] (HIVE-11930) how to prevent ppd the topN(a) udf predication in where clause?
[ https://issues.apache.org/jira/browse/HIVE-11930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934780#comment-14934780 ] Feng Yuan commented on HIVE-11930: -- hi [~ashutoshc]this cant be used in where clause. > how to prevent ppd the topN(a) udf predication in where clause? > --- > > Key: HIVE-11930 > URL: https://issues.apache.org/jira/browse/HIVE-11930 > Project: Hive > Issue Type: New Feature > Components: Hive >Affects Versions: 0.14.0 >Reporter: Feng Yuan >Priority: Minor > > select > a.state_date,a.customer,a.taskid,a.step_id,a.exit_title,a.pv,top1000(a.only_id) > from > ( select > t1.state_date,t1.customer,t1.taskid,t1.step_id,t1.exit_title,t1.pv,t1.only_id > from > ( select t11.state_date, >t11.customer, >t11.taskid, >t11.step_id, >t11.exit_title, >t11.pv, >concat(t11.customer,t11.taskid,t11.step_id) as > only_id >from > ( select > state_date,customer,taskid,step_id,exit_title,count(*) as pv > from bdi_fact2.mid_url_step > where exit_url!='-1' > and exit_title !='-1' > and l_date='2015-08-31' > group by > state_date,customer,taskid,step_id,exit_title > )t11 >)t1 >order by t1.only_id,t1.pv desc > )a > where a.customer='Cdianyingwang' > and a.taskid='33' > and a.step_id='0' > and top1000(a.only_id)<=10; > in above example: > outer top1000(a.only_id)<=10;will ppd to: > stage 1: > ( select t11.state_date, >t11.customer, >t11.taskid, >t11.step_id, >t11.exit_title, >t11.pv, >concat(t11.customer,t11.taskid,t11.step_id) as > only_id >from > ( select > state_date,customer,taskid,step_id,exit_title,count(*) as pv > from bdi_fact2.mid_url_step > where exit_url!='-1' > and exit_title !='-1' > and l_date='2015-08-31' > group by > state_date,customer,taskid,step_id,exit_title > )t11 >)t1 > and this stage have 2 reduce,so you can see this will output 20 records, > upon to outer stage,the final results is exactly this 20 records. > so i want to know is there any way to hint this topN udf predication not to > ppd? > Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9566) HiveServer2 fails to start with NullPointerException
[ https://issues.apache.org/jira/browse/HIVE-9566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934684#comment-14934684 ] Lefty Leverenz commented on HIVE-9566: -- This was also committed to branch-1.0 (for release 1.0.2). Shouldn't 1.0.2 be listed in Fix Version/s so it will get picked up for the release notes? See commit 37206a49f1f6e12f3ac997bb04d3b383ae7781e1. > HiveServer2 fails to start with NullPointerException > > > Key: HIVE-9566 > URL: https://issues.apache.org/jira/browse/HIVE-9566 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 0.13.0, 0.14.0, 0.13.1 >Reporter: Na Yang >Assignee: Na Yang > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-9566-branch-0.13.patch, > HIVE-9566-branch-0.14.patch, HIVE-9566-trunk.patch, HIVE-9566.patch > > > hiveserver2 uses embedded metastore with default hive-site.xml configuration. > I use "hive --stop --service hiveserver2" command to stop the running > hiveserver2 process and then use "hive --start --service hiveserver2" command > to start the hiveserver2 service. I see the following exception in the > hive.log file > {noformat} > java.lang.NullPointerException > at > org.apache.hive.service.server.HiveServer2.stop(HiveServer2.java:104) > at > org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:138) > at > org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:171) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.main(RunJar.java:212) > {noformat} > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11972) [Refactor] Improve determination of dynamic partitioning columns in FileSink Operator
[ https://issues.apache.org/jira/browse/HIVE-11972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934748#comment-14934748 ] Hive QA commented on HIVE-11972: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12762542/HIVE-11972.patch {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9646 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_reduce org.apache.hadoop.hive.ql.exec.TestFileSinkOperator.testDeleteDynamicPartitioning org.apache.hadoop.hive.ql.exec.TestFileSinkOperator.testInsertDynamicPartitioning org.apache.hadoop.hive.ql.exec.TestFileSinkOperator.testNonAcidDynamicPartitioning org.apache.hadoop.hive.ql.exec.TestFileSinkOperator.testUpdateDynamicPartitioning org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5454/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5454/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5454/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12762542 - PreCommit-HIVE-TRUNK-Build > [Refactor] Improve determination of dynamic partitioning columns in FileSink > Operator > - > > Key: HIVE-11972 > URL: https://issues.apache.org/jira/browse/HIVE-11972 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-11972.patch > > > Currently it uses column names to locate DP columns, which is brittle since > column names may change during planning and optimization phases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11211) Reset the fields in JoinStatsRule in StatsRulesProcFactory
[ https://issues.apache.org/jira/browse/HIVE-11211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934679#comment-14934679 ] Lefty Leverenz commented on HIVE-11211: --- Fix version only shows 2.0.0, although this was also committed to branch-1 (for 1.3.0) and recently to branch-1.2 (for 1.2.2). Commit f428af1d2908588dd68eb30cde2f158bf9ef04c0 for branch-1.2 (1.2.2). Commit 64d8582cb8d357216ef7fa208f68548ceb1ef2d3 for branch-1 (1.3.0). Commit 42326958148c2558be9c3d4dfe44c9e735704617 for master (2.0.0). > Reset the fields in JoinStatsRule in StatsRulesProcFactory > -- > > Key: HIVE-11211 > URL: https://issues.apache.org/jira/browse/HIVE-11211 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 2.0.0 > > Attachments: HIVE-11211.02.patch, HIVE-11211.03.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11903) Add zookeeper lock metrics to HS2
[ https://issues.apache.org/jira/browse/HIVE-11903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-11903: Attachment: HIVE-11903.3.patch Patch 3 removes zookeeper connection metrics and only record locks count. > Add zookeeper lock metrics to HS2 > - > > Key: HIVE-11903 > URL: https://issues.apache.org/jira/browse/HIVE-11903 > Project: Hive > Issue Type: Sub-task > Components: Diagnosability >Reporter: Szehon Ho >Assignee: Yongzhi Chen > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-11903.1.patch, HIVE-11903.2.patch, > HIVE-11903.3.patch > > > Potential metrics are active zookeeper locks taken by type. Can refine as we > go along. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11835) Type decimal(1,1) reads 0.0, 0.00, etc from text file as NULL
[ https://issues.apache.org/jira/browse/HIVE-11835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934692#comment-14934692 ] Xuefu Zhang commented on HIVE-11835: Decimal(1,1) covers a range from [-0.9, 0.9], beyond which data will be rounded (half up) to fit or null if such rounding isn't possible. As seen in the new test case, 1.0 => NULL, while 0.345 => 0.3. > Type decimal(1,1) reads 0.0, 0.00, etc from text file as NULL > - > > Key: HIVE-11835 > URL: https://issues.apache.org/jira/browse/HIVE-11835 > Project: Hive > Issue Type: Bug > Components: Types >Affects Versions: 1.2.0, 1.1.0, 2.0.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-11835.1.patch, HIVE-11835.2.patch, HIVE-11835.patch > > > Steps to reproduce: > 1. create a text file with values like 0.0, 0.00, etc. > 2. create table in hive with type decimal(1,1). > 3. run "load data local inpath ..." to load data into the table. > 4. run select * on the table. > You will see that NULL is displayed for 0.0, 0.00, .0, etc. Instead, these > should be read as 0.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10598) Vectorization borks when column is added to table.
[ https://issues.apache.org/jira/browse/HIVE-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935122#comment-14935122 ] Hive QA commented on HIVE-10598: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12762634/HIVE-10598.06.patch {color:red}ERROR:{color} -1 due to 42 failed/errored test(s), 9633 tests executed *Failed tests:* {noformat} TestMiniTezCliDriver-auto_sortmerge_join_13.q-tez_self_join.q-orc_vectorization_ppd.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_join_partition_key org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_auto_smb_mapjoin_14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_bucketmapjoin1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_ptf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_windowing_streaming org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mergejoin org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_auto_smb_mapjoin_14 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_reduce org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_leftsemi_mapjoin org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_ptf org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_ptf org.apache.hadoop.hive.ql.TestTxnCommands.testMultipleInserts org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testVectorization org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testVectorizationWithAcid org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testVectorizationWithBuckets org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.majorCompactAfterAbort org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.majorCompactWhileStreaming org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.minorCompactAfterAbort org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.minorCompactWhileStreaming org.apache.hadoop.hive.ql.txn.compactor.TestWorker.majorTableLegacy org.apache.hadoop.hive.ql.txn.compactor.TestWorker.majorTableNoBase org.apache.hadoop.hive.ql.txn.compactor.TestWorker.majorTableWithBase org.apache.hadoop.hive.ql.txn.compactor.TestWorker.majorWithAborted org.apache.hadoop.hive.ql.txn.compactor.TestWorker.majorWithOpenInMiddle org.apache.hadoop.hive.ql.txn.compactor.TestWorker.minorTableLegacy org.apache.hadoop.hive.ql.txn.compactor.TestWorker.minorTableNoBase org.apache.hadoop.hive.ql.txn.compactor.TestWorker.minorTableWithBase org.apache.hadoop.hive.ql.txn.compactor.TestWorker.minorWithAborted org.apache.hadoop.hive.ql.txn.compactor.TestWorker.minorWithOpenInMiddle org.apache.hadoop.hive.ql.txn.compactor.TestWorker2.majorTableLegacy org.apache.hadoop.hive.ql.txn.compactor.TestWorker2.majorTableNoBase org.apache.hadoop.hive.ql.txn.compactor.TestWorker2.majorTableWithBase org.apache.hadoop.hive.ql.txn.compactor.TestWorker2.majorWithAborted org.apache.hadoop.hive.ql.txn.compactor.TestWorker2.majorWithOpenInMiddle org.apache.hadoop.hive.ql.txn.compactor.TestWorker2.minorTableLegacy org.apache.hadoop.hive.ql.txn.compactor.TestWorker2.minorTableNoBase org.apache.hadoop.hive.ql.txn.compactor.TestWorker2.minorTableWithBase org.apache.hadoop.hive.ql.txn.compactor.TestWorker2.minorWithAborted org.apache.hadoop.hive.ql.txn.compactor.TestWorker2.minorWithOpenInMiddle org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5456/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5456/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5456/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 42 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12762634 - PreCommit-HIVE-TRUNK-Build > Vectorization borks when column is added to table. > -- > > Key: HIVE-10598 > URL: https://issues.apache.org/jira/browse/HIVE-10598 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Mithun Radhakrishnan >Assignee: Matt McCline > Attachments: HIVE-10598.01.patch, HIVE-10598.02.patch, > HIVE-10598.03.patch, HIVE-10598.04.patch, HIVE-10598.05.patch, > HIVE-10598.06.patch > > > Consider the following table definition: > {code:sql}
[jira] [Updated] (HIVE-11903) Add zookeeper lock metrics to HS2
[ https://issues.apache.org/jira/browse/HIVE-11903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-11903: Description: Potential metrics are active zookeeper locks taken by type. Can refine as we go along. (was: Potential metrics are active zookeeper connections, locks taken by type, etc. Can refine as we go along.) > Add zookeeper lock metrics to HS2 > - > > Key: HIVE-11903 > URL: https://issues.apache.org/jira/browse/HIVE-11903 > Project: Hive > Issue Type: Sub-task > Components: Diagnosability >Reporter: Szehon Ho >Assignee: Yongzhi Chen > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-11903.1.patch, HIVE-11903.2.patch > > > Potential metrics are active zookeeper locks taken by type. Can refine as we > go along. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11945) ORC with non-local reads may not be reusing connection to DN
[ https://issues.apache.org/jira/browse/HIVE-11945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HIVE-11945: Attachment: HIVE-11945.3.branch-1.patch Attaching rebased patch for branch-1. > ORC with non-local reads may not be reusing connection to DN > > > Key: HIVE-11945 > URL: https://issues.apache.org/jira/browse/HIVE-11945 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: HIVE-11945.1.patch, HIVE-11945.2.patch, > HIVE-11945.3.branch-1.patch, HIVE-11945.3.patch > > > When “seek + readFully(buffer, offset, length)” is used, DFSInputStream ends > up going via “readWithStrategy()”. This sets up BlockReader with length > equivalent to that of the block size. So until this position is reached, > RemoteBlockReader2.peer would not be added to the PeerCache (Plz refer > RemoteBlockReader2.close() in HDFS). So eventually the next call to the same > DN would end opening a new socket. In ORC, when it is not a data local read, > this has a the possibility of opening/closing lots of connections with DN. > In random reads, it would be good to set this length to the amount of data > that is to be read (e.g pread call in DFSInputStream which sets up the > BlockReader’s length correctly & the code path returns the Peer back to peer > cache properly). “readFully(position, buffer, offset, length)” follows this > code path and ends up reusing the connections properly. Creating this JIRA to > fix this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11880) filter bug of UNION ALL when hive.ppd.remove.duplicatefilters=true and filter condition is type incompatible column
[ https://issues.apache.org/jira/browse/HIVE-11880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935024#comment-14935024 ] WangMeng commented on HIVE-11880: - [~ashutoshc] [~jpullokkaran] I have published this patch on Review Board: https://reviews.apache.org/r/38805/ Please help review it . Thanks. > filter bug of UNION ALL when hive.ppd.remove.duplicatefilters=true and > filter condition is type incompatible column > - > > Key: HIVE-11880 > URL: https://issues.apache.org/jira/browse/HIVE-11880 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer >Affects Versions: 1.2.1 >Reporter: WangMeng >Assignee: WangMeng > Attachments: HIVE-11880.01.patch, HIVE-11880.02.patch, > HIVE-11880.03.patch, HIVE-11880.04.patch > > >For UNION ALL , when an union operator is constant column (such as '0L', > BIGINT Type) and its corresponding column has incompatible type (such as INT > type). > Query with filter condition on type incompatible column on this UNION ALL > will cause IndexOutOfBoundsException. > Such as TPC-H table "orders",in the following query: > Type of 'orders'.'o_custkey' is INT normally, while the type of > corresponding constant column "0" is BIGINT( `0L AS `o_custkey` ). > This query (with filter "type incompatible column 'o_custkey' ") will fail > with java.lang.IndexOutOfBoundsException : > {code} > SELECT Count(1) > FROM ( > SELECT `o_orderkey` , > `o_custkey` > FROM `orders` > UNION ALL > SELECT `o_orderkey`, > 0L AS `o_custkey` > FROM `orders`) `oo` > WHERE o_custkey<10 limit 4 ; > {code} > When > {code} > set hive.ppd.remove.duplicatefilters=true > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6090) Audit logs for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-6090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14934935#comment-14934935 ] Hive QA commented on HIVE-6090: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12762581/HIVE-6090.3.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9631 tests executed *Failed tests:* {noformat} TestMiniTezCliDriver-auto_join30.q-vector_data_types.q-filter_join_breaktask.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_reduce org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5455/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5455/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5455/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12762581 - PreCommit-HIVE-TRUNK-Build > Audit logs for HiveServer2 > -- > > Key: HIVE-6090 > URL: https://issues.apache.org/jira/browse/HIVE-6090 > Project: Hive > Issue Type: Improvement > Components: Diagnosability, HiveServer2 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Labels: audit, hiveserver > Attachments: HIVE-6090.1.WIP.patch, HIVE-6090.1.patch, > HIVE-6090.3.patch, HIVE-6090.patch > > > HiveMetastore has audit logs and would like to audit all queries or requests > to HiveServer2 also. This will help in understanding how the APIs were used, > queries submitted, users etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11903) Add zookeeper lock metrics to HS2
[ https://issues.apache.org/jira/browse/HIVE-11903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-11903: Summary: Add zookeeper lock metrics to HS2 (was: Add zookeeper metrics to HS2) > Add zookeeper lock metrics to HS2 > - > > Key: HIVE-11903 > URL: https://issues.apache.org/jira/browse/HIVE-11903 > Project: Hive > Issue Type: Sub-task > Components: Diagnosability >Reporter: Szehon Ho >Assignee: Yongzhi Chen > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-11903.1.patch, HIVE-11903.2.patch > > > Potential metrics are active zookeeper connections, locks taken by type, etc. > Can refine as we go along. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7594) Hive JDBC client: "out of sequence response" on large long running query
[ https://issues.apache.org/jira/browse/HIVE-7594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935294#comment-14935294 ] Sidi RHIL commented on HIVE-7594: - Hello, I have the same problem when using Talend to read a hive table (see the log below). Did anybody find a solution for this issue? Exception in component tHiveInput_1 java.sql.SQLException: org.apache.thrift.TApplicationException: CloseOperation failed: out of sequence response at org.apache.hive.jdbc.HiveStatement.closeClientOperation(HiveStatement.java:172) at org.apache.hive.jdbc.HiveStatement.close(HiveStatement.java:191) at mon_projet.agregate_doxtl_0_1.agregate_doxtl.tHiveInput_1Process(agregate_doxtl.java:2540) at mon_projet.agregate_doxtl_0_1.agregate_doxtl.runJobInTOS(agregate_doxtl.java:3447) at mon_projet.agregate_doxtl_0_1.agregate_doxtl.main(agregate_doxtl.java:3304) Caused by: org.apache.thrift.TApplicationException: CloseOperation failed: out of sequence response at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:76) at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_CloseOperation(TCLIService.java:455) at org.apache.hive.service.cli.thrift.TCLIService$Client.CloseOperation(TCLIService.java:442) at org.apache.hive.jdbc.HiveStatement.closeClientOperation(HiveStatement.java:166) ... 4 more > Hive JDBC client: "out of sequence response" on large long running query > > > Key: HIVE-7594 > URL: https://issues.apache.org/jira/browse/HIVE-7594 > Project: Hive > Issue Type: Bug > Components: Clients, HiveServer2 >Affects Versions: 0.13.0 > Environment: HDP2.1 >Reporter: Hari Sekhon > > When executing a long running query in a JDBC client (Squirrel) to > HiveServer2 after several minutes I get this error in the client: > {code} > Error: org.apache.thrift.TApplicationException: ExecuteStatement failed: out > of sequence response > SQLState: 08S01 > ErrorCode: 0 > {code} > I've seen this before in, iirc when running 2 queries in 1 session but I've > closed the client and run only this single query in a new session each time. > I did a search and saw HIVE-6893 referring to a Metastore exception which I > have in some older logs but not corresponding / recent in these recent > instances, the error seems different in this case but may be related. > The query to reproduce is "select count(*) from myTable" where myTable is a > 1TB table of 620 million rows. This happens in both MR and Tez execution > engines running on Yarn. > Here are all the jars I've added to the classpath (taken from Hortonworks doc > http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1-latest/bk_dataintegration/content/ch_using-hive-2.html, > plus added hadoop-common, hive-exec and slf4j-api to solve class not found > issues on top of that): > commons-codec-1.4.jar > commons-logging-1.1.3.jar > hadoop-common-2.4.0.2.1.3.0-563.jar > hive-exec-0.13.0.2.1.3.0-563.jar > hive-jdbc-0.13.0.2.1.3.0-563.jar > hive-service-0.13.0.2.1.3.0-563.jar > httpclient-4.2.5.jar > httpcore-4.2.5.jar > libthrift-0.9.0.jar > slf4j-api-1.7.5.jar > I am seeing errors like this in the hiveserver2.log: > {code} > 2014-08-01 15:04:31,358 ERROR [pool-5-thread-3]: server.TThreadPoolServer > (TThreadPoolServer.java:run(215)) - Error occurred during processing of > message. > java.lang.RuntimeException: org.apache.thrift.transport.TTransportException > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.thrift.transport.TTransportException > at > org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) > at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) > at > org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:182) > at > org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) > at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253) > at > org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) > ... 4 more > ... > 2014-08-01 15:06:31,520 ERROR [pool-5-thread-3]:
[jira] [Commented] (HIVE-11954) Extend logic to choose side table in MapJoin Conversion algorithm
[ https://issues.apache.org/jira/browse/HIVE-11954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935266#comment-14935266 ] Hive QA commented on HIVE-11954: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12762593/HIVE-11954.01.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9646 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_12 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_reduce org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5457/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5457/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5457/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12762593 - PreCommit-HIVE-TRUNK-Build > Extend logic to choose side table in MapJoin Conversion algorithm > - > > Key: HIVE-11954 > URL: https://issues.apache.org/jira/browse/HIVE-11954 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 2.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-11954.01.patch, HIVE-11954.patch, HIVE-11954.patch > > > Selection of side table (in memory/hash table) in MapJoin Conversion > algorithm needs to be more sophisticated. > In an N way Map Join, Hive should pick an input stream as side table (in > memory table) that has least cost in producing relation (like TS(FIL|Proj)*). > Cost based choice needs extended cost model; without return path its going to > be hard to do this. > For the time being we could employ a modified cost based algorithm for side > table selection. > New algorithm is described below: > 1. Identify the candidate set of inputs for side table (in memory/hash table) > from the inputs (based on conditional task size) > 2. For each of the input identify its cost, memory requirement. Cost is 1 for > each heavy weight relation op (Join, GB, PTF/Windowing, TF, etc.). Cost for > an input is the total no of heavy weight ops in its branch. > 3. Order set from #1 on cost & memory req (ascending order) > 4. Pick the first element from #3 as the side table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11972) [Refactor] Improve determination of dynamic partitioning columns in FileSink Operator
[ https://issues.apache.org/jira/browse/HIVE-11972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935240#comment-14935240 ] Ashutosh Chauhan commented on HIVE-11972: - [~prasanth_j] Would you like to take a look? > [Refactor] Improve determination of dynamic partitioning columns in FileSink > Operator > - > > Key: HIVE-11972 > URL: https://issues.apache.org/jira/browse/HIVE-11972 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-11972.patch > > > Currently it uses column names to locate DP columns, which is brittle since > column names may change during planning and optimization phases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11973) IN operator fails when the column type is DATE
[ https://issues.apache.org/jira/browse/HIVE-11973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935249#comment-14935249 ] Yongzhi Chen commented on HIVE-11973: - The in statement thinks string can not convert to date for FunctionRegistry.getPrimitiveCommonCategory(TypeInfo, TypeInfo) return null for the two types. For string can be converted to Date and implicitly converted to Date. So the PrimitiveCommonCategory for the two should be Date. {noformat} FunctionRegistry.getPrimitiveCommonCategory(TypeInfo, TypeInfo) line: 772 FunctionRegistry.getCommonClass(TypeInfo, TypeInfo) line: 810 GenericUDFUtils$ReturnObjectInspectorResolver.update(ObjectInspector, boolean) line: 165 GenericUDFUtils$ReturnObjectInspectorResolver.update(ObjectInspector) line: 103 GenericUDFIn.initialize(ObjectInspector[]) line: 89 GenericUDFIn(GenericUDF).initializeAndFoldConstants(ObjectInspector[]) line: 139 ExprNodeGenericFuncDesc.newInstance(GenericUDF, String, List) line: 234 {noformat} > IN operator fails when the column type is DATE > --- > > Key: HIVE-11973 > URL: https://issues.apache.org/jira/browse/HIVE-11973 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.0.0 >Reporter: sanjiv singh >Assignee: Yongzhi Chen > > Test DLL : > {code} > CREATE TABLE `date_dim`( > `d_date_sk` int, > `d_date_id` string, > `d_date` date, > `d_current_week` string, > `d_current_month` string, > `d_current_quarter` string, > `d_current_year` string) ; > {code} > Hive query : > {code} > SELECT * > FROM date_dim > WHERE d_date IN ('2000-03-22','2001-03-22') ; > {code} > In 1.0.0 , the above query fails with: > {code} > FAILED: SemanticException [Error 10014]: Line 1:180 Wrong arguments > ''2001-03-22'': The arguments for IN should be the same type! Types are: > {date IN (string, string)} > {code} > I changed the query as given to pass the error : > {code} > SELECT * > FROM date_dim > WHERE d_date IN (CAST('2000-03-22' AS DATE) , CAST('2001-03-22' AS DATE) > ) ; > {code} > But it works without casting : > {code} > SELECT * > FROM date_dim > WHERE d_date = '2000-03-22' ; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11980) Follow up on HIVE-11696, exception is thrown from CTAS from the table with table-level serde is Parquet while partition-level serde is JSON
[ https://issues.apache.org/jira/browse/HIVE-11980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-11980: Description: When we create a new table from the table with table-level serde to be Parquet and partition-level serde to be JSON, currently the following exception will be thrown if there are struct fields. Apparently, getStructFieldsDataAsList() also needs to handle the case of List in addition to ArrayWritable similar to getStructFieldData. {noformat} Caused by: java.lang.UnsupportedOperationException: Cannot inspect java.util.ArrayList at org.apache.hadoop.hive.ql.io.parquet.serde.ArrayWritableObjectInspector.getStructFieldsDataAsList(ArrayWritableObjectInspector.java:172) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:354) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:257) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:241) at org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:55) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:720) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:813) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:813) at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:162) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:508) {noformat} was: Apparently, getStructFieldsDataAsList() also needs to handle the case of List in addition to ArrayWritable similar to getStructFieldData. {noformat} Caused by: java.lang.UnsupportedOperationException: Cannot inspect java.util.ArrayList at org.apache.hadoop.hive.ql.io.parquet.serde.ArrayWritableObjectInspector.getStructFieldsDataAsList(ArrayWritableObjectInspector.java:172) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:354) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:257) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:241) at org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:55) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:720) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:813) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:813) at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:162) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:508) {noformat} > Follow up on HIVE-11696, exception is thrown from CTAS from the table with > table-level serde is Parquet while partition-level serde is JSON > --- > > Key: HIVE-11980 > URL: https://issues.apache.org/jira/browse/HIVE-11980 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-11980.patch > > > When we create a new table from the table with table-level serde to be > Parquet and partition-level serde to be JSON, currently the following > exception will be thrown if there are struct fields. > Apparently, getStructFieldsDataAsList() also needs to handle the case of List > in addition to ArrayWritable similar to getStructFieldData. > {noformat} > Caused by: java.lang.UnsupportedOperationException: Cannot inspect > java.util.ArrayList > at > org.apache.hadoop.hive.ql.io.parquet.serde.ArrayWritableObjectInspector.getStructFieldsDataAsList(ArrayWritableObjectInspector.java:172) > at > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:354) > at > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:257) > at > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:241) > at > org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:55) > at >
[jira] [Commented] (HIVE-11930) how to prevent ppd the topN(a) udf predication in where clause?
[ https://issues.apache.org/jira/browse/HIVE-11930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935230#comment-14935230 ] Ashutosh Chauhan commented on HIVE-11930: - Not in where clause, but in java source file containing your udf {{top1000}} > how to prevent ppd the topN(a) udf predication in where clause? > --- > > Key: HIVE-11930 > URL: https://issues.apache.org/jira/browse/HIVE-11930 > Project: Hive > Issue Type: New Feature > Components: Hive >Affects Versions: 0.14.0 >Reporter: Feng Yuan >Priority: Minor > > select > a.state_date,a.customer,a.taskid,a.step_id,a.exit_title,a.pv,top1000(a.only_id) > from > ( select > t1.state_date,t1.customer,t1.taskid,t1.step_id,t1.exit_title,t1.pv,t1.only_id > from > ( select t11.state_date, >t11.customer, >t11.taskid, >t11.step_id, >t11.exit_title, >t11.pv, >concat(t11.customer,t11.taskid,t11.step_id) as > only_id >from > ( select > state_date,customer,taskid,step_id,exit_title,count(*) as pv > from bdi_fact2.mid_url_step > where exit_url!='-1' > and exit_title !='-1' > and l_date='2015-08-31' > group by > state_date,customer,taskid,step_id,exit_title > )t11 >)t1 >order by t1.only_id,t1.pv desc > )a > where a.customer='Cdianyingwang' > and a.taskid='33' > and a.step_id='0' > and top1000(a.only_id)<=10; > in above example: > outer top1000(a.only_id)<=10;will ppd to: > stage 1: > ( select t11.state_date, >t11.customer, >t11.taskid, >t11.step_id, >t11.exit_title, >t11.pv, >concat(t11.customer,t11.taskid,t11.step_id) as > only_id >from > ( select > state_date,customer,taskid,step_id,exit_title,count(*) as pv > from bdi_fact2.mid_url_step > where exit_url!='-1' > and exit_title !='-1' > and l_date='2015-08-31' > group by > state_date,customer,taskid,step_id,exit_title > )t11 >)t1 > and this stage have 2 reduce,so you can see this will output 20 records, > upon to outer stage,the final results is exactly this 20 records. > so i want to know is there any way to hint this topN udf predication not to > ppd? > Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11973) IN operator fails when the column type is DATE
[ https://issues.apache.org/jira/browse/HIVE-11973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-11973: Attachment: HIVE-11973.1.patch > IN operator fails when the column type is DATE > --- > > Key: HIVE-11973 > URL: https://issues.apache.org/jira/browse/HIVE-11973 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 1.0.0 >Reporter: sanjiv singh >Assignee: Yongzhi Chen > Attachments: HIVE-11973.1.patch > > > Test DLL : > {code} > CREATE TABLE `date_dim`( > `d_date_sk` int, > `d_date_id` string, > `d_date` date, > `d_current_week` string, > `d_current_month` string, > `d_current_quarter` string, > `d_current_year` string) ; > {code} > Hive query : > {code} > SELECT * > FROM date_dim > WHERE d_date IN ('2000-03-22','2001-03-22') ; > {code} > In 1.0.0 , the above query fails with: > {code} > FAILED: SemanticException [Error 10014]: Line 1:180 Wrong arguments > ''2001-03-22'': The arguments for IN should be the same type! Types are: > {date IN (string, string)} > {code} > I changed the query as given to pass the error : > {code} > SELECT * > FROM date_dim > WHERE d_date IN (CAST('2000-03-22' AS DATE) , CAST('2001-03-22' AS DATE) > ) ; > {code} > But it works without casting : > {code} > SELECT * > FROM date_dim > WHERE d_date = '2000-03-22' ; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11980) Follow up on HIVE-11696, exception is thrown from CTAS from the table with table-level serde is Parquet while partition-level serde is JSON
[ https://issues.apache.org/jira/browse/HIVE-11980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-11980: Attachment: HIVE-11980.patch Update getStructFrieidsDataAsList() to also handle {{List}} as a parameter. Added unit test for the coverage. > Follow up on HIVE-11696, exception is thrown from CTAS from the table with > table-level serde is Parquet while partition-level serde is JSON > --- > > Key: HIVE-11980 > URL: https://issues.apache.org/jira/browse/HIVE-11980 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-11980.patch > > > Apparently, getStructFieldsDataAsList() also needs to handle the case of List > in addition to ArrayWritable similar to getStructFieldData. > {noformat} > Caused by: java.lang.UnsupportedOperationException: Cannot inspect > java.util.ArrayList > at > org.apache.hadoop.hive.ql.io.parquet.serde.ArrayWritableObjectInspector.getStructFieldsDataAsList(ArrayWritableObjectInspector.java:172) > at > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:354) > at > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:257) > at > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:241) > at > org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:55) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:720) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:813) > at > org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:813) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97) > at > org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:162) > at > org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:508) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11928) ORC footer section can also exceed protobuf message limit
[ https://issues.apache.org/jira/browse/HIVE-11928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935267#comment-14935267 ] Hive QA commented on HIVE-11928: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12762650/HIVE-11928.2.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5458/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5458/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5458/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-5458/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at a4c43f0 HIVE-11945: ORC with non-local reads may not be reusing connection to DN (Rajesh Balamohan reviewed by Sergey Shelukhin, Prasanth Jayachandran) + git clean -f -d + git checkout master Already on 'master' + git reset --hard origin/master HEAD is now at a4c43f0 HIVE-11945: ORC with non-local reads may not be reusing connection to DN (Rajesh Balamohan reviewed by Sergey Shelukhin, Prasanth Jayachandran) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12762650 - PreCommit-HIVE-TRUNK-Build > ORC footer section can also exceed protobuf message limit > - > > Key: HIVE-11928 > URL: https://issues.apache.org/jira/browse/HIVE-11928 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Jagruti Varia >Assignee: Prasanth Jayachandran > Attachments: HIVE-11928-branch-1.patch, HIVE-11928.1.patch, > HIVE-11928.1.patch, HIVE-11928.2.patch, HIVE-11928.2.patch > > > Similar to HIVE-11592 but for orc footer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11985) handle long typenames from Avro schema in metastore
[ https://issues.apache.org/jira/browse/HIVE-11985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935413#comment-14935413 ] Jimmy Xiang commented on HIVE-11985: I am not familiar with Avro serde either :( > handle long typenames from Avro schema in metastore > --- > > Key: HIVE-11985 > URL: https://issues.apache.org/jira/browse/HIVE-11985 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11985.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11755) Incorrect method called with Kerberos enabled in AccumuloStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-11755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935442#comment-14935442 ] Josh Elser commented on HIVE-11755: --- Thanks, [~brocknoland]. Much appreciated! > Incorrect method called with Kerberos enabled in AccumuloStorageHandler > --- > > Key: HIVE-11755 > URL: https://issues.apache.org/jira/browse/HIVE-11755 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 >Reporter: Josh Elser >Assignee: Josh Elser > Fix For: 1.2.2 > > Attachments: HIVE-11755.001.patch, HIVE-11755.002.patch, > HIVE-11755.003.patch > > > The following exception was noticed in testing out the > AccumuloStorageHandler's OutputFormat: > {noformat} > java.lang.IllegalStateException: Connector info for AccumuloOutputFormat can > only be set once per job > at > org.apache.accumulo.core.client.mapreduce.lib.impl.ConfiguratorBase.setConnectorInfo(ConfiguratorBase.java:146) > at > org.apache.accumulo.core.client.mapred.AccumuloOutputFormat.setConnectorInfo(AccumuloOutputFormat.java:125) > at > org.apache.hadoop.hive.accumulo.mr.HiveAccumuloTableOutputFormat.configureAccumuloOutputFormat(HiveAccumuloTableOutputFormat.java:95) > at > org.apache.hadoop.hive.accumulo.mr.HiveAccumuloTableOutputFormat.checkOutputSpecs(HiveAccumuloTableOutputFormat.java:51) > at > org.apache.hadoop.hive.ql.io.HivePassThroughOutputFormat.checkOutputSpecs(HivePassThroughOutputFormat.java:46) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.checkOutputSpecs(FileSinkOperator.java:1124) > at > org.apache.hadoop.hive.ql.io.HiveOutputFormatImpl.checkOutputSpecs(HiveOutputFormatImpl.java:67) > at > org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:268) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:139) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570) > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561) > at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:431) > at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1653) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1412) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) > at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311) > at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:409) > at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:425) > at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:714) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > Job Submission failed with exception > 'java.lang.IllegalStateException(Connector info for AccumuloOutputFormat can > only be set once per job)' > {noformat} > The OutputFormat implementation already had a method in place to account for > this exception but the method accidentally
[jira] [Commented] (HIVE-11755) Incorrect method called with Kerberos enabled in AccumuloStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-11755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935439#comment-14935439 ] Brock Noland commented on HIVE-11755: - Looks reasonable. +1 > Incorrect method called with Kerberos enabled in AccumuloStorageHandler > --- > > Key: HIVE-11755 > URL: https://issues.apache.org/jira/browse/HIVE-11755 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 >Reporter: Josh Elser >Assignee: Josh Elser > Fix For: 1.2.2 > > Attachments: HIVE-11755.001.patch, HIVE-11755.002.patch, > HIVE-11755.003.patch > > > The following exception was noticed in testing out the > AccumuloStorageHandler's OutputFormat: > {noformat} > java.lang.IllegalStateException: Connector info for AccumuloOutputFormat can > only be set once per job > at > org.apache.accumulo.core.client.mapreduce.lib.impl.ConfiguratorBase.setConnectorInfo(ConfiguratorBase.java:146) > at > org.apache.accumulo.core.client.mapred.AccumuloOutputFormat.setConnectorInfo(AccumuloOutputFormat.java:125) > at > org.apache.hadoop.hive.accumulo.mr.HiveAccumuloTableOutputFormat.configureAccumuloOutputFormat(HiveAccumuloTableOutputFormat.java:95) > at > org.apache.hadoop.hive.accumulo.mr.HiveAccumuloTableOutputFormat.checkOutputSpecs(HiveAccumuloTableOutputFormat.java:51) > at > org.apache.hadoop.hive.ql.io.HivePassThroughOutputFormat.checkOutputSpecs(HivePassThroughOutputFormat.java:46) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.checkOutputSpecs(FileSinkOperator.java:1124) > at > org.apache.hadoop.hive.ql.io.HiveOutputFormatImpl.checkOutputSpecs(HiveOutputFormatImpl.java:67) > at > org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:268) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:139) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570) > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561) > at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:431) > at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1653) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1412) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) > at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311) > at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:409) > at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:425) > at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:714) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > Job Submission failed with exception > 'java.lang.IllegalStateException(Connector info for AccumuloOutputFormat can > only be set once per job)' > {noformat} > The OutputFormat implementation already had a method in place to account for > this exception but the method accidentally wasn't getting
[jira] [Updated] (HIVE-11988) [hive] security issue with hive & ranger for import table command
[ https://issues.apache.org/jira/browse/HIVE-11988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Sharma updated HIVE-11988: - Assignee: Sushanth Sowmyan > [hive] security issue with hive & ranger for import table command > - > > Key: HIVE-11988 > URL: https://issues.apache.org/jira/browse/HIVE-11988 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 0.14.0, 1.2.1 >Reporter: Deepak Sharma >Assignee: Sushanth Sowmyan >Priority: Critical > Fix For: 0.14.1, 1.2.2 > > > if a user does not have permission to create table in hive , then if the same > user import data for a table using following command then , it will have to > create table also and that is working successfully , ideally it should not > work > STR: > > 1. put some raw data in hdfs path /user/user1/tempdata > 2. in ranger check policy , user1 should not have any permission on any table > 3. login through user1 into beeline ( obviously it will fail since user > doesnt have permission to create table) > create table tt1(id INT,ff String); > FAILED: HiveAccessControlException Permission denied: user user1 does not > have CREATE privilege on default/tt1 (state=42000,code=4) > 4. now try following command to import data into a table ( table should not > exist already) > import table tt1 from '/user/user1/tempdata'; > ER: > since user1 doesnt have permission to create table so this operation should > fail > AR: > table is created successfully and data is also imported !! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11880) filter bug of UNION ALL when hive.ppd.remove.duplicatefilters=true and filter condition is type incompatible column
[ https://issues.apache.org/jira/browse/HIVE-11880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935812#comment-14935812 ] Hive QA commented on HIVE-11880: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12764007/HIVE-11880.04.patch {color:red}ERROR:{color} -1 due to 57 failed/errored test(s), 9631 tests executed *Failed tests:* {noformat} TestMiniTezCliDriver-vector_distinct_2.q-vector_interval_2.q-load_dyn_part2.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_explain org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_nulls org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_simple_select org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_subq_not_in org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_product_check_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_product_check_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_alt_syntax org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_nulls org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lineage3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nonblock_op_deduplicate org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_optimize_nullscan org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_boolexpr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin_having org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_unqualcolumnrefs org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_views org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_test_boolean_whereclause org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_count org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_percentile org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_join_nulls org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_join org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_simple_select org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_subq_not_in org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cross_join org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cross_product_check_1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cross_product_check_2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_subquery_exists org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_auto_smb_mapjoin_14 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_reduce org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_join_nulls org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_mr_diff_schema_alias org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_join_nonexistent_part org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join_nulls org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_cbo_simple_select org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_cbo_subq_not_in org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_cross_join org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_cross_product_check_1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_cross_product_check_2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_alt_syntax org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_cond_pushdown_1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_cond_pushdown_3 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_mapjoin1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_optimize_nullscan org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_udf_percentile org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation org.apache.hive.service.cli.TestEmbeddedThriftBinaryCLIService.testExecuteStatementParallel {noformat} Test results:
[jira] [Commented] (HIVE-11964) RelOptHiveTable.hiveColStatsMap might contain mismatched column stats
[ https://issues.apache.org/jira/browse/HIVE-11964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935844#comment-14935844 ] Laljo John Pullokkaran commented on HIVE-11964: --- +1 Thanks [~ctang.ma] > RelOptHiveTable.hiveColStatsMap might contain mismatched column stats > - > > Key: HIVE-11964 > URL: https://issues.apache.org/jira/browse/HIVE-11964 > Project: Hive > Issue Type: Bug > Components: Query Planning, Statistics >Affects Versions: 1.2.1 >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-11964.patch > > > RelOptHiveTable.hiveColStatsMap might contain mismatched stats since it was > built by assuming the stats returned from > == > hiveColStats =StatsUtils.getTableColumnStats(hiveTblMetadata, > hiveNonPartitionCols, nonPartColNamesThatRqrStats); > or > HiveMetaStoreClient.getTableColumnStatistics(dbName, tableName, colNames) > == > have the same order of the requested columns. But actually the order is > non-deterministic. therefore the returned stats should be re-ordered before > it is put in hiveColStatsMap. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11969) start Tez session in background when starting CLI
[ https://issues.apache.org/jira/browse/HIVE-11969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936046#comment-14936046 ] Sergey Shelukhin commented on HIVE-11969: - Seems to work as intended on cluster. [~sseth] can you take a look? Esp. wrt the right part of init being async. > start Tez session in background when starting CLI > - > > Key: HIVE-11969 > URL: https://issues.apache.org/jira/browse/HIVE-11969 > Project: Hive > Issue Type: Bug > Components: Tez >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11969.01.patch, HIVE-11969.patch > > > Tez session spins up AM, which can cause delays, esp. if the cluster is very > busy. > This can be done in background, so the AM might get started while the user is > running local commands and doing other things. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11807) Set ORC buffer size in relation to set stripe size
[ https://issues.apache.org/jira/browse/HIVE-11807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936123#comment-14936123 ] Gopal V commented on HIVE-11807: [~owen.omalley]: LGTM - +1. > Set ORC buffer size in relation to set stripe size > -- > > Key: HIVE-11807 > URL: https://issues.apache.org/jira/browse/HIVE-11807 > Project: Hive > Issue Type: Improvement > Components: File Formats >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Attachments: HIVE-11807.patch, HIVE-11807.patch > > > A customer produced ORC files with very small stripe sizes (10k rows/stripe) > by setting a small 64MB stripe size and 256K buffer size for a 54 column > table. At that size, each of the streams only get a buffer or two before the > stripe size is reached. The current code uses the available memory instead of > the stripe size and thus doesn't shrink the buffer size if the JVM has much > more memory than the stripe size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11898) support default partition in metastoredirectsql
[ https://issues.apache.org/jira/browse/HIVE-11898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936125#comment-14936125 ] Sergey Shelukhin commented on HIVE-11898: - These tests pass for me. [~sushanth] can you take a look? > support default partition in metastoredirectsql > --- > > Key: HIVE-11898 > URL: https://issues.apache.org/jira/browse/HIVE-11898 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11898.01.patch, HIVE-11898.02.patch, > HIVE-11898.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11990) Loading data inpath from a temporary table dir fails on Windows
[ https://issues.apache.org/jira/browse/HIVE-11990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-11990: - Attachment: HIVE-11990.2.patch [~jdere] Patch #2 with test case. Thanks Hari > Loading data inpath from a temporary table dir fails on Windows > --- > > Key: HIVE-11990 > URL: https://issues.apache.org/jira/browse/HIVE-11990 > Project: Hive > Issue Type: Bug >Reporter: Takahiko Saito >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-11990.1.patch, HIVE-11990.2.patch > > > The query runs: > {noformat} > load data inpath 'wasb:///tmp/testtemptable/temptablemisc_5/data' overwrite > into table temp2; > {noformat} > It fails with: > {noformat} > FAILED: SemanticException [Error 10028]: Line 2:37 Path is not legal > ''wasb:///tmp/testtemptable/temptablemisc_5/data'': Move from: > wasb://humb23-hi...@humboldttesting3.blob.core.windows.net/tmp/testtemptable/temptablemisc_5/data > to: > hdfs://headnode0.humb23-hive1-ssh.h2.internal.cloudapp.net:8020/tmp/hive/hrt_qa/0d5f8b31-5908-44bf-ae4c-9eee956da066/_tmp_space.db/75b44252-42a7-4d28-baf8-4977daa5d49c > is not valid. Please check that values for params "default.fs.name" and > "hive.metastore.warehouse.dir" do not conflict. > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11992) speed up and re-enable slow q test files in Hive
[ https://issues.apache.org/jira/browse/HIVE-11992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11992: Description: Due to perceived lack of importance and long runtimes, we have disabled the following q files: CliDriver: rcfile_merge1.q,\ MinimrCliDriver: ql_rewrite_gbtoidx.q,\ ql_rewrite_gbtoidx_cbo_1.q,\ ql_rewrite_gbtoidx_cbo_2.q,\ smb_mapjoin_8.q,\ If someone thinks any of these are important, they should be re-enabled, however, their runtime should be made acceptable first (they each take 10-30 minutes right now, and should take 3 minutes at most, ideally 0-2). Please feel free to look at all of these, or file sub-tasks to look at a subset of the list. was: Due to perceived lack of importance and long runtimes, we have disabled the following q files: CliDriver: rcfile_merge1.q,\ MinimrCliDriver: ql_rewrite_gbtoidx.q,\ ql_rewrite_gbtoidx_cbo_1.q,\ ql_rewrite_gbtoidx_cbo_2.q,\ smb_mapjoin_8.q,\ If someone thinks any of these are important, they should be re-enabled, however, their runtime should be made acceptable first (they take 10-30 minutes right now, and should take 3 minutes at most, ideally 0-2). Please feel free to look at all of these, or file sub-tasks to look at a subset of the list. > speed up and re-enable slow q test files in Hive > > > Key: HIVE-11992 > URL: https://issues.apache.org/jira/browse/HIVE-11992 > Project: Hive > Issue Type: Bug > Components: Test >Reporter: Sergey Shelukhin >Priority: Minor > > Due to perceived lack of importance and long runtimes, we have disabled the > following q files: > CliDriver: > rcfile_merge1.q,\ > MinimrCliDriver: > ql_rewrite_gbtoidx.q,\ > ql_rewrite_gbtoidx_cbo_1.q,\ > ql_rewrite_gbtoidx_cbo_2.q,\ > smb_mapjoin_8.q,\ > If someone thinks any of these are important, they should be re-enabled, > however, their runtime should be made acceptable first (they each take 10-30 > minutes right now, and should take 3 minutes at most, ideally 0-2). > Please feel free to look at all of these, or file sub-tasks to look at a > subset of the list. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11894) CBO: Calcite Operator To Hive Operator (Calcite Return Path): correct table column name in CTAS queries
[ https://issues.apache.org/jira/browse/HIVE-11894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11894: --- Attachment: HIVE-11894.03.patch a combined patch of 11894 and 11907, depending on the check in of 11971 > CBO: Calcite Operator To Hive Operator (Calcite Return Path): correct table > column name in CTAS queries > --- > > Key: HIVE-11894 > URL: https://issues.apache.org/jira/browse/HIVE-11894 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11894.01.patch, HIVE-11894.02.patch, > HIVE-11894.03.patch > > > To repro, run lineage2.q with return path turned on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11920) ADD JAR failing with URL schemes other than file/ivy/hdfs
[ https://issues.apache.org/jira/browse/HIVE-11920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935828#comment-14935828 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-11920: -- ltgm +1 > ADD JAR failing with URL schemes other than file/ivy/hdfs > - > > Key: HIVE-11920 > URL: https://issues.apache.org/jira/browse/HIVE-11920 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.0 >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-11920.1.patch > > > Example stack trace below. It looks like this was introduced by HIVE-9664. > {noformat} > 015-09-16 19:53:16,502 ERROR [main]: SessionState > (SessionState.java:printError(960)) - invalid url: > wasb:///tmp/hive-udfs-0.1.jar, expecting ( file | hdfs | ivy) as url scheme. > java.lang.RuntimeException: invalid url: wasb:///tmp/hive-udfs-0.1.jar, > expecting ( file | hdfs | ivy) as url scheme. > at > org.apache.hadoop.hive.ql.session.SessionState.getURLType(SessionState.java:1230) > at > org.apache.hadoop.hive.ql.session.SessionState.resolveAndDownload(SessionState.java:1237) > at > org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1163) > at > org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1149) > at > org.apache.hadoop.hive.ql.exec.FunctionTask.addFunctionResources(FunctionTask.java:301) > at > org.apache.hadoop.hive.ql.exec.Registry.registerToSessionRegistry(Registry.java:453) > at > org.apache.hadoop.hive.ql.exec.Registry.registerPermanentFunction(Registry.java:200) > at > org.apache.hadoop.hive.ql.exec.FunctionRegistry.registerPermanentFunction(FunctionRegistry.java:1495) > at > org.apache.hadoop.hive.ql.exec.FunctionTask.createPermanentFunction(FunctionTask.java:136) > at > org.apache.hadoop.hive.ql.exec.FunctionTask.execute(FunctionTask.java:75) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1655) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1414) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1195) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11983) Hive streaming API uses incorrect logic to assign buckets to incoming records
[ https://issues.apache.org/jira/browse/HIVE-11983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-11983: -- Component/s: Transactions > Hive streaming API uses incorrect logic to assign buckets to incoming records > - > > Key: HIVE-11983 > URL: https://issues.apache.org/jira/browse/HIVE-11983 > Project: Hive > Issue Type: Bug > Components: HCatalog, Transactions >Affects Versions: 1.2.1 >Reporter: Roshan Naik >Assignee: Roshan Naik > Labels: streaming, streaming_api > Attachments: HIVE-11983.patch > > > The Streaming API tries to distribute records evenly into buckets. > All records in every Transaction that is part of TransactionBatch goes to the > same bucket and a new bucket number is chose for each TransactionBatch. > Fix: API needs to hash each record to determine which bucket it belongs to. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11960) braces in join conditions are not supported
[ https://issues.apache.org/jira/browse/HIVE-11960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11960: Attachment: HIVE-11960.02.patch It actually appears to be supported thru the magic of recursion. Updated the test (I just added 2 sets of braces to the original no-braces query that I used to compare) > braces in join conditions are not supported > --- > > Key: HIVE-11960 > URL: https://issues.apache.org/jira/browse/HIVE-11960 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11960.01.patch, HIVE-11960.02.patch, > HIVE-11960.patch > > > These should be supported; they are ANSI -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11903) Add lock metrics to HS2
[ https://issues.apache.org/jira/browse/HIVE-11903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-11903: - Summary: Add lock metrics to HS2 (was: Add zookeeper lock metrics to HS2) > Add lock metrics to HS2 > --- > > Key: HIVE-11903 > URL: https://issues.apache.org/jira/browse/HIVE-11903 > Project: Hive > Issue Type: Sub-task > Components: Diagnosability >Reporter: Szehon Ho >Assignee: Yongzhi Chen > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-11903.1.patch, HIVE-11903.2.patch, > HIVE-11903.3.patch > > > Potential metrics are active zookeeper locks taken by type. Can refine as we > go along. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11976) Extend CBO rules to being able to apply rules only once on a given operator
[ https://issues.apache.org/jira/browse/HIVE-11976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936098#comment-14936098 ] Hive QA commented on HIVE-11976: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12764060/HIVE-11976.patch {color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 9631 tests executed *Failed tests:* {noformat} TestMiniTezCliDriver-update_orig_table.q-vectorization_13.q-mapreduce2.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_deep_filters org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_flatten_and_or org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pointlookup org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pointlookup2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pointlookup3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_7 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_reduce org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_7 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_pcr org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5461/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5461/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5461/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 14 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12764060 - PreCommit-HIVE-TRUNK-Build > Extend CBO rules to being able to apply rules only once on a given operator > --- > > Key: HIVE-11976 > URL: https://issues.apache.org/jira/browse/HIVE-11976 > Project: Hive > Issue Type: New Feature > Components: CBO >Affects Versions: 2.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-11976.patch > > > Create a way to bail out quickly from HepPlanner if the rule has been already > applied on a certain operator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11993) It is not necessary to start tez session when cli with parameter "-e" or "-f"
[ https://issues.apache.org/jira/browse/HIVE-11993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated HIVE-11993: -- Description: With "-e" or "-f", hive execute under batch mode, so I don't think it is necessary to start tez session When hive session is started. Especially when I only want to execute DDL. was:Especially when I only want to execute DDL. > It is not necessary to start tez session when cli with parameter "-e" or "-f" > - > > Key: HIVE-11993 > URL: https://issues.apache.org/jira/browse/HIVE-11993 > Project: Hive > Issue Type: Improvement > Components: CLI >Reporter: Jeff Zhang > > With "-e" or "-f", hive execute under batch mode, so I don't think it is > necessary to start tez session When hive session is started. > Especially when I only want to execute DDL. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11915) BoneCP returns closed connections from the pool
[ https://issues.apache.org/jira/browse/HIVE-11915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935926#comment-14935926 ] Thejas M Nair commented on HIVE-11915: -- Thanks for the update. Can you also update the log message to also indicate that there is going to be a retry ? Otherwise, users might not realize that the issue was not fatal. Also, please update it to include the full exception stack trace. That can be very useful for debugging. > BoneCP returns closed connections from the pool > --- > > Key: HIVE-11915 > URL: https://issues.apache.org/jira/browse/HIVE-11915 > Project: Hive > Issue Type: Bug >Reporter: Takahiko Saito >Assignee: Sergey Shelukhin > Attachments: HIVE-11915.01.patch, HIVE-11915.02.patch, > HIVE-11915.WIP.patch, HIVE-11915.patch > > > It's a very old bug in BoneCP and it will never be fixed... There are > multiple workarounds on the internet but according to responses they are all > unreliable. We should upgrade to HikariCP (which in turn is only supported by > DN 4), meanwhile try some shamanic rituals. In this JIRA we will try a > relatively weak drum. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11985) handle long typenames from Avro schema in metastore
[ https://issues.apache.org/jira/browse/HIVE-11985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935949#comment-14935949 ] Sergey Shelukhin commented on HIVE-11985: - https://reviews.apache.org/r/38862/ > handle long typenames from Avro schema in metastore > --- > > Key: HIVE-11985 > URL: https://issues.apache.org/jira/browse/HIVE-11985 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11985.01.patch, HIVE-11985.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11903) Add zookeeper lock metrics to HS2
[ https://issues.apache.org/jira/browse/HIVE-11903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936037#comment-14936037 ] Szehon Ho commented on HIVE-11903: -- Thanks, looks good to me, we can remove the extra imports on CuratorFrameworkSingleton for now, but I can do it on commit. +1 > Add zookeeper lock metrics to HS2 > - > > Key: HIVE-11903 > URL: https://issues.apache.org/jira/browse/HIVE-11903 > Project: Hive > Issue Type: Sub-task > Components: Diagnosability >Reporter: Szehon Ho >Assignee: Yongzhi Chen > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-11903.1.patch, HIVE-11903.2.patch, > HIVE-11903.3.patch > > > Potential metrics are active zookeeper locks taken by type. Can refine as we > go along. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11992) speed up and re-enable slow q test files in Hive
[ https://issues.apache.org/jira/browse/HIVE-11992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936079#comment-14936079 ] Sergey Shelukhin commented on HIVE-11992: - CliDriver tests are already parallelized, but total capacity of test runners is still limited... > speed up and re-enable slow q test files in Hive > > > Key: HIVE-11992 > URL: https://issues.apache.org/jira/browse/HIVE-11992 > Project: Hive > Issue Type: Bug > Components: Test >Reporter: Sergey Shelukhin >Priority: Minor > > Due to perceived lack of importance and long runtimes, we have disabled the > following q files: > CliDriver: > rcfile_merge1.q,\ > MinimrCliDriver: > ql_rewrite_gbtoidx.q,\ > ql_rewrite_gbtoidx_cbo_1.q,\ > ql_rewrite_gbtoidx_cbo_2.q,\ > smb_mapjoin_8.q,\ > If someone thinks any of these are important, they should be re-enabled, > however, their runtime should be made acceptable first (they each take 10-30 > minutes right now, and should take 3 minutes at most, ideally 0-2). > Please feel free to look at all of these, or file sub-tasks to look at a > subset of the list. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11823) create a self-contained translation for SARG to be used by metastore
[ https://issues.apache.org/jira/browse/HIVE-11823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11823: Description: See HIVE-11705. This just contains the hbase-metastore-specific methods from that patch NO PRECOMMIT TESTS was: See HIVE-11705. This just contains the hbase-metastore-specific methods from that patch > create a self-contained translation for SARG to be used by metastore > > > Key: HIVE-11823 > URL: https://issues.apache.org/jira/browse/HIVE-11823 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11823.01.patch, HIVE-11823.02.patch, > HIVE-11823.patch > > > See HIVE-11705. This just contains the hbase-metastore-specific methods from > that patch > NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11836) ORC SARG creation throws NPE for null constants with void type
[ https://issues.apache.org/jira/browse/HIVE-11836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-11836: -- Component/s: Transactions > ORC SARG creation throws NPE for null constants with void type > -- > > Key: HIVE-11836 > URL: https://issues.apache.org/jira/browse/HIVE-11836 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.3.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Fix For: 1.3.0 > > Attachments: HIVE-11836.1.patch > > > Queries like > {code} > select * from table where col = null > {code} > will throw the following exception > {code} > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.boxLiteral(SearchArgumentImpl.java:446) > at > org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.getLiteral(SearchArgumentImpl.java:476) > at > org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.createLeaf(SearchArgumentImpl.java:524) > at > org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.createLeaf(SearchArgumentImpl.java:584) > at > org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.parse(SearchArgumentImpl.java:629) > at > org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.addChildren(SearchArgumentImpl.java:598) > at > org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.parse(SearchArgumentImpl.java:621) > at > org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.addChildren(SearchArgumentImpl.java:598) > at > org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.parse(SearchArgumentImpl.java:621) > at > org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl$ExpressionBuilder.expression(SearchArgumentImpl.java:916) > at > org.apache.hadoop.hive.ql.io.sarg.SearchArgumentImpl.(SearchArgumentImpl.java:953) > at > org.apache.hadoop.hive.ql.io.sarg.SearchArgumentFactory.create(SearchArgumentFactory.java:36) > at > org.apache.hadoop.hive.ql.io.sarg.SearchArgumentFactory.createFromConf(SearchArgumentFactory.java:50) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.setSearchArgument(OrcInputFormat.java:312) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.java:1224) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1113) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:249) > {code} > This issue does not happen when CBO is enabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11672) Hive Streaming API handles bucketing incorrectly
[ https://issues.apache.org/jira/browse/HIVE-11672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935893#comment-14935893 ] Eugene Koifman commented on HIVE-11672: --- [~roshan_naik] is this is a dup of HIVE-11983? > Hive Streaming API handles bucketing incorrectly > > > Key: HIVE-11672 > URL: https://issues.apache.org/jira/browse/HIVE-11672 > Project: Hive > Issue Type: Bug > Components: HCatalog, Hive, Transactions >Affects Versions: 1.2.1 >Reporter: Raj Bains >Assignee: Roshan Naik >Priority: Critical > > Hive Streaming API allows the clients to get a random bucket and then insert > data into it. However, this leads to incorrect bucketing as Hive expects data > to be distributed into buckets based on a hash function applied to bucket > key. The data is inserted randomly by the clients right now. They have no way > of > # Knowing what bucket a row (tuple) belongs to > # Asking for a specific bucket > There are optimization such as Sort Merge Join and Bucket Map Join that rely > on the data being correctly distributed across buckets and these will cause > incorrect read results if the data is not distributed correctly. > There are two obvious design choices > # Hive Streaming API should fix this internally by distributing the data > correctly > # Hive Streaming API should expose data distribution scheme to the clients > and allow them to distribute the data correctly > The first option will mean every client thread will write to many buckets, > causing many small files in each bucket and too many connections open. this > does not seem feasible. The second option pushes more functionality into the > client of the Hive Streaming API, but can maintain high throughput and write > good sized ORC files. This option seems preferable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11952) disable q tests that are both slow and less relevant
[ https://issues.apache.org/jira/browse/HIVE-11952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11952: Fix Version/s: 2.0.0 > disable q tests that are both slow and less relevant > > > Key: HIVE-11952 > URL: https://issues.apache.org/jira/browse/HIVE-11952 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 2.0.0 > > Attachments: HIVE-11952.01.patch, HIVE-11952.patch > > > We will disable several tests that test obscure and old features and take > inordinate amount of time, and file JIRAs to look at their perf if someone > still cares about them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11444) ACID Compactor should generate stats/alerts
[ https://issues.apache.org/jira/browse/HIVE-11444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-11444: -- Description: Compaction should generate stats about number of files it reads, min/max/avg size etc. It should also generate alerts if it looks like the system is not configured correctly. For example, if there are lots of delta files with very small files, it's a good sign that Streaming API is configured with batches that are too small. Simplest idea is to add another periodic task to AcidHouseKeeperService to //periodically do select count(*), min(txnid),max(txnid), type from txns group by type. //1. dump that to log file at info //2. could also keep counts for last 10min, hour, 6 hours, 24 hours, etc //2.2 if a large increase is detected - issue alert (at least to the log for now) at warn/error was: Compaction should generate stats about number of files it reads, min/max/avg size etc. It should also generate alerts if it looks like the system is not configured correctly. For example, if there are lots of delta files with very small files, it's a good sign that Streaming API is configured with batches that are too small. > ACID Compactor should generate stats/alerts > --- > > Key: HIVE-11444 > URL: https://issues.apache.org/jira/browse/HIVE-11444 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > > Compaction should generate stats about number of files it reads, min/max/avg > size etc. It should also generate alerts if it looks like the system is not > configured correctly. > For example, if there are lots of delta files with very small files, it's a > good sign that Streaming API is configured with batches that are too small. > Simplest idea is to add another periodic task to AcidHouseKeeperService to > //periodically do select count(*), min(txnid),max(txnid), type from > txns group by type. > //1. dump that to log file at info > //2. could also keep counts for last 10min, hour, 6 hours, 24 hours, > etc > //2.2 if a large increase is detected - issue alert (at least to the > log for now) at warn/error -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11915) BoneCP returns closed connections from the pool
[ https://issues.apache.org/jira/browse/HIVE-11915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11915: Attachment: HIVE-11915.02.patch Updated the patch; fixed the log, moved the retry logic closer to the code. I disagree about extra retries just-in-case - that's what retrying object store and ms client do, and 90% of the cases when I see these (and 100% for the former) is when it blindly retries for some non-recoverable exceptions. DBCP has been around for a while and doesn't merit doing things just in case, esp. if there's no way to tell what is recoverable... at least with BONECP we know what we are trying to fix. > BoneCP returns closed connections from the pool > --- > > Key: HIVE-11915 > URL: https://issues.apache.org/jira/browse/HIVE-11915 > Project: Hive > Issue Type: Bug >Reporter: Takahiko Saito >Assignee: Sergey Shelukhin > Attachments: HIVE-11915.01.patch, HIVE-11915.02.patch, > HIVE-11915.WIP.patch, HIVE-11915.patch > > > It's a very old bug in BoneCP and it will never be fixed... There are > multiple workarounds on the internet but according to responses they are all > unreliable. We should upgrade to HikariCP (which in turn is only supported by > DN 4), meanwhile try some shamanic rituals. In this JIRA we will try a > relatively weak drum. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11960) braces in join conditions are not supported
[ https://issues.apache.org/jira/browse/HIVE-11960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935880#comment-14935880 ] Sergey Shelukhin commented on HIVE-11960: - It's not redundant - there's ambiguity between this and virtual tables otherwise. I will look at the latter. The ((join) join) case us supported and tested, but I suspect that ((join)) won't work. Need to see if the "obvious" way to add is good enough. > braces in join conditions are not supported > --- > > Key: HIVE-11960 > URL: https://issues.apache.org/jira/browse/HIVE-11960 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11960.01.patch, HIVE-11960.patch > > > These should be supported; they are ANSI -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11609) Capability to add a filter to hbase scan via composite key doesn't work
[ https://issues.apache.org/jira/browse/HIVE-11609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935951#comment-14935951 ] Swarnim Kulkarni commented on HIVE-11609: - [~ashutoshc] So I looked into this a little bit and looks like the fix with HIVE-10940 isn't really going to work, mostly because it seems to be pretty tailored to Tez. The SerializeFilter is only called for TezCompiler[1]. Any reason why this is not called for MapreduceCompiler too or SparkCompiler? As a quick hack, I tried calling it within the MapReduceCompiler in the "optimizeTaskPlan" but that doesn't seem to work very well too. Might need to dig a little bit into what's going on there. If it's ok with you, I would potentially like to log a separate bug though and tackle it there just to keep it separate from what we are trying to do here. If that works, we can re-add the "transient" and only go the SerializeFilter route. [1] https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java#L486-L490 > Capability to add a filter to hbase scan via composite key doesn't work > --- > > Key: HIVE-11609 > URL: https://issues.apache.org/jira/browse/HIVE-11609 > Project: Hive > Issue Type: Bug > Components: HBase Handler >Reporter: Swarnim Kulkarni >Assignee: Swarnim Kulkarni > Attachments: HIVE-11609.1.patch.txt, HIVE-11609.2.patch.txt > > > It seems like the capability to add filter to an hbase scan which was added > as part of HIVE-6411 doesn't work. This is primarily because in the > HiveHBaseInputFormat, the filter is added in the getsplits instead of > getrecordreader. This works fine for start and stop keys but not for filter > because a filter is respected only when an actual scan is performed. This is > also related to the initial refactoring that was done as part of HIVE-3420. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11952) disable q tests that are both slow and less relevant
[ https://issues.apache.org/jira/browse/HIVE-11952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11952: Attachment: HIVE-11952.01.patch The patch that takes care of the other variable > disable q tests that are both slow and less relevant > > > Key: HIVE-11952 > URL: https://issues.apache.org/jira/browse/HIVE-11952 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11952.01.patch, HIVE-11952.patch > > > We will disable several tests that test obscure and old features and take > inordinate amount of time, and file JIRAs to look at their perf if someone > still cares about them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11819) HiveServer2 catches OOMs on request threads
[ https://issues.apache.org/jira/browse/HIVE-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936051#comment-14936051 ] Sergey Shelukhin commented on HIVE-11819: - [~sushanth] should this be ported to branch-1? > HiveServer2 catches OOMs on request threads > --- > > Key: HIVE-11819 > URL: https://issues.apache.org/jira/browse/HIVE-11819 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 2.0.0 > > Attachments: HIVE-11819.01.patch, HIVE-11819.02.patch, > HIVE-11819.patch > > > ThriftCLIService methods such as ExecuteStatement are apparently capable of > catching OOMs because they get wrapped in RTE by HiveSessionProxy. > This shouldn't happen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11823) create a self-contained translation for SARG to be used by metastore
[ https://issues.apache.org/jira/browse/HIVE-11823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11823: Attachment: HIVE-11823.02.patch Rebased the patch. [~prasanth_j] ping? > create a self-contained translation for SARG to be used by metastore > > > Key: HIVE-11823 > URL: https://issues.apache.org/jira/browse/HIVE-11823 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11823.01.patch, HIVE-11823.02.patch, > HIVE-11823.patch > > > See HIVE-11705. This just contains the hbase-metastore-specific methods from > that patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11925) Hive file format checking breaks load from named pipes
[ https://issues.apache.org/jira/browse/HIVE-11925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11925: Attachment: HIVE-11925.01.patch Fix the check to not be done for HDFS, and to handle unknown fs-es. > Hive file format checking breaks load from named pipes > -- > > Key: HIVE-11925 > URL: https://issues.apache.org/jira/browse/HIVE-11925 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11925.01.patch, HIVE-11925.patch > > > Opening the file and mucking with it when hive.fileformat.check is true (the > default) breaks the LOAD command from a named pipe. Right now, it's done for > all the text files blindly to see if they might be in some other format. > Files.getAttribute can be used to figure out if the input is a named pipe (or > a socket) and skip the format check. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11983) Hive streaming API uses incorrect logic to assign buckets to incoming records
[ https://issues.apache.org/jira/browse/HIVE-11983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated HIVE-11983: --- Attachment: HIVE-11983.patch Uploading patch > Hive streaming API uses incorrect logic to assign buckets to incoming records > - > > Key: HIVE-11983 > URL: https://issues.apache.org/jira/browse/HIVE-11983 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 1.2.1 >Reporter: Roshan Naik >Assignee: Roshan Naik > Attachments: HIVE-11983.patch > > > The Streaming API tries to distribute records evenly into buckets. > All records in every Transaction that is part of TransactionBatch goes to the > same bucket and a new bucket number is chose for each TransactionBatch. > Fix: API needs to hash each record to determine which bucket it belongs to. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11642) LLAP: make sure tests pass #3
[ https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11642: Attachment: HIVE-11642.14.patch A more recent diff > LLAP: make sure tests pass #3 > - > > Key: HIVE-11642 > URL: https://issues.apache.org/jira/browse/HIVE-11642 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11642.01.patch, HIVE-11642.02.patch, > HIVE-11642.03.patch, HIVE-11642.04.patch, HIVE-11642.05.patch, > HIVE-11642.12.patch, HIVE-11642.13.patch, HIVE-11642.14.patch, > HIVE-11642.patch > > > Tests should pass against the most recent branch and Tez 0.8. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10595) Dropping a table can cause NPEs in the compactor
[ https://issues.apache.org/jira/browse/HIVE-10595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-10595: -- Component/s: Transactions > Dropping a table can cause NPEs in the compactor > > > Key: HIVE-10595 > URL: https://issues.apache.org/jira/browse/HIVE-10595 > Project: Hive > Issue Type: Bug > Components: Metastore, Transactions >Affects Versions: 0.14.0, 1.0.0, 1.1.0 >Reporter: Alan Gates >Assignee: Alan Gates > Fix For: 1.2.0 > > Attachments: HIVE-10595.1.patch, HIVE-10595.patch > > > Reproduction: > # start metastore with compactor off > # insert enough entries in a table to trigger a compaction > # drop the table > # stop metastore > # restart metastore with compactor on > Result: NPE in the compactor threads. I suspect this would also happen if > the inserts and drops were done in between a run of the compactor, but I > haven't proven it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11355) Hive on tez: memory manager for sort buffers (input/output) and operators
[ https://issues.apache.org/jira/browse/HIVE-11355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936035#comment-14936035 ] Gopal V commented on HIVE-11355: [~vikram.dixit]: the feature seems to over-estimate sorter sizes larger than the oldgen sizes in the JVM, the Xmx is 80% of the container size & the goal of this is to only scale down the buffers from their configured size. I noticed that it occasionally, the decider decides to scale it upwards to bad results. {code} ], TaskAttempt 3 failed, info=[Error: Failure while running task: attempt_1442254312093_1019_1_00_16_3:java.lang.IllegalArgumentException: tez.runtime.io.sort.mb 8187 should be larger than 0 and should be less than the available task memory (MB):6311 at com.google.common.base.Preconditions.checkArgument(Preconditions.java:92) at org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.getInitialMemoryRequirement(ExternalSorter.java:338) at org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.initialize(OrderedPartitionedKVOutput.java:92) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable.callInternal(LogicalIOProcessorRuntimeTask.java:477) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable.callInternal(LogicalIOProcessorRuntimeTask.java:455) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} To repro, run query28 on 30Tb scale (planner on cn105). > Hive on tez: memory manager for sort buffers (input/output) and operators > - > > Key: HIVE-11355 > URL: https://issues.apache.org/jira/browse/HIVE-11355 > Project: Hive > Issue Type: Improvement > Components: Tez >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K > Attachments: HIVE-11355.1.patch, HIVE-11355.2.patch, > HIVE-11355.3.patch, HIVE-11355.4.patch, HIVE-11355.5.patch > > > We need to better manage the sort buffer allocations to ensure better > performance. Also, we need to provide configurations to certain operators to > stay within memory limits. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11992) speed up and re-enable slow q test files in Hive
[ https://issues.apache.org/jira/browse/HIVE-11992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11992: Priority: Minor (was: Major) > speed up and re-enable slow q test files in Hive > > > Key: HIVE-11992 > URL: https://issues.apache.org/jira/browse/HIVE-11992 > Project: Hive > Issue Type: Bug > Components: Test >Reporter: Sergey Shelukhin >Priority: Minor > > Due to perceived lack of importance and long runtimes, we have disabled the > following q files: > CliDriver: > rcfile_merge1.q,\ > MinimrCliDriver: > ql_rewrite_gbtoidx.q,\ > ql_rewrite_gbtoidx_cbo_1.q,\ > ql_rewrite_gbtoidx_cbo_2.q,\ > smb_mapjoin_8.q,\ > If someone thinks any of these are important, they should be re-enabled, > however, their runtime should be made acceptable first (they take 10-30 > minutes right now, and should take 3 minutes at most, ideally 0-2). > Please feel free to look at all of these, or file sub-tasks to look at a > subset of the list. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11960) braces in join conditions are not supported
[ https://issues.apache.org/jira/browse/HIVE-11960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936034#comment-14936034 ] Pengcheng Xiong commented on HIVE-11960: LGTM +1 pending QA run. > braces in join conditions are not supported > --- > > Key: HIVE-11960 > URL: https://issues.apache.org/jira/browse/HIVE-11960 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11960.01.patch, HIVE-11960.02.patch, > HIVE-11960.patch > > > These should be supported; they are ANSI -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11993) It is not necessary to start tez session when cli with parameter "-e" or "-f"
[ https://issues.apache.org/jira/browse/HIVE-11993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated HIVE-11993: -- Description: With "-e" or "-f", hive execute under batch mode, so I don't think it is necessary to start tez session When hive session is started Especially when I only want to execute DDL. Tez Session can be started when it is needed in this mode. was: With "-e" or "-f", hive execute under batch mode, so I don't think it is necessary to start tez session When hive session is started. Especially when I only want to execute DDL. > It is not necessary to start tez session when cli with parameter "-e" or "-f" > - > > Key: HIVE-11993 > URL: https://issues.apache.org/jira/browse/HIVE-11993 > Project: Hive > Issue Type: Improvement > Components: CLI >Reporter: Jeff Zhang > > With "-e" or "-f", hive execute under batch mode, so I don't think it is > necessary to start tez session When hive session is started > Especially when I only want to execute DDL. Tez Session can be started when > it is needed in this mode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11985) handle long typenames from Avro schema in metastore
[ https://issues.apache.org/jira/browse/HIVE-11985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-11985: Attachment: HIVE-11985.01.patch Need to change the type name to keep metastore happy. Tested this on cluster with a giant Avro schema, I can create the table, query it (it's empty though) and describe correctly. At any rate, it's an improvement over existing truncated type name. [~ashutoshc] do you want to review or suggest a reviewer? :) Btw, this case will also fail on Oracle (before the patch), as it doesn't allow the data to be truncated on insert. > handle long typenames from Avro schema in metastore > --- > > Key: HIVE-11985 > URL: https://issues.apache.org/jira/browse/HIVE-11985 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11985.01.patch, HIVE-11985.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11952) disable q tests that are both slow and less relevant
[ https://issues.apache.org/jira/browse/HIVE-11952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936025#comment-14936025 ] Sergey Shelukhin commented on HIVE-11952: - Tests didn't execute, as planned. Will remove from that variable and commit > disable q tests that are both slow and less relevant > > > Key: HIVE-11952 > URL: https://issues.apache.org/jira/browse/HIVE-11952 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11952.patch > > > We will disable several tests that test obscure and old features and take > inordinate amount of time, and file JIRAs to look at their perf if someone > still cares about them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11989) vector_groupby_reduce.q is failing on CLI and MiniTez drivers on master
[ https://issues.apache.org/jira/browse/HIVE-11989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935966#comment-14935966 ] Matt McCline commented on HIVE-11989: - +1 lgtm > vector_groupby_reduce.q is failing on CLI and MiniTez drivers on master > --- > > Key: HIVE-11989 > URL: https://issues.apache.org/jira/browse/HIVE-11989 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11989.01.patch > > > need to update the golden files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11970) COLUMNS_V2 table in metastore should have a longer name field
[ https://issues.apache.org/jira/browse/HIVE-11970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936023#comment-14936023 ] Sergey Shelukhin commented on HIVE-11970: - [~sushanth] [~thejas] ping? > COLUMNS_V2 table in metastore should have a longer name field > - > > Key: HIVE-11970 > URL: https://issues.apache.org/jira/browse/HIVE-11970 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11970.patch > > > In some cases, esp. with derived names, e.g. from Avro schemas, the column > names can be pretty long. COLUMNS_V2 name field has a very short length. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11992) speed up and re-enable slow q test files in Hive
[ https://issues.apache.org/jira/browse/HIVE-11992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936076#comment-14936076 ] Gopal V commented on HIVE-11992: If these need to be brought back, moving these tests to a separate test-shard would do the trick - since they'll run in parallel instead of blocking other tests. > speed up and re-enable slow q test files in Hive > > > Key: HIVE-11992 > URL: https://issues.apache.org/jira/browse/HIVE-11992 > Project: Hive > Issue Type: Bug > Components: Test >Reporter: Sergey Shelukhin >Priority: Minor > > Due to perceived lack of importance and long runtimes, we have disabled the > following q files: > CliDriver: > rcfile_merge1.q,\ > MinimrCliDriver: > ql_rewrite_gbtoidx.q,\ > ql_rewrite_gbtoidx_cbo_1.q,\ > ql_rewrite_gbtoidx_cbo_2.q,\ > smb_mapjoin_8.q,\ > If someone thinks any of these are important, they should be re-enabled, > however, their runtime should be made acceptable first (they each take 10-30 > minutes right now, and should take 3 minutes at most, ideally 0-2). > Please feel free to look at all of these, or file sub-tasks to look at a > subset of the list. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11977) Hive should handle an external avro table with zero length files present
[ https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936101#comment-14936101 ] Hive QA commented on HIVE-11977: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12764065/HIVE-11977.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5462/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5462/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5462/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-5462/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin >From https://github.com/apache/hive dc130f0..1636292 branch-1 -> origin/branch-1 a5ffa71..6a8d7e4 master -> origin/master + git reset --hard HEAD HEAD is now at a5ffa71 HIVE-11724 : WebHcat get jobs to order jobs on time order with latest at top (Kiran Kumar Kolli, reviewed by Hari Subramaniyan) + git clean -f -d Removing ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveHepPlannerContext.java Removing ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveVolcanoPlannerContext.java Removing ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRulesRegistry.java + git checkout master Already on 'master' Your branch is behind 'origin/master' by 3 commits, and can be fast-forwarded. + git reset --hard origin/master HEAD is now at 6a8d7e4 HIVE-11819 : HiveServer2 catches OOMs on request threads (Sergey Shelukhin, reviewed by Vaibhav Gumashta) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch patch: malformed patch at line 34: @@ -146,7 +156,7 @@ private boolean pathIsInPartition(Path split, String partitionPath) { patch: malformed patch at line 34: @@ -146,7 +156,7 @@ private boolean pathIsInPartition(Path split, String partitionPath) { patch: malformed patch at line 34: @@ -146,7 +156,7 @@ private boolean pathIsInPartition(Path split, String partitionPath) { The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12764065 - PreCommit-HIVE-TRUNK-Build > Hive should handle an external avro table with zero length files present > > > Key: HIVE-11977 > URL: https://issues.apache.org/jira/browse/HIVE-11977 > Project: Hive > Issue Type: Bug >Reporter: Aaron Dossett >Assignee: Aaron Dossett > Attachments: HIVE-11977.patch > > > If a zero length file is in the top level directory housing an external avro > table, all hive queries on the table fail. > This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader > creates a new org.apache.avro.file.DataFileReader and DataFileReader throws > an exception when trying to read an empty file (because the empty file lacks > the magic number marking it as avro). > AvroGenericRecordReader should detect
[jira] [Commented] (HIVE-11962) Improve windowing_windowspec2.q tests to return consistent results
[ https://issues.apache.org/jira/browse/HIVE-11962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936218#comment-14936218 ] Hive QA commented on HIVE-11962: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12764073/HIVE-11962.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 9638 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_reduce org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5463/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5463/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5463/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12764073 - PreCommit-HIVE-TRUNK-Build > Improve windowing_windowspec2.q tests to return consistent results > -- > > Key: HIVE-11962 > URL: https://issues.apache.org/jira/browse/HIVE-11962 > Project: Hive > Issue Type: Test > Components: Test >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu >Priority: Trivial > Attachments: HIVE-11962.patch > > > Upstream test result for windowing_windowspec2.q seems to return consistent > result while we have observe that in a different test env, the result could > be slightly different. > e.g., for the following query, the value t could be the same in each > partition of ts. So the row order could be either for those rows. I haven't > looked further why it causes that difference yet. > {noformat} > select ts, f, max(f) over (partition by ts order by t rows between 2 > preceding and 1 preceding) from over10k limit 100; > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11642) LLAP: make sure tests pass #3
[ https://issues.apache.org/jira/browse/HIVE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936304#comment-14936304 ] Hive QA commented on HIVE-11642: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12764316/HIVE-11642.14.patch {color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 9986 tests executed *Failed tests:* {noformat} TestCliDriver-orc_ppd_decimal.q-vector_decimal_round.q-metadata_export_drop.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynpart_sort_opt_vectorization org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_explainuser_1 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_metadata_only_queries org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_auto_smb_mapjoin_14 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_groupby_reduce org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorization_limit org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_scriptfile1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_reduce org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5464/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5464/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5464/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12764316 - PreCommit-HIVE-TRUNK-Build > LLAP: make sure tests pass #3 > - > > Key: HIVE-11642 > URL: https://issues.apache.org/jira/browse/HIVE-11642 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11642.01.patch, HIVE-11642.02.patch, > HIVE-11642.03.patch, HIVE-11642.04.patch, HIVE-11642.05.patch, > HIVE-11642.12.patch, HIVE-11642.13.patch, HIVE-11642.14.patch, > HIVE-11642.patch > > > Tests should pass against the most recent branch and Tez 0.8. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11699) Support special characters in quoted table names
[ https://issues.apache.org/jira/browse/HIVE-11699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11699: --- Attachment: HIVE-11699.06.patch > Support special characters in quoted table names > > > Key: HIVE-11699 > URL: https://issues.apache.org/jira/browse/HIVE-11699 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11699.01.patch, HIVE-11699.02.patch, > HIVE-11699.03.patch, HIVE-11699.04.patch, HIVE-11699.05.patch, > HIVE-11699.06.patch > > > Right now table names can only be "[a-zA-z_0-9]+". This patch tries to > investigate how much change there should be if we would like to support > special characters, e.g., "/" in table names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11937) Improve StatsOptimizer to deal with query with additional constant columns
[ https://issues.apache.org/jira/browse/HIVE-11937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11937: --- Fix Version/s: 2.0.0 > Improve StatsOptimizer to deal with query with additional constant columns > -- > > Key: HIVE-11937 > URL: https://issues.apache.org/jira/browse/HIVE-11937 > Project: Hive > Issue Type: Improvement >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Fix For: 2.0.0 > > Attachments: HIVE-11937.01.patch, HIVE-11937.02.patch > > > Right now StatsOptimizer can deal with query such as "select count(1) from > src" by directly looking into the metastore. However, it can not deal with > "select '1' as one, count(1) from src" which has an additional constant > column. We may improve it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11948) Investigate TxnHandler and CompactionTxnHandler to see where we can reduce transaction isolation level
[ https://issues.apache.org/jira/browse/HIVE-11948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-11948: -- Description: at least some operations (or parts of operations) can run at READ_COMMITTED. CompactionTxnHandler.setRunAs() CompactionTxnHandler.findNextToCompact() if update stmt includes cq_state = '" + INITIATED_STATE + "'" in WHERE clause and logic to look for "next" candidate CompactionTxnHandler.markCompacted() perhaps add cq_state=WORKING_STATE in Where clause (mostly as an extra consistency check) was:at least some operations (or parts of operations) can run at READ_COMMITTED. > Investigate TxnHandler and CompactionTxnHandler to see where we can reduce > transaction isolation level > -- > > Key: HIVE-11948 > URL: https://issues.apache.org/jira/browse/HIVE-11948 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 0.14.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > > at least some operations (or parts of operations) can run at READ_COMMITTED. > CompactionTxnHandler.setRunAs() > CompactionTxnHandler.findNextToCompact() > if update stmt includes cq_state = '" + INITIATED_STATE + "'" in WHERE clause > and logic to look for "next" candidate > CompactionTxnHandler.markCompacted() > perhaps add cq_state=WORKING_STATE in Where clause (mostly as an extra > consistency check) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-11400) insert overwrite task awasy stuck at latest job
[ https://issues.apache.org/jira/browse/HIVE-11400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feng Yuan resolved HIVE-11400. -- Resolution: Cannot Reproduce > insert overwrite task awasy stuck at latest job > --- > > Key: HIVE-11400 > URL: https://issues.apache.org/jira/browse/HIVE-11400 > Project: Hive > Issue Type: Bug > Components: Hive, Query Processor >Affects Versions: 0.14.0 > Environment: hadoop 2.6.0,centos 6.5 >Reporter: Feng Yuan > Attachments: failed_logs, success_logs, task_explain > > > when i run a task like "insert overwrite table a (select * from b join > select * from c on b.id=c.id) tmp;" it will get stuck on latest job.(eg. the > parser explain the task has 3 jobs,but the third job(or stage) will never get > executed). > there have two files: > 1.hql explain file. > 2.running logs. > you will see the stage-0 in explain file is Move Operation,but you will not > see it in the running logs.and the fact is 16 of 17 jobs has > complete(actually the 13th job get lost?i dont see anywhere in the logs),but > the 17th job get hanged forever.and even it not bean assigned a jobid and > launched! > there are someone can help this? > Thanks for you very much~! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11937) Improve StatsOptimizer to deal with query with additional constant columns
[ https://issues.apache.org/jira/browse/HIVE-11937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936185#comment-14936185 ] Pengcheng Xiong commented on HIVE-11937: The failed tests are unrelated. Pushed to master. Thanks [~ashutoshc] for the review! > Improve StatsOptimizer to deal with query with additional constant columns > -- > > Key: HIVE-11937 > URL: https://issues.apache.org/jira/browse/HIVE-11937 > Project: Hive > Issue Type: Improvement >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11937.01.patch, HIVE-11937.02.patch > > > Right now StatsOptimizer can deal with query such as "select count(1) from > src" by directly looking into the metastore. However, it can not deal with > "select '1' as one, count(1) from src" which has an additional constant > column. We may improve it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10965) direct SQL for stats fails in 0-column case
[ https://issues.apache.org/jira/browse/HIVE-10965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936395#comment-14936395 ] Lefty Leverenz commented on HIVE-10965: --- Thanks, I wasn't sure if I should change the fix version myself, but this is better. > direct SQL for stats fails in 0-column case > --- > > Key: HIVE-10965 > URL: https://issues.apache.org/jira/browse/HIVE-10965 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 1.2.1, 1.0.2 > > Attachments: HIVE-10965.01.patch, HIVE-10965.02.patch, > HIVE-10965.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11930) how to prevent ppd the topN(a) udf predication in where clause?
[ https://issues.apache.org/jira/browse/HIVE-11930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936180#comment-14936180 ] Feng Yuan commented on HIVE-11930: -- do you mean this: @UDFType(stateful=true) public class top1000() extends UDF{} i try like this,but my sql is: ... where a.customer='Cdianyingwang' and a.taskid='33' and a.step_id='0' and top1000(a.only_id)<=10; complier say top1000 shouldnt place in where clause. > how to prevent ppd the topN(a) udf predication in where clause? > --- > > Key: HIVE-11930 > URL: https://issues.apache.org/jira/browse/HIVE-11930 > Project: Hive > Issue Type: New Feature > Components: Hive >Affects Versions: 0.14.0 >Reporter: Feng Yuan >Priority: Minor > > select > a.state_date,a.customer,a.taskid,a.step_id,a.exit_title,a.pv,top1000(a.only_id) > from > ( select > t1.state_date,t1.customer,t1.taskid,t1.step_id,t1.exit_title,t1.pv,t1.only_id > from > ( select t11.state_date, >t11.customer, >t11.taskid, >t11.step_id, >t11.exit_title, >t11.pv, >concat(t11.customer,t11.taskid,t11.step_id) as > only_id >from > ( select > state_date,customer,taskid,step_id,exit_title,count(*) as pv > from bdi_fact2.mid_url_step > where exit_url!='-1' > and exit_title !='-1' > and l_date='2015-08-31' > group by > state_date,customer,taskid,step_id,exit_title > )t11 >)t1 >order by t1.only_id,t1.pv desc > )a > where a.customer='Cdianyingwang' > and a.taskid='33' > and a.step_id='0' > and top1000(a.only_id)<=10; > in above example: > outer top1000(a.only_id)<=10;will ppd to: > stage 1: > ( select t11.state_date, >t11.customer, >t11.taskid, >t11.step_id, >t11.exit_title, >t11.pv, >concat(t11.customer,t11.taskid,t11.step_id) as > only_id >from > ( select > state_date,customer,taskid,step_id,exit_title,count(*) as pv > from bdi_fact2.mid_url_step > where exit_url!='-1' > and exit_title !='-1' > and l_date='2015-08-31' > group by > state_date,customer,taskid,step_id,exit_title > )t11 >)t1 > and this stage have 2 reduce,so you can see this will output 20 records, > upon to outer stage,the final results is exactly this 20 records. > so i want to know is there any way to hint this topN udf predication not to > ppd? > Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11972) [Refactor] Improve determination of dynamic partitioning columns in FileSink Operator
[ https://issues.apache.org/jira/browse/HIVE-11972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-11972: Attachment: HIVE-11972.3.patch > [Refactor] Improve determination of dynamic partitioning columns in FileSink > Operator > - > > Key: HIVE-11972 > URL: https://issues.apache.org/jira/browse/HIVE-11972 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-11972.2.patch, HIVE-11972.3.patch, HIVE-11972.patch > > > Currently it uses column names to locate DP columns, which is brittle since > column names may change during planning and optimization phases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11977) Hive should handle an external avro table with zero length files present
[ https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Dossett updated HIVE-11977: - Attachment: HIVE-11977-002.patch > Hive should handle an external avro table with zero length files present > > > Key: HIVE-11977 > URL: https://issues.apache.org/jira/browse/HIVE-11977 > Project: Hive > Issue Type: Bug >Reporter: Aaron Dossett >Assignee: Aaron Dossett > Attachments: HIVE-11977-002.patch, HIVE-11977.patch > > > If a zero length file is in the top level directory housing an external avro > table, all hive queries on the table fail. > This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader > creates a new org.apache.avro.file.DataFileReader and DataFileReader throws > an exception when trying to read an empty file (because the empty file lacks > the magic number marking it as avro). > AvroGenericRecordReader should detect an empty file and then behave > reasonably. > Caused by: java.io.IOException: Not a data file. > at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102) > at org.apache.avro.file.DataFileReader.(DataFileReader.java:97) > at > org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.(AvroGenericRecordReader.java:81) > at > org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246) > ... 25 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11977) Hive should handle an external avro table with zero length files present
[ https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14936249#comment-14936249 ] Aaron Dossett commented on HIVE-11977: -- Attached a second patch that includes a unit test and better patch formatting > Hive should handle an external avro table with zero length files present > > > Key: HIVE-11977 > URL: https://issues.apache.org/jira/browse/HIVE-11977 > Project: Hive > Issue Type: Bug >Reporter: Aaron Dossett >Assignee: Aaron Dossett > Attachments: HIVE-11977-2.patch, HIVE-11977.patch > > > If a zero length file is in the top level directory housing an external avro > table, all hive queries on the table fail. > This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader > creates a new org.apache.avro.file.DataFileReader and DataFileReader throws > an exception when trying to read an empty file (because the empty file lacks > the magic number marking it as avro). > AvroGenericRecordReader should detect an empty file and then behave > reasonably. > Caused by: java.io.IOException: Not a data file. > at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102) > at org.apache.avro.file.DataFileReader.(DataFileReader.java:97) > at > org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.(AvroGenericRecordReader.java:81) > at > org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246) > ... 25 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11977) Hive should handle an external avro table with zero length files present
[ https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Dossett updated HIVE-11977: - Attachment: HIVE-11977-2.patch > Hive should handle an external avro table with zero length files present > > > Key: HIVE-11977 > URL: https://issues.apache.org/jira/browse/HIVE-11977 > Project: Hive > Issue Type: Bug >Reporter: Aaron Dossett >Assignee: Aaron Dossett > Attachments: HIVE-11977-2.patch, HIVE-11977.patch > > > If a zero length file is in the top level directory housing an external avro > table, all hive queries on the table fail. > This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader > creates a new org.apache.avro.file.DataFileReader and DataFileReader throws > an exception when trying to read an empty file (because the empty file lacks > the magic number marking it as avro). > AvroGenericRecordReader should detect an empty file and then behave > reasonably. > Caused by: java.io.IOException: Not a data file. > at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102) > at org.apache.avro.file.DataFileReader.(DataFileReader.java:97) > at > org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.(AvroGenericRecordReader.java:81) > at > org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246) > ... 25 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11977) Hive should handle an external avro table with zero length files present
[ https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Dossett updated HIVE-11977: - Attachment: (was: HIVE-11977-002.patch) > Hive should handle an external avro table with zero length files present > > > Key: HIVE-11977 > URL: https://issues.apache.org/jira/browse/HIVE-11977 > Project: Hive > Issue Type: Bug >Reporter: Aaron Dossett >Assignee: Aaron Dossett > Attachments: HIVE-11977.patch > > > If a zero length file is in the top level directory housing an external avro > table, all hive queries on the table fail. > This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader > creates a new org.apache.avro.file.DataFileReader and DataFileReader throws > an exception when trying to read an empty file (because the empty file lacks > the magic number marking it as avro). > AvroGenericRecordReader should detect an empty file and then behave > reasonably. > Caused by: java.io.IOException: Not a data file. > at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102) > at org.apache.avro.file.DataFileReader.(DataFileReader.java:97) > at > org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.(AvroGenericRecordReader.java:81) > at > org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246) > ... 25 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11445) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : groupby distinct does not work
[ https://issues.apache.org/jira/browse/HIVE-11445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-11445: --- Attachment: HIVE-11445.02.patch > CBO: Calcite Operator To Hive Operator (Calcite Return Path) : groupby > distinct does not work > - > > Key: HIVE-11445 > URL: https://issues.apache.org/jira/browse/HIVE-11445 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Pengcheng Xiong >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-11445.01.patch, HIVE-11445.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11684) Implement limit pushdown through outer join in CBO
[ https://issues.apache.org/jira/browse/HIVE-11684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935505#comment-14935505 ] Hive QA commented on HIVE-11684: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12764000/HIVE-11684.12.patch {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9633 tests executed *Failed tests:* {noformat} TestMiniTezCliDriver-auto_join30.q-vector_data_types.q-filter_join_breaktask.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_auto_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_reduce org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5459/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5459/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5459/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12764000 - PreCommit-HIVE-TRUNK-Build > Implement limit pushdown through outer join in CBO > -- > > Key: HIVE-11684 > URL: https://issues.apache.org/jira/browse/HIVE-11684 > Project: Hive > Issue Type: New Feature > Components: CBO >Affects Versions: 2.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-11684.01.patch, HIVE-11684.02.patch, > HIVE-11684.03.patch, HIVE-11684.04.patch, HIVE-11684.05.patch, > HIVE-11684.07.patch, HIVE-11684.08.patch, HIVE-11684.09.patch, > HIVE-11684.10.patch, HIVE-11684.11.patch, HIVE-11684.12.patch, > HIVE-11684.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11971) testResultSetMetaData() in TestJdbcDriver2.java is failing on CBO AST path
[ https://issues.apache.org/jira/browse/HIVE-11971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11971: --- Summary: testResultSetMetaData() in TestJdbcDriver2.java is failing on CBO AST path (was: testResultSetMetaData() in TestJdbc2.java is failing on CBO AST path) > testResultSetMetaData() in TestJdbcDriver2.java is failing on CBO AST path > -- > > Key: HIVE-11971 > URL: https://issues.apache.org/jira/browse/HIVE-11971 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11971.01.patch > > > test is passing because wrong golden file is used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11634) Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...)
[ https://issues.apache.org/jira/browse/HIVE-11634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-11634: - Attachment: HIVE-11634.96.patch [~jcamachorodriguez] Can you please look at the latest patch, made the required changes. Thanks Hari > Support partition pruning for IN(STRUCT(partcol, nonpartcol..)...) > -- > > Key: HIVE-11634 > URL: https://issues.apache.org/jira/browse/HIVE-11634 > Project: Hive > Issue Type: Bug > Components: CBO >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-11634.1.patch, HIVE-11634.2.patch, > HIVE-11634.3.patch, HIVE-11634.4.patch, HIVE-11634.5.patch, > HIVE-11634.6.patch, HIVE-11634.7.patch, HIVE-11634.8.patch, > HIVE-11634.9.patch, HIVE-11634.91.patch, HIVE-11634.92.patch, > HIVE-11634.93.patch, HIVE-11634.94.patch, HIVE-11634.95.patch, > HIVE-11634.96.patch > > > Currently, we do not support partition pruning for the following scenario > {code} > create table pcr_t1 (key int, value string) partitioned by (ds string); > insert overwrite table pcr_t1 partition (ds='2000-04-08') select * from src > where key < 20 order by key; > insert overwrite table pcr_t1 partition (ds='2000-04-09') select * from src > where key < 20 order by key; > insert overwrite table pcr_t1 partition (ds='2000-04-10') select * from src > where key < 20 order by key; > explain extended select ds from pcr_t1 where struct(ds, key) in > (struct('2000-04-08',1), struct('2000-04-09',2)); > {code} > If we run the above query, we see that all the partitions of table pcr_t1 are > present in the filter predicate where as we can prune partition > (ds='2000-04-10'). > The optimization is to rewrite the above query into the following. > {code} > explain extended select ds from pcr_t1 where (struct(ds)) IN > (struct('2000-04-08'), struct('2000-04-09')) and struct(ds, key) in > (struct('2000-04-08',1), struct('2000-04-09',2)); > {code} > The predicate (struct(ds)) IN (struct('2000-04-08'), struct('2000-04-09')) > is used by partition pruner to prune the columns which otherwise will not be > pruned. > This is an extension of the idea presented in HIVE-11573. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11976) Extend CBO rules to being able to apply rules only once on a given operator
[ https://issues.apache.org/jira/browse/HIVE-11976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935574#comment-14935574 ] Laljo John Pullokkaran commented on HIVE-11976: --- Patch looks good. May be we should address the following: 1. HivePreFilter Rule the bail out condition should be modified (Pullup predicate should use the child real node). 2. Should we register child filter as well so that rule doesn't fire on child. > Extend CBO rules to being able to apply rules only once on a given operator > --- > > Key: HIVE-11976 > URL: https://issues.apache.org/jira/browse/HIVE-11976 > Project: Hive > Issue Type: New Feature > Components: CBO >Affects Versions: 2.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-11976.patch > > > Create a way to bail out quickly from HepPlanner if the rule has been already > applied on a certain operator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11971) testResultSetMetaData() in TestJdbc2.java is failing on CBO AST path
[ https://issues.apache.org/jira/browse/HIVE-11971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935467#comment-14935467 ] Pengcheng Xiong commented on HIVE-11971: The failed tests are unrelated. [~ashutoshc] or [~jpullokkaran], could you please take a look? Thanks. > testResultSetMetaData() in TestJdbc2.java is failing on CBO AST path > > > Key: HIVE-11971 > URL: https://issues.apache.org/jira/browse/HIVE-11971 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11971.01.patch > > > test is passing because wrong golden file is used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10965) direct SQL for stats fails in 0-column case
[ https://issues.apache.org/jira/browse/HIVE-10965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935570#comment-14935570 ] Thejas M Nair commented on HIVE-10965: -- Thanks for catching that [~leftylev]! [~pxiong] was back porting some critical fixes to 1.0 line. I had an offline discussion with him now clarifying the process to him, he is going to update the fix version for couple of other jiras that were backported. > direct SQL for stats fails in 0-column case > --- > > Key: HIVE-10965 > URL: https://issues.apache.org/jira/browse/HIVE-10965 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 1.2.1, 1.0.2 > > Attachments: HIVE-10965.01.patch, HIVE-10965.02.patch, > HIVE-10965.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11445) CBO: Calcite Operator To Hive Operator (Calcite Return Path) : groupby distinct does not work
[ https://issues.apache.org/jira/browse/HIVE-11445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935479#comment-14935479 ] Jesus Camacho Rodriguez commented on HIVE-11445: Problem was that distinct nodes that are part of the key were being added to distExprNodes; patch solves that issue. > CBO: Calcite Operator To Hive Operator (Calcite Return Path) : groupby > distinct does not work > - > > Key: HIVE-11445 > URL: https://issues.apache.org/jira/browse/HIVE-11445 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Pengcheng Xiong >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-11445.01.patch, HIVE-11445.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11835) Type decimal(1,1) reads 0.0, 0.00, etc from text file as NULL
[ https://issues.apache.org/jira/browse/HIVE-11835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1493#comment-1493 ] Szehon Ho commented on HIVE-11835: -- Thanks for the clarification. > Type decimal(1,1) reads 0.0, 0.00, etc from text file as NULL > - > > Key: HIVE-11835 > URL: https://issues.apache.org/jira/browse/HIVE-11835 > Project: Hive > Issue Type: Bug > Components: Types >Affects Versions: 1.2.0, 1.1.0, 2.0.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-11835.1.patch, HIVE-11835.2.patch, HIVE-11835.patch > > > Steps to reproduce: > 1. create a text file with values like 0.0, 0.00, etc. > 2. create table in hive with type decimal(1,1). > 3. run "load data local inpath ..." to load data into the table. > 4. run select * on the table. > You will see that NULL is displayed for 0.0, 0.00, .0, etc. Instead, these > should be read as 0.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-8527) Incorrect TIMESTAMP result on JDBC direct read when next row has no (null) value for the TIMESTAMP
[ https://issues.apache.org/jira/browse/HIVE-8527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Zanter resolved HIVE-8527. Resolution: Fixed Fix Version/s: 1.1.0 I can verify that this is fixed in Hive version 1.1.0. (may have been fixed earlier than that as well.) Seems to have been fixed by the same thing that fixed HIVE-8297. > Incorrect TIMESTAMP result on JDBC direct read when next row has no (null) > value for the TIMESTAMP > -- > > Key: HIVE-8527 > URL: https://issues.apache.org/jira/browse/HIVE-8527 > Project: Hive > Issue Type: Bug > Components: JDBC >Affects Versions: 0.13.0 > Environment: Linux >Reporter: Doug Sedlak > Fix For: 1.1.0 > > > For the case: > SELECT * FROM [table] > JDBC direct reads the table backing data, versus cranking up a MR and > creating a result set. This report is another direct read JDBC issue with > TIMESTAMPS, see HIVE-8297 also. > As in title, a succeeding row with no value corrupts the value read for the > current row. To reproduce using beeline: > 1) Create this file as follows in HDFS. > $ cat > /tmp/ts2.txt > 2014-09-28 00:00:00,2014-09-28 00:00:00, > ,, > > $ hadoop fs -copyFromLocal /tmp/ts2.txt /tmp/ts2.txt > 2) In beeline load above HDFS data to a TEXTFILE table: > $ beeline > > !connect jdbc:hive2://:/ hive pass > org.apache.hive.jdbc.HiveDriver > > drop table `TIMESTAMP_TEXT2`; > > CREATE TABLE `TIMESTAMP_TEXT2` (`ts1` TIMESTAMP, `ts2` TIMESTAMP) ROW > FORMAT DELIMITED FIELDS TERMINATED BY '\054' LINES TERMINATED BY '\012' > STORED AS TEXTFILE; > > LOAD DATA INPATH '/tmp/ts2.txt' OVERWRITE INTO TABLE > `TIMESTAMP_TEXT2`; > 3) To demonstrate the corrupt data read, in beeline: > > select * from `TIMESTAMP_TEXT2`; > Note 1: The incorrect conduct demonstrated above replicates with a standalone > Java/JDBC program. > Note 2: Don't know if this is an issue with any other data types, also don't > know what releases affected, however this occurs in Hive 13. Hive CLI works > fine. Also works fine if you force a MR: > select * from `TIMESTAMP_TEXT2` where 1=1; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11724) WebHcat get jobs to order jobs on time order with latest at top
[ https://issues.apache.org/jira/browse/HIVE-11724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-11724: - Labels: TODOC1.3 (was: ) > WebHcat get jobs to order jobs on time order with latest at top > --- > > Key: HIVE-11724 > URL: https://issues.apache.org/jira/browse/HIVE-11724 > Project: Hive > Issue Type: Improvement > Components: WebHCat >Affects Versions: 0.14.0 >Reporter: Kiran Kumar Kolli >Assignee: Kiran Kumar Kolli > Labels: TODOC1.3 > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-11724.1.patch, HIVE-11724.2.patch, > HIVE-11724.3.patch, HIVE-11724.4.patch, HIVE-11724.5.patch, HIVE-11724.6.patch > > > HIVE-5519 added pagination feature support to WebHcat. This implementation > returns the jobs lexicographically resulting in older jobs showing at the > top. > Improvement is to order them on time with latest at top. Typically latest > jobs (or running) ones are more relevant to the user. Time based ordering > with pagination makes more sense. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11985) handle long typenames from Avro schema in metastore
[ https://issues.apache.org/jira/browse/HIVE-11985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935699#comment-14935699 ] Sergey Shelukhin commented on HIVE-11985: - [~xuefuz] [~sachingoyal] are you familiar with it? I wonder who is. most commits on these files are pretty old, you have one in 2014 :) > handle long typenames from Avro schema in metastore > --- > > Key: HIVE-11985 > URL: https://issues.apache.org/jira/browse/HIVE-11985 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11985.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11850) On Windows, creating udf function using wasb fail throwing java.lang.RuntimeException: invalid url: wasb:///... expecting ( file | hdfs | ivy) as url scheme.
[ https://issues.apache.org/jira/browse/HIVE-11850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-11850: - Summary: On Windows, creating udf function using wasb fail throwing java.lang.RuntimeException: invalid url: wasb:///... expecting ( file | hdfs | ivy) as url scheme. (was: On Humboldt, creating udf function using wasb fail throwing java.lang.RuntimeException: invalid url: wasb:///... expecting ( file | hdfs | ivy) as url scheme.) > On Windows, creating udf function using wasb fail throwing > java.lang.RuntimeException: invalid url: wasb:///... expecting ( file | hdfs > | ivy) as url scheme. > --- > > Key: HIVE-11850 > URL: https://issues.apache.org/jira/browse/HIVE-11850 > Project: Hive > Issue Type: Bug > Components: UDF >Affects Versions: 1.2.1 > Environment: Humboldt >Reporter: Takahiko Saito > Fix For: 1.2.1 > > > {noformat} > hive> drop function if exists gencounter; > OK > Time taken: 2.614 seconds > On Humboldt, creating UDF function fail as follows: > hive> create function gencounter as > 'org.apache.hive.udf.generic.GenericUDFGenCounter' using jar > 'wasb:///tmp/hive-udfs-0.1.jar'; > invalid url: wasb:///tmp/hive-udfs-0.1.jar, expecting ( file | hdfs | ivy) > as url scheme. > Failed to register default.gencounter using class > org.apache.hive.udf.generic.GenericUDFGenCounter > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.FunctionTask > {noformat} > The jar exists in wasb dir: > {noformat} > hrt_qa@headnode0:~$ hadoop fs -ls wasb:///tmp/ > Found 2 items > -rw-r--r-- 1 hrt_qa supergroup 4472 2015-09-16 11:50 > wasb:///tmp/hive-udfs-0.1.jar > drwxrwxrwx - hdfs supergroup 0 2015-09-16 12:00 > wasb:///tmp/阿䶵aa阿䶵 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11990) Loading data inpath from a temporary table dir fails on Humboldt
[ https://issues.apache.org/jira/browse/HIVE-11990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-11990: - Attachment: HIVE-11990.1.patch [~jdere] Can you please review the change. Now that we support move files from one file system to another, we can remove the following code in LoadSemanticAnalyzer.java {code} // only in 'local' mode do we copy stuff from one place to another. // reject different scheme/authority in other cases. if (!isLocal && (!StringUtils.equals(fromURI.getScheme(), toURI.getScheme()) || !StringUtils .equals(fromURI.getAuthority(), toURI.getAuthority( { String reason = "Move from: " + fromURI.toString() + " to: " + toURI.toString() + " is not valid. " + "Please check that values for params \"default.fs.name\" and " + "\"hive.metastore.warehouse.dir\" do not conflict."; throw new SemanticException(ErrorMsg.ILLEGAL_PATH.getMsg(ast, reason)); } {code} Thanks Hari > Loading data inpath from a temporary table dir fails on Humboldt > > > Key: HIVE-11990 > URL: https://issues.apache.org/jira/browse/HIVE-11990 > Project: Hive > Issue Type: Bug >Reporter: Takahiko Saito >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-11990.1.patch > > > The query runs: > {noformat} > load data inpath 'wasb:///tmp/testtemptable/temptablemisc_5/data' overwrite > into table temp2; > {noformat} > It fails with: > {noformat} > FAILED: SemanticException [Error 10028]: Line 2:37 Path is not legal > ''wasb:///tmp/testtemptable/temptablemisc_5/data'': Move from: > wasb://humb23-hi...@humboldttesting3.blob.core.windows.net/tmp/testtemptable/temptablemisc_5/data > to: > hdfs://headnode0.humb23-hive1-ssh.h2.internal.cloudapp.net:8020/tmp/hive/hrt_qa/0d5f8b31-5908-44bf-ae4c-9eee956da066/_tmp_space.db/75b44252-42a7-4d28-baf8-4977daa5d49c > is not valid. Please check that values for params "default.fs.name" and > "hive.metastore.warehouse.dir" do not conflict. > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11990) Loading data inpath from a temporary table dir fails on Windows
[ https://issues.apache.org/jira/browse/HIVE-11990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-11990: - Summary: Loading data inpath from a temporary table dir fails on Windows (was: Loading data inpath from a temporary table dir fails on Humboldt) > Loading data inpath from a temporary table dir fails on Windows > --- > > Key: HIVE-11990 > URL: https://issues.apache.org/jira/browse/HIVE-11990 > Project: Hive > Issue Type: Bug >Reporter: Takahiko Saito >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-11990.1.patch > > > The query runs: > {noformat} > load data inpath 'wasb:///tmp/testtemptable/temptablemisc_5/data' overwrite > into table temp2; > {noformat} > It fails with: > {noformat} > FAILED: SemanticException [Error 10028]: Line 2:37 Path is not legal > ''wasb:///tmp/testtemptable/temptablemisc_5/data'': Move from: > wasb://humb23-hi...@humboldttesting3.blob.core.windows.net/tmp/testtemptable/temptablemisc_5/data > to: > hdfs://headnode0.humb23-hive1-ssh.h2.internal.cloudapp.net:8020/tmp/hive/hrt_qa/0d5f8b31-5908-44bf-ae4c-9eee956da066/_tmp_space.db/75b44252-42a7-4d28-baf8-4977daa5d49c > is not valid. Please check that values for params "default.fs.name" and > "hive.metastore.warehouse.dir" do not conflict. > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11915) BoneCP returns closed connections from the pool
[ https://issues.apache.org/jira/browse/HIVE-11915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935599#comment-14935599 ] Thejas M Nair commented on HIVE-11915: -- The retries are being set only for bonecp, but basing the log message on that seems very brittle. Other developers might add retries for other connection pooling types by setting getConnAttemptCount, and easily overlook updating the log message. Even in case of bonecp exceptions, in some cases the error can be non-recoverable. This is a fatal error and should be rare. The delay due to retries likely to be very small (not easily noticeable to the user). I think that delay would be acceptable for the circumstance. This looks like a tradeoff between more easy to maintain code and a delay that users are unlikely to notice. > BoneCP returns closed connections from the pool > --- > > Key: HIVE-11915 > URL: https://issues.apache.org/jira/browse/HIVE-11915 > Project: Hive > Issue Type: Bug >Reporter: Takahiko Saito >Assignee: Sergey Shelukhin > Attachments: HIVE-11915.01.patch, HIVE-11915.WIP.patch, > HIVE-11915.patch > > > It's a very old bug in BoneCP and it will never be fixed... There are > multiple workarounds on the internet but according to responses they are all > unreliable. We should upgrade to HikariCP (which in turn is only supported by > DN 4), meanwhile try some shamanic rituals. In this JIRA we will try a > relatively weak drum. -- This message was sent by Atlassian JIRA (v6.3.4#6332)