[jira] [Commented] (HIVE-10407) separate out the timestamp ranges for testing purposes
[ https://issues.apache.org/jira/browse/HIVE-10407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504447#comment-14504447 ] Hive QA commented on HIVE-10407: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12726702/HIVE-10407.patch {color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 8732 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_precision2 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3508/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3508/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3508/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 14 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12726702 - PreCommit-HIVE-TRUNK-Build separate out the timestamp ranges for testing purposes -- Key: HIVE-10407 URL: https://issues.apache.org/jira/browse/HIVE-10407 Project: Hive Issue Type: Bug Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: HIVE-10407.patch, HIVE-10407.patch, HIVE-10407.patch Some platforms have limits for date ranges, so separate out the test cases that are outside of the range 1970 to 2038. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10347) Merge spark to trunk 4/15/2015
[ https://issues.apache.org/jira/browse/HIVE-10347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505439#comment-14505439 ] Szehon Ho commented on HIVE-10347: -- Test failures dont look related, and its ready to go. [~xuefuz] can you take a look? Merge spark to trunk 4/15/2015 -- Key: HIVE-10347 URL: https://issues.apache.org/jira/browse/HIVE-10347 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-10347.2.patch, HIVE-10347.2.patch, HIVE-10347.3.patch, HIVE-10347.4.patch, HIVE-10347.5.patch, HIVE-10347.5.patch, HIVE-10347.6.patch, HIVE-10347.patch CLEAR LIBRARY CACHE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9917) After HIVE-3454 is done, make int to timestamp conversion configurable
[ https://issues.apache.org/jira/browse/HIVE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505408#comment-14505408 ] Jason Dere commented on HIVE-9917: -- You mean on RB? I don't think I have access to update your RB entry. You can just create a new git diff but without the --no-prefix option, and upload that to RB. After HIVE-3454 is done, make int to timestamp conversion configurable -- Key: HIVE-9917 URL: https://issues.apache.org/jira/browse/HIVE-9917 Project: Hive Issue Type: Improvement Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-9917.patch After HIVE-3454 is fixed, we will have correct behavior of converting int to timestamp. While the customers are using such incorrect behavior for so long, better to make it configurable so that in one release, it will default to old/inconsistent way and the next release will default to new/consistent way. And then we will deprecate it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10233) Hive on LLAP: Memory manager
[ https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-10233: -- Attachment: HIVE-10233-WIP.4.patch Hive on LLAP: Memory manager Key: HIVE-10233 URL: https://issues.apache.org/jira/browse/HIVE-10233 Project: Hive Issue Type: Bug Components: Tez Affects Versions: llap Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, HIVE-10233-WIP.4.patch We need a memory manager in llap/tez to manage the usage of memory across threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10421) DROP TABLE with qualified table name ignores database name when checking partitions
[ https://issues.apache.org/jira/browse/HIVE-10421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-10421: -- Attachment: HIVE-10421.1.patch DROP TABLE with qualified table name ignores database name when checking partitions --- Key: HIVE-10421 URL: https://issues.apache.org/jira/browse/HIVE-10421 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-10421.1.patch Hive was only recently changed to allow drop table dbname.tabname. However DDLTask.dropTable() is still using an older version of Hive.getPartitionNames(), which only took in a single string for the table name, rather than the database and table names. As a result Hive is filling in the current database name as the dbname during the listPartitions call to the MetaStore. It also appears that on the Hive Metastore side, in the non-auth path there is no validation to check that the dbname.tablename actually exists - this call simply returns back an empty list of partitions, which causes the table to be dropped without checking any of the partition information. I will open a separate issue for this one. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10413) [CBO] Return path assumes distinct column cant be same as grouping column
[ https://issues.apache.org/jira/browse/HIVE-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505462#comment-14505462 ] Laljo John Pullokkaran commented on HIVE-10413: --- [~ashutoshc] We need to handle: 1. We need to to maintain gbInfo.distExprNodes/distExprNames/distExprTypes 2. In genMapSideRS we add all of gbInfo.distExprNodes to reduce keys. This is wrong if distinct key is already part of GB Key. [CBO] Return path assumes distinct column cant be same as grouping column - Key: HIVE-10413 URL: https://issues.apache.org/jira/browse/HIVE-10413 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-10413.patch Found in cbo_udf_udaf.q tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10323) Tez merge join operator does not honor hive.join.emit.interval
[ https://issues.apache.org/jira/browse/HIVE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505559#comment-14505559 ] Gunther Hagleitner commented on HIVE-10323: --- Patch looks good. Minor nit: The condition for nextKeyGroup should be an else block. Some other considerations: - Maybe we should log emit and spill intervals. Also warn if the first is than latter? - Looks like you emit before you put the current record into storage. Wouldn't it be better to do that afterwards? Biggest concern: There's not a lot of testing going on. For one thing I think you could set the emit interval low (2?) for all tez tests and see if you get bigger coverage that way. If not you should test all the combinations: left, right, outer, multi key, multi table, spill other tables, etc. Tez merge join operator does not honor hive.join.emit.interval -- Key: HIVE-10323 URL: https://issues.apache.org/jira/browse/HIVE-10323 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.2.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-10323.1.patch This affects efficiency in case of skews. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8858) Visualize generated Spark plan [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505572#comment-14505572 ] Jimmy Xiang commented on HIVE-8858: --- Cool, looks good to me. +1 Visualize generated Spark plan [Spark Branch] - Key: HIVE-8858 URL: https://issues.apache.org/jira/browse/HIVE-8858 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Chinna Rao Lalam Attachments: HIVE-8858-spark.patch, HIVE-8858.1-spark.patch, HIVE-8858.2-spark.patch, HIVE-8858.3-spark.patch, HIVE-8858.4-spark.patch The spark plan generated by SparkPlanGenerator contains info which isn't available in Hive's explain plan, such as RDD caching. Also, the graph is slight different from orignal SparkWork. Thus, it would be nice to visualize the plan as is done for SparkWork. Preferrably, the visualization can happen as part of Hive explain extended. If not feasible, we at least can log this at info level. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9917) After HIVE-3454 is done, make int to timestamp conversion configurable
[ https://issues.apache.org/jira/browse/HIVE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505332#comment-14505332 ] Aihua Xu commented on HIVE-9917: I see. Yeah. I was using --no-prefix. After HIVE-3454 is done, make int to timestamp conversion configurable -- Key: HIVE-9917 URL: https://issues.apache.org/jira/browse/HIVE-9917 Project: Hive Issue Type: Improvement Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-9917.patch After HIVE-3454 is fixed, we will have correct behavior of converting int to timestamp. While the customers are using such incorrect behavior for so long, better to make it configurable so that in one release, it will default to old/inconsistent way and the next release will default to new/consistent way. And then we will deprecate it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10408) LLAP: NPE in scheduler in case of rejected tasks
[ https://issues.apache.org/jira/browse/HIVE-10408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-10408: -- Summary: LLAP: NPE in scheduler in case of rejected tasks (was: LLAP: query fails - NPE (old exception I posted was bogus)) LLAP: NPE in scheduler in case of rejected tasks Key: HIVE-10408 URL: https://issues.apache.org/jira/browse/HIVE-10408 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Siddharth Seth Fix For: llap Attachments: HIVE-10408.1.txt {noformat} java.lang.NullPointerException at org.apache.tez.dag.app.rm.LlapTaskSchedulerService.deallocateTask(LlapTaskSchedulerService.java:388) at org.apache.tez.dag.app.rm.TaskSchedulerEventHandler.handleTASucceeded(TaskSchedulerEventHandler.java:339) at org.apache.tez.dag.app.rm.TaskSchedulerEventHandler.handleEvent(TaskSchedulerEventHandler.java:224) at org.apache.tez.dag.app.rm.TaskSchedulerEventHandler$1.run(TaskSchedulerEventHandler.java:493) {noformat} The query, running alone on 10-node cluster, dumped 1000 mappers into running; with 3 completed it failed with that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-10421) DROP TABLE with qualified table name ignores database name when checking partitions
[ https://issues.apache.org/jira/browse/HIVE-10421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere reassigned HIVE-10421: - Assignee: Jason Dere DROP TABLE with qualified table name ignores database name when checking partitions --- Key: HIVE-10421 URL: https://issues.apache.org/jira/browse/HIVE-10421 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Jason Dere Hive was only recently changed to allow drop table dbname.tabname. However DDLTask.dropTable() is still using an older version of Hive.getPartitionNames(), which only took in a single string for the table name, rather than the database and table names. As a result Hive is filling in the current database name as the dbname during the listPartitions call to the MetaStore. It also appears that on the Hive Metastore side, in the non-auth path there is no validation to check that the dbname.tablename actually exists - this call simply returns back an empty list of partitions, which causes the table to be dropped without checking any of the partition information. I will open a separate issue for this one. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10379) Wrong result when executing with tez
[ https://issues.apache.org/jira/browse/HIVE-10379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505303#comment-14505303 ] ErwanMAS commented on HIVE-10379: - I have downloaded the hortonworks sandbox 2.2.4 . It's fixed . It's a duplicate of HIVE- . Wrong result when executing with tez - Key: HIVE-10379 URL: https://issues.apache.org/jira/browse/HIVE-10379 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.13.0, 0.14.0 Environment: Hortonworks sandbox 2.1.1 2.2.0 Reporter: ErwanMAS Assignee: Gunther Hagleitner Fix For: 1.0.0 Attachments: HIVE-10379.1.patch I do a left join with a lateral view outer , too many row are generated with tez . in map reduce , i have 125 rows , in tez 132 . Example : {noformat} drop table foo ; create table foo ( dummyfoo int ) ; insert into table foo select count(*) from foo ; select count(*) as cnt from ( select a.val,p.code from ( select castone*5)+two)*5+three) as int) as val from foo lateral view outer explode(split(0,1,2,3,4,,)) tbl_1 as one lateral view outer explode(split(0,1,2,3,4,,)) tbl_2 as two lateral view outer explode(split(0,1,2,3,4,,)) tbl_3 as three ) as a left join ( select dummyfoo as code from foo ) p on p.code=a.val ) w ; set hive.execution.engine=tez; set hive.vectorized.execution.enabled=false; select count(*) as cnt from ( select a.val,p.code from ( select castone*5)+two)*5+three) as int) as val from foo lateral view outer explode(split(0,1,2,3,4,,)) tbl_1 as one lateral view outer explode(split(0,1,2,3,4,,)) tbl_2 as two lateral view outer explode(split(0,1,2,3,4,,)) tbl_3 as three ) as a left join ( select dummyfoo as code from foo ) p on p.code=a.val ) w ; {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10413) [CBO] Return path assumes distinct column cant be same as grouping column
[ https://issues.apache.org/jira/browse/HIVE-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-10413: -- Issue Type: Sub-task (was: Bug) Parent: HIVE-9132 [CBO] Return path assumes distinct column cant be same as grouping column - Key: HIVE-10413 URL: https://issues.apache.org/jira/browse/HIVE-10413 Project: Hive Issue Type: Sub-task Affects Versions: 1.2.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-10413.patch Found in cbo_udf_udaf.q tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10233) Hive on LLAP: Memory manager
[ https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-10233: -- Attachment: HIVE-10233-WIP-4.patch Hive on LLAP: Memory manager Key: HIVE-10233 URL: https://issues.apache.org/jira/browse/HIVE-10233 Project: Hive Issue Type: Bug Components: Tez Affects Versions: llap Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, HIVE-10233-WIP-4.patch We need a memory manager in llap/tez to manage the usage of memory across threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10419) can't do query on partitioned view with analytic function in strictmode
[ https://issues.apache.org/jira/browse/HIVE-10419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hector Lagos updated HIVE-10419: Description: Hey Guys, I created the following table: CREATE TABLE t1 (id int, key string, value string) partitioned by (dt int); And after that i created a view on that table as follow: create view v1 PARTITIONED ON (dt) as SELECT * FROM ( SELECT row_number() over (partition by key order by value asc) as row_n, * FROM t1 ) t WHERE row_n = 1; We are working with hive.mapred.mode=strict and when I try to do the query select * from v1 where dt = 2 , I'm getting the following error: FAILED: SemanticException [Error 10041]: No partition predicate found for Alias v1:t:t1 Table t1 Is this a bug or a limitation of Hive when you use analytic functions in partitioned views? If i remove the row_number function it works without problems. Thanks in advance, any help will be appreciated. was: Hey Guysm Affects Version/s: 0.14.0 1.0.0 Tags: view,partition,analytical function (was: view) Summary: can't do query on partitioned view with analytic function in strictmode (was: can't do query on partitioned view with analytical function in strictmode) can't do query on partitioned view with analytic function in strictmode --- Key: HIVE-10419 URL: https://issues.apache.org/jira/browse/HIVE-10419 Project: Hive Issue Type: Bug Components: Hive, Views Affects Versions: 0.13.0, 0.14.0, 1.0.0 Environment: Cloudera 5.3.x. Reporter: Hector Lagos Hey Guys, I created the following table: CREATE TABLE t1 (id int, key string, value string) partitioned by (dt int); And after that i created a view on that table as follow: create view v1 PARTITIONED ON (dt) as SELECT * FROM ( SELECT row_number() over (partition by key order by value asc) as row_n, * FROM t1 ) t WHERE row_n = 1; We are working with hive.mapred.mode=strict and when I try to do the query select * from v1 where dt = 2 , I'm getting the following error: FAILED: SemanticException [Error 10041]: No partition predicate found for Alias v1:t:t1 Table t1 Is this a bug or a limitation of Hive when you use analytic functions in partitioned views? If i remove the row_number function it works without problems. Thanks in advance, any help will be appreciated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10331) ORC : Is null SARG filters out all row groups written in old ORC format
[ https://issues.apache.org/jira/browse/HIVE-10331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-10331: - Attachment: HIVE-10331.06.patch The patch looks good. I made a very minor modification to bloomfilter decimal test case which tests for both NO and YES_NO cases. ORC : Is null SARG filters out all row groups written in old ORC format --- Key: HIVE-10331 URL: https://issues.apache.org/jira/browse/HIVE-10331 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.1.0 Reporter: Mostafa Mokhtar Assignee: Mostafa Mokhtar Fix For: 1.2.0 Attachments: HIVE-10331.01.patch, HIVE-10331.02.patch, HIVE-10331.03.patch, HIVE-10331.03.patch, HIVE-10331.04.patch, HIVE-10331.05.patch, HIVE-10331.06.patch Queries are returning wrong results as all row groups gets filtered out and no rows get scanned. {code} SELECT count(*) FROM store_sales WHERE ss_addr_sk IS NULL {code} With hive.optimize.index.filter disabled we get the correct results In pickRowGroups stats show that hasNull_ is fales, while the rowgroup actually has null. Same query runs fine for newly loaded ORC tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10416) CBO (Calcite Return Path): Fix return columns if Sort operator is on top of plan returned by Calcite
[ https://issues.apache.org/jira/browse/HIVE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505331#comment-14505331 ] Laljo John Pullokkaran commented on HIVE-10416: --- [~jcamachorodriguez] Introducing Select on top of Sort will not work as TEZ can not preserve ordering across select. CBO (Calcite Return Path): Fix return columns if Sort operator is on top of plan returned by Calcite Key: HIVE-10416 URL: https://issues.apache.org/jira/browse/HIVE-10416 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 1.2.0 Attachments: HIVE-10416.patch When return path is on, if the plan's top operator is a Sort, we need to produce a SelectOp that will output exactly the columns needed by the FS. The following query reproduces the problem: {noformat} select cbo_t3.c_int, c, count(*) from (select key as a, c_int+1 as b, sum(c_int) as c from cbo_t1 where (cbo_t1.c_int + 1 = 0) and (cbo_t1.c_int 0 or cbo_t1.c_float = 0) group by c_float, cbo_t1.c_int, key order by a) cbo_t1 join (select key as p, c_int+1 as q, sum(c_int) as r from cbo_t2 where (cbo_t2.c_int + 1 = 0) and (cbo_t2.c_int 0 or cbo_t2.c_float = 0) group by c_float, cbo_t2.c_int, key order by q/10 desc, r asc) cbo_t2 on cbo_t1.a=p join cbo_t3 on cbo_t1.a=key where (b + cbo_t2.q = 0) and (b 0 or c_int = 0) group by cbo_t3.c_int, c order by cbo_t3.c_int+c desc, c; {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10227) Concrete implementation of Export/Import based ReplicationTaskFactory
[ https://issues.apache.org/jira/browse/HIVE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505401#comment-14505401 ] Sushanth Sowmyan commented on HIVE-10227: - After some sleeping on this, I feel like I should be stricter still about erroring out whenever .create is called, so that no events get seen as getting processed, but I don't think adding one more Factory is a good way of doing that. Here's what I now think: a) We get rid of NoopFactory as a default - that should move to tests, and not stay here - we require that that config parameter be set to some instantiatable factory to use this class. b) We get rid of the explicit InvalidStateFactory I mention above, but instantiate it as an inline anonymous Factory instantiation if we fail to load whatever Factory class the user provides. Thoughts? Also, is it possible for us to do any factory refactoring in another jira? This jira is huge enough that the longer we leave it uncommitted, the more it'll be exposed to rebasing needs. Also, a couple of other jiras like HIVE-9674 are awaiting this landing before they can be made patch-available. Concrete implementation of Export/Import based ReplicationTaskFactory - Key: HIVE-10227 URL: https://issues.apache.org/jira/browse/HIVE-10227 Project: Hive Issue Type: Sub-task Components: Import/Export Affects Versions: 1.2.0 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-10227.2.patch, HIVE-10227.3.patch, HIVE-10227.4.patch, HIVE-10227.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10384) RetryingMetaStoreClient does not retry wrapped TTransportExceptions
[ https://issues.apache.org/jira/browse/HIVE-10384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-10384: --- Attachment: HIVE-10384.patch Looking through RetryMetaStoreClient/IMetaStoreClient/MetaStoreClient, I think two exceptions wrapped in MetaException should be caught and retry is needed in RetryMetaStoreClient.invoke. One is IOException from reloginExpiringKeytabUser, other other is TTransportException from base.reconnect(). I did not see that an TTransportException could be wrapped in the InvocationTargetException. [~ekhliang] I wonder if it is the TTransportException that you meant which should be but has not been retried, or is there any other. Thanks RetryingMetaStoreClient does not retry wrapped TTransportExceptions --- Key: HIVE-10384 URL: https://issues.apache.org/jira/browse/HIVE-10384 Project: Hive Issue Type: Bug Components: Clients Reporter: Eric Liang Assignee: Chaoyu Tang Attachments: HIVE-10384.patch This bug is very similar to HIVE-9436, in that a TTransportException wrapped in a MetaException will not be retried. RetryingMetaStoreClient has a block of code above the MetaException handler that retries thrift exceptions, but this doesn't work when the exception is wrapped. {code} if ((e.getCause() instanceof TApplicationException) || (e.getCause() instanceof TProtocolException) || (e.getCause() instanceof TTransportException)) { caughtException = (TException) e.getCause(); } else if ((e.getCause() instanceof MetaException) e.getCause().getMessage().matches((?s).*JDO[a-zA-Z]*Exception.*)) { caughtException = (MetaException) e.getCause(); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10062) HiveOnTez: Union followed by Multi-GB followed by Multi-insert loses data
[ https://issues.apache.org/jira/browse/HIVE-10062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505579#comment-14505579 ] Gunther Hagleitner commented on HIVE-10062: --- Test failures are unrelated. There's a minor typo in the latest patch, but I can fix on commit. I'm +1 HiveOnTez: Union followed by Multi-GB followed by Multi-insert loses data - Key: HIVE-10062 URL: https://issues.apache.org/jira/browse/HIVE-10062 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Priority: Critical Attachments: HIVE-10062.01.patch, HIVE-10062.02.patch, HIVE-10062.03.patch, HIVE-10062.04.patch, HIVE-10062.05.patch In q.test environment with src table, execute the following query: {code} CREATE TABLE DEST1(key STRING, value STRING) STORED AS TEXTFILE; CREATE TABLE DEST2(key STRING, val1 STRING, val2 STRING) STORED AS TEXTFILE; FROM (select 'tst1' as key, cast(count(1) as string) as value from src s1 UNION all select s2.key as key, s2.value as value from src s2) unionsrc INSERT OVERWRITE TABLE DEST1 SELECT unionsrc.key, COUNT(DISTINCT SUBSTR(unionsrc.value,5)) GROUP BY unionsrc.key INSERT OVERWRITE TABLE DEST2 SELECT unionsrc.key, unionsrc.value, COUNT(DISTINCT SUBSTR(unionsrc.value,5)) GROUP BY unionsrc.key, unionsrc.value; select * from DEST1; select * from DEST2; {code} DEST1 and DEST2 should both have 310 rows. However, DEST2 only has 1 row tst1500 1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9923) No clear message when from is missing
[ https://issues.apache.org/jira/browse/HIVE-9923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504929#comment-14504929 ] Yongzhi Chen commented on HIVE-9923: Thanks [~szehon] and [~csun] for reviewing it. No clear message when from is missing --- Key: HIVE-9923 URL: https://issues.apache.org/jira/browse/HIVE-9923 Project: Hive Issue Type: Bug Affects Versions: 1.0.0 Reporter: Jeff Zhang Assignee: Yongzhi Chen Fix For: 1.2.0 Attachments: HIVE-9923.1.patch, HIVE-9923.2.patch For the following sql, from is missing but it throw NPE which is not clear for user. {code} hive insert overwrite directory '/tmp/hive-3' select sb1.name, sb2.age student_bucketed sb1 join student_bucketed sb2 on sb1.name=sb2.name; FAILED: NullPointerException null {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10416) CBO (Calcite Return Path): Fix return columns if Sort operator is on top of plan returned by Calcite
[ https://issues.apache.org/jira/browse/HIVE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10416: --- Attachment: HIVE-10416.patch [~ashutoshc], could you take a look? Thanks! CBO (Calcite Return Path): Fix return columns if Sort operator is on top of plan returned by Calcite Key: HIVE-10416 URL: https://issues.apache.org/jira/browse/HIVE-10416 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 1.2.0 Attachments: HIVE-10416.patch When return path is on, if the plan's top operator is a Sort, we need to produce a SelectOp that will output exactly the columns needed by the FS. The following query reproduces the problem: {noformat} select cbo_t3.c_int, c, count(*) from (select key as a, c_int+1 as b, sum(c_int) as c from cbo_t1 where (cbo_t1.c_int + 1 = 0) and (cbo_t1.c_int 0 or cbo_t1.c_float = 0) group by c_float, cbo_t1.c_int, key order by a) cbo_t1 join (select key as p, c_int+1 as q, sum(c_int) as r from cbo_t2 where (cbo_t2.c_int + 1 = 0) and (cbo_t2.c_int 0 or cbo_t2.c_float = 0) group by c_float, cbo_t2.c_int, key order by q/10 desc, r asc) cbo_t2 on cbo_t1.a=p join cbo_t3 on cbo_t1.a=key where (b + cbo_t2.q = 0) and (b 0 or c_int = 0) group by cbo_t3.c_int, c order by cbo_t3.c_int+c desc, c; {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-9791) insert into table throws NPE
[ https://issues.apache.org/jira/browse/HIVE-9791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen resolved HIVE-9791. Resolution: Fixed This issue should be fixed by the fixes for HIVE-9923 insert into table throws NPE Key: HIVE-9791 URL: https://issues.apache.org/jira/browse/HIVE-9791 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Alexander Pivovarov Assignee: Yongzhi Chen to reproduce NPE run the following {code} create table a as select 'A' letter; OK insert into table a select 'B' letter; FAILED: NullPointerException null -- works fine if add from table to select statement insert into table a select 'B' letter from dual; OK {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10403) Add n-way join support for Hybrid Grace Hash Join
[ https://issues.apache.org/jira/browse/HIVE-10403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504546#comment-14504546 ] Hive QA commented on HIVE-10403: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12726735/HIVE-10403.01.patch {color:red}ERROR:{color} -1 due to 21 failed/errored test(s), 8729 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_precision2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_join29 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_10 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_9 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning_2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_char_mapjoin1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_nested_mapjoin {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3509/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3509/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3509/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 21 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12726735 - PreCommit-HIVE-TRUNK-Build Add n-way join support for Hybrid Grace Hash Join - Key: HIVE-10403 URL: https://issues.apache.org/jira/browse/HIVE-10403 Project: Hive Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Wei Zheng Assignee: Wei Zheng Attachments: HIVE-10403.01.patch Currently Hybrid Grace Hash Join only supports 2-way join (one big table and one small table). This task will enable n-way join (one big table and multiple small tables). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10416) CBO (Calcite Return Path): Fix return columns if Sort operator is on top of plan returned by Calcite
[ https://issues.apache.org/jira/browse/HIVE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505055#comment-14505055 ] Hive QA commented on HIVE-10416: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12726880/HIVE-10416.patch {color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8728 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_precision2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3513/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3513/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3513/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12726880 - PreCommit-HIVE-TRUNK-Build CBO (Calcite Return Path): Fix return columns if Sort operator is on top of plan returned by Calcite Key: HIVE-10416 URL: https://issues.apache.org/jira/browse/HIVE-10416 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 1.2.0 Attachments: HIVE-10416.patch When return path is on, if the plan's top operator is a Sort, we need to produce a SelectOp that will output exactly the columns needed by the FS. The following query reproduces the problem: {noformat} select cbo_t3.c_int, c, count(*) from (select key as a, c_int+1 as b, sum(c_int) as c from cbo_t1 where (cbo_t1.c_int + 1 = 0) and (cbo_t1.c_int 0 or cbo_t1.c_float = 0) group by c_float, cbo_t1.c_int, key order by a) cbo_t1 join (select key as p, c_int+1 as q, sum(c_int) as r from cbo_t2 where (cbo_t2.c_int + 1 = 0) and (cbo_t2.c_int 0 or cbo_t2.c_float = 0) group by c_float, cbo_t2.c_int, key order by q/10 desc, r asc) cbo_t2 on cbo_t1.a=p join cbo_t3 on cbo_t1.a=key where (b + cbo_t2.q = 0) and (b 0 or c_int = 0) group by cbo_t3.c_int, c order by cbo_t3.c_int+c desc, c; {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10235) Loop optimization for SIMD in ColumnDivideColumn.txt
[ https://issues.apache.org/jira/browse/HIVE-10235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504949#comment-14504949 ] Gopal V commented on HIVE-10235: [~chengxiang li]: Patch LGTM - +1. Not able to see a significant leap in perf on my quick tests - division doesn't seem to be a common scenario in my tests. Loop optimization for SIMD in ColumnDivideColumn.txt Key: HIVE-10235 URL: https://issues.apache.org/jira/browse/HIVE-10235 Project: Hive Issue Type: Sub-task Components: Vectorization Affects Versions: 1.1.0 Reporter: Chengxiang Li Assignee: Chengxiang Li Priority: Minor Attachments: HIVE-10235.1.patch, HIVE-10235.1.patch Found two loop which could be optimized for packed instruction set during execution. 1. hasDivBy0 depends on the result of last loop, which prevent the loop be executed vectorized. {code:java} for(int i = 0; i != n; i++) { OperandType2 denom = vector2[i]; outputVector[i] = vector1[0] OperatorSymbol denom; hasDivBy0 = hasDivBy0 || (denom == 0); } {code} 2. same as HIVE-10180, vector2\[0\] reference provent JVM optimizing loop into packed instruction set. {code:java} for(int i = 0; i != n; i++) { outputVector[i] = vector1[i] OperatorSymbol vector2[0]; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504946#comment-14504946 ] Hive QA commented on HIVE-9824: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12726853/HIVE-9824.07.patch {color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8750 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_precision2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_decimal_mapjoin {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3512/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3512/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3512/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12726853 - PreCommit-HIVE-TRUNK-Build LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10302) Load small tables (for map join) in executor memory only once[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-10302: --- Summary: Load small tables (for map join) in executor memory only once[Spark Branch] (was: Cache small tables in memory [Spark Branch]) Load small tables (for map join) in executor memory only once[Spark Branch] --- Key: HIVE-10302 URL: https://issues.apache.org/jira/browse/HIVE-10302 Project: Hive Issue Type: Improvement Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: spark-branch Attachments: HIVE-10302.spark-1.patch If we can cache small tables in executor memory, we could save some time in loading them from HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10302) Load small tables (for map join) in executor memory only once [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-10302: --- Summary: Load small tables (for map join) in executor memory only once [Spark Branch] (was: Load small tables (for map join) in executor memory only once[Spark Branch]) Load small tables (for map join) in executor memory only once [Spark Branch] Key: HIVE-10302 URL: https://issues.apache.org/jira/browse/HIVE-10302 Project: Hive Issue Type: Improvement Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: spark-branch Attachments: HIVE-10302.spark-1.patch If we can cache small tables in executor memory, we could save some time in loading them from HDFS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10302) Load small tables (for map join) in executor memory only once [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-10302: --- Description: Usually there are multiple cores in a Spark executor, and thus it's possible that multiple map-join tasks can be running in the same executor (concurrently or sequentially). Currently, each task will load its own copy of the small tables for map join into memory, ending up with inefficiency. Ideally, we only load the small tables once and share them among the tasks running in that executor. (was: If we can cache small tables in executor memory, we could save some time in loading them from HDFS.) Load small tables (for map join) in executor memory only once [Spark Branch] Key: HIVE-10302 URL: https://issues.apache.org/jira/browse/HIVE-10302 Project: Hive Issue Type: Improvement Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: spark-branch Attachments: HIVE-10302.spark-1.patch Usually there are multiple cores in a Spark executor, and thus it's possible that multiple map-join tasks can be running in the same executor (concurrently or sequentially). Currently, each task will load its own copy of the small tables for map join into memory, ending up with inefficiency. Ideally, we only load the small tables once and share them among the tasks running in that executor. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10191) ORC: Cleanup writer per-row synchronization
[ https://issues.apache.org/jira/browse/HIVE-10191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-10191: --- Attachment: HIVE-10191.3.patch ORC: Cleanup writer per-row synchronization --- Key: HIVE-10191 URL: https://issues.apache.org/jira/browse/HIVE-10191 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 1.2.0 Reporter: Gopal V Assignee: Gopal V Attachments: HIVE-10191.1.patch, HIVE-10191.2.patch, HIVE-10191.3.patch ORC writers were originally meant to be thread-safe, but in the present day implementation each ORC writer is entirely share-nothing which converts most of the synchronized blocks in ORC as entirely uncontested locks. These uncontested locks prevent the JVM from inlining/optimizing these methods, while adding no extra thread-safety to the ORC writers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10165) Improve hive-hcatalog-streaming extensibility and support updates and deletes.
[ https://issues.apache.org/jira/browse/HIVE-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliot West updated HIVE-10165: --- Description: h3. Overview I'd like to extend the [hive-hcatalog-streaming|https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest] API so that it also supports the writing of record updates and deletes in addition to the already supported inserts. h3. Motivation We have many Hadoop processes outside of Hive that merge changed facts into existing datasets. Traditionally we achieve this by: reading in a ground-truth dataset and a modified dataset, grouping by a key, sorting by a sequence and then applying a function to determine inserted, updated, and deleted rows. However, in our current scheme we must rewrite all partitions that may potentially contain changes. In practice the number of mutated records is very small when compared with the records contained in a partition. This approach results in a number of operational issues: * Excessive amount of write activity required for small data changes. * Downstream applications cannot robustly read these datasets while they are being updated. * Due to scale of the updates (hundreds or partitions) the scope for contention is high. I believe we can address this problem by instead writing only the changed records to a Hive transactional table. This should drastically reduce the amount of data that we need to write and also provide a means for managing concurrent access to the data. Our existing merge processes can read and retain each record's {{ROW_ID}}/{{RecordIdentifier}} and pass this through to an updated form of the hive-hcatalog-streaming API which will then have the required data to perform an update or insert in a transactional manner. h3. Benefits * Enables the creation of large-scale dataset merge processes * Opens up Hive transactional functionality in an accessible manner to processes that operate outside of Hive. h3. Implementation Our changes do not break the existing API contracts. Instead our approach has been to consider the functionality offered by the existing API and our proposed API as fulfilling separate and distinct use-cases. The existing API is primarily focused on the task of continuously writing large volumes of new data into a Hive table for near-immediate analysis. Our use-case however, is concerned more with the frequent but not continuous ingestion of mutations to a Hive table from some ETL merge process. Consequently we feel it is justifiable to add our new functionality via an alternative set of public interfaces and leave the existing API as is. This keeps both APIs clean and focused at the expense of presenting additional options to potential users. Wherever possible, shared implementation concerns have been factored out into abstract base classes that are open to third-party extension. A detailed breakdown of the changes is as follows: * We've introduced a public {{RecordMutator}} interface whose purpose is to expose insert/update/delete operations to the user. This is a counterpart to the write-only {{RecordWriter}}. We've also factored out life-cycle methods common to these two interfaces into a super {{RecordOperationWriter}} interface. Note that the row representation has be changed from {{byte[]}} to {{Object}}. Within our data processing jobs our records are often available in a strongly typed and decoded form such as a POJO or a Tuple object. Therefore is seems to make sense that we are able to pass this through to the {{OrcRecordUpdater}} without having to go through a {{byte[]}} encoding step. This of course still allows users to use {{byte[]}} if they wish. * The introduction of {{RecordMutator}} requires that insert/update/delete operations are then also exposed on a {{TransactionBatch}} type. We've done this with the introduction of a public {{MutatorTransactionBatch}} interface which is a counterpart to the write-only {{TransactionBatch}}. We've also factored out life-cycle methods common to these two interfaces into a super {{BaseTransactionBatch}} interface. * Functionality that would be shared by implementations of both {{RecordWriters}} and {{RecordMutators}} has been factored out of {{AbstractRecordWriter}} into a new abstract base class {{AbstractOperationRecordWriter}}. The visibility is such that it is open to extension by third parties. The {{AbstractOperationRecordWriter}} also permits the setting of the {{AcidOutputFormat.Options#recordIdColumn()}} (defaulted to {{-1}}) which is a requirement for enabling updates and deletes. Additionally, these options are now fed an {{ObjectInspector}} via an abstract method so that a {{SerDe}} is not mandated (it was not required for our use-case). The {{AbstractRecordWriter}} is now much leaner, handling only the extraction of the {{ObjectInspector}} from the {{SerDe}}. * A new abstract class,
[jira] [Commented] (HIVE-10239) Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and PostgreSQL
[ https://issues.apache.org/jira/browse/HIVE-10239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505170#comment-14505170 ] Naveen Gangam commented on HIVE-10239: -- [~spena] I believe I have seen this error on my machine too but wasnt fatal by any means. The script executed fine after this error. I will re-run it to make sure. Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and PostgreSQL Key: HIVE-10239 URL: https://issues.apache.org/jira/browse/HIVE-10239 Project: Hive Issue Type: Improvement Affects Versions: 1.1.0 Reporter: Naveen Gangam Assignee: Naveen Gangam Attachments: HIVE-10239-donotcommit.patch, HIVE-10239.0.patch, HIVE-10239.0.patch, HIVE-10239.00.patch, HIVE-10239.patch Need to create DB-implementation specific scripts to use the framework introduced in HIVE-9800 to have any metastore schema changes tested across all supported databases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505186#comment-14505186 ] Vikram Dixit K commented on HIVE-9824: -- [~mmccline] The latest patch doesn't apply on trunk anymore. LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10339) Allow JDBC Driver to pass HTTP header Key/Value pairs
[ https://issues.apache.org/jira/browse/HIVE-10339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505885#comment-14505885 ] Thejas M Nair commented on HIVE-10339: -- +1 Please open a follow up jira for adding more e2e like tests using wiremock or equivalent. Allow JDBC Driver to pass HTTP header Key/Value pairs - Key: HIVE-10339 URL: https://issues.apache.org/jira/browse/HIVE-10339 Project: Hive Issue Type: Improvement Components: Beeline Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-10339.1.patch, HIVE-10339.2.patch Currently Beeline ODBC driver does not support carrying user specified HTTP header. The beeline JDBC driver in HTTP mode connection string is as jdbc:hive2://host:port/db?hive.server2.transport.mode=http;hive.server2.thrift.http.path=http_endpoint, When transport mode is http Beeline/ODBC driver should allow end user to send arbitrary HTTP Header name value pair. All the beeline driver needs to do is to use the user specified name values and call the underlying HTTPClient API to set the header. E.g the Beeline connection string could be jdbc:hive2://host:port/db?hive.server2.transport.mode=http;hive.server2.thrift.http.path=http_endpoint,http.header.name1=value1, And the beeline will call underlying to set HTTP header to name1 and value1 This is required for the end user to send identity in a HTTP header down to Knox via beeline. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505656#comment-14505656 ] Matt McCline commented on HIVE-9824: I switched over to using https://github.com/apache/hive from using git://git.apache.org/hive.git because of the read error: Connection reset by peer problem. I did notice when I generated the review board patch with this command line: {noformat} git diff --no-ext-diff HEAD^ review_board_patch_07.txt {noformat} and this command line for the actual patch: {noformat} git diff --no-ext-diff --no-prefix HEAD^ HIVE-9824.07.patch {noformat} The files had the same length when they usually have different lengths. LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10384) RetryingMetaStoreClient does not retry wrapped TTransportExceptions
[ https://issues.apache.org/jira/browse/HIVE-10384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505753#comment-14505753 ] Eric Liang commented on HIVE-10384: --- +1, probably at least all the T* exceptions should be retried after they are unwrapped. RetryingMetaStoreClient does not retry wrapped TTransportExceptions --- Key: HIVE-10384 URL: https://issues.apache.org/jira/browse/HIVE-10384 Project: Hive Issue Type: Bug Components: Clients Reporter: Eric Liang Assignee: Chaoyu Tang Attachments: HIVE-10384.patch This bug is very similar to HIVE-9436, in that a TTransportException wrapped in a MetaException will not be retried. RetryingMetaStoreClient has a block of code above the MetaException handler that retries thrift exceptions, but this doesn't work when the exception is wrapped. {code} if ((e.getCause() instanceof TApplicationException) || (e.getCause() instanceof TProtocolException) || (e.getCause() instanceof TTransportException)) { caughtException = (TException) e.getCause(); } else if ((e.getCause() instanceof MetaException) e.getCause().getMessage().matches((?s).*JDO[a-zA-Z]*Exception.*)) { caughtException = (MetaException) e.getCause(); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9917) After HIVE-3454 is done, make int to timestamp conversion configurable
[ https://issues.apache.org/jira/browse/HIVE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505623#comment-14505623 ] Aihua Xu commented on HIVE-9917: No. I mean check in this code change to trunk? After HIVE-3454 is done, make int to timestamp conversion configurable -- Key: HIVE-9917 URL: https://issues.apache.org/jira/browse/HIVE-9917 Project: Hive Issue Type: Improvement Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-9917.patch After HIVE-3454 is fixed, we will have correct behavior of converting int to timestamp. While the customers are using such incorrect behavior for so long, better to make it configurable so that in one release, it will default to old/inconsistent way and the next release will default to new/consistent way. And then we will deprecate it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505716#comment-14505716 ] Sergey Shelukhin commented on HIVE-9824: +1. Can you file follow up jiras for replacing the hashtable, and also for making hybrid work in all cases (if still needed)? LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch, HIVE-9824.08.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10331) ORC : Is null SARG filters out all row groups written in old ORC format
[ https://issues.apache.org/jira/browse/HIVE-10331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505787#comment-14505787 ] Hive QA commented on HIVE-10331: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12726948/HIVE-10331.06.patch {color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 8728 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3516/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3516/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3516/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12726948 - PreCommit-HIVE-TRUNK-Build ORC : Is null SARG filters out all row groups written in old ORC format --- Key: HIVE-10331 URL: https://issues.apache.org/jira/browse/HIVE-10331 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.1.0 Reporter: Mostafa Mokhtar Assignee: Mostafa Mokhtar Fix For: 1.2.0 Attachments: HIVE-10331.01.patch, HIVE-10331.02.patch, HIVE-10331.03.patch, HIVE-10331.03.patch, HIVE-10331.04.patch, HIVE-10331.05.patch, HIVE-10331.06.patch Queries are returning wrong results as all row groups gets filtered out and no rows get scanned. {code} SELECT count(*) FROM store_sales WHERE ss_addr_sk IS NULL {code} With hive.optimize.index.filter disabled we get the correct results In pickRowGroups stats show that hasNull_ is fales, while the rowgroup actually has null. Same query runs fine for newly loaded ORC tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10339) Allow JDBC Driver to pass HTTP header Key/Value pairs
[ https://issues.apache.org/jira/browse/HIVE-10339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-10339: - Attachment: HIVE-10339.2.patch [~thejas] Addressed the review comments. Uploading patch #2 Thanks Hari Allow JDBC Driver to pass HTTP header Key/Value pairs - Key: HIVE-10339 URL: https://issues.apache.org/jira/browse/HIVE-10339 Project: Hive Issue Type: Improvement Components: Beeline Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-10339.1.patch, HIVE-10339.2.patch Currently Beeline ODBC driver does not support carrying user specified HTTP header. The beeline JDBC driver in HTTP mode connection string is as jdbc:hive2://host:port/db?hive.server2.transport.mode=http;hive.server2.thrift.http.path=http_endpoint, When transport mode is http Beeline/ODBC driver should allow end user to send arbitrary HTTP Header name value pair. All the beeline driver needs to do is to use the user specified name values and call the underlying HTTPClient API to set the header. E.g the Beeline connection string could be jdbc:hive2://host:port/db?hive.server2.transport.mode=http;hive.server2.thrift.http.path=http_endpoint,http.header.name1=value1, And the beeline will call underlying to set HTTP header to name1 and value1 This is required for the end user to send identity in a HTTP header down to Knox via beeline. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10426) Rework/simplify ReplicationTaskFactory instantiation
[ https://issues.apache.org/jira/browse/HIVE-10426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-10426: Description: Creating a new jira to continue discussions from HIVE-10227 as to what ReplicationTask.Factory instantiation should look like. (was: Creating a new jira to continue discussions of what ReplicationTask.Factory instantiation should look like.) Rework/simplify ReplicationTaskFactory instantiation Key: HIVE-10426 URL: https://issues.apache.org/jira/browse/HIVE-10426 Project: Hive Issue Type: Sub-task Components: Import/Export Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Creating a new jira to continue discussions from HIVE-10227 as to what ReplicationTask.Factory instantiation should look like. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10384) RetryingMetaStoreClient does not retry wrapped TTransportExceptions
[ https://issues.apache.org/jira/browse/HIVE-10384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505746#comment-14505746 ] Szehon Ho commented on HIVE-10384: -- Makes sense to me, +1 RetryingMetaStoreClient does not retry wrapped TTransportExceptions --- Key: HIVE-10384 URL: https://issues.apache.org/jira/browse/HIVE-10384 Project: Hive Issue Type: Bug Components: Clients Reporter: Eric Liang Assignee: Chaoyu Tang Attachments: HIVE-10384.patch This bug is very similar to HIVE-9436, in that a TTransportException wrapped in a MetaException will not be retried. RetryingMetaStoreClient has a block of code above the MetaException handler that retries thrift exceptions, but this doesn't work when the exception is wrapped. {code} if ((e.getCause() instanceof TApplicationException) || (e.getCause() instanceof TProtocolException) || (e.getCause() instanceof TTransportException)) { caughtException = (TException) e.getCause(); } else if ((e.getCause() instanceof MetaException) e.getCause().getMessage().matches((?s).*JDO[a-zA-Z]*Exception.*)) { caughtException = (MetaException) e.getCause(); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10331) ORC : Is null SARG filters out all row groups written in old ORC format
[ https://issues.apache.org/jira/browse/HIVE-10331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505795#comment-14505795 ] Prasanth Jayachandran commented on HIVE-10331: -- SVN is marked read-only for git migration. Will commit the patch once the migration is done. ORC : Is null SARG filters out all row groups written in old ORC format --- Key: HIVE-10331 URL: https://issues.apache.org/jira/browse/HIVE-10331 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.1.0 Reporter: Mostafa Mokhtar Assignee: Mostafa Mokhtar Fix For: 1.2.0 Attachments: HIVE-10331.01.patch, HIVE-10331.02.patch, HIVE-10331.03.patch, HIVE-10331.03.patch, HIVE-10331.04.patch, HIVE-10331.05.patch, HIVE-10331.06.patch Queries are returning wrong results as all row groups gets filtered out and no rows get scanned. {code} SELECT count(*) FROM store_sales WHERE ss_addr_sk IS NULL {code} With hive.optimize.index.filter disabled we get the correct results In pickRowGroups stats show that hasNull_ is fales, while the rowgroup actually has null. Same query runs fine for newly loaded ORC tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10421) DROP TABLE with qualified table name ignores database name when checking partitions
[ https://issues.apache.org/jira/browse/HIVE-10421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505620#comment-14505620 ] Hive QA commented on HIVE-10421: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12726930/HIVE-10421.1.patch {color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8727 tests executed *Failed tests:* {noformat} TestCustomAuthentication - did not produce a TEST-*.xml file TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3515/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3515/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3515/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12726930 - PreCommit-HIVE-TRUNK-Build DROP TABLE with qualified table name ignores database name when checking partitions --- Key: HIVE-10421 URL: https://issues.apache.org/jira/browse/HIVE-10421 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-10421.1.patch Hive was only recently changed to allow drop table dbname.tabname. However DDLTask.dropTable() is still using an older version of Hive.getPartitionNames(), which only took in a single string for the table name, rather than the database and table names. As a result Hive is filling in the current database name as the dbname during the listPartitions call to the MetaStore. It also appears that on the Hive Metastore side, in the non-auth path there is no validation to check that the dbname.tablename actually exists - this call simply returns back an empty list of partitions, which causes the table to be dropped without checking any of the partition information. I will open a separate issue for this one. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-9824: --- Attachment: HIVE-9824.08.patch More review board changes. LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch, HIVE-9824.08.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505696#comment-14505696 ] Matt McCline commented on HIVE-9824: Actually, they are exactly 1000 bytes different... LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch, HIVE-9824.08.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10423) HIVE-7948 breaks deploy_e2e_artifacts.sh
[ https://issues.apache.org/jira/browse/HIVE-10423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505682#comment-14505682 ] Aswathy Chellammal Sreekumar commented on HIVE-10423: - @Eugene please review the patch that includes small fix to prevent the issue with rerun HIVE-7948 breaks deploy_e2e_artifacts.sh Key: HIVE-10423 URL: https://issues.apache.org/jira/browse/HIVE-10423 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Eugene Koifman Assignee: Aswathy Chellammal Sreekumar Attachments: HIVE-10423.patch HIVE-7948 added a step to download a ml-1m.zip file and unzip it. this only works if you call deploy_e2e_artifacts.sh once. If you call it again (which is very common in dev) it blocks and ask for additional input from user because target files already exist. This needs to be changed similarly to what we discussed for HIVE-9272, i.e. place artifacts not under source control in testdist/. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10384) RetryingMetaStoreClient does not retry wrapped TTransportExceptions
[ https://issues.apache.org/jira/browse/HIVE-10384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505761#comment-14505761 ] Eric Liang commented on HIVE-10384: --- Oh sorry, I misunderstood your comment. I believe that TTransportException is indeed thrown from within invoke(). For example, see this stack trace: {code} Got excep tion: org.apache.thrift.transport.TTransportException null org.apache.thrift.transport.TTransportException at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_tables(ThriftHiveMetastore.java:9 83) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_tables(ThriftHiveMetastore.java:969) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:1038) at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces sorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89) {code} I believe the offending method that wraps this exception is in MetaStoreUtils: logAndThrowMetaException(Exception e) throws MetaException RetryingMetaStoreClient does not retry wrapped TTransportExceptions --- Key: HIVE-10384 URL: https://issues.apache.org/jira/browse/HIVE-10384 Project: Hive Issue Type: Bug Components: Clients Reporter: Eric Liang Assignee: Chaoyu Tang Attachments: HIVE-10384.patch This bug is very similar to HIVE-9436, in that a TTransportException wrapped in a MetaException will not be retried. RetryingMetaStoreClient has a block of code above the MetaException handler that retries thrift exceptions, but this doesn't work when the exception is wrapped. {code} if ((e.getCause() instanceof TApplicationException) || (e.getCause() instanceof TProtocolException) || (e.getCause() instanceof TTransportException)) { caughtException = (TException) e.getCause(); } else if ((e.getCause() instanceof MetaException) e.getCause().getMessage().matches((?s).*JDO[a-zA-Z]*Exception.*)) { caughtException = (MetaException) e.getCause(); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10399) from_unixtime_millis() Hive UDF
[ https://issues.apache.org/jira/browse/HIVE-10399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505871#comment-14505871 ] Aihua Xu commented on HIVE-10399: - I don't think you need such UDF. You can just call cast(123.123 as timestamp) to convert a double to timestamp. Give it a try to see if it's what you want. from_unixtime_millis() Hive UDF --- Key: HIVE-10399 URL: https://issues.apache.org/jira/browse/HIVE-10399 Project: Hive Issue Type: New Feature Components: UDF Environment: HDP 2.2 Reporter: Hari Sekhon Priority: Minor Feature request for a {code}from_unixtime_millis(){code} Hive UDF - from_unixtime() accepts only secs since epoch, and right now the solution is to create a custom UDF, but this seems like quite a standard thing to support millisecond precision dates in Hive natively. Hari Sekhon http://www.linkedin.com/in/harisekhon -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10239) Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and PostgreSQL
[ https://issues.apache.org/jira/browse/HIVE-10239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-10239: - Attachment: HIVE-10239.01.patch Appears that the oracle package installation fails because it cannot download the oracle-xe packages for 64 bit operating systems. On my local ubuntu VMs, I get the exact same failure except that it proceeds to install the 32-bit packages for oracle xe. I am not quite sure what OS configuration drives this. {noformat} 511808 W: Failed to fetch http://oss.oracle.com/debian/dists/unstable/main/binary-amd64/Packages HttpError404 W: Failed to fetch http://oss.oracle.com/debian/dists/unstable/non-free/binary-amd64/Packages HttpError404 E: Some index files failed to download. They have been ignored, or old ones used instead. + /bin/true + apt-get install -y --force-yes oracle-xe Reading package lists... Done Building dependency tree Reading state information... Done The following extra packages will be installed: gcc-4.9-base gcc-4.9-base:i386 libaio:i386 libc6 libc6:i386 libgcc1 libgcc1:i386 Suggested packages: glibc-doc glibc-doc:i386 locales:i386 The following NEW packages will be installed: gcc-4.9-base:i386 libaio:i386 libc6:i386 libgcc1:i386 oracle-xe:i386 The following packages will be upgraded: gcc-4.9-base libc6 libgcc1 3 upgraded, 5 newly installed, 0 to remove and 169 not upgraded. Need to get 230 MB of archives. After this operation, 415 MB of additional disk space will be used. WARNING: The following packages cannot be authenticated! libaio:i386 oracle-xe:i386 {noformat} I am uploading a patch to make it use 32-bit packages. Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and PostgreSQL Key: HIVE-10239 URL: https://issues.apache.org/jira/browse/HIVE-10239 Project: Hive Issue Type: Improvement Affects Versions: 1.1.0 Reporter: Naveen Gangam Assignee: Naveen Gangam Attachments: HIVE-10239-donotcommit.patch, HIVE-10239.0.patch, HIVE-10239.0.patch, HIVE-10239.00.patch, HIVE-10239.01.patch, HIVE-10239.patch Need to create DB-implementation specific scripts to use the framework introduced in HIVE-9800 to have any metastore schema changes tested across all supported databases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9917) After HIVE-3454 is done, make int to timestamp conversion configurable
[ https://issues.apache.org/jira/browse/HIVE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505638#comment-14505638 ] Jason Dere commented on HIVE-9917: -- The +1 is supposed to sit for a day before getting committed .. I'll get it in tomorrow After HIVE-3454 is done, make int to timestamp conversion configurable -- Key: HIVE-9917 URL: https://issues.apache.org/jira/browse/HIVE-9917 Project: Hive Issue Type: Improvement Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-9917.patch After HIVE-3454 is fixed, we will have correct behavior of converting int to timestamp. While the customers are using such incorrect behavior for so long, better to make it configurable so that in one release, it will default to old/inconsistent way and the next release will default to new/consistent way. And then we will deprecate it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10384) RetryingMetaStoreClient does not retry wrapped TTransportExceptions
[ https://issues.apache.org/jira/browse/HIVE-10384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505940#comment-14505940 ] Hive QA commented on HIVE-10384: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12726947/HIVE-10384.patch {color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 8728 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3517/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3517/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3517/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 14 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12726947 - PreCommit-HIVE-TRUNK-Build RetryingMetaStoreClient does not retry wrapped TTransportExceptions --- Key: HIVE-10384 URL: https://issues.apache.org/jira/browse/HIVE-10384 Project: Hive Issue Type: Bug Components: Clients Reporter: Eric Liang Assignee: Chaoyu Tang Attachments: HIVE-10384.patch This bug is very similar to HIVE-9436, in that a TTransportException wrapped in a MetaException will not be retried. RetryingMetaStoreClient has a block of code above the MetaException handler that retries thrift exceptions, but this doesn't work when the exception is wrapped. {code} if ((e.getCause() instanceof TApplicationException) || (e.getCause() instanceof TProtocolException) || (e.getCause() instanceof TTransportException)) { caughtException = (TException) e.getCause(); } else if ((e.getCause() instanceof MetaException) e.getCause().getMessage().matches((?s).*JDO[a-zA-Z]*Exception.*)) { caughtException = (MetaException) e.getCause(); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10430) HIVE-9937 broke hadoop-1 build
[ https://issues.apache.org/jira/browse/HIVE-10430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-10430: - Description: TestLazySimpleFast added in HIVE-9937 uses Text.copyBytes() that is not present in hadoop-1. (was: TestLazySimpleFast uses Text.copyBytes() that is not present in hadoop-1. ) HIVE-9937 broke hadoop-1 build -- Key: HIVE-10430 URL: https://issues.apache.org/jira/browse/HIVE-10430 Project: Hive Issue Type: Bug Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran TestLazySimpleFast added in HIVE-9937 uses Text.copyBytes() that is not present in hadoop-1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10430) HIVE-9937 broke hadoop-1 build
[ https://issues.apache.org/jira/browse/HIVE-10430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran resolved HIVE-10430. -- Resolution: Duplicate HIVE-9937 broke hadoop-1 build -- Key: HIVE-10430 URL: https://issues.apache.org/jira/browse/HIVE-10430 Project: Hive Issue Type: Bug Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran TestLazySimpleFast added in HIVE-9937 uses Text.copyBytes() that is not present in hadoop-1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10391) CBO (Calcite Return Path): HiveOpConverter always assumes that HiveFilter does not include a partition column
[ https://issues.apache.org/jira/browse/HIVE-10391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-10391: -- Attachment: HIVE-10391.patch CBO (Calcite Return Path): HiveOpConverter always assumes that HiveFilter does not include a partition column - Key: HIVE-10391 URL: https://issues.apache.org/jira/browse/HIVE-10391 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Laljo John Pullokkaran Fix For: 1.2.0 Attachments: HIVE-10391.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10434) Cancel connection to HS2 when remote Spark driver process has failed [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HIVE-10434: Issue Type: Sub-task (was: Improvement) Parent: HIVE-7292 Cancel connection to HS2 when remote Spark driver process has failed [Spark Branch] Key: HIVE-10434 URL: https://issues.apache.org/jira/browse/HIVE-10434 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: 1.2.0 Reporter: Chao Sun Assignee: Chao Sun Currently in HoS, in SparkClientImpl it first launch a remote Driver process, and then wait for it to connect back to the HS2. However, in certain situations (for instance, permission issue), the remote process may fail and exit with error code. In this situation, the HS2 process will still wait for the process to connect, and wait for a full timeout period before it throws the exception. What makes it worth, user may need to wait for two timeout periods: one for the SparkSetReducerParallelism, and another for the actual Spark job. This could be very annoying. We should cancel the timeout task once we found out that the process has failed, and set the promise as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10192) insert into table failed for partitioned table.
[ https://issues.apache.org/jira/browse/HIVE-10192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506120#comment-14506120 ] Aihua Xu commented on HIVE-10192: - [~Ganesh.Sathish] Are you still having the issue? I tried a simple case and I don't see the issue. Could you please provide repro steps and sample data if you still have. insert into table failed for partitioned table. -- Key: HIVE-10192 URL: https://issues.apache.org/jira/browse/HIVE-10192 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.12.0 Environment: os-Unix Distribution-Pivotal Reporter: Ganesh Sathish When i am trying to load the data from the partitioned table in RC format to a partitioned table in ORC format.Using the below command to load the data. create table ORC_Table stored as ORC as select * from RC_Table; Facing the issue: ArrayIndexOutofBoundsException:26 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10413) [CBO] Return path assumes distinct column cant be same as grouping column
[ https://issues.apache.org/jira/browse/HIVE-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-10413: Attachment: HIVE-10413.1.patch Updated patch which makes all queries in .q file pass except one with multiple distincts. [CBO] Return path assumes distinct column cant be same as grouping column - Key: HIVE-10413 URL: https://issues.apache.org/jira/browse/HIVE-10413 Project: Hive Issue Type: Sub-task Affects Versions: 1.2.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-10413.1.patch, HIVE-10413.patch Found in cbo_udf_udaf.q tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10431) HIVE-9555 broke hadoop-1 build
[ https://issues.apache.org/jira/browse/HIVE-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506231#comment-14506231 ] Prasanth Jayachandran commented on HIVE-10431: -- Seeing another error.. {code} RecordReaderUtils.java:[442,40] error: cannot find symbol [ERROR] class HdfsFileStatus {code} HIVE-9555 broke hadoop-1 build -- Key: HIVE-10431 URL: https://issues.apache.org/jira/browse/HIVE-10431 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Prasanth Jayachandran Assignee: Sergey Shelukhin HIVE-9555 RecordReaderUtils uses direct bytebuffer read from FSDataInputStream which is not present in hadoop-1. This breaks hadoop-1 compilation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10431) HIVE-9555 broke hadoop-1 build
[ https://issues.apache.org/jira/browse/HIVE-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506233#comment-14506233 ] Prasanth Jayachandran commented on HIVE-10431: -- Ignore my previous comment.. That's not happening in trunk.. it happens only in LLAP branch. HIVE-9555 broke hadoop-1 build -- Key: HIVE-10431 URL: https://issues.apache.org/jira/browse/HIVE-10431 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Prasanth Jayachandran Assignee: Sergey Shelukhin HIVE-9555 RecordReaderUtils uses direct bytebuffer read from FSDataInputStream which is not present in hadoop-1. This breaks hadoop-1 compilation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10312) SASL.QOP in JDBC URL is ignored for Delegation token Authentication
[ https://issues.apache.org/jira/browse/HIVE-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-10312: Attachment: HIVE-10312.1.patch Seems like a reasonable patch. +1 SASL.QOP in JDBC URL is ignored for Delegation token Authentication --- Key: HIVE-10312 URL: https://issues.apache.org/jira/browse/HIVE-10312 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 1.2.0 Reporter: Mubashir Kazia Assignee: Mubashir Kazia Fix For: 1.2.0 Attachments: HIVE-10312.1.patch, HIVE-10312.1.patch When HS2 is configured for QOP other than auth (auth-int or auth-conf), Kerberos client connection works fine when the JDBC URL specifies the matching QOP, however when this HS2 is accessed through Oozie (Delegation token / Digest authentication), connections fails because the JDBC driver ignores the SASL.QOP parameters in the JDBC URL. SASL.QOP setting should be valid for DIGEST Auth mech. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10428) NPE in RegexSerDe using HCat
[ https://issues.apache.org/jira/browse/HIVE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506373#comment-14506373 ] Hive QA commented on HIVE-10428: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12727013/HIVE-10428.1.patch {color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 8728 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3521/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3521/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3521/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12727013 - PreCommit-HIVE-TRUNK-Build NPE in RegexSerDe using HCat Key: HIVE-10428 URL: https://issues.apache.org/jira/browse/HIVE-10428 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-10428.1.patch When HCatalog calls to table with org.apache.hadoop.hive.serde2.RegexSerDe, when doing Hcatalog call to get read the table, it throws exception: {noformat} 15/04/21 14:07:31 INFO security.TokenCache: Got dt for hdfs://hdpsecahdfs; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:hdpsecahdfs, Ident: (HDFS_DELEGATION_TOKEN token 1478 for haha) 15/04/21 14:07:31 INFO mapred.FileInputFormat: Total input paths to process : 1 Splits len : 1 SplitInfo : [hdpseca03.seca.hwxsup.com, hdpseca04.seca.hwxsup.com, hdpseca05.seca.hwxsup.com] 15/04/21 14:07:31 INFO mapreduce.InternalUtil: Initializing org.apache.hadoop.hive.serde2.RegexSerDe with properties {name=casetest.regex_table, numFiles=1, columns.types=string,string, serialization.format=1, columns=id,name, rawDataSize=0, numRows=0, output.format.string=%1$s %2$s, serialization.lib=org.apache.hadoop.hive.serde2.RegexSerDe, COLUMN_STATS_ACCURATE=true, totalSize=25, serialization.null.format=\N, input.regex=([^ ]*) ([^ ]*), transient_lastDdlTime=1429590172} 15/04/21 14:07:31 WARN serde2.RegexSerDe: output.format.string has been deprecated Exception in thread main java.lang.NullPointerException at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:187) at com.google.common.base.Splitter.split(Splitter.java:371) at org.apache.hadoop.hive.serde2.RegexSerDe.initialize(RegexSerDe.java:155) at org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:49) at org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:518) at
[jira] [Commented] (HIVE-9711) ORC Vectorization DoubleColumnVector.isRepeating=false if all entries are NaN
[ https://issues.apache.org/jira/browse/HIVE-9711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506224#comment-14506224 ] Prasanth Jayachandran commented on HIVE-9711: - SVN is currently read-only. Will commit this patch once we migrate to git. ORC Vectorization DoubleColumnVector.isRepeating=false if all entries are NaN - Key: HIVE-9711 URL: https://issues.apache.org/jira/browse/HIVE-9711 Project: Hive Issue Type: Bug Components: File Formats, Vectorization Affects Versions: 1.2.0 Reporter: Gopal V Assignee: Gopal V Fix For: 1.2.0 Attachments: HIVE-9711.1.patch, HIVE-9711.2.patch, HIVE-9711.3.patch The isRepeating=true check uses Java equality, which results in NaN != NaN comparison operations. The noNulls case needs the current check folded into the previous loop, while the hasNulls case needs a logical AND of the isNull[] field instead of == comparisons. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-10048) JDBC - Support SSL encryption regardless of Authentication mechanism
[ https://issues.apache.org/jira/browse/HIVE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mubashir Kazia reassigned HIVE-10048: - Assignee: Mubashir Kazia JDBC - Support SSL encryption regardless of Authentication mechanism Key: HIVE-10048 URL: https://issues.apache.org/jira/browse/HIVE-10048 Project: Hive Issue Type: Improvement Components: JDBC Affects Versions: 1.0.0 Reporter: Mubashir Kazia Assignee: Mubashir Kazia Labels: newbie, patch Fix For: 1.2.0 Attachments: HIVE-10048.1.patch JDBC driver currently only supports SSL Transport if the Authentication mechanism is SASL Plain with username and password. SSL transport should be decoupled from Authentication mechanism. If the customer chooses to do Kerberos Authentication and SSL encryption over the wire it should be supported. The Server side already supports this but the driver does not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-10115) HS2 running on a Kerberized cluster should offer Kerberos(GSSAPI) and Delegation token(DIGEST) when alternate authentication is enabled
[ https://issues.apache.org/jira/browse/HIVE-10115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mubashir Kazia reassigned HIVE-10115: - Assignee: Mubashir Kazia HS2 running on a Kerberized cluster should offer Kerberos(GSSAPI) and Delegation token(DIGEST) when alternate authentication is enabled --- Key: HIVE-10115 URL: https://issues.apache.org/jira/browse/HIVE-10115 Project: Hive Issue Type: Improvement Components: Authentication Affects Versions: 1.1.0 Reporter: Mubashir Kazia Assignee: Mubashir Kazia Labels: patch Fix For: 1.2.0 Attachments: HIVE-10115.0.patch In a Kerberized cluster when alternate authentication is enabled on HS2, it should also accept Kerberos Authentication. The reason this is important is because when we enable LDAP authentication HS2 stops accepting delegation token authentication. So we are forced to enter username passwords in the oozie configuration. The whole idea of SASL is that multiple authentication mechanism can be offered. If we disable Kerberos(GSSAPI) and delegation token (DIGEST) authentication when we enable LDAP authentication, this defeats SASL purpose. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-4625) HS2 should not attempt to get delegation token from metastore if using embedded metastore
[ https://issues.apache.org/jira/browse/HIVE-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506241#comment-14506241 ] Thejas M Nair commented on HIVE-4625: - There are 2 other functions as well there only in thrift (non local) mode exception is thrown, similar change should be made there as well for consistency. HS2 should not attempt to get delegation token from metastore if using embedded metastore - Key: HIVE-4625 URL: https://issues.apache.org/jira/browse/HIVE-4625 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.11.0 Reporter: Thejas M Nair Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-4625.1.patch, HIVE-4625.2.patch, HIVE-4625.3.patch, HIVE-4625.4.patch In kerberos secure mode, with doas enabled, Hive server2 tries to get delegation token from metastore even if the metastore is being used in embedded mode. To avoid failure in that case, it uses catch block for UnsupportedOperationException thrown that does nothing. But this leads to an error being logged by lower levels and can mislead users into thinking that there is a problem. It should check if delegation token mode is supported with current configuration before calling the function. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-10417) Parallel Order By return wrong results for partitioned tables
[ https://issues.apache.org/jira/browse/HIVE-10417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nemon Lou reassigned HIVE-10417: Assignee: Nemon Lou Parallel Order By return wrong results for partitioned tables - Key: HIVE-10417 URL: https://issues.apache.org/jira/browse/HIVE-10417 Project: Hive Issue Type: Bug Affects Versions: 0.14.0, 0.13.1, 1.0.0 Reporter: Nemon Lou Assignee: Nemon Lou Following is the script that reproduce this bug. set hive.optimize.sampling.orderby=true; set mapreduce.job.reduces=10; select * from src order by key desc limit 10; +--++ | src.key | src.value | +--++ | 98 | val_98 | | 98 | val_98 | | 97 | val_97 | | 97 | val_97 | | 96 | val_96 | | 95 | val_95 | | 95 | val_95 | | 92 | val_92 | | 90 | val_90 | | 90 | val_90 | +--++ 10 rows selected (47.916 seconds) reset; create table src_orc_p (key string ,value string ) partitioned by (kp string) stored as orc tblproperties(orc.compress=SNAPPY); set hive.exec.dynamic.partition.mode=nonstrict; set hive.exec.max.dynamic.partitions.pernode=1; set hive.exec.max.dynamic.partitions=1; insert into table src_orc_p partition(kp) select *,substring(key,1) from src distribute by substring(key,1); set mapreduce.job.reduces=10; set hive.optimize.sampling.orderby=true; select * from src_orc_p order by key desc limit 10; ++--+-+ | src_orc_p.key | src_orc_p.value | src_orc_p.kend | ++--+-+ | 0 | val_0| 0 | | 0 | val_0| 0 | | 0 | val_0| 0 | ++--+-+ 3 rows selected (39.861 seconds) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10339) Allow JDBC Driver to pass HTTP header Key/Value pairs
[ https://issues.apache.org/jira/browse/HIVE-10339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506191#comment-14506191 ] Hive QA commented on HIVE-10339: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12727000/HIVE-10339.2.patch {color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 8729 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3519/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3519/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3519/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12727000 - PreCommit-HIVE-TRUNK-Build Allow JDBC Driver to pass HTTP header Key/Value pairs - Key: HIVE-10339 URL: https://issues.apache.org/jira/browse/HIVE-10339 Project: Hive Issue Type: Improvement Components: Beeline Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-10339.1.patch, HIVE-10339.2.patch Currently Beeline ODBC driver does not support carrying user specified HTTP header. The beeline JDBC driver in HTTP mode connection string is as jdbc:hive2://host:port/db?hive.server2.transport.mode=http;hive.server2.thrift.http.path=http_endpoint, When transport mode is http Beeline/ODBC driver should allow end user to send arbitrary HTTP Header name value pair. All the beeline driver needs to do is to use the user specified name values and call the underlying HTTPClient API to set the header. E.g the Beeline connection string could be jdbc:hive2://host:port/db?hive.server2.transport.mode=http;hive.server2.thrift.http.path=http_endpoint,http.header.name1=value1, And the beeline will call underlying to set HTTP header to name1 and value1 This is required for the end user to send identity in a HTTP header down to Knox via beeline. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-10312) SASL.QOP in JDBC URL is ignored for Delegation token Authentication
[ https://issues.apache.org/jira/browse/HIVE-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mubashir Kazia reassigned HIVE-10312: - Assignee: Mubashir Kazia SASL.QOP in JDBC URL is ignored for Delegation token Authentication --- Key: HIVE-10312 URL: https://issues.apache.org/jira/browse/HIVE-10312 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 1.2.0 Reporter: Mubashir Kazia Assignee: Mubashir Kazia Fix For: 1.2.0 Attachments: HIVE-10312.1.patch When HS2 is configured for QOP other than auth (auth-int or auth-conf), Kerberos client connection works fine when the JDBC URL specifies the matching QOP, however when this HS2 is accessed through Oozie (Delegation token / Digest authentication), connections fails because the JDBC driver ignores the SASL.QOP parameters in the JDBC URL. SASL.QOP setting should be valid for DIGEST Auth mech. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-5545) HCatRecord getInteger method returns String when used on Partition columns of type INT
[ https://issues.apache.org/jira/browse/HIVE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan reassigned HIVE-5545: -- Assignee: Sushanth Sowmyan HCatRecord getInteger method returns String when used on Partition columns of type INT -- Key: HIVE-5545 URL: https://issues.apache.org/jira/browse/HIVE-5545 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0 Environment: hadoop-1.0.3 Reporter: Rishav Rohit Assignee: Sushanth Sowmyan HCatRecord getInteger method returns String when used on Partition columns of type INT. java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Integer -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10339) Allow JDBC Driver to pass HTTP header Key/Value pairs
[ https://issues.apache.org/jira/browse/HIVE-10339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506436#comment-14506436 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-10339: -- The failures are unrelated to the change. Thanks Hari Allow JDBC Driver to pass HTTP header Key/Value pairs - Key: HIVE-10339 URL: https://issues.apache.org/jira/browse/HIVE-10339 Project: Hive Issue Type: Improvement Components: Beeline Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-10339.1.patch, HIVE-10339.2.patch Currently Beeline ODBC driver does not support carrying user specified HTTP header. The beeline JDBC driver in HTTP mode connection string is as jdbc:hive2://host:port/db?hive.server2.transport.mode=http;hive.server2.thrift.http.path=http_endpoint, When transport mode is http Beeline/ODBC driver should allow end user to send arbitrary HTTP Header name value pair. All the beeline driver needs to do is to use the user specified name values and call the underlying HTTPClient API to set the header. E.g the Beeline connection string could be jdbc:hive2://host:port/db?hive.server2.transport.mode=http;hive.server2.thrift.http.path=http_endpoint,http.header.name1=value1, And the beeline will call underlying to set HTTP header to name1 and value1 This is required for the end user to send identity in a HTTP header down to Knox via beeline. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9957) Hive 1.1.0 not compatible with Hadoop 2.4.0
[ https://issues.apache.org/jira/browse/HIVE-9957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506463#comment-14506463 ] Lefty Leverenz commented on HIVE-9957: -- The Hive wiki has a section explaining how to apply a patch in the How to Contribute doc: * [How To Contribute -- Applying a Patch | https://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute-ApplyingaPatch] Hive 1.1.0 not compatible with Hadoop 2.4.0 --- Key: HIVE-9957 URL: https://issues.apache.org/jira/browse/HIVE-9957 Project: Hive Issue Type: Bug Components: Encryption Reporter: Vivek Shrivastava Assignee: Sergio Peña Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9957.1.patch Getting this exception while accessing data through Hive. Exception in thread main java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.DFSClient.getKeyProvider()Lorg/apache/hadoop/crypto/key/KeyProvider; at org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.init(Hadoop23Shims.java:1152) at org.apache.hadoop.hive.shims.Hadoop23Shims.createHdfsEncryptionShim(Hadoop23Shims.java:1279) at org.apache.hadoop.hive.ql.session.SessionState.getHdfsEncryptionShim(SessionState.java:392) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.isPathEncrypted(SemanticAnalyzer.java:1756) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getStagingDirectoryPathname(SemanticAnalyzer.java:1875) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1689) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1427) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genResolvedParseTree(SemanticAnalyzer.java:10132) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10147) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:192) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:222) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:421) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1112) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1160) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:207) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:754) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-4625) HS2 should not attempt to get delegation token from metastore if using embedded metastore
[ https://issues.apache.org/jira/browse/HIVE-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506442#comment-14506442 ] Hive QA commented on HIVE-4625: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12727018/HIVE-4625.4.patch {color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8728 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3522/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3522/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3522/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12727018 - PreCommit-HIVE-TRUNK-Build HS2 should not attempt to get delegation token from metastore if using embedded metastore - Key: HIVE-4625 URL: https://issues.apache.org/jira/browse/HIVE-4625 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.11.0 Reporter: Thejas M Nair Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-4625.1.patch, HIVE-4625.2.patch, HIVE-4625.3.patch, HIVE-4625.4.patch In kerberos secure mode, with doas enabled, Hive server2 tries to get delegation token from metastore even if the metastore is being used in embedded mode. To avoid failure in that case, it uses catch block for UnsupportedOperationException thrown that does nothing. But this leads to an error being logged by lower levels and can mislead users into thinking that there is a problem. It should check if delegation token mode is supported with current configuration before calling the function. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10434) Cancel connection when remote Spark driver process has failed [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HIVE-10434: Summary: Cancel connection when remote Spark driver process has failed [Spark Branch] (was: Cancel connection to HS2 when remote Spark driver process has failed [Spark Branch] ) Cancel connection when remote Spark driver process has failed [Spark Branch] - Key: HIVE-10434 URL: https://issues.apache.org/jira/browse/HIVE-10434 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: 1.2.0 Reporter: Chao Sun Assignee: Chao Sun Attachments: HIVE-10434.1-spark.patch Currently in HoS, in SparkClientImpl it first launch a remote Driver process, and then wait for it to connect back to the HS2. However, in certain situations (for instance, permission issue), the remote process may fail and exit with error code. In this situation, the HS2 process will still wait for the process to connect, and wait for a full timeout period before it throws the exception. What makes it worth, user may need to wait for two timeout periods: one for the SparkSetReducerParallelism, and another for the actual Spark job. This could be very annoying. We should cancel the timeout task once we found out that the process has failed, and set the promise as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-4625) HS2 should not attempt to get delegation token from metastore if using embedded metastore
[ https://issues.apache.org/jira/browse/HIVE-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-4625: Attachment: HIVE-4625.4.patch HS2 should not attempt to get delegation token from metastore if using embedded metastore - Key: HIVE-4625 URL: https://issues.apache.org/jira/browse/HIVE-4625 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.11.0 Reporter: Thejas M Nair Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-4625.1.patch, HIVE-4625.2.patch, HIVE-4625.3.patch, HIVE-4625.4.patch In kerberos secure mode, with doas enabled, Hive server2 tries to get delegation token from metastore even if the metastore is being used in embedded mode. To avoid failure in that case, it uses catch block for UnsupportedOperationException thrown that does nothing. But this leads to an error being logged by lower levels and can mislead users into thinking that there is a problem. It should check if delegation token mode is supported with current configuration before calling the function. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10391) CBO (Calcite Return Path): HiveOpConverter always assumes that HiveFilter does not include a partition column
[ https://issues.apache.org/jira/browse/HIVE-10391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506001#comment-14506001 ] Laljo John Pullokkaran commented on HIVE-10391: --- [~ashutoshc] Can you review the patch? CBO (Calcite Return Path): HiveOpConverter always assumes that HiveFilter does not include a partition column - Key: HIVE-10391 URL: https://issues.apache.org/jira/browse/HIVE-10391 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Pengcheng Xiong Assignee: Laljo John Pullokkaran Fix For: 1.2.0 Attachments: HIVE-10391.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506061#comment-14506061 ] Hive QA commented on HIVE-9824: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12726989/HIVE-9824.08.patch {color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 8750 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3518/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3518/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3518/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12726989 - PreCommit-HIVE-TRUNK-Build LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch, HIVE-9824.08.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10354) Investigate the test failure of TestHadoop20SAuthBridge.testSaslWithHiveMetaStore
[ https://issues.apache.org/jira/browse/HIVE-10354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu resolved HIVE-10354. - Resolution: Fixed The issue may be related to the env. Resolve for now and let's reopen it if it persists. Investigate the test failure of TestHadoop20SAuthBridge.testSaslWithHiveMetaStore - Key: HIVE-10354 URL: https://issues.apache.org/jira/browse/HIVE-10354 Project: Hive Issue Type: Bug Reporter: Aihua Xu It failed with: java.lang.NullPointerException: null at org.apache.hadoop.hive.metastore.HiveMetaStore.getDelegationToken(HiveMetaStore.java:5752) at org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.getDelegationTokenStr(TestHadoop20SAuthBridge.java:318) at org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.obtainTokenAndAddIntoUGI(TestHadoop20SAuthBridge.java:339) at org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore(TestHadoop20SAuthBridge.java:231) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10339) Allow JDBC Driver to pass HTTP header Key/Value pairs
[ https://issues.apache.org/jira/browse/HIVE-10339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505964#comment-14505964 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-10339: -- [~thejas] Thanks for the review. Added HIVE-10432 as the follow-up jira. Thanks Hari Allow JDBC Driver to pass HTTP header Key/Value pairs - Key: HIVE-10339 URL: https://issues.apache.org/jira/browse/HIVE-10339 Project: Hive Issue Type: Improvement Components: Beeline Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-10339.1.patch, HIVE-10339.2.patch Currently Beeline ODBC driver does not support carrying user specified HTTP header. The beeline JDBC driver in HTTP mode connection string is as jdbc:hive2://host:port/db?hive.server2.transport.mode=http;hive.server2.thrift.http.path=http_endpoint, When transport mode is http Beeline/ODBC driver should allow end user to send arbitrary HTTP Header name value pair. All the beeline driver needs to do is to use the user specified name values and call the underlying HTTPClient API to set the header. E.g the Beeline connection string could be jdbc:hive2://host:port/db?hive.server2.transport.mode=http;hive.server2.thrift.http.path=http_endpoint,http.header.name1=value1, And the beeline will call underlying to set HTTP header to name1 and value1 This is required for the end user to send identity in a HTTP header down to Knox via beeline. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-10370) Hive does not compile with -Phadoop-1 option
[ https://issues.apache.org/jira/browse/HIVE-10370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran reassigned HIVE-10370: Assignee: Prasanth Jayachandran Hive does not compile with -Phadoop-1 option Key: HIVE-10370 URL: https://issues.apache.org/jira/browse/HIVE-10370 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Prasanth Jayachandran Priority: Critical Attachments: HIVE-10370.1.patch Running into the below error while running mvn clean install -Pdist -Phadoop-1 {code} [ERROR]hive/serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazySimpleFast.java:[164,33] cannot find symbol symbol: method copyBytes() location: variable serialized of type org.apache.hadoop.io.Text {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10370) Hive does not compile with -Phadoop-1 option
[ https://issues.apache.org/jira/browse/HIVE-10370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-10370: - Attachment: HIVE-10370.1.patch Compilation will still break because of HIVE-10431 Hive does not compile with -Phadoop-1 option Key: HIVE-10370 URL: https://issues.apache.org/jira/browse/HIVE-10370 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Priority: Critical Attachments: HIVE-10370.1.patch Running into the below error while running mvn clean install -Pdist -Phadoop-1 {code} [ERROR]hive/serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazySimpleFast.java:[164,33] cannot find symbol symbol: method copyBytes() location: variable serialized of type org.apache.hadoop.io.Text {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10434) Cancel connection to HS2 when remote Spark driver process has failed [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HIVE-10434: Attachment: HIVE-10434.1-spark.patch Attaching initial patch. Tested on my own cluster and it worked. Cancel connection to HS2 when remote Spark driver process has failed [Spark Branch] Key: HIVE-10434 URL: https://issues.apache.org/jira/browse/HIVE-10434 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: 1.2.0 Reporter: Chao Sun Assignee: Chao Sun Attachments: HIVE-10434.1-spark.patch Currently in HoS, in SparkClientImpl it first launch a remote Driver process, and then wait for it to connect back to the HS2. However, in certain situations (for instance, permission issue), the remote process may fail and exit with error code. In this situation, the HS2 process will still wait for the process to connect, and wait for a full timeout period before it throws the exception. What makes it worth, user may need to wait for two timeout periods: one for the SparkSetReducerParallelism, and another for the actual Spark job. This could be very annoying. We should cancel the timeout task once we found out that the process has failed, and set the promise as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10428) NPE in RegexSerDe using HCat
[ https://issues.apache.org/jira/browse/HIVE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-10428: -- Attachment: HIVE-10428.1.patch NPE in RegexSerDe using HCat Key: HIVE-10428 URL: https://issues.apache.org/jira/browse/HIVE-10428 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-10428.1.patch When HCatalog calls to table with org.apache.hadoop.hive.serde2.RegexSerDe, when doing Hcatalog call to get read the table, it throws exception: {noformat} 15/04/21 14:07:31 INFO security.TokenCache: Got dt for hdfs://hdpsecahdfs; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:hdpsecahdfs, Ident: (HDFS_DELEGATION_TOKEN token 1478 for haha) 15/04/21 14:07:31 INFO mapred.FileInputFormat: Total input paths to process : 1 Splits len : 1 SplitInfo : [hdpseca03.seca.hwxsup.com, hdpseca04.seca.hwxsup.com, hdpseca05.seca.hwxsup.com] 15/04/21 14:07:31 INFO mapreduce.InternalUtil: Initializing org.apache.hadoop.hive.serde2.RegexSerDe with properties {name=casetest.regex_table, numFiles=1, columns.types=string,string, serialization.format=1, columns=id,name, rawDataSize=0, numRows=0, output.format.string=%1$s %2$s, serialization.lib=org.apache.hadoop.hive.serde2.RegexSerDe, COLUMN_STATS_ACCURATE=true, totalSize=25, serialization.null.format=\N, input.regex=([^ ]*) ([^ ]*), transient_lastDdlTime=1429590172} 15/04/21 14:07:31 WARN serde2.RegexSerDe: output.format.string has been deprecated Exception in thread main java.lang.NullPointerException at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:187) at com.google.common.base.Splitter.split(Splitter.java:371) at org.apache.hadoop.hive.serde2.RegexSerDe.initialize(RegexSerDe.java:155) at org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:49) at org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:518) at org.apache.hive.hcatalog.mapreduce.InternalUtil.initializeDeserializer(InternalUtil.java:156) at org.apache.hive.hcatalog.mapreduce.HCatRecordReader.createDeserializer(HCatRecordReader.java:127) at org.apache.hive.hcatalog.mapreduce.HCatRecordReader.initialize(HCatRecordReader.java:92) at HCatalogSQLMR.main(HCatalogSQLMR.java:81) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10431) HIVE-9555 broke hadoop-1 build
[ https://issues.apache.org/jira/browse/HIVE-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505957#comment-14505957 ] Prasanth Jayachandran commented on HIVE-10431: -- [~sershe] fyi.. HIVE-9555 broke hadoop-1 build -- Key: HIVE-10431 URL: https://issues.apache.org/jira/browse/HIVE-10431 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Prasanth Jayachandran Assignee: Sergey Shelukhin HIVE-9555 RecordReaderUtils uses direct bytebuffer read from FSDataInputStream which is not present in hadoop-1. This breaks hadoop-1 compilation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10384) RetryingMetaStoreClient does not retry wrapped TTransportExceptions
[ https://issues.apache.org/jira/browse/HIVE-10384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-10384: --- Attachment: HIVE-10384.1.patch Thanks [~szehon] for reviewing the code and Eric Liang providing the case where the TTransportException is wrapped in MetaException which is further wrapped in InvocationTargetException. Updated the patch to include that case as well. Thanks. RetryingMetaStoreClient does not retry wrapped TTransportExceptions --- Key: HIVE-10384 URL: https://issues.apache.org/jira/browse/HIVE-10384 Project: Hive Issue Type: Bug Components: Clients Reporter: Eric Liang Assignee: Chaoyu Tang Attachments: HIVE-10384.1.patch, HIVE-10384.patch This bug is very similar to HIVE-9436, in that a TTransportException wrapped in a MetaException will not be retried. RetryingMetaStoreClient has a block of code above the MetaException handler that retries thrift exceptions, but this doesn't work when the exception is wrapped. {code} if ((e.getCause() instanceof TApplicationException) || (e.getCause() instanceof TProtocolException) || (e.getCause() instanceof TTransportException)) { caughtException = (TException) e.getCause(); } else if ((e.getCause() instanceof MetaException) e.getCause().getMessage().matches((?s).*JDO[a-zA-Z]*Exception.*)) { caughtException = (MetaException) e.getCause(); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
[ https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-9824: --- Attachment: HIVE-9824.07.patch LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;) -- Key: HIVE-9824 URL: https://issues.apache.org/jira/browse/HIVE-9824 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch Today's VectorMapJoinOperator is a pass-through that converts each row from a vectorized row batch in a Java Object[] row and passes it to the MapJoinOperator superclass. This enhancement creates specialized vectorized map join operator classes that are optimized. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8858) Visualize generated Spark plan [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505272#comment-14505272 ] Jimmy Xiang commented on HIVE-8858: --- Looks good. Can this be part of explain extended? If we have to log into the log file, should we put in a buffer and log it in one log.info call? Another thing is that in assigning those numbers, can they match with the corresponding works/operators? For example, MapInput 1 is corresponding to Map 1 while MapInput 2 is corresponding to Map 2? Visualize generated Spark plan [Spark Branch] - Key: HIVE-8858 URL: https://issues.apache.org/jira/browse/HIVE-8858 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Assignee: Chinna Rao Lalam Attachments: HIVE-8858-spark.patch, HIVE-8858.1-spark.patch, HIVE-8858.2-spark.patch, HIVE-8858.3-spark.patch, HIVE-8858.4-spark.patch The spark plan generated by SparkPlanGenerator contains info which isn't available in Hive's explain plan, such as RDD caching. Also, the graph is slight different from orignal SparkWork. Thus, it would be nice to visualize the plan as is done for SparkWork. Preferrably, the visualization can happen as part of Hive explain extended. If not feasible, we at least can log this at info level. -- This message was sent by Atlassian JIRA (v6.3.4#6332)