[jira] [Commented] (HIVE-10407) separate out the timestamp ranges for testing purposes

2015-04-21 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504447#comment-14504447
 ] 

Hive QA commented on HIVE-10407:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12726702/HIVE-10407.patch

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 8732 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_precision2
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3508/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3508/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3508/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12726702 - PreCommit-HIVE-TRUNK-Build

 separate out the timestamp ranges for testing purposes
 --

 Key: HIVE-10407
 URL: https://issues.apache.org/jira/browse/HIVE-10407
 Project: Hive
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: HIVE-10407.patch, HIVE-10407.patch, HIVE-10407.patch


 Some platforms have limits for date ranges, so separate out the test cases 
 that are outside of the range 1970 to 2038.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10347) Merge spark to trunk 4/15/2015

2015-04-21 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505439#comment-14505439
 ] 

Szehon Ho commented on HIVE-10347:
--

Test failures dont look related, and its ready to go.  [~xuefuz] can you take a 
look?

 Merge spark to trunk 4/15/2015
 --

 Key: HIVE-10347
 URL: https://issues.apache.org/jira/browse/HIVE-10347
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-10347.2.patch, HIVE-10347.2.patch, 
 HIVE-10347.3.patch, HIVE-10347.4.patch, HIVE-10347.5.patch, 
 HIVE-10347.5.patch, HIVE-10347.6.patch, HIVE-10347.patch


 CLEAR LIBRARY CACHE



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9917) After HIVE-3454 is done, make int to timestamp conversion configurable

2015-04-21 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505408#comment-14505408
 ] 

Jason Dere commented on HIVE-9917:
--

You mean on RB? I don't think I have access to update your RB entry. You can 
just create a new git diff but without the --no-prefix option, and upload that 
to RB.

 After HIVE-3454 is done, make int to timestamp conversion configurable
 --

 Key: HIVE-9917
 URL: https://issues.apache.org/jira/browse/HIVE-9917
 Project: Hive
  Issue Type: Improvement
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-9917.patch


 After HIVE-3454 is fixed, we will have correct behavior of converting int to 
 timestamp. While the customers are using such incorrect behavior for so long, 
 better to make it configurable so that in one release, it will default to 
 old/inconsistent way and the next release will default to new/consistent way. 
 And then we will deprecate it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10233) Hive on LLAP: Memory manager

2015-04-21 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-10233:
--
Attachment: HIVE-10233-WIP.4.patch

 Hive on LLAP: Memory manager
 

 Key: HIVE-10233
 URL: https://issues.apache.org/jira/browse/HIVE-10233
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: llap
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
 HIVE-10233-WIP.4.patch


 We need a memory manager in llap/tez to manage the usage of memory across 
 threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10421) DROP TABLE with qualified table name ignores database name when checking partitions

2015-04-21 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-10421:
--
Attachment: HIVE-10421.1.patch

 DROP TABLE with qualified table name ignores database name when checking 
 partitions
 ---

 Key: HIVE-10421
 URL: https://issues.apache.org/jira/browse/HIVE-10421
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-10421.1.patch


 Hive was only recently changed to allow drop table dbname.tabname. However 
 DDLTask.dropTable() is still using an older version of 
 Hive.getPartitionNames(), which only took in a single string for the table 
 name, rather than the database and table names. As a result Hive is filling 
 in the current database name as the dbname during the listPartitions call to 
 the MetaStore.
 It also appears that on the Hive Metastore side, in the non-auth path there 
 is no validation to check that the dbname.tablename actually exists - this 
 call simply returns back an empty list of partitions, which causes the table 
 to be dropped without checking any of the partition information. I will open 
 a separate issue for this one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10413) [CBO] Return path assumes distinct column cant be same as grouping column

2015-04-21 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505462#comment-14505462
 ] 

Laljo John Pullokkaran commented on HIVE-10413:
---

[~ashutoshc] We need to handle:
1. We need to to maintain gbInfo.distExprNodes/distExprNames/distExprTypes 
2. In genMapSideRS we add all of gbInfo.distExprNodes to reduce keys. This is 
wrong if distinct key is already part of GB Key.

 [CBO] Return path assumes distinct column cant be same as grouping column
 -

 Key: HIVE-10413
 URL: https://issues.apache.org/jira/browse/HIVE-10413
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-10413.patch


 Found in cbo_udf_udaf.q tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10323) Tez merge join operator does not honor hive.join.emit.interval

2015-04-21 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505559#comment-14505559
 ] 

Gunther Hagleitner commented on HIVE-10323:
---

Patch looks good. Minor nit: The condition for nextKeyGroup should be an else 
block.

Some other considerations:

- Maybe we should log emit and spill intervals. Also warn if the first is  
than latter?
- Looks like you emit before you put the current record into storage. Wouldn't 
it be better to do that afterwards?

Biggest concern: There's not a lot of testing going on. For one thing I think 
you could set the emit interval low (2?) for all tez tests and see if you get 
bigger coverage that way. If not you should test all the combinations: left, 
right, outer, multi key, multi table, spill other tables, etc.

 Tez merge join operator does not honor hive.join.emit.interval
 --

 Key: HIVE-10323
 URL: https://issues.apache.org/jira/browse/HIVE-10323
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 1.2.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-10323.1.patch


 This affects efficiency in case of skews.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8858) Visualize generated Spark plan [Spark Branch]

2015-04-21 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505572#comment-14505572
 ] 

Jimmy Xiang commented on HIVE-8858:
---

Cool, looks good to me. +1

 Visualize generated Spark plan [Spark Branch]
 -

 Key: HIVE-8858
 URL: https://issues.apache.org/jira/browse/HIVE-8858
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Xuefu Zhang
Assignee: Chinna Rao Lalam
 Attachments: HIVE-8858-spark.patch, HIVE-8858.1-spark.patch, 
 HIVE-8858.2-spark.patch, HIVE-8858.3-spark.patch, HIVE-8858.4-spark.patch


 The spark plan generated by SparkPlanGenerator contains info which isn't 
 available in Hive's explain plan, such as RDD caching. Also, the graph is 
 slight different from orignal SparkWork. Thus, it would be nice to visualize 
 the plan as is done for SparkWork.
 Preferrably, the visualization can happen as part of Hive explain extended. 
 If not feasible, we at least can log this at info level.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9917) After HIVE-3454 is done, make int to timestamp conversion configurable

2015-04-21 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505332#comment-14505332
 ] 

Aihua Xu commented on HIVE-9917:


I see. Yeah. I was using --no-prefix. 

 After HIVE-3454 is done, make int to timestamp conversion configurable
 --

 Key: HIVE-9917
 URL: https://issues.apache.org/jira/browse/HIVE-9917
 Project: Hive
  Issue Type: Improvement
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-9917.patch


 After HIVE-3454 is fixed, we will have correct behavior of converting int to 
 timestamp. While the customers are using such incorrect behavior for so long, 
 better to make it configurable so that in one release, it will default to 
 old/inconsistent way and the next release will default to new/consistent way. 
 And then we will deprecate it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10408) LLAP: NPE in scheduler in case of rejected tasks

2015-04-21 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-10408:
--
Summary: LLAP: NPE in scheduler in case of rejected tasks  (was: LLAP: 
query fails - NPE (old exception I posted was bogus))

 LLAP: NPE in scheduler in case of rejected tasks
 

 Key: HIVE-10408
 URL: https://issues.apache.org/jira/browse/HIVE-10408
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Siddharth Seth
 Fix For: llap

 Attachments: HIVE-10408.1.txt


 {noformat}
 java.lang.NullPointerException
 at 
 org.apache.tez.dag.app.rm.LlapTaskSchedulerService.deallocateTask(LlapTaskSchedulerService.java:388)
 at 
 org.apache.tez.dag.app.rm.TaskSchedulerEventHandler.handleTASucceeded(TaskSchedulerEventHandler.java:339)
 at 
 org.apache.tez.dag.app.rm.TaskSchedulerEventHandler.handleEvent(TaskSchedulerEventHandler.java:224)
 at 
 org.apache.tez.dag.app.rm.TaskSchedulerEventHandler$1.run(TaskSchedulerEventHandler.java:493)
 {noformat}
 The query, running alone on 10-node cluster, dumped 1000 mappers into 
 running; with 3 completed it failed with that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10421) DROP TABLE with qualified table name ignores database name when checking partitions

2015-04-21 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere reassigned HIVE-10421:
-

Assignee: Jason Dere

 DROP TABLE with qualified table name ignores database name when checking 
 partitions
 ---

 Key: HIVE-10421
 URL: https://issues.apache.org/jira/browse/HIVE-10421
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere

 Hive was only recently changed to allow drop table dbname.tabname. However 
 DDLTask.dropTable() is still using an older version of 
 Hive.getPartitionNames(), which only took in a single string for the table 
 name, rather than the database and table names. As a result Hive is filling 
 in the current database name as the dbname during the listPartitions call to 
 the MetaStore.
 It also appears that on the Hive Metastore side, in the non-auth path there 
 is no validation to check that the dbname.tablename actually exists - this 
 call simply returns back an empty list of partitions, which causes the table 
 to be dropped without checking any of the partition information. I will open 
 a separate issue for this one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10379) Wrong result when executing with tez

2015-04-21 Thread ErwanMAS (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505303#comment-14505303
 ] 

ErwanMAS commented on HIVE-10379:
-

I have downloaded the hortonworks sandbox 2.2.4 . It's fixed .
It's a duplicate of HIVE- .
 



 Wrong result when executing with tez 
 -

 Key: HIVE-10379
 URL: https://issues.apache.org/jira/browse/HIVE-10379
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 0.13.0, 0.14.0
 Environment: Hortonworks sandbox 2.1.1  2.2.0  
Reporter: ErwanMAS
Assignee: Gunther Hagleitner
 Fix For: 1.0.0

 Attachments: HIVE-10379.1.patch


 I do a left join with a lateral view outer , too many row are generated with 
 tez .
 in map reduce , i have 125 rows , in tez 132 .
 Example :
 {noformat}
   drop table foo ;
   create table foo ( dummyfoo int  ) ;
   insert into table foo select count(*) from foo ;
   select count(*) as cnt from (
   select a.val,p.code from
 ( select castone*5)+two)*5+three) as int) as val 
 from foo
 lateral view outer 
 explode(split(0,1,2,3,4,,)) tbl_1 as one
 lateral view outer 
 explode(split(0,1,2,3,4,,)) tbl_2 as two
 lateral view outer 
 explode(split(0,1,2,3,4,,)) tbl_3 as three ) as a
 left join
 ( select dummyfoo as code from foo ) p on p.code=a.val
   ) w ;
   set hive.execution.engine=tez;
   set hive.vectorized.execution.enabled=false;
   select count(*) as cnt from (
   select a.val,p.code from
 ( select castone*5)+two)*5+three) as int) as val 
 from foo
 lateral view outer 
 explode(split(0,1,2,3,4,,)) tbl_1 as one
 lateral view outer 
 explode(split(0,1,2,3,4,,)) tbl_2 as two
 lateral view outer 
 explode(split(0,1,2,3,4,,)) tbl_3 as three ) as a
 left join
 ( select dummyfoo as code from foo ) p on p.code=a.val
   ) w ;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10413) [CBO] Return path assumes distinct column cant be same as grouping column

2015-04-21 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-10413:
--
Issue Type: Sub-task  (was: Bug)
Parent: HIVE-9132

 [CBO] Return path assumes distinct column cant be same as grouping column
 -

 Key: HIVE-10413
 URL: https://issues.apache.org/jira/browse/HIVE-10413
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 1.2.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-10413.patch


 Found in cbo_udf_udaf.q tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10233) Hive on LLAP: Memory manager

2015-04-21 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-10233:
--
Attachment: HIVE-10233-WIP-4.patch

 Hive on LLAP: Memory manager
 

 Key: HIVE-10233
 URL: https://issues.apache.org/jira/browse/HIVE-10233
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: llap
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
 HIVE-10233-WIP-4.patch


 We need a memory manager in llap/tez to manage the usage of memory across 
 threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10419) can't do query on partitioned view with analytic function in strictmode

2015-04-21 Thread Hector Lagos (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hector Lagos updated HIVE-10419:

  Description: 
Hey Guys,

I created the following table:

CREATE TABLE t1 (id int, key string, value string) partitioned by (dt int);

And after that i created a view on that table as follow:

create view v1 PARTITIONED ON (dt)
as
SELECT * FROM (
SELECT row_number() over (partition by key order by value asc) as row_n, * FROM 
t1 
) t WHERE row_n = 1;

We are working with hive.mapred.mode=strict and when I try to do the  query 
select * from v1 where dt = 2 , I'm getting the following error:

FAILED: SemanticException [Error 10041]: No partition predicate found for Alias 
v1:t:t1 Table t1

Is this a bug or a limitation of Hive when you use analytic functions in 
partitioned views? If i remove the row_number function it works without 
problems. 

Thanks in advance, any help will be appreciated. 



  was:
Hey Guysm



Affects Version/s: 0.14.0
   1.0.0
 Tags: view,partition,analytical function  (was: view)
  Summary: can't do query on partitioned view with analytic 
function in strictmode  (was: can't do query on partitioned view with 
analytical function in strictmode)

 can't do query on partitioned view with analytic function in strictmode
 ---

 Key: HIVE-10419
 URL: https://issues.apache.org/jira/browse/HIVE-10419
 Project: Hive
  Issue Type: Bug
  Components: Hive, Views
Affects Versions: 0.13.0, 0.14.0, 1.0.0
 Environment: Cloudera 5.3.x. 
Reporter: Hector Lagos

 Hey Guys,
 I created the following table:
 CREATE TABLE t1 (id int, key string, value string) partitioned by (dt int);
 And after that i created a view on that table as follow:
 create view v1 PARTITIONED ON (dt)
 as
 SELECT * FROM (
 SELECT row_number() over (partition by key order by value asc) as row_n, * 
 FROM t1 
 ) t WHERE row_n = 1;
 We are working with hive.mapred.mode=strict and when I try to do the  query 
 select * from v1 where dt = 2 , I'm getting the following error:
 FAILED: SemanticException [Error 10041]: No partition predicate found for 
 Alias v1:t:t1 Table t1
 Is this a bug or a limitation of Hive when you use analytic functions in 
 partitioned views? If i remove the row_number function it works without 
 problems. 
 Thanks in advance, any help will be appreciated. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10331) ORC : Is null SARG filters out all row groups written in old ORC format

2015-04-21 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-10331:
-
Attachment: HIVE-10331.06.patch

The patch looks good. I made a very minor modification to bloomfilter decimal 
test case which tests for both NO and YES_NO cases.

 ORC : Is null SARG filters out all row groups written in old ORC format
 ---

 Key: HIVE-10331
 URL: https://issues.apache.org/jira/browse/HIVE-10331
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.1.0
Reporter: Mostafa Mokhtar
Assignee: Mostafa Mokhtar
 Fix For: 1.2.0

 Attachments: HIVE-10331.01.patch, HIVE-10331.02.patch, 
 HIVE-10331.03.patch, HIVE-10331.03.patch, HIVE-10331.04.patch, 
 HIVE-10331.05.patch, HIVE-10331.06.patch


 Queries are returning wrong results as all row groups gets filtered out and 
 no rows get scanned.
 {code}
 SELECT 
   count(*)
 FROM
 store_sales
 WHERE
 ss_addr_sk IS NULL
 {code}
 With hive.optimize.index.filter disabled we get the correct results
 In pickRowGroups stats show that hasNull_ is fales, while the rowgroup 
 actually has null.
 Same query runs fine for newly loaded ORC tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10416) CBO (Calcite Return Path): Fix return columns if Sort operator is on top of plan returned by Calcite

2015-04-21 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505331#comment-14505331
 ] 

Laljo John Pullokkaran commented on HIVE-10416:
---

[~jcamachorodriguez] Introducing Select on top of Sort will not work as TEZ 
can not preserve ordering across select.

 CBO (Calcite Return Path): Fix return columns if Sort operator is on top of 
 plan returned by Calcite
 

 Key: HIVE-10416
 URL: https://issues.apache.org/jira/browse/HIVE-10416
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 1.2.0

 Attachments: HIVE-10416.patch


 When return path is on, if the plan's top operator is a Sort, we need to 
 produce a SelectOp that will output exactly the columns needed by the FS.
 The following query reproduces the problem:
 {noformat}
 select cbo_t3.c_int, c, count(*)
 from (select key as a, c_int+1 as b, sum(c_int) as c from cbo_t1
 where (cbo_t1.c_int + 1 = 0) and (cbo_t1.c_int  0 or cbo_t1.c_float = 0)
 group by c_float, cbo_t1.c_int, key order by a) cbo_t1
 join (select key as p, c_int+1 as q, sum(c_int) as r from cbo_t2
 where (cbo_t2.c_int + 1 = 0) and (cbo_t2.c_int  0 or cbo_t2.c_float = 0)
 group by c_float, cbo_t2.c_int, key order by q/10 desc, r asc) cbo_t2 on 
 cbo_t1.a=p
 join cbo_t3 on cbo_t1.a=key
 where (b + cbo_t2.q = 0) and (b  0 or c_int = 0)
 group by cbo_t3.c_int, c order by cbo_t3.c_int+c desc, c;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10227) Concrete implementation of Export/Import based ReplicationTaskFactory

2015-04-21 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505401#comment-14505401
 ] 

Sushanth Sowmyan commented on HIVE-10227:
-

After some sleeping on this, I feel like I should be stricter still about 
erroring out whenever .create is called, so that no events get seen as getting 
processed, but I don't think adding one more Factory is a good way of doing 
that.

Here's what I now think:

a) We get rid of NoopFactory as a default - that should move to tests, and not 
stay here - we require that that config parameter be set to some instantiatable 
factory to use this class.
b) We get rid of the explicit InvalidStateFactory I mention above, but 
instantiate it as an inline anonymous Factory instantiation if we fail to load 
whatever Factory class the user provides.

Thoughts?

Also, is it possible for us to do any factory refactoring in another jira? This 
jira is huge enough that the longer we leave it uncommitted, the more it'll be 
exposed to rebasing needs. Also, a couple of other jiras like HIVE-9674 are 
awaiting this landing before they can be made patch-available.

 Concrete implementation of Export/Import based ReplicationTaskFactory
 -

 Key: HIVE-10227
 URL: https://issues.apache.org/jira/browse/HIVE-10227
 Project: Hive
  Issue Type: Sub-task
  Components: Import/Export
Affects Versions: 1.2.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-10227.2.patch, HIVE-10227.3.patch, 
 HIVE-10227.4.patch, HIVE-10227.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10384) RetryingMetaStoreClient does not retry wrapped TTransportExceptions

2015-04-21 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-10384:
---
Attachment: HIVE-10384.patch

Looking through RetryMetaStoreClient/IMetaStoreClient/MetaStoreClient, I think 
two exceptions wrapped in MetaException should be caught and retry is needed in 
RetryMetaStoreClient.invoke. One is IOException from reloginExpiringKeytabUser, 
other other is TTransportException from base.reconnect(). I did not see that an 
TTransportException could be wrapped in the InvocationTargetException.
[~ekhliang] I wonder if it is the TTransportException that you meant which 
should be but has not been retried, or is there any other. Thanks

 RetryingMetaStoreClient does not retry wrapped TTransportExceptions
 ---

 Key: HIVE-10384
 URL: https://issues.apache.org/jira/browse/HIVE-10384
 Project: Hive
  Issue Type: Bug
  Components: Clients
Reporter: Eric Liang
Assignee: Chaoyu Tang
 Attachments: HIVE-10384.patch


 This bug is very similar to HIVE-9436, in that a TTransportException wrapped 
 in a MetaException will not be retried. RetryingMetaStoreClient has a block 
 of code above the MetaException handler that retries thrift exceptions, but 
 this doesn't work when the exception is wrapped.
 {code}
 if ((e.getCause() instanceof TApplicationException) ||
 (e.getCause() instanceof TProtocolException) ||
 (e.getCause() instanceof TTransportException)) {
   caughtException = (TException) e.getCause();
 } else if ((e.getCause() instanceof MetaException) 
 
 e.getCause().getMessage().matches((?s).*JDO[a-zA-Z]*Exception.*)) {
   caughtException = (MetaException) e.getCause();
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10062) HiveOnTez: Union followed by Multi-GB followed by Multi-insert loses data

2015-04-21 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505579#comment-14505579
 ] 

Gunther Hagleitner commented on HIVE-10062:
---

Test failures are unrelated. There's a minor typo in the latest patch, but I 
can fix on commit.

I'm +1

 HiveOnTez: Union followed by Multi-GB followed by Multi-insert loses data
 -

 Key: HIVE-10062
 URL: https://issues.apache.org/jira/browse/HIVE-10062
 Project: Hive
  Issue Type: Bug
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
Priority: Critical
 Attachments: HIVE-10062.01.patch, HIVE-10062.02.patch, 
 HIVE-10062.03.patch, HIVE-10062.04.patch, HIVE-10062.05.patch


 In q.test environment with src table, execute the following query: 
 {code}
 CREATE TABLE DEST1(key STRING, value STRING) STORED AS TEXTFILE;
 CREATE TABLE DEST2(key STRING, val1 STRING, val2 STRING) STORED AS TEXTFILE;
 FROM (select 'tst1' as key, cast(count(1) as string) as value from src s1
  UNION all 
   select s2.key as key, s2.value as value from src s2) unionsrc
 INSERT OVERWRITE TABLE DEST1 SELECT unionsrc.key, COUNT(DISTINCT 
 SUBSTR(unionsrc.value,5)) GROUP BY unionsrc.key
 INSERT OVERWRITE TABLE DEST2 SELECT unionsrc.key, unionsrc.value, 
 COUNT(DISTINCT SUBSTR(unionsrc.value,5)) 
 GROUP BY unionsrc.key, unionsrc.value;
 select * from DEST1;
 select * from DEST2;
 {code}
 DEST1 and DEST2 should both have 310 rows. However, DEST2 only has 1 row 
 tst1500 1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9923) No clear message when from is missing

2015-04-21 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504929#comment-14504929
 ] 

Yongzhi Chen commented on HIVE-9923:


Thanks [~szehon] and [~csun] for reviewing it.

 No clear message when from is missing
 ---

 Key: HIVE-9923
 URL: https://issues.apache.org/jira/browse/HIVE-9923
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Jeff Zhang
Assignee: Yongzhi Chen
 Fix For: 1.2.0

 Attachments: HIVE-9923.1.patch, HIVE-9923.2.patch


 For the following sql, from is missing but it throw NPE which is not clear 
 for user.
 {code}
 hive insert overwrite directory '/tmp/hive-3' select sb1.name, sb2.age 
 student_bucketed sb1 join student_bucketed sb2 on sb1.name=sb2.name;
 FAILED: NullPointerException null
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10416) CBO (Calcite Return Path): Fix return columns if Sort operator is on top of plan returned by Calcite

2015-04-21 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-10416:
---
Attachment: HIVE-10416.patch

[~ashutoshc], could you take a look? Thanks!

 CBO (Calcite Return Path): Fix return columns if Sort operator is on top of 
 plan returned by Calcite
 

 Key: HIVE-10416
 URL: https://issues.apache.org/jira/browse/HIVE-10416
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 1.2.0

 Attachments: HIVE-10416.patch


 When return path is on, if the plan's top operator is a Sort, we need to 
 produce a SelectOp that will output exactly the columns needed by the FS.
 The following query reproduces the problem:
 {noformat}
 select cbo_t3.c_int, c, count(*)
 from (select key as a, c_int+1 as b, sum(c_int) as c from cbo_t1
 where (cbo_t1.c_int + 1 = 0) and (cbo_t1.c_int  0 or cbo_t1.c_float = 0)
 group by c_float, cbo_t1.c_int, key order by a) cbo_t1
 join (select key as p, c_int+1 as q, sum(c_int) as r from cbo_t2
 where (cbo_t2.c_int + 1 = 0) and (cbo_t2.c_int  0 or cbo_t2.c_float = 0)
 group by c_float, cbo_t2.c_int, key order by q/10 desc, r asc) cbo_t2 on 
 cbo_t1.a=p
 join cbo_t3 on cbo_t1.a=key
 where (b + cbo_t2.q = 0) and (b  0 or c_int = 0)
 group by cbo_t3.c_int, c order by cbo_t3.c_int+c desc, c;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-9791) insert into table throws NPE

2015-04-21 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen resolved HIVE-9791.

Resolution: Fixed

This issue should be fixed by the fixes for HIVE-9923

 insert into table throws NPE
 

 Key: HIVE-9791
 URL: https://issues.apache.org/jira/browse/HIVE-9791
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Alexander Pivovarov
Assignee: Yongzhi Chen

 to reproduce NPE run the following
 {code}
 create table a as select 'A' letter;
 OK
 insert into table a select 'B' letter;
 FAILED: NullPointerException null
 -- works fine if add from table to select statement
 insert into table a select 'B' letter from dual;
 OK
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10403) Add n-way join support for Hybrid Grace Hash Join

2015-04-21 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504546#comment-14504546
 ] 

Hive QA commented on HIVE-10403:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12726735/HIVE-10403.01.patch

{color:red}ERROR:{color} -1 due to 21 failed/errored test(s), 8729 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_precision2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_join29
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_10
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_9
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_char_mapjoin1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_nested_mapjoin
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3509/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3509/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3509/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 21 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12726735 - PreCommit-HIVE-TRUNK-Build

 Add n-way join support for Hybrid Grace Hash Join
 -

 Key: HIVE-10403
 URL: https://issues.apache.org/jira/browse/HIVE-10403
 Project: Hive
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Wei Zheng
Assignee: Wei Zheng
 Attachments: HIVE-10403.01.patch


 Currently Hybrid Grace Hash Join only supports 2-way join (one big table and 
 one small table). This task will enable n-way join (one big table and 
 multiple small tables).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10416) CBO (Calcite Return Path): Fix return columns if Sort operator is on top of plan returned by Calcite

2015-04-21 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505055#comment-14505055
 ] 

Hive QA commented on HIVE-10416:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12726880/HIVE-10416.patch

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8728 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_precision2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3513/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3513/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3513/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12726880 - PreCommit-HIVE-TRUNK-Build

 CBO (Calcite Return Path): Fix return columns if Sort operator is on top of 
 plan returned by Calcite
 

 Key: HIVE-10416
 URL: https://issues.apache.org/jira/browse/HIVE-10416
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 1.2.0

 Attachments: HIVE-10416.patch


 When return path is on, if the plan's top operator is a Sort, we need to 
 produce a SelectOp that will output exactly the columns needed by the FS.
 The following query reproduces the problem:
 {noformat}
 select cbo_t3.c_int, c, count(*)
 from (select key as a, c_int+1 as b, sum(c_int) as c from cbo_t1
 where (cbo_t1.c_int + 1 = 0) and (cbo_t1.c_int  0 or cbo_t1.c_float = 0)
 group by c_float, cbo_t1.c_int, key order by a) cbo_t1
 join (select key as p, c_int+1 as q, sum(c_int) as r from cbo_t2
 where (cbo_t2.c_int + 1 = 0) and (cbo_t2.c_int  0 or cbo_t2.c_float = 0)
 group by c_float, cbo_t2.c_int, key order by q/10 desc, r asc) cbo_t2 on 
 cbo_t1.a=p
 join cbo_t3 on cbo_t1.a=key
 where (b + cbo_t2.q = 0) and (b  0 or c_int = 0)
 group by cbo_t3.c_int, c order by cbo_t3.c_int+c desc, c;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10235) Loop optimization for SIMD in ColumnDivideColumn.txt

2015-04-21 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504949#comment-14504949
 ] 

Gopal V commented on HIVE-10235:


[~chengxiang li]: Patch LGTM - +1.

Not able to see a significant leap in perf on my quick tests - division doesn't 
seem to be a common scenario in my tests.

 Loop optimization for SIMD in ColumnDivideColumn.txt
 

 Key: HIVE-10235
 URL: https://issues.apache.org/jira/browse/HIVE-10235
 Project: Hive
  Issue Type: Sub-task
  Components: Vectorization
Affects Versions: 1.1.0
Reporter: Chengxiang Li
Assignee: Chengxiang Li
Priority: Minor
 Attachments: HIVE-10235.1.patch, HIVE-10235.1.patch


 Found two loop which could be optimized for packed instruction set during 
 execution.
 1. hasDivBy0 depends on the result of last loop, which prevent the loop be 
 executed vectorized.
 {code:java}
 for(int i = 0; i != n; i++) {
   OperandType2 denom = vector2[i];
   outputVector[i] = vector1[0] OperatorSymbol denom;
   hasDivBy0 = hasDivBy0 || (denom == 0);
 }
 {code}
 2. same as HIVE-10180, vector2\[0\] reference provent JVM optimizing loop 
 into packed instruction set.
 {code:java}
 for(int i = 0; i != n; i++) {
   outputVector[i] = vector1[i] OperatorSymbol vector2[0];
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-21 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504946#comment-14504946
 ] 

Hive QA commented on HIVE-9824:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12726853/HIVE-9824.07.patch

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8750 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_precision2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vector_decimal_mapjoin
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3512/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3512/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3512/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12726853 - PreCommit-HIVE-TRUNK-Build

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, 
 HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10302) Load small tables (for map join) in executor memory only once[Spark Branch]

2015-04-21 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-10302:
---
Summary: Load small tables (for map join) in executor memory only 
once[Spark Branch]  (was: Cache small tables in memory [Spark Branch])

 Load small tables (for map join) in executor memory only once[Spark Branch]
 ---

 Key: HIVE-10302
 URL: https://issues.apache.org/jira/browse/HIVE-10302
 Project: Hive
  Issue Type: Improvement
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: spark-branch

 Attachments: HIVE-10302.spark-1.patch


 If we can cache small tables in executor memory, we could save some time in 
 loading them from HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10302) Load small tables (for map join) in executor memory only once [Spark Branch]

2015-04-21 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-10302:
---
Summary: Load small tables (for map join) in executor memory only once 
[Spark Branch]  (was: Load small tables (for map join) in executor memory only 
once[Spark Branch])

 Load small tables (for map join) in executor memory only once [Spark Branch]
 

 Key: HIVE-10302
 URL: https://issues.apache.org/jira/browse/HIVE-10302
 Project: Hive
  Issue Type: Improvement
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: spark-branch

 Attachments: HIVE-10302.spark-1.patch


 If we can cache small tables in executor memory, we could save some time in 
 loading them from HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10302) Load small tables (for map join) in executor memory only once [Spark Branch]

2015-04-21 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-10302:
---
Description: Usually there are multiple cores in a Spark executor, and thus 
it's possible that multiple map-join tasks can be running in the same executor 
(concurrently or sequentially). Currently, each task will load its own copy of 
the small tables for map join into memory, ending up with inefficiency. 
Ideally, we only load the small tables once and share them among the tasks 
running in that executor.  (was: If we can cache small tables in executor 
memory, we could save some time in loading them from HDFS.)

 Load small tables (for map join) in executor memory only once [Spark Branch]
 

 Key: HIVE-10302
 URL: https://issues.apache.org/jira/browse/HIVE-10302
 Project: Hive
  Issue Type: Improvement
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: spark-branch

 Attachments: HIVE-10302.spark-1.patch


 Usually there are multiple cores in a Spark executor, and thus it's possible 
 that multiple map-join tasks can be running in the same executor 
 (concurrently or sequentially). Currently, each task will load its own copy 
 of the small tables for map join into memory, ending up with inefficiency. 
 Ideally, we only load the small tables once and share them among the tasks 
 running in that executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10191) ORC: Cleanup writer per-row synchronization

2015-04-21 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-10191:
---
Attachment: HIVE-10191.3.patch

 ORC: Cleanup writer per-row synchronization
 ---

 Key: HIVE-10191
 URL: https://issues.apache.org/jira/browse/HIVE-10191
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 1.2.0
Reporter: Gopal V
Assignee: Gopal V
 Attachments: HIVE-10191.1.patch, HIVE-10191.2.patch, 
 HIVE-10191.3.patch


 ORC writers were originally meant to be thread-safe, but in the present day 
 implementation each ORC writer is entirely share-nothing which converts most 
 of the synchronized blocks in ORC as entirely uncontested locks.
 These uncontested locks prevent the JVM from inlining/optimizing these 
 methods, while adding no extra thread-safety to the ORC writers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10165) Improve hive-hcatalog-streaming extensibility and support updates and deletes.

2015-04-21 Thread Elliot West (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliot West updated HIVE-10165:
---
Description: 
h3. Overview
I'd like to extend the 
[hive-hcatalog-streaming|https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest]
 API so that it also supports the writing of record updates and deletes in 
addition to the already supported inserts.

h3. Motivation
We have many Hadoop processes outside of Hive that merge changed facts into 
existing datasets. Traditionally we achieve this by: reading in a ground-truth 
dataset and a modified dataset, grouping by a key, sorting by a sequence and 
then applying a function to determine inserted, updated, and deleted rows. 
However, in our current scheme we must rewrite all partitions that may 
potentially contain changes. In practice the number of mutated records is very 
small when compared with the records contained in a partition. This approach 
results in a number of operational issues:
* Excessive amount of write activity required for small data changes.
* Downstream applications cannot robustly read these datasets while they are 
being updated.
* Due to scale of the updates (hundreds or partitions) the scope for contention 
is high. 

I believe we can address this problem by instead writing only the changed 
records to a Hive transactional table. This should drastically reduce the 
amount of data that we need to write and also provide a means for managing 
concurrent access to the data. Our existing merge processes can read and retain 
each record's {{ROW_ID}}/{{RecordIdentifier}} and pass this through to an 
updated form of the hive-hcatalog-streaming API which will then have the 
required data to perform an update or insert in a transactional manner. 

h3. Benefits
* Enables the creation of large-scale dataset merge processes  
* Opens up Hive transactional functionality in an accessible manner to 
processes that operate outside of Hive.

h3. Implementation
Our changes do not break the existing API contracts. Instead our approach has 
been to consider the functionality offered by the existing API and our proposed 
API as fulfilling separate and distinct use-cases. The existing API is 
primarily focused on the task of continuously writing large volumes of new data 
into a Hive table for near-immediate analysis. Our use-case however, is 
concerned more with the frequent but not continuous ingestion of mutations to a 
Hive table from some ETL merge process. Consequently we feel it is justifiable 
to add our new functionality via an alternative set of public interfaces and 
leave the existing API as is. This keeps both APIs clean and focused at the 
expense of presenting additional options to potential users. Wherever possible, 
shared implementation concerns have been factored out into abstract base 
classes that are open to third-party extension. A detailed breakdown of the 
changes is as follows:

* We've introduced a public {{RecordMutator}} interface whose purpose is to 
expose insert/update/delete operations to the user. This is a counterpart to 
the write-only {{RecordWriter}}. We've also factored out life-cycle methods 
common to these two interfaces into a super {{RecordOperationWriter}} 
interface.  Note that the row representation has be changed from {{byte[]}} to 
{{Object}}. Within our data processing jobs our records are often available in 
a strongly typed and decoded form such as a POJO or a Tuple object. Therefore 
is seems to make sense that we are able to pass this through to the 
{{OrcRecordUpdater}} without having to go through a {{byte[]}} encoding step. 
This of course still allows users to use {{byte[]}} if they wish.
* The introduction of {{RecordMutator}} requires that insert/update/delete 
operations are then also exposed on a {{TransactionBatch}} type. We've done 
this with the introduction of a public {{MutatorTransactionBatch}} interface 
which is a counterpart to the write-only {{TransactionBatch}}. We've also 
factored out life-cycle methods common to these two interfaces into a super 
{{BaseTransactionBatch}} interface. 
* Functionality that would be shared by implementations of both 
{{RecordWriters}} and {{RecordMutators}} has been factored out of 
{{AbstractRecordWriter}} into a new abstract base class 
{{AbstractOperationRecordWriter}}. The visibility is such that it is open to 
extension by third parties. The {{AbstractOperationRecordWriter}} also permits 
the setting of the {{AcidOutputFormat.Options#recordIdColumn()}} (defaulted to 
{{-1}}) which is a requirement for enabling updates and deletes. Additionally, 
these options are now fed an {{ObjectInspector}} via an abstract method so that 
a {{SerDe}} is not mandated (it was not required for our use-case). The 
{{AbstractRecordWriter}} is now much leaner, handling only the extraction of 
the {{ObjectInspector}} from the {{SerDe}}.
* A new abstract class, 

[jira] [Commented] (HIVE-10239) Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and PostgreSQL

2015-04-21 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505170#comment-14505170
 ] 

Naveen Gangam commented on HIVE-10239:
--

[~spena] I believe I have seen this error on my machine too but wasnt fatal by 
any means. The script executed fine after this error. I will re-run it to make 
sure. 

 Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and 
 PostgreSQL
 

 Key: HIVE-10239
 URL: https://issues.apache.org/jira/browse/HIVE-10239
 Project: Hive
  Issue Type: Improvement
Affects Versions: 1.1.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam
 Attachments: HIVE-10239-donotcommit.patch, HIVE-10239.0.patch, 
 HIVE-10239.0.patch, HIVE-10239.00.patch, HIVE-10239.patch


 Need to create DB-implementation specific scripts to use the framework 
 introduced in HIVE-9800 to have any metastore schema changes tested across 
 all supported databases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-21 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505186#comment-14505186
 ] 

Vikram Dixit K commented on HIVE-9824:
--

[~mmccline] The latest patch doesn't apply on trunk anymore.

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, 
 HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10339) Allow JDBC Driver to pass HTTP header Key/Value pairs

2015-04-21 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505885#comment-14505885
 ] 

Thejas M Nair commented on HIVE-10339:
--

+1 
Please open a follow up jira for adding more e2e like tests using wiremock or 
equivalent.


 Allow JDBC Driver to pass HTTP header Key/Value pairs
 -

 Key: HIVE-10339
 URL: https://issues.apache.org/jira/browse/HIVE-10339
 Project: Hive
  Issue Type: Improvement
  Components: Beeline
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10339.1.patch, HIVE-10339.2.patch


 Currently Beeline  ODBC driver does not support carrying user specified HTTP 
 header.
 The beeline JDBC driver in HTTP mode connection string is as 
 jdbc:hive2://host:port/db?hive.server2.transport.mode=http;hive.server2.thrift.http.path=http_endpoint,
 When transport mode is http Beeline/ODBC driver should allow end user to send 
 arbitrary HTTP Header name value pair.
 All the beeline driver needs to do is to use the user specified name values 
 and call the underlying HTTPClient API to set the header.
 E.g the Beeline connection string could be 
 jdbc:hive2://host:port/db?hive.server2.transport.mode=http;hive.server2.thrift.http.path=http_endpoint,http.header.name1=value1,
 And the beeline will call underlying to set HTTP header to name1 and value1
 This is required for the  end user to send  identity in a HTTP header down to 
 Knox via beeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-21 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505656#comment-14505656
 ] 

Matt McCline commented on HIVE-9824:


I switched over to using https://github.com/apache/hive from using  
git://git.apache.org/hive.git  because of the read error: Connection reset by 
peer problem.

I did notice when I generated the review board patch with this command line:
{noformat}
git diff --no-ext-diff  HEAD^  review_board_patch_07.txt
{noformat}

and this command line for the actual patch:
{noformat}
git diff --no-ext-diff --no-prefix HEAD^  HIVE-9824.07.patch
{noformat}

The files had the same length when they usually have different lengths.

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, 
 HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10384) RetryingMetaStoreClient does not retry wrapped TTransportExceptions

2015-04-21 Thread Eric Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505753#comment-14505753
 ] 

Eric Liang commented on HIVE-10384:
---

+1, probably at least all the T* exceptions should be retried after they are 
unwrapped.

 RetryingMetaStoreClient does not retry wrapped TTransportExceptions
 ---

 Key: HIVE-10384
 URL: https://issues.apache.org/jira/browse/HIVE-10384
 Project: Hive
  Issue Type: Bug
  Components: Clients
Reporter: Eric Liang
Assignee: Chaoyu Tang
 Attachments: HIVE-10384.patch


 This bug is very similar to HIVE-9436, in that a TTransportException wrapped 
 in a MetaException will not be retried. RetryingMetaStoreClient has a block 
 of code above the MetaException handler that retries thrift exceptions, but 
 this doesn't work when the exception is wrapped.
 {code}
 if ((e.getCause() instanceof TApplicationException) ||
 (e.getCause() instanceof TProtocolException) ||
 (e.getCause() instanceof TTransportException)) {
   caughtException = (TException) e.getCause();
 } else if ((e.getCause() instanceof MetaException) 
 
 e.getCause().getMessage().matches((?s).*JDO[a-zA-Z]*Exception.*)) {
   caughtException = (MetaException) e.getCause();
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9917) After HIVE-3454 is done, make int to timestamp conversion configurable

2015-04-21 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505623#comment-14505623
 ] 

Aihua Xu commented on HIVE-9917:


No. I mean check in this code change to trunk?

 After HIVE-3454 is done, make int to timestamp conversion configurable
 --

 Key: HIVE-9917
 URL: https://issues.apache.org/jira/browse/HIVE-9917
 Project: Hive
  Issue Type: Improvement
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-9917.patch


 After HIVE-3454 is fixed, we will have correct behavior of converting int to 
 timestamp. While the customers are using such incorrect behavior for so long, 
 better to make it configurable so that in one release, it will default to 
 old/inconsistent way and the next release will default to new/consistent way. 
 And then we will deprecate it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-21 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505716#comment-14505716
 ] 

Sergey Shelukhin commented on HIVE-9824:


+1. Can you file follow up jiras for replacing the hashtable, and also for 
making hybrid work in all cases (if still needed)?

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, 
 HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch, HIVE-9824.08.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10331) ORC : Is null SARG filters out all row groups written in old ORC format

2015-04-21 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505787#comment-14505787
 ] 

Hive QA commented on HIVE-10331:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12726948/HIVE-10331.06.patch

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 8728 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3516/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3516/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3516/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12726948 - PreCommit-HIVE-TRUNK-Build

 ORC : Is null SARG filters out all row groups written in old ORC format
 ---

 Key: HIVE-10331
 URL: https://issues.apache.org/jira/browse/HIVE-10331
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.1.0
Reporter: Mostafa Mokhtar
Assignee: Mostafa Mokhtar
 Fix For: 1.2.0

 Attachments: HIVE-10331.01.patch, HIVE-10331.02.patch, 
 HIVE-10331.03.patch, HIVE-10331.03.patch, HIVE-10331.04.patch, 
 HIVE-10331.05.patch, HIVE-10331.06.patch


 Queries are returning wrong results as all row groups gets filtered out and 
 no rows get scanned.
 {code}
 SELECT 
   count(*)
 FROM
 store_sales
 WHERE
 ss_addr_sk IS NULL
 {code}
 With hive.optimize.index.filter disabled we get the correct results
 In pickRowGroups stats show that hasNull_ is fales, while the rowgroup 
 actually has null.
 Same query runs fine for newly loaded ORC tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10339) Allow JDBC Driver to pass HTTP header Key/Value pairs

2015-04-21 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10339:
-
Attachment: HIVE-10339.2.patch

[~thejas] Addressed the review comments. Uploading patch #2

Thanks
Hari

 Allow JDBC Driver to pass HTTP header Key/Value pairs
 -

 Key: HIVE-10339
 URL: https://issues.apache.org/jira/browse/HIVE-10339
 Project: Hive
  Issue Type: Improvement
  Components: Beeline
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10339.1.patch, HIVE-10339.2.patch


 Currently Beeline  ODBC driver does not support carrying user specified HTTP 
 header.
 The beeline JDBC driver in HTTP mode connection string is as 
 jdbc:hive2://host:port/db?hive.server2.transport.mode=http;hive.server2.thrift.http.path=http_endpoint,
 When transport mode is http Beeline/ODBC driver should allow end user to send 
 arbitrary HTTP Header name value pair.
 All the beeline driver needs to do is to use the user specified name values 
 and call the underlying HTTPClient API to set the header.
 E.g the Beeline connection string could be 
 jdbc:hive2://host:port/db?hive.server2.transport.mode=http;hive.server2.thrift.http.path=http_endpoint,http.header.name1=value1,
 And the beeline will call underlying to set HTTP header to name1 and value1
 This is required for the  end user to send  identity in a HTTP header down to 
 Knox via beeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10426) Rework/simplify ReplicationTaskFactory instantiation

2015-04-21 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-10426:

Description: Creating a new jira to continue discussions from HIVE-10227 as 
to what ReplicationTask.Factory instantiation should look like.  (was: Creating 
a new jira to continue discussions of what ReplicationTask.Factory 
instantiation should look like.)

 Rework/simplify ReplicationTaskFactory instantiation
 

 Key: HIVE-10426
 URL: https://issues.apache.org/jira/browse/HIVE-10426
 Project: Hive
  Issue Type: Sub-task
  Components: Import/Export
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan

 Creating a new jira to continue discussions from HIVE-10227 as to what 
 ReplicationTask.Factory instantiation should look like.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10384) RetryingMetaStoreClient does not retry wrapped TTransportExceptions

2015-04-21 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505746#comment-14505746
 ] 

Szehon Ho commented on HIVE-10384:
--

Makes sense to me, +1

 RetryingMetaStoreClient does not retry wrapped TTransportExceptions
 ---

 Key: HIVE-10384
 URL: https://issues.apache.org/jira/browse/HIVE-10384
 Project: Hive
  Issue Type: Bug
  Components: Clients
Reporter: Eric Liang
Assignee: Chaoyu Tang
 Attachments: HIVE-10384.patch


 This bug is very similar to HIVE-9436, in that a TTransportException wrapped 
 in a MetaException will not be retried. RetryingMetaStoreClient has a block 
 of code above the MetaException handler that retries thrift exceptions, but 
 this doesn't work when the exception is wrapped.
 {code}
 if ((e.getCause() instanceof TApplicationException) ||
 (e.getCause() instanceof TProtocolException) ||
 (e.getCause() instanceof TTransportException)) {
   caughtException = (TException) e.getCause();
 } else if ((e.getCause() instanceof MetaException) 
 
 e.getCause().getMessage().matches((?s).*JDO[a-zA-Z]*Exception.*)) {
   caughtException = (MetaException) e.getCause();
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10331) ORC : Is null SARG filters out all row groups written in old ORC format

2015-04-21 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505795#comment-14505795
 ] 

Prasanth Jayachandran commented on HIVE-10331:
--

SVN is marked read-only for git migration. Will commit the patch once the 
migration is done.

 ORC : Is null SARG filters out all row groups written in old ORC format
 ---

 Key: HIVE-10331
 URL: https://issues.apache.org/jira/browse/HIVE-10331
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.1.0
Reporter: Mostafa Mokhtar
Assignee: Mostafa Mokhtar
 Fix For: 1.2.0

 Attachments: HIVE-10331.01.patch, HIVE-10331.02.patch, 
 HIVE-10331.03.patch, HIVE-10331.03.patch, HIVE-10331.04.patch, 
 HIVE-10331.05.patch, HIVE-10331.06.patch


 Queries are returning wrong results as all row groups gets filtered out and 
 no rows get scanned.
 {code}
 SELECT 
   count(*)
 FROM
 store_sales
 WHERE
 ss_addr_sk IS NULL
 {code}
 With hive.optimize.index.filter disabled we get the correct results
 In pickRowGroups stats show that hasNull_ is fales, while the rowgroup 
 actually has null.
 Same query runs fine for newly loaded ORC tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10421) DROP TABLE with qualified table name ignores database name when checking partitions

2015-04-21 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505620#comment-14505620
 ] 

Hive QA commented on HIVE-10421:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12726930/HIVE-10421.1.patch

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8727 tests 
executed
*Failed tests:*
{noformat}
TestCustomAuthentication - did not produce a TEST-*.xml file
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3515/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3515/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3515/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12726930 - PreCommit-HIVE-TRUNK-Build

 DROP TABLE with qualified table name ignores database name when checking 
 partitions
 ---

 Key: HIVE-10421
 URL: https://issues.apache.org/jira/browse/HIVE-10421
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-10421.1.patch


 Hive was only recently changed to allow drop table dbname.tabname. However 
 DDLTask.dropTable() is still using an older version of 
 Hive.getPartitionNames(), which only took in a single string for the table 
 name, rather than the database and table names. As a result Hive is filling 
 in the current database name as the dbname during the listPartitions call to 
 the MetaStore.
 It also appears that on the Hive Metastore side, in the non-auth path there 
 is no validation to check that the dbname.tablename actually exists - this 
 call simply returns back an empty list of partitions, which causes the table 
 to be dropped without checking any of the partition information. I will open 
 a separate issue for this one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-21 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-9824:
---
Attachment: HIVE-9824.08.patch

More review board changes.

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, 
 HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch, HIVE-9824.08.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-21 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505696#comment-14505696
 ] 

Matt McCline commented on HIVE-9824:


Actually, they are exactly 1000 bytes different...

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, 
 HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch, HIVE-9824.08.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10423) HIVE-7948 breaks deploy_e2e_artifacts.sh

2015-04-21 Thread Aswathy Chellammal Sreekumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505682#comment-14505682
 ] 

Aswathy Chellammal Sreekumar commented on HIVE-10423:
-

@Eugene please review the patch that includes small fix to prevent the issue 
with rerun

 HIVE-7948 breaks deploy_e2e_artifacts.sh
 

 Key: HIVE-10423
 URL: https://issues.apache.org/jira/browse/HIVE-10423
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Eugene Koifman
Assignee: Aswathy Chellammal Sreekumar
 Attachments: HIVE-10423.patch


 HIVE-7948 added a step to download a ml-1m.zip file and unzip it.
 this only works if you call deploy_e2e_artifacts.sh once.  If you call it 
 again (which is very common in dev) it blocks and ask for additional input 
 from user because target files already exist.
 This needs to be changed similarly to what we discussed for HIVE-9272, i.e. 
 place artifacts not under source control in testdist/.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10384) RetryingMetaStoreClient does not retry wrapped TTransportExceptions

2015-04-21 Thread Eric Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505761#comment-14505761
 ] 

Eric Liang commented on HIVE-10384:
---

Oh sorry, I misunderstood your comment. I believe that TTransportException is 
indeed thrown from within invoke(). For example, see this stack trace:

{code}
Got excep
tion: org.apache.thrift.transport.TTransportException null
org.apache.thrift.transport.TTransportException
at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at 
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
at 
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_tables(ThriftHiveMetastore.java:9
83)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_tables(ThriftHiveMetastore.java:969)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:1038)
at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
{code}

I believe the offending method that wraps this exception is in MetaStoreUtils:  
logAndThrowMetaException(Exception e) throws MetaException


 RetryingMetaStoreClient does not retry wrapped TTransportExceptions
 ---

 Key: HIVE-10384
 URL: https://issues.apache.org/jira/browse/HIVE-10384
 Project: Hive
  Issue Type: Bug
  Components: Clients
Reporter: Eric Liang
Assignee: Chaoyu Tang
 Attachments: HIVE-10384.patch


 This bug is very similar to HIVE-9436, in that a TTransportException wrapped 
 in a MetaException will not be retried. RetryingMetaStoreClient has a block 
 of code above the MetaException handler that retries thrift exceptions, but 
 this doesn't work when the exception is wrapped.
 {code}
 if ((e.getCause() instanceof TApplicationException) ||
 (e.getCause() instanceof TProtocolException) ||
 (e.getCause() instanceof TTransportException)) {
   caughtException = (TException) e.getCause();
 } else if ((e.getCause() instanceof MetaException) 
 
 e.getCause().getMessage().matches((?s).*JDO[a-zA-Z]*Exception.*)) {
   caughtException = (MetaException) e.getCause();
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10399) from_unixtime_millis() Hive UDF

2015-04-21 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505871#comment-14505871
 ] 

Aihua Xu commented on HIVE-10399:
-

I don't think you need such UDF. You can just call cast(123.123 as timestamp) 
to convert a double to timestamp. Give it a try to see if it's what you want.

 from_unixtime_millis() Hive UDF
 ---

 Key: HIVE-10399
 URL: https://issues.apache.org/jira/browse/HIVE-10399
 Project: Hive
  Issue Type: New Feature
  Components: UDF
 Environment: HDP 2.2
Reporter: Hari Sekhon
Priority: Minor

 Feature request for a
 {code}from_unixtime_millis(){code}
 Hive UDF - from_unixtime() accepts only secs since epoch, and right now the 
 solution is to create a custom UDF, but this seems like quite a standard 
 thing to support millisecond precision dates in Hive natively.
 Hari Sekhon
 http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10239) Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and PostgreSQL

2015-04-21 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-10239:
-
Attachment: HIVE-10239.01.patch

Appears that the oracle package installation fails because it cannot download 
the oracle-xe packages for 64 bit operating systems. On my local ubuntu VMs, I 
get the exact same failure except that it proceeds to install the 32-bit 
packages for oracle xe. I am not quite sure what OS configuration drives this. 
{noformat}
511808
W: Failed to fetch 
http://oss.oracle.com/debian/dists/unstable/main/binary-amd64/Packages  
HttpError404

W: Failed to fetch 
http://oss.oracle.com/debian/dists/unstable/non-free/binary-amd64/Packages  
HttpError404

E: Some index files failed to download. They have been ignored, or old ones 
used instead.
+ /bin/true
+ apt-get install -y --force-yes oracle-xe
Reading package lists... Done
Building dependency tree   
Reading state information... Done
The following extra packages will be installed:
  gcc-4.9-base gcc-4.9-base:i386 libaio:i386 libc6 libc6:i386 libgcc1 
libgcc1:i386
Suggested packages:
  glibc-doc glibc-doc:i386 locales:i386
The following NEW packages will be installed:
  gcc-4.9-base:i386 libaio:i386 libc6:i386 libgcc1:i386 oracle-xe:i386
The following packages will be upgraded:
  gcc-4.9-base libc6 libgcc1
3 upgraded, 5 newly installed, 0 to remove and 169 not upgraded.
Need to get 230 MB of archives.
After this operation, 415 MB of additional disk space will be used.
WARNING: The following packages cannot be authenticated!
  libaio:i386 oracle-xe:i386
{noformat}

I am uploading a patch to make it use 32-bit packages.

 Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and 
 PostgreSQL
 

 Key: HIVE-10239
 URL: https://issues.apache.org/jira/browse/HIVE-10239
 Project: Hive
  Issue Type: Improvement
Affects Versions: 1.1.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam
 Attachments: HIVE-10239-donotcommit.patch, HIVE-10239.0.patch, 
 HIVE-10239.0.patch, HIVE-10239.00.patch, HIVE-10239.01.patch, HIVE-10239.patch


 Need to create DB-implementation specific scripts to use the framework 
 introduced in HIVE-9800 to have any metastore schema changes tested across 
 all supported databases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9917) After HIVE-3454 is done, make int to timestamp conversion configurable

2015-04-21 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505638#comment-14505638
 ] 

Jason Dere commented on HIVE-9917:
--

The +1 is supposed to sit for a day before getting committed .. I'll get it in 
tomorrow

 After HIVE-3454 is done, make int to timestamp conversion configurable
 --

 Key: HIVE-9917
 URL: https://issues.apache.org/jira/browse/HIVE-9917
 Project: Hive
  Issue Type: Improvement
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-9917.patch


 After HIVE-3454 is fixed, we will have correct behavior of converting int to 
 timestamp. While the customers are using such incorrect behavior for so long, 
 better to make it configurable so that in one release, it will default to 
 old/inconsistent way and the next release will default to new/consistent way. 
 And then we will deprecate it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10384) RetryingMetaStoreClient does not retry wrapped TTransportExceptions

2015-04-21 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505940#comment-14505940
 ] 

Hive QA commented on HIVE-10384:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12726947/HIVE-10384.patch

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 8728 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3517/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3517/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3517/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12726947 - PreCommit-HIVE-TRUNK-Build

 RetryingMetaStoreClient does not retry wrapped TTransportExceptions
 ---

 Key: HIVE-10384
 URL: https://issues.apache.org/jira/browse/HIVE-10384
 Project: Hive
  Issue Type: Bug
  Components: Clients
Reporter: Eric Liang
Assignee: Chaoyu Tang
 Attachments: HIVE-10384.patch


 This bug is very similar to HIVE-9436, in that a TTransportException wrapped 
 in a MetaException will not be retried. RetryingMetaStoreClient has a block 
 of code above the MetaException handler that retries thrift exceptions, but 
 this doesn't work when the exception is wrapped.
 {code}
 if ((e.getCause() instanceof TApplicationException) ||
 (e.getCause() instanceof TProtocolException) ||
 (e.getCause() instanceof TTransportException)) {
   caughtException = (TException) e.getCause();
 } else if ((e.getCause() instanceof MetaException) 
 
 e.getCause().getMessage().matches((?s).*JDO[a-zA-Z]*Exception.*)) {
   caughtException = (MetaException) e.getCause();
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10430) HIVE-9937 broke hadoop-1 build

2015-04-21 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-10430:
-
Description: TestLazySimpleFast added in HIVE-9937 uses Text.copyBytes() 
that is not present in hadoop-1.   (was: TestLazySimpleFast uses 
Text.copyBytes() that is not present in hadoop-1. )

 HIVE-9937 broke hadoop-1 build
 --

 Key: HIVE-10430
 URL: https://issues.apache.org/jira/browse/HIVE-10430
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran

 TestLazySimpleFast added in HIVE-9937 uses Text.copyBytes() that is not 
 present in hadoop-1. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10430) HIVE-9937 broke hadoop-1 build

2015-04-21 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran resolved HIVE-10430.
--
Resolution: Duplicate

 HIVE-9937 broke hadoop-1 build
 --

 Key: HIVE-10430
 URL: https://issues.apache.org/jira/browse/HIVE-10430
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran

 TestLazySimpleFast added in HIVE-9937 uses Text.copyBytes() that is not 
 present in hadoop-1. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10391) CBO (Calcite Return Path): HiveOpConverter always assumes that HiveFilter does not include a partition column

2015-04-21 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-10391:
--
Attachment: HIVE-10391.patch

 CBO (Calcite Return Path): HiveOpConverter always assumes that HiveFilter 
 does not include a partition column
 -

 Key: HIVE-10391
 URL: https://issues.apache.org/jira/browse/HIVE-10391
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Laljo John Pullokkaran
 Fix For: 1.2.0

 Attachments: HIVE-10391.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10434) Cancel connection to HS2 when remote Spark driver process has failed [Spark Branch]

2015-04-21 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-10434:

Issue Type: Sub-task  (was: Improvement)
Parent: HIVE-7292

 Cancel connection to HS2 when remote Spark driver process has failed [Spark 
 Branch] 
 

 Key: HIVE-10434
 URL: https://issues.apache.org/jira/browse/HIVE-10434
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: 1.2.0
Reporter: Chao Sun
Assignee: Chao Sun

 Currently in HoS, in SparkClientImpl it first launch a remote Driver process, 
 and then wait for it to connect back to the HS2. However, in certain 
 situations (for instance, permission issue), the remote process may fail and 
 exit with error code. In this situation, the HS2 process will still wait for 
 the process to connect, and wait for a full timeout period before it throws 
 the exception.
 What makes it worth, user may need to wait for two timeout periods: one for 
 the SparkSetReducerParallelism, and another for the actual Spark job. This 
 could be very annoying.
 We should cancel the timeout task once we found out that the process has 
 failed, and set the promise as failed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10192) insert into table failed for partitioned table.

2015-04-21 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506120#comment-14506120
 ] 

Aihua Xu commented on HIVE-10192:
-

[~Ganesh.Sathish] Are you still having the issue? I tried a simple case and I 
don't see the issue. 

Could you please provide repro steps and sample data if you still have.

 insert into table  failed for partitioned table.
 --

 Key: HIVE-10192
 URL: https://issues.apache.org/jira/browse/HIVE-10192
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 0.12.0
 Environment: os-Unix
 Distribution-Pivotal
Reporter: Ganesh Sathish

 When i am trying to load the data from the partitioned table in RC format to 
 a partitioned table in ORC format.Using the below command to load the data.
 create table ORC_Table stored as ORC as select * from RC_Table;
 Facing the issue:
 ArrayIndexOutofBoundsException:26



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10413) [CBO] Return path assumes distinct column cant be same as grouping column

2015-04-21 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-10413:

Attachment: HIVE-10413.1.patch

Updated patch which makes all queries in .q file pass except one with multiple 
distincts.

 [CBO] Return path assumes distinct column cant be same as grouping column
 -

 Key: HIVE-10413
 URL: https://issues.apache.org/jira/browse/HIVE-10413
 Project: Hive
  Issue Type: Sub-task
Affects Versions: 1.2.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-10413.1.patch, HIVE-10413.patch


 Found in cbo_udf_udaf.q tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10431) HIVE-9555 broke hadoop-1 build

2015-04-21 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506231#comment-14506231
 ] 

Prasanth Jayachandran commented on HIVE-10431:
--

Seeing another error..
{code}
RecordReaderUtils.java:[442,40] error: cannot find symbol
[ERROR] class HdfsFileStatus
{code}

 HIVE-9555 broke hadoop-1 build
 --

 Key: HIVE-10431
 URL: https://issues.apache.org/jira/browse/HIVE-10431
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Prasanth Jayachandran
Assignee: Sergey Shelukhin

 HIVE-9555 RecordReaderUtils uses direct bytebuffer read from 
 FSDataInputStream which is not present in hadoop-1. This breaks hadoop-1 
 compilation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10431) HIVE-9555 broke hadoop-1 build

2015-04-21 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506233#comment-14506233
 ] 

Prasanth Jayachandran commented on HIVE-10431:
--

Ignore my previous comment.. That's not happening in trunk.. it happens only in 
LLAP branch.

 HIVE-9555 broke hadoop-1 build
 --

 Key: HIVE-10431
 URL: https://issues.apache.org/jira/browse/HIVE-10431
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Prasanth Jayachandran
Assignee: Sergey Shelukhin

 HIVE-9555 RecordReaderUtils uses direct bytebuffer read from 
 FSDataInputStream which is not present in hadoop-1. This breaks hadoop-1 
 compilation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10312) SASL.QOP in JDBC URL is ignored for Delegation token Authentication

2015-04-21 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-10312:

Attachment: HIVE-10312.1.patch

Seems like a reasonable patch.

+1

 SASL.QOP in JDBC URL is ignored for Delegation token Authentication
 ---

 Key: HIVE-10312
 URL: https://issues.apache.org/jira/browse/HIVE-10312
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 1.2.0
Reporter: Mubashir Kazia
Assignee: Mubashir Kazia
 Fix For: 1.2.0

 Attachments: HIVE-10312.1.patch, HIVE-10312.1.patch


 When HS2 is configured for QOP other than auth (auth-int or auth-conf), 
 Kerberos client connection works fine when the JDBC URL specifies the 
 matching QOP, however when this HS2 is accessed through Oozie (Delegation 
 token / Digest authentication), connections fails because the JDBC driver 
 ignores the SASL.QOP parameters in the JDBC URL. SASL.QOP setting should be 
 valid for DIGEST Auth mech.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10428) NPE in RegexSerDe using HCat

2015-04-21 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506373#comment-14506373
 ] 

Hive QA commented on HIVE-10428:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12727013/HIVE-10428.1.patch

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 8728 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3521/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3521/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3521/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12727013 - PreCommit-HIVE-TRUNK-Build

 NPE in RegexSerDe using HCat
 

 Key: HIVE-10428
 URL: https://issues.apache.org/jira/browse/HIVE-10428
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-10428.1.patch


 When HCatalog calls to table with org.apache.hadoop.hive.serde2.RegexSerDe, 
 when doing Hcatalog call to get read the table, it throws exception:
 {noformat}
 15/04/21 14:07:31 INFO security.TokenCache: Got dt for hdfs://hdpsecahdfs; 
 Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:hdpsecahdfs, Ident: 
 (HDFS_DELEGATION_TOKEN token 1478 for haha)
 15/04/21 14:07:31 INFO mapred.FileInputFormat: Total input paths to process : 
 1
 Splits len : 1
 SplitInfo : [hdpseca03.seca.hwxsup.com, hdpseca04.seca.hwxsup.com, 
 hdpseca05.seca.hwxsup.com]
 15/04/21 14:07:31 INFO mapreduce.InternalUtil: Initializing 
 org.apache.hadoop.hive.serde2.RegexSerDe with properties 
 {name=casetest.regex_table, numFiles=1, columns.types=string,string, 
 serialization.format=1, columns=id,name, rawDataSize=0, numRows=0, 
 output.format.string=%1$s %2$s, 
 serialization.lib=org.apache.hadoop.hive.serde2.RegexSerDe, 
 COLUMN_STATS_ACCURATE=true, totalSize=25, serialization.null.format=\N, 
 input.regex=([^ ]*) ([^ ]*), transient_lastDdlTime=1429590172}
 15/04/21 14:07:31 WARN serde2.RegexSerDe: output.format.string has been 
 deprecated
 Exception in thread main java.lang.NullPointerException
   at 
 com.google.common.base.Preconditions.checkNotNull(Preconditions.java:187)
   at com.google.common.base.Splitter.split(Splitter.java:371)
   at 
 org.apache.hadoop.hive.serde2.RegexSerDe.initialize(RegexSerDe.java:155)
   at 
 org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:49)
   at 
 org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:518)
   at 
 

[jira] [Commented] (HIVE-9711) ORC Vectorization DoubleColumnVector.isRepeating=false if all entries are NaN

2015-04-21 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506224#comment-14506224
 ] 

Prasanth Jayachandran commented on HIVE-9711:
-

SVN is currently read-only. Will commit this patch once we migrate to git.

 ORC Vectorization DoubleColumnVector.isRepeating=false if all entries are NaN
 -

 Key: HIVE-9711
 URL: https://issues.apache.org/jira/browse/HIVE-9711
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Vectorization
Affects Versions: 1.2.0
Reporter: Gopal V
Assignee: Gopal V
 Fix For: 1.2.0

 Attachments: HIVE-9711.1.patch, HIVE-9711.2.patch, HIVE-9711.3.patch


 The isRepeating=true check uses Java equality, which results in NaN != NaN 
 comparison operations.
 The noNulls case needs the current check folded into the previous loop, while 
 the hasNulls case needs a logical AND of the isNull[] field instead of == 
 comparisons.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10048) JDBC - Support SSL encryption regardless of Authentication mechanism

2015-04-21 Thread Mubashir Kazia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mubashir Kazia reassigned HIVE-10048:
-

Assignee: Mubashir Kazia

 JDBC - Support SSL encryption regardless of Authentication mechanism
 

 Key: HIVE-10048
 URL: https://issues.apache.org/jira/browse/HIVE-10048
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Affects Versions: 1.0.0
Reporter: Mubashir Kazia
Assignee: Mubashir Kazia
  Labels: newbie, patch
 Fix For: 1.2.0

 Attachments: HIVE-10048.1.patch


 JDBC driver currently only supports SSL Transport if the Authentication 
 mechanism is SASL Plain with username and password. SSL transport  should be 
 decoupled from Authentication mechanism. If the customer chooses to do 
 Kerberos Authentication and SSL encryption over the wire it should be 
 supported. The Server side already supports this but the driver does not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10115) HS2 running on a Kerberized cluster should offer Kerberos(GSSAPI) and Delegation token(DIGEST) when alternate authentication is enabled

2015-04-21 Thread Mubashir Kazia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mubashir Kazia reassigned HIVE-10115:
-

Assignee: Mubashir Kazia

 HS2 running on a Kerberized cluster should offer Kerberos(GSSAPI) and 
 Delegation token(DIGEST) when alternate authentication is enabled
 ---

 Key: HIVE-10115
 URL: https://issues.apache.org/jira/browse/HIVE-10115
 Project: Hive
  Issue Type: Improvement
  Components: Authentication
Affects Versions: 1.1.0
Reporter: Mubashir Kazia
Assignee: Mubashir Kazia
  Labels: patch
 Fix For: 1.2.0

 Attachments: HIVE-10115.0.patch


 In a Kerberized cluster when alternate authentication is enabled on HS2, it 
 should also accept Kerberos Authentication. The reason this is important is 
 because when we enable LDAP authentication HS2 stops accepting delegation 
 token authentication. So we are forced to enter username passwords in the 
 oozie configuration.
 The whole idea of SASL is that multiple authentication mechanism can be 
 offered. If we disable Kerberos(GSSAPI) and delegation token (DIGEST) 
 authentication when we enable LDAP authentication, this defeats SASL purpose.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4625) HS2 should not attempt to get delegation token from metastore if using embedded metastore

2015-04-21 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506241#comment-14506241
 ] 

Thejas M Nair commented on HIVE-4625:
-

There are 2 other functions as well there only in thrift (non local) mode 
exception is thrown, similar change should be made there as well for 
consistency.


 HS2 should not attempt to get delegation token from metastore if using 
 embedded metastore
 -

 Key: HIVE-4625
 URL: https://issues.apache.org/jira/browse/HIVE-4625
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.11.0
Reporter: Thejas M Nair
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-4625.1.patch, HIVE-4625.2.patch, HIVE-4625.3.patch, 
 HIVE-4625.4.patch


 In kerberos secure mode, with doas enabled, Hive server2 tries to get 
 delegation token from metastore even if the metastore is being used in 
 embedded mode. 
 To avoid failure in that case, it uses catch block for 
 UnsupportedOperationException thrown that does nothing. But this leads to an 
 error being logged  by lower levels and can mislead users into thinking that 
 there is a problem.
 It should check if delegation token mode is supported with current 
 configuration before calling the function.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10417) Parallel Order By return wrong results for partitioned tables

2015-04-21 Thread Nemon Lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou reassigned HIVE-10417:


Assignee: Nemon Lou

 Parallel Order By return wrong results for partitioned tables
 -

 Key: HIVE-10417
 URL: https://issues.apache.org/jira/browse/HIVE-10417
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0, 0.13.1, 1.0.0
Reporter: Nemon Lou
Assignee: Nemon Lou

 Following is the script that reproduce this bug.
 set hive.optimize.sampling.orderby=true;
 set mapreduce.job.reduces=10;
 select * from src order by key desc limit 10;
 +--++
 | src.key  | src.value  |
 +--++
 | 98   | val_98 |
 | 98   | val_98 |
 | 97   | val_97 |
 | 97   | val_97 |
 | 96   | val_96 |
 | 95   | val_95 |
 | 95   | val_95 |
 | 92   | val_92 |
 | 90   | val_90 |
 | 90   | val_90 |
 +--++
 10 rows selected (47.916 seconds)
 reset;
 create table src_orc_p (key string ,value string )
 partitioned by (kp string)
 stored as orc
 tblproperties(orc.compress=SNAPPY);
 set hive.exec.dynamic.partition.mode=nonstrict;
 set hive.exec.max.dynamic.partitions.pernode=1;
 set hive.exec.max.dynamic.partitions=1;
 insert into table src_orc_p partition(kp) select *,substring(key,1) from src 
 distribute by substring(key,1);
 set mapreduce.job.reduces=10;
 set hive.optimize.sampling.orderby=true;
 select * from src_orc_p order by key desc limit 10;
 ++--+-+
 | src_orc_p.key  | src_orc_p.value  | src_orc_p.kend  |
 ++--+-+
 | 0  | val_0| 0   |
 | 0  | val_0| 0   |
 | 0  | val_0| 0   |
 ++--+-+
 3 rows selected (39.861 seconds)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10339) Allow JDBC Driver to pass HTTP header Key/Value pairs

2015-04-21 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506191#comment-14506191
 ] 

Hive QA commented on HIVE-10339:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12727000/HIVE-10339.2.patch

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 8729 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3519/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3519/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3519/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12727000 - PreCommit-HIVE-TRUNK-Build

 Allow JDBC Driver to pass HTTP header Key/Value pairs
 -

 Key: HIVE-10339
 URL: https://issues.apache.org/jira/browse/HIVE-10339
 Project: Hive
  Issue Type: Improvement
  Components: Beeline
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10339.1.patch, HIVE-10339.2.patch


 Currently Beeline  ODBC driver does not support carrying user specified HTTP 
 header.
 The beeline JDBC driver in HTTP mode connection string is as 
 jdbc:hive2://host:port/db?hive.server2.transport.mode=http;hive.server2.thrift.http.path=http_endpoint,
 When transport mode is http Beeline/ODBC driver should allow end user to send 
 arbitrary HTTP Header name value pair.
 All the beeline driver needs to do is to use the user specified name values 
 and call the underlying HTTPClient API to set the header.
 E.g the Beeline connection string could be 
 jdbc:hive2://host:port/db?hive.server2.transport.mode=http;hive.server2.thrift.http.path=http_endpoint,http.header.name1=value1,
 And the beeline will call underlying to set HTTP header to name1 and value1
 This is required for the  end user to send  identity in a HTTP header down to 
 Knox via beeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10312) SASL.QOP in JDBC URL is ignored for Delegation token Authentication

2015-04-21 Thread Mubashir Kazia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mubashir Kazia reassigned HIVE-10312:
-

Assignee: Mubashir Kazia

 SASL.QOP in JDBC URL is ignored for Delegation token Authentication
 ---

 Key: HIVE-10312
 URL: https://issues.apache.org/jira/browse/HIVE-10312
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 1.2.0
Reporter: Mubashir Kazia
Assignee: Mubashir Kazia
 Fix For: 1.2.0

 Attachments: HIVE-10312.1.patch


 When HS2 is configured for QOP other than auth (auth-int or auth-conf), 
 Kerberos client connection works fine when the JDBC URL specifies the 
 matching QOP, however when this HS2 is accessed through Oozie (Delegation 
 token / Digest authentication), connections fails because the JDBC driver 
 ignores the SASL.QOP parameters in the JDBC URL. SASL.QOP setting should be 
 valid for DIGEST Auth mech.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-5545) HCatRecord getInteger method returns String when used on Partition columns of type INT

2015-04-21 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan reassigned HIVE-5545:
--

Assignee: Sushanth Sowmyan

 HCatRecord getInteger method returns String when used on Partition columns of 
 type INT
 --

 Key: HIVE-5545
 URL: https://issues.apache.org/jira/browse/HIVE-5545
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0
 Environment: hadoop-1.0.3
Reporter: Rishav Rohit
Assignee: Sushanth Sowmyan

 HCatRecord getInteger method returns String when used on Partition columns of 
 type INT.
 java.lang.ClassCastException: java.lang.String cannot be cast to 
 java.lang.Integer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10339) Allow JDBC Driver to pass HTTP header Key/Value pairs

2015-04-21 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506436#comment-14506436
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-10339:
--

The failures are unrelated to the change.

Thanks
Hari

 Allow JDBC Driver to pass HTTP header Key/Value pairs
 -

 Key: HIVE-10339
 URL: https://issues.apache.org/jira/browse/HIVE-10339
 Project: Hive
  Issue Type: Improvement
  Components: Beeline
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10339.1.patch, HIVE-10339.2.patch


 Currently Beeline  ODBC driver does not support carrying user specified HTTP 
 header.
 The beeline JDBC driver in HTTP mode connection string is as 
 jdbc:hive2://host:port/db?hive.server2.transport.mode=http;hive.server2.thrift.http.path=http_endpoint,
 When transport mode is http Beeline/ODBC driver should allow end user to send 
 arbitrary HTTP Header name value pair.
 All the beeline driver needs to do is to use the user specified name values 
 and call the underlying HTTPClient API to set the header.
 E.g the Beeline connection string could be 
 jdbc:hive2://host:port/db?hive.server2.transport.mode=http;hive.server2.thrift.http.path=http_endpoint,http.header.name1=value1,
 And the beeline will call underlying to set HTTP header to name1 and value1
 This is required for the  end user to send  identity in a HTTP header down to 
 Knox via beeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9957) Hive 1.1.0 not compatible with Hadoop 2.4.0

2015-04-21 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506463#comment-14506463
 ] 

Lefty Leverenz commented on HIVE-9957:
--

The Hive wiki has a section explaining how to apply a patch in the How to 
Contribute doc:

* [How To Contribute -- Applying a Patch | 
https://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute-ApplyingaPatch]

 Hive 1.1.0 not compatible with Hadoop 2.4.0
 ---

 Key: HIVE-9957
 URL: https://issues.apache.org/jira/browse/HIVE-9957
 Project: Hive
  Issue Type: Bug
  Components: Encryption
Reporter: Vivek Shrivastava
Assignee: Sergio Peña
  Labels: TODOC1.2
 Fix For: 1.2.0

 Attachments: HIVE-9957.1.patch


 Getting this exception while accessing data through Hive. 
 Exception in thread main java.lang.NoSuchMethodError: 
 org.apache.hadoop.hdfs.DFSClient.getKeyProvider()Lorg/apache/hadoop/crypto/key/KeyProvider;
 at 
 org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.init(Hadoop23Shims.java:1152)
 at 
 org.apache.hadoop.hive.shims.Hadoop23Shims.createHdfsEncryptionShim(Hadoop23Shims.java:1279)
 at 
 org.apache.hadoop.hive.ql.session.SessionState.getHdfsEncryptionShim(SessionState.java:392)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.isPathEncrypted(SemanticAnalyzer.java:1756)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getStagingDirectoryPathname(SemanticAnalyzer.java:1875)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1689)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1427)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genResolvedParseTree(SemanticAnalyzer.java:10132)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10147)
 at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:192)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:222)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:421)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1112)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1160)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:207)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
 at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:754)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-4625) HS2 should not attempt to get delegation token from metastore if using embedded metastore

2015-04-21 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506442#comment-14506442
 ] 

Hive QA commented on HIVE-4625:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12727018/HIVE-4625.4.patch

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8728 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3522/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3522/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3522/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12727018 - PreCommit-HIVE-TRUNK-Build

 HS2 should not attempt to get delegation token from metastore if using 
 embedded metastore
 -

 Key: HIVE-4625
 URL: https://issues.apache.org/jira/browse/HIVE-4625
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.11.0
Reporter: Thejas M Nair
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-4625.1.patch, HIVE-4625.2.patch, HIVE-4625.3.patch, 
 HIVE-4625.4.patch


 In kerberos secure mode, with doas enabled, Hive server2 tries to get 
 delegation token from metastore even if the metastore is being used in 
 embedded mode. 
 To avoid failure in that case, it uses catch block for 
 UnsupportedOperationException thrown that does nothing. But this leads to an 
 error being logged  by lower levels and can mislead users into thinking that 
 there is a problem.
 It should check if delegation token mode is supported with current 
 configuration before calling the function.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10434) Cancel connection when remote Spark driver process has failed [Spark Branch]

2015-04-21 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-10434:

Summary: Cancel connection when remote Spark driver process has failed 
[Spark Branch]   (was: Cancel connection to HS2 when remote Spark driver 
process has failed [Spark Branch] )

 Cancel connection when remote Spark driver process has failed [Spark Branch] 
 -

 Key: HIVE-10434
 URL: https://issues.apache.org/jira/browse/HIVE-10434
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: 1.2.0
Reporter: Chao Sun
Assignee: Chao Sun
 Attachments: HIVE-10434.1-spark.patch


 Currently in HoS, in SparkClientImpl it first launch a remote Driver process, 
 and then wait for it to connect back to the HS2. However, in certain 
 situations (for instance, permission issue), the remote process may fail and 
 exit with error code. In this situation, the HS2 process will still wait for 
 the process to connect, and wait for a full timeout period before it throws 
 the exception.
 What makes it worth, user may need to wait for two timeout periods: one for 
 the SparkSetReducerParallelism, and another for the actual Spark job. This 
 could be very annoying.
 We should cancel the timeout task once we found out that the process has 
 failed, and set the promise as failed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-4625) HS2 should not attempt to get delegation token from metastore if using embedded metastore

2015-04-21 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-4625:

Attachment: HIVE-4625.4.patch

 HS2 should not attempt to get delegation token from metastore if using 
 embedded metastore
 -

 Key: HIVE-4625
 URL: https://issues.apache.org/jira/browse/HIVE-4625
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.11.0
Reporter: Thejas M Nair
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-4625.1.patch, HIVE-4625.2.patch, HIVE-4625.3.patch, 
 HIVE-4625.4.patch


 In kerberos secure mode, with doas enabled, Hive server2 tries to get 
 delegation token from metastore even if the metastore is being used in 
 embedded mode. 
 To avoid failure in that case, it uses catch block for 
 UnsupportedOperationException thrown that does nothing. But this leads to an 
 error being logged  by lower levels and can mislead users into thinking that 
 there is a problem.
 It should check if delegation token mode is supported with current 
 configuration before calling the function.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10391) CBO (Calcite Return Path): HiveOpConverter always assumes that HiveFilter does not include a partition column

2015-04-21 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506001#comment-14506001
 ] 

Laljo John Pullokkaran commented on HIVE-10391:
---

[~ashutoshc] Can you review the patch?

 CBO (Calcite Return Path): HiveOpConverter always assumes that HiveFilter 
 does not include a partition column
 -

 Key: HIVE-10391
 URL: https://issues.apache.org/jira/browse/HIVE-10391
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Laljo John Pullokkaran
 Fix For: 1.2.0

 Attachments: HIVE-10391.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-21 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506061#comment-14506061
 ] 

Hive QA commented on HIVE-9824:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12726989/HIVE-9824.08.patch

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 8750 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3518/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3518/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3518/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12726989 - PreCommit-HIVE-TRUNK-Build

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, 
 HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch, HIVE-9824.08.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10354) Investigate the test failure of TestHadoop20SAuthBridge.testSaslWithHiveMetaStore

2015-04-21 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu resolved HIVE-10354.
-
Resolution: Fixed

The issue may be related to the env. Resolve for now and let's reopen it if it 
persists.

 Investigate the test failure of 
 TestHadoop20SAuthBridge.testSaslWithHiveMetaStore
 -

 Key: HIVE-10354
 URL: https://issues.apache.org/jira/browse/HIVE-10354
 Project: Hive
  Issue Type: Bug
Reporter: Aihua Xu

 It failed with:
 java.lang.NullPointerException: null
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStore.getDelegationToken(HiveMetaStore.java:5752)
   at 
 org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.getDelegationTokenStr(TestHadoop20SAuthBridge.java:318)
   at 
 org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.obtainTokenAndAddIntoUGI(TestHadoop20SAuthBridge.java:339)
   at 
 org.apache.hadoop.hive.thrift.TestHadoop20SAuthBridge.testSaslWithHiveMetaStore(TestHadoop20SAuthBridge.java:231)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10339) Allow JDBC Driver to pass HTTP header Key/Value pairs

2015-04-21 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505964#comment-14505964
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-10339:
--

[~thejas] Thanks for the review. Added HIVE-10432 as the follow-up jira.

Thanks
Hari

 Allow JDBC Driver to pass HTTP header Key/Value pairs
 -

 Key: HIVE-10339
 URL: https://issues.apache.org/jira/browse/HIVE-10339
 Project: Hive
  Issue Type: Improvement
  Components: Beeline
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10339.1.patch, HIVE-10339.2.patch


 Currently Beeline  ODBC driver does not support carrying user specified HTTP 
 header.
 The beeline JDBC driver in HTTP mode connection string is as 
 jdbc:hive2://host:port/db?hive.server2.transport.mode=http;hive.server2.thrift.http.path=http_endpoint,
 When transport mode is http Beeline/ODBC driver should allow end user to send 
 arbitrary HTTP Header name value pair.
 All the beeline driver needs to do is to use the user specified name values 
 and call the underlying HTTPClient API to set the header.
 E.g the Beeline connection string could be 
 jdbc:hive2://host:port/db?hive.server2.transport.mode=http;hive.server2.thrift.http.path=http_endpoint,http.header.name1=value1,
 And the beeline will call underlying to set HTTP header to name1 and value1
 This is required for the  end user to send  identity in a HTTP header down to 
 Knox via beeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10370) Hive does not compile with -Phadoop-1 option

2015-04-21 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-10370:


Assignee: Prasanth Jayachandran

 Hive does not compile with -Phadoop-1 option
 

 Key: HIVE-10370
 URL: https://issues.apache.org/jira/browse/HIVE-10370
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Prasanth Jayachandran
Priority: Critical
 Attachments: HIVE-10370.1.patch


 Running into the below error while running mvn clean install -Pdist 
 -Phadoop-1
 {code}
 [ERROR]hive/serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazySimpleFast.java:[164,33]
  cannot find symbol
   symbol:   method copyBytes()
   location: variable serialized of type org.apache.hadoop.io.Text
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10370) Hive does not compile with -Phadoop-1 option

2015-04-21 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-10370:
-
Attachment: HIVE-10370.1.patch

Compilation will still break because of HIVE-10431

 Hive does not compile with -Phadoop-1 option
 

 Key: HIVE-10370
 URL: https://issues.apache.org/jira/browse/HIVE-10370
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Priority: Critical
 Attachments: HIVE-10370.1.patch


 Running into the below error while running mvn clean install -Pdist 
 -Phadoop-1
 {code}
 [ERROR]hive/serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazySimpleFast.java:[164,33]
  cannot find symbol
   symbol:   method copyBytes()
   location: variable serialized of type org.apache.hadoop.io.Text
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10434) Cancel connection to HS2 when remote Spark driver process has failed [Spark Branch]

2015-04-21 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-10434:

Attachment: HIVE-10434.1-spark.patch

Attaching initial patch. Tested on my own cluster and it worked.

 Cancel connection to HS2 when remote Spark driver process has failed [Spark 
 Branch] 
 

 Key: HIVE-10434
 URL: https://issues.apache.org/jira/browse/HIVE-10434
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: 1.2.0
Reporter: Chao Sun
Assignee: Chao Sun
 Attachments: HIVE-10434.1-spark.patch


 Currently in HoS, in SparkClientImpl it first launch a remote Driver process, 
 and then wait for it to connect back to the HS2. However, in certain 
 situations (for instance, permission issue), the remote process may fail and 
 exit with error code. In this situation, the HS2 process will still wait for 
 the process to connect, and wait for a full timeout period before it throws 
 the exception.
 What makes it worth, user may need to wait for two timeout periods: one for 
 the SparkSetReducerParallelism, and another for the actual Spark job. This 
 could be very annoying.
 We should cancel the timeout task once we found out that the process has 
 failed, and set the promise as failed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10428) NPE in RegexSerDe using HCat

2015-04-21 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-10428:
--
Attachment: HIVE-10428.1.patch

 NPE in RegexSerDe using HCat
 

 Key: HIVE-10428
 URL: https://issues.apache.org/jira/browse/HIVE-10428
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-10428.1.patch


 When HCatalog calls to table with org.apache.hadoop.hive.serde2.RegexSerDe, 
 when doing Hcatalog call to get read the table, it throws exception:
 {noformat}
 15/04/21 14:07:31 INFO security.TokenCache: Got dt for hdfs://hdpsecahdfs; 
 Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:hdpsecahdfs, Ident: 
 (HDFS_DELEGATION_TOKEN token 1478 for haha)
 15/04/21 14:07:31 INFO mapred.FileInputFormat: Total input paths to process : 
 1
 Splits len : 1
 SplitInfo : [hdpseca03.seca.hwxsup.com, hdpseca04.seca.hwxsup.com, 
 hdpseca05.seca.hwxsup.com]
 15/04/21 14:07:31 INFO mapreduce.InternalUtil: Initializing 
 org.apache.hadoop.hive.serde2.RegexSerDe with properties 
 {name=casetest.regex_table, numFiles=1, columns.types=string,string, 
 serialization.format=1, columns=id,name, rawDataSize=0, numRows=0, 
 output.format.string=%1$s %2$s, 
 serialization.lib=org.apache.hadoop.hive.serde2.RegexSerDe, 
 COLUMN_STATS_ACCURATE=true, totalSize=25, serialization.null.format=\N, 
 input.regex=([^ ]*) ([^ ]*), transient_lastDdlTime=1429590172}
 15/04/21 14:07:31 WARN serde2.RegexSerDe: output.format.string has been 
 deprecated
 Exception in thread main java.lang.NullPointerException
   at 
 com.google.common.base.Preconditions.checkNotNull(Preconditions.java:187)
   at com.google.common.base.Splitter.split(Splitter.java:371)
   at 
 org.apache.hadoop.hive.serde2.RegexSerDe.initialize(RegexSerDe.java:155)
   at 
 org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:49)
   at 
 org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:518)
   at 
 org.apache.hive.hcatalog.mapreduce.InternalUtil.initializeDeserializer(InternalUtil.java:156)
   at 
 org.apache.hive.hcatalog.mapreduce.HCatRecordReader.createDeserializer(HCatRecordReader.java:127)
   at 
 org.apache.hive.hcatalog.mapreduce.HCatRecordReader.initialize(HCatRecordReader.java:92)
   at HCatalogSQLMR.main(HCatalogSQLMR.java:81)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10431) HIVE-9555 broke hadoop-1 build

2015-04-21 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505957#comment-14505957
 ] 

Prasanth Jayachandran commented on HIVE-10431:
--

[~sershe] fyi..

 HIVE-9555 broke hadoop-1 build
 --

 Key: HIVE-10431
 URL: https://issues.apache.org/jira/browse/HIVE-10431
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Prasanth Jayachandran
Assignee: Sergey Shelukhin

 HIVE-9555 RecordReaderUtils uses direct bytebuffer read from 
 FSDataInputStream which is not present in hadoop-1. This breaks hadoop-1 
 compilation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10384) RetryingMetaStoreClient does not retry wrapped TTransportExceptions

2015-04-21 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-10384:
---
Attachment: HIVE-10384.1.patch

Thanks [~szehon] for reviewing the code and Eric Liang providing the case where 
the TTransportException is wrapped in MetaException which is further wrapped in 
InvocationTargetException. Updated the patch to include that case as well. 
Thanks.

 RetryingMetaStoreClient does not retry wrapped TTransportExceptions
 ---

 Key: HIVE-10384
 URL: https://issues.apache.org/jira/browse/HIVE-10384
 Project: Hive
  Issue Type: Bug
  Components: Clients
Reporter: Eric Liang
Assignee: Chaoyu Tang
 Attachments: HIVE-10384.1.patch, HIVE-10384.patch


 This bug is very similar to HIVE-9436, in that a TTransportException wrapped 
 in a MetaException will not be retried. RetryingMetaStoreClient has a block 
 of code above the MetaException handler that retries thrift exceptions, but 
 this doesn't work when the exception is wrapped.
 {code}
 if ((e.getCause() instanceof TApplicationException) ||
 (e.getCause() instanceof TProtocolException) ||
 (e.getCause() instanceof TTransportException)) {
   caughtException = (TException) e.getCause();
 } else if ((e.getCause() instanceof MetaException) 
 
 e.getCause().getMessage().matches((?s).*JDO[a-zA-Z]*Exception.*)) {
   caughtException = (MetaException) e.getCause();
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9824) LLAP: Native Vectorization of Map Join so previously CPU bound queries shift their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)

2015-04-21 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-9824:
---
Attachment: HIVE-9824.07.patch

 LLAP: Native Vectorization of Map Join so previously CPU bound queries shift 
 their bottleneck to I/O and make it possible for the rest of LLAP to shine ;)
 --

 Key: HIVE-9824
 URL: https://issues.apache.org/jira/browse/HIVE-9824
 Project: Hive
  Issue Type: Sub-task
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-9824.01.patch, HIVE-9824.02.patch, 
 HIVE-9824.04.patch, HIVE-9824.06.patch, HIVE-9824.07.patch


 Today's VectorMapJoinOperator is a pass-through that converts each row from a 
 vectorized row batch in a Java Object[] row and passes it to the 
 MapJoinOperator superclass.
 This enhancement creates specialized vectorized map join operator classes 
 that are optimized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8858) Visualize generated Spark plan [Spark Branch]

2015-04-21 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505272#comment-14505272
 ] 

Jimmy Xiang commented on HIVE-8858:
---

Looks good. Can this be part of explain extended? If we have to log into the 
log file, should we put in a buffer and log it in one log.info call? Another 
thing is that in assigning those numbers, can they match with the corresponding 
works/operators? For example, MapInput 1 is corresponding to Map 1 while 
MapInput 2 is corresponding to Map 2?

 Visualize generated Spark plan [Spark Branch]
 -

 Key: HIVE-8858
 URL: https://issues.apache.org/jira/browse/HIVE-8858
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Xuefu Zhang
Assignee: Chinna Rao Lalam
 Attachments: HIVE-8858-spark.patch, HIVE-8858.1-spark.patch, 
 HIVE-8858.2-spark.patch, HIVE-8858.3-spark.patch, HIVE-8858.4-spark.patch


 The spark plan generated by SparkPlanGenerator contains info which isn't 
 available in Hive's explain plan, such as RDD caching. Also, the graph is 
 slight different from orignal SparkWork. Thus, it would be nice to visualize 
 the plan as is done for SparkWork.
 Preferrably, the visualization can happen as part of Hive explain extended. 
 If not feasible, we at least can log this at info level.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)