[jira] [Assigned] (HIVE-4239) Remove lock on compilation stage

2015-05-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-4239:
--

Assignee: Sergey Shelukhin

 Remove lock on compilation stage
 

 Key: HIVE-4239
 URL: https://issues.apache.org/jira/browse/HIVE-4239
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Query Processor
Reporter: Carl Steinbach
Assignee: Sergey Shelukhin





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10528) Hiveserver2 in HTTP mode is not applying auth_to_local rules

2015-05-26 Thread Abdelrahman Shettia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abdelrahman Shettia updated HIVE-10528:
---
Attachment: HIVE-10528.3.patch

 Hiveserver2 in HTTP mode is not applying auth_to_local rules
 

 Key: HIVE-10528
 URL: https://issues.apache.org/jira/browse/HIVE-10528
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0
 Environment: Centos 6
Reporter: Abdelrahman Shettia
Assignee: Abdelrahman Shettia
 Attachments: HIVE-10528.1.patch, HIVE-10528.1.patch, 
 HIVE-10528.2.patch, HIVE-10528.3.patch


 PROBLEM: Authenticating to HS2 in HTTP mode with Kerberos, auth_to_local 
 mappings do not get applied.  Because of this various permissions checks 
 which rely on the local cluster name for a user are going to fail.
 STEPS TO REPRODUCE:
 1.  Create  kerberos cluster  and HS2 in HTTP mode
 2.  Create a new user, test, along with a kerberos principal for this user
 3.  Create a separate principal, mapped-test
 4.  Create an auth_to_local rule to make sure that mapped-test is mapped to 
 test
 5.  As the test user, connect to HS2 with beeline and create a simple table:
 {code}
 CREATE TABLE permtest (field1 int);
 {code}
 There is no need to load anything into this table.
 6.  Establish that it works as the test user:
 {code}
 show create table permtest;
 {code}
 7.  Drop the test identity and become mapped-test
 8.  Re-connect to HS2 with beeline, re-run the above command:
 {code}
 show create table permtest;
 {code}
 You will find that when this is done in HTTP mode, you will get an HDFS error 
 (because of StorageBasedAuthorization doing a HDFS permissions check) and the 
 user will be mapped-test and NOT test as it should be.
 ANALYSIS:  This appears to be HTTP specific and the problem seems to come in 
 {{ThriftHttpServlet$HttpKerberosServerAction.getPrincipalWithoutRealmAndHost()}}:
 {code}
   try {
 fullKerberosName = 
 ShimLoader.getHadoopShims().getKerberosNameShim(fullPrincipal);
   } catch (IOException e) {
 throw new HttpAuthenticationException(e);
   }
   return fullKerberosName.getServiceName();
 {code}
 getServiceName applies no auth_to_local rules.  Seems like maybe this should 
 be getShortName()?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10828) Insert...values for fewer number of columns fail

2015-05-26 Thread Aswathy Chellammal Sreekumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aswathy Chellammal Sreekumar updated HIVE-10828:

Description: 
Schema on insert queries with fewer number of columns fails with below error 
message

ERROR ql.Driver (SessionState.java:printError(957)) - FAILED: 
NullPointerException null
java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genReduceSinkPlan(SemanticAnalyzer.java:7277)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBucketingSortingDest(SemanticAnalyzer.java:6120)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:6291)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8992)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8883)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9728)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9621)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10094)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:324)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10105)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:208)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:409)
at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:425)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:714)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

Steps to reproduce:
set hive.support.concurrency=true;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
set hive.enforce.bucketing=true;
drop table if exists table1; 
create table table1 (a int, b string, c string) 
   partitioned by (bkt int) 
   clustered by (a) into 2 buckets 
   stored as orc 
   tblproperties ('transactional'='true'); 
insert into table_1 partition (bkt) (b, a, bkt) values 
('part one', 1, 1), ('part one', 2, 1), ('part two', 3, 2), ('part three', 
4, 3);


  was:
Schema on insert queries with fewer number of columns fails with below error 
message

ERROR ql.Driver (SessionState.java:printError(957)) - FAILED: 
NullPointerException null
java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genReduceSinkPlan(SemanticAnalyzer.java:7277)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBucketingSortingDest(SemanticAnalyzer.java:6120)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:6291)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8992)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8883)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9728)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9621)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10094)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:324)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10105)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:208)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
at 

[jira] [Updated] (HIVE-10550) Dynamic RDD caching optimization for HoS.[Spark Branch]

2015-05-26 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-10550:
-
Attachment: HIVE-10550.5-spark.patch

 Dynamic RDD caching optimization for HoS.[Spark Branch]
 ---

 Key: HIVE-10550
 URL: https://issues.apache.org/jira/browse/HIVE-10550
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
 Attachments: HIVE-10550.1-spark.patch, HIVE-10550.1.patch, 
 HIVE-10550.2-spark.patch, HIVE-10550.3-spark.patch, HIVE-10550.4-spark.patch, 
 HIVE-10550.5-spark.patch


 A Hive query may try to scan the same table multi times, like self-join, 
 self-union, or even share the same subquery, [TPC-DS 
 Q39|https://github.com/hortonworks/hive-testbench/blob/hive14/sample-queries-tpcds/query39.sql]
  is an example. As you may know that, Spark support cache RDD data, which 
 mean Spark would put the calculated RDD data in memory and get the data from 
 memory directly for next time, this avoid the calculation cost of this 
 RDD(and all the cost of its dependencies) at the cost of more memory usage. 
 Through analyze the query context, we should be able to understand which part 
 of query could be shared, so that we can reuse the cached RDD in the 
 generated Spark job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10761) Create codahale-based metrics system for Hive

2015-05-26 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-10761:
-
Attachment: HIVE-10761.2.patch

Some loose ends, like make it take in configured list of reporters, and add 
end-to-end unit test for Metastore metrics, latest patch should be ready for 
review.

 Create codahale-based metrics system for Hive
 -

 Key: HIVE-10761
 URL: https://issues.apache.org/jira/browse/HIVE-10761
 Project: Hive
  Issue Type: New Feature
  Components: Diagnosability
Reporter: Szehon Ho
Assignee: Szehon Ho
 Attachments: HIVE-10761.2.patch, HIVE-10761.patch, hms-metrics.json


 There is a current Hive metrics system that hooks up to a JMX reporting, but 
 all its measurements, models are custom.
 This is to make another metrics system that will be based on Codahale (ie 
 yammer, dropwizard), which has the following advantage:
 * Well-defined metric model for frequently-needed metrics (ie JVM metrics)
 * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, 
 etc), 
 * Built-in reporting frameworks like JMX, Console, Log, JSON webserver
 It is used for many projects, including several Apache projects like Oozie.  
 Overall, monitoring tools should find it easier to understand these common 
 metric, measurement, reporting models.
 The existing metric subsystem will be kept and can be enabled if backward 
 compatibility is desired.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10689) HS2 metadata api calls should use HiveAuthorizer interface for authorization

2015-05-26 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-10689:
-
Attachment: HIVE-10689.1.patch

 HS2 metadata api calls should use HiveAuthorizer interface for authorization
 

 Key: HIVE-10689
 URL: https://issues.apache.org/jira/browse/HIVE-10689
 Project: Hive
  Issue Type: Bug
  Components: Authorization, SQLStandardAuthorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-10689.1.patch


 java.sql.DataBaseMetadata apis in jdbc api result in calls to HS2 metadata 
 api's and their execution is via separate Hive Operation implementations, 
 that don't use the Hive Driver class. Invocation of these api's should also 
 be authorized using the HiveAuthorizer api.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10829) ATS hook fails for explainTask

2015-05-26 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-10829:
---
Attachment: HIVE-10829.01.patch

 ATS hook fails for explainTask
 --

 Key: HIVE-10829
 URL: https://issues.apache.org/jira/browse/HIVE-10829
 Project: Hive
  Issue Type: Bug
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
Priority: Minor
 Attachments: HIVE-10829.01.patch


 Commands:
 create table idtable(id string);
 create table ctastable as select * from idtable;
 With ATS hook:
 2015-05-22 18:54:47,092 INFO  [ATS Logger 0]: hooks.ATSHook 
 (ATSHook.java:run(136)) - Failed to submit plan to ATS: 
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:589)
 at 
 org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:576)
 at 
 org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:821)
 at 
 org.apache.hadoop.hive.ql.exec.ExplainTask.outputStagePlans(ExplainTask.java:965)
 at 
 org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:219)
 at org.apache.hadoop.hive.ql.hooks.ATSHook$2.run(ATSHook.java:120)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10731) NullPointerException in HiveParser.g

2015-05-26 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560177#comment-14560177
 ] 

Pengcheng Xiong commented on HIVE-10731:


[~jpullokkaran], this patch also needs your review. Thanks.

 NullPointerException in HiveParser.g
 

 Key: HIVE-10731
 URL: https://issues.apache.org/jira/browse/HIVE-10731
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Affects Versions: 1.2.0
Reporter: Xiu
Assignee: Pengcheng Xiong
Priority: Minor
 Attachments: HIVE-10731.01.patch


 In HiveParser.g:
 {code:Java}
 protected boolean useSQL11ReservedKeywordsForIdentifier() {
 return !HiveConf.getBoolVar(hiveConf, 
 HiveConf.ConfVars.HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS);
 }
 {code}
 NullPointerException is thrown when hiveConf is not set.
 Stack trace:
 {code:Java}
 java.lang.NullPointerException
 at org.apache.hadoop.hive.conf.HiveConf.getBoolVar(HiveConf.java:2583)
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser.useSQL11ReservedKeywordsForIdentifier(HiveParser.java:1000)
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.useSQL11ReservedKeywordsForIdentifier(HiveParser_IdentifiersParser.java:726)
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:10922)
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45808)
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser.columnNameType(HiveParser.java:38008)
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser.columnNameTypeList(HiveParser.java:36167)
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser.createTableStatement(HiveParser.java:5214)
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser.ddlStatement(HiveParser.java:2640)
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1650)
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1109)
 at 
 org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:202)
 at 
 org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
 at 
 org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:161)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10528) Hiveserver2 in HTTP mode is not applying auth_to_local rules

2015-05-26 Thread Abdelrahman Shettia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abdelrahman Shettia updated HIVE-10528:
---
Attachment: HIVE-10528.2.patch

 Hiveserver2 in HTTP mode is not applying auth_to_local rules
 

 Key: HIVE-10528
 URL: https://issues.apache.org/jira/browse/HIVE-10528
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0
 Environment: Centos 6
Reporter: Abdelrahman Shettia
Assignee: Abdelrahman Shettia
 Attachments: HIVE-10528.1.patch, HIVE-10528.1.patch, 
 HIVE-10528.2.patch


 PROBLEM: Authenticating to HS2 in HTTP mode with Kerberos, auth_to_local 
 mappings do not get applied.  Because of this various permissions checks 
 which rely on the local cluster name for a user are going to fail.
 STEPS TO REPRODUCE:
 1.  Create  kerberos cluster  and HS2 in HTTP mode
 2.  Create a new user, test, along with a kerberos principal for this user
 3.  Create a separate principal, mapped-test
 4.  Create an auth_to_local rule to make sure that mapped-test is mapped to 
 test
 5.  As the test user, connect to HS2 with beeline and create a simple table:
 {code}
 CREATE TABLE permtest (field1 int);
 {code}
 There is no need to load anything into this table.
 6.  Establish that it works as the test user:
 {code}
 show create table permtest;
 {code}
 7.  Drop the test identity and become mapped-test
 8.  Re-connect to HS2 with beeline, re-run the above command:
 {code}
 show create table permtest;
 {code}
 You will find that when this is done in HTTP mode, you will get an HDFS error 
 (because of StorageBasedAuthorization doing a HDFS permissions check) and the 
 user will be mapped-test and NOT test as it should be.
 ANALYSIS:  This appears to be HTTP specific and the problem seems to come in 
 {{ThriftHttpServlet$HttpKerberosServerAction.getPrincipalWithoutRealmAndHost()}}:
 {code}
   try {
 fullKerberosName = 
 ShimLoader.getHadoopShims().getKerberosNameShim(fullPrincipal);
   } catch (IOException e) {
 throw new HttpAuthenticationException(e);
   }
   return fullKerberosName.getServiceName();
 {code}
 getServiceName applies no auth_to_local rules.  Seems like maybe this should 
 be getShortName()?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10819) SearchArgumentImpl for Timestamp is broken by HIVE-10286

2015-05-26 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560238#comment-14560238
 ] 

Ferdinand Xu commented on HIVE-10819:
-

Hi [~sershe], [~daijy], the problematic commit is already reverted.
{noformat}
Repository: hive
Updated Branches:
  refs/heads/master db8067f96 - a00bf4f87


Revert HIVE-10277: Unable to process Comment line '--' in HIVE-1.1.0 (Chinna 
via Xuefu)

This reverts commit d66a7347ab97983cc5b9fca6bdabebc81e5a77e5.
{noformat}

 SearchArgumentImpl for Timestamp is broken by HIVE-10286
 

 Key: HIVE-10819
 URL: https://issues.apache.org/jira/browse/HIVE-10819
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 1.2.1

 Attachments: HIVE-10819.1.patch, HIVE-10819.2.patch, 
 HIVE-10819.3.patch


 The work around for kryo bug for Timestamp is accidentally removed by 
 HIVE-10286. Need to bring it back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10788) Change sort_array to support non-primitive types

2015-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560240#comment-14560240
 ] 

Hive QA commented on HIVE-10788:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12735427/HIVE-10788.1.patch

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 8977 tests executed
*Failed tests:*
{noformat}
TestCustomAuthentication - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_crc32
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_sha1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_join30
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_null_projection
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_sort_array_wrong1
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4048/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4048/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4048/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12735427 - PreCommit-HIVE-TRUNK-Build

 Change sort_array to support non-primitive types
 

 Key: HIVE-10788
 URL: https://issues.apache.org/jira/browse/HIVE-10788
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Chao Sun
Assignee: Chao Sun
 Attachments: HIVE-10788.1.patch


 Currently {{sort_array}} only support primitive types. As we already support 
 comparison between non-primitive types, it makes sense to remove this 
 restriction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10828) Insert...values for fewer number of columns fail

2015-05-26 Thread Aswathy Chellammal Sreekumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aswathy Chellammal Sreekumar updated HIVE-10828:

Description: 
Schema on insert queries with fewer number of columns fails with below error 
message

ERROR ql.Driver (SessionState.java:printError(957)) - FAILED: 
NullPointerException null
java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genReduceSinkPlan(SemanticAnalyzer.java:7277)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBucketingSortingDest(SemanticAnalyzer.java:6120)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:6291)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8992)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8883)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9728)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9621)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10094)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:324)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10105)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:208)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:409)
at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:425)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:714)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

*Steps to reproduce:*

set hive.support.concurrency=true;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
set hive.enforce.bucketing=true;
drop table if exists table1; 
create table table1 (a int, b string, c string) 
   partitioned by (bkt int) 
   clustered by (a) into 2 buckets 
   stored as orc 
   tblproperties ('transactional'='true'); 
insert into table_1 partition (bkt) (b, a, bkt) values 
('part one', 1, 1), ('part one', 2, 1), ('part two', 3, 2), ('part three', 
4, 3);


  was:
Schema on insert queries with fewer number of columns fails with below error 
message

ERROR ql.Driver (SessionState.java:printError(957)) - FAILED: 
NullPointerException null
java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genReduceSinkPlan(SemanticAnalyzer.java:7277)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBucketingSortingDest(SemanticAnalyzer.java:6120)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:6291)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8992)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8883)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9728)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9621)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10094)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:324)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10105)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:208)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
at 

[jira] [Updated] (HIVE-10828) Insert...values for fewer number of columns fail

2015-05-26 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-10828:
--
Description: 
Schema on insert queries with fewer number of columns fails with below error 
message
{noformat}
ERROR ql.Driver (SessionState.java:printError(957)) - FAILED: 
NullPointerException null
java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genReduceSinkPlan(SemanticAnalyzer.java:7277)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBucketingSortingDest(SemanticAnalyzer.java:6120)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:6291)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8992)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8883)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9728)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9621)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10094)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:324)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10105)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:208)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:409)
at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:425)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:714)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
{noformat}
*Steps to reproduce:*

set hive.support.concurrency=true;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
set hive.enforce.bucketing=true;
drop table if exists table1; 
create table table1 (a int, b string, c string) 
   partitioned by (bkt int) 
   clustered by (a) into 2 buckets 
   stored as orc 
   tblproperties ('transactional'='true'); 
insert into table_1 partition (bkt) (b, a, bkt) values 
('part one', 1, 1), ('part one', 2, 1), ('part two', 3, 2), ('part three', 
4, 3);


  was:
Schema on insert queries with fewer number of columns fails with below error 
message

ERROR ql.Driver (SessionState.java:printError(957)) - FAILED: 
NullPointerException null
java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genReduceSinkPlan(SemanticAnalyzer.java:7277)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBucketingSortingDest(SemanticAnalyzer.java:6120)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:6291)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8992)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8883)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9728)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9621)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10094)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:324)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10105)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:208)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
at 

[jira] [Commented] (HIVE-10828) Insert...values for fewer number of columns fail

2015-05-26 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560290#comment-14560290
 ] 

Eugene Koifman commented on HIVE-10828:
---

Simpler repro case
{noformat}
set hive.enforce.bucketing=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.cbo.enable=false;

drop table if exists acid_partitioned;
create table acid_partitioned (a int, c string)
  partitioned by (p int)
  clustered by (a) into 1 buckets;
  
insert into acid_partitioned partition (p) (a,p) values(1,1);
{noformat}

above example disables CBO because it causes additional issues.  will file 
separate ticket for that

 Insert...values for fewer number of columns fail
 

 Key: HIVE-10828
 URL: https://issues.apache.org/jira/browse/HIVE-10828
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.0
Reporter: Aswathy Chellammal Sreekumar
Assignee: Eugene Koifman

 Schema on insert queries with fewer number of columns fails with below error 
 message
 ERROR ql.Driver (SessionState.java:printError(957)) - FAILED: 
 NullPointerException null
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genReduceSinkPlan(SemanticAnalyzer.java:7277)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBucketingSortingDest(SemanticAnalyzer.java:6120)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:6291)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8992)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8883)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9728)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9621)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10094)
 at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:324)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10105)
 at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:208)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
 at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
 at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:409)
 at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:425)
 at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:714)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 *Steps to reproduce:*
 set hive.support.concurrency=true;
 set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
 set hive.enforce.bucketing=true;
 drop table if exists table1; 
 create table table1 (a int, b string, c string) 
partitioned by (bkt int) 
clustered by (a) into 2 buckets 
stored as orc 
tblproperties ('transactional'='true'); 
 insert into table_1 partition (bkt) (b, a, bkt) values 
 ('part one', 1, 1), ('part one', 2, 1), ('part two', 3, 2), ('part 
 three', 4, 3);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9069) Simplify filter predicates for CBO

2015-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560306#comment-14560306
 ] 

Hive QA commented on HIVE-9069:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12735433/HIVE-9069.14.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 8975 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_7
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_7
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4049/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4049/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4049/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12735433 - PreCommit-HIVE-TRUNK-Build

 Simplify filter predicates for CBO
 --

 Key: HIVE-9069
 URL: https://issues.apache.org/jira/browse/HIVE-9069
 Project: Hive
  Issue Type: Bug
  Components: CBO
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Jesus Camacho Rodriguez
 Fix For: 0.14.1

 Attachments: HIVE-9069.01.patch, HIVE-9069.02.patch, 
 HIVE-9069.03.patch, HIVE-9069.04.patch, HIVE-9069.05.patch, 
 HIVE-9069.06.patch, HIVE-9069.07.patch, HIVE-9069.08.patch, 
 HIVE-9069.08.patch, HIVE-9069.09.patch, HIVE-9069.10.patch, 
 HIVE-9069.11.patch, HIVE-9069.12.patch, HIVE-9069.13.patch, 
 HIVE-9069.14.patch, HIVE-9069.14.patch, HIVE-9069.patch


 Simplify predicates for disjunctive predicates so that can get pushed down to 
 the scan.
 Looks like this is still an issue, some of the filters can be pushed down to 
 the scan.
 {code}
 set hive.cbo.enable=true
 set hive.stats.fetch.column.stats=true
 set hive.exec.dynamic.partition.mode=nonstrict
 set hive.tez.auto.reducer.parallelism=true
 set hive.auto.convert.join.noconditionaltask.size=32000
 set hive.exec.reducers.bytes.per.reducer=1
 set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager
 set hive.support.concurrency=false
 set hive.tez.exec.print.summary=true
 explain  
 select  substr(r_reason_desc,1,20) as r
,avg(ws_quantity) wq
,avg(wr_refunded_cash) ref
,avg(wr_fee) fee
  from web_sales, web_returns, web_page, customer_demographics cd1,
   customer_demographics cd2, customer_address, date_dim, reason 
  where web_sales.ws_web_page_sk = web_page.wp_web_page_sk
and web_sales.ws_item_sk = web_returns.wr_item_sk
and web_sales.ws_order_number = web_returns.wr_order_number
and web_sales.ws_sold_date_sk = date_dim.d_date_sk and d_year = 1998
and cd1.cd_demo_sk = web_returns.wr_refunded_cdemo_sk 
and cd2.cd_demo_sk = web_returns.wr_returning_cdemo_sk
and customer_address.ca_address_sk = web_returns.wr_refunded_addr_sk
and reason.r_reason_sk = web_returns.wr_reason_sk
and
(
 (
  cd1.cd_marital_status = 'M'
  and
  cd1.cd_marital_status = cd2.cd_marital_status
  and
  cd1.cd_education_status = '4 yr Degree'
  and 
  cd1.cd_education_status = cd2.cd_education_status
  and
  ws_sales_price between 100.00 and 150.00
 )
or
 (
  cd1.cd_marital_status = 'D'
  and
  cd1.cd_marital_status = cd2.cd_marital_status
  and
  cd1.cd_education_status = 'Primary' 
  and
  cd1.cd_education_status = cd2.cd_education_status
  and
  ws_sales_price between 50.00 and 100.00
 )
or
 (
  cd1.cd_marital_status = 'U'
  and
  cd1.cd_marital_status = cd2.cd_marital_status
  and
  cd1.cd_education_status = 'Advanced Degree'
  and
  cd1.cd_education_status = cd2.cd_education_status
  and
  ws_sales_price between 150.00 and 200.00
 )
)
and
(
 (
  ca_country = 'United States'
  and
  ca_state in ('KY', 'GA', 'NM')
  and ws_net_profit between 100 and 200  
 )
 or
 (
  ca_country = 'United States'
  and
  ca_state in ('MT', 'OR', 'IN')
  and ws_net_profit between 150 and 300  
 )
 or
 (
  ca_country = 'United States'
  and
  ca_state in ('WI', 'MO', 'WV')
  and ws_net_profit between 50 and 250  
 )
)
 group by r_reason_desc
 order by r, wq, ref, fee
 limit 100
 OK
 STAGE DEPENDENCIES:
   Stage-1 

[jira] [Updated] (HIVE-7723) Explain plan for complex query with lots of partitions is slow due to in-efficient collection used to find a matching ReadEntity

2015-05-26 Thread Mostafa Mokhtar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-7723:
--
Attachment: HIVE-7723.11.patch

 Explain plan for complex query with lots of partitions is slow due to 
 in-efficient collection used to find a matching ReadEntity
 

 Key: HIVE-7723
 URL: https://issues.apache.org/jira/browse/HIVE-7723
 Project: Hive
  Issue Type: Bug
  Components: CLI, Physical Optimizer
Affects Versions: 0.13.1
Reporter: Mostafa Mokhtar
Assignee: Mostafa Mokhtar
 Attachments: HIVE-7723.1.patch, HIVE-7723.10.patch, 
 HIVE-7723.11.patch, HIVE-7723.2.patch, HIVE-7723.3.patch, HIVE-7723.4.patch, 
 HIVE-7723.5.patch, HIVE-7723.6.patch, HIVE-7723.7.patch, HIVE-7723.8.patch, 
 HIVE-7723.9.patch


 Explain on TPC-DS query 64 took 11 seconds, when the CLI was profiled it 
 showed that ReadEntity.equals is taking ~40% of the CPU.
 ReadEntity.equals is called from the snippet below.
 Again and again the set is iterated over to get the actual match, a HashMap 
 is a better option for this case as Set doesn't have a Get method.
 Also for ReadEntity equals is case-insensitive while hash is , which is an 
 undesired behavior.
 {code}
 public static ReadEntity addInput(SetReadEntity inputs, ReadEntity 
 newInput) {
 // If the input is already present, make sure the new parent is added to 
 the input.
 if (inputs.contains(newInput)) {
   for (ReadEntity input : inputs) {
 if (input.equals(newInput)) {
   if ((newInput.getParents() != null)  
 (!newInput.getParents().isEmpty())) {
 input.getParents().addAll(newInput.getParents());
 input.setDirect(input.isDirect() || newInput.isDirect());
   }
   return input;
 }
   }
   assert false;
 } else {
   inputs.add(newInput);
   return newInput;
 }
 // make compile happy
 return null;
   }
 {code}
 This is the query used : 
 {code}
 select cs1.product_name ,cs1.store_name ,cs1.store_zip ,cs1.b_street_number 
 ,cs1.b_streen_name ,cs1.b_city
  ,cs1.b_zip ,cs1.c_street_number ,cs1.c_street_name ,cs1.c_city 
 ,cs1.c_zip ,cs1.syear ,cs1.cnt
  ,cs1.s1 ,cs1.s2 ,cs1.s3
  ,cs2.s1 ,cs2.s2 ,cs2.s3 ,cs2.syear ,cs2.cnt
 from
 (select i_product_name as product_name ,i_item_sk as item_sk ,s_store_name as 
 store_name
  ,s_zip as store_zip ,ad1.ca_street_number as b_street_number 
 ,ad1.ca_street_name as b_streen_name
  ,ad1.ca_city as b_city ,ad1.ca_zip as b_zip ,ad2.ca_street_number as 
 c_street_number
  ,ad2.ca_street_name as c_street_name ,ad2.ca_city as c_city ,ad2.ca_zip 
 as c_zip
  ,d1.d_year as syear ,d2.d_year as fsyear ,d3.d_year as s2year ,count(*) 
 as cnt
  ,sum(ss_wholesale_cost) as s1 ,sum(ss_list_price) as s2 
 ,sum(ss_coupon_amt) as s3
   FROM   store_sales
 JOIN store_returns ON store_sales.ss_item_sk = 
 store_returns.sr_item_sk and store_sales.ss_ticket_number = 
 store_returns.sr_ticket_number
 JOIN customer ON store_sales.ss_customer_sk = customer.c_customer_sk
 JOIN date_dim d1 ON store_sales.ss_sold_date_sk = d1.d_date_sk
 JOIN date_dim d2 ON customer.c_first_sales_date_sk = d2.d_date_sk 
 JOIN date_dim d3 ON customer.c_first_shipto_date_sk = d3.d_date_sk
 JOIN store ON store_sales.ss_store_sk = store.s_store_sk
 JOIN customer_demographics cd1 ON store_sales.ss_cdemo_sk= 
 cd1.cd_demo_sk
 JOIN customer_demographics cd2 ON customer.c_current_cdemo_sk = 
 cd2.cd_demo_sk
 JOIN promotion ON store_sales.ss_promo_sk = promotion.p_promo_sk
 JOIN household_demographics hd1 ON store_sales.ss_hdemo_sk = 
 hd1.hd_demo_sk
 JOIN household_demographics hd2 ON customer.c_current_hdemo_sk = 
 hd2.hd_demo_sk
 JOIN customer_address ad1 ON store_sales.ss_addr_sk = 
 ad1.ca_address_sk
 JOIN customer_address ad2 ON customer.c_current_addr_sk = 
 ad2.ca_address_sk
 JOIN income_band ib1 ON hd1.hd_income_band_sk = ib1.ib_income_band_sk
 JOIN income_band ib2 ON hd2.hd_income_band_sk = ib2.ib_income_band_sk
 JOIN item ON store_sales.ss_item_sk = item.i_item_sk
 JOIN
  (select cs_item_sk
 ,sum(cs_ext_list_price) as 
 sale,sum(cr_refunded_cash+cr_reversed_charge+cr_store_credit) as refund
   from catalog_sales JOIN catalog_returns
   ON catalog_sales.cs_item_sk = catalog_returns.cr_item_sk
 and catalog_sales.cs_order_number = catalog_returns.cr_order_number
   group by cs_item_sk
   having 
 sum(cs_ext_list_price)2*sum(cr_refunded_cash+cr_reversed_charge+cr_store_credit))
  cs_ui
 ON store_sales.ss_item_sk = cs_ui.cs_item_sk
   WHERE  
  cd1.cd_marital_status  

[jira] [Updated] (HIVE-7723) Explain plan for complex query with lots of partitions is slow due to in-efficient collection used to find a matching ReadEntity

2015-05-26 Thread Mostafa Mokhtar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-7723:
--
Attachment: (was: HIVE-7723.11.patch)

 Explain plan for complex query with lots of partitions is slow due to 
 in-efficient collection used to find a matching ReadEntity
 

 Key: HIVE-7723
 URL: https://issues.apache.org/jira/browse/HIVE-7723
 Project: Hive
  Issue Type: Bug
  Components: CLI, Physical Optimizer
Affects Versions: 0.13.1
Reporter: Mostafa Mokhtar
Assignee: Mostafa Mokhtar
 Attachments: HIVE-7723.1.patch, HIVE-7723.10.patch, 
 HIVE-7723.2.patch, HIVE-7723.3.patch, HIVE-7723.4.patch, HIVE-7723.5.patch, 
 HIVE-7723.6.patch, HIVE-7723.7.patch, HIVE-7723.8.patch, HIVE-7723.9.patch


 Explain on TPC-DS query 64 took 11 seconds, when the CLI was profiled it 
 showed that ReadEntity.equals is taking ~40% of the CPU.
 ReadEntity.equals is called from the snippet below.
 Again and again the set is iterated over to get the actual match, a HashMap 
 is a better option for this case as Set doesn't have a Get method.
 Also for ReadEntity equals is case-insensitive while hash is , which is an 
 undesired behavior.
 {code}
 public static ReadEntity addInput(SetReadEntity inputs, ReadEntity 
 newInput) {
 // If the input is already present, make sure the new parent is added to 
 the input.
 if (inputs.contains(newInput)) {
   for (ReadEntity input : inputs) {
 if (input.equals(newInput)) {
   if ((newInput.getParents() != null)  
 (!newInput.getParents().isEmpty())) {
 input.getParents().addAll(newInput.getParents());
 input.setDirect(input.isDirect() || newInput.isDirect());
   }
   return input;
 }
   }
   assert false;
 } else {
   inputs.add(newInput);
   return newInput;
 }
 // make compile happy
 return null;
   }
 {code}
 This is the query used : 
 {code}
 select cs1.product_name ,cs1.store_name ,cs1.store_zip ,cs1.b_street_number 
 ,cs1.b_streen_name ,cs1.b_city
  ,cs1.b_zip ,cs1.c_street_number ,cs1.c_street_name ,cs1.c_city 
 ,cs1.c_zip ,cs1.syear ,cs1.cnt
  ,cs1.s1 ,cs1.s2 ,cs1.s3
  ,cs2.s1 ,cs2.s2 ,cs2.s3 ,cs2.syear ,cs2.cnt
 from
 (select i_product_name as product_name ,i_item_sk as item_sk ,s_store_name as 
 store_name
  ,s_zip as store_zip ,ad1.ca_street_number as b_street_number 
 ,ad1.ca_street_name as b_streen_name
  ,ad1.ca_city as b_city ,ad1.ca_zip as b_zip ,ad2.ca_street_number as 
 c_street_number
  ,ad2.ca_street_name as c_street_name ,ad2.ca_city as c_city ,ad2.ca_zip 
 as c_zip
  ,d1.d_year as syear ,d2.d_year as fsyear ,d3.d_year as s2year ,count(*) 
 as cnt
  ,sum(ss_wholesale_cost) as s1 ,sum(ss_list_price) as s2 
 ,sum(ss_coupon_amt) as s3
   FROM   store_sales
 JOIN store_returns ON store_sales.ss_item_sk = 
 store_returns.sr_item_sk and store_sales.ss_ticket_number = 
 store_returns.sr_ticket_number
 JOIN customer ON store_sales.ss_customer_sk = customer.c_customer_sk
 JOIN date_dim d1 ON store_sales.ss_sold_date_sk = d1.d_date_sk
 JOIN date_dim d2 ON customer.c_first_sales_date_sk = d2.d_date_sk 
 JOIN date_dim d3 ON customer.c_first_shipto_date_sk = d3.d_date_sk
 JOIN store ON store_sales.ss_store_sk = store.s_store_sk
 JOIN customer_demographics cd1 ON store_sales.ss_cdemo_sk= 
 cd1.cd_demo_sk
 JOIN customer_demographics cd2 ON customer.c_current_cdemo_sk = 
 cd2.cd_demo_sk
 JOIN promotion ON store_sales.ss_promo_sk = promotion.p_promo_sk
 JOIN household_demographics hd1 ON store_sales.ss_hdemo_sk = 
 hd1.hd_demo_sk
 JOIN household_demographics hd2 ON customer.c_current_hdemo_sk = 
 hd2.hd_demo_sk
 JOIN customer_address ad1 ON store_sales.ss_addr_sk = 
 ad1.ca_address_sk
 JOIN customer_address ad2 ON customer.c_current_addr_sk = 
 ad2.ca_address_sk
 JOIN income_band ib1 ON hd1.hd_income_band_sk = ib1.ib_income_band_sk
 JOIN income_band ib2 ON hd2.hd_income_band_sk = ib2.ib_income_band_sk
 JOIN item ON store_sales.ss_item_sk = item.i_item_sk
 JOIN
  (select cs_item_sk
 ,sum(cs_ext_list_price) as 
 sale,sum(cr_refunded_cash+cr_reversed_charge+cr_store_credit) as refund
   from catalog_sales JOIN catalog_returns
   ON catalog_sales.cs_item_sk = catalog_returns.cr_item_sk
 and catalog_sales.cs_order_number = catalog_returns.cr_order_number
   group by cs_item_sk
   having 
 sum(cs_ext_list_price)2*sum(cr_refunded_cash+cr_reversed_charge+cr_store_credit))
  cs_ui
 ON store_sales.ss_item_sk = cs_ui.cs_item_sk
   WHERE  
  cd1.cd_marital_status  cd2.cd_marital_status 

[jira] [Commented] (HIVE-10811) RelFieldTrimmer throws NoSuchElementException in some cases

2015-05-26 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560335#comment-14560335
 ] 

Laljo John Pullokkaran commented on HIVE-10811:
---

Why do we need to keep the fields from input that is part of the collation but 
is not used by parent. If no operators from parent refer to that column then i 
don't see how preserving sort order is helpful.

 RelFieldTrimmer throws NoSuchElementException in some cases
 ---

 Key: HIVE-10811
 URL: https://issues.apache.org/jira/browse/HIVE-10811
 Project: Hive
  Issue Type: Bug
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-10811.01.patch, HIVE-10811.02.patch, 
 HIVE-10811.patch


 RelFieldTrimmer runs into NoSuchElementException in some cases.
 Stack trace:
 {noformat}
 Exception in thread main java.lang.AssertionError: Internal error: While 
 invoking method 'public org.apache.calcite.sql2rel.RelFieldTrimmer$TrimResult 
 org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(org.apache.calcite.rel.core.Sort,org.apache.calcite.util.ImmutableBitSet,java.util.Set)'
   at org.apache.calcite.util.Util.newInternal(Util.java:743)
   at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:543)
   at 
 org.apache.calcite.sql2rel.RelFieldTrimmer.dispatchTrimFields(RelFieldTrimmer.java:269)
   at 
 org.apache.calcite.sql2rel.RelFieldTrimmer.trim(RelFieldTrimmer.java:175)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:947)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:820)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:768)
   at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:109)
   at 
 org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:730)
   at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:145)
   at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:105)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:607)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:244)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10048)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:207)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
   at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 Caused by: java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:536)
   ... 32 more
 Caused by: java.lang.AssertionError: Internal error: While invoking method 
 'public org.apache.calcite.sql2rel.RelFieldTrimmer$TrimResult 
 org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(org.apache.calcite.rel.core.Sort,org.apache.calcite.util.ImmutableBitSet,java.util.Set)'
   at 

[jira] [Commented] (HIVE-10550) Dynamic RDD caching optimization for HoS.[Spark Branch]

2015-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560342#comment-14560342
 ] 

Hive QA commented on HIVE-10550:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12735497/HIVE-10550.5-spark.patch

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 8721 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucket6.q-scriptfile1_win.q-quotedid_smb.q-and-1-more - did 
not produce a TEST-*.xml file
TestMinimrCliDriver-bucketizedhiveinputformat.q-empty_dir_in_table.q - did not 
produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-infer_bucket_sort_map_operators.q-load_hdfs_file_with_space_in_the_name.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-import_exported_table.q-truncate_column_buckets.q-bucket_num_reducers2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-infer_bucket_sort_num_buckets.q-parallel_orderby.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-join1.q-infer_bucket_sort_bucketed_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-bucket5.q-infer_bucket_sort_merge.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-input16_cc.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-bucket_num_reducers.q-scriptfile1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx_cbo_2.q-bucketmapjoin6.q-bucket4.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-reduce_deduplicate.q-infer_bucket_sort_dyn_part.q-udf_using.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-uber_reduce.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-stats_counter_partitioned.q-external_table_with_space_in_location_path.q-disable_merge_for_bucketing.q-and-1-more
 - did not produce a TEST-*.xml file
org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/866/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/866/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-866/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12735497 - PreCommit-HIVE-SPARK-Build

 Dynamic RDD caching optimization for HoS.[Spark Branch]
 ---

 Key: HIVE-10550
 URL: https://issues.apache.org/jira/browse/HIVE-10550
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
 Attachments: HIVE-10550.1-spark.patch, HIVE-10550.1.patch, 
 HIVE-10550.2-spark.patch, HIVE-10550.3-spark.patch, HIVE-10550.4-spark.patch, 
 HIVE-10550.5-spark.patch


 A Hive query may try to scan the same table multi times, like self-join, 
 self-union, or even share the same subquery, [TPC-DS 
 Q39|https://github.com/hortonworks/hive-testbench/blob/hive14/sample-queries-tpcds/query39.sql]
  is an example. As you may know that, Spark support cache RDD data, which 
 mean Spark would put the calculated RDD data in memory and get the data from 
 memory directly for next time, this avoid the calculation cost of this 
 RDD(and all the cost of its dependencies) at the cost of more memory usage. 
 Through analyze the query context, we should be able to understand which part 
 of query could be shared, so that we can reuse the cached RDD in the 
 generated Spark job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10812) Scaling PK/FK's selectivity for stats annotation

2015-05-26 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-10812:

Component/s: Statistics
 Physical Optimizer

 Scaling PK/FK's selectivity for stats annotation
 

 Key: HIVE-10812
 URL: https://issues.apache.org/jira/browse/HIVE-10812
 Project: Hive
  Issue Type: Improvement
  Components: Physical Optimizer, Statistics
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Fix For: 1.2.1

 Attachments: HIVE-10812.01.patch, HIVE-10812.02.patch, 
 HIVE-10812.03.patch


 Right now, the computation of the selectivity of FK side based on PK side 
 does not take into consideration of the range of FK and the range of PK.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10704) Errors in Tez HashTableLoader when estimated table size is 0

2015-05-26 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560431#comment-14560431
 ] 

Alexander Pivovarov commented on HIVE-10704:


Mostafa, can you check RB link? I'm not sure it shows HIVE-10704.3.patch

 Errors in Tez HashTableLoader when estimated table size is 0
 

 Key: HIVE-10704
 URL: https://issues.apache.org/jira/browse/HIVE-10704
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Jason Dere
Assignee: Mostafa Mokhtar
 Fix For: 1.2.1

 Attachments: HIVE-10704.1.patch, HIVE-10704.2.patch, 
 HIVE-10704.3.patch


 Couple of issues:
 - If the table sizes in MapJoinOperator.getParentDataSizes() are 0 for all 
 tables, the largest small table selection is wrong and could select the large 
 table (which results in NPE)
 - The memory estimates can either divide-by-zero, or allocate 0 memory if the 
 table size is 0. Try to come up with a sensible default for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9069) Simplify filter predicates for CBO

2015-05-26 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560332#comment-14560332
 ] 

Laljo John Pullokkaran commented on HIVE-9069:
--

[~jcamachorodriguez] In extractCommonOperands for a disjunction if any operand 
doesn't have any of the reductionCondition then we can short circuit and bail 
out.

 Simplify filter predicates for CBO
 --

 Key: HIVE-9069
 URL: https://issues.apache.org/jira/browse/HIVE-9069
 Project: Hive
  Issue Type: Bug
  Components: CBO
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Jesus Camacho Rodriguez
 Fix For: 0.14.1

 Attachments: HIVE-9069.01.patch, HIVE-9069.02.patch, 
 HIVE-9069.03.patch, HIVE-9069.04.patch, HIVE-9069.05.patch, 
 HIVE-9069.06.patch, HIVE-9069.07.patch, HIVE-9069.08.patch, 
 HIVE-9069.08.patch, HIVE-9069.09.patch, HIVE-9069.10.patch, 
 HIVE-9069.11.patch, HIVE-9069.12.patch, HIVE-9069.13.patch, 
 HIVE-9069.14.patch, HIVE-9069.14.patch, HIVE-9069.patch


 Simplify predicates for disjunctive predicates so that can get pushed down to 
 the scan.
 Looks like this is still an issue, some of the filters can be pushed down to 
 the scan.
 {code}
 set hive.cbo.enable=true
 set hive.stats.fetch.column.stats=true
 set hive.exec.dynamic.partition.mode=nonstrict
 set hive.tez.auto.reducer.parallelism=true
 set hive.auto.convert.join.noconditionaltask.size=32000
 set hive.exec.reducers.bytes.per.reducer=1
 set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager
 set hive.support.concurrency=false
 set hive.tez.exec.print.summary=true
 explain  
 select  substr(r_reason_desc,1,20) as r
,avg(ws_quantity) wq
,avg(wr_refunded_cash) ref
,avg(wr_fee) fee
  from web_sales, web_returns, web_page, customer_demographics cd1,
   customer_demographics cd2, customer_address, date_dim, reason 
  where web_sales.ws_web_page_sk = web_page.wp_web_page_sk
and web_sales.ws_item_sk = web_returns.wr_item_sk
and web_sales.ws_order_number = web_returns.wr_order_number
and web_sales.ws_sold_date_sk = date_dim.d_date_sk and d_year = 1998
and cd1.cd_demo_sk = web_returns.wr_refunded_cdemo_sk 
and cd2.cd_demo_sk = web_returns.wr_returning_cdemo_sk
and customer_address.ca_address_sk = web_returns.wr_refunded_addr_sk
and reason.r_reason_sk = web_returns.wr_reason_sk
and
(
 (
  cd1.cd_marital_status = 'M'
  and
  cd1.cd_marital_status = cd2.cd_marital_status
  and
  cd1.cd_education_status = '4 yr Degree'
  and 
  cd1.cd_education_status = cd2.cd_education_status
  and
  ws_sales_price between 100.00 and 150.00
 )
or
 (
  cd1.cd_marital_status = 'D'
  and
  cd1.cd_marital_status = cd2.cd_marital_status
  and
  cd1.cd_education_status = 'Primary' 
  and
  cd1.cd_education_status = cd2.cd_education_status
  and
  ws_sales_price between 50.00 and 100.00
 )
or
 (
  cd1.cd_marital_status = 'U'
  and
  cd1.cd_marital_status = cd2.cd_marital_status
  and
  cd1.cd_education_status = 'Advanced Degree'
  and
  cd1.cd_education_status = cd2.cd_education_status
  and
  ws_sales_price between 150.00 and 200.00
 )
)
and
(
 (
  ca_country = 'United States'
  and
  ca_state in ('KY', 'GA', 'NM')
  and ws_net_profit between 100 and 200  
 )
 or
 (
  ca_country = 'United States'
  and
  ca_state in ('MT', 'OR', 'IN')
  and ws_net_profit between 150 and 300  
 )
 or
 (
  ca_country = 'United States'
  and
  ca_state in ('WI', 'MO', 'WV')
  and ws_net_profit between 50 and 250  
 )
)
 group by r_reason_desc
 order by r, wq, ref, fee
 limit 100
 OK
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 depends on stages: Stage-1
 STAGE PLANS:
   Stage: Stage-1
 Tez
   Edges:
 Map 9 - Map 1 (BROADCAST_EDGE)
 Reducer 3 - Map 13 (SIMPLE_EDGE), Map 2 (SIMPLE_EDGE)
 Reducer 4 - Map 9 (SIMPLE_EDGE), Reducer 3 (SIMPLE_EDGE)
 Reducer 5 - Map 14 (SIMPLE_EDGE), Reducer 4 (SIMPLE_EDGE)
 Reducer 6 - Map 10 (SIMPLE_EDGE), Map 11 (BROADCAST_EDGE), Map 12 
 (BROADCAST_EDGE), Reducer 5 (SIMPLE_EDGE)
 Reducer 7 - Reducer 6 (SIMPLE_EDGE)
 Reducer 8 - Reducer 7 (SIMPLE_EDGE)
   DagName: mmokhtar_2014161818_f5fd23ba-d783-4b13-8507-7faa65851798:1
   Vertices:
 Map 1 
 Map Operator Tree:
 TableScan
   alias: web_page
   filterExpr: wp_web_page_sk is not null (type: boolean)
   Statistics: Num rows: 4602 Data size: 2696178 Basic stats: 
 COMPLETE Column stats: COMPLETE
 

[jira] [Updated] (HIVE-686) add UDF substring_index

2015-05-26 Thread Alexander Pivovarov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-686:
-
Attachment: HIVE-686.1.patch

patch #1
- derive substring_index from GenericUDF
- add Junit and qtest tests

 add UDF substring_index
 ---

 Key: HIVE-686
 URL: https://issues.apache.org/jira/browse/HIVE-686
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Reporter: Namit Jain
Assignee: Alexander Pivovarov
 Attachments: HIVE-686.1.patch, HIVE-686.patch, HIVE-686.patch


 SUBSTRING_INDEX(str,delim,count)
 Returns the substring from string str before count occurrences of the 
 delimiter delim. If count is positive, everything to the left of the final 
 delimiter (counting from the left) is returned. If count is negative, 
 everything to the right of the final delimiter (counting from the right) is 
 returned. SUBSTRING_INDEX() performs a case-sensitive match when searching 
 for delim.
 Examples:
 {code}
 SELECT SUBSTRING_INDEX('www.mysql.com', '.', 3);
 --www.mysql.com
 SELECT SUBSTRING_INDEX('www.mysql.com', '.', 2);
 --www.mysql
 SELECT SUBSTRING_INDEX('www.mysql.com', '.', 1);
 --www
 SELECT SUBSTRING_INDEX('www.mysql.com', '.', 0);
 --''
 SELECT SUBSTRING_INDEX('www.mysql.com', '.', -1);
 --com
 SELECT SUBSTRING_INDEX('www.mysql.com', '.', -2);
 --mysql.com
 SELECT SUBSTRING_INDEX('www.mysql.com', '.', -3);
 --www.mysql.com
 {code}
 {code}
 --#delim does not exist in str
 SELECT SUBSTRING_INDEX('www.mysql.com', 'Q', 1);
 --www.mysql.com
 --#delim is 2 chars
 SELECT SUBSTRING_INDEX('www||mysql||com', '||', 2);
 --www||mysql
 --#delim is empty string
 SELECT SUBSTRING_INDEX('www.mysql.com', '', 2);
 --''
 --#str is empty string
 SELECT SUBSTRING_INDEX('', '.', 2);
 --''
 {code}
 {code}
 --#null params
 SELECT SUBSTRING_INDEX(null, '.', 1);
 --null
 SELECT SUBSTRING_INDEX('www.mysql.com', null, 1);
 --null
 SELECT SUBSTRING_INDEX('www.mysql.com', '.', null);
 --null
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10807) Invalidate basic stats for insert queries if autogather=false

2015-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560370#comment-14560370
 ] 

Hive QA commented on HIVE-10807:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12735432/HIVE-10807.2.patch

{color:red}ERROR:{color} -1 due to 59 failed/errored test(s), 8974 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_into1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_null_element
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_multi_field_struct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_optional_elements
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_required_elements
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_single_field_struct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_structs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_unannotated_groups
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_unannotated_primitives
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_avro_array_of_primitives
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_avro_array_of_single_field_struct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_create
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_decimal1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_map_null
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_map_of_arrays_of_ints
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_map_of_maps
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_nested_complex
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_read_backward_compatible_files
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_schema_evolution
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_thrift_array_of_primitives
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_thrift_array_of_single_field_struct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_types
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_crc32
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_sha1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_join30
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_null_projection
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_parquet_join
org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testAmbiguousSingleFieldGroupInList
org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testAvroPrimitiveInList
org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testAvroSingleFieldGroupInList
org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testHiveRequiredGroupInList
org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testMultiFieldGroupInList
org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testNewOptionalGroupInList
org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testNewRequiredGroupInList
org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testThriftPrimitiveInList
org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testThriftSingleFieldGroupInList
org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testUnannotatedListOfGroups
org.apache.hadoop.hive.ql.io.parquet.TestDataWritableWriter.testSimpleType
org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testDoubleMapWithStructValue
org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testMapWithComplexKey
org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testNestedMap
org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testStringMapOfOptionalArray
org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testStringMapOfOptionalIntArray
org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testStringMapOptionalPrimitive
org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testStringMapRequiredPrimitive
org.apache.hadoop.hive.ql.io.parquet.TestParquetSerDe.testParquetHiveSerDe
org.apache.hadoop.hive.ql.io.parquet.serde.TestAbstractParquetMapInspector.testEmptyContainer
org.apache.hadoop.hive.ql.io.parquet.serde.TestAbstractParquetMapInspector.testNullContainer
org.apache.hadoop.hive.ql.io.parquet.serde.TestAbstractParquetMapInspector.testRegularMap
org.apache.hadoop.hive.ql.io.parquet.serde.TestDeepParquetHiveMapInspector.testEmptyContainer
org.apache.hadoop.hive.ql.io.parquet.serde.TestDeepParquetHiveMapInspector.testNullContainer
org.apache.hadoop.hive.ql.io.parquet.serde.TestDeepParquetHiveMapInspector.testRegularMap

[jira] [Updated] (HIVE-10807) Invalidate basic stats for insert queries if autogather=false

2015-05-26 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-10807:

Attachment: HIVE-10807.3.patch

 Invalidate basic stats for insert queries if autogather=false
 -

 Key: HIVE-10807
 URL: https://issues.apache.org/jira/browse/HIVE-10807
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 1.2.0
Reporter: Gopal V
Assignee: Ashutosh Chauhan
 Attachments: HIVE-10807.2.patch, HIVE-10807.3.patch, HIVE-10807.patch


 if stats.autogather=false leads to incorrect basic stats in case of insert 
 statements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10716) Fold case/when udf for expression involving nulls in filter operator.

2015-05-26 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560403#comment-14560403
 ] 

Gopal V commented on HIVE-10716:


The easiest fix to the problem seems to be an additional filter expr to produce 
an AND()
{code}
hive explain select avg(ss_sold_date_sk) from store_sales where (case 
ss_sold_date when '1998-01-02' then 1 else null end)=1;

 Map Operator Tree:
TableScan
  alias: store_sales
  filterExpr: CASE (ss_sold_date) WHEN ('1998-01-02') THEN 
(true) ELSE (null) END (type: int)
  Statistics: Num rows: 2474913 Data size: 9899654 Basic stats: 
COMPLETE Column stats: COMPLETE
{code}

vs

{code}
hive explain select avg(ss_sold_date_sk) from store_sales where (case 
ss_sold_date when '1998-01-02' then 1 else null end)=1 and ss_sold_time_Sk  0;
Map Operator Tree:
TableScan
  alias: store_sales
  filterExpr: ((ss_sold_date = '1998-01-02') and 
(ss_sold_time_sk  0)) (type: boolean)
  Statistics: Num rows: 1237456 Data size: 9899654 Basic stats: 
COMPLETE Column stats: COMPLETE
  Filter Operator
predicate: (ss_sold_time_sk  0) (type: boolean)
{code}

[~ashutoshc]: any idea why the extra filter helps in fixing the PPD case?

 Fold case/when udf for expression involving nulls in filter operator.
 -

 Key: HIVE-10716
 URL: https://issues.apache.org/jira/browse/HIVE-10716
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer
Affects Versions: 1.3.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-10716.patch


 From HIVE-10636 comments, more folding is possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10830) First column of a Hive table created with LazyBinaryColumnarSerDe is not read properly

2015-05-26 Thread lovekesh bansal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lovekesh bansal updated HIVE-10830:
---
Description: 
1. create external table platdev.table_target ( id INT, message String, state 
string, date string ) partitioned by (country string) row format delimited 
fields terminated by ',' stored as RCFILE location 
'/user/nikgupta/table_target' ;

2. insert overwrite table platdev.table_target partition(country) select case 
when id=13 then 15 else id end,message,state,date,country from 
platdev.table_base2 where id between 13 and 16; \n

say now my table is written by default using LazyBinaryColumnarSerDe and has 
the following data:
15  thirteendelhi   2-12-2014   india
14  fourteendelhi   1-1-2014india
15  fifteen florida 1-1-2014us
16  sixteen florida 2-12-2014   us

Now If I try to read the data with a mapreduce program, with map function as 
given below:

public void map(LongWritable key, BytesRefArrayWritable val, Context context)
throws IOException, InterruptedException {

for (int i = 0; i  val.size(); i++) {
 BytesRefWritable bytesRefread = val.get(i);
 byte[] currentCell = Arrays.copyOfRange(bytesRefread.getData(), 
bytesRefread.getStart(), bytesRefread.getStart()+bytesRefread.getLength());
 Text currentCellStr = new Text(currentCell);
 System.out.println(rowText=+currentCellStr   );
}
context.write(NullWritable.get(), bytes);
   }


and set  the following job configuration parameters:- 

job.setInputFormatClass(RCFileMapReduceInputFormat.class);
job.setOutputFormatClass(RCFileMapReduceOutputFormat.class);
jobConf.setInt(RCFile.COLUMN_NUMBER_CONF_STR, 5)
 

The output shown is as follows: (LazyBinaryColumnarSerDe)
rowText=
rowText=fifteen
rowText=goa
rowText=2-2-
rowText=us

But exactly the same case using the (ColumnarSerDe) explicitly in the table 
definition would give the following output:
rowText=1
rowText=fifteen
rowText=goa
rowText=2-2-
rowText=us

Point is that First column value is missing in the case of 
LazyBinaryColumnarSerDe.

  was:
1. create external table platdev.table_target ( id INT, message String, state 
string, date string ) partitioned by (country string) row format delimited 
fields terminated by ',' stored as RCFILE location 
'/user/nikgupta/table_target' ;

2. insert overwrite table platdev.table_target partition(country) select case 
when id=13 then 15 else id end,message,state,date,country from 
platdev.table_base2 where id between 13 and 16; \n

say now my table has the following data:
15  thirteendelhi   2-12-2014   india
14  fourteendelhi   1-1-2014india
15  fifteen florida 1-1-2014us
16  sixteen florida 2-12-2014   us

Now If I try to read the data with a mapreduce program, with map function as 
given below:

public void map(LongWritable key, BytesRefArrayWritable val, Context context)
throws IOException, InterruptedException {

for (int i = 0; i  val.size(); i++) {
 BytesRefWritable bytesRefread = val.get(i);
 byte[] currentCell = Arrays.copyOfRange(bytesRefread.getData(), 
bytesRefread.getStart(), bytesRefread.getStart()+bytesRefread.getLength());
 Text currentCellStr = new Text(currentCell);
 System.out.println(rowText=+currentCellStr   );
}
context.write(NullWritable.get(), bytes);
   }


and set  the following job configuration parameters:- 

job.setInputFormatClass(RCFileMapReduceInputFormat.class);
job.setOutputFormatClass(RCFileMapReduceOutputFormat.class);
jobConf.setInt(RCFile.COLUMN_NUMBER_CONF_STR, 5)
 

The output shown is as follows:
rowText=
rowText=fifteen
rowText=goa
rowText=2-2-
rowText=us

But exactly the same case using the ColumnarSerDe explicitly in the table 
definition would give the following output:
rowText=1
rowText=fifteen
rowText=goa
rowText=2-2-
rowText=us

Point is that First column value is missing. 


 First column of a Hive table created with LazyBinaryColumnarSerDe is not read 
 properly
 --

 Key: HIVE-10830
 URL: https://issues.apache.org/jira/browse/HIVE-10830
 Project: Hive
  Issue Type: Bug
Reporter: lovekesh bansal

 1. create external table platdev.table_target ( id INT, message String, state 
 string, date string ) partitioned by (country string) row format delimited 
 fields terminated by ',' stored as RCFILE location 
 '/user/nikgupta/table_target' ;
 2. insert overwrite table platdev.table_target partition(country) select case 
 when id=13 then 15 else id end,message,state,date,country from 
 platdev.table_base2 where id between 13 and 16; \n
 say now my table is written by 

[jira] [Commented] (HIVE-10819) SearchArgumentImpl for Timestamp is broken by HIVE-10286

2015-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560438#comment-14560438
 ] 

Hive QA commented on HIVE-10819:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12735439/HIVE-10819.3.patch

{color:red}ERROR:{color} -1 due to 59 failed/errored test(s), 8974 tests 
executed
*Failed tests:*
{noformat}
TestCustomAuthentication - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_null_element
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_multi_field_struct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_optional_elements
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_required_elements
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_single_field_struct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_structs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_unannotated_groups
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_unannotated_primitives
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_avro_array_of_primitives
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_avro_array_of_single_field_struct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_create
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_decimal1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_map_null
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_map_of_arrays_of_ints
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_map_of_maps
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_nested_complex
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_read_backward_compatible_files
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_schema_evolution
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_thrift_array_of_primitives
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_thrift_array_of_single_field_struct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_types
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_crc32
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_sha1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_join30
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_null_projection
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_parquet_join
org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testAmbiguousSingleFieldGroupInList
org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testAvroPrimitiveInList
org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testAvroSingleFieldGroupInList
org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testHiveRequiredGroupInList
org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testMultiFieldGroupInList
org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testNewOptionalGroupInList
org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testNewRequiredGroupInList
org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testThriftPrimitiveInList
org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testThriftSingleFieldGroupInList
org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testUnannotatedListOfGroups
org.apache.hadoop.hive.ql.io.parquet.TestDataWritableWriter.testSimpleType
org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testDoubleMapWithStructValue
org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testMapWithComplexKey
org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testNestedMap
org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testStringMapOfOptionalArray
org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testStringMapOfOptionalIntArray
org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testStringMapOptionalPrimitive
org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testStringMapRequiredPrimitive
org.apache.hadoop.hive.ql.io.parquet.TestParquetSerDe.testParquetHiveSerDe
org.apache.hadoop.hive.ql.io.parquet.serde.TestAbstractParquetMapInspector.testEmptyContainer
org.apache.hadoop.hive.ql.io.parquet.serde.TestAbstractParquetMapInspector.testNullContainer
org.apache.hadoop.hive.ql.io.parquet.serde.TestAbstractParquetMapInspector.testRegularMap
org.apache.hadoop.hive.ql.io.parquet.serde.TestDeepParquetHiveMapInspector.testEmptyContainer
org.apache.hadoop.hive.ql.io.parquet.serde.TestDeepParquetHiveMapInspector.testNullContainer
org.apache.hadoop.hive.ql.io.parquet.serde.TestDeepParquetHiveMapInspector.testRegularMap

[jira] [Resolved] (HIVE-10813) Fix current test failures after HIVE-8769

2015-05-26 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-10813.
-
   Resolution: Fixed
Fix Version/s: 1.3.0

Fixed by HIVE-10812

 Fix current test failures after HIVE-8769
 -

 Key: HIVE-10813
 URL: https://issues.apache.org/jira/browse/HIVE-10813
 Project: Hive
  Issue Type: Bug
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Fix For: 1.3.0


 We fix the stats annotation in HIVE-8769. However, there are some newly 
 committed test cases (e.g., udf_sha1.q) that are not covered in the patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10716) Fold case/when udf for expression involving nulls in filter operator.

2015-05-26 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560400#comment-14560400
 ] 

Gopal V commented on HIVE-10716:


[~ashutoshc]: LGTM - +1 for the count(1) case, but it looks really odd that the 
{{TableScan::filterExpr}} is not getting folded for this.

TableScan FilterExpr is populated before this folding happens, so it might just 
be an optimization ordering issue?

{code}
hive explain select count(1) from store_sales where (case ss_sold_date when 
'x' then 1 else null end)=1;

STAGE PLANS:
  Stage: Stage-1
Tez
  Edges:
Reducer 2 - Map 1 (SIMPLE_EDGE)
  DagName: gopal_20150526214205_80c41d84-1694-47e9-ab24-144f8007b187:13
  Vertices:
Map 1 
Map Operator Tree:
TableScan
  alias: store_sales
  filterExpr: CASE (ss_sold_date) WHEN ('x') THEN (true) ELSE 
(null) END (type: int)
  Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL 
Column stats: COMPLETE
  Filter Operator
predicate: (ss_sold_date = 'x') (type: boolean)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL 
Column stats: COMPLETE
Select Operator
  Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL 
Column stats: COMPLETE
  Group By Operator
aggregations: count(1)
mode: hash
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 93 Basic stats: 
COMPLETE Column stats: COMPLETE
Reduce Output Operator
  sort order: 
  Statistics: Num rows: 1 Data size: 93 Basic stats: 
COMPLETE Column stats: COMPLETE
  value expressions: _col0 (type: bigint)
Execution mode: vectorized
Reducer 2 
Reduce Operator Tree:
  Group By Operator
aggregations: count(VALUE._col0)
{code}

 Fold case/when udf for expression involving nulls in filter operator.
 -

 Key: HIVE-10716
 URL: https://issues.apache.org/jira/browse/HIVE-10716
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer
Affects Versions: 1.3.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-10716.patch


 From HIVE-10636 comments, more folding is possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10704) Errors in Tez HashTableLoader when estimated table size is 0

2015-05-26 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560432#comment-14560432
 ] 

Alexander Pivovarov commented on HIVE-10704:


Mostafa, can you check RB link? I'm not sure it shows HIVE-10704.3.patch

 Errors in Tez HashTableLoader when estimated table size is 0
 

 Key: HIVE-10704
 URL: https://issues.apache.org/jira/browse/HIVE-10704
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Jason Dere
Assignee: Mostafa Mokhtar
 Fix For: 1.2.1

 Attachments: HIVE-10704.1.patch, HIVE-10704.2.patch, 
 HIVE-10704.3.patch


 Couple of issues:
 - If the table sizes in MapJoinOperator.getParentDataSizes() are 0 for all 
 tables, the largest small table selection is wrong and could select the large 
 table (which results in NPE)
 - The memory estimates can either divide-by-zero, or allocate 0 memory if the 
 table size is 0. Try to come up with a sensible default for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10716) Fold case/when udf for expression involving nulls in filter operator.

2015-05-26 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-10716:

Affects Version/s: (was: 1.3.0)
   1.2.0

 Fold case/when udf for expression involving nulls in filter operator.
 -

 Key: HIVE-10716
 URL: https://issues.apache.org/jira/browse/HIVE-10716
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer
Affects Versions: 1.2.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 1.2.1

 Attachments: HIVE-10716.patch


 From HIVE-10636 comments, more folding is possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10716) Fold case/when udf for expression involving nulls in filter operator.

2015-05-26 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560457#comment-14560457
 ] 

Ashutosh Chauhan commented on HIVE-10716:
-

[~gopalv] I need to verify, but my guess is 
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java#L80
 is coming in play here.

 Fold case/when udf for expression involving nulls in filter operator.
 -

 Key: HIVE-10716
 URL: https://issues.apache.org/jira/browse/HIVE-10716
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer
Affects Versions: 1.2.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 1.2.1

 Attachments: HIVE-10716.patch


 From HIVE-10636 comments, more folding is possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10793) Hybrid Hybrid Grace Hash Join : Don't allocate all hash table memory upfront

2015-05-26 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560459#comment-14560459
 ] 

Lefty Leverenz commented on HIVE-10793:
---

Doc note:  This changes the default value of 
*hive.mapjoin.optimized.hashtable.wbsize* so the wiki needs to be updated (with 
version information).

* [Configuration Properties -- hive.mapjoin.optimized.hashtable.wbsize | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.mapjoin.optimized.hashtable.wbsize]

The patch also makes minor changes to the definitions of 
*hive.mapjoin.hybridgrace.minwbsize* and 
*hive.mapjoin.hybridgrace.minnumpartitions* which do not need any doc changes.

 Hybrid Hybrid Grace Hash Join : Don't allocate all hash table memory upfront
 

 Key: HIVE-10793
 URL: https://issues.apache.org/jira/browse/HIVE-10793
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.0
Reporter: Mostafa Mokhtar
Assignee: Mostafa Mokhtar
 Fix For: 1.3.0

 Attachments: HIVE-10793.1.patch, HIVE-10793.2.patch


 HybridHashTableContainer will allocate memory based on estimate, which means 
 if the actual is less than the estimate the allocated memory won't be used.
 Number of partitions is calculated based on estimated data size
 {code}
 numPartitions = calcNumPartitions(memoryThreshold, estimatedTableSize, 
 minNumParts, minWbSize,
   nwayConf);
 {code}
 Then based on number of partitions writeBufferSize is set
 {code}
 writeBufferSize = (int)(estimatedTableSize / numPartitions);
 {code}
 Each hash partition will allocate 1 WriteBuffer, with no further allocation 
 if the estimate data size is correct.
 Suggested solution is to reduce writeBufferSize by a factor such that only X% 
 of the memory is preallocated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10793) Hybrid Hybrid Grace Hash Join : Don't allocate all hash table memory upfront

2015-05-26 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-10793:
--
Labels: TODOC1.3  (was: )

 Hybrid Hybrid Grace Hash Join : Don't allocate all hash table memory upfront
 

 Key: HIVE-10793
 URL: https://issues.apache.org/jira/browse/HIVE-10793
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.0
Reporter: Mostafa Mokhtar
Assignee: Mostafa Mokhtar
  Labels: TODOC1.3
 Fix For: 1.3.0

 Attachments: HIVE-10793.1.patch, HIVE-10793.2.patch


 HybridHashTableContainer will allocate memory based on estimate, which means 
 if the actual is less than the estimate the allocated memory won't be used.
 Number of partitions is calculated based on estimated data size
 {code}
 numPartitions = calcNumPartitions(memoryThreshold, estimatedTableSize, 
 minNumParts, minWbSize,
   nwayConf);
 {code}
 Then based on number of partitions writeBufferSize is set
 {code}
 writeBufferSize = (int)(estimatedTableSize / numPartitions);
 {code}
 Each hash partition will allocate 1 WriteBuffer, with no further allocation 
 if the estimate data size is correct.
 Suggested solution is to reduce writeBufferSize by a factor such that only X% 
 of the memory is preallocated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10775) Frequent calls to printStackTrace() obscuring legitimate problems

2015-05-26 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-10775:

Issue Type: Improvement  (was: Test)

 Frequent calls to printStackTrace() obscuring legitimate problems
 -

 Key: HIVE-10775
 URL: https://issues.apache.org/jira/browse/HIVE-10775
 Project: Hive
  Issue Type: Improvement
  Components: Metastore, Query Processor
Reporter: Andrew Cowie
Assignee: Andrew Cowie
Priority: Minor
 Attachments: HIVE-10775.1.patch


 When running test suites built on top of libraries that build on top of ... 
 that use Hive, the signal to noise ratio with exceptions flying past is 
 appalling. Most of this is down to calls to printStackTrace() embedded in 
 this library. HIVE-7697 showed someone cleaning that up and replacing with 
 logging the exception instead. That seems wise (logging can be redirected by 
 the calling test suite).
 So, if you don't object, I'll hunt down the calls to printStackTrace() and 
 replace them with LOG.warn() instead. I'm about half way through the patch 
 now.
 AfC



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10775) Frequent calls to printStackTrace() obscuring legitimate problems

2015-05-26 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558713#comment-14558713
 ] 

Ashutosh Chauhan commented on HIVE-10775:
-

yeah.. this is certainly useful. Thanks [~afcowie] for picking up this task.

 Frequent calls to printStackTrace() obscuring legitimate problems
 -

 Key: HIVE-10775
 URL: https://issues.apache.org/jira/browse/HIVE-10775
 Project: Hive
  Issue Type: Test
  Components: Metastore, Query Processor
Reporter: Andrew Cowie
Assignee: Andrew Cowie
Priority: Minor
 Attachments: HIVE-10775.1.patch


 When running test suites built on top of libraries that build on top of ... 
 that use Hive, the signal to noise ratio with exceptions flying past is 
 appalling. Most of this is down to calls to printStackTrace() embedded in 
 this library. HIVE-7697 showed someone cleaning that up and replacing with 
 logging the exception instead. That seems wise (logging can be redirected by 
 the calling test suite).
 So, if you don't object, I'll hunt down the calls to printStackTrace() and 
 replace them with LOG.warn() instead. I'm about half way through the patch 
 now.
 AfC



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10817) Blacklist For Bad MetaStore

2015-05-26 Thread Nemon Lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-10817:
-
Attachment: HIVE-10817

 Blacklist For Bad MetaStore
 ---

 Key: HIVE-10817
 URL: https://issues.apache.org/jira/browse/HIVE-10817
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2, Metastore
Affects Versions: 1.2.0
Reporter: Nemon Lou
Assignee: Nemon Lou
 Attachments: HIVE-10817


 During a reliability test ,when one of MetaStore 's machine power down 
 ,HiveServer2 then never submit jobs to YARN.
 There are 100 JDBC clients (Beeline)  running concurrently.And all the 
 100 JDBC clients hangs.
 After checking HiveServer2's thread stack,i find that most of the threads 
 waiting to lock AbstractService while the one holding it is trying to connect 
 to 
 the bad MetaStore which has been power down.When the thread which hold this 
 lock finally return SocketTimeoutException and release this lock,another 
 thread will hold this lock and again stuck until  socket time out.
 Adding a new blacklist mechanism finally solved this issue. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6791) Support variable substition for Beeline shell command

2015-05-26 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558832#comment-14558832
 ] 

Ferdinand Xu commented on HIVE-6791:


Hi [~xuefuz], I am working on hive cli replacing work. Seems this is a gap 
between cli and beeline. Do you want to work on this jira? If not, I'd like to 
pick it up. And I have a basic idea to archive the goal that we can add a new 
command process to add new hive variable to the hiveVariables in SessionState. 
Any thoughts about it? Thank you!

 Support variable substition for Beeline shell command
 -

 Key: HIVE-6791
 URL: https://issues.apache.org/jira/browse/HIVE-6791
 Project: Hive
  Issue Type: New Feature
  Components: CLI, Clients
Affects Versions: 0.14.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang

 A follow-up task from HIVE-6694. Similar to HIVE-6570.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10819) SearchArgumentImpl for Timestamp is broken by HIVE-10286

2015-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558759#comment-14558759
 ] 

Hive QA commented on HIVE-10819:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12735252/HIVE-10819.2.patch

{color:red}ERROR:{color} -1 due to 638 failed/errored test(s), 8972 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_multiple
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alias_casted_column
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table2_h23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table_h23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_protect_mode
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition_authorization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join26
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_reordering_values
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_change_schema
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_comments
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_compression_enabled
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_compression_enabled_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_date
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_deserialize_map_null
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_evolved_schemas
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_joins
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_joins_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_fields
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_sanity_test
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_schema_evolution_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_timestamp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_type_evolution
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table_udfs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_binary_output_format
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_5

[jira] [Updated] (HIVE-10815) Let HiveMetaStoreClient Choose MetaStore Randomly

2015-05-26 Thread Nemon Lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-10815:
-
Attachment: (was: HIVE-10815.patch)

 Let HiveMetaStoreClient Choose MetaStore Randomly
 -

 Key: HIVE-10815
 URL: https://issues.apache.org/jira/browse/HIVE-10815
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2, Metastore
Affects Versions: 1.2.0
Reporter: Nemon Lou
Assignee: Nemon Lou
 Attachments: HIVE-10815.patch


 Currently HiveMetaStoreClient using a fixed order to choose MetaStore URIs 
 when multiple metastores configured.
  Choosing MetaStore Randomly will be good for load balance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10815) Let HiveMetaStoreClient Choose MetaStore Randomly

2015-05-26 Thread Nemon Lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-10815:
-
Attachment: HIVE-10815.patch

 Let HiveMetaStoreClient Choose MetaStore Randomly
 -

 Key: HIVE-10815
 URL: https://issues.apache.org/jira/browse/HIVE-10815
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2, Metastore
Affects Versions: 1.2.0
Reporter: Nemon Lou
Assignee: Nemon Lou
 Attachments: HIVE-10815.patch


 Currently HiveMetaStoreClient using a fixed order to choose MetaStore URIs 
 when multiple metastores configured.
  Choosing MetaStore Randomly will be good for load balance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10815) Let HiveMetaStoreClient Choose MetaStore Randomly

2015-05-26 Thread Nemon Lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-10815:
-
Attachment: HIVE-10815.patch

 Let HiveMetaStoreClient Choose MetaStore Randomly
 -

 Key: HIVE-10815
 URL: https://issues.apache.org/jira/browse/HIVE-10815
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2, Metastore
Affects Versions: 1.2.0
Reporter: Nemon Lou
Assignee: Nemon Lou
 Attachments: HIVE-10815.patch


 Currently HiveMetaStoreClient using a fixed order to choose MetaStore URIs 
 when multiple metastores configured.
  Choosing MetaStore Randomly will be good for load balance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10775) Frequent calls to printStackTrace() obscuring legitimate problems

2015-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558855#comment-14558855
 ] 

Hive QA commented on HIVE-10775:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12735230/HIVE-10775.1.patch

{color:red}ERROR:{color} -1 due to 639 failed/errored test(s), 8971 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_multiple
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alias_casted_column
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table2_h23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table_h23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_protect_mode
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition_authorization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join26
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_reordering_values
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_change_schema
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_comments
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_compression_enabled
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_compression_enabled_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_date
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_deserialize_map_null
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_evolved_schemas
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_joins
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_joins_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_fields
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_sanity_test
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_schema_evolution_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_timestamp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_type_evolution
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table_udfs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_binary_output_format
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_5

[jira] [Updated] (HIVE-10165) Improve hive-hcatalog-streaming extensibility and support updates and deletes.

2015-05-26 Thread Elliot West (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliot West updated HIVE-10165:
---
Attachment: HIVE-10165.5.patch

Attached [^HIVE-10165.5.patch] to fix a failing test of mine. What should I do 
with the tests that failing in other parts of the Hive project?

 Improve hive-hcatalog-streaming extensibility and support updates and deletes.
 --

 Key: HIVE-10165
 URL: https://issues.apache.org/jira/browse/HIVE-10165
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Affects Versions: 1.2.0
Reporter: Elliot West
Assignee: Elliot West
  Labels: streaming_api
 Attachments: HIVE-10165.0.patch, HIVE-10165.4.patch, 
 HIVE-10165.5.patch


 h3. Overview
 I'd like to extend the 
 [hive-hcatalog-streaming|https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest]
  API so that it also supports the writing of record updates and deletes in 
 addition to the already supported inserts.
 h3. Motivation
 We have many Hadoop processes outside of Hive that merge changed facts into 
 existing datasets. Traditionally we achieve this by: reading in a 
 ground-truth dataset and a modified dataset, grouping by a key, sorting by a 
 sequence and then applying a function to determine inserted, updated, and 
 deleted rows. However, in our current scheme we must rewrite all partitions 
 that may potentially contain changes. In practice the number of mutated 
 records is very small when compared with the records contained in a 
 partition. This approach results in a number of operational issues:
 * Excessive amount of write activity required for small data changes.
 * Downstream applications cannot robustly read these datasets while they are 
 being updated.
 * Due to scale of the updates (hundreds or partitions) the scope for 
 contention is high. 
 I believe we can address this problem by instead writing only the changed 
 records to a Hive transactional table. This should drastically reduce the 
 amount of data that we need to write and also provide a means for managing 
 concurrent access to the data. Our existing merge processes can read and 
 retain each record's {{ROW_ID}}/{{RecordIdentifier}} and pass this through to 
 an updated form of the hive-hcatalog-streaming API which will then have the 
 required data to perform an update or insert in a transactional manner. 
 h3. Benefits
 * Enables the creation of large-scale dataset merge processes  
 * Opens up Hive transactional functionality in an accessible manner to 
 processes that operate outside of Hive.
 h3. Implementation
 Our changes do not break the existing API contracts. Instead our approach has 
 been to consider the functionality offered by the existing API and our 
 proposed API as fulfilling separate and distinct use-cases. The existing API 
 is primarily focused on the task of continuously writing large volumes of new 
 data into a Hive table for near-immediate analysis. Our use-case however, is 
 concerned more with the frequent but not continuous ingestion of mutations to 
 a Hive table from some ETL merge process. Consequently we feel it is 
 justifiable to add our new functionality via an alternative set of public 
 interfaces and leave the existing API as is. This keeps both APIs clean and 
 focused at the expense of presenting additional options to potential users. 
 Wherever possible, shared implementation concerns have been factored out into 
 abstract base classes that are open to third-party extension. A detailed 
 breakdown of the changes is as follows:
 * We've introduced a public {{RecordMutator}} interface whose purpose is to 
 expose insert/update/delete operations to the user. This is a counterpart to 
 the write-only {{RecordWriter}}. We've also factored out life-cycle methods 
 common to these two interfaces into a super {{RecordOperationWriter}} 
 interface.  Note that the row representation has be changed from {{byte[]}} 
 to {{Object}}. Within our data processing jobs our records are often 
 available in a strongly typed and decoded form such as a POJO or a Tuple 
 object. Therefore is seems to make sense that we are able to pass this 
 through to the {{OrcRecordUpdater}} without having to go through a {{byte[]}} 
 encoding step. This of course still allows users to use {{byte[]}} if they 
 wish.
 * The introduction of {{RecordMutator}} requires that insert/update/delete 
 operations are then also exposed on a {{TransactionBatch}} type. We've done 
 this with the introduction of a public {{MutatorTransactionBatch}} interface 
 which is a counterpart to the write-only {{TransactionBatch}}. We've also 
 factored out life-cycle methods common to these two interfaces into a super 
 {{BaseTransactionBatch}} interface. 
 * 

[jira] [Updated] (HIVE-10811) RelFieldTrimmer throws NoSuchElementException in some cases

2015-05-26 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-10811:
---
Attachment: HIVE-10811.01.patch

 RelFieldTrimmer throws NoSuchElementException in some cases
 ---

 Key: HIVE-10811
 URL: https://issues.apache.org/jira/browse/HIVE-10811
 Project: Hive
  Issue Type: Bug
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-10811.01.patch, HIVE-10811.patch


 RelFieldTrimmer runs into NoSuchElementException in some cases.
 Stack trace:
 {noformat}
 Exception in thread main java.lang.AssertionError: Internal error: While 
 invoking method 'public org.apache.calcite.sql2rel.RelFieldTrimmer$TrimResult 
 org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(org.apache.calcite.rel.core.Sort,org.apache.calcite.util.ImmutableBitSet,java.util.Set)'
   at org.apache.calcite.util.Util.newInternal(Util.java:743)
   at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:543)
   at 
 org.apache.calcite.sql2rel.RelFieldTrimmer.dispatchTrimFields(RelFieldTrimmer.java:269)
   at 
 org.apache.calcite.sql2rel.RelFieldTrimmer.trim(RelFieldTrimmer.java:175)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:947)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:820)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:768)
   at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:109)
   at 
 org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:730)
   at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:145)
   at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:105)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:607)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:244)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10048)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:207)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
   at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 Caused by: java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:536)
   ... 32 more
 Caused by: java.lang.AssertionError: Internal error: While invoking method 
 'public org.apache.calcite.sql2rel.RelFieldTrimmer$TrimResult 
 org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(org.apache.calcite.rel.core.Sort,org.apache.calcite.util.ImmutableBitSet,java.util.Set)'
   at org.apache.calcite.util.Util.newInternal(Util.java:743)
   at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:543)
   at 
 org.apache.calcite.sql2rel.RelFieldTrimmer.dispatchTrimFields(RelFieldTrimmer.java:269)
   at 
 

[jira] [Updated] (HIVE-9069) Simplify filter predicates for CBO

2015-05-26 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-9069:
--
Attachment: HIVE-9069.14.patch

 Simplify filter predicates for CBO
 --

 Key: HIVE-9069
 URL: https://issues.apache.org/jira/browse/HIVE-9069
 Project: Hive
  Issue Type: Bug
  Components: CBO
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Jesus Camacho Rodriguez
 Fix For: 0.14.1

 Attachments: HIVE-9069.01.patch, HIVE-9069.02.patch, 
 HIVE-9069.03.patch, HIVE-9069.04.patch, HIVE-9069.05.patch, 
 HIVE-9069.06.patch, HIVE-9069.07.patch, HIVE-9069.08.patch, 
 HIVE-9069.08.patch, HIVE-9069.09.patch, HIVE-9069.10.patch, 
 HIVE-9069.11.patch, HIVE-9069.12.patch, HIVE-9069.13.patch, 
 HIVE-9069.14.patch, HIVE-9069.patch


 Simplify predicates for disjunctive predicates so that can get pushed down to 
 the scan.
 Looks like this is still an issue, some of the filters can be pushed down to 
 the scan.
 {code}
 set hive.cbo.enable=true
 set hive.stats.fetch.column.stats=true
 set hive.exec.dynamic.partition.mode=nonstrict
 set hive.tez.auto.reducer.parallelism=true
 set hive.auto.convert.join.noconditionaltask.size=32000
 set hive.exec.reducers.bytes.per.reducer=1
 set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager
 set hive.support.concurrency=false
 set hive.tez.exec.print.summary=true
 explain  
 select  substr(r_reason_desc,1,20) as r
,avg(ws_quantity) wq
,avg(wr_refunded_cash) ref
,avg(wr_fee) fee
  from web_sales, web_returns, web_page, customer_demographics cd1,
   customer_demographics cd2, customer_address, date_dim, reason 
  where web_sales.ws_web_page_sk = web_page.wp_web_page_sk
and web_sales.ws_item_sk = web_returns.wr_item_sk
and web_sales.ws_order_number = web_returns.wr_order_number
and web_sales.ws_sold_date_sk = date_dim.d_date_sk and d_year = 1998
and cd1.cd_demo_sk = web_returns.wr_refunded_cdemo_sk 
and cd2.cd_demo_sk = web_returns.wr_returning_cdemo_sk
and customer_address.ca_address_sk = web_returns.wr_refunded_addr_sk
and reason.r_reason_sk = web_returns.wr_reason_sk
and
(
 (
  cd1.cd_marital_status = 'M'
  and
  cd1.cd_marital_status = cd2.cd_marital_status
  and
  cd1.cd_education_status = '4 yr Degree'
  and 
  cd1.cd_education_status = cd2.cd_education_status
  and
  ws_sales_price between 100.00 and 150.00
 )
or
 (
  cd1.cd_marital_status = 'D'
  and
  cd1.cd_marital_status = cd2.cd_marital_status
  and
  cd1.cd_education_status = 'Primary' 
  and
  cd1.cd_education_status = cd2.cd_education_status
  and
  ws_sales_price between 50.00 and 100.00
 )
or
 (
  cd1.cd_marital_status = 'U'
  and
  cd1.cd_marital_status = cd2.cd_marital_status
  and
  cd1.cd_education_status = 'Advanced Degree'
  and
  cd1.cd_education_status = cd2.cd_education_status
  and
  ws_sales_price between 150.00 and 200.00
 )
)
and
(
 (
  ca_country = 'United States'
  and
  ca_state in ('KY', 'GA', 'NM')
  and ws_net_profit between 100 and 200  
 )
 or
 (
  ca_country = 'United States'
  and
  ca_state in ('MT', 'OR', 'IN')
  and ws_net_profit between 150 and 300  
 )
 or
 (
  ca_country = 'United States'
  and
  ca_state in ('WI', 'MO', 'WV')
  and ws_net_profit between 50 and 250  
 )
)
 group by r_reason_desc
 order by r, wq, ref, fee
 limit 100
 OK
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 depends on stages: Stage-1
 STAGE PLANS:
   Stage: Stage-1
 Tez
   Edges:
 Map 9 - Map 1 (BROADCAST_EDGE)
 Reducer 3 - Map 13 (SIMPLE_EDGE), Map 2 (SIMPLE_EDGE)
 Reducer 4 - Map 9 (SIMPLE_EDGE), Reducer 3 (SIMPLE_EDGE)
 Reducer 5 - Map 14 (SIMPLE_EDGE), Reducer 4 (SIMPLE_EDGE)
 Reducer 6 - Map 10 (SIMPLE_EDGE), Map 11 (BROADCAST_EDGE), Map 12 
 (BROADCAST_EDGE), Reducer 5 (SIMPLE_EDGE)
 Reducer 7 - Reducer 6 (SIMPLE_EDGE)
 Reducer 8 - Reducer 7 (SIMPLE_EDGE)
   DagName: mmokhtar_2014161818_f5fd23ba-d783-4b13-8507-7faa65851798:1
   Vertices:
 Map 1 
 Map Operator Tree:
 TableScan
   alias: web_page
   filterExpr: wp_web_page_sk is not null (type: boolean)
   Statistics: Num rows: 4602 Data size: 2696178 Basic stats: 
 COMPLETE Column stats: COMPLETE
   Filter Operator
 predicate: wp_web_page_sk is not null (type: boolean)
 Statistics: Num rows: 4602 Data size: 18408 Basic stats: 
 COMPLETE Column stats: 

[jira] [Updated] (HIVE-10792) PPD leads to wrong answer when mapper scans the same table with multiple aliases

2015-05-26 Thread Dayue Gao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dayue Gao updated HIVE-10792:
-
Attachment: HIVE-10792.2.patch

Generate patch against master branch instead of old trunk branch, also adding 2 
test cases.

 PPD leads to wrong answer when mapper scans the same table with multiple 
 aliases
 

 Key: HIVE-10792
 URL: https://issues.apache.org/jira/browse/HIVE-10792
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Query Processor
Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.2.0, 1.1.0, 1.2.1
Reporter: Dayue Gao
Assignee: Dayue Gao
Priority: Critical
 Fix For: 1.2.1

 Attachments: HIVE-10792.1.patch, HIVE-10792.2.patch, 
 HIVE-10792.test.sql


 Here's the steps to reproduce the bug.
 First of all, prepare a simple ORC table with one row
 {code}
 create table test_orc (c0 int, c1 int) stored as ORC;
 {code}
 Table: test_orc
 ||c0||c1||
 |0|1|
 The following SQL gets empty result which is not expected
 {code}
 select * from test_orc t1
 union all
 select * from test_orc t2
 where t2.c0 = 1
 {code}
 Self join is also broken
 {code}
 set hive.auto.convert.join=false; -- force common join
 select * from test_orc t1
 left outer join test_orc t2 on (t1.c0=t2.c0 and t2.c1=0);
 {code}
 It gets empty result while the expected answer is
 ||t1.c0||t1.c1||t2.c0||t2.c1||
 |0|1|NULL|NULL|
 In these cases, we pushdown predicates into OrcInputFormat. As a result, 
 TableScanOperator for t1 can't receive its rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6991) History not able to disable/enable after session started

2015-05-26 Thread Chinna Rao Lalam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-6991:
---
Attachment: HIVE-6991.2.patch

After session started if set hive.session.history.enabled property, it won't 
take effect because creating the history file is done while starting the 
session only.

Added new method updateHistory(), it will call if set 
hive.session.history.enabled property.

 History not able to disable/enable after session started
 

 Key: HIVE-6991
 URL: https://issues.apache.org/jira/browse/HIVE-6991
 Project: Hive
  Issue Type: Bug
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-6991.1.patch, HIVE-6991.2.patch, HIVE-6991.patch


 By default history is disabled, after session started if enable history 
 through this command set hive.session.history.enabled=true. It is not working.
 I think it will help to this user query
 http://mail-archives.apache.org/mod_mbox/hive-user/201404.mbox/%3ccajqy7afapa_pjs6buon0o8zyt2qwfn2wt-mtznwfmurav_8...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10822) CLI start script throwing error message on console

2015-05-26 Thread Chinna Rao Lalam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-10822:

Attachment: HIVE-10822.patch

Updated the patch with double quotes.

 CLI start script throwing error message on console
 --

 Key: HIVE-10822
 URL: https://issues.apache.org/jira/browse/HIVE-10822
 Project: Hive
  Issue Type: Sub-task
  Components: CLI
Affects Versions: beeline-cli-branch
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-10822.patch


 Starting cli throwing following message on console
 {noformat}
 [chinna@stobdtserver1 bin]$ ./hive
 ./ext/cli.sh: line 20: [: ==: unary operator expected
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-10823) CLI start script throwing error message on console

2015-05-26 Thread Chinna Rao Lalam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam resolved HIVE-10823.
-
Resolution: Duplicate

 CLI start script throwing error message on console
 --

 Key: HIVE-10823
 URL: https://issues.apache.org/jira/browse/HIVE-10823
 Project: Hive
  Issue Type: Sub-task
  Components: CLI
Affects Versions: beeline-cli-branch
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam

 Starting cli throwing following message on console
 {noformat}
 [chinna@stobdtserver1 bin]$ ./hive
 ./ext/cli.sh: line 20: [: ==: unary operator expected
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10792) PPD leads to wrong answer when mapper scans the same table with multiple aliases

2015-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559027#comment-14559027
 ] 

Hive QA commented on HIVE-10792:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12735292/HIVE-10792.2.patch

{color:red}ERROR:{color} -1 due to 640 failed/errored test(s), 8975 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_multiple
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alias_casted_column
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table2_h23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table_h23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_protect_mode
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition_authorization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join26
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_reordering_values
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_change_schema
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_comments
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_compression_enabled
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_compression_enabled_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_date
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_deserialize_map_null
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_evolved_schemas
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_joins
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_joins_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_fields
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_sanity_test
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_schema_evolution_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_timestamp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_type_evolution
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table_udfs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_binary_output_format
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_5

[jira] [Commented] (HIVE-10822) CLI start script throwing error message on console

2015-05-26 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559084#comment-14559084
 ] 

Ferdinand Xu commented on HIVE-10822:
-

[~chinnalalam], I can't reproduce this issue locally. Do you use export 
USE_DEPRECATED_CLI=true to export? Thank you

 CLI start script throwing error message on console
 --

 Key: HIVE-10822
 URL: https://issues.apache.org/jira/browse/HIVE-10822
 Project: Hive
  Issue Type: Sub-task
  Components: CLI
Affects Versions: beeline-cli-branch
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-10822.patch


 Starting cli throwing following message on console
 {noformat}
 [chinna@stobdtserver1 bin]$ ./hive
 ./ext/cli.sh: line 20: [: ==: unary operator expected
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10822) CLI start script throwing error message on console

2015-05-26 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559089#comment-14559089
 ] 

Ferdinand Xu commented on HIVE-10822:
-

+1 for the patch. Thank you for figuring it out.

 CLI start script throwing error message on console
 --

 Key: HIVE-10822
 URL: https://issues.apache.org/jira/browse/HIVE-10822
 Project: Hive
  Issue Type: Sub-task
  Components: CLI
Affects Versions: beeline-cli-branch
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-10822.patch


 Starting cli throwing following message on console
 {noformat}
 [chinna@stobdtserver1 bin]$ ./hive
 ./ext/cli.sh: line 20: [: ==: unary operator expected
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10697) ObjectInspectorConvertors#UnionConvertor does a faulty conversion

2015-05-26 Thread Olson,Andrew (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olson,Andrew updated HIVE-10697:

Summary: ObjectInspectorConvertors#UnionConvertor does a faulty conversion  
(was: ObjecInspectorConvertors#UnionConvertor does a faulty conversion)

 ObjectInspectorConvertors#UnionConvertor does a faulty conversion
 -

 Key: HIVE-10697
 URL: https://issues.apache.org/jira/browse/HIVE-10697
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Swarnim Kulkarni
Assignee: Swarnim Kulkarni

 Currently the UnionConvertor in the ObjectInspectorConvertors class has an 
 issue with the convert method where it attempts to convert the 
 objectinspector itself instead of converting the field.[1]. This should be 
 changed to convert the field itself. This could result in a 
 ClassCastException as shown below:
 {code}
 Caused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.lazy.objectinspector.LazyUnionObjectInspector 
 cannot be cast to org.apache.hadoop.hive.serde2.lazy.LazyString
   at 
 org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyStringObjectInspector.getPrimitiveWritableObject(LazyStringObjectInspector.java:51)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$TextConverter.convert(PrimitiveObjectInspectorConverter.java:391)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$TextConverter.convert(PrimitiveObjectInspectorConverter.java:338)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$UnionConverter.convert(ObjectInspectorConverters.java:456)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.convert(ObjectInspectorConverters.java:395)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$MapConverter.convert(ObjectInspectorConverters.java:539)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.convert(ObjectInspectorConverters.java:395)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:154)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:127)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:518)
   ... 9 more
 {code}
 [1] 
 https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorConverters.java#L466



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10165) Improve hive-hcatalog-streaming extensibility and support updates and deletes.

2015-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559100#comment-14559100
 ] 

Hive QA commented on HIVE-10165:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12735291/HIVE-10165.5.patch

{color:red}ERROR:{color} -1 due to 637 failed/errored test(s), 9048 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_multiple
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alias_casted_column
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table2_h23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table_h23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_protect_mode
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition_authorization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join26
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_reordering_values
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_change_schema
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_comments
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_compression_enabled
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_compression_enabled_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_date
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_deserialize_map_null
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_evolved_schemas
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_joins
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_joins_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_fields
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_sanity_test
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_schema_evolution_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_timestamp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_type_evolution
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table_udfs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_binary_output_format
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_5

[jira] [Commented] (HIVE-10811) RelFieldTrimmer throws NoSuchElementException in some cases

2015-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559540#comment-14559540
 ] 

Hive QA commented on HIVE-10811:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12735331/HIVE-10811.02.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 8973 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_crc32
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_sha1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_join30
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_null_projection
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4044/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4044/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4044/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12735331 - PreCommit-HIVE-TRUNK-Build

 RelFieldTrimmer throws NoSuchElementException in some cases
 ---

 Key: HIVE-10811
 URL: https://issues.apache.org/jira/browse/HIVE-10811
 Project: Hive
  Issue Type: Bug
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-10811.01.patch, HIVE-10811.02.patch, 
 HIVE-10811.patch


 RelFieldTrimmer runs into NoSuchElementException in some cases.
 Stack trace:
 {noformat}
 Exception in thread main java.lang.AssertionError: Internal error: While 
 invoking method 'public org.apache.calcite.sql2rel.RelFieldTrimmer$TrimResult 
 org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(org.apache.calcite.rel.core.Sort,org.apache.calcite.util.ImmutableBitSet,java.util.Set)'
   at org.apache.calcite.util.Util.newInternal(Util.java:743)
   at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:543)
   at 
 org.apache.calcite.sql2rel.RelFieldTrimmer.dispatchTrimFields(RelFieldTrimmer.java:269)
   at 
 org.apache.calcite.sql2rel.RelFieldTrimmer.trim(RelFieldTrimmer.java:175)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:947)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:820)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:768)
   at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:109)
   at 
 org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:730)
   at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:145)
   at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:105)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:607)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:244)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10048)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:207)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
   at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 

[jira] [Commented] (HIVE-10812) Scaling PK/FK's selectivity for stats annotation

2015-05-26 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559538#comment-14559538
 ] 

Laljo John Pullokkaran commented on HIVE-10812:
---

+1

 Scaling PK/FK's selectivity for stats annotation
 

 Key: HIVE-10812
 URL: https://issues.apache.org/jira/browse/HIVE-10812
 Project: Hive
  Issue Type: Improvement
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-10812.01.patch, HIVE-10812.02.patch


 Right now, the computation of the selectivity of FK side based on PK side 
 does not take into consideration of the range of FK and the range of PK.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10711) Tez HashTableLoader attempts to allocate more memory than available when HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD exceeds process max mem

2015-05-26 Thread Mostafa Mokhtar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559618#comment-14559618
 ] 

Mostafa Mokhtar commented on HIVE-10711:


[~apivovarov]
do you have anymore feedback?

 Tez HashTableLoader attempts to allocate more memory than available when 
 HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD exceeds process max mem
 --

 Key: HIVE-10711
 URL: https://issues.apache.org/jira/browse/HIVE-10711
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Mostafa Mokhtar
 Fix For: 1.2.1

 Attachments: HIVE-10711.1.patch, HIVE-10711.2.patch, 
 HIVE-10711.3.patch, HIVE-10711.4.patch


 Tez HashTableLoader bases its memory allocation on 
 HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD. If this value is largeer than the 
 process max memory then this can result in the HashTableLoader trying to use 
 more memory than available to the process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10749) Implement Insert ACID statement for parquet [Parquet branch]

2015-05-26 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-10749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-10749:
---
Summary: Implement Insert ACID statement for parquet [Parquet branch]  
(was: Implement Insert ACID statement for parquet)

 Implement Insert ACID statement for parquet [Parquet branch]
 

 Key: HIVE-10749
 URL: https://issues.apache.org/jira/browse/HIVE-10749
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
 Attachments: HIVE-10749.1.patch, HIVE-10749.1.patch, 
 HIVE-10749.2.patch, HIVE-10749.patch


 We need to implement insert statement for parquet format like ORC.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10812) Scaling PK/FK's selectivity for stats annotation

2015-05-26 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-10812:
---
Attachment: HIVE-10812.03.patch

 Scaling PK/FK's selectivity for stats annotation
 

 Key: HIVE-10812
 URL: https://issues.apache.org/jira/browse/HIVE-10812
 Project: Hive
  Issue Type: Improvement
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-10812.01.patch, HIVE-10812.02.patch, 
 HIVE-10812.03.patch


 Right now, the computation of the selectivity of FK side based on PK side 
 does not take into consideration of the range of FK and the range of PK.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10749) Implement Insert ACID statement for parquet [Parquet branch]

2015-05-26 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-10749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-10749:
---
Attachment: HIVE-10749.2-parquet.patch

Re-attaching patch to allow jenkins job to execute tests on parquet branch

 Implement Insert ACID statement for parquet [Parquet branch]
 

 Key: HIVE-10749
 URL: https://issues.apache.org/jira/browse/HIVE-10749
 Project: Hive
  Issue Type: Sub-task
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
 Attachments: HIVE-10749.1.patch, HIVE-10749.1.patch, 
 HIVE-10749.2-parquet.patch, HIVE-10749.2.patch, HIVE-10749.patch


 We need to implement insert statement for parquet format like ORC.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10304) Add deprecation message to HiveCLI

2015-05-26 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559325#comment-14559325
 ] 

Xuefu Zhang commented on HIVE-10304:


The final decision will be replacing Hive CLI's implementation with beeline 
(HIVE-10511). You still have the script.

Since you have so many scripts using Hive CLI. When HIVE-10511 is in place, it 
would be great if you can test it with your script. Thanks.

 Add deprecation message to HiveCLI
 --

 Key: HIVE-10304
 URL: https://issues.apache.org/jira/browse/HIVE-10304
 Project: Hive
  Issue Type: Sub-task
  Components: CLI
Affects Versions: 1.1.0
Reporter: Szehon Ho
Assignee: Szehon Ho
  Labels: TODOC1.2
 Attachments: HIVE-10304.2.patch, HIVE-10304.3.patch, HIVE-10304.patch


 As Beeline is now the recommended command line tool to Hive, we should add a 
 message to HiveCLI to indicate that it is deprecated and redirect them to 
 Beeline.  
 This is not suggesting to remove HiveCLI for now, but just a helpful 
 direction for user to know the direction to focus attention in Beeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10815) Let HiveMetaStoreClient Choose MetaStore Randomly

2015-05-26 Thread Nemon Lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-10815:
-
Attachment: (was: HIVE-10815.patch)

 Let HiveMetaStoreClient Choose MetaStore Randomly
 -

 Key: HIVE-10815
 URL: https://issues.apache.org/jira/browse/HIVE-10815
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2, Metastore
Affects Versions: 1.2.0
Reporter: Nemon Lou
Assignee: Nemon Lou
 Attachments: HIVE-10815.patch


 Currently HiveMetaStoreClient using a fixed order to choose MetaStore URIs 
 when multiple metastores configured.
  Choosing MetaStore Randomly will be good for load balance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6991) History not able to disable/enable after session started

2015-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559359#comment-14559359
 ] 

Hive QA commented on HIVE-6991:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12735306/HIVE-6991.2.patch

{color:red}ERROR:{color} -1 due to 637 failed/errored test(s), 8973 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_multiple
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alias_casted_column
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table2_h23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table_h23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_protect_mode
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition_authorization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join26
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_reordering_values
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_change_schema
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_comments
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_compression_enabled
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_compression_enabled_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_date
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_deserialize_map_null
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_evolved_schemas
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_joins
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_joins_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_fields
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_sanity_test
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_schema_evolution_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_timestamp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_type_evolution
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table_udfs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_binary_output_format
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_5

[jira] [Commented] (HIVE-10277) Unable to process Comment line '--' in HIVE-1.1.0

2015-05-26 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559321#comment-14559321
 ] 

Xuefu Zhang commented on HIVE-10277:


Thank you. I have reverted it.

@Chinna, I'll reopen the JIRA. Could you resubmit a patch if it's still a
problem, and make sure that tests passes.

Thanks,
Xuefu

On Tue, May 26, 2015 at 7:00 AM, Ferdinand Xu (JIRA) j...@apache.org



 Unable to process Comment line '--' in HIVE-1.1.0
 -

 Key: HIVE-10277
 URL: https://issues.apache.org/jira/browse/HIVE-10277
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.0.0
Reporter: Kaveen Raajan
Assignee: Chinna Rao Lalam
Priority: Minor
  Labels: hive
 Fix For: 1.3.0

 Attachments: HIVE-10277-1.patch, HIVE-10277.2.patch, HIVE-10277.patch


 I tried to use comment line (*--*) in HIVE-1.1.0 grunt shell like,
 ~hive--this is comment line~
 ~hiveshow tables;~
 I got error like 
 {quote}
 NoViableAltException(-1@[])
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:
 1020)
 at 
 org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:19
 9)
 at 
 org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:16
 6)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:393)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1112)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1160)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:2
 07)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
 at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:754
 )
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
 java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
 sorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 FAILED: ParseException line 2:0 cannot recognize input near 'EOF' 'EOF' 
 'EO
 F'
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10815) Let HiveMetaStoreClient Choose MetaStore Randomly

2015-05-26 Thread Nemon Lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-10815:
-
Attachment: HIVE-10815.patch

 Let HiveMetaStoreClient Choose MetaStore Randomly
 -

 Key: HIVE-10815
 URL: https://issues.apache.org/jira/browse/HIVE-10815
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2, Metastore
Affects Versions: 1.2.0
Reporter: Nemon Lou
Assignee: Nemon Lou
 Attachments: HIVE-10815.patch, HIVE-10815.patch


 Currently HiveMetaStoreClient using a fixed order to choose MetaStore URIs 
 when multiple metastores configured.
  Choosing MetaStore Randomly will be good for load balance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-10653) LLAP: registry logs strange lines on daemons

2015-05-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-10653:
---

Assignee: Sergey Shelukhin  (was: Gopal V)

 LLAP: registry logs strange lines on daemons
 

 Key: HIVE-10653
 URL: https://issues.apache.org/jira/browse/HIVE-10653
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin

 Discovered while looking at HIVE-10648; [~sseth] mentioned that this should 
 not be happening.
 Most of the daemons described as being killed were actually alive. 
 Several/all LLAP daemons in the cluster logged these messages at 
 approximately the same time (while AM was stuck, incidentally; perhaps they 
 were just bored with no work).
 {noformat}
 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO 
 org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: 
 Starting to refresh ServiceInstanceSet 515383300
 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO 
 org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding 
 new worker f698eaee-bf6c-484d-9b90-a60d9005760c which mapped to 
 DynamicServiceInstance [alive=true, 
 host=cn057-10.l42scl.hortonworks.com:15001 with resources=memory:20480, 
 vCores:6]
 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO 
 org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding 
 new worker 9d1f50d1-f237-43c1-a8c5-32741e82d18b which mapped to 
 DynamicServiceInstance [alive=true, 
 host=cn041-10.l42scl.hortonworks.com:15001 with resources=memory:20480, 
 vCores:6]
 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO 
 org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding 
 new worker b8a22e2f-652a-4fde-be7a-744786bc93c9 which mapped to 
 DynamicServiceInstance [alive=true, 
 host=cn042-10.l42scl.hortonworks.com:15001 with resources=memory:20480, 
 vCores:6]
 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO 
 org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding 
 new worker 8394e271-e0d5-4589-817e-0181db0866b9 which mapped to 
 DynamicServiceInstance [alive=true, 
 host=cn056-10.l42scl.hortonworks.com:15001 with resources=memory:20480, 
 vCores:6]
 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO 
 org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding 
 new worker 1cabdcce-1089-4de6-abdf-315f18a8b4c0 which mapped to 
 DynamicServiceInstance [alive=true, 
 host=cn054-10.l42scl.hortonworks.com:15001 with resources=memory:20480, 
 vCores:6]
 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO 
 org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding 
 new worker 4027ad61-8c61-4173-90e2-d166ceaad74b which mapped to 
 DynamicServiceInstance [alive=true, 
 host=cn051-10.l42scl.hortonworks.com:15001 with resources=memory:20480, 
 vCores:6]
 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO 
 org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding 
 new worker 7f71a05f-f849-43d2-8fdb-09ba144d4b93 which mapped to 
 DynamicServiceInstance [alive=true, 
 host=cn050-10.l42scl.hortonworks.com:15001 with resources=memory:20480, 
 vCores:6]
 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO 
 org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding 
 new worker 41835ca1-69cd-4290-8c8f-8a9583a5d635 which mapped to 
 DynamicServiceInstance [alive=true, 
 host=cn053-10.l42scl.hortonworks.com:15001 with resources=memory:20480, 
 vCores:6]
 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO 
 org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding 
 new worker 54952e48-41be-48e1-922c-a39d0ee48a33 which mapped to 
 DynamicServiceInstance [alive=true, 
 host=cn055-10.l42scl.hortonworks.com:15001 with resources=memory:20480, 
 vCores:6]
 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO 
 org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding 
 new worker 980dfe6c-d03b-462b-bee3-35d183c74aee which mapped to 
 DynamicServiceInstance [alive=true, 
 host=cn052-10.l42scl.hortonworks.com:15001 with resources=memory:20480, 
 vCores:6]
 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO 
 org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding 
 new worker d524212a-6743-4f18-bcf6-525a0d4b1a0a which mapped to 
 DynamicServiceInstance [alive=true, 
 host=cn046-10.l42scl.hortonworks.com:15001 with resources=memory:20480, 
 vCores:6]
 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO 
 org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: 
 Killing service instance: DynamicServiceInstance [alive=true, 
 

[jira] [Commented] (HIVE-10244) Vectorization : TPC-DS Q80 fails with java.lang.ClassCastException when hive.vectorized.execution.reduce.enabled is enabled

2015-05-26 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559797#comment-14559797
 ] 

Laljo John Pullokkaran commented on HIVE-10244:
---

[~mmccline] How can you end up grouping id without grouping sets?
Language prevents referring to grouping id without grouping sets.

If grouping sets are present then previous line should bail out right?

if (desc.isGroupingSetsPresent()) {
  LOG.info(Grouping sets not supported in vector mode);
  return false;
}

 Vectorization : TPC-DS Q80 fails with java.lang.ClassCastException when 
 hive.vectorized.execution.reduce.enabled is enabled
 ---

 Key: HIVE-10244
 URL: https://issues.apache.org/jira/browse/HIVE-10244
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Matt McCline
 Attachments: HIVE-10244.01.patch, explain_q80_vectorized_reduce_on.txt


 Query 
 {code}
 set hive.vectorized.execution.reduce.enabled=true;
 with ssr as
  (select  s_store_id as store_id,
   sum(ss_ext_sales_price) as sales,
   sum(coalesce(sr_return_amt, 0)) as returns,
   sum(ss_net_profit - coalesce(sr_net_loss, 0)) as profit
   from store_sales left outer join store_returns on
  (ss_item_sk = sr_item_sk and ss_ticket_number = sr_ticket_number),
  date_dim,
  store,
  item,
  promotion
  where ss_sold_date_sk = d_date_sk
and d_date between cast('1998-08-04' as date) 
   and (cast('1998-09-04' as date))
and ss_store_sk = s_store_sk
and ss_item_sk = i_item_sk
and i_current_price  50
and ss_promo_sk = p_promo_sk
and p_channel_tv = 'N'
  group by s_store_id)
  ,
  csr as
  (select  cp_catalog_page_id as catalog_page_id,
   sum(cs_ext_sales_price) as sales,
   sum(coalesce(cr_return_amount, 0)) as returns,
   sum(cs_net_profit - coalesce(cr_net_loss, 0)) as profit
   from catalog_sales left outer join catalog_returns on
  (cs_item_sk = cr_item_sk and cs_order_number = cr_order_number),
  date_dim,
  catalog_page,
  item,
  promotion
  where cs_sold_date_sk = d_date_sk
and d_date between cast('1998-08-04' as date)
   and (cast('1998-09-04' as date))
 and cs_catalog_page_sk = cp_catalog_page_sk
and cs_item_sk = i_item_sk
and i_current_price  50
and cs_promo_sk = p_promo_sk
and p_channel_tv = 'N'
 group by cp_catalog_page_id)
  ,
  wsr as
  (select  web_site_id,
   sum(ws_ext_sales_price) as sales,
   sum(coalesce(wr_return_amt, 0)) as returns,
   sum(ws_net_profit - coalesce(wr_net_loss, 0)) as profit
   from web_sales left outer join web_returns on
  (ws_item_sk = wr_item_sk and ws_order_number = wr_order_number),
  date_dim,
  web_site,
  item,
  promotion
  where ws_sold_date_sk = d_date_sk
and d_date between cast('1998-08-04' as date)
   and (cast('1998-09-04' as date))
 and ws_web_site_sk = web_site_sk
and ws_item_sk = i_item_sk
and i_current_price  50
and ws_promo_sk = p_promo_sk
and p_channel_tv = 'N'
 group by web_site_id)
   select  channel
 , id
 , sum(sales) as sales
 , sum(returns) as returns
 , sum(profit) as profit
  from 
  (select 'store channel' as channel
 , concat('store', store_id) as id
 , sales
 , returns
 , profit
  from   ssr
  union all
  select 'catalog channel' as channel
 , concat('catalog_page', catalog_page_id) as id
 , sales
 , returns
 , profit
  from  csr
  union all
  select 'web channel' as channel
 , concat('web_site', web_site_id) as id
 , sales
 , returns
 , profit
  from   wsr
  ) x
  group by channel, id with rollup
  order by channel
  ,id
  limit 100
 {code}
 Exception 
 {code}
 Vertex failed, vertexName=Reducer 5, vertexId=vertex_1426707664723_1377_1_22, 
 diagnostics=[Task failed, taskId=task_1426707664723_1377_1_22_00, 
 diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
 task:java.lang.RuntimeException: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing vector batch (tag=0) 
 \N\N09.285817653506076E84.639990363237801E7-1.1814318134887291E8
 \N\N04.682909323885761E82.2415242712669864E7-5.966176123188091E7
 \N\N01.2847032699693155E96.300096113768728E7-5.94963316209578E8
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
   at 
 

[jira] [Updated] (HIVE-10777) LLAP: add pre-fragment and per-table cache details

2015-05-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-10777:

Attachment: HIVE-10777.02.patch

Updated the name of the config setting

 LLAP: add pre-fragment and per-table cache details
 --

 Key: HIVE-10777
 URL: https://issues.apache.org/jira/browse/HIVE-10777
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: llap

 Attachments: HIVE-10777.01.patch, HIVE-10777.02.patch, 
 HIVE-10777.WIP.patch, HIVE-10777.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10808) Inner join on Null throwing Cast Exception

2015-05-26 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559785#comment-14559785
 ] 

Naveen Gangam commented on HIVE-10808:
--

[~swarnim] Agreed. However, we received this stack trace from a customer that 
can no longer reproduce the issue( their infra underwent some 
changes/upgrades). We have not been able to reproduce this using a test 
dataset. If I am able to reproduce this more consistently, I can create a unit 
test for this. Fair?

 Inner join on Null throwing Cast Exception
 --

 Key: HIVE-10808
 URL: https://issues.apache.org/jira/browse/HIVE-10808
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.1
Reporter: Naveen Gangam
Assignee: Naveen Gangam
Priority: Critical
 Attachments: HIVE-10808.patch


 select
  a.col1,
  a.col2,
  a.col3,
  a.col4
  from
  tab1 a
  inner join
  (
  select
  max(x) as x
  from
  tab1
  where
  x  20130327
  ) r
  on
  a.x = r.x
  where
  a.col1 = 'F'
  and a.col3 in ('A', 'S', 'G');
 Failed Task log snippet:
 2015-05-18 19:22:17,372 INFO [main] 
 org.apache.hadoop.hive.ql.exec.mr.ObjectCache: Ignoring retrieval request: 
 __MAP_PLAN__
 2015-05-18 19:22:17,372 INFO [main] 
 org.apache.hadoop.hive.ql.exec.mr.ObjectCache: Ignoring cache key: 
 __MAP_PLAN__
 2015-05-18 19:22:17,457 WARN [main] org.apache.hadoop.mapred.YarnChild: 
 Exception running child : java.lang.RuntimeException: Error in configuring 
 object
 at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:446)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
 ... 9 more
 Caused by: java.lang.RuntimeException: Error in configuring object
 at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
 at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
 ... 14 more
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
 ... 17 more
 Caused by: java.lang.RuntimeException: Map operator initialization failed
 at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:157)
 ... 22 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.NullStructSerDe$NullStructSerDeObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:334)
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:352)
 at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:126)
 ... 22 more
 Caused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.NullStructSerDe$NullStructSerDeObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1149)
 at 
 

[jira] [Resolved] (HIVE-10653) LLAP: registry logs strange lines on daemons

2015-05-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-10653.
-
   Resolution: Fixed
Fix Version/s: llap

committed to branch

 LLAP: registry logs strange lines on daemons
 

 Key: HIVE-10653
 URL: https://issues.apache.org/jira/browse/HIVE-10653
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: llap


 Discovered while looking at HIVE-10648; [~sseth] mentioned that this should 
 not be happening.
 Most of the daemons described as being killed were actually alive. 
 Several/all LLAP daemons in the cluster logged these messages at 
 approximately the same time (while AM was stuck, incidentally; perhaps they 
 were just bored with no work).
 {noformat}
 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO 
 org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: 
 Starting to refresh ServiceInstanceSet 515383300
 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO 
 org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding 
 new worker f698eaee-bf6c-484d-9b90-a60d9005760c which mapped to 
 DynamicServiceInstance [alive=true, 
 host=cn057-10.l42scl.hortonworks.com:15001 with resources=memory:20480, 
 vCores:6]
 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO 
 org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding 
 new worker 9d1f50d1-f237-43c1-a8c5-32741e82d18b which mapped to 
 DynamicServiceInstance [alive=true, 
 host=cn041-10.l42scl.hortonworks.com:15001 with resources=memory:20480, 
 vCores:6]
 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO 
 org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding 
 new worker b8a22e2f-652a-4fde-be7a-744786bc93c9 which mapped to 
 DynamicServiceInstance [alive=true, 
 host=cn042-10.l42scl.hortonworks.com:15001 with resources=memory:20480, 
 vCores:6]
 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO 
 org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding 
 new worker 8394e271-e0d5-4589-817e-0181db0866b9 which mapped to 
 DynamicServiceInstance [alive=true, 
 host=cn056-10.l42scl.hortonworks.com:15001 with resources=memory:20480, 
 vCores:6]
 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO 
 org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding 
 new worker 1cabdcce-1089-4de6-abdf-315f18a8b4c0 which mapped to 
 DynamicServiceInstance [alive=true, 
 host=cn054-10.l42scl.hortonworks.com:15001 with resources=memory:20480, 
 vCores:6]
 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO 
 org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding 
 new worker 4027ad61-8c61-4173-90e2-d166ceaad74b which mapped to 
 DynamicServiceInstance [alive=true, 
 host=cn051-10.l42scl.hortonworks.com:15001 with resources=memory:20480, 
 vCores:6]
 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO 
 org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding 
 new worker 7f71a05f-f849-43d2-8fdb-09ba144d4b93 which mapped to 
 DynamicServiceInstance [alive=true, 
 host=cn050-10.l42scl.hortonworks.com:15001 with resources=memory:20480, 
 vCores:6]
 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO 
 org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding 
 new worker 41835ca1-69cd-4290-8c8f-8a9583a5d635 which mapped to 
 DynamicServiceInstance [alive=true, 
 host=cn053-10.l42scl.hortonworks.com:15001 with resources=memory:20480, 
 vCores:6]
 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO 
 org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding 
 new worker 54952e48-41be-48e1-922c-a39d0ee48a33 which mapped to 
 DynamicServiceInstance [alive=true, 
 host=cn055-10.l42scl.hortonworks.com:15001 with resources=memory:20480, 
 vCores:6]
 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO 
 org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding 
 new worker 980dfe6c-d03b-462b-bee3-35d183c74aee which mapped to 
 DynamicServiceInstance [alive=true, 
 host=cn052-10.l42scl.hortonworks.com:15001 with resources=memory:20480, 
 vCores:6]
 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO 
 org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding 
 new worker d524212a-6743-4f18-bcf6-525a0d4b1a0a which mapped to 
 DynamicServiceInstance [alive=true, 
 host=cn046-10.l42scl.hortonworks.com:15001 with resources=memory:20480, 
 vCores:6]
 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO 
 org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: 
 Killing service instance: DynamicServiceInstance 

[jira] [Commented] (HIVE-10808) Inner join on Null throwing Cast Exception

2015-05-26 Thread Swarnim Kulkarni (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559800#comment-14559800
 ] 

Swarnim Kulkarni commented on HIVE-10808:
-

Sounds great. Easier to review patches with tests on it which guarantee that 
the patch actually works ;)

 Inner join on Null throwing Cast Exception
 --

 Key: HIVE-10808
 URL: https://issues.apache.org/jira/browse/HIVE-10808
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.1
Reporter: Naveen Gangam
Assignee: Naveen Gangam
Priority: Critical
 Attachments: HIVE-10808.patch


 select
  a.col1,
  a.col2,
  a.col3,
  a.col4
  from
  tab1 a
  inner join
  (
  select
  max(x) as x
  from
  tab1
  where
  x  20130327
  ) r
  on
  a.x = r.x
  where
  a.col1 = 'F'
  and a.col3 in ('A', 'S', 'G');
 Failed Task log snippet:
 2015-05-18 19:22:17,372 INFO [main] 
 org.apache.hadoop.hive.ql.exec.mr.ObjectCache: Ignoring retrieval request: 
 __MAP_PLAN__
 2015-05-18 19:22:17,372 INFO [main] 
 org.apache.hadoop.hive.ql.exec.mr.ObjectCache: Ignoring cache key: 
 __MAP_PLAN__
 2015-05-18 19:22:17,457 WARN [main] org.apache.hadoop.mapred.YarnChild: 
 Exception running child : java.lang.RuntimeException: Error in configuring 
 object
 at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:446)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
 ... 9 more
 Caused by: java.lang.RuntimeException: Error in configuring object
 at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
 at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
 ... 14 more
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
 ... 17 more
 Caused by: java.lang.RuntimeException: Map operator initialization failed
 at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:157)
 ... 22 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.NullStructSerDe$NullStructSerDeObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:334)
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:352)
 at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:126)
 ... 22 more
 Caused by: java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.NullStructSerDe$NullStructSerDeObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1149)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:219)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:183)
 at 
 

[jira] [Resolved] (HIVE-10777) LLAP: add pre-fragment and per-table cache details

2015-05-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-10777.
-

committed to branch

 LLAP: add pre-fragment and per-table cache details
 --

 Key: HIVE-10777
 URL: https://issues.apache.org/jira/browse/HIVE-10777
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: llap

 Attachments: HIVE-10777.01.patch, HIVE-10777.02.patch, 
 HIVE-10777.WIP.patch, HIVE-10777.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6791) Support variable substition for Beeline shell command

2015-05-26 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559120#comment-14559120
 ] 

Xuefu Zhang commented on HIVE-6791:
---

Yes, I just assigned it to you. Thanks.

 Support variable substition for Beeline shell command
 -

 Key: HIVE-6791
 URL: https://issues.apache.org/jira/browse/HIVE-6791
 Project: Hive
  Issue Type: New Feature
  Components: CLI, Clients
Affects Versions: 0.14.0
Reporter: Xuefu Zhang
Assignee: Ferdinand Xu

 A follow-up task from HIVE-6694. Similar to HIVE-6570.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-6791) Support variable substition for Beeline shell command

2015-05-26 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang reassigned HIVE-6791:
-

Assignee: Ferdinand Xu  (was: Xuefu Zhang)

 Support variable substition for Beeline shell command
 -

 Key: HIVE-6791
 URL: https://issues.apache.org/jira/browse/HIVE-6791
 Project: Hive
  Issue Type: New Feature
  Components: CLI, Clients
Affects Versions: 0.14.0
Reporter: Xuefu Zhang
Assignee: Ferdinand Xu

 A follow-up task from HIVE-6694. Similar to HIVE-6570.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10277) Unable to process Comment line '--' in HIVE-1.1.0

2015-05-26 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559123#comment-14559123
 ] 

Ferdinand Xu commented on HIVE-10277:
-

Hi [~xuefuz], seems this commit breaks lots of test. Could you take a look at 
it? 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4034/


 Unable to process Comment line '--' in HIVE-1.1.0
 -

 Key: HIVE-10277
 URL: https://issues.apache.org/jira/browse/HIVE-10277
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.0.0
Reporter: Kaveen Raajan
Assignee: Chinna Rao Lalam
Priority: Minor
  Labels: hive
 Fix For: 1.3.0

 Attachments: HIVE-10277-1.patch, HIVE-10277.2.patch, HIVE-10277.patch


 I tried to use comment line (*--*) in HIVE-1.1.0 grunt shell like,
 ~hive--this is comment line~
 ~hiveshow tables;~
 I got error like 
 {quote}
 NoViableAltException(-1@[])
 at 
 org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:
 1020)
 at 
 org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:19
 9)
 at 
 org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:16
 6)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:393)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1112)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1160)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:2
 07)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
 at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:754
 )
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
 java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
 sorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 FAILED: ParseException line 2:0 cannot recognize input near 'EOF' 'EOF' 
 'EO
 F'
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10819) SearchArgumentImpl for Timestamp is broken by HIVE-10286

2015-05-26 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559696#comment-14559696
 ] 

Sergey Shelukhin commented on HIVE-10819:
-

this breaks a lot of tests...

 SearchArgumentImpl for Timestamp is broken by HIVE-10286
 

 Key: HIVE-10819
 URL: https://issues.apache.org/jira/browse/HIVE-10819
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 1.2.1

 Attachments: HIVE-10819.1.patch, HIVE-10819.2.patch


 The work around for kryo bug for Timestamp is accidentally removed by 
 HIVE-10286. Need to bring it back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10711) Tez HashTableLoader attempts to allocate more memory than available when HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD exceeds process max mem

2015-05-26 Thread Mostafa Mokhtar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559699#comment-14559699
 ] 

Mostafa Mokhtar commented on HIVE-10711:


[~sushanth] FYI 

[~apivovarov]
Can you please commit the change to 1.2.1?

 Tez HashTableLoader attempts to allocate more memory than available when 
 HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD exceeds process max mem
 --

 Key: HIVE-10711
 URL: https://issues.apache.org/jira/browse/HIVE-10711
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Mostafa Mokhtar
 Fix For: 1.2.1

 Attachments: HIVE-10711.1.patch, HIVE-10711.2.patch, 
 HIVE-10711.3.patch, HIVE-10711.4.patch


 Tez HashTableLoader bases its memory allocation on 
 HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD. If this value is largeer than the 
 process max memory then this can result in the HashTableLoader trying to use 
 more memory than available to the process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10793) Hybrid Hybrid Grace Hash Join : Don't allocate all hash table memory upfront

2015-05-26 Thread Mostafa Mokhtar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559704#comment-14559704
 ] 

Mostafa Mokhtar commented on HIVE-10793:


[~sushanth] [~sershe]
Can this go to 1.2.1 as well?



 Hybrid Hybrid Grace Hash Join : Don't allocate all hash table memory upfront
 

 Key: HIVE-10793
 URL: https://issues.apache.org/jira/browse/HIVE-10793
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.0
Reporter: Mostafa Mokhtar
Assignee: Mostafa Mokhtar
 Fix For: 1.2.1

 Attachments: HIVE-10793.1.patch, HIVE-10793.2.patch


 HybridHashTableContainer will allocate memory based on estimate, which means 
 if the actual is less than the estimate the allocated memory won't be used.
 Number of partitions is calculated based on estimated data size
 {code}
 numPartitions = calcNumPartitions(memoryThreshold, estimatedTableSize, 
 minNumParts, minWbSize,
   nwayConf);
 {code}
 Then based on number of partitions writeBufferSize is set
 {code}
 writeBufferSize = (int)(estimatedTableSize / numPartitions);
 {code}
 Each hash partition will allocate 1 WriteBuffer, with no further allocation 
 if the estimate data size is correct.
 Suggested solution is to reduce writeBufferSize by a factor such that only X% 
 of the memory is preallocated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7723) Explain plan for complex query with lots of partitions is slow due to in-efficient collection used to find a matching ReadEntity

2015-05-26 Thread Mostafa Mokhtar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-7723:
--
Attachment: HIVE-7723.11.patch

 Explain plan for complex query with lots of partitions is slow due to 
 in-efficient collection used to find a matching ReadEntity
 

 Key: HIVE-7723
 URL: https://issues.apache.org/jira/browse/HIVE-7723
 Project: Hive
  Issue Type: Bug
  Components: CLI, Physical Optimizer
Affects Versions: 0.13.1
Reporter: Mostafa Mokhtar
Assignee: Mostafa Mokhtar
 Attachments: HIVE-7723.1.patch, HIVE-7723.10.patch, 
 HIVE-7723.11.patch, HIVE-7723.2.patch, HIVE-7723.3.patch, HIVE-7723.4.patch, 
 HIVE-7723.5.patch, HIVE-7723.6.patch, HIVE-7723.7.patch, HIVE-7723.8.patch, 
 HIVE-7723.9.patch


 Explain on TPC-DS query 64 took 11 seconds, when the CLI was profiled it 
 showed that ReadEntity.equals is taking ~40% of the CPU.
 ReadEntity.equals is called from the snippet below.
 Again and again the set is iterated over to get the actual match, a HashMap 
 is a better option for this case as Set doesn't have a Get method.
 Also for ReadEntity equals is case-insensitive while hash is , which is an 
 undesired behavior.
 {code}
 public static ReadEntity addInput(SetReadEntity inputs, ReadEntity 
 newInput) {
 // If the input is already present, make sure the new parent is added to 
 the input.
 if (inputs.contains(newInput)) {
   for (ReadEntity input : inputs) {
 if (input.equals(newInput)) {
   if ((newInput.getParents() != null)  
 (!newInput.getParents().isEmpty())) {
 input.getParents().addAll(newInput.getParents());
 input.setDirect(input.isDirect() || newInput.isDirect());
   }
   return input;
 }
   }
   assert false;
 } else {
   inputs.add(newInput);
   return newInput;
 }
 // make compile happy
 return null;
   }
 {code}
 This is the query used : 
 {code}
 select cs1.product_name ,cs1.store_name ,cs1.store_zip ,cs1.b_street_number 
 ,cs1.b_streen_name ,cs1.b_city
  ,cs1.b_zip ,cs1.c_street_number ,cs1.c_street_name ,cs1.c_city 
 ,cs1.c_zip ,cs1.syear ,cs1.cnt
  ,cs1.s1 ,cs1.s2 ,cs1.s3
  ,cs2.s1 ,cs2.s2 ,cs2.s3 ,cs2.syear ,cs2.cnt
 from
 (select i_product_name as product_name ,i_item_sk as item_sk ,s_store_name as 
 store_name
  ,s_zip as store_zip ,ad1.ca_street_number as b_street_number 
 ,ad1.ca_street_name as b_streen_name
  ,ad1.ca_city as b_city ,ad1.ca_zip as b_zip ,ad2.ca_street_number as 
 c_street_number
  ,ad2.ca_street_name as c_street_name ,ad2.ca_city as c_city ,ad2.ca_zip 
 as c_zip
  ,d1.d_year as syear ,d2.d_year as fsyear ,d3.d_year as s2year ,count(*) 
 as cnt
  ,sum(ss_wholesale_cost) as s1 ,sum(ss_list_price) as s2 
 ,sum(ss_coupon_amt) as s3
   FROM   store_sales
 JOIN store_returns ON store_sales.ss_item_sk = 
 store_returns.sr_item_sk and store_sales.ss_ticket_number = 
 store_returns.sr_ticket_number
 JOIN customer ON store_sales.ss_customer_sk = customer.c_customer_sk
 JOIN date_dim d1 ON store_sales.ss_sold_date_sk = d1.d_date_sk
 JOIN date_dim d2 ON customer.c_first_sales_date_sk = d2.d_date_sk 
 JOIN date_dim d3 ON customer.c_first_shipto_date_sk = d3.d_date_sk
 JOIN store ON store_sales.ss_store_sk = store.s_store_sk
 JOIN customer_demographics cd1 ON store_sales.ss_cdemo_sk= 
 cd1.cd_demo_sk
 JOIN customer_demographics cd2 ON customer.c_current_cdemo_sk = 
 cd2.cd_demo_sk
 JOIN promotion ON store_sales.ss_promo_sk = promotion.p_promo_sk
 JOIN household_demographics hd1 ON store_sales.ss_hdemo_sk = 
 hd1.hd_demo_sk
 JOIN household_demographics hd2 ON customer.c_current_hdemo_sk = 
 hd2.hd_demo_sk
 JOIN customer_address ad1 ON store_sales.ss_addr_sk = 
 ad1.ca_address_sk
 JOIN customer_address ad2 ON customer.c_current_addr_sk = 
 ad2.ca_address_sk
 JOIN income_band ib1 ON hd1.hd_income_band_sk = ib1.ib_income_band_sk
 JOIN income_band ib2 ON hd2.hd_income_band_sk = ib2.ib_income_band_sk
 JOIN item ON store_sales.ss_item_sk = item.i_item_sk
 JOIN
  (select cs_item_sk
 ,sum(cs_ext_list_price) as 
 sale,sum(cr_refunded_cash+cr_reversed_charge+cr_store_credit) as refund
   from catalog_sales JOIN catalog_returns
   ON catalog_sales.cs_item_sk = catalog_returns.cr_item_sk
 and catalog_sales.cs_order_number = catalog_returns.cr_order_number
   group by cs_item_sk
   having 
 sum(cs_ext_list_price)2*sum(cr_refunded_cash+cr_reversed_charge+cr_store_credit))
  cs_ui
 ON store_sales.ss_item_sk = cs_ui.cs_item_sk
   WHERE  
  cd1.cd_marital_status  

[jira] [Updated] (HIVE-10825) Add parquet branch profile to jenkins-submit-build.sh

2015-05-26 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-10825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-10825:
---
Description: NO PRECOMMIT TESTS  (was: NO PRECOMMIT TEST)

 Add parquet branch profile to jenkins-submit-build.sh
 -

 Key: HIVE-10825
 URL: https://issues.apache.org/jira/browse/HIVE-10825
 Project: Hive
  Issue Type: Sub-task
  Components: Testing Infrastructure
Reporter: Sergio Peña
Assignee: Sergio Peña
Priority: Minor
 Attachments: HIVE-10825.1.patch


 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10711) Tez HashTableLoader attempts to allocate more memory than available when HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD exceeds process max mem

2015-05-26 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559729#comment-14559729
 ] 

Alexander Pivovarov commented on HIVE-10711:


Mostafa, lets wait 24 hours before commit. Just to clarify. Do you want me to 
commit it to master and then do hotfix (cherry-pick) from master to branch-1.2?

 Tez HashTableLoader attempts to allocate more memory than available when 
 HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD exceeds process max mem
 --

 Key: HIVE-10711
 URL: https://issues.apache.org/jira/browse/HIVE-10711
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Mostafa Mokhtar
 Fix For: 1.2.1

 Attachments: HIVE-10711.1.patch, HIVE-10711.2.patch, 
 HIVE-10711.3.patch, HIVE-10711.4.patch


 Tez HashTableLoader bases its memory allocation on 
 HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD. If this value is largeer than the 
 process max memory then this can result in the HashTableLoader trying to use 
 more memory than available to the process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10825) Add parquet branch profile to jenkins-submit-build.sh

2015-05-26 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559730#comment-14559730
 ] 

Szehon Ho commented on HIVE-10825:
--

+1

 Add parquet branch profile to jenkins-submit-build.sh
 -

 Key: HIVE-10825
 URL: https://issues.apache.org/jira/browse/HIVE-10825
 Project: Hive
  Issue Type: Sub-task
  Components: Testing Infrastructure
Reporter: Sergio Peña
Assignee: Sergio Peña
Priority: Minor
 Attachments: HIVE-10825.1.patch


 NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10711) Tez HashTableLoader attempts to allocate more memory than available when HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD exceeds process max mem

2015-05-26 Thread Mostafa Mokhtar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559731#comment-14559731
 ] 

Mostafa Mokhtar commented on HIVE-10711:


Yes, please.



 Tez HashTableLoader attempts to allocate more memory than available when 
 HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD exceeds process max mem
 --

 Key: HIVE-10711
 URL: https://issues.apache.org/jira/browse/HIVE-10711
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Mostafa Mokhtar
 Fix For: 1.2.1

 Attachments: HIVE-10711.1.patch, HIVE-10711.2.patch, 
 HIVE-10711.3.patch, HIVE-10711.4.patch


 Tez HashTableLoader bases its memory allocation on 
 HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD. If this value is largeer than the 
 process max memory then this can result in the HashTableLoader trying to use 
 more memory than available to the process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10711) Tez HashTableLoader attempts to allocate more memory than available when HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD exceeds process max mem

2015-05-26 Thread Mostafa Mokhtar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559732#comment-14559732
 ] 

Mostafa Mokhtar commented on HIVE-10711:


Yes, please.



 Tez HashTableLoader attempts to allocate more memory than available when 
 HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD exceeds process max mem
 --

 Key: HIVE-10711
 URL: https://issues.apache.org/jira/browse/HIVE-10711
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Mostafa Mokhtar
 Fix For: 1.2.1

 Attachments: HIVE-10711.1.patch, HIVE-10711.2.patch, 
 HIVE-10711.3.patch, HIVE-10711.4.patch


 Tez HashTableLoader bases its memory allocation on 
 HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD. If this value is largeer than the 
 process max memory then this can result in the HashTableLoader trying to use 
 more memory than available to the process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10711) Tez HashTableLoader attempts to allocate more memory than available when HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD exceeds process max mem

2015-05-26 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559667#comment-14559667
 ] 

Alexander Pivovarov commented on HIVE-10711:


+1

 Tez HashTableLoader attempts to allocate more memory than available when 
 HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD exceeds process max mem
 --

 Key: HIVE-10711
 URL: https://issues.apache.org/jira/browse/HIVE-10711
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Mostafa Mokhtar
 Fix For: 1.2.1

 Attachments: HIVE-10711.1.patch, HIVE-10711.2.patch, 
 HIVE-10711.3.patch, HIVE-10711.4.patch


 Tez HashTableLoader bases its memory allocation on 
 HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD. If this value is largeer than the 
 process max memory then this can result in the HashTableLoader trying to use 
 more memory than available to the process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10793) Hybrid Hybrid Grace Hash Join : Don't allocate all hash table memory upfront

2015-05-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-10793:

Fix Version/s: (was: 1.2.1)
   1.3.0

 Hybrid Hybrid Grace Hash Join : Don't allocate all hash table memory upfront
 

 Key: HIVE-10793
 URL: https://issues.apache.org/jira/browse/HIVE-10793
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.0
Reporter: Mostafa Mokhtar
Assignee: Mostafa Mokhtar
 Fix For: 1.3.0

 Attachments: HIVE-10793.1.patch, HIVE-10793.2.patch


 HybridHashTableContainer will allocate memory based on estimate, which means 
 if the actual is less than the estimate the allocated memory won't be used.
 Number of partitions is calculated based on estimated data size
 {code}
 numPartitions = calcNumPartitions(memoryThreshold, estimatedTableSize, 
 minNumParts, minWbSize,
   nwayConf);
 {code}
 Then based on number of partitions writeBufferSize is set
 {code}
 writeBufferSize = (int)(estimatedTableSize / numPartitions);
 {code}
 Each hash partition will allocate 1 WriteBuffer, with no further allocation 
 if the estimate data size is correct.
 Suggested solution is to reduce writeBufferSize by a factor such that only X% 
 of the memory is preallocated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10825) Add parquet branch profile to jenkins-submit-build.sh

2015-05-26 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-10825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-10825:
---
Description: NO PRECOMMIT TEST

 Add parquet branch profile to jenkins-submit-build.sh
 -

 Key: HIVE-10825
 URL: https://issues.apache.org/jira/browse/HIVE-10825
 Project: Hive
  Issue Type: Sub-task
  Components: Testing Infrastructure
Reporter: Sergio Peña
Assignee: Sergio Peña
Priority: Minor
 Attachments: HIVE-10825.1.patch


 NO PRECOMMIT TEST



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10825) Add parquet branch profile to jenkins-submit-build.sh

2015-05-26 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-10825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-10825:
---
Attachment: HIVE-10825.1.patch

 Add parquet branch profile to jenkins-submit-build.sh
 -

 Key: HIVE-10825
 URL: https://issues.apache.org/jira/browse/HIVE-10825
 Project: Hive
  Issue Type: Sub-task
  Components: Testing Infrastructure
Reporter: Sergio Peña
Assignee: Sergio Peña
Priority: Minor
 Attachments: HIVE-10825.1.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10812) Scaling PK/FK's selectivity for stats annotation

2015-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559839#comment-14559839
 ] 

Hive QA commented on HIVE-10812:




{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12735375/HIVE-10812.03.patch

{color:green}SUCCESS:{color} +1 8974 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4045/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4045/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4045/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12735375 - PreCommit-HIVE-TRUNK-Build

 Scaling PK/FK's selectivity for stats annotation
 

 Key: HIVE-10812
 URL: https://issues.apache.org/jira/browse/HIVE-10812
 Project: Hive
  Issue Type: Improvement
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-10812.01.patch, HIVE-10812.02.patch, 
 HIVE-10812.03.patch


 Right now, the computation of the selectivity of FK side based on PK side 
 does not take into consideration of the range of FK and the range of PK.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10165) Improve hive-hcatalog-streaming extensibility and support updates and deletes.

2015-05-26 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559879#comment-14559879
 ] 

Alan Gates commented on HIVE-10165:
---

I'll review if someone else doesn't get to it first.  It will take me a few 
days to get to it as I'm out the rest of this week.

As far as the failing tests, the 5 earlier failures didn't look related to your 
patch.  Unless we really broke the trunk it's surprising to see 600+ test 
failures for your later patch.  Have you tried running some of these locally to 
see whether you can reproduce them?

 Improve hive-hcatalog-streaming extensibility and support updates and deletes.
 --

 Key: HIVE-10165
 URL: https://issues.apache.org/jira/browse/HIVE-10165
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Affects Versions: 1.2.0
Reporter: Elliot West
Assignee: Elliot West
  Labels: streaming_api
 Attachments: HIVE-10165.0.patch, HIVE-10165.4.patch, 
 HIVE-10165.5.patch


 h3. Overview
 I'd like to extend the 
 [hive-hcatalog-streaming|https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest]
  API so that it also supports the writing of record updates and deletes in 
 addition to the already supported inserts.
 h3. Motivation
 We have many Hadoop processes outside of Hive that merge changed facts into 
 existing datasets. Traditionally we achieve this by: reading in a 
 ground-truth dataset and a modified dataset, grouping by a key, sorting by a 
 sequence and then applying a function to determine inserted, updated, and 
 deleted rows. However, in our current scheme we must rewrite all partitions 
 that may potentially contain changes. In practice the number of mutated 
 records is very small when compared with the records contained in a 
 partition. This approach results in a number of operational issues:
 * Excessive amount of write activity required for small data changes.
 * Downstream applications cannot robustly read these datasets while they are 
 being updated.
 * Due to scale of the updates (hundreds or partitions) the scope for 
 contention is high. 
 I believe we can address this problem by instead writing only the changed 
 records to a Hive transactional table. This should drastically reduce the 
 amount of data that we need to write and also provide a means for managing 
 concurrent access to the data. Our existing merge processes can read and 
 retain each record's {{ROW_ID}}/{{RecordIdentifier}} and pass this through to 
 an updated form of the hive-hcatalog-streaming API which will then have the 
 required data to perform an update or insert in a transactional manner. 
 h3. Benefits
 * Enables the creation of large-scale dataset merge processes  
 * Opens up Hive transactional functionality in an accessible manner to 
 processes that operate outside of Hive.
 h3. Implementation
 Our changes do not break the existing API contracts. Instead our approach has 
 been to consider the functionality offered by the existing API and our 
 proposed API as fulfilling separate and distinct use-cases. The existing API 
 is primarily focused on the task of continuously writing large volumes of new 
 data into a Hive table for near-immediate analysis. Our use-case however, is 
 concerned more with the frequent but not continuous ingestion of mutations to 
 a Hive table from some ETL merge process. Consequently we feel it is 
 justifiable to add our new functionality via an alternative set of public 
 interfaces and leave the existing API as is. This keeps both APIs clean and 
 focused at the expense of presenting additional options to potential users. 
 Wherever possible, shared implementation concerns have been factored out into 
 abstract base classes that are open to third-party extension. A detailed 
 breakdown of the changes is as follows:
 * We've introduced a public {{RecordMutator}} interface whose purpose is to 
 expose insert/update/delete operations to the user. This is a counterpart to 
 the write-only {{RecordWriter}}. We've also factored out life-cycle methods 
 common to these two interfaces into a super {{RecordOperationWriter}} 
 interface.  Note that the row representation has be changed from {{byte[]}} 
 to {{Object}}. Within our data processing jobs our records are often 
 available in a strongly typed and decoded form such as a POJO or a Tuple 
 object. Therefore is seems to make sense that we are able to pass this 
 through to the {{OrcRecordUpdater}} without having to go through a {{byte[]}} 
 encoding step. This of course still allows users to use {{byte[]}} if they 
 wish.
 * The introduction of {{RecordMutator}} requires that insert/update/delete 
 operations are then also exposed on a {{TransactionBatch}} type. We've done 
 this 

[jira] [Commented] (HIVE-7723) Explain plan for complex query with lots of partitions is slow due to in-efficient collection used to find a matching ReadEntity

2015-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559849#comment-14559849
 ] 

Hive QA commented on HIVE-7723:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12735389/HIVE-7723.11.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4046/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4046/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4046/

Messages:
{noformat}
 This message was trimmed, see log for full details 
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java:
 Recompile with -Xlint:unchecked for details.
[INFO] 
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ 
spark-client ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ spark-client ---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/warehouse
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf
 [copy] Copying 11 files to 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
spark-client ---
[INFO] Compiling 5 source files to 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/test-classes
[INFO] 
[INFO] --- maven-dependency-plugin:2.8:copy (copy-guava-14) @ spark-client ---
[INFO] Configured Artifact: com.google.guava:guava:14.0.1:jar
[INFO] Copying guava-14.0.1.jar to 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/dependency/guava-14.0.1.jar
[INFO] 
[INFO] --- maven-surefire-plugin:2.16:test (default-test) @ spark-client ---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ spark-client ---
[INFO] Building jar: 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-1.3.0-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ 
spark-client ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ spark-client ---
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-1.3.0-SNAPSHOT.jar
 to 
/home/hiveptest/.m2/repository/org/apache/hive/spark-client/1.3.0-SNAPSHOT/spark-client-1.3.0-SNAPSHOT.jar
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/spark-client/pom.xml to 
/home/hiveptest/.m2/repository/org/apache/hive/spark-client/1.3.0-SNAPSHOT/spark-client-1.3.0-SNAPSHOT.pom
[INFO] 
[INFO] 
[INFO] Building Hive Query Language 1.3.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-exec ---
[INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql/target
[INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql 
(includes = [datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ 
hive-exec ---
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (generate-sources) @ hive-exec ---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/ql/target/generated-test-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen
Generating vector expression code
Generating vector expression test code
[INFO] Executed tasks
[INFO] 
[INFO] --- build-helper-maven-plugin:1.8:add-source (add-source) @ hive-exec ---
[INFO] Source directory: 
/data/hive-ptest/working/apache-github-source-source/ql/src/gen/protobuf/gen-java
 added.
[INFO] Source directory: 
/data/hive-ptest/working/apache-github-source-source/ql/src/gen/thrift/gen-javabean
 added.
[INFO] 

[jira] [Commented] (HIVE-10811) RelFieldTrimmer throws NoSuchElementException in some cases

2015-05-26 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559854#comment-14559854
 ] 

Laljo John Pullokkaran commented on HIVE-10811:
---

[~jcamachorodriguez] I don't get the patch.
Shouldn't we be checking collations from rel present in input?

 RelFieldTrimmer throws NoSuchElementException in some cases
 ---

 Key: HIVE-10811
 URL: https://issues.apache.org/jira/browse/HIVE-10811
 Project: Hive
  Issue Type: Bug
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-10811.01.patch, HIVE-10811.02.patch, 
 HIVE-10811.patch


 RelFieldTrimmer runs into NoSuchElementException in some cases.
 Stack trace:
 {noformat}
 Exception in thread main java.lang.AssertionError: Internal error: While 
 invoking method 'public org.apache.calcite.sql2rel.RelFieldTrimmer$TrimResult 
 org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(org.apache.calcite.rel.core.Sort,org.apache.calcite.util.ImmutableBitSet,java.util.Set)'
   at org.apache.calcite.util.Util.newInternal(Util.java:743)
   at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:543)
   at 
 org.apache.calcite.sql2rel.RelFieldTrimmer.dispatchTrimFields(RelFieldTrimmer.java:269)
   at 
 org.apache.calcite.sql2rel.RelFieldTrimmer.trim(RelFieldTrimmer.java:175)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:947)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:820)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:768)
   at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:109)
   at 
 org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:730)
   at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:145)
   at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:105)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:607)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:244)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10048)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:207)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
   at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 Caused by: java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:536)
   ... 32 more
 Caused by: java.lang.AssertionError: Internal error: While invoking method 
 'public org.apache.calcite.sql2rel.RelFieldTrimmer$TrimResult 
 org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(org.apache.calcite.rel.core.Sort,org.apache.calcite.util.ImmutableBitSet,java.util.Set)'
   at org.apache.calcite.util.Util.newInternal(Util.java:743)
   at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:543)
   

[jira] [Commented] (HIVE-10165) Improve hive-hcatalog-streaming extensibility and support updates and deletes.

2015-05-26 Thread Elliot West (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559866#comment-14559866
 ] 

Elliot West commented on HIVE-10165:


I'm not quite sure what to do next. I have a '-1' because some (unrelated) 
tests fail. However I (perhaps naïvely) don't believe this is connected to my 
patch. Could someone please review?

 Improve hive-hcatalog-streaming extensibility and support updates and deletes.
 --

 Key: HIVE-10165
 URL: https://issues.apache.org/jira/browse/HIVE-10165
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Affects Versions: 1.2.0
Reporter: Elliot West
Assignee: Elliot West
  Labels: streaming_api
 Attachments: HIVE-10165.0.patch, HIVE-10165.4.patch, 
 HIVE-10165.5.patch


 h3. Overview
 I'd like to extend the 
 [hive-hcatalog-streaming|https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest]
  API so that it also supports the writing of record updates and deletes in 
 addition to the already supported inserts.
 h3. Motivation
 We have many Hadoop processes outside of Hive that merge changed facts into 
 existing datasets. Traditionally we achieve this by: reading in a 
 ground-truth dataset and a modified dataset, grouping by a key, sorting by a 
 sequence and then applying a function to determine inserted, updated, and 
 deleted rows. However, in our current scheme we must rewrite all partitions 
 that may potentially contain changes. In practice the number of mutated 
 records is very small when compared with the records contained in a 
 partition. This approach results in a number of operational issues:
 * Excessive amount of write activity required for small data changes.
 * Downstream applications cannot robustly read these datasets while they are 
 being updated.
 * Due to scale of the updates (hundreds or partitions) the scope for 
 contention is high. 
 I believe we can address this problem by instead writing only the changed 
 records to a Hive transactional table. This should drastically reduce the 
 amount of data that we need to write and also provide a means for managing 
 concurrent access to the data. Our existing merge processes can read and 
 retain each record's {{ROW_ID}}/{{RecordIdentifier}} and pass this through to 
 an updated form of the hive-hcatalog-streaming API which will then have the 
 required data to perform an update or insert in a transactional manner. 
 h3. Benefits
 * Enables the creation of large-scale dataset merge processes  
 * Opens up Hive transactional functionality in an accessible manner to 
 processes that operate outside of Hive.
 h3. Implementation
 Our changes do not break the existing API contracts. Instead our approach has 
 been to consider the functionality offered by the existing API and our 
 proposed API as fulfilling separate and distinct use-cases. The existing API 
 is primarily focused on the task of continuously writing large volumes of new 
 data into a Hive table for near-immediate analysis. Our use-case however, is 
 concerned more with the frequent but not continuous ingestion of mutations to 
 a Hive table from some ETL merge process. Consequently we feel it is 
 justifiable to add our new functionality via an alternative set of public 
 interfaces and leave the existing API as is. This keeps both APIs clean and 
 focused at the expense of presenting additional options to potential users. 
 Wherever possible, shared implementation concerns have been factored out into 
 abstract base classes that are open to third-party extension. A detailed 
 breakdown of the changes is as follows:
 * We've introduced a public {{RecordMutator}} interface whose purpose is to 
 expose insert/update/delete operations to the user. This is a counterpart to 
 the write-only {{RecordWriter}}. We've also factored out life-cycle methods 
 common to these two interfaces into a super {{RecordOperationWriter}} 
 interface.  Note that the row representation has be changed from {{byte[]}} 
 to {{Object}}. Within our data processing jobs our records are often 
 available in a strongly typed and decoded form such as a POJO or a Tuple 
 object. Therefore is seems to make sense that we are able to pass this 
 through to the {{OrcRecordUpdater}} without having to go through a {{byte[]}} 
 encoding step. This of course still allows users to use {{byte[]}} if they 
 wish.
 * The introduction of {{RecordMutator}} requires that insert/update/delete 
 operations are then also exposed on a {{TransactionBatch}} type. We've done 
 this with the introduction of a public {{MutatorTransactionBatch}} interface 
 which is a counterpart to the write-only {{TransactionBatch}}. We've also 
 factored out life-cycle methods common to these two 

[jira] [Commented] (HIVE-10753) hs2 jdbc url - wrong connection string cause error on beeline/jdbc/odbc client, misleading message

2015-05-26 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559893#comment-14559893
 ] 

Thejas M Nair commented on HIVE-10753:
--

+1

 hs2 jdbc url - wrong connection string cause  error on beeline/jdbc/odbc 
 client, misleading message
 ---

 Key: HIVE-10753
 URL: https://issues.apache.org/jira/browse/HIVE-10753
 Project: Hive
  Issue Type: Bug
  Components: Beeline, JDBC
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10753.1.patch, HIVE-10753.2.patch


 {noformat}
 beeline -u 
 'jdbc:hive2://localhost:10001/default?httpPath=/;transportMode=http' -n 
 hdiuser
 scan complete in 15ms
 Connecting to 
 jdbc:hive2://localhost:10001/default?httpPath=/;transportMode=http
 Java heap space
 Beeline version 0.14.0.2.2.4.1-1 by Apache Hive
 0: jdbc:hive2://localhost:10001/default (closed) ^Chdiuser@headnode0:~$ 
 But it works if I use the deprecated param - 
 hdiuser@headnode0:~$ beeline -u 
 'jdbc:hive2://localhost:10001/default?hive.server2.transport.mode=http;httpPath=/'
  -n hdiuser
 scan complete in 12ms
 Connecting to 
 jdbc:hive2://localhost:10001/default?hive.server2.transport.mode=http;httpPath=/
 15/04/28 23:16:46 [main]: WARN jdbc.Utils: * JDBC param deprecation *
 15/04/28 23:16:46 [main]: WARN jdbc.Utils: The use of 
 hive.server2.transport.mode is deprecated.
 15/04/28 23:16:46 [main]: WARN jdbc.Utils: Please use transportMode like so: 
 jdbc:hive2://host:port/dbName;transportMode=transport_mode_value
 Connected to: Apache Hive (version 0.14.0.2.2.4.1-1)
 Driver: Hive JDBC (version 0.14.0.2.2.4.1-1)
 Transaction isolation: TRANSACTION_REPEATABLE_READ
 Beeline version 0.14.0.2.2.4.1-1 by Apache Hive
 0: jdbc:hive2://localhost:10001/default show tables;
 +--+--+
 | tab_name |
 +--+--+
 | hivesampletable  |
 +--+--+
 1 row selected (18.181 seconds)
 0: jdbc:hive2://localhost:10001/default ^Chdiuser@headnode0:~$ ^C
 {noformat}
 The reason for the above message is :
 The url is wrong. Correct one:
 {code}
 beeline -u 
 'jdbc:hive2://localhost:10001/default;httpPath=/;transportMode=http' -n 
 hdiuser
 {code}
 Note the ; instead of ?. The deprecation msg prints the format as well: 
 {code}
 Please use transportMode like so: 
 jdbc:hive2://host:port/dbName;transportMode=transport_mode_value
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10809) HCat FileOutputCommitterContainer leaves behind empty _SCRATCH directories

2015-05-26 Thread Selina Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Selina Zhang updated HIVE-10809:

Attachment: HIVE-10809.2.patch

The above unit test failures seem not relevant to this patch. 

Uploaded a new patch. Add verification in TestHCatStorer to verify the scratch 
directories are removed. 





 HCat FileOutputCommitterContainer leaves behind empty _SCRATCH directories
 --

 Key: HIVE-10809
 URL: https://issues.apache.org/jira/browse/HIVE-10809
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 1.2.0
Reporter: Selina Zhang
Assignee: Selina Zhang
 Attachments: HIVE-10809.1.patch, HIVE-10809.2.patch


 When static partition is added through HCatStorer or HCatWriter
 {code}
 JoinedData = LOAD '/user/selinaz/data/part-r-0' USING JsonLoader();
 STORE JoinedData INTO 'selina.joined_events_e' USING 
 org.apache.hive.hcatalog.pig.HCatStorer('author=selina');
 {code}
 The table directory looks like
 {noformat}
 drwx--   - selinaz users  0 2015-05-22 21:19 
 /user/selinaz/joined_events_e/_SCRATCH0.9157208938193798
 drwx--   - selinaz users  0 2015-05-22 21:19 
 /user/selinaz/joined_events_e/author=selina
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10244) Vectorization : TPC-DS Q80 fails with java.lang.ClassCastException when hive.vectorized.execution.reduce.enabled is enabled

2015-05-26 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559918#comment-14559918
 ] 

Matt McCline commented on HIVE-10244:
-

Ya, I know, that is what I thought.  But the new prune flag seems to be on in 
the Reducer even though isGroupingSetsPresent is false.  We should talk to the 
author and reviewer of the change.

Jedi Master [~ashutoshc], can you explain to us Padawan Learners 
[~jpullokkaran] [~mmccline] [~jcamachorodriguez] all about the prune flag?

 Vectorization : TPC-DS Q80 fails with java.lang.ClassCastException when 
 hive.vectorized.execution.reduce.enabled is enabled
 ---

 Key: HIVE-10244
 URL: https://issues.apache.org/jira/browse/HIVE-10244
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Matt McCline
 Attachments: HIVE-10244.01.patch, explain_q80_vectorized_reduce_on.txt


 Query 
 {code}
 set hive.vectorized.execution.reduce.enabled=true;
 with ssr as
  (select  s_store_id as store_id,
   sum(ss_ext_sales_price) as sales,
   sum(coalesce(sr_return_amt, 0)) as returns,
   sum(ss_net_profit - coalesce(sr_net_loss, 0)) as profit
   from store_sales left outer join store_returns on
  (ss_item_sk = sr_item_sk and ss_ticket_number = sr_ticket_number),
  date_dim,
  store,
  item,
  promotion
  where ss_sold_date_sk = d_date_sk
and d_date between cast('1998-08-04' as date) 
   and (cast('1998-09-04' as date))
and ss_store_sk = s_store_sk
and ss_item_sk = i_item_sk
and i_current_price  50
and ss_promo_sk = p_promo_sk
and p_channel_tv = 'N'
  group by s_store_id)
  ,
  csr as
  (select  cp_catalog_page_id as catalog_page_id,
   sum(cs_ext_sales_price) as sales,
   sum(coalesce(cr_return_amount, 0)) as returns,
   sum(cs_net_profit - coalesce(cr_net_loss, 0)) as profit
   from catalog_sales left outer join catalog_returns on
  (cs_item_sk = cr_item_sk and cs_order_number = cr_order_number),
  date_dim,
  catalog_page,
  item,
  promotion
  where cs_sold_date_sk = d_date_sk
and d_date between cast('1998-08-04' as date)
   and (cast('1998-09-04' as date))
 and cs_catalog_page_sk = cp_catalog_page_sk
and cs_item_sk = i_item_sk
and i_current_price  50
and cs_promo_sk = p_promo_sk
and p_channel_tv = 'N'
 group by cp_catalog_page_id)
  ,
  wsr as
  (select  web_site_id,
   sum(ws_ext_sales_price) as sales,
   sum(coalesce(wr_return_amt, 0)) as returns,
   sum(ws_net_profit - coalesce(wr_net_loss, 0)) as profit
   from web_sales left outer join web_returns on
  (ws_item_sk = wr_item_sk and ws_order_number = wr_order_number),
  date_dim,
  web_site,
  item,
  promotion
  where ws_sold_date_sk = d_date_sk
and d_date between cast('1998-08-04' as date)
   and (cast('1998-09-04' as date))
 and ws_web_site_sk = web_site_sk
and ws_item_sk = i_item_sk
and i_current_price  50
and ws_promo_sk = p_promo_sk
and p_channel_tv = 'N'
 group by web_site_id)
   select  channel
 , id
 , sum(sales) as sales
 , sum(returns) as returns
 , sum(profit) as profit
  from 
  (select 'store channel' as channel
 , concat('store', store_id) as id
 , sales
 , returns
 , profit
  from   ssr
  union all
  select 'catalog channel' as channel
 , concat('catalog_page', catalog_page_id) as id
 , sales
 , returns
 , profit
  from  csr
  union all
  select 'web channel' as channel
 , concat('web_site', web_site_id) as id
 , sales
 , returns
 , profit
  from   wsr
  ) x
  group by channel, id with rollup
  order by channel
  ,id
  limit 100
 {code}
 Exception 
 {code}
 Vertex failed, vertexName=Reducer 5, vertexId=vertex_1426707664723_1377_1_22, 
 diagnostics=[Task failed, taskId=task_1426707664723_1377_1_22_00, 
 diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
 task:java.lang.RuntimeException: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing vector batch (tag=0) 
 \N\N09.285817653506076E84.639990363237801E7-1.1814318134887291E8
 \N\N04.682909323885761E82.2415242712669864E7-5.966176123188091E7
 \N\N01.2847032699693155E96.300096113768728E7-5.94963316209578E8
   at 
 org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
   at 
 

[jira] [Commented] (HIVE-9069) Simplify filter predicates for CBO

2015-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559220#comment-14559220
 ] 

Hive QA commented on HIVE-9069:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12735298/HIVE-9069.14.patch

{color:red}ERROR:{color} -1 due to 636 failed/errored test(s), 8974 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_multiple
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alias_casted_column
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table2_h23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table_h23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_protect_mode
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition_authorization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join26
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_reordering_values
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_change_schema
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_comments
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_compression_enabled
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_compression_enabled_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_date
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_deserialize_map_null
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_evolved_schemas
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_joins
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_joins_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_fields
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_sanity_test
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_schema_evolution_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_timestamp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_type_evolution
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table_udfs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_binary_output_format
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_5

[jira] [Commented] (HIVE-10811) RelFieldTrimmer throws NoSuchElementException in some cases

2015-05-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559104#comment-14559104
 ] 

Hive QA commented on HIVE-10811:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12735296/HIVE-10811.01.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4040/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4040/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4040/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4040/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 1f75e34 HIVE-9605:Remove parquet nested objects from wrapper 
writable objects (Sergio Pena, reviewed by Ferdinand Xu)
+ git clean -f -d
Removing hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/
Removing hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at 1f75e34 HIVE-9605:Remove parquet nested objects from wrapper 
writable objects (Sergio Pena, reviewed by Ferdinand Xu)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12735296 - PreCommit-HIVE-TRUNK-Build

 RelFieldTrimmer throws NoSuchElementException in some cases
 ---

 Key: HIVE-10811
 URL: https://issues.apache.org/jira/browse/HIVE-10811
 Project: Hive
  Issue Type: Bug
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-10811.01.patch, HIVE-10811.patch


 RelFieldTrimmer runs into NoSuchElementException in some cases.
 Stack trace:
 {noformat}
 Exception in thread main java.lang.AssertionError: Internal error: While 
 invoking method 'public org.apache.calcite.sql2rel.RelFieldTrimmer$TrimResult 
 org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(org.apache.calcite.rel.core.Sort,org.apache.calcite.util.ImmutableBitSet,java.util.Set)'
   at org.apache.calcite.util.Util.newInternal(Util.java:743)
   at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:543)
   at 
 org.apache.calcite.sql2rel.RelFieldTrimmer.dispatchTrimFields(RelFieldTrimmer.java:269)
   at 
 org.apache.calcite.sql2rel.RelFieldTrimmer.trim(RelFieldTrimmer.java:175)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:947)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:820)
   at 
 

  1   2   >