[jira] [Commented] (HIVE-11097) HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases

2016-01-19 Thread Wan Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106468#comment-15106468
 ] 

Wan Chang commented on HIVE-11097:
--

Hi [~prasanth_j], I use hive0.13.1 and the bug occurs with some complex sql. 
But I didn't reproduce the case on the master branch. I don't know whether it 
has been fix yet. 

> HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases
> -
>
> Key: HIVE-11097
> URL: https://issues.apache.org/jira/browse/HIVE-11097
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.2.0
> Environment: Hive 0.13.1, Hive 2.0.0, hadoop 2.4.1
>Reporter: Wan Chang
>Assignee: Wan Chang
>Priority: Critical
> Attachments: HIVE-11097.1.patch, HIVE-11097.2.patch, 
> HIVE-11097.3.patch
>
>
> Say we have a sql as
> {code}
> create table if not exists test_orc_src (a int, b int, c int) stored as orc;
> create table if not exists test_orc_src2 (a int, b int, d int) stored as orc;
> insert overwrite table test_orc_src select 1,2,3 from src limit 1;
> insert overwrite table test_orc_src2 select 1,2,4 from src limit 1;
> set hive.auto.convert.join = false;
> set hive.execution.engine=mr;
> select
>   tb.c
> from test.test_orc_src tb
> join (select * from test.test_orc_src2) tm
> on tb.a = tm.a
> where tb.b = 2
> {code}
> The correct result is 3 but it produced no result.
> I find that in HiveInputFormat.pushProjectionsAndFilters
> {code}
> match = splitPath.startsWith(key) || splitPathWithNoSchema.startsWith(key);
> {code}
> It uses startsWith to combine aliases with path, so tm will match two alias 
> in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12429) Switch default Hive authorization to SQLStandardAuth in 2.0

2016-01-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106568#comment-15106568
 ] 

Hive QA commented on HIVE-12429:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12782970/HIVE-12429.18.patch

{color:green}SUCCESS:{color} +1 due to 54 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9992 tests executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniTezCliDriver-tez_bmj_schema_evolution.q-orc_merge5.q-vectorization_limit.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6668/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6668/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6668/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12782970 - PreCommit-HIVE-TRUNK-Build

> Switch default Hive authorization to SQLStandardAuth in 2.0
> ---
>
> Key: HIVE-12429
> URL: https://issues.apache.org/jira/browse/HIVE-12429
> Project: Hive
>  Issue Type: Task
>  Components: Authorization, Security
>Affects Versions: 2.0.0
>Reporter: Alan Gates
>Assignee: Daniel Dai
> Attachments: HIVE-12429.1.patch, HIVE-12429.10.patch, 
> HIVE-12429.11.patch, HIVE-12429.12.patch, HIVE-12429.13.patch, 
> HIVE-12429.14.patch, HIVE-12429.15.patch, HIVE-12429.16.patch, 
> HIVE-12429.17.patch, HIVE-12429.18.patch, HIVE-12429.2.patch, 
> HIVE-12429.3.patch, HIVE-12429.4.patch, HIVE-12429.5.patch, 
> HIVE-12429.6.patch, HIVE-12429.7.patch, HIVE-12429.8.patch, HIVE-12429.9.patch
>
>
> Hive's default authorization is not real security, as it does not secure a 
> number of features and anyone can grant access to any object to any user.  We 
> should switch the default to SQLStandardAuth, which provides real 
> authentication.
> As this is a backwards incompatible change this was hard to do previously, 
> but 2.0 gives us a place to do this type of change.
> By default authorization will still be off, as there are a few other things 
> to set when turning on authorization (such as the list of admin users).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12446) Tracking jira for changes required for move to Tez 0.8.2

2016-01-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106572#comment-15106572
 ] 

Hive QA commented on HIVE-12446:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12782971/HIVE-12446.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6669/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6669/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6669/

Messages:
{noformat}
 This message was trimmed, see log for full details 
[INFO] Hive Service RPC
[INFO] Spark Remote Client
[INFO] Hive Query Language
[INFO] Hive Service
[INFO] Hive Accumulo Handler
[INFO] Hive JDBC
[INFO] Hive Beeline
[INFO] Hive CLI
[INFO] Hive Contrib
[INFO] Hive HBase Handler
[INFO] Hive HCatalog
[INFO] Hive HCatalog Core
[INFO] Hive HCatalog Pig Adapter
[INFO] Hive HCatalog Server Extensions
[INFO] Hive HCatalog Webhcat Java Client
[INFO] Hive HCatalog Webhcat
[INFO] Hive HCatalog Streaming
[INFO] Hive HPL/SQL
[INFO] Hive HWI
[INFO] Hive ODBC
[INFO] Hive Llap Server
[INFO] Hive Shims Aggregator
[INFO] Hive TestUtils
[INFO] Hive Packaging
[INFO] 
[INFO] 
[INFO] Building Hive 2.1.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive ---
[INFO] Deleting /data/hive-ptest/working/apache-github-source-source/target
[INFO] Deleting /data/hive-ptest/working/apache-github-source-source (includes 
= [datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ hive ---
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ hive ---
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive ---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive ---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/target/tmp
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/target/warehouse
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/target/tmp/conf
 [copy] Copying 16 files to 
/data/hive-ptest/working/apache-github-source-source/target/tmp/conf
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ hive 
---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ hive ---
[INFO] Installing /data/hive-ptest/working/apache-github-source-source/pom.xml 
to 
/data/hive-ptest/working/maven/org/apache/hive/hive/2.1.0-SNAPSHOT/hive-2.1.0-SNAPSHOT.pom
[INFO] 
[INFO] 
[INFO] Building Hive Shims Common 2.1.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-shims-common ---
[INFO] Deleting 
/data/hive-ptest/working/apache-github-source-source/shims/common/target
[INFO] Deleting 
/data/hive-ptest/working/apache-github-source-source/shims/common (includes = 
[datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ 
hive-shims-common ---
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ 
hive-shims-common ---
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ 
hive-shims-common ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 
/data/hive-ptest/working/apache-github-source-source/shims/common/src/main/resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive-shims-common 
---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ 
hive-shims-common ---
[INFO] Compiling 29 source files to 
/data/hive-ptest/working/apache-github-source-source/shims/common/target/classes
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShimsSecure.java:
 Some input files use or override a deprecated API.
[WARNING] 
/data/hive-ptest/working/apache-github-source-source/shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShimsSecure.java:
 Recompile with 

[jira] [Updated] (HIVE-12736) It seems that result of Hive on Spark be mistaken and result of Hive and Hive on Spark are not the same

2016-01-19 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-12736:
-
Attachment: HIVE-12736.5-spark.patch

[~xuefuz], Yes, it's related, i miss something here. Group By before MapJoin is 
not allowed, and in MR mode, it use {{ReduceSinkOperator}} to check whether 
there is Group By before MapJoin, it has conflict with Spark mode, as mentioned 
before. Instead of validate MapJoin compatibility with other Operators by 
through {{opAllowedBeforeMapJoin()}} and {{opAllowedAfterMapJoin()}}, i should 
be easier and proper to implement through pattern match, i didn't rewrite the 
validation for MR mode, just add new validation logic for Spark mode based on 
pattern match.

> It seems that result of Hive on Spark be mistaken and result of Hive and Hive 
> on Spark are not the same
> ---
>
> Key: HIVE-12736
> URL: https://issues.apache.org/jira/browse/HIVE-12736
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.1, 1.2.1
>Reporter: JoneZhang
>Assignee: Chengxiang Li
> Attachments: HIVE-12736.1-spark.patch, HIVE-12736.2-spark.patch, 
> HIVE-12736.3-spark.patch, HIVE-12736.4-spark.patch, HIVE-12736.5-spark.patch
>
>
> {code}
> select  * from staff;
> 1 jone22  1
> 2 lucy21  1
> 3 hmm 22  2
> 4 james   24  3
> 5 xiaoliu 23  3
> select id,date_ from trade union all select id,"test" from trade ;
> 1 201510210908
> 2 201509080234
> 2 201509080235
> 1 test
> 2 test
> 2 test
> set hive.execution.engine=spark;
> set spark.master=local;
> select /*+mapjoin(t)*/ * from staff s join 
> (select id,date_ from trade union all select id,"test" from trade ) t on 
> s.id=t.id;
> 1 jone22  1   1   201510210908
> 2 lucy21  1   2   201509080234
> 2 lucy21  1   2   201509080235
> set hive.execution.engine=mr;
> select /*+mapjoin(t)*/ * from staff s join 
> (select id,date_ from trade union all select id,"test" from trade ) t on 
> s.id=t.id;
> FAILED: SemanticException [Error 10227]: Not all clauses are supported with 
> mapjoin hint. Please remove mapjoin hint.
> {code}
> I have two questions
> 1.Why result of hive on spark not include the following record?
> {code}
> 1 jone22  1   1   test
> 2 lucy21  1   2   test
> 2 lucy21  1   2   test
> {code}
> 2.Why there are two different ways of dealing same query?
> explain 1:
> {code}
> set hive.execution.engine=spark;
> set spark.master=local;
> explain 
> select id,date_ from trade union all select id,"test" from trade;
> OK
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Spark
>   DagName: jonezhang_20151222191643_5301d90a-caf0-4934-8092-d165c87a4190:1
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: trade
>   Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: id (type: int), date_ (type: string)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 6 Data size: 48 Basic stats: 
> COMPLETE Column stats: NONE
> File Output Operator
>   compressed: false
>   Statistics: Num rows: 12 Data size: 96 Basic stats: 
> COMPLETE Column stats: NONE
>   table:
>   input format: 
> org.apache.hadoop.mapred.TextInputFormat
>   output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>   serde: 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> Map 2 
> Map Operator Tree:
> TableScan
>   alias: trade
>   Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: id (type: int), 'test' (type: string)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 6 Data size: 48 Basic stats: 
> COMPLETE Column stats: NONE
> File Output Operator
>   compressed: false
>   Statistics: Num rows: 12 Data size: 96 Basic stats: 
> COMPLETE Column stats: NONE
>   table:
>   input format: 
> org.apache.hadoop.mapred.TextInputFormat
>   output format: 
> 

[jira] [Commented] (HIVE-12736) It seems that result of Hive on Spark be mistaken and result of Hive and Hive on Spark are not the same

2016-01-19 Thread Chengxiang Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106575#comment-15106575
 ] 

Chengxiang Li commented on HIVE-12736:
--

Besides, during test, i found TestSparkNegativeCliDriver run in MR mode 
actually, i would create another JIRA to track it.

> It seems that result of Hive on Spark be mistaken and result of Hive and Hive 
> on Spark are not the same
> ---
>
> Key: HIVE-12736
> URL: https://issues.apache.org/jira/browse/HIVE-12736
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.1, 1.2.1
>Reporter: JoneZhang
>Assignee: Chengxiang Li
> Attachments: HIVE-12736.1-spark.patch, HIVE-12736.2-spark.patch, 
> HIVE-12736.3-spark.patch, HIVE-12736.4-spark.patch, HIVE-12736.5-spark.patch
>
>
> {code}
> select  * from staff;
> 1 jone22  1
> 2 lucy21  1
> 3 hmm 22  2
> 4 james   24  3
> 5 xiaoliu 23  3
> select id,date_ from trade union all select id,"test" from trade ;
> 1 201510210908
> 2 201509080234
> 2 201509080235
> 1 test
> 2 test
> 2 test
> set hive.execution.engine=spark;
> set spark.master=local;
> select /*+mapjoin(t)*/ * from staff s join 
> (select id,date_ from trade union all select id,"test" from trade ) t on 
> s.id=t.id;
> 1 jone22  1   1   201510210908
> 2 lucy21  1   2   201509080234
> 2 lucy21  1   2   201509080235
> set hive.execution.engine=mr;
> select /*+mapjoin(t)*/ * from staff s join 
> (select id,date_ from trade union all select id,"test" from trade ) t on 
> s.id=t.id;
> FAILED: SemanticException [Error 10227]: Not all clauses are supported with 
> mapjoin hint. Please remove mapjoin hint.
> {code}
> I have two questions
> 1.Why result of hive on spark not include the following record?
> {code}
> 1 jone22  1   1   test
> 2 lucy21  1   2   test
> 2 lucy21  1   2   test
> {code}
> 2.Why there are two different ways of dealing same query?
> explain 1:
> {code}
> set hive.execution.engine=spark;
> set spark.master=local;
> explain 
> select id,date_ from trade union all select id,"test" from trade;
> OK
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Spark
>   DagName: jonezhang_20151222191643_5301d90a-caf0-4934-8092-d165c87a4190:1
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: trade
>   Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: id (type: int), date_ (type: string)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 6 Data size: 48 Basic stats: 
> COMPLETE Column stats: NONE
> File Output Operator
>   compressed: false
>   Statistics: Num rows: 12 Data size: 96 Basic stats: 
> COMPLETE Column stats: NONE
>   table:
>   input format: 
> org.apache.hadoop.mapred.TextInputFormat
>   output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>   serde: 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> Map 2 
> Map Operator Tree:
> TableScan
>   alias: trade
>   Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: id (type: int), 'test' (type: string)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 6 Data size: 48 Basic stats: 
> COMPLETE Column stats: NONE
> File Output Operator
>   compressed: false
>   Statistics: Num rows: 12 Data size: 96 Basic stats: 
> COMPLETE Column stats: NONE
>   table:
>   input format: 
> org.apache.hadoop.mapred.TextInputFormat
>   output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>   serde: 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
> Fetch Operator
>   limit: -1
>   Processor Tree:
> ListSink
> {code}
> explain 2:
> {code}
> set hive.execution.engine=spark;
> set spark.master=local;
> explain 
> select /*+mapjoin(t)*/ * from staff s join 
> (select id,date_ from trade union all select id,"test" from trade ) t on 
> 

[jira] [Commented] (HIVE-12864) StackOverflowError parsing queries with very large predicates

2016-01-19 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106500#comment-15106500
 ] 

Jesus Camacho Rodriguez commented on HIVE-12864:


[~pxiong], thanks for checking on this. 

The original five methods are tree traversals implemented recursively.
The new ones in the patch are similar to the original ones, but I rewrote each 
of them to be iterative -using stacks-. This avoid the StackOverflowError.
Concretely:
* setUnknownTokenBoundaries(): post-order
* dump(StringBuilder sb): pre- and post-order
* toStringTree(ASTNode rootNode): pre- and post-order
* processPositionAlias(ASTNode ast): pre-order
* findSubQueries(ASTNode node, List subQueries): pre-order

These algorithms are part of the parsing logic, so you can pick any query in 
the q files (e.g. lineage2.q, lineage3.q, subquery_in.q) to walk through each 
of the algorithms.

> StackOverflowError parsing queries with very large predicates
> -
>
> Key: HIVE-12864
> URL: https://issues.apache.org/jira/browse/HIVE-12864
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12864.01.patch, HIVE-12864.patch
>
>
> We have seen that queries with very large predicates might fail with the 
> following stacktrace:
> {noformat}
> 016-01-12 05:47:36,516|beaver.machine|INFO|552|5072|Thread-22|Exception in 
> thread "main" java.lang.StackOverflowError
> 2016-01-12 05:47:36,517|beaver.machine|INFO|552|5072|Thread-22|at 
> org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:145)
> 2016-01-12 05:47:36,517|beaver.machine|INFO|552|5072|Thread-22|at 
> org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146)
> 2016-01-12 05:47:36,517|beaver.machine|INFO|552|5072|Thread-22|at 
> org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146)
> 2016-01-12 05:47:36,517|beaver.machine|INFO|552|5072|Thread-22|at 
> org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146)
> 2016-01-12 05:47:36,517|beaver.machine|INFO|552|5072|Thread-22|at 
> org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146)
> 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at 
> org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146)
> 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at 
> org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146)
> 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at 
> org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146)
> 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at 
> org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146)
> 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at 
> org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146)
> 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at 
> org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146)
> 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at 
> org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146)
> 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at 
> org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146)
> 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at 
> org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146)
> 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at 
> org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146)
> 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at 
> org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146)
> 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at 
> org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146)
> 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at 
> org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146)
> 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at 
> org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146)
> 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at 
> org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146)
> 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at 
> org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146)
> 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at 
> 

[jira] [Updated] (HIVE-12429) Switch default Hive authorization to SQLStandardAuth in 2.0

2016-01-19 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-12429:

Labels: TODOC2.0  (was: )

> Switch default Hive authorization to SQLStandardAuth in 2.0
> ---
>
> Key: HIVE-12429
> URL: https://issues.apache.org/jira/browse/HIVE-12429
> Project: Hive
>  Issue Type: Task
>  Components: Authorization, Security
>Affects Versions: 2.0.0
>Reporter: Alan Gates
>Assignee: Daniel Dai
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-12429.1.patch, HIVE-12429.10.patch, 
> HIVE-12429.11.patch, HIVE-12429.12.patch, HIVE-12429.13.patch, 
> HIVE-12429.14.patch, HIVE-12429.15.patch, HIVE-12429.16.patch, 
> HIVE-12429.17.patch, HIVE-12429.18.patch, HIVE-12429.2.patch, 
> HIVE-12429.3.patch, HIVE-12429.4.patch, HIVE-12429.5.patch, 
> HIVE-12429.6.patch, HIVE-12429.7.patch, HIVE-12429.8.patch, HIVE-12429.9.patch
>
>
> Hive's default authorization is not real security, as it does not secure a 
> number of features and anyone can grant access to any object to any user.  We 
> should switch the default to SQLStandardAuth, which provides real 
> authentication.
> As this is a backwards incompatible change this was hard to do previously, 
> but 2.0 gives us a place to do this type of change.
> By default authorization will still be off, as there are a few other things 
> to set when turning on authorization (such as the list of admin users).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-1633) CombineHiveInputFormat fails with "cannot find dir for emptyFile"

2016-01-19 Thread Ergin Demirel (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107521#comment-15107521
 ] 

Ergin Demirel commented on HIVE-1633:
-

We are still getting error message when trying to load empty table/file running 
on local mode. 
Tried adding "file://" in front of the path though it didn't help. Can someone 
please clarify the solution here? 

Hive Version: 0.10.0+121-1.cdh4.3.0.p0.16~precise-cdh4.3.0

Thanks

{code}
java.io.FileNotFoundException: File does not exist: 
/tmp/hdfs/hive_2016-01-19_21-40-07_727_4067638808884572526/-mr-1/1/emptyFile
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:807)
at 
org.apache.hadoop.mapred.lib.CombineFileInputFormat$OneFileInfo.(CombineFileInputFormat.java:462)
at 
org.apache.hadoop.mapred.lib.CombineFileInputFormat.getMoreSplits(CombineFileInputFormat.java:256)
at 
org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:212)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:411)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:377)
at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:387)
at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1091)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1083)
at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:993)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:946)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:946)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:920)
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448)
at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:690)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Job Submission failed with exception 'java.io.FileNotFoundException(File does 
not exist: 
/tmp/hdfs/hive_2016-01-19_21-40-07_727_4067638808884572526/-mr-1/1/emptyFile)'
Execution failed with exit status: 1
{code}

> CombineHiveInputFormat fails with "cannot find dir for emptyFile"
> -
>
> Key: HIVE-1633
> URL: https://issues.apache.org/jira/browse/HIVE-1633
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Reporter: Amareshwari Sriramadasu
>Assignee: Sreekanth Ramakrishnan
> Fix For: 0.7.0
>
> Attachments: HIVE-1633.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12867) Semantic Exception Error Msg should be with in the range of "10000 to 19999"

2016-01-19 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12867:
-
Attachment: HIVE-12867.2.patch

> Semantic Exception Error Msg should be with in the range of "1 to 1"
> 
>
> Key: HIVE-12867
> URL: https://issues.apache.org/jira/browse/HIVE-12867
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Laljo John Pullokkaran
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12867.1.patch, HIVE-12867.2.patch
>
>
> At many places errors encountered during semantic exception is translated as 
> generic error(GENERIC_ERROR, 4) msg as opposed to semantic error msg.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12867) Semantic Exception Error Msg should be with in the range of "10000 to 19999"

2016-01-19 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12867:
-
Attachment: (was: HIVE-12867.2.patch)

> Semantic Exception Error Msg should be with in the range of "1 to 1"
> 
>
> Key: HIVE-12867
> URL: https://issues.apache.org/jira/browse/HIVE-12867
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Laljo John Pullokkaran
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12867.1.patch, HIVE-12867.2.patch
>
>
> At many places errors encountered during semantic exception is translated as 
> generic error(GENERIC_ERROR, 4) msg as opposed to semantic error msg.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12798) CBO: Calcite Operator To Hive Operator (Calcite Return Path): MiniTezCliDriver.vector* queries failures due to NPE in Vectorizer.onExpressionHasNullSafes()

2016-01-19 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107596#comment-15107596
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-12798:
--

I've comitted this to master. cc-ing [~sershe] for approval for commit to 
branch-2.0. This can lead to NPE even in regular code path with vectorization 
turned on.

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): 
> MiniTezCliDriver.vector* queries failures due to NPE in 
> Vectorizer.onExpressionHasNullSafes()
> ---
>
> Key: HIVE-12798
> URL: https://issues.apache.org/jira/browse/HIVE-12798
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12798.1.patch
>
>
> As of 01/04/2016, the following tests fail in the MiniTezCliDriver mode when 
> the cbo return path is enabled. We need to fix them :
> {code}
>  vector_leftsemi_mapjoin
>  vector_join_filters
>  vector_interval_mapjoin
>  vector_left_outer_join
>  vectorized_mapjoin
>  vector_inner_join
>  vectorized_context
>  tez_vector_dynpart_hashjoin_1
>  count
>  auto_sortmerge_join_6
>  skewjoin
>  vector_auto_smb_mapjoin_14
>  auto_join_filters
>  vector_outer_join0
>  vector_outer_join1
>  vector_outer_join2
>  vector_outer_join3
>  vector_outer_join4
>  vector_outer_join5
>  hybridgrace_hashjoin_1
>  vector_mapjoin_reduce
>  vectorized_nested_mapjoin
>  vector_left_outer_join2
>  vector_char_mapjoin1
>  vector_decimal_mapjoin
>  vectorized_dynamic_partition_pruning
>  vector_varchar_mapjoin1
> {code}
> This jira is intended to cover the vectorization issues related to the 
> MiniTezCliDriver failures caused by NPE via nullSafes array as shown below :
> {code}
> private boolean onExpressionHasNullSafes(MapJoinDesc desc) {
>  boolean[] nullSafes = desc.getNullSafes();
>  for (boolean nullSafe : nullSafes) {
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12855) LLAP: add checks when resolving UDFs to enforce whitelist

2016-01-19 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12855:

Attachment: HIVE-12855.01.patch

> LLAP: add checks when resolving UDFs to enforce whitelist
> -
>
> Key: HIVE-12855
> URL: https://issues.apache.org/jira/browse/HIVE-12855
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12855.01.patch, HIVE-12855.part.patch
>
>
> Currently, adding a temporary UDF and calling LLAP with it (bypassing the 
> LlapDecider check, I did it by just modifying the source) only fails because 
> the class could not be found. If the UDF was accessible to LLAP, it would 
> execute. Inside the daemon, UDF instantiation should fail for custom UDFs 
> (and only succeed for whitelisted custom UDFs, once that is implemented).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12855) LLAP: add checks when resolving UDFs to enforce whitelist

2016-01-19 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12855:

Attachment: (was: HIVE-12855.patch)

> LLAP: add checks when resolving UDFs to enforce whitelist
> -
>
> Key: HIVE-12855
> URL: https://issues.apache.org/jira/browse/HIVE-12855
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12855.01.patch, HIVE-12855.part.patch
>
>
> Currently, adding a temporary UDF and calling LLAP with it (bypassing the 
> LlapDecider check, I did it by just modifying the source) only fails because 
> the class could not be found. If the UDF was accessible to LLAP, it would 
> execute. Inside the daemon, UDF instantiation should fail for custom UDFs 
> (and only succeed for whitelisted custom UDFs, once that is implemented).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12783) fix the unit test failures in TestSparkClient and TestSparkSessionManagerImpl

2016-01-19 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107462#comment-15107462
 ] 

Owen O'Malley commented on HIVE-12783:
--

I just committed this.

> fix the unit test failures in TestSparkClient and TestSparkSessionManagerImpl
> -
>
> Key: HIVE-12783
> URL: https://issues.apache.org/jira/browse/HIVE-12783
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Affects Versions: 2.0.0
>Reporter: Pengcheng Xiong
>Assignee: Owen O'Malley
>Priority: Blocker
> Attachments: HIVE-12783.patch, HIVE-12783.patch, HIVE-12783.patch
>
>
> This includes
> {code}
> org.apache.hive.spark.client.TestSparkClient.testSyncRpc
> org.apache.hive.spark.client.TestSparkClient.testJobSubmission
> org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
> org.apache.hive.spark.client.TestSparkClient.testCounters
> org.apache.hive.spark.client.TestSparkClient.testRemoteClient
> org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles
> org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob
> org.apache.hive.spark.client.TestSparkClient.testErrorJob
> org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
> org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
> {code}
> all of them passed on my laptop. cc'ing [~szehon], [~xuefuz], could you 
> please take a look? Shall we ignore them? Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12682) Reducers in dynamic partitioning job spend a lot of time running hadoop.conf.Configuration.getOverlay

2016-01-19 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-12682:
-
Attachment: HIVE-12682-branch-1.patch

> Reducers in dynamic partitioning job spend a lot of time running 
> hadoop.conf.Configuration.getOverlay
> -
>
> Key: HIVE-12682
> URL: https://issues.apache.org/jira/browse/HIVE-12682
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Carter Shanklin
>Assignee: Prasanth Jayachandran
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12682-branch-1.patch, HIVE-12682.1.patch, 
> HIVE-12682.2.patch, reducer.png
>
>
> I tested this on Hive 1.2.1 but looks like it's still applicable to 2.0.
> I ran this query:
> {code}
> create table flights (
> …
> )
> PARTITIONED BY (Year int)
> CLUSTERED BY (Month)
> SORTED BY (DayofMonth) into 12 buckets
> STORED AS ORC
> TBLPROPERTIES("orc.bloom.filter.columns"="*")
> ;
> {code}
> (Taken from here: 
> https://github.com/t3rmin4t0r/all-airlines-data/blob/master/ddl/orc.sql)
> I profiled just the reduce phase and noticed something odd, the attached 
> graph shows where time was spent during the reducer phase.
> !reducer.png!
> Problem seems to relate to 
> https://github.com/apache/hive/blob/branch-2.0/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java#L903
> /cc [~gopalv]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12855) LLAP: add checks when resolving UDFs to enforce whitelist

2016-01-19 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107373#comment-15107373
 ] 

Sergey Shelukhin commented on HIVE-12855:
-

Looks like this approach won't work in MiniLlap cluster because the embedded 
daemon causes the global registration on the client, which causes AM to fail to 
parse. Stupid Kryo needs a proper hook system... I think I will keep only the 
global hook for now.

> LLAP: add checks when resolving UDFs to enforce whitelist
> -
>
> Key: HIVE-12855
> URL: https://issues.apache.org/jira/browse/HIVE-12855
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12855.part.patch, HIVE-12855.patch
>
>
> Currently, adding a temporary UDF and calling LLAP with it (bypassing the 
> LlapDecider check, I did it by just modifying the source) only fails because 
> the class could not be found. If the UDF was accessible to LLAP, it would 
> execute. Inside the daemon, UDF instantiation should fail for custom UDFs 
> (and only succeed for whitelisted custom UDFs, once that is implemented).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12763) Use bit vector to track NDV

2016-01-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107384#comment-15107384
 ] 

Hive QA commented on HIVE-12763:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12783018/HIVE-12763.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 39 failed/errored test(s), 10003 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_interval_2.q-bucket3.q-vectorization_7.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_udf1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_partlvl
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_partlvl_dp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_quoting
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_tbllvl
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compustat_avro
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compute_stats_date
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compute_stats_decimal
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compute_stats_double
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compute_stats_empty_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compute_stats_long
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compute_stats_string
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_display_colstats_tbllvl
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_temp_table_display_colstats_tbllvl
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_varchar_udf1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mapjoin_mapjoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_join_hash
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggrStatsCacheIntegration.hit
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggrStatsCacheIntegration.someWithStats
org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsCache.someNonexistentPartitions
org.apache.hadoop.hive.metastore.hbase.TestHBaseSchemaTool.oneMondoTest
org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.decimalPartitionStatistics
org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.decimalTableStatistics
org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.doublePartitionStatistics
org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.doubleTableStatistics
org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.longPartitionStatistics
org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.longTableStatistics
org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.stringPartitionStatistics
org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.stringTableStatistics
org.apache.hadoop.hive.metastore.hbase.TestHBaseStoreIntegration.partitionStatistics
org.apache.hadoop.hive.metastore.hbase.TestHBaseStoreIntegration.tableStatistics
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6673/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6673/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6673/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 39 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12783018 - PreCommit-HIVE-TRUNK-Build

> Use bit vector to track NDV
> ---
>
> Key: HIVE-12763
> URL: https://issues.apache.org/jira/browse/HIVE-12763
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12763.01.patch, HIVE-12763.02.patch
>
>
> This will improve merging of per partitions stats. It will also help merge 
> NDV for auto-gather column stats.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12805) CBO: Calcite Operator To Hive Operator (Calcite Return Path): MiniTezCliDriver skewjoin.q failure

2016-01-19 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107427#comment-15107427
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-12805:
--

^Typo : patch 3

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): 
> MiniTezCliDriver skewjoin.q failure
> -
>
> Key: HIVE-12805
> URL: https://issues.apache.org/jira/browse/HIVE-12805
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12805.1.patch, HIVE-12805.2.patch, 
> HIVE-12805.3.patch
>
>
> Set hive.cbo.returnpath.hiveop=true
> {code}
> FROM T1 a FULL OUTER JOIN T2 c ON c.key+1=a.key SELECT /*+ STREAMTABLE(a) */ 
> sum(hash(a.key)), sum(hash(a.val)), sum(hash(c.key))
> {code}
> The stack trace:
> {code}
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
> at java.util.ArrayList.rangeCheck(ArrayList.java:635)
> at java.util.ArrayList.get(ArrayList.java:411)
> at 
> org.apache.hadoop.hive.ql.ppd.SyntheticJoinPredicate$JoinSynthetic.process(SyntheticJoinPredicate.java:183)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:43)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> at 
> org.apache.hadoop.hive.ql.ppd.SyntheticJoinPredicate.transform(SyntheticJoinPredicate.java:100)
> at 
> org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:236)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10170)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:231)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:471)
> {code}
> Same error happens in auto_sortmerge_join_6.q.out for 
> {code}
> select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src h on 
> h.value = a.value
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12783) fix the unit test failures in TestSparkClient and TestSparkSessionManagerImpl

2016-01-19 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107471#comment-15107471
 ] 

Pengcheng Xiong commented on HIVE-12783:


[~owen.omalley], thanks a lot. :)

> fix the unit test failures in TestSparkClient and TestSparkSessionManagerImpl
> -
>
> Key: HIVE-12783
> URL: https://issues.apache.org/jira/browse/HIVE-12783
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Affects Versions: 2.0.0
>Reporter: Pengcheng Xiong
>Assignee: Owen O'Malley
>Priority: Blocker
> Attachments: HIVE-12783.patch, HIVE-12783.patch, HIVE-12783.patch
>
>
> This includes
> {code}
> org.apache.hive.spark.client.TestSparkClient.testSyncRpc
> org.apache.hive.spark.client.TestSparkClient.testJobSubmission
> org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
> org.apache.hive.spark.client.TestSparkClient.testCounters
> org.apache.hive.spark.client.TestSparkClient.testRemoteClient
> org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles
> org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob
> org.apache.hive.spark.client.TestSparkClient.testErrorJob
> org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
> org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
> {code}
> all of them passed on my laptop. cc'ing [~szehon], [~xuefuz], could you 
> please take a look? Shall we ignore them? Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12865) Exchange partition does not show inputs field for post/pre execute hooks

2016-01-19 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-12865:

Attachment: HIVE-12865.patch

> Exchange partition does not show inputs field for post/pre execute hooks
> 
>
> Key: HIVE-12865
> URL: https://issues.apache.org/jira/browse/HIVE-12865
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.12.0
>Reporter: Paul Yang
>Assignee: Aihua Xu
> Attachments: HIVE-12865.patch
>
>
> The pre/post execute hook interface has fields that indicate which Hive 
> objects were read / written to as a result of running the query. For the 
> exchange partition operation, the read entity field is empty.
> This is an important issue as the hook interface may be configured to perform 
> critical warehouse operations.
> See
> ql/src/test/results/clientpositive/exchange_partition3.q.out
> {code}
> --- a/ql/src/test/results/clientpositive/exchange_partition3.q.out
> +++ b/ql/src/test/results/clientpositive/exchange_partition3.q.out
> @@ -65,9 +65,17 @@ ds=2013-04-05/hr=2
>  PREHOOK: query: -- This will exchange both partitions hr=1 and hr=2
>  ALTER TABLE exchange_part_test1 EXCHANGE PARTITION (ds='2013-04-05') WITH 
> TABLE exchange_part_test2
>  PREHOOK: type: ALTERTABLE_EXCHANGEPARTITION
> +PREHOOK: Output: default@exchange_part_test1
> +PREHOOK: Output: default@exchange_part_test2
>  POSTHOOK: query: -- This will exchange both partitions hr=1 and hr=2
>  ALTER TABLE exchange_part_test1 EXCHANGE PARTITION (ds='2013-04-05') WITH 
> TABLE exchange_part_test2
>  POSTHOOK: type: ALTERTABLE_EXCHANGEPARTITION
> +POSTHOOK: Output: default@exchange_part_test1
> +POSTHOOK: Output: default@exchange_part_test1@ds=2013-04-05/hr=1
> +POSTHOOK: Output: default@exchange_part_test1@ds=2013-04-05/hr=2
> +POSTHOOK: Output: default@exchange_part_test2
> +POSTHOOK: Output: default@exchange_part_test2@ds=2013-04-05/hr=1
> +POSTHOOK: Output: default@exchange_part_test2@ds=2013-04-05/hr=2
>  PREHOOK: query: SHOW PARTITIONS exchange_part_test1
>  PREHOOK: type: SHOWPARTITIONS
>  PREHOOK: Input: default@exchange_part_test1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12220) LLAP: Usability issues with hive.llap.io.cache.orc.size

2016-01-19 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107561#comment-15107561
 ] 

Prasanth Jayachandran commented on HIVE-12220:
--

+1

> LLAP: Usability issues with hive.llap.io.cache.orc.size
> ---
>
> Key: HIVE-12220
> URL: https://issues.apache.org/jira/browse/HIVE-12220
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Carter Shanklin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12220.01.patch, HIVE-12220.02.patch, 
> HIVE-12220.patch, HIVE-12220.tmp.patch
>
>
> In the llap-daemon site you need to set, among other things,
> llap.daemon.memory.per.instance.mb
> and
> hive.llap.io.cache.orc.size
> The use of hive.llap.io.cache.orc.size caused me some unnecessary problems, 
> initially I entered the value in MB rather than in bytes. Operator error you 
> could say but I look at this as a fraction of the other value which is in mb.
> Second, is this really tied to ORC? E.g. when we have the vectorized text 
> reader will this data be cached as well? Or might it be in the future?
> I would like to propose instead using hive.llap.io.cache.size.mb for this 
> setting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12429) Switch default Hive authorization to SQLStandardAuth in 2.0

2016-01-19 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107402#comment-15107402
 ] 

Sushanth Sowmyan commented on HIVE-12429:
-

Thanks for the update, Daniel. LGTM. +1.

Will go ahead and commit.

> Switch default Hive authorization to SQLStandardAuth in 2.0
> ---
>
> Key: HIVE-12429
> URL: https://issues.apache.org/jira/browse/HIVE-12429
> Project: Hive
>  Issue Type: Task
>  Components: Authorization, Security
>Affects Versions: 2.0.0
>Reporter: Alan Gates
>Assignee: Daniel Dai
> Attachments: HIVE-12429.1.patch, HIVE-12429.10.patch, 
> HIVE-12429.11.patch, HIVE-12429.12.patch, HIVE-12429.13.patch, 
> HIVE-12429.14.patch, HIVE-12429.15.patch, HIVE-12429.16.patch, 
> HIVE-12429.17.patch, HIVE-12429.18.patch, HIVE-12429.2.patch, 
> HIVE-12429.3.patch, HIVE-12429.4.patch, HIVE-12429.5.patch, 
> HIVE-12429.6.patch, HIVE-12429.7.patch, HIVE-12429.8.patch, HIVE-12429.9.patch
>
>
> Hive's default authorization is not real security, as it does not secure a 
> number of features and anyone can grant access to any object to any user.  We 
> should switch the default to SQLStandardAuth, which provides real 
> authentication.
> As this is a backwards incompatible change this was hard to do previously, 
> but 2.0 gives us a place to do this type of change.
> By default authorization will still be off, as there are a few other things 
> to set when turning on authorization (such as the list of admin users).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12883) Support basic stats and column stats in table properties in HBaseStore

2016-01-19 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12883:

Target Version/s: 2.0.0

> Support basic stats and column stats in table properties in HBaseStore
> --
>
> Key: HIVE-12883
> URL: https://issues.apache.org/jira/browse/HIVE-12883
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Blocker
> Attachments: HIVE-12883.01.patch, HIVE-12883.02.patch, 
> HIVE-12883.03.patch
>
>
> Need to add support for HBase store too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12805) CBO: Calcite Operator To Hive Operator (Calcite Return Path): MiniTezCliDriver skewjoin.q failure

2016-01-19 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12805:
-
Attachment: HIVE-12805.2.patch

Addressing [~ashutoshc] 's comments in patch 2.

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): 
> MiniTezCliDriver skewjoin.q failure
> -
>
> Key: HIVE-12805
> URL: https://issues.apache.org/jira/browse/HIVE-12805
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12805.1.patch, HIVE-12805.2.patch, 
> HIVE-12805.3.patch
>
>
> Set hive.cbo.returnpath.hiveop=true
> {code}
> FROM T1 a FULL OUTER JOIN T2 c ON c.key+1=a.key SELECT /*+ STREAMTABLE(a) */ 
> sum(hash(a.key)), sum(hash(a.val)), sum(hash(c.key))
> {code}
> The stack trace:
> {code}
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
> at java.util.ArrayList.rangeCheck(ArrayList.java:635)
> at java.util.ArrayList.get(ArrayList.java:411)
> at 
> org.apache.hadoop.hive.ql.ppd.SyntheticJoinPredicate$JoinSynthetic.process(SyntheticJoinPredicate.java:183)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:43)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> at 
> org.apache.hadoop.hive.ql.ppd.SyntheticJoinPredicate.transform(SyntheticJoinPredicate.java:100)
> at 
> org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:236)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10170)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:231)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:471)
> {code}
> Same error happens in auto_sortmerge_join_6.q.out for 
> {code}
> select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src h on 
> h.value = a.value
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12805) CBO: Calcite Operator To Hive Operator (Calcite Return Path): MiniTezCliDriver skewjoin.q failure

2016-01-19 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12805:
-
Attachment: (was: HIVE-12805.2.patch)

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): 
> MiniTezCliDriver skewjoin.q failure
> -
>
> Key: HIVE-12805
> URL: https://issues.apache.org/jira/browse/HIVE-12805
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12805.1.patch, HIVE-12805.2.patch, 
> HIVE-12805.3.patch
>
>
> Set hive.cbo.returnpath.hiveop=true
> {code}
> FROM T1 a FULL OUTER JOIN T2 c ON c.key+1=a.key SELECT /*+ STREAMTABLE(a) */ 
> sum(hash(a.key)), sum(hash(a.val)), sum(hash(c.key))
> {code}
> The stack trace:
> {code}
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
> at java.util.ArrayList.rangeCheck(ArrayList.java:635)
> at java.util.ArrayList.get(ArrayList.java:411)
> at 
> org.apache.hadoop.hive.ql.ppd.SyntheticJoinPredicate$JoinSynthetic.process(SyntheticJoinPredicate.java:183)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:43)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> at 
> org.apache.hadoop.hive.ql.ppd.SyntheticJoinPredicate.transform(SyntheticJoinPredicate.java:100)
> at 
> org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:236)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10170)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:231)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:471)
> {code}
> Same error happens in auto_sortmerge_join_6.q.out for 
> {code}
> select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src h on 
> h.value = a.value
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12805) CBO: Calcite Operator To Hive Operator (Calcite Return Path): MiniTezCliDriver skewjoin.q failure

2016-01-19 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12805:
-
Attachment: HIVE-12805.3.patch

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): 
> MiniTezCliDriver skewjoin.q failure
> -
>
> Key: HIVE-12805
> URL: https://issues.apache.org/jira/browse/HIVE-12805
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12805.1.patch, HIVE-12805.2.patch, 
> HIVE-12805.3.patch
>
>
> Set hive.cbo.returnpath.hiveop=true
> {code}
> FROM T1 a FULL OUTER JOIN T2 c ON c.key+1=a.key SELECT /*+ STREAMTABLE(a) */ 
> sum(hash(a.key)), sum(hash(a.val)), sum(hash(c.key))
> {code}
> The stack trace:
> {code}
> java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
> at java.util.ArrayList.rangeCheck(ArrayList.java:635)
> at java.util.ArrayList.get(ArrayList.java:411)
> at 
> org.apache.hadoop.hive.ql.ppd.SyntheticJoinPredicate$JoinSynthetic.process(SyntheticJoinPredicate.java:183)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:43)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> at 
> org.apache.hadoop.hive.ql.ppd.SyntheticJoinPredicate.transform(SyntheticJoinPredicate.java:100)
> at 
> org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:236)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10170)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:231)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:471)
> {code}
> Same error happens in auto_sortmerge_join_6.q.out for 
> {code}
> select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src h on 
> h.value = a.value
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12883) Support basic stats and column stats in table properties in HBaseStore

2016-01-19 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12883:

Priority: Blocker  (was: Major)

> Support basic stats and column stats in table properties in HBaseStore
> --
>
> Key: HIVE-12883
> URL: https://issues.apache.org/jira/browse/HIVE-12883
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Blocker
> Attachments: HIVE-12883.01.patch, HIVE-12883.02.patch, 
> HIVE-12883.03.patch
>
>
> Need to add support for HBase store too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12429) Switch default Hive authorization to SQLStandardAuth in 2.0

2016-01-19 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107445#comment-15107445
 ] 

Daniel Dai commented on HIVE-12429:
---

Thanks [~sushanth]!

> Switch default Hive authorization to SQLStandardAuth in 2.0
> ---
>
> Key: HIVE-12429
> URL: https://issues.apache.org/jira/browse/HIVE-12429
> Project: Hive
>  Issue Type: Task
>  Components: Authorization, Security
>Affects Versions: 2.0.0
>Reporter: Alan Gates
>Assignee: Daniel Dai
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-12429.1.patch, HIVE-12429.10.patch, 
> HIVE-12429.11.patch, HIVE-12429.12.patch, HIVE-12429.13.patch, 
> HIVE-12429.14.patch, HIVE-12429.15.patch, HIVE-12429.16.patch, 
> HIVE-12429.17.patch, HIVE-12429.18.patch, HIVE-12429.2.patch, 
> HIVE-12429.3.patch, HIVE-12429.4.patch, HIVE-12429.5.patch, 
> HIVE-12429.6.patch, HIVE-12429.7.patch, HIVE-12429.8.patch, HIVE-12429.9.patch
>
>
> Hive's default authorization is not real security, as it does not secure a 
> number of features and anyone can grant access to any object to any user.  We 
> should switch the default to SQLStandardAuth, which provides real 
> authentication.
> As this is a backwards incompatible change this was hard to do previously, 
> but 2.0 gives us a place to do this type of change.
> By default authorization will still be off, as there are a few other things 
> to set when turning on authorization (such as the list of admin users).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12890) Disable multi-statment transaction control statements until HIVE-11078

2016-01-19 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-12890:
--
Attachment: HIVE-12890.patch

[~alangates] could you review

> Disable multi-statment transaction control statements until HIVE-11078
> --
>
> Key: HIVE-12890
> URL: https://issues.apache.org/jira/browse/HIVE-12890
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-12890.patch
>
>
> HIVE-11077 added support for begin transaction/commit/rollback but the 
> feature is not complete w/o HIVE-11078.  Need to disable these statements to 
> prevent user confusion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12528) don't start HS2 Tez sessions in a single thread

2016-01-19 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12528:

Attachment: HIVE-12528.05.patch

Another rebase.

> don't start HS2 Tez sessions in a single thread
> ---
>
> Key: HIVE-12528
> URL: https://issues.apache.org/jira/browse/HIVE-12528
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12528.01.patch, HIVE-12528.02.patch, 
> HIVE-12528.03.patch, HIVE-12528.04.patch, HIVE-12528.05.patch, 
> HIVE-12528.patch
>
>
> Starting sessions in parallel would improve the startup time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12867) Semantic Exception Error Msg should be with in the range of "10000 to 19999"

2016-01-19 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12867:
-
Attachment: HIVE-12867.2.patch

> Semantic Exception Error Msg should be with in the range of "1 to 1"
> 
>
> Key: HIVE-12867
> URL: https://issues.apache.org/jira/browse/HIVE-12867
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Laljo John Pullokkaran
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12867.1.patch, HIVE-12867.2.patch
>
>
> At many places errors encountered during semantic exception is translated as 
> generic error(GENERIC_ERROR, 4) msg as opposed to semantic error msg.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12865) Exchange partition does not show inputs field for post/pre execute hooks

2016-01-19 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107564#comment-15107564
 ] 

Aihua Xu commented on HIVE-12865:
-

Attached the initial patch: added the source table and the partitions to be 
exchanged as the inputs.

[~ctang.ma] and [~xuefuz] Can you help review the code? 

[~pauly] Could you also take a look if this is what you are expecting?

> Exchange partition does not show inputs field for post/pre execute hooks
> 
>
> Key: HIVE-12865
> URL: https://issues.apache.org/jira/browse/HIVE-12865
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.12.0
>Reporter: Paul Yang
>Assignee: Aihua Xu
> Attachments: HIVE-12865.patch
>
>
> The pre/post execute hook interface has fields that indicate which Hive 
> objects were read / written to as a result of running the query. For the 
> exchange partition operation, the read entity field is empty.
> This is an important issue as the hook interface may be configured to perform 
> critical warehouse operations.
> See
> ql/src/test/results/clientpositive/exchange_partition3.q.out
> {code}
> --- a/ql/src/test/results/clientpositive/exchange_partition3.q.out
> +++ b/ql/src/test/results/clientpositive/exchange_partition3.q.out
> @@ -65,9 +65,17 @@ ds=2013-04-05/hr=2
>  PREHOOK: query: -- This will exchange both partitions hr=1 and hr=2
>  ALTER TABLE exchange_part_test1 EXCHANGE PARTITION (ds='2013-04-05') WITH 
> TABLE exchange_part_test2
>  PREHOOK: type: ALTERTABLE_EXCHANGEPARTITION
> +PREHOOK: Output: default@exchange_part_test1
> +PREHOOK: Output: default@exchange_part_test2
>  POSTHOOK: query: -- This will exchange both partitions hr=1 and hr=2
>  ALTER TABLE exchange_part_test1 EXCHANGE PARTITION (ds='2013-04-05') WITH 
> TABLE exchange_part_test2
>  POSTHOOK: type: ALTERTABLE_EXCHANGEPARTITION
> +POSTHOOK: Output: default@exchange_part_test1
> +POSTHOOK: Output: default@exchange_part_test1@ds=2013-04-05/hr=1
> +POSTHOOK: Output: default@exchange_part_test1@ds=2013-04-05/hr=2
> +POSTHOOK: Output: default@exchange_part_test2
> +POSTHOOK: Output: default@exchange_part_test2@ds=2013-04-05/hr=1
> +POSTHOOK: Output: default@exchange_part_test2@ds=2013-04-05/hr=2
>  PREHOOK: query: SHOW PARTITIONS exchange_part_test1
>  PREHOOK: type: SHOWPARTITIONS
>  PREHOOK: Input: default@exchange_part_test1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11785) Support escaping carriage return and new line for LazySimpleSerDe

2016-01-19 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-11785:

Release Note: This change with HIVE-12820 in addition adds the support of 
carriage return and new line characters in the fields. Before this change, the 
user needs to preprocess the text by replacing them with some characters other 
than carriage return and new line in order for the files to be properly 
processed. With this change, it will automatically escape them if 
{{serialization.escape.crlf}} serde property is set to true. One incompatible 
change is: characters 'r' and 'n' cannot be used as separator or field 
delimiter.   (was: This change with HIVE-12820 in addition adds the support of 
carriage return and new line characters in the fields. Before this change, the 
user needs to preprocess the text by replacing them with some characters other 
than carriage return and new line in order for the files to be properly 
processed. With this change, it will automatically escape them if 
{{serialization.escape.crlf}} serde property is set to true. One incompatible 
change is: characters 'r' and 'n' cannot be used as separator or field 
delimiter )

> Support escaping carriage return and new line for LazySimpleSerDe
> -
>
> Key: HIVE-11785
> URL: https://issues.apache.org/jira/browse/HIVE-11785
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-11785.2.patch, HIVE-11785.3.patch, 
> HIVE-11785.patch, test.parquet
>
>
> Create the table and perform the queries as follows. You will see different 
> results when the setting changes. 
> The expected result should be:
> {noformat}
> 1 newline
> here
> 2 carriage return
> 3 both
> here
> {noformat}
> {noformat}
> hive> create table repo (lvalue int, charstring string) stored as parquet;
> OK
> Time taken: 0.34 seconds
> hive> load data inpath '/tmp/repo/test.parquet' overwrite into table repo;
> Loading data to table default.repo
> chgrp: changing ownership of 
> 'hdfs://nameservice1/user/hive/warehouse/repo/test.parquet': User does not 
> belong to hive
> Table default.repo stats: [numFiles=1, numRows=0, totalSize=610, 
> rawDataSize=0]
> OK
> Time taken: 0.732 seconds
> hive> set hive.fetch.task.conversion=more;
> hive> select * from repo;
> OK
> 1 newline
> here
> here  carriage return
> 3 both
> here
> Time taken: 0.253 seconds, Fetched: 3 row(s)
> hive> set hive.fetch.task.conversion=none;
> hive> select * from repo;
> Query ID = root_20150909113535_e081db8b-ccd9-4c44-aad9-d990ffb8edf3
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_1441752031022_0006, Tracking URL = 
> http://host-10-17-81-63.coe.cloudera.com:8088/proxy/application_1441752031022_0006/
> Kill Command = 
> /opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/hadoop/bin/hadoop job  
> -kill job_1441752031022_0006
> Hadoop job information for Stage-1: number of mappers: 1; number of reducers: > 0
> 2015-09-09 11:35:54,127 Stage-1 map = 0%,  reduce = 0%
> 2015-09-09 11:36:04,664 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.98 
> sec
> MapReduce Total cumulative CPU time: 2 seconds 980 msec
> Ended Job = job_1441752031022_0006
> MapReduce Jobs Launched:
> Stage-Stage-1: Map: 1   Cumulative CPU: 2.98 sec   HDFS Read: 4251 HDFS 
> Write: 51 SUCCESS
> Total MapReduce CPU Time Spent: 2 seconds 980 msec
> OK
> 1 newline
> NULL  NULL
> 2 carriage return
> NULL  NULL
> 3 both
> NULL  NULL
> Time taken: 25.131 seconds, Fetched: 6 row(s)
> hive>
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-12629) hive.auto.convert.join=true makes lateral view join sql failed on spark engine on yarn

2016-01-19 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun reassigned HIVE-12629:
---

Assignee: Chao Sun  (was: Xuefu Zhang)

> hive.auto.convert.join=true makes lateral view join sql failed on spark 
> engine on yarn
> --
>
> Key: HIVE-12629
> URL: https://issues.apache.org/jira/browse/HIVE-12629
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.1
>Reporter: 吴子美
>Assignee: Chao Sun
>
> I am using hive1.2 on spark on yarn. 
> I found 
> select count(1) from 
> (select  user_id from xxx group by user_id ) a join
> (select  user_id from yyy lateral view json_tuple(u, 'h') v1 as h) b
> on a.user_id=b.user_id ;
> failed in hive on spark on yarn, but OK in hive on MR.
> I tried the following sql on spark. It was OK.
> select count(1) from 
> (select  user_id from xxx group by user_id ) a left join
> (select  user_id from yyy lateral view json_tuple(u, 'h') v1 as h) b
> on a.user_id=b.user_id ;
> When I turn hive.auto.convert.join from true to false. Everything goes OK.
> The error message in hive.log was :
> {code}
> 2015-12-09 21:10:17,190 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(569)) - 15/12/09 21:10:17 INFO log.PerfLogger: 
> 
> 2015-12-09 21:10:17,190 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(569)) - 15/12/09 21:10:17 INFO exec.Utilities: 
> Serializing ReduceWork via kryo
> 2015-12-09 21:10:17,214 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(569)) - 15/12/09 21:10:17 INFO log.PerfLogger: 
>  duration=24 from=org.apache.hadoop.hive.ql.exec.Utilities>
> 2015-12-09 21:10:17,261 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(569)) - 15/12/09 21:10:17 INFO client.RemoteDriver: 
> Failed to run job 8fed1ca8-834f-497f-b189-eab343440a9f
> 2015-12-09 21:10:17,261 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(569)) - java.lang.IllegalStateException: Connection 
> already exists
> 2015-12-09 21:10:17,261 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(569)) -  at 
> org.apache.hadoop.hive.ql.exec.spark.SparkPlan.connect(SparkPlan.java:142)
> 2015-12-09 21:10:17,261 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(569)) -  at 
> org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateParentTran(SparkPlanGenerator.java:142)
> 2015-12-09 21:10:17,261 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(569)) -  at 
> org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:106)
> 2015-12-09 21:10:17,261 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(569)) -  at 
> org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient$JobStatusJob.call(RemoteHiveSparkClient.java:252)
> 2015-12-09 21:10:17,261 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(569)) -  at 
> org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:366)
> 2015-12-09 21:10:17,261 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(569)) -  at 
> org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335)
> 2015-12-09 21:10:17,261 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(569)) -  at 
> java.util.concurrent.FutureTask.run(FutureTask.java:262)
> 2015-12-09 21:10:17,262 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(569)) -  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 2015-12-09 21:10:17,262 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(569)) -  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 2015-12-09 21:10:17,262 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(569)) -  at 
> java.lang.Thread.run(Thread.java:745)
> 2015-12-09 21:10:17,266 INFO  [RPC-Handler-3]: client.SparkClientImpl 
> (SparkClientImpl.java:handle(522)) - Received result for 
> 8fed1ca8-834f-497f-b189-eab343440a9f
> 2015-12-09 21:10:18,054 ERROR [HiveServer2-Background-Pool: Thread-43]: 
> status.SparkJobMonitor (SessionState.java:printError(960)) - Status: Failed
> 2015-12-09 21:10:18,055 INFO  [HiveServer2-Background-Pool: Thread-43]: 
> log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) -  method=SparkRunJob start=144915051 end=144918055 duration=3004 
> from=org.apache.hadoop.hive.ql.exec.spark.status.SparkJobMonitor>
> 2015-12-09 21:10:18,076 ERROR [HiveServer2-Background-Pool: Thread-43]: 
> ql.Driver (SessionState.java:printError(960)) - 

[jira] [Commented] (HIVE-12887) Handle ORC schema on read with fewer columns than file schema (after Schema Evolution changes)

2016-01-19 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107177#comment-15107177
 ] 

Sergey Shelukhin commented on HIVE-12887:
-

What will happen after column removal with this patch? Is test needed?
Also, nit: please surround LOG.info with types with if LOG.isInfoEnabled.


> Handle ORC schema on read with fewer columns than file schema (after Schema 
> Evolution changes)
> --
>
> Key: HIVE-12887
> URL: https://issues.apache.org/jira/browse/HIVE-12887
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-12887.01.patch
>
>
> Exception caused by reading after column removal.
> {code}
> Caused by: java.lang.IndexOutOfBoundsException: Index: 10, Size: 10
>   at java.util.ArrayList.rangeCheck(ArrayList.java:653)
>   at java.util.ArrayList.get(ArrayList.java:429)
>   at java.util.Collections$UnmodifiableList.get(Collections.java:1309)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcProto$Type.getSubtypes(OrcProto.java:12240)
>   at 
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.(TreeReaderFactory.java:2053)
>   at 
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory.createTreeReader(TreeReaderFactory.java:2481)
>   at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.(RecordReaderImpl.java:216)
>   at 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:598)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$ReaderPair.(OrcRawRecordMerger.java:179)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$OriginalReaderPair.(OrcRawRecordMerger.java:222)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.(OrcRawRecordMerger.java:442)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.java:1285)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1165)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:249)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12682) Reducers in dynamic partitioning job spend a lot of time running hadoop.conf.Configuration.getOverlay

2016-01-19 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107246#comment-15107246
 ] 

Ashutosh Chauhan commented on HIVE-12682:
-

+1

> Reducers in dynamic partitioning job spend a lot of time running 
> hadoop.conf.Configuration.getOverlay
> -
>
> Key: HIVE-12682
> URL: https://issues.apache.org/jira/browse/HIVE-12682
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1
>Reporter: Carter Shanklin
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-12682.1.patch, HIVE-12682.2.patch, reducer.png
>
>
> I tested this on Hive 1.2.1 but looks like it's still applicable to 2.0.
> I ran this query:
> {code}
> create table flights (
> …
> )
> PARTITIONED BY (Year int)
> CLUSTERED BY (Month)
> SORTED BY (DayofMonth) into 12 buckets
> STORED AS ORC
> TBLPROPERTIES("orc.bloom.filter.columns"="*")
> ;
> {code}
> (Taken from here: 
> https://github.com/t3rmin4t0r/all-airlines-data/blob/master/ddl/orc.sql)
> I profiled just the reduce phase and noticed something odd, the attached 
> graph shows where time was spent during the reducer phase.
> !reducer.png!
> Problem seems to relate to 
> https://github.com/apache/hive/blob/branch-2.0/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java#L903
> /cc [~gopalv]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12837) Better memory estimation/allocation for hybrid grace hash join during hash table loading

2016-01-19 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107145#comment-15107145
 ] 

Wei Zheng commented on HIVE-12837:
--

Both TestDbTxnManager2 and tez_union.q passed locally without any problem.

> Better memory estimation/allocation for hybrid grace hash join during hash 
> table loading
> 
>
> Key: HIVE-12837
> URL: https://issues.apache.org/jira/browse/HIVE-12837
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.1.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12837.1.patch, HIVE-12837.2.patch, 
> HIVE-12837.3.patch
>
>
> This is to avoid an edge case when the memory available is very little (less 
> than a single write buffer size), and we start loading the hash table. Since 
> the write buffer is lazily allocated, we will easily run out of memory before 
> even checking if we should spill any hash partition.
> e.g.
> Total memory available: 210 MB
> Size of ref array of BytesBytesMultiHashMap for each hash partition: ~16 MB
> Size of write buffer: 8 MB (lazy allocation)
> Number of hash partitions: 16
> Number of hash partitions created in memory: 13
> Number of hash partitions created on disk: 3
> Available memory left after HybridHashTableContainer initialization: 
> 210-16*13=2MB
> Now let's say a row is to be loaded into a hash partition in memory, it will 
> try to allocate an 8MB write buffer for it, but we only have 2MB, thus OOM.
> Solution is to perform the check for possible spilling earlier so we can 
> spill partitions if memory is about to be full, to avoid OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11785) Support escaping carriage return and new line for LazySimpleSerDe

2016-01-19 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-11785:

Release Note: This change with HIVE-12820 in addition adds the support of 
carriage return and new line characters in the fields. Before this change, the 
user needs to preprocess the text by replacing them with some characters other 
than carriage return and new line in order for the files to be properly 
processed. With this change, it will automatically escape them if 
{{serialization.escape.crlf}} serde property is set to true. One incompatible 
change is: characters 'r' and 'n' cannot be used as separator or field 
delimiter   (was: This change disallows carriage return and new line characters 
to be used as field separators or escape character. While before this change, 
those were allowed while those cases could easily lead to incorrect results if 
the content also contain carriage return or new line. Since even carriage 
return or new line was escaped, line based input format in MapReduce used in 
Hive will break the lines by carriage return and new line only and lead to 
incorrect result.)

> Support escaping carriage return and new line for LazySimpleSerDe
> -
>
> Key: HIVE-11785
> URL: https://issues.apache.org/jira/browse/HIVE-11785
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-11785.2.patch, HIVE-11785.3.patch, 
> HIVE-11785.patch, test.parquet
>
>
> Create the table and perform the queries as follows. You will see different 
> results when the setting changes. 
> The expected result should be:
> {noformat}
> 1 newline
> here
> 2 carriage return
> 3 both
> here
> {noformat}
> {noformat}
> hive> create table repo (lvalue int, charstring string) stored as parquet;
> OK
> Time taken: 0.34 seconds
> hive> load data inpath '/tmp/repo/test.parquet' overwrite into table repo;
> Loading data to table default.repo
> chgrp: changing ownership of 
> 'hdfs://nameservice1/user/hive/warehouse/repo/test.parquet': User does not 
> belong to hive
> Table default.repo stats: [numFiles=1, numRows=0, totalSize=610, 
> rawDataSize=0]
> OK
> Time taken: 0.732 seconds
> hive> set hive.fetch.task.conversion=more;
> hive> select * from repo;
> OK
> 1 newline
> here
> here  carriage return
> 3 both
> here
> Time taken: 0.253 seconds, Fetched: 3 row(s)
> hive> set hive.fetch.task.conversion=none;
> hive> select * from repo;
> Query ID = root_20150909113535_e081db8b-ccd9-4c44-aad9-d990ffb8edf3
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_1441752031022_0006, Tracking URL = 
> http://host-10-17-81-63.coe.cloudera.com:8088/proxy/application_1441752031022_0006/
> Kill Command = 
> /opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/hadoop/bin/hadoop job  
> -kill job_1441752031022_0006
> Hadoop job information for Stage-1: number of mappers: 1; number of reducers: > 0
> 2015-09-09 11:35:54,127 Stage-1 map = 0%,  reduce = 0%
> 2015-09-09 11:36:04,664 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.98 
> sec
> MapReduce Total cumulative CPU time: 2 seconds 980 msec
> Ended Job = job_1441752031022_0006
> MapReduce Jobs Launched:
> Stage-Stage-1: Map: 1   Cumulative CPU: 2.98 sec   HDFS Read: 4251 HDFS 
> Write: 51 SUCCESS
> Total MapReduce CPU Time Spent: 2 seconds 980 msec
> OK
> 1 newline
> NULL  NULL
> 2 carriage return
> NULL  NULL
> 3 both
> NULL  NULL
> Time taken: 25.131 seconds, Fetched: 6 row(s)
> hive>
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12883) Support basic stats and column stats in table properties in HBaseStore

2016-01-19 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107209#comment-15107209
 ] 

Ashutosh Chauhan commented on HIVE-12883:
-

+1

> Support basic stats and column stats in table properties in HBaseStore
> --
>
> Key: HIVE-12883
> URL: https://issues.apache.org/jira/browse/HIVE-12883
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12883.01.patch, HIVE-12883.02.patch, 
> HIVE-12883.03.patch
>
>
> Need to add support for HBase store too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12820) Remove the check if carriage return and new line are used for separator or escape character

2016-01-19 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-12820:

Fix Version/s: 2.1.0

> Remove the check if carriage return and new line are used for separator or 
> escape character
> ---
>
> Key: HIVE-12820
> URL: https://issues.apache.org/jira/browse/HIVE-12820
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 2.1.0
>
> Attachments: HIVE-12820.2.patch, HIVE-12820.patch
>
>
> The change in HIVE-11785 doesn't allow \r or \n to be used as separator or 
> escape character which may break some existing tables which uses \r as 
> separator or escape character e.g..
> This case actually can be supported regardless of SERIALIZATION_ESCAPE_CRLF 
> set or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12798) CBO: Calcite Operator To Hive Operator (Calcite Return Path): MiniTezCliDriver.vector* queries failures due to NPE in Vectorizer.onExpressionHasNullSafes()

2016-01-19 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107648#comment-15107648
 ] 

Sergey Shelukhin commented on HIVE-12798:
-

+1

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): 
> MiniTezCliDriver.vector* queries failures due to NPE in 
> Vectorizer.onExpressionHasNullSafes()
> ---
>
> Key: HIVE-12798
> URL: https://issues.apache.org/jira/browse/HIVE-12798
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Fix For: 2.1.0
>
> Attachments: HIVE-12798.1.patch
>
>
> As of 01/04/2016, the following tests fail in the MiniTezCliDriver mode when 
> the cbo return path is enabled. We need to fix them :
> {code}
>  vector_leftsemi_mapjoin
>  vector_join_filters
>  vector_interval_mapjoin
>  vector_left_outer_join
>  vectorized_mapjoin
>  vector_inner_join
>  vectorized_context
>  tez_vector_dynpart_hashjoin_1
>  count
>  auto_sortmerge_join_6
>  skewjoin
>  vector_auto_smb_mapjoin_14
>  auto_join_filters
>  vector_outer_join0
>  vector_outer_join1
>  vector_outer_join2
>  vector_outer_join3
>  vector_outer_join4
>  vector_outer_join5
>  hybridgrace_hashjoin_1
>  vector_mapjoin_reduce
>  vectorized_nested_mapjoin
>  vector_left_outer_join2
>  vector_char_mapjoin1
>  vector_decimal_mapjoin
>  vectorized_dynamic_partition_pruning
>  vector_varchar_mapjoin1
> {code}
> This jira is intended to cover the vectorization issues related to the 
> MiniTezCliDriver failures caused by NPE via nullSafes array as shown below :
> {code}
> private boolean onExpressionHasNullSafes(MapJoinDesc desc) {
>  boolean[] nullSafes = desc.getNullSafes();
>  for (boolean nullSafe : nullSafes) {
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12891) Hive fails when java.io.tmpdir is set to a relative location

2016-01-19 Thread Reuben Kuhnert (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuben Kuhnert updated HIVE-12891:
--
Attachment: HIVE-12891.01.19.2016.01.patch

> Hive fails when java.io.tmpdir is set to a relative location
> 
>
> Key: HIVE-12891
> URL: https://issues.apache.org/jira/browse/HIVE-12891
> Project: Hive
>  Issue Type: Bug
>Reporter: Reuben Kuhnert
>Assignee: Reuben Kuhnert
> Attachments: HIVE-12891.01.19.2016.01.patch
>
>
> The function {{SessionState.createSessionDirs}} fails when trying to create 
> directories where {{java.io.tmpdir}} is set to a relative location.
> {code}
> \[uber-SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: 
> IllegalArgumentException java.net.URISyntaxException: Relative path in 
> absolute URI: 
> file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1
> ...
> Minor variations:
> \[uber-SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: SemanticException 
> Exception while processing Exception while writing out the local file 
> o.a.h.hive.ql/parse.SemanticException: Exception while processing exception 
> while writing out local file 
> ... 
> caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: 
> Relative path in absolute URI: 
> file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1 
> at o.a.h.fs.Path.initialize (206) 
> at o.a.h.fs.Path.(197)... 
> at o.a.h.hive.ql.context.getScratchDir(267) 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12891) Hive fails when java.io.tmpdir is set to a relative location

2016-01-19 Thread Reuben Kuhnert (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107807#comment-15107807
 ] 

Reuben Kuhnert commented on HIVE-12891:
---

Patch Fix: Ensure that paths are expanded to absolute locations.

> Hive fails when java.io.tmpdir is set to a relative location
> 
>
> Key: HIVE-12891
> URL: https://issues.apache.org/jira/browse/HIVE-12891
> Project: Hive
>  Issue Type: Bug
>Reporter: Reuben Kuhnert
>Assignee: Reuben Kuhnert
> Attachments: HIVE-12891.01.19.2016.01.patch
>
>
> The function {{SessionState.createSessionDirs}} fails when trying to create 
> directories where {{java.io.tmpdir}} is set to a relative location.
> {code}
> \[uber-SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: 
> IllegalArgumentException java.net.URISyntaxException: Relative path in 
> absolute URI: 
> file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1
> ...
> Minor variations:
> \[uber-SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: SemanticException 
> Exception while processing Exception while writing out the local file 
> o.a.h.hive.ql/parse.SemanticException: Exception while processing exception 
> while writing out local file 
> ... 
> caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: 
> Relative path in absolute URI: 
> file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1 
> at o.a.h.fs.Path.initialize (206) 
> at o.a.h.fs.Path.(197)... 
> at o.a.h.hive.ql.context.getScratchDir(267) 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12783) fix the unit test failures in TestSparkClient and TestSparkSessionManagerImpl

2016-01-19 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107877#comment-15107877
 ] 

Lefty Leverenz commented on HIVE-12783:
---

Nudging [~owen.omalley]:  this needs a status update.  Thanks.

> fix the unit test failures in TestSparkClient and TestSparkSessionManagerImpl
> -
>
> Key: HIVE-12783
> URL: https://issues.apache.org/jira/browse/HIVE-12783
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Affects Versions: 2.0.0
>Reporter: Pengcheng Xiong
>Assignee: Owen O'Malley
>Priority: Blocker
> Attachments: HIVE-12783.patch, HIVE-12783.patch, HIVE-12783.patch
>
>
> This includes
> {code}
> org.apache.hive.spark.client.TestSparkClient.testSyncRpc
> org.apache.hive.spark.client.TestSparkClient.testJobSubmission
> org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
> org.apache.hive.spark.client.TestSparkClient.testCounters
> org.apache.hive.spark.client.TestSparkClient.testRemoteClient
> org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles
> org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob
> org.apache.hive.spark.client.TestSparkClient.testErrorJob
> org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
> org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
> {code}
> all of them passed on my laptop. cc'ing [~szehon], [~xuefuz], could you 
> please take a look? Shall we ignore them? Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12798) CBO: Calcite Operator To Hive Operator (Calcite Return Path): MiniTezCliDriver.vector* queries failures due to NPE in Vectorizer.onExpressionHasNullSafes()

2016-01-19 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12798:
-
Fix Version/s: 2.0.0

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): 
> MiniTezCliDriver.vector* queries failures due to NPE in 
> Vectorizer.onExpressionHasNullSafes()
> ---
>
> Key: HIVE-12798
> URL: https://issues.apache.org/jira/browse/HIVE-12798
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Fix For: 2.0.0, 2.1.0
>
> Attachments: HIVE-12798.1.patch
>
>
> As of 01/04/2016, the following tests fail in the MiniTezCliDriver mode when 
> the cbo return path is enabled. We need to fix them :
> {code}
>  vector_leftsemi_mapjoin
>  vector_join_filters
>  vector_interval_mapjoin
>  vector_left_outer_join
>  vectorized_mapjoin
>  vector_inner_join
>  vectorized_context
>  tez_vector_dynpart_hashjoin_1
>  count
>  auto_sortmerge_join_6
>  skewjoin
>  vector_auto_smb_mapjoin_14
>  auto_join_filters
>  vector_outer_join0
>  vector_outer_join1
>  vector_outer_join2
>  vector_outer_join3
>  vector_outer_join4
>  vector_outer_join5
>  hybridgrace_hashjoin_1
>  vector_mapjoin_reduce
>  vectorized_nested_mapjoin
>  vector_left_outer_join2
>  vector_char_mapjoin1
>  vector_decimal_mapjoin
>  vectorized_dynamic_partition_pruning
>  vector_varchar_mapjoin1
> {code}
> This jira is intended to cover the vectorization issues related to the 
> MiniTezCliDriver failures caused by NPE via nullSafes array as shown below :
> {code}
> private boolean onExpressionHasNullSafes(MapJoinDesc desc) {
>  boolean[] nullSafes = desc.getNullSafes();
>  for (boolean nullSafe : nullSafes) {
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-8680) Set Max Message for Binary Thrift endpoints

2016-01-19 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107768#comment-15107768
 ] 

Lefty Leverenz edited comment on HIVE-8680 at 1/20/16 12:57 AM:


Doc note: the wiki documentation has been added:
* [Configuration Properties - hive.metastore.server.max.message.size | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.metastore.server.max.message.size]
* [Configuration Properties - hive.server2.thrift.max.message.size | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.server2.thrift.max.message.size]

Removed the TODOC15.


was (Author: sladymon):
Doc note: the wiki documentation has been added:
* [Configuration Properties - hive.metastore.server.max.message.size | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.metastore.server.max.message.size]
* [Configuration Properties - hive.server2.thrift.max.message.size | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.server2.thrift.max.message.size]
Removed the TODOC15.

> Set Max Message for Binary Thrift endpoints
> ---
>
> Key: HIVE-8680
> URL: https://issues.apache.org/jira/browse/HIVE-8680
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Brock Noland
> Fix For: 1.1.0, 1.0.2
>
> Attachments: HIVE-8680.patch, HIVE-8680.patch
>
>
> Thrift has a configuration open to restrict incoming message size. If we 
> configure this we'll stop OOM'ing when someone sends us an HTTP request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12244) Refactoring code for avoiding of comparison of Strings and do comparison on Path

2016-01-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107798#comment-15107798
 ] 

Hive QA commented on HIVE-12244:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12783127/HIVE-12244.2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6676/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6676/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6676/

Messages:
{noformat}
 This message was trimmed, see log for full details 
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 
/data/hive-ptest/working/apache-github-source-source/itests/custom-udfs/udf-classloader-udf2/src/main/resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ 
udf-classloader-udf2 ---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ 
udf-classloader-udf2 ---
[INFO] Compiling 1 source file to 
/data/hive-ptest/working/apache-github-source-source/itests/custom-udfs/udf-classloader-udf2/target/classes
[INFO] 
[INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ 
udf-classloader-udf2 ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 
/data/hive-ptest/working/apache-github-source-source/itests/custom-udfs/udf-classloader-udf2/src/test/resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ udf-classloader-udf2 
---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/itests/custom-udfs/udf-classloader-udf2/target/tmp
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/itests/custom-udfs/udf-classloader-udf2/target/warehouse
[mkdir] Created dir: 
/data/hive-ptest/working/apache-github-source-source/itests/custom-udfs/udf-classloader-udf2/target/tmp/conf
 [copy] Copying 16 files to 
/data/hive-ptest/working/apache-github-source-source/itests/custom-udfs/udf-classloader-udf2/target/tmp/conf
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
udf-classloader-udf2 ---
[INFO] No sources to compile
[INFO] 
[INFO] --- maven-surefire-plugin:2.16:test (default-test) @ 
udf-classloader-udf2 ---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ udf-classloader-udf2 ---
[INFO] Building jar: 
/data/hive-ptest/working/apache-github-source-source/itests/custom-udfs/udf-classloader-udf2/target/udf-classloader-udf2-2.1.0-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ 
udf-classloader-udf2 ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ 
udf-classloader-udf2 ---
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/itests/custom-udfs/udf-classloader-udf2/target/udf-classloader-udf2-2.1.0-SNAPSHOT.jar
 to 
/data/hive-ptest/working/maven/org/apache/hive/hive-it-custom-udfs/udf-classloader-udf2/2.1.0-SNAPSHOT/udf-classloader-udf2-2.1.0-SNAPSHOT.jar
[INFO] Installing 
/data/hive-ptest/working/apache-github-source-source/itests/custom-udfs/udf-classloader-udf2/pom.xml
 to 
/data/hive-ptest/working/maven/org/apache/hive/hive-it-custom-udfs/udf-classloader-udf2/2.1.0-SNAPSHOT/udf-classloader-udf2-2.1.0-SNAPSHOT.pom
[INFO] 
[INFO] 
[INFO] Building Hive Integration - HCatalog Unit Tests 2.1.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-hcatalog-it-unit 
---
[INFO] Deleting 
/data/hive-ptest/working/apache-github-source-source/itests/hcatalog-unit/target
[INFO] Deleting 
/data/hive-ptest/working/apache-github-source-source/itests/hcatalog-unit 
(includes = [datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ 
hive-hcatalog-it-unit ---
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (download-spark) @ hive-hcatalog-it-unit 
---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ 
hive-hcatalog-it-unit ---
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ 
hive-hcatalog-it-unit ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing 

[jira] [Updated] (HIVE-12220) LLAP: Usability issues with hive.llap.io.cache.orc.size

2016-01-19 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12220:

Attachment: HIVE-12220.03.patch

The same patch... HiveQA didn't run for whatever reason

> LLAP: Usability issues with hive.llap.io.cache.orc.size
> ---
>
> Key: HIVE-12220
> URL: https://issues.apache.org/jira/browse/HIVE-12220
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Carter Shanklin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12220.01.patch, HIVE-12220.02.patch, 
> HIVE-12220.03.patch, HIVE-12220.patch, HIVE-12220.tmp.patch
>
>
> In the llap-daemon site you need to set, among other things,
> llap.daemon.memory.per.instance.mb
> and
> hive.llap.io.cache.orc.size
> The use of hive.llap.io.cache.orc.size caused me some unnecessary problems, 
> initially I entered the value in MB rather than in bytes. Operator error you 
> could say but I look at this as a fraction of the other value which is in mb.
> Second, is this really tied to ORC? E.g. when we have the vectorized text 
> reader will this data be cached as well? Or might it be in the future?
> I would like to propose instead using hive.llap.io.cache.size.mb for this 
> setting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11097) HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases

2016-01-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107678#comment-15107678
 ] 

Hive QA commented on HIVE-11097:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12783035/HIVE-11097.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10011 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_symlink_text_input_format
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6674/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6674/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6674/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12783035 - PreCommit-HIVE-TRUNK-Build

> HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases
> -
>
> Key: HIVE-11097
> URL: https://issues.apache.org/jira/browse/HIVE-11097
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats
>Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.2.0
> Environment: Hive 0.13.1, Hive 2.0.0, hadoop 2.4.1
>Reporter: Wan Chang
>Assignee: Wan Chang
>Priority: Critical
> Attachments: HIVE-11097.1.patch, HIVE-11097.2.patch, 
> HIVE-11097.3.patch
>
>
> Say we have a sql as
> {code}
> create table if not exists test_orc_src (a int, b int, c int) stored as orc;
> create table if not exists test_orc_src2 (a int, b int, d int) stored as orc;
> insert overwrite table test_orc_src select 1,2,3 from src limit 1;
> insert overwrite table test_orc_src2 select 1,2,4 from src limit 1;
> set hive.auto.convert.join = false;
> set hive.execution.engine=mr;
> select
>   tb.c
> from test.test_orc_src tb
> join (select * from test.test_orc_src2) tm
> on tb.a = tm.a
> where tb.b = 2
> {code}
> The correct result is 3 but it produced no result.
> I find that in HiveInputFormat.pushProjectionsAndFilters
> {code}
> match = splitPath.startsWith(key) || splitPathWithNoSchema.startsWith(key);
> {code}
> It uses startsWith to combine aliases with path, so tm will match two alias 
> in this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9147) Add unit test for HIVE-7323

2016-01-19 Thread Peter Slawski (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Slawski updated HIVE-9147:

Attachment: HIVE-9147.2.patch

I rebased this patch on top of the latest master branch. Please see attachments.

> Add unit test for HIVE-7323
> ---
>
> Key: HIVE-9147
> URL: https://issues.apache.org/jira/browse/HIVE-9147
> Project: Hive
>  Issue Type: Test
>  Components: Statistics
>Affects Versions: 0.14.0, 0.13.1
>Reporter: Peter Slawski
>Priority: Minor
> Attachments: HIVE-9147.1.patch, HIVE-9147.2.patch
>
>
> This unit test verifies that DateStatisticImpl doesn't store mutable objects 
> from callers for minimum and maximum values. This ensures callers cannot 
> modify the internal minimum and maximum values outside of DateStatisticImpl.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12727) allow full table queries in strict mode

2016-01-19 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12727:

Attachment: HIVE-12727.01.patch

The same patch. I cannot repro the test failures locally, need to see the logs 
here.

> allow full table queries in strict mode
> ---
>
> Key: HIVE-12727
> URL: https://issues.apache.org/jira/browse/HIVE-12727
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Attachments: HIVE-12727.01.patch, HIVE-12727.patch
>
>
> Making strict mode the default recently appears to have broken many normal 
> queries, such as some TPCDS benchmark queries, e.g. Q85:
> Response message: org.apache.hive.service.cli.HiveSQLException: Error while 
> compiling statement: FAILED: SemanticException [Error 10041]: No partition 
> predicate found for Alias "web_sales" Table "web_returns"
> We should remove this restriction from strict mode, or change the default 
> back to non-strict. Perhaps make a 3-value parameter, nonstrict, semistrict, 
> and strict, for backward compat for people who are relying on strict already.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12893) Sorted dynamic partition does not work if subset of partition columns are constant folded

2016-01-19 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107894#comment-15107894
 ] 

Prasanth Jayachandran commented on HIVE-12893:
--

[~ashutoshc] Can you please review this patch?

> Sorted dynamic partition does not work if subset of partition columns are 
> constant folded
> -
>
> Key: HIVE-12893
> URL: https://issues.apache.org/jira/browse/HIVE-12893
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-12893.1.patch
>
>
> If all partition columns are constant folded then sorted dynamic partitioning 
> should not be used as it is similar to static partitioning. But if only 
> subset of partition columns are constant folded sorted dynamic partition 
> optimization will be helpful. Currently, this optimization is disabled if 
> atleast one partition column constant folded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12893) Sorted dynamic partition does not work if subset of partition columns are constant folded

2016-01-19 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-12893:
-
Attachment: HIVE-12893.1.patch

This patch moves the SortedDynamicPartition optimizer above 
PartitionConditionRemover optimization. Removal of partition condition after 
constant folding makes it complicated to determine the partition columns as the 
folded columns will be removed from row schema. Also this patch disables 
BucketingSortingReduceSinkOptimizer if SortedDynamicPartition optimizer inserts 
a new ReduceSink. 

> Sorted dynamic partition does not work if subset of partition columns are 
> constant folded
> -
>
> Key: HIVE-12893
> URL: https://issues.apache.org/jira/browse/HIVE-12893
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-12893.1.patch
>
>
> If all partition columns are constant folded then sorted dynamic partitioning 
> should not be used as it is similar to static partitioning. But if only 
> subset of partition columns are constant folded sorted dynamic partition 
> optimization will be helpful. Currently, this optimization is disabled if 
> atleast one partition column constant folded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12893) Sorted dynamic partition does not work if subset of partition columns are constant folded

2016-01-19 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-12893:
-
Reporter: Yi Zhang  (was: Prasanth Jayachandran)

> Sorted dynamic partition does not work if subset of partition columns are 
> constant folded
> -
>
> Key: HIVE-12893
> URL: https://issues.apache.org/jira/browse/HIVE-12893
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Yi Zhang
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-12893.1.patch
>
>
> If all partition columns are constant folded then sorted dynamic partitioning 
> should not be used as it is similar to static partitioning. But if only 
> subset of partition columns are constant folded sorted dynamic partition 
> optimization will be helpful. Currently, this optimization is disabled if 
> atleast one partition column constant folded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8680) Set Max Message for Binary Thrift endpoints

2016-01-19 Thread Shannon Ladymon (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107768#comment-15107768
 ] 

Shannon Ladymon commented on HIVE-8680:
---

Doc note: the wiki documentation has been added:
* [Configuration Properties - hive.metastore.server.max.message.size | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.metastore.server.max.message.size]
* [Configuration Properties - hive.server2.thrift.max.message.size | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.server2.thrift.max.message.size]
Removed the TODOC15.

> Set Max Message for Binary Thrift endpoints
> ---
>
> Key: HIVE-8680
> URL: https://issues.apache.org/jira/browse/HIVE-8680
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Brock Noland
> Fix For: 1.1.0, 1.0.2
>
> Attachments: HIVE-8680.patch, HIVE-8680.patch
>
>
> Thrift has a configuration open to restrict incoming message size. If we 
> configure this we'll stop OOM'ing when someone sends us an HTTP request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12856) LLAP: update (add/remove) the UDFs available in LLAP when they are changed; also refresh periodically

2016-01-19 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12856:

Description: I don't think re-querying the functions is going to scale, and 
the sessions obviously cannot notify all LLAP clusters of every change. We 
should add global versioning to metastore functions to track changes, and then 
possibly add a notification mechanism, potentially thru ZK to avoid overloading 
the metastore itself.  (was: I don't think re-querying the functions is going 
to scale, and the sessions obviously cannot notify all LLAP clusters of every 
change. We should add versioning to metastore functions to track changes, and 
then possibly add a notification mechanism, potentially thru ZK to avoid 
overloading the metastore itself.)

> LLAP: update (add/remove) the UDFs available in LLAP when they are changed; 
> also refresh periodically
> -
>
> Key: HIVE-12856
> URL: https://issues.apache.org/jira/browse/HIVE-12856
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> I don't think re-querying the functions is going to scale, and the sessions 
> obviously cannot notify all LLAP clusters of every change. We should add 
> global versioning to metastore functions to track changes, and then possibly 
> add a notification mechanism, potentially thru ZK to avoid overloading the 
> metastore itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12892) Add global change versioning to permanent functions in metastore

2016-01-19 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12892:

Summary: Add global change versioning to permanent functions in metastore  
(was: Add change versioning to permanent functions in metastore)

> Add global change versioning to permanent functions in metastore
> 
>
> Key: HIVE-12892
> URL: https://issues.apache.org/jira/browse/HIVE-12892
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-12892) Add change versioning to permanent functions in metastore

2016-01-19 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-12892.
-
Resolution: Won't Fix

> Add change versioning to permanent functions in metastore
> -
>
> Key: HIVE-12892
> URL: https://issues.apache.org/jira/browse/HIVE-12892
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HIVE-12892) Add global change versioning to permanent functions in metastore

2016-01-19 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reopened HIVE-12892:
-

> Add global change versioning to permanent functions in metastore
> 
>
> Key: HIVE-12892
> URL: https://issues.apache.org/jira/browse/HIVE-12892
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12865) Exchange partition does not show inputs field for post/pre execute hooks

2016-01-19 Thread Paul Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107830#comment-15107830
 ] 

Paul Yang commented on HIVE-12865:
--

Looks great to me - thanks Aihua!

> Exchange partition does not show inputs field for post/pre execute hooks
> 
>
> Key: HIVE-12865
> URL: https://issues.apache.org/jira/browse/HIVE-12865
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.12.0
>Reporter: Paul Yang
>Assignee: Aihua Xu
> Attachments: HIVE-12865.patch
>
>
> The pre/post execute hook interface has fields that indicate which Hive 
> objects were read / written to as a result of running the query. For the 
> exchange partition operation, the read entity field is empty.
> This is an important issue as the hook interface may be configured to perform 
> critical warehouse operations.
> See
> ql/src/test/results/clientpositive/exchange_partition3.q.out
> {code}
> --- a/ql/src/test/results/clientpositive/exchange_partition3.q.out
> +++ b/ql/src/test/results/clientpositive/exchange_partition3.q.out
> @@ -65,9 +65,17 @@ ds=2013-04-05/hr=2
>  PREHOOK: query: -- This will exchange both partitions hr=1 and hr=2
>  ALTER TABLE exchange_part_test1 EXCHANGE PARTITION (ds='2013-04-05') WITH 
> TABLE exchange_part_test2
>  PREHOOK: type: ALTERTABLE_EXCHANGEPARTITION
> +PREHOOK: Output: default@exchange_part_test1
> +PREHOOK: Output: default@exchange_part_test2
>  POSTHOOK: query: -- This will exchange both partitions hr=1 and hr=2
>  ALTER TABLE exchange_part_test1 EXCHANGE PARTITION (ds='2013-04-05') WITH 
> TABLE exchange_part_test2
>  POSTHOOK: type: ALTERTABLE_EXCHANGEPARTITION
> +POSTHOOK: Output: default@exchange_part_test1
> +POSTHOOK: Output: default@exchange_part_test1@ds=2013-04-05/hr=1
> +POSTHOOK: Output: default@exchange_part_test1@ds=2013-04-05/hr=2
> +POSTHOOK: Output: default@exchange_part_test2
> +POSTHOOK: Output: default@exchange_part_test2@ds=2013-04-05/hr=1
> +POSTHOOK: Output: default@exchange_part_test2@ds=2013-04-05/hr=2
>  PREHOOK: query: SHOW PARTITIONS exchange_part_test1
>  PREHOOK: type: SHOWPARTITIONS
>  PREHOOK: Input: default@exchange_part_test1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8680) Set Max Message for Binary Thrift endpoints

2016-01-19 Thread Shannon Ladymon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shannon Ladymon updated HIVE-8680:
--
Labels:   (was: TODOC15)

> Set Max Message for Binary Thrift endpoints
> ---
>
> Key: HIVE-8680
> URL: https://issues.apache.org/jira/browse/HIVE-8680
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Brock Noland
> Fix For: 1.1.0, 1.0.2
>
> Attachments: HIVE-8680.patch, HIVE-8680.patch
>
>
> Thrift has a configuration open to restrict incoming message size. If we 
> configure this we'll stop OOM'ing when someone sends us an HTTP request.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12856) LLAP: update (add/remove) the UDFs available in LLAP when they are changed; also refresh periodically

2016-01-19 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12856:

Description: I don't think re-querying the functions is going to scale, and 
the sessions obviously cannot notify all LLAP clusters of every change. We 
should add versioning to metastore functions to track changes, and then 
possibly add a notification mechanism, potentially thru ZK to avoid overloading 
the metastore itself.

> LLAP: update (add/remove) the UDFs available in LLAP when they are changed; 
> also refresh periodically
> -
>
> Key: HIVE-12856
> URL: https://issues.apache.org/jira/browse/HIVE-12856
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> I don't think re-querying the functions is going to scale, and the sessions 
> obviously cannot notify all LLAP clusters of every change. We should add 
> versioning to metastore functions to track changes, and then possibly add a 
> notification mechanism, potentially thru ZK to avoid overloading the 
> metastore itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12855) LLAP: add checks when resolving UDFs to enforce whitelist

2016-01-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108067#comment-15108067
 ] 

Hive QA commented on HIVE-12855:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12783176/HIVE-12855.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10023 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dynpart_hashjoin_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dynpart_hashjoin_2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_vector_dynpart_hashjoin_2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6679/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6679/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6679/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12783176 - PreCommit-HIVE-TRUNK-Build

> LLAP: add checks when resolving UDFs to enforce whitelist
> -
>
> Key: HIVE-12855
> URL: https://issues.apache.org/jira/browse/HIVE-12855
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12855.01.patch, HIVE-12855.part.patch
>
>
> Currently, adding a temporary UDF and calling LLAP with it (bypassing the 
> LlapDecider check, I did it by just modifying the source) only fails because 
> the class could not be found. If the UDF was accessible to LLAP, it would 
> execute. Inside the daemon, UDF instantiation should fail for custom UDFs 
> (and only succeed for whitelisted custom UDFs, once that is implemented).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12889) Support COUNT(DISTINCT) for partitioning query.

2016-01-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107975#comment-15107975
 ] 

Hive QA commented on HIVE-12889:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12783137/HIVE-12889.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 23 failed/errored test(s), 10010 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vectorized_parquet.q-orc_merge6.q-vector_outer_join0.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_partitioned
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_join29
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_14
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_hybridgrace_hashjoin_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_vec_mapwork_table
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_join_result_complex
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_joins_explain
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union_decimal
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_aggregate
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_mapjoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_reduce2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_0
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_17
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6677/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6677/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6677/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 23 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12783137 - PreCommit-HIVE-TRUNK-Build

> Support COUNT(DISTINCT) for partitioning query.
> ---
>
> Key: HIVE-12889
> URL: https://issues.apache.org/jira/browse/HIVE-12889
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Affects Versions: 2.1.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-12889.patch
>
>
> We need to support avg(distinct), count(distinct), sum(distinct) for the 
> parent jira HIVE-9534. Separate the work for count(distinct) in this subtask.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12353) When Compactor fails it calls CompactionTxnHandler.markedCleaned(). it should not.

2016-01-19 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-12353:
--
Attachment: HIVE-12353.4.patch

> When Compactor fails it calls CompactionTxnHandler.markedCleaned().  it 
> should not.
> ---
>
> Key: HIVE-12353
> URL: https://issues.apache.org/jira/browse/HIVE-12353
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-12353.2.patch, HIVE-12353.3.patch, 
> HIVE-12353.4.patch, HIVE-12353.patch
>
>
> One of the things that this method does is delete entries from TXN_COMPONENTS 
> for partition that it was trying to compact.
> This causes Aborted transactions in TXNS to become empty according to
> CompactionTxnHandler.cleanEmptyAbortedTxns() which means they can now be 
> deleted.  
> Once they are deleted, data that belongs to these txns is deemed committed...
> We should extend COMPACTION_QUEUE state with 'f' and 's' (failed, success) 
> states.  We should also not delete then entry from markedCleaned()
> We'll have separate process that cleans 'f' and 's' records after X minutes 
> (or after > N records for a given partition exist).
> This allows SHOW COMPACTIONS to show some history info and how many times 
> compaction failed on a given partition (subject to retention interval) so 
> that we don't have to call markCleaned() on Compactor failures at the same 
> time preventing Compactor to constantly getting stuck on the same bad 
> partition/table.
> Ideally we'd want to include END_TIME field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12736) It seems that result of Hive on Spark be mistaken and result of Hive and Hive on Spark are not the same

2016-01-19 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108052#comment-15108052
 ] 

Xuefu Zhang commented on HIVE-12736:


I also tried memcheck.q, and it passed locally for me too. It doesn't seem 
related to the patch regardless.

As to the patch, it looks good to me. However, I do know much about mapjoin 
with hint, not sure why groupby and union cannot exist before mapjoin. If you 
have some explanation, that will help.

+1 for the patch.

> It seems that result of Hive on Spark be mistaken and result of Hive and Hive 
> on Spark are not the same
> ---
>
> Key: HIVE-12736
> URL: https://issues.apache.org/jira/browse/HIVE-12736
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.1, 1.2.1
>Reporter: JoneZhang
>Assignee: Chengxiang Li
> Attachments: HIVE-12736.1-spark.patch, HIVE-12736.2-spark.patch, 
> HIVE-12736.3-spark.patch, HIVE-12736.4-spark.patch, HIVE-12736.5-spark.patch, 
> HIVE-12736.5-spark.patch
>
>
> {code}
> select  * from staff;
> 1 jone22  1
> 2 lucy21  1
> 3 hmm 22  2
> 4 james   24  3
> 5 xiaoliu 23  3
> select id,date_ from trade union all select id,"test" from trade ;
> 1 201510210908
> 2 201509080234
> 2 201509080235
> 1 test
> 2 test
> 2 test
> set hive.execution.engine=spark;
> set spark.master=local;
> select /*+mapjoin(t)*/ * from staff s join 
> (select id,date_ from trade union all select id,"test" from trade ) t on 
> s.id=t.id;
> 1 jone22  1   1   201510210908
> 2 lucy21  1   2   201509080234
> 2 lucy21  1   2   201509080235
> set hive.execution.engine=mr;
> select /*+mapjoin(t)*/ * from staff s join 
> (select id,date_ from trade union all select id,"test" from trade ) t on 
> s.id=t.id;
> FAILED: SemanticException [Error 10227]: Not all clauses are supported with 
> mapjoin hint. Please remove mapjoin hint.
> {code}
> I have two questions
> 1.Why result of hive on spark not include the following record?
> {code}
> 1 jone22  1   1   test
> 2 lucy21  1   2   test
> 2 lucy21  1   2   test
> {code}
> 2.Why there are two different ways of dealing same query?
> explain 1:
> {code}
> set hive.execution.engine=spark;
> set spark.master=local;
> explain 
> select id,date_ from trade union all select id,"test" from trade;
> OK
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Spark
>   DagName: jonezhang_20151222191643_5301d90a-caf0-4934-8092-d165c87a4190:1
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: trade
>   Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: id (type: int), date_ (type: string)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 6 Data size: 48 Basic stats: 
> COMPLETE Column stats: NONE
> File Output Operator
>   compressed: false
>   Statistics: Num rows: 12 Data size: 96 Basic stats: 
> COMPLETE Column stats: NONE
>   table:
>   input format: 
> org.apache.hadoop.mapred.TextInputFormat
>   output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>   serde: 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> Map 2 
> Map Operator Tree:
> TableScan
>   alias: trade
>   Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: id (type: int), 'test' (type: string)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 6 Data size: 48 Basic stats: 
> COMPLETE Column stats: NONE
> File Output Operator
>   compressed: false
>   Statistics: Num rows: 12 Data size: 96 Basic stats: 
> COMPLETE Column stats: NONE
>   table:
>   input format: 
> org.apache.hadoop.mapred.TextInputFormat
>   output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>   serde: 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
> Fetch Operator
>   limit: -1
>   Processor Tree:
> ListSink
> {code}
> 

[jira] [Updated] (HIVE-12446) Tracking jira for changes required for move to Tez 0.8.2

2016-01-19 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-12446:
--
Attachment: HIVE-12446.02.patch

> Tracking jira for changes required for move to Tez 0.8.2
> 
>
> Key: HIVE-12446
> URL: https://issues.apache.org/jira/browse/HIVE-12446
> Project: Hive
>  Issue Type: Task
>  Components: llap
>Reporter: Siddharth Seth
> Attachments: HIVE-12446.02.patch, HIVE-12446.02.patch, 
> HIVE-12446.combined.1.patch, HIVE-12446.combined.1.txt, HIVE-12446.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12892) Add global change versioning to permanent functions in metastore

2016-01-19 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12892:

Attachment: HIVE-12892.WIP.patch

WIP patch for backup. In the first cut, the version would be queriable. Perhaps 
it could also have ZK notifications to avoid overload on metastore from many 
subscribers connecting.

> Add global change versioning to permanent functions in metastore
> 
>
> Key: HIVE-12892
> URL: https://issues.apache.org/jira/browse/HIVE-12892
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12892.WIP.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12894) Detect whether ORC is reading from ACID table correctly for Schema Evolution

2016-01-19 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-12894:

Attachment: HIVE-12894.01.patch

This patch included uncommitted changes for HIVE-12887, too.

> Detect whether ORC is reading from ACID table correctly for Schema Evolution
> 
>
> Key: HIVE-12894
> URL: https://issues.apache.org/jira/browse/HIVE-12894
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-12894.01.patch
>
>
> Set an configuration variable with 'transactional' property to indicate the 
> table is ACID.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12891) Hive fails when java.io.tmpdir is set to a relative location

2016-01-19 Thread Lenni Kuff (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108163#comment-15108163
 ] 

Lenni Kuff commented on HIVE-12891:
---

Comments:
  - Do you want to expand all of these paths to absolute? Some of them are HDFS 
scratch dirs, not sure if we want to support relative paths for those or just 
java.io.tmpdir
  - Update the config documentation to mention that relative or absolute paths 
are allowed.
  - Is it easy to add a test for this?

> Hive fails when java.io.tmpdir is set to a relative location
> 
>
> Key: HIVE-12891
> URL: https://issues.apache.org/jira/browse/HIVE-12891
> Project: Hive
>  Issue Type: Bug
>Reporter: Reuben Kuhnert
>Assignee: Reuben Kuhnert
> Attachments: HIVE-12891.01.19.2016.01.patch
>
>
> The function {{SessionState.createSessionDirs}} fails when trying to create 
> directories where {{java.io.tmpdir}} is set to a relative location.
> {code}
> \[uber-SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: 
> IllegalArgumentException java.net.URISyntaxException: Relative path in 
> absolute URI: 
> file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1
> ...
> Minor variations:
> \[uber-SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: SemanticException 
> Exception while processing Exception while writing out the local file 
> o.a.h.hive.ql/parse.SemanticException: Exception while processing exception 
> while writing out local file 
> ... 
> caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: 
> Relative path in absolute URI: 
> file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1 
> at o.a.h.fs.Path.initialize (206) 
> at o.a.h.fs.Path.(197)... 
> at o.a.h.hive.ql.context.getScratchDir(267) 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12855) LLAP: add checks when resolving UDFs to enforce whitelist

2016-01-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106651#comment-15106651
 ] 

Hive QA commented on HIVE-12855:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12782985/HIVE-12855.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 10025 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_bmj_schema_evolution
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dml
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dynpart_hashjoin_1
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dynpart_hashjoin_2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_schema_evolution
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_self_join
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_union_dynamic_partition
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_union_multiinsert
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_vector_dynpart_hashjoin_2
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_join_part_col_char
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6670/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6670/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6670/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 18 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12782985 - PreCommit-HIVE-TRUNK-Build

> LLAP: add checks when resolving UDFs to enforce whitelist
> -
>
> Key: HIVE-12855
> URL: https://issues.apache.org/jira/browse/HIVE-12855
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12855.part.patch, HIVE-12855.patch
>
>
> Currently, adding a temporary UDF and calling LLAP with it (bypassing the 
> LlapDecider check, I did it by just modifying the source) only fails because 
> the class could not be found. If the UDF was accessible to LLAP, it would 
> execute. Inside the daemon, UDF instantiation should fail for custom UDFs 
> (and only succeed for whitelisted custom UDFs, once that is implemented).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12478) Improve Hive/Calcite Trasitive Predicate inference

2016-01-19 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106772#comment-15106772
 ] 

Jesus Camacho Rodriguez commented on HIVE-12478:


[~jpullokkaran], could you take a look at the current version of the patch?

Finally, I had to store state about transitive inference in the operator 
itself, it was the only reasonable way of implementing exhaustive PPD, 
inference and constant propagation using HepPlanner.

> Improve Hive/Calcite Trasitive Predicate inference
> --
>
> Key: HIVE-12478
> URL: https://issues.apache.org/jira/browse/HIVE-12478
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Laljo John Pullokkaran
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12478.01.patch, HIVE-12478.02.patch, 
> HIVE-12478.03.patch, HIVE-12478.04.patch, HIVE-12478.05.patch, 
> HIVE-12478.06.patch, HIVE-12478.07.patch, HIVE-12478.08.patch, 
> HIVE-12478.patch
>
>
> HiveJoinPushTransitivePredicatesRule does not pull up predicates for 
> transitive inference if they contain more than one column.
> EXPLAIN select * from srcpart join (select ds as ds, ds as `date` from 
> srcpart where  (ds = '2008-04-08' and value=1)) s on (srcpart.ds = s.ds);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12787) Trace improvement - Inconsistent logging upon shutdown-start of the Hive metastore process

2016-01-19 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106775#comment-15106775
 ] 

Aihua Xu commented on HIVE-12787:
-

+1. Makes sense to me. 

> Trace improvement - Inconsistent logging upon shutdown-start of the Hive 
> metastore process
> --
>
> Key: HIVE-12787
> URL: https://issues.apache.org/jira/browse/HIVE-12787
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 1.2.1
>Reporter: Siddharth Ahuja
>Assignee: Siddharth Ahuja
>Priority: Minor
> Attachments: HIVE-12787.1.patch
>
>
> The log at: 
> https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L793
>  logged at the start of the shutdown of the Hive metastore process can be 
> improved to match the finish of the shutdown log at: 
> https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L793
> by rephrasing from: "Shutting down the object store..." to: "Metastore 
> shutdown started...". This will match the shutdown completion log: "Metastore 
> shutdown complete.".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12736) It seems that result of Hive on Spark be mistaken and result of Hive and Hive on Spark are not the same

2016-01-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106714#comment-15106714
 ] 

Hive QA commented on HIVE-12736:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12783072/HIVE-12736.5-spark.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9870 tests executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_memcheck
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/1036/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/1036/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-1036/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12783072 - PreCommit-HIVE-SPARK-Build

> It seems that result of Hive on Spark be mistaken and result of Hive and Hive 
> on Spark are not the same
> ---
>
> Key: HIVE-12736
> URL: https://issues.apache.org/jira/browse/HIVE-12736
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.1, 1.2.1
>Reporter: JoneZhang
>Assignee: Chengxiang Li
> Attachments: HIVE-12736.1-spark.patch, HIVE-12736.2-spark.patch, 
> HIVE-12736.3-spark.patch, HIVE-12736.4-spark.patch, HIVE-12736.5-spark.patch
>
>
> {code}
> select  * from staff;
> 1 jone22  1
> 2 lucy21  1
> 3 hmm 22  2
> 4 james   24  3
> 5 xiaoliu 23  3
> select id,date_ from trade union all select id,"test" from trade ;
> 1 201510210908
> 2 201509080234
> 2 201509080235
> 1 test
> 2 test
> 2 test
> set hive.execution.engine=spark;
> set spark.master=local;
> select /*+mapjoin(t)*/ * from staff s join 
> (select id,date_ from trade union all select id,"test" from trade ) t on 
> s.id=t.id;
> 1 jone22  1   1   201510210908
> 2 lucy21  1   2   201509080234
> 2 lucy21  1   2   201509080235
> set hive.execution.engine=mr;
> select /*+mapjoin(t)*/ * from staff s join 
> (select id,date_ from trade union all select id,"test" from trade ) t on 
> s.id=t.id;
> FAILED: SemanticException [Error 10227]: Not all clauses are supported with 
> mapjoin hint. Please remove mapjoin hint.
> {code}
> I have two questions
> 1.Why result of hive on spark not include the following record?
> {code}
> 1 jone22  1   1   test
> 2 lucy21  1   2   test
> 2 lucy21  1   2   test
> {code}
> 2.Why there are two different ways of dealing same query?
> explain 1:
> {code}
> set hive.execution.engine=spark;
> set spark.master=local;
> explain 
> select id,date_ from trade union all select id,"test" from trade;
> OK
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Spark
>   DagName: jonezhang_20151222191643_5301d90a-caf0-4934-8092-d165c87a4190:1
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: trade
>   Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: id (type: int), date_ (type: string)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 6 Data size: 48 Basic stats: 
> COMPLETE Column stats: NONE
> File Output Operator
>   compressed: false
>   Statistics: Num rows: 12 Data size: 96 Basic stats: 
> COMPLETE Column stats: NONE
>   table:
>   input format: 
> org.apache.hadoop.mapred.TextInputFormat
>   output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>   serde: 
> 

[jira] [Updated] (HIVE-12478) Improve Hive/Calcite Trasitive Predicate inference

2016-01-19 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-12478:
---
Attachment: HIVE-12478.08.patch

> Improve Hive/Calcite Trasitive Predicate inference
> --
>
> Key: HIVE-12478
> URL: https://issues.apache.org/jira/browse/HIVE-12478
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Laljo John Pullokkaran
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12478.01.patch, HIVE-12478.02.patch, 
> HIVE-12478.03.patch, HIVE-12478.04.patch, HIVE-12478.05.patch, 
> HIVE-12478.06.patch, HIVE-12478.07.patch, HIVE-12478.08.patch, 
> HIVE-12478.patch
>
>
> HiveJoinPushTransitivePredicatesRule does not pull up predicates for 
> transitive inference if they contain more than one column.
> EXPLAIN select * from srcpart join (select ds as ds, ds as `date` from 
> srcpart where  (ds = '2008-04-08' and value=1)) s on (srcpart.ds = s.ds);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12244) Refactoring code for avoiding of comparison of Strings and do comparison on Path

2016-01-19 Thread Alina Abramova (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alina Abramova updated HIVE-12244:
--
Attachment: HIVE-12244.2.patch

Rebased patch

> Refactoring code for avoiding of comparison of Strings and do comparison on 
> Path
> 
>
> Key: HIVE-12244
> URL: https://issues.apache.org/jira/browse/HIVE-12244
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 0.13.0, 0.14.0, 1.0.0, 1.2.1
>Reporter: Alina Abramova
>Assignee: Alina Abramova
>Priority: Minor
>  Labels: patch
> Fix For: 1.2.1
>
> Attachments: HIVE-12244.1.patch, HIVE-12244.2.patch
>
>
> In Hive often String is used for representation path and it causes new issues.
> We need to compare it with equals() but comparing Strings often is not right 
> in terms comparing paths .
> I think if we use Path from org.apache.hadoop.fs we will avoid new problems 
> in future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12887) Handle ORC schema on read with fewer columns than file schema (after Schema Evolution changes)

2016-01-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106905#comment-15106905
 ] 

Hive QA commented on HIVE-12887:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12783001/HIVE-12887.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10010 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6671/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6671/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6671/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12783001 - PreCommit-HIVE-TRUNK-Build

> Handle ORC schema on read with fewer columns than file schema (after Schema 
> Evolution changes)
> --
>
> Key: HIVE-12887
> URL: https://issues.apache.org/jira/browse/HIVE-12887
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-12887.01.patch
>
>
> Exception caused by reading after column removal.
> {code}
> Caused by: java.lang.IndexOutOfBoundsException: Index: 10, Size: 10
>   at java.util.ArrayList.rangeCheck(ArrayList.java:653)
>   at java.util.ArrayList.get(ArrayList.java:429)
>   at java.util.Collections$UnmodifiableList.get(Collections.java:1309)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcProto$Type.getSubtypes(OrcProto.java:12240)
>   at 
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.(TreeReaderFactory.java:2053)
>   at 
> org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory.createTreeReader(TreeReaderFactory.java:2481)
>   at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.(RecordReaderImpl.java:216)
>   at 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:598)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$ReaderPair.(OrcRawRecordMerger.java:179)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$OriginalReaderPair.(OrcRawRecordMerger.java:222)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.(OrcRawRecordMerger.java:442)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.java:1285)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1165)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:249)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12244) Refactoring code for avoiding of comparison of Strings and do comparison on Path

2016-01-19 Thread Alina Abramova (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106909#comment-15106909
 ] 

Alina Abramova commented on HIVE-12244:
---

Rebased patch was attached to the issue.

> Refactoring code for avoiding of comparison of Strings and do comparison on 
> Path
> 
>
> Key: HIVE-12244
> URL: https://issues.apache.org/jira/browse/HIVE-12244
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 0.13.0, 0.14.0, 1.0.0, 1.2.1
>Reporter: Alina Abramova
>Assignee: Alina Abramova
>Priority: Minor
>  Labels: patch
> Fix For: 1.2.1
>
> Attachments: HIVE-12244.1.patch, HIVE-12244.2.patch
>
>
> In Hive often String is used for representation path and it causes new issues.
> We need to compare it with equals() but comparing Strings often is not right 
> in terms comparing paths .
> I think if we use Path from org.apache.hadoop.fs we will avoid new problems 
> in future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12889) Support COUNT(DISTINCT) for partitioning query.

2016-01-19 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-12889:

Summary: Support COUNT(DISTINCT) for partitioning query.  (was: Support 
COUNT(DISTINCT) for partitioning qurery.)

> Support COUNT(DISTINCT) for partitioning query.
> ---
>
> Key: HIVE-12889
> URL: https://issues.apache.org/jira/browse/HIVE-12889
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Affects Versions: 2.1.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>
> We need to support avg(distinct), count(distinct), sum(distinct) for the 
> parent jira HIVE-9534. Separate the work for count(distinct) in this subtask.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12478) Improve Hive/Calcite Trasitive Predicate inference

2016-01-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107784#comment-15107784
 ] 

Hive QA commented on HIVE-12478:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12783103/HIVE-12478.08.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10023 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_semijoin4
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6675/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6675/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6675/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12783103 - PreCommit-HIVE-TRUNK-Build

> Improve Hive/Calcite Trasitive Predicate inference
> --
>
> Key: HIVE-12478
> URL: https://issues.apache.org/jira/browse/HIVE-12478
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Laljo John Pullokkaran
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-12478.01.patch, HIVE-12478.02.patch, 
> HIVE-12478.03.patch, HIVE-12478.04.patch, HIVE-12478.05.patch, 
> HIVE-12478.06.patch, HIVE-12478.07.patch, HIVE-12478.08.patch, 
> HIVE-12478.patch
>
>
> HiveJoinPushTransitivePredicatesRule does not pull up predicates for 
> transitive inference if they contain more than one column.
> EXPLAIN select * from srcpart join (select ds as ds, ds as `date` from 
> srcpart where  (ds = '2008-04-08' and value=1)) s on (srcpart.ds = s.ds);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12736) It seems that result of Hive on Spark be mistaken and result of Hive and Hive on Spark are not the same

2016-01-19 Thread Chengxiang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengxiang Li updated HIVE-12736:
-
Attachment: HIVE-12736.5-spark.patch

I can't reproduce the failed mapjoin_memcheck.q locally, upload the patch again 
to verify.

> It seems that result of Hive on Spark be mistaken and result of Hive and Hive 
> on Spark are not the same
> ---
>
> Key: HIVE-12736
> URL: https://issues.apache.org/jira/browse/HIVE-12736
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.1, 1.2.1
>Reporter: JoneZhang
>Assignee: Chengxiang Li
> Attachments: HIVE-12736.1-spark.patch, HIVE-12736.2-spark.patch, 
> HIVE-12736.3-spark.patch, HIVE-12736.4-spark.patch, HIVE-12736.5-spark.patch, 
> HIVE-12736.5-spark.patch
>
>
> {code}
> select  * from staff;
> 1 jone22  1
> 2 lucy21  1
> 3 hmm 22  2
> 4 james   24  3
> 5 xiaoliu 23  3
> select id,date_ from trade union all select id,"test" from trade ;
> 1 201510210908
> 2 201509080234
> 2 201509080235
> 1 test
> 2 test
> 2 test
> set hive.execution.engine=spark;
> set spark.master=local;
> select /*+mapjoin(t)*/ * from staff s join 
> (select id,date_ from trade union all select id,"test" from trade ) t on 
> s.id=t.id;
> 1 jone22  1   1   201510210908
> 2 lucy21  1   2   201509080234
> 2 lucy21  1   2   201509080235
> set hive.execution.engine=mr;
> select /*+mapjoin(t)*/ * from staff s join 
> (select id,date_ from trade union all select id,"test" from trade ) t on 
> s.id=t.id;
> FAILED: SemanticException [Error 10227]: Not all clauses are supported with 
> mapjoin hint. Please remove mapjoin hint.
> {code}
> I have two questions
> 1.Why result of hive on spark not include the following record?
> {code}
> 1 jone22  1   1   test
> 2 lucy21  1   2   test
> 2 lucy21  1   2   test
> {code}
> 2.Why there are two different ways of dealing same query?
> explain 1:
> {code}
> set hive.execution.engine=spark;
> set spark.master=local;
> explain 
> select id,date_ from trade union all select id,"test" from trade;
> OK
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Spark
>   DagName: jonezhang_20151222191643_5301d90a-caf0-4934-8092-d165c87a4190:1
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: trade
>   Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: id (type: int), date_ (type: string)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 6 Data size: 48 Basic stats: 
> COMPLETE Column stats: NONE
> File Output Operator
>   compressed: false
>   Statistics: Num rows: 12 Data size: 96 Basic stats: 
> COMPLETE Column stats: NONE
>   table:
>   input format: 
> org.apache.hadoop.mapred.TextInputFormat
>   output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>   serde: 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> Map 2 
> Map Operator Tree:
> TableScan
>   alias: trade
>   Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE 
> Column stats: NONE
>   Select Operator
> expressions: id (type: int), 'test' (type: string)
> outputColumnNames: _col0, _col1
> Statistics: Num rows: 6 Data size: 48 Basic stats: 
> COMPLETE Column stats: NONE
> File Output Operator
>   compressed: false
>   Statistics: Num rows: 12 Data size: 96 Basic stats: 
> COMPLETE Column stats: NONE
>   table:
>   input format: 
> org.apache.hadoop.mapred.TextInputFormat
>   output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>   serde: 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
> Fetch Operator
>   limit: -1
>   Processor Tree:
> ListSink
> {code}
> explain 2:
> {code}
> set hive.execution.engine=spark;
> set spark.master=local;
> explain 
> select /*+mapjoin(t)*/ * from staff s join 
> (select id,date_ from trade union all select id,"test" from trade ) t on 
> s.id=t.id;

[jira] [Updated] (HIVE-12889) Support COUNT(DISTINCT) for partitioning query.

2016-01-19 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-12889:

Attachment: HIVE-12889.patch

> Support COUNT(DISTINCT) for partitioning query.
> ---
>
> Key: HIVE-12889
> URL: https://issues.apache.org/jira/browse/HIVE-12889
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Affects Versions: 2.1.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-12889.patch
>
>
> We need to support avg(distinct), count(distinct), sum(distinct) for the 
> parent jira HIVE-9534. Separate the work for count(distinct) in this subtask.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-5801) Support for reader/writer of ORC format for R environment

2016-01-19 Thread Jorge Martinez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106968#comment-15106968
 ] 

Jorge Martinez commented on HIVE-5801:
--

Hi [~mhausenblas], there's an R package to read CSV an ORC files from HDFS. 
It's on Github: https://github.com/vertica/r-dataconnector

> Support for reader/writer of ORC format for R environment
> -
>
> Key: HIVE-5801
> URL: https://issues.apache.org/jira/browse/HIVE-5801
> Project: Hive
>  Issue Type: Improvement
>Reporter: Michael Hausenblas
>Priority: Minor
>
> It would be great if the ORC format would directly be accessible from R [1], 
> that is, providing reader/writer for it.
> [1] http://www.r-project.org/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12889) Support COUNT(DISTINCT) for partitioning query.

2016-01-19 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106994#comment-15106994
 ] 

Aihua Xu commented on HIVE-12889:
-

Uploaded the first patch: in this patch, 

1. Enabling the parser to properly parse such query "count(distinct) over 
(partition by c1)";
2. ORDER BY and windowing frame won't work with the functions of distinct due 
to performance concern and implementation requirement.
3. We insert the distinct fields into the order by list, so during counting, we 
only need to compare the current row against the previous remembered row.

> Support COUNT(DISTINCT) for partitioning query.
> ---
>
> Key: HIVE-12889
> URL: https://issues.apache.org/jira/browse/HIVE-12889
> Project: Hive
>  Issue Type: Sub-task
>  Components: PTF-Windowing
>Affects Versions: 2.1.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-12889.patch
>
>
> We need to support avg(distinct), count(distinct), sum(distinct) for the 
> parent jira HIVE-9534. Separate the work for count(distinct) in this subtask.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12885) LDAP Authenticator improvements

2016-01-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107050#comment-15107050
 ] 

Hive QA commented on HIVE-12885:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12783013/HIVE-12885.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10025 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.ql.TestTxnCommands.exchangePartition
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6672/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6672/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6672/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12783013 - PreCommit-HIVE-TRUNK-Build

> LDAP Authenticator improvements
> ---
>
> Key: HIVE-12885
> URL: https://issues.apache.org/jira/browse/HIVE-12885
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-12885.2.patch, HIVE-12885.patch
>
>
> Currently Hive's LDAP Atn provider assumes certain defaults to keep its 
> configuration simple. 
> 1) One of the assumptions is the presence of an attribute 
> "distinguishedName". In certain non-standard LDAP implementations, this 
> attribute may not be available. So instead of basing all ldap searches on 
> this attribute, getNameInNamespace() returns the same value. So this API is 
> to be used instead.
> 2) It also assumes that the "user" value being passed in, will be able to 
> bind to LDAP. However, certain LDAP implementations, by default, only allow 
> the full DN to be used, just short user names are not permitted. We will need 
> to be able to support short names too when hive configuration only has 
> "BaseDN" specified (not userDNPatterns). So instead of hard-coding "uid" or 
> "CN" as keys for the short usernames, it probably better to make this a 
> configurable parameter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12446) Tracking jira for changes required for move to Tez 0.8.2

2016-01-19 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-12446:
--
Attachment: HIVE-12446.02.patch

> Tracking jira for changes required for move to Tez 0.8.2
> 
>
> Key: HIVE-12446
> URL: https://issues.apache.org/jira/browse/HIVE-12446
> Project: Hive
>  Issue Type: Task
>  Components: llap
>Reporter: Siddharth Seth
> Attachments: HIVE-12446.02.patch, HIVE-12446.combined.1.patch, 
> HIVE-12446.combined.1.txt, HIVE-12446.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)