[jira] [Commented] (HIVE-11097) HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases
[ https://issues.apache.org/jira/browse/HIVE-11097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106468#comment-15106468 ] Wan Chang commented on HIVE-11097: -- Hi [~prasanth_j], I use hive0.13.1 and the bug occurs with some complex sql. But I didn't reproduce the case on the master branch. I don't know whether it has been fix yet. > HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases > - > > Key: HIVE-11097 > URL: https://issues.apache.org/jira/browse/HIVE-11097 > Project: Hive > Issue Type: Bug > Components: File Formats >Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.2.0 > Environment: Hive 0.13.1, Hive 2.0.0, hadoop 2.4.1 >Reporter: Wan Chang >Assignee: Wan Chang >Priority: Critical > Attachments: HIVE-11097.1.patch, HIVE-11097.2.patch, > HIVE-11097.3.patch > > > Say we have a sql as > {code} > create table if not exists test_orc_src (a int, b int, c int) stored as orc; > create table if not exists test_orc_src2 (a int, b int, d int) stored as orc; > insert overwrite table test_orc_src select 1,2,3 from src limit 1; > insert overwrite table test_orc_src2 select 1,2,4 from src limit 1; > set hive.auto.convert.join = false; > set hive.execution.engine=mr; > select > tb.c > from test.test_orc_src tb > join (select * from test.test_orc_src2) tm > on tb.a = tm.a > where tb.b = 2 > {code} > The correct result is 3 but it produced no result. > I find that in HiveInputFormat.pushProjectionsAndFilters > {code} > match = splitPath.startsWith(key) || splitPathWithNoSchema.startsWith(key); > {code} > It uses startsWith to combine aliases with path, so tm will match two alias > in this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12429) Switch default Hive authorization to SQLStandardAuth in 2.0
[ https://issues.apache.org/jira/browse/HIVE-12429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106568#comment-15106568 ] Hive QA commented on HIVE-12429: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12782970/HIVE-12429.18.patch {color:green}SUCCESS:{color} +1 due to 54 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9992 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniTezCliDriver-tez_bmj_schema_evolution.q-orc_merge5.q-vectorization_limit.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6668/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6668/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6668/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12782970 - PreCommit-HIVE-TRUNK-Build > Switch default Hive authorization to SQLStandardAuth in 2.0 > --- > > Key: HIVE-12429 > URL: https://issues.apache.org/jira/browse/HIVE-12429 > Project: Hive > Issue Type: Task > Components: Authorization, Security >Affects Versions: 2.0.0 >Reporter: Alan Gates >Assignee: Daniel Dai > Attachments: HIVE-12429.1.patch, HIVE-12429.10.patch, > HIVE-12429.11.patch, HIVE-12429.12.patch, HIVE-12429.13.patch, > HIVE-12429.14.patch, HIVE-12429.15.patch, HIVE-12429.16.patch, > HIVE-12429.17.patch, HIVE-12429.18.patch, HIVE-12429.2.patch, > HIVE-12429.3.patch, HIVE-12429.4.patch, HIVE-12429.5.patch, > HIVE-12429.6.patch, HIVE-12429.7.patch, HIVE-12429.8.patch, HIVE-12429.9.patch > > > Hive's default authorization is not real security, as it does not secure a > number of features and anyone can grant access to any object to any user. We > should switch the default to SQLStandardAuth, which provides real > authentication. > As this is a backwards incompatible change this was hard to do previously, > but 2.0 gives us a place to do this type of change. > By default authorization will still be off, as there are a few other things > to set when turning on authorization (such as the list of admin users). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12446) Tracking jira for changes required for move to Tez 0.8.2
[ https://issues.apache.org/jira/browse/HIVE-12446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106572#comment-15106572 ] Hive QA commented on HIVE-12446: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12782971/HIVE-12446.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6669/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6669/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6669/ Messages: {noformat} This message was trimmed, see log for full details [INFO] Hive Service RPC [INFO] Spark Remote Client [INFO] Hive Query Language [INFO] Hive Service [INFO] Hive Accumulo Handler [INFO] Hive JDBC [INFO] Hive Beeline [INFO] Hive CLI [INFO] Hive Contrib [INFO] Hive HBase Handler [INFO] Hive HCatalog [INFO] Hive HCatalog Core [INFO] Hive HCatalog Pig Adapter [INFO] Hive HCatalog Server Extensions [INFO] Hive HCatalog Webhcat Java Client [INFO] Hive HCatalog Webhcat [INFO] Hive HCatalog Streaming [INFO] Hive HPL/SQL [INFO] Hive HWI [INFO] Hive ODBC [INFO] Hive Llap Server [INFO] Hive Shims Aggregator [INFO] Hive TestUtils [INFO] Hive Packaging [INFO] [INFO] [INFO] Building Hive 2.1.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive --- [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/target [INFO] Deleting /data/hive-ptest/working/apache-github-source-source (includes = [datanucleus.log, derby.log], excludes = []) [INFO] [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ hive --- [INFO] [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ hive --- [INFO] [INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive --- [INFO] Executing tasks main: [INFO] Executed tasks [INFO] [INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/target/tmp [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/target/warehouse [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/target/tmp/conf [copy] Copying 16 files to /data/hive-ptest/working/apache-github-source-source/target/tmp/conf [INFO] Executed tasks [INFO] [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ hive --- [INFO] [INFO] --- maven-install-plugin:2.4:install (default-install) @ hive --- [INFO] Installing /data/hive-ptest/working/apache-github-source-source/pom.xml to /data/hive-ptest/working/maven/org/apache/hive/hive/2.1.0-SNAPSHOT/hive-2.1.0-SNAPSHOT.pom [INFO] [INFO] [INFO] Building Hive Shims Common 2.1.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-shims-common --- [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/shims/common/target [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/shims/common (includes = [datanucleus.log, derby.log], excludes = []) [INFO] [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ hive-shims-common --- [INFO] [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ hive-shims-common --- [INFO] [INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ hive-shims-common --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] skip non existing resourceDirectory /data/hive-ptest/working/apache-github-source-source/shims/common/src/main/resources [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive-shims-common --- [INFO] Executing tasks main: [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ hive-shims-common --- [INFO] Compiling 29 source files to /data/hive-ptest/working/apache-github-source-source/shims/common/target/classes [WARNING] /data/hive-ptest/working/apache-github-source-source/shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShimsSecure.java: Some input files use or override a deprecated API. [WARNING] /data/hive-ptest/working/apache-github-source-source/shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShimsSecure.java: Recompile with
[jira] [Updated] (HIVE-12736) It seems that result of Hive on Spark be mistaken and result of Hive and Hive on Spark are not the same
[ https://issues.apache.org/jira/browse/HIVE-12736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li updated HIVE-12736: - Attachment: HIVE-12736.5-spark.patch [~xuefuz], Yes, it's related, i miss something here. Group By before MapJoin is not allowed, and in MR mode, it use {{ReduceSinkOperator}} to check whether there is Group By before MapJoin, it has conflict with Spark mode, as mentioned before. Instead of validate MapJoin compatibility with other Operators by through {{opAllowedBeforeMapJoin()}} and {{opAllowedAfterMapJoin()}}, i should be easier and proper to implement through pattern match, i didn't rewrite the validation for MR mode, just add new validation logic for Spark mode based on pattern match. > It seems that result of Hive on Spark be mistaken and result of Hive and Hive > on Spark are not the same > --- > > Key: HIVE-12736 > URL: https://issues.apache.org/jira/browse/HIVE-12736 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.1, 1.2.1 >Reporter: JoneZhang >Assignee: Chengxiang Li > Attachments: HIVE-12736.1-spark.patch, HIVE-12736.2-spark.patch, > HIVE-12736.3-spark.patch, HIVE-12736.4-spark.patch, HIVE-12736.5-spark.patch > > > {code} > select * from staff; > 1 jone22 1 > 2 lucy21 1 > 3 hmm 22 2 > 4 james 24 3 > 5 xiaoliu 23 3 > select id,date_ from trade union all select id,"test" from trade ; > 1 201510210908 > 2 201509080234 > 2 201509080235 > 1 test > 2 test > 2 test > set hive.execution.engine=spark; > set spark.master=local; > select /*+mapjoin(t)*/ * from staff s join > (select id,date_ from trade union all select id,"test" from trade ) t on > s.id=t.id; > 1 jone22 1 1 201510210908 > 2 lucy21 1 2 201509080234 > 2 lucy21 1 2 201509080235 > set hive.execution.engine=mr; > select /*+mapjoin(t)*/ * from staff s join > (select id,date_ from trade union all select id,"test" from trade ) t on > s.id=t.id; > FAILED: SemanticException [Error 10227]: Not all clauses are supported with > mapjoin hint. Please remove mapjoin hint. > {code} > I have two questions > 1.Why result of hive on spark not include the following record? > {code} > 1 jone22 1 1 test > 2 lucy21 1 2 test > 2 lucy21 1 2 test > {code} > 2.Why there are two different ways of dealing same query? > explain 1: > {code} > set hive.execution.engine=spark; > set spark.master=local; > explain > select id,date_ from trade union all select id,"test" from trade; > OK > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Spark > DagName: jonezhang_20151222191643_5301d90a-caf0-4934-8092-d165c87a4190:1 > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: trade > Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: id (type: int), date_ (type: string) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 6 Data size: 48 Basic stats: > COMPLETE Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 12 Data size: 96 Basic stats: > COMPLETE Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Map 2 > Map Operator Tree: > TableScan > alias: trade > Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: id (type: int), 'test' (type: string) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 6 Data size: 48 Basic stats: > COMPLETE Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 12 Data size: 96 Basic stats: > COMPLETE Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.TextInputFormat > output format: >
[jira] [Commented] (HIVE-12736) It seems that result of Hive on Spark be mistaken and result of Hive and Hive on Spark are not the same
[ https://issues.apache.org/jira/browse/HIVE-12736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106575#comment-15106575 ] Chengxiang Li commented on HIVE-12736: -- Besides, during test, i found TestSparkNegativeCliDriver run in MR mode actually, i would create another JIRA to track it. > It seems that result of Hive on Spark be mistaken and result of Hive and Hive > on Spark are not the same > --- > > Key: HIVE-12736 > URL: https://issues.apache.org/jira/browse/HIVE-12736 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.1, 1.2.1 >Reporter: JoneZhang >Assignee: Chengxiang Li > Attachments: HIVE-12736.1-spark.patch, HIVE-12736.2-spark.patch, > HIVE-12736.3-spark.patch, HIVE-12736.4-spark.patch, HIVE-12736.5-spark.patch > > > {code} > select * from staff; > 1 jone22 1 > 2 lucy21 1 > 3 hmm 22 2 > 4 james 24 3 > 5 xiaoliu 23 3 > select id,date_ from trade union all select id,"test" from trade ; > 1 201510210908 > 2 201509080234 > 2 201509080235 > 1 test > 2 test > 2 test > set hive.execution.engine=spark; > set spark.master=local; > select /*+mapjoin(t)*/ * from staff s join > (select id,date_ from trade union all select id,"test" from trade ) t on > s.id=t.id; > 1 jone22 1 1 201510210908 > 2 lucy21 1 2 201509080234 > 2 lucy21 1 2 201509080235 > set hive.execution.engine=mr; > select /*+mapjoin(t)*/ * from staff s join > (select id,date_ from trade union all select id,"test" from trade ) t on > s.id=t.id; > FAILED: SemanticException [Error 10227]: Not all clauses are supported with > mapjoin hint. Please remove mapjoin hint. > {code} > I have two questions > 1.Why result of hive on spark not include the following record? > {code} > 1 jone22 1 1 test > 2 lucy21 1 2 test > 2 lucy21 1 2 test > {code} > 2.Why there are two different ways of dealing same query? > explain 1: > {code} > set hive.execution.engine=spark; > set spark.master=local; > explain > select id,date_ from trade union all select id,"test" from trade; > OK > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Spark > DagName: jonezhang_20151222191643_5301d90a-caf0-4934-8092-d165c87a4190:1 > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: trade > Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: id (type: int), date_ (type: string) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 6 Data size: 48 Basic stats: > COMPLETE Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 12 Data size: 96 Basic stats: > COMPLETE Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Map 2 > Map Operator Tree: > TableScan > alias: trade > Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: id (type: int), 'test' (type: string) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 6 Data size: 48 Basic stats: > COMPLETE Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 12 Data size: 96 Basic stats: > COMPLETE Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > ListSink > {code} > explain 2: > {code} > set hive.execution.engine=spark; > set spark.master=local; > explain > select /*+mapjoin(t)*/ * from staff s join > (select id,date_ from trade union all select id,"test" from trade ) t on >
[jira] [Commented] (HIVE-12864) StackOverflowError parsing queries with very large predicates
[ https://issues.apache.org/jira/browse/HIVE-12864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106500#comment-15106500 ] Jesus Camacho Rodriguez commented on HIVE-12864: [~pxiong], thanks for checking on this. The original five methods are tree traversals implemented recursively. The new ones in the patch are similar to the original ones, but I rewrote each of them to be iterative -using stacks-. This avoid the StackOverflowError. Concretely: * setUnknownTokenBoundaries(): post-order * dump(StringBuilder sb): pre- and post-order * toStringTree(ASTNode rootNode): pre- and post-order * processPositionAlias(ASTNode ast): pre-order * findSubQueries(ASTNode node, List subQueries): pre-order These algorithms are part of the parsing logic, so you can pick any query in the q files (e.g. lineage2.q, lineage3.q, subquery_in.q) to walk through each of the algorithms. > StackOverflowError parsing queries with very large predicates > - > > Key: HIVE-12864 > URL: https://issues.apache.org/jira/browse/HIVE-12864 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0, 2.1.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12864.01.patch, HIVE-12864.patch > > > We have seen that queries with very large predicates might fail with the > following stacktrace: > {noformat} > 016-01-12 05:47:36,516|beaver.machine|INFO|552|5072|Thread-22|Exception in > thread "main" java.lang.StackOverflowError > 2016-01-12 05:47:36,517|beaver.machine|INFO|552|5072|Thread-22|at > org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:145) > 2016-01-12 05:47:36,517|beaver.machine|INFO|552|5072|Thread-22|at > org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146) > 2016-01-12 05:47:36,517|beaver.machine|INFO|552|5072|Thread-22|at > org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146) > 2016-01-12 05:47:36,517|beaver.machine|INFO|552|5072|Thread-22|at > org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146) > 2016-01-12 05:47:36,517|beaver.machine|INFO|552|5072|Thread-22|at > org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146) > 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at > org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146) > 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at > org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146) > 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at > org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146) > 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at > org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146) > 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at > org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146) > 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at > org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146) > 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at > org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146) > 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at > org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146) > 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at > org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146) > 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at > org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146) > 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at > org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146) > 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at > org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146) > 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at > org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146) > 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at > org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146) > 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at > org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146) > 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at > org.antlr.runtime.tree.CommonTree.setUnknownTokenBoundaries(CommonTree.java:146) > 2016-01-12 05:47:36,519|beaver.machine|INFO|552|5072|Thread-22|at >
[jira] [Updated] (HIVE-12429) Switch default Hive authorization to SQLStandardAuth in 2.0
[ https://issues.apache.org/jira/browse/HIVE-12429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-12429: Labels: TODOC2.0 (was: ) > Switch default Hive authorization to SQLStandardAuth in 2.0 > --- > > Key: HIVE-12429 > URL: https://issues.apache.org/jira/browse/HIVE-12429 > Project: Hive > Issue Type: Task > Components: Authorization, Security >Affects Versions: 2.0.0 >Reporter: Alan Gates >Assignee: Daniel Dai > Labels: TODOC2.0 > Fix For: 2.0.0 > > Attachments: HIVE-12429.1.patch, HIVE-12429.10.patch, > HIVE-12429.11.patch, HIVE-12429.12.patch, HIVE-12429.13.patch, > HIVE-12429.14.patch, HIVE-12429.15.patch, HIVE-12429.16.patch, > HIVE-12429.17.patch, HIVE-12429.18.patch, HIVE-12429.2.patch, > HIVE-12429.3.patch, HIVE-12429.4.patch, HIVE-12429.5.patch, > HIVE-12429.6.patch, HIVE-12429.7.patch, HIVE-12429.8.patch, HIVE-12429.9.patch > > > Hive's default authorization is not real security, as it does not secure a > number of features and anyone can grant access to any object to any user. We > should switch the default to SQLStandardAuth, which provides real > authentication. > As this is a backwards incompatible change this was hard to do previously, > but 2.0 gives us a place to do this type of change. > By default authorization will still be off, as there are a few other things > to set when turning on authorization (such as the list of admin users). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-1633) CombineHiveInputFormat fails with "cannot find dir for emptyFile"
[ https://issues.apache.org/jira/browse/HIVE-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107521#comment-15107521 ] Ergin Demirel commented on HIVE-1633: - We are still getting error message when trying to load empty table/file running on local mode. Tried adding "file://" in front of the path though it didn't help. Can someone please clarify the solution here? Hive Version: 0.10.0+121-1.cdh4.3.0.p0.16~precise-cdh4.3.0 Thanks {code} java.io.FileNotFoundException: File does not exist: /tmp/hdfs/hive_2016-01-19_21-40-07_727_4067638808884572526/-mr-1/1/emptyFile at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:807) at org.apache.hadoop.mapred.lib.CombineFileInputFormat$OneFileInfo.(CombineFileInputFormat.java:462) at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getMoreSplits(CombineFileInputFormat.java:256) at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:212) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:411) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:377) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:387) at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1091) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1083) at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:993) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:946) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:946) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:920) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:448) at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:690) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist: /tmp/hdfs/hive_2016-01-19_21-40-07_727_4067638808884572526/-mr-1/1/emptyFile)' Execution failed with exit status: 1 {code} > CombineHiveInputFormat fails with "cannot find dir for emptyFile" > - > > Key: HIVE-1633 > URL: https://issues.apache.org/jira/browse/HIVE-1633 > Project: Hive > Issue Type: Bug > Components: Clients >Reporter: Amareshwari Sriramadasu >Assignee: Sreekanth Ramakrishnan > Fix For: 0.7.0 > > Attachments: HIVE-1633.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12867) Semantic Exception Error Msg should be with in the range of "10000 to 19999"
[ https://issues.apache.org/jira/browse/HIVE-12867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-12867: - Attachment: HIVE-12867.2.patch > Semantic Exception Error Msg should be with in the range of "1 to 1" > > > Key: HIVE-12867 > URL: https://issues.apache.org/jira/browse/HIVE-12867 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 >Reporter: Laljo John Pullokkaran >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-12867.1.patch, HIVE-12867.2.patch > > > At many places errors encountered during semantic exception is translated as > generic error(GENERIC_ERROR, 4) msg as opposed to semantic error msg. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12867) Semantic Exception Error Msg should be with in the range of "10000 to 19999"
[ https://issues.apache.org/jira/browse/HIVE-12867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-12867: - Attachment: (was: HIVE-12867.2.patch) > Semantic Exception Error Msg should be with in the range of "1 to 1" > > > Key: HIVE-12867 > URL: https://issues.apache.org/jira/browse/HIVE-12867 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 >Reporter: Laljo John Pullokkaran >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-12867.1.patch, HIVE-12867.2.patch > > > At many places errors encountered during semantic exception is translated as > generic error(GENERIC_ERROR, 4) msg as opposed to semantic error msg. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12798) CBO: Calcite Operator To Hive Operator (Calcite Return Path): MiniTezCliDriver.vector* queries failures due to NPE in Vectorizer.onExpressionHasNullSafes()
[ https://issues.apache.org/jira/browse/HIVE-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107596#comment-15107596 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-12798: -- I've comitted this to master. cc-ing [~sershe] for approval for commit to branch-2.0. This can lead to NPE even in regular code path with vectorization turned on. > CBO: Calcite Operator To Hive Operator (Calcite Return Path): > MiniTezCliDriver.vector* queries failures due to NPE in > Vectorizer.onExpressionHasNullSafes() > --- > > Key: HIVE-12798 > URL: https://issues.apache.org/jira/browse/HIVE-12798 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-12798.1.patch > > > As of 01/04/2016, the following tests fail in the MiniTezCliDriver mode when > the cbo return path is enabled. We need to fix them : > {code} > vector_leftsemi_mapjoin > vector_join_filters > vector_interval_mapjoin > vector_left_outer_join > vectorized_mapjoin > vector_inner_join > vectorized_context > tez_vector_dynpart_hashjoin_1 > count > auto_sortmerge_join_6 > skewjoin > vector_auto_smb_mapjoin_14 > auto_join_filters > vector_outer_join0 > vector_outer_join1 > vector_outer_join2 > vector_outer_join3 > vector_outer_join4 > vector_outer_join5 > hybridgrace_hashjoin_1 > vector_mapjoin_reduce > vectorized_nested_mapjoin > vector_left_outer_join2 > vector_char_mapjoin1 > vector_decimal_mapjoin > vectorized_dynamic_partition_pruning > vector_varchar_mapjoin1 > {code} > This jira is intended to cover the vectorization issues related to the > MiniTezCliDriver failures caused by NPE via nullSafes array as shown below : > {code} > private boolean onExpressionHasNullSafes(MapJoinDesc desc) { > boolean[] nullSafes = desc.getNullSafes(); > for (boolean nullSafe : nullSafes) { > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12855) LLAP: add checks when resolving UDFs to enforce whitelist
[ https://issues.apache.org/jira/browse/HIVE-12855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12855: Attachment: HIVE-12855.01.patch > LLAP: add checks when resolving UDFs to enforce whitelist > - > > Key: HIVE-12855 > URL: https://issues.apache.org/jira/browse/HIVE-12855 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12855.01.patch, HIVE-12855.part.patch > > > Currently, adding a temporary UDF and calling LLAP with it (bypassing the > LlapDecider check, I did it by just modifying the source) only fails because > the class could not be found. If the UDF was accessible to LLAP, it would > execute. Inside the daemon, UDF instantiation should fail for custom UDFs > (and only succeed for whitelisted custom UDFs, once that is implemented). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12855) LLAP: add checks when resolving UDFs to enforce whitelist
[ https://issues.apache.org/jira/browse/HIVE-12855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12855: Attachment: (was: HIVE-12855.patch) > LLAP: add checks when resolving UDFs to enforce whitelist > - > > Key: HIVE-12855 > URL: https://issues.apache.org/jira/browse/HIVE-12855 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12855.01.patch, HIVE-12855.part.patch > > > Currently, adding a temporary UDF and calling LLAP with it (bypassing the > LlapDecider check, I did it by just modifying the source) only fails because > the class could not be found. If the UDF was accessible to LLAP, it would > execute. Inside the daemon, UDF instantiation should fail for custom UDFs > (and only succeed for whitelisted custom UDFs, once that is implemented). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12783) fix the unit test failures in TestSparkClient and TestSparkSessionManagerImpl
[ https://issues.apache.org/jira/browse/HIVE-12783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107462#comment-15107462 ] Owen O'Malley commented on HIVE-12783: -- I just committed this. > fix the unit test failures in TestSparkClient and TestSparkSessionManagerImpl > - > > Key: HIVE-12783 > URL: https://issues.apache.org/jira/browse/HIVE-12783 > Project: Hive > Issue Type: Sub-task > Components: Test >Affects Versions: 2.0.0 >Reporter: Pengcheng Xiong >Assignee: Owen O'Malley >Priority: Blocker > Attachments: HIVE-12783.patch, HIVE-12783.patch, HIVE-12783.patch > > > This includes > {code} > org.apache.hive.spark.client.TestSparkClient.testSyncRpc > org.apache.hive.spark.client.TestSparkClient.testJobSubmission > org.apache.hive.spark.client.TestSparkClient.testMetricsCollection > org.apache.hive.spark.client.TestSparkClient.testCounters > org.apache.hive.spark.client.TestSparkClient.testRemoteClient > org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles > org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob > org.apache.hive.spark.client.TestSparkClient.testErrorJob > org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse > org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse > {code} > all of them passed on my laptop. cc'ing [~szehon], [~xuefuz], could you > please take a look? Shall we ignore them? Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12682) Reducers in dynamic partitioning job spend a lot of time running hadoop.conf.Configuration.getOverlay
[ https://issues.apache.org/jira/browse/HIVE-12682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-12682: - Attachment: HIVE-12682-branch-1.patch > Reducers in dynamic partitioning job spend a lot of time running > hadoop.conf.Configuration.getOverlay > - > > Key: HIVE-12682 > URL: https://issues.apache.org/jira/browse/HIVE-12682 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Carter Shanklin >Assignee: Prasanth Jayachandran > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-12682-branch-1.patch, HIVE-12682.1.patch, > HIVE-12682.2.patch, reducer.png > > > I tested this on Hive 1.2.1 but looks like it's still applicable to 2.0. > I ran this query: > {code} > create table flights ( > … > ) > PARTITIONED BY (Year int) > CLUSTERED BY (Month) > SORTED BY (DayofMonth) into 12 buckets > STORED AS ORC > TBLPROPERTIES("orc.bloom.filter.columns"="*") > ; > {code} > (Taken from here: > https://github.com/t3rmin4t0r/all-airlines-data/blob/master/ddl/orc.sql) > I profiled just the reduce phase and noticed something odd, the attached > graph shows where time was spent during the reducer phase. > !reducer.png! > Problem seems to relate to > https://github.com/apache/hive/blob/branch-2.0/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java#L903 > /cc [~gopalv] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12855) LLAP: add checks when resolving UDFs to enforce whitelist
[ https://issues.apache.org/jira/browse/HIVE-12855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107373#comment-15107373 ] Sergey Shelukhin commented on HIVE-12855: - Looks like this approach won't work in MiniLlap cluster because the embedded daemon causes the global registration on the client, which causes AM to fail to parse. Stupid Kryo needs a proper hook system... I think I will keep only the global hook for now. > LLAP: add checks when resolving UDFs to enforce whitelist > - > > Key: HIVE-12855 > URL: https://issues.apache.org/jira/browse/HIVE-12855 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12855.part.patch, HIVE-12855.patch > > > Currently, adding a temporary UDF and calling LLAP with it (bypassing the > LlapDecider check, I did it by just modifying the source) only fails because > the class could not be found. If the UDF was accessible to LLAP, it would > execute. Inside the daemon, UDF instantiation should fail for custom UDFs > (and only succeed for whitelisted custom UDFs, once that is implemented). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12763) Use bit vector to track NDV
[ https://issues.apache.org/jira/browse/HIVE-12763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107384#comment-15107384 ] Hive QA commented on HIVE-12763: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12783018/HIVE-12763.02.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 39 failed/errored test(s), 10003 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniTezCliDriver-vector_interval_2.q-bucket3.q-vectorization_7.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_udf1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_partlvl org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_partlvl_dp org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_quoting org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_tbllvl org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compustat_avro org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compute_stats_date org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compute_stats_decimal org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compute_stats_double org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compute_stats_empty_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compute_stats_long org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compute_stats_string org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_display_colstats_tbllvl org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_temp_table_display_colstats_tbllvl org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_varchar_udf1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mapjoin_mapjoin org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_join_hash org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.metastore.hbase.TestHBaseAggrStatsCacheIntegration.hit org.apache.hadoop.hive.metastore.hbase.TestHBaseAggrStatsCacheIntegration.someWithStats org.apache.hadoop.hive.metastore.hbase.TestHBaseAggregateStatsCache.someNonexistentPartitions org.apache.hadoop.hive.metastore.hbase.TestHBaseSchemaTool.oneMondoTest org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.decimalPartitionStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.decimalTableStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.doublePartitionStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.doubleTableStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.longPartitionStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.longTableStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.stringPartitionStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStore.stringTableStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStoreIntegration.partitionStatistics org.apache.hadoop.hive.metastore.hbase.TestHBaseStoreIntegration.tableStatistics org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6673/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6673/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6673/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 39 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12783018 - PreCommit-HIVE-TRUNK-Build > Use bit vector to track NDV > --- > > Key: HIVE-12763 > URL: https://issues.apache.org/jira/browse/HIVE-12763 > Project: Hive > Issue Type: Improvement >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12763.01.patch, HIVE-12763.02.patch > > > This will improve merging of per partitions stats. It will also help merge > NDV for auto-gather column stats. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12805) CBO: Calcite Operator To Hive Operator (Calcite Return Path): MiniTezCliDriver skewjoin.q failure
[ https://issues.apache.org/jira/browse/HIVE-12805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107427#comment-15107427 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-12805: -- ^Typo : patch 3 > CBO: Calcite Operator To Hive Operator (Calcite Return Path): > MiniTezCliDriver skewjoin.q failure > - > > Key: HIVE-12805 > URL: https://issues.apache.org/jira/browse/HIVE-12805 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-12805.1.patch, HIVE-12805.2.patch, > HIVE-12805.3.patch > > > Set hive.cbo.returnpath.hiveop=true > {code} > FROM T1 a FULL OUTER JOIN T2 c ON c.key+1=a.key SELECT /*+ STREAMTABLE(a) */ > sum(hash(a.key)), sum(hash(a.val)), sum(hash(c.key)) > {code} > The stack trace: > {code} > java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 > at java.util.ArrayList.rangeCheck(ArrayList.java:635) > at java.util.ArrayList.get(ArrayList.java:411) > at > org.apache.hadoop.hive.ql.ppd.SyntheticJoinPredicate$JoinSynthetic.process(SyntheticJoinPredicate.java:183) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:43) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120) > at > org.apache.hadoop.hive.ql.ppd.SyntheticJoinPredicate.transform(SyntheticJoinPredicate.java:100) > at > org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:236) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10170) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:231) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:471) > {code} > Same error happens in auto_sortmerge_join_6.q.out for > {code} > select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src h on > h.value = a.value > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12783) fix the unit test failures in TestSparkClient and TestSparkSessionManagerImpl
[ https://issues.apache.org/jira/browse/HIVE-12783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107471#comment-15107471 ] Pengcheng Xiong commented on HIVE-12783: [~owen.omalley], thanks a lot. :) > fix the unit test failures in TestSparkClient and TestSparkSessionManagerImpl > - > > Key: HIVE-12783 > URL: https://issues.apache.org/jira/browse/HIVE-12783 > Project: Hive > Issue Type: Sub-task > Components: Test >Affects Versions: 2.0.0 >Reporter: Pengcheng Xiong >Assignee: Owen O'Malley >Priority: Blocker > Attachments: HIVE-12783.patch, HIVE-12783.patch, HIVE-12783.patch > > > This includes > {code} > org.apache.hive.spark.client.TestSparkClient.testSyncRpc > org.apache.hive.spark.client.TestSparkClient.testJobSubmission > org.apache.hive.spark.client.TestSparkClient.testMetricsCollection > org.apache.hive.spark.client.TestSparkClient.testCounters > org.apache.hive.spark.client.TestSparkClient.testRemoteClient > org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles > org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob > org.apache.hive.spark.client.TestSparkClient.testErrorJob > org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse > org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse > {code} > all of them passed on my laptop. cc'ing [~szehon], [~xuefuz], could you > please take a look? Shall we ignore them? Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12865) Exchange partition does not show inputs field for post/pre execute hooks
[ https://issues.apache.org/jira/browse/HIVE-12865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-12865: Attachment: HIVE-12865.patch > Exchange partition does not show inputs field for post/pre execute hooks > > > Key: HIVE-12865 > URL: https://issues.apache.org/jira/browse/HIVE-12865 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.12.0 >Reporter: Paul Yang >Assignee: Aihua Xu > Attachments: HIVE-12865.patch > > > The pre/post execute hook interface has fields that indicate which Hive > objects were read / written to as a result of running the query. For the > exchange partition operation, the read entity field is empty. > This is an important issue as the hook interface may be configured to perform > critical warehouse operations. > See > ql/src/test/results/clientpositive/exchange_partition3.q.out > {code} > --- a/ql/src/test/results/clientpositive/exchange_partition3.q.out > +++ b/ql/src/test/results/clientpositive/exchange_partition3.q.out > @@ -65,9 +65,17 @@ ds=2013-04-05/hr=2 > PREHOOK: query: -- This will exchange both partitions hr=1 and hr=2 > ALTER TABLE exchange_part_test1 EXCHANGE PARTITION (ds='2013-04-05') WITH > TABLE exchange_part_test2 > PREHOOK: type: ALTERTABLE_EXCHANGEPARTITION > +PREHOOK: Output: default@exchange_part_test1 > +PREHOOK: Output: default@exchange_part_test2 > POSTHOOK: query: -- This will exchange both partitions hr=1 and hr=2 > ALTER TABLE exchange_part_test1 EXCHANGE PARTITION (ds='2013-04-05') WITH > TABLE exchange_part_test2 > POSTHOOK: type: ALTERTABLE_EXCHANGEPARTITION > +POSTHOOK: Output: default@exchange_part_test1 > +POSTHOOK: Output: default@exchange_part_test1@ds=2013-04-05/hr=1 > +POSTHOOK: Output: default@exchange_part_test1@ds=2013-04-05/hr=2 > +POSTHOOK: Output: default@exchange_part_test2 > +POSTHOOK: Output: default@exchange_part_test2@ds=2013-04-05/hr=1 > +POSTHOOK: Output: default@exchange_part_test2@ds=2013-04-05/hr=2 > PREHOOK: query: SHOW PARTITIONS exchange_part_test1 > PREHOOK: type: SHOWPARTITIONS > PREHOOK: Input: default@exchange_part_test1 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12220) LLAP: Usability issues with hive.llap.io.cache.orc.size
[ https://issues.apache.org/jira/browse/HIVE-12220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107561#comment-15107561 ] Prasanth Jayachandran commented on HIVE-12220: -- +1 > LLAP: Usability issues with hive.llap.io.cache.orc.size > --- > > Key: HIVE-12220 > URL: https://issues.apache.org/jira/browse/HIVE-12220 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Carter Shanklin >Assignee: Sergey Shelukhin > Attachments: HIVE-12220.01.patch, HIVE-12220.02.patch, > HIVE-12220.patch, HIVE-12220.tmp.patch > > > In the llap-daemon site you need to set, among other things, > llap.daemon.memory.per.instance.mb > and > hive.llap.io.cache.orc.size > The use of hive.llap.io.cache.orc.size caused me some unnecessary problems, > initially I entered the value in MB rather than in bytes. Operator error you > could say but I look at this as a fraction of the other value which is in mb. > Second, is this really tied to ORC? E.g. when we have the vectorized text > reader will this data be cached as well? Or might it be in the future? > I would like to propose instead using hive.llap.io.cache.size.mb for this > setting. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12429) Switch default Hive authorization to SQLStandardAuth in 2.0
[ https://issues.apache.org/jira/browse/HIVE-12429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107402#comment-15107402 ] Sushanth Sowmyan commented on HIVE-12429: - Thanks for the update, Daniel. LGTM. +1. Will go ahead and commit. > Switch default Hive authorization to SQLStandardAuth in 2.0 > --- > > Key: HIVE-12429 > URL: https://issues.apache.org/jira/browse/HIVE-12429 > Project: Hive > Issue Type: Task > Components: Authorization, Security >Affects Versions: 2.0.0 >Reporter: Alan Gates >Assignee: Daniel Dai > Attachments: HIVE-12429.1.patch, HIVE-12429.10.patch, > HIVE-12429.11.patch, HIVE-12429.12.patch, HIVE-12429.13.patch, > HIVE-12429.14.patch, HIVE-12429.15.patch, HIVE-12429.16.patch, > HIVE-12429.17.patch, HIVE-12429.18.patch, HIVE-12429.2.patch, > HIVE-12429.3.patch, HIVE-12429.4.patch, HIVE-12429.5.patch, > HIVE-12429.6.patch, HIVE-12429.7.patch, HIVE-12429.8.patch, HIVE-12429.9.patch > > > Hive's default authorization is not real security, as it does not secure a > number of features and anyone can grant access to any object to any user. We > should switch the default to SQLStandardAuth, which provides real > authentication. > As this is a backwards incompatible change this was hard to do previously, > but 2.0 gives us a place to do this type of change. > By default authorization will still be off, as there are a few other things > to set when turning on authorization (such as the list of admin users). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12883) Support basic stats and column stats in table properties in HBaseStore
[ https://issues.apache.org/jira/browse/HIVE-12883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12883: Target Version/s: 2.0.0 > Support basic stats and column stats in table properties in HBaseStore > -- > > Key: HIVE-12883 > URL: https://issues.apache.org/jira/browse/HIVE-12883 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong >Priority: Blocker > Attachments: HIVE-12883.01.patch, HIVE-12883.02.patch, > HIVE-12883.03.patch > > > Need to add support for HBase store too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12805) CBO: Calcite Operator To Hive Operator (Calcite Return Path): MiniTezCliDriver skewjoin.q failure
[ https://issues.apache.org/jira/browse/HIVE-12805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-12805: - Attachment: HIVE-12805.2.patch Addressing [~ashutoshc] 's comments in patch 2. > CBO: Calcite Operator To Hive Operator (Calcite Return Path): > MiniTezCliDriver skewjoin.q failure > - > > Key: HIVE-12805 > URL: https://issues.apache.org/jira/browse/HIVE-12805 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-12805.1.patch, HIVE-12805.2.patch, > HIVE-12805.3.patch > > > Set hive.cbo.returnpath.hiveop=true > {code} > FROM T1 a FULL OUTER JOIN T2 c ON c.key+1=a.key SELECT /*+ STREAMTABLE(a) */ > sum(hash(a.key)), sum(hash(a.val)), sum(hash(c.key)) > {code} > The stack trace: > {code} > java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 > at java.util.ArrayList.rangeCheck(ArrayList.java:635) > at java.util.ArrayList.get(ArrayList.java:411) > at > org.apache.hadoop.hive.ql.ppd.SyntheticJoinPredicate$JoinSynthetic.process(SyntheticJoinPredicate.java:183) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:43) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120) > at > org.apache.hadoop.hive.ql.ppd.SyntheticJoinPredicate.transform(SyntheticJoinPredicate.java:100) > at > org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:236) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10170) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:231) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:471) > {code} > Same error happens in auto_sortmerge_join_6.q.out for > {code} > select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src h on > h.value = a.value > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12805) CBO: Calcite Operator To Hive Operator (Calcite Return Path): MiniTezCliDriver skewjoin.q failure
[ https://issues.apache.org/jira/browse/HIVE-12805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-12805: - Attachment: (was: HIVE-12805.2.patch) > CBO: Calcite Operator To Hive Operator (Calcite Return Path): > MiniTezCliDriver skewjoin.q failure > - > > Key: HIVE-12805 > URL: https://issues.apache.org/jira/browse/HIVE-12805 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-12805.1.patch, HIVE-12805.2.patch, > HIVE-12805.3.patch > > > Set hive.cbo.returnpath.hiveop=true > {code} > FROM T1 a FULL OUTER JOIN T2 c ON c.key+1=a.key SELECT /*+ STREAMTABLE(a) */ > sum(hash(a.key)), sum(hash(a.val)), sum(hash(c.key)) > {code} > The stack trace: > {code} > java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 > at java.util.ArrayList.rangeCheck(ArrayList.java:635) > at java.util.ArrayList.get(ArrayList.java:411) > at > org.apache.hadoop.hive.ql.ppd.SyntheticJoinPredicate$JoinSynthetic.process(SyntheticJoinPredicate.java:183) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:43) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120) > at > org.apache.hadoop.hive.ql.ppd.SyntheticJoinPredicate.transform(SyntheticJoinPredicate.java:100) > at > org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:236) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10170) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:231) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:471) > {code} > Same error happens in auto_sortmerge_join_6.q.out for > {code} > select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src h on > h.value = a.value > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12805) CBO: Calcite Operator To Hive Operator (Calcite Return Path): MiniTezCliDriver skewjoin.q failure
[ https://issues.apache.org/jira/browse/HIVE-12805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-12805: - Attachment: HIVE-12805.3.patch > CBO: Calcite Operator To Hive Operator (Calcite Return Path): > MiniTezCliDriver skewjoin.q failure > - > > Key: HIVE-12805 > URL: https://issues.apache.org/jira/browse/HIVE-12805 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-12805.1.patch, HIVE-12805.2.patch, > HIVE-12805.3.patch > > > Set hive.cbo.returnpath.hiveop=true > {code} > FROM T1 a FULL OUTER JOIN T2 c ON c.key+1=a.key SELECT /*+ STREAMTABLE(a) */ > sum(hash(a.key)), sum(hash(a.val)), sum(hash(c.key)) > {code} > The stack trace: > {code} > java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 > at java.util.ArrayList.rangeCheck(ArrayList.java:635) > at java.util.ArrayList.get(ArrayList.java:411) > at > org.apache.hadoop.hive.ql.ppd.SyntheticJoinPredicate$JoinSynthetic.process(SyntheticJoinPredicate.java:183) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:43) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120) > at > org.apache.hadoop.hive.ql.ppd.SyntheticJoinPredicate.transform(SyntheticJoinPredicate.java:100) > at > org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:236) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10170) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:231) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:471) > {code} > Same error happens in auto_sortmerge_join_6.q.out for > {code} > select count(*) FROM tbl1 a JOIN tbl2 b ON a.key = b.key join src h on > h.value = a.value > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12883) Support basic stats and column stats in table properties in HBaseStore
[ https://issues.apache.org/jira/browse/HIVE-12883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12883: Priority: Blocker (was: Major) > Support basic stats and column stats in table properties in HBaseStore > -- > > Key: HIVE-12883 > URL: https://issues.apache.org/jira/browse/HIVE-12883 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong >Priority: Blocker > Attachments: HIVE-12883.01.patch, HIVE-12883.02.patch, > HIVE-12883.03.patch > > > Need to add support for HBase store too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12429) Switch default Hive authorization to SQLStandardAuth in 2.0
[ https://issues.apache.org/jira/browse/HIVE-12429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107445#comment-15107445 ] Daniel Dai commented on HIVE-12429: --- Thanks [~sushanth]! > Switch default Hive authorization to SQLStandardAuth in 2.0 > --- > > Key: HIVE-12429 > URL: https://issues.apache.org/jira/browse/HIVE-12429 > Project: Hive > Issue Type: Task > Components: Authorization, Security >Affects Versions: 2.0.0 >Reporter: Alan Gates >Assignee: Daniel Dai > Labels: TODOC2.0 > Fix For: 2.0.0 > > Attachments: HIVE-12429.1.patch, HIVE-12429.10.patch, > HIVE-12429.11.patch, HIVE-12429.12.patch, HIVE-12429.13.patch, > HIVE-12429.14.patch, HIVE-12429.15.patch, HIVE-12429.16.patch, > HIVE-12429.17.patch, HIVE-12429.18.patch, HIVE-12429.2.patch, > HIVE-12429.3.patch, HIVE-12429.4.patch, HIVE-12429.5.patch, > HIVE-12429.6.patch, HIVE-12429.7.patch, HIVE-12429.8.patch, HIVE-12429.9.patch > > > Hive's default authorization is not real security, as it does not secure a > number of features and anyone can grant access to any object to any user. We > should switch the default to SQLStandardAuth, which provides real > authentication. > As this is a backwards incompatible change this was hard to do previously, > but 2.0 gives us a place to do this type of change. > By default authorization will still be off, as there are a few other things > to set when turning on authorization (such as the list of admin users). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12890) Disable multi-statment transaction control statements until HIVE-11078
[ https://issues.apache.org/jira/browse/HIVE-12890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-12890: -- Attachment: HIVE-12890.patch [~alangates] could you review > Disable multi-statment transaction control statements until HIVE-11078 > -- > > Key: HIVE-12890 > URL: https://issues.apache.org/jira/browse/HIVE-12890 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 1.3.0, 2.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Blocker > Attachments: HIVE-12890.patch > > > HIVE-11077 added support for begin transaction/commit/rollback but the > feature is not complete w/o HIVE-11078. Need to disable these statements to > prevent user confusion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12528) don't start HS2 Tez sessions in a single thread
[ https://issues.apache.org/jira/browse/HIVE-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12528: Attachment: HIVE-12528.05.patch Another rebase. > don't start HS2 Tez sessions in a single thread > --- > > Key: HIVE-12528 > URL: https://issues.apache.org/jira/browse/HIVE-12528 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12528.01.patch, HIVE-12528.02.patch, > HIVE-12528.03.patch, HIVE-12528.04.patch, HIVE-12528.05.patch, > HIVE-12528.patch > > > Starting sessions in parallel would improve the startup time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12867) Semantic Exception Error Msg should be with in the range of "10000 to 19999"
[ https://issues.apache.org/jira/browse/HIVE-12867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-12867: - Attachment: HIVE-12867.2.patch > Semantic Exception Error Msg should be with in the range of "1 to 1" > > > Key: HIVE-12867 > URL: https://issues.apache.org/jira/browse/HIVE-12867 > Project: Hive > Issue Type: Bug >Affects Versions: 1.2.1 >Reporter: Laljo John Pullokkaran >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-12867.1.patch, HIVE-12867.2.patch > > > At many places errors encountered during semantic exception is translated as > generic error(GENERIC_ERROR, 4) msg as opposed to semantic error msg. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12865) Exchange partition does not show inputs field for post/pre execute hooks
[ https://issues.apache.org/jira/browse/HIVE-12865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107564#comment-15107564 ] Aihua Xu commented on HIVE-12865: - Attached the initial patch: added the source table and the partitions to be exchanged as the inputs. [~ctang.ma] and [~xuefuz] Can you help review the code? [~pauly] Could you also take a look if this is what you are expecting? > Exchange partition does not show inputs field for post/pre execute hooks > > > Key: HIVE-12865 > URL: https://issues.apache.org/jira/browse/HIVE-12865 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.12.0 >Reporter: Paul Yang >Assignee: Aihua Xu > Attachments: HIVE-12865.patch > > > The pre/post execute hook interface has fields that indicate which Hive > objects were read / written to as a result of running the query. For the > exchange partition operation, the read entity field is empty. > This is an important issue as the hook interface may be configured to perform > critical warehouse operations. > See > ql/src/test/results/clientpositive/exchange_partition3.q.out > {code} > --- a/ql/src/test/results/clientpositive/exchange_partition3.q.out > +++ b/ql/src/test/results/clientpositive/exchange_partition3.q.out > @@ -65,9 +65,17 @@ ds=2013-04-05/hr=2 > PREHOOK: query: -- This will exchange both partitions hr=1 and hr=2 > ALTER TABLE exchange_part_test1 EXCHANGE PARTITION (ds='2013-04-05') WITH > TABLE exchange_part_test2 > PREHOOK: type: ALTERTABLE_EXCHANGEPARTITION > +PREHOOK: Output: default@exchange_part_test1 > +PREHOOK: Output: default@exchange_part_test2 > POSTHOOK: query: -- This will exchange both partitions hr=1 and hr=2 > ALTER TABLE exchange_part_test1 EXCHANGE PARTITION (ds='2013-04-05') WITH > TABLE exchange_part_test2 > POSTHOOK: type: ALTERTABLE_EXCHANGEPARTITION > +POSTHOOK: Output: default@exchange_part_test1 > +POSTHOOK: Output: default@exchange_part_test1@ds=2013-04-05/hr=1 > +POSTHOOK: Output: default@exchange_part_test1@ds=2013-04-05/hr=2 > +POSTHOOK: Output: default@exchange_part_test2 > +POSTHOOK: Output: default@exchange_part_test2@ds=2013-04-05/hr=1 > +POSTHOOK: Output: default@exchange_part_test2@ds=2013-04-05/hr=2 > PREHOOK: query: SHOW PARTITIONS exchange_part_test1 > PREHOOK: type: SHOWPARTITIONS > PREHOOK: Input: default@exchange_part_test1 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11785) Support escaping carriage return and new line for LazySimpleSerDe
[ https://issues.apache.org/jira/browse/HIVE-11785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-11785: Release Note: This change with HIVE-12820 in addition adds the support of carriage return and new line characters in the fields. Before this change, the user needs to preprocess the text by replacing them with some characters other than carriage return and new line in order for the files to be properly processed. With this change, it will automatically escape them if {{serialization.escape.crlf}} serde property is set to true. One incompatible change is: characters 'r' and 'n' cannot be used as separator or field delimiter. (was: This change with HIVE-12820 in addition adds the support of carriage return and new line characters in the fields. Before this change, the user needs to preprocess the text by replacing them with some characters other than carriage return and new line in order for the files to be properly processed. With this change, it will automatically escape them if {{serialization.escape.crlf}} serde property is set to true. One incompatible change is: characters 'r' and 'n' cannot be used as separator or field delimiter ) > Support escaping carriage return and new line for LazySimpleSerDe > - > > Key: HIVE-11785 > URL: https://issues.apache.org/jira/browse/HIVE-11785 > Project: Hive > Issue Type: New Feature > Components: Query Processor >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Labels: TODOC2.0 > Fix For: 2.0.0 > > Attachments: HIVE-11785.2.patch, HIVE-11785.3.patch, > HIVE-11785.patch, test.parquet > > > Create the table and perform the queries as follows. You will see different > results when the setting changes. > The expected result should be: > {noformat} > 1 newline > here > 2 carriage return > 3 both > here > {noformat} > {noformat} > hive> create table repo (lvalue int, charstring string) stored as parquet; > OK > Time taken: 0.34 seconds > hive> load data inpath '/tmp/repo/test.parquet' overwrite into table repo; > Loading data to table default.repo > chgrp: changing ownership of > 'hdfs://nameservice1/user/hive/warehouse/repo/test.parquet': User does not > belong to hive > Table default.repo stats: [numFiles=1, numRows=0, totalSize=610, > rawDataSize=0] > OK > Time taken: 0.732 seconds > hive> set hive.fetch.task.conversion=more; > hive> select * from repo; > OK > 1 newline > here > here carriage return > 3 both > here > Time taken: 0.253 seconds, Fetched: 3 row(s) > hive> set hive.fetch.task.conversion=none; > hive> select * from repo; > Query ID = root_20150909113535_e081db8b-ccd9-4c44-aad9-d990ffb8edf3 > Total jobs = 1 > Launching Job 1 out of 1 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1441752031022_0006, Tracking URL = > http://host-10-17-81-63.coe.cloudera.com:8088/proxy/application_1441752031022_0006/ > Kill Command = > /opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/hadoop/bin/hadoop job > -kill job_1441752031022_0006 > Hadoop job information for Stage-1: number of mappers: 1; number of reducers: > 0 > 2015-09-09 11:35:54,127 Stage-1 map = 0%, reduce = 0% > 2015-09-09 11:36:04,664 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.98 > sec > MapReduce Total cumulative CPU time: 2 seconds 980 msec > Ended Job = job_1441752031022_0006 > MapReduce Jobs Launched: > Stage-Stage-1: Map: 1 Cumulative CPU: 2.98 sec HDFS Read: 4251 HDFS > Write: 51 SUCCESS > Total MapReduce CPU Time Spent: 2 seconds 980 msec > OK > 1 newline > NULL NULL > 2 carriage return > NULL NULL > 3 both > NULL NULL > Time taken: 25.131 seconds, Fetched: 6 row(s) > hive> > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-12629) hive.auto.convert.join=true makes lateral view join sql failed on spark engine on yarn
[ https://issues.apache.org/jira/browse/HIVE-12629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun reassigned HIVE-12629: --- Assignee: Chao Sun (was: Xuefu Zhang) > hive.auto.convert.join=true makes lateral view join sql failed on spark > engine on yarn > -- > > Key: HIVE-12629 > URL: https://issues.apache.org/jira/browse/HIVE-12629 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 1.2.1 >Reporter: 吴子美 >Assignee: Chao Sun > > I am using hive1.2 on spark on yarn. > I found > select count(1) from > (select user_id from xxx group by user_id ) a join > (select user_id from yyy lateral view json_tuple(u, 'h') v1 as h) b > on a.user_id=b.user_id ; > failed in hive on spark on yarn, but OK in hive on MR. > I tried the following sql on spark. It was OK. > select count(1) from > (select user_id from xxx group by user_id ) a left join > (select user_id from yyy lateral view json_tuple(u, 'h') v1 as h) b > on a.user_id=b.user_id ; > When I turn hive.auto.convert.join from true to false. Everything goes OK. > The error message in hive.log was : > {code} > 2015-12-09 21:10:17,190 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(569)) - 15/12/09 21:10:17 INFO log.PerfLogger: > > 2015-12-09 21:10:17,190 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(569)) - 15/12/09 21:10:17 INFO exec.Utilities: > Serializing ReduceWork via kryo > 2015-12-09 21:10:17,214 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(569)) - 15/12/09 21:10:17 INFO log.PerfLogger: > duration=24 from=org.apache.hadoop.hive.ql.exec.Utilities> > 2015-12-09 21:10:17,261 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(569)) - 15/12/09 21:10:17 INFO client.RemoteDriver: > Failed to run job 8fed1ca8-834f-497f-b189-eab343440a9f > 2015-12-09 21:10:17,261 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(569)) - java.lang.IllegalStateException: Connection > already exists > 2015-12-09 21:10:17,261 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(569)) - at > org.apache.hadoop.hive.ql.exec.spark.SparkPlan.connect(SparkPlan.java:142) > 2015-12-09 21:10:17,261 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(569)) - at > org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateParentTran(SparkPlanGenerator.java:142) > 2015-12-09 21:10:17,261 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(569)) - at > org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:106) > 2015-12-09 21:10:17,261 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(569)) - at > org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient$JobStatusJob.call(RemoteHiveSparkClient.java:252) > 2015-12-09 21:10:17,261 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(569)) - at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:366) > 2015-12-09 21:10:17,261 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(569)) - at > org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:335) > 2015-12-09 21:10:17,261 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(569)) - at > java.util.concurrent.FutureTask.run(FutureTask.java:262) > 2015-12-09 21:10:17,262 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(569)) - at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > 2015-12-09 21:10:17,262 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(569)) - at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > 2015-12-09 21:10:17,262 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(569)) - at > java.lang.Thread.run(Thread.java:745) > 2015-12-09 21:10:17,266 INFO [RPC-Handler-3]: client.SparkClientImpl > (SparkClientImpl.java:handle(522)) - Received result for > 8fed1ca8-834f-497f-b189-eab343440a9f > 2015-12-09 21:10:18,054 ERROR [HiveServer2-Background-Pool: Thread-43]: > status.SparkJobMonitor (SessionState.java:printError(960)) - Status: Failed > 2015-12-09 21:10:18,055 INFO [HiveServer2-Background-Pool: Thread-43]: > log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - method=SparkRunJob start=144915051 end=144918055 duration=3004 > from=org.apache.hadoop.hive.ql.exec.spark.status.SparkJobMonitor> > 2015-12-09 21:10:18,076 ERROR [HiveServer2-Background-Pool: Thread-43]: > ql.Driver (SessionState.java:printError(960)) -
[jira] [Commented] (HIVE-12887) Handle ORC schema on read with fewer columns than file schema (after Schema Evolution changes)
[ https://issues.apache.org/jira/browse/HIVE-12887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107177#comment-15107177 ] Sergey Shelukhin commented on HIVE-12887: - What will happen after column removal with this patch? Is test needed? Also, nit: please surround LOG.info with types with if LOG.isInfoEnabled. > Handle ORC schema on read with fewer columns than file schema (after Schema > Evolution changes) > -- > > Key: HIVE-12887 > URL: https://issues.apache.org/jira/browse/HIVE-12887 > Project: Hive > Issue Type: Bug > Components: ORC >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12887.01.patch > > > Exception caused by reading after column removal. > {code} > Caused by: java.lang.IndexOutOfBoundsException: Index: 10, Size: 10 > at java.util.ArrayList.rangeCheck(ArrayList.java:653) > at java.util.ArrayList.get(ArrayList.java:429) > at java.util.Collections$UnmodifiableList.get(Collections.java:1309) > at > org.apache.hadoop.hive.ql.io.orc.OrcProto$Type.getSubtypes(OrcProto.java:12240) > at > org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.(TreeReaderFactory.java:2053) > at > org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory.createTreeReader(TreeReaderFactory.java:2481) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.(RecordReaderImpl.java:216) > at > org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:598) > at > org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$ReaderPair.(OrcRawRecordMerger.java:179) > at > org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$OriginalReaderPair.(OrcRawRecordMerger.java:222) > at > org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.(OrcRawRecordMerger.java:442) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.java:1285) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1165) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:249) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12682) Reducers in dynamic partitioning job spend a lot of time running hadoop.conf.Configuration.getOverlay
[ https://issues.apache.org/jira/browse/HIVE-12682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107246#comment-15107246 ] Ashutosh Chauhan commented on HIVE-12682: - +1 > Reducers in dynamic partitioning job spend a lot of time running > hadoop.conf.Configuration.getOverlay > - > > Key: HIVE-12682 > URL: https://issues.apache.org/jira/browse/HIVE-12682 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Carter Shanklin >Assignee: Prasanth Jayachandran > Attachments: HIVE-12682.1.patch, HIVE-12682.2.patch, reducer.png > > > I tested this on Hive 1.2.1 but looks like it's still applicable to 2.0. > I ran this query: > {code} > create table flights ( > … > ) > PARTITIONED BY (Year int) > CLUSTERED BY (Month) > SORTED BY (DayofMonth) into 12 buckets > STORED AS ORC > TBLPROPERTIES("orc.bloom.filter.columns"="*") > ; > {code} > (Taken from here: > https://github.com/t3rmin4t0r/all-airlines-data/blob/master/ddl/orc.sql) > I profiled just the reduce phase and noticed something odd, the attached > graph shows where time was spent during the reducer phase. > !reducer.png! > Problem seems to relate to > https://github.com/apache/hive/blob/branch-2.0/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java#L903 > /cc [~gopalv] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12837) Better memory estimation/allocation for hybrid grace hash join during hash table loading
[ https://issues.apache.org/jira/browse/HIVE-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107145#comment-15107145 ] Wei Zheng commented on HIVE-12837: -- Both TestDbTxnManager2 and tez_union.q passed locally without any problem. > Better memory estimation/allocation for hybrid grace hash join during hash > table loading > > > Key: HIVE-12837 > URL: https://issues.apache.org/jira/browse/HIVE-12837 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.1.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-12837.1.patch, HIVE-12837.2.patch, > HIVE-12837.3.patch > > > This is to avoid an edge case when the memory available is very little (less > than a single write buffer size), and we start loading the hash table. Since > the write buffer is lazily allocated, we will easily run out of memory before > even checking if we should spill any hash partition. > e.g. > Total memory available: 210 MB > Size of ref array of BytesBytesMultiHashMap for each hash partition: ~16 MB > Size of write buffer: 8 MB (lazy allocation) > Number of hash partitions: 16 > Number of hash partitions created in memory: 13 > Number of hash partitions created on disk: 3 > Available memory left after HybridHashTableContainer initialization: > 210-16*13=2MB > Now let's say a row is to be loaded into a hash partition in memory, it will > try to allocate an 8MB write buffer for it, but we only have 2MB, thus OOM. > Solution is to perform the check for possible spilling earlier so we can > spill partitions if memory is about to be full, to avoid OOM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11785) Support escaping carriage return and new line for LazySimpleSerDe
[ https://issues.apache.org/jira/browse/HIVE-11785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-11785: Release Note: This change with HIVE-12820 in addition adds the support of carriage return and new line characters in the fields. Before this change, the user needs to preprocess the text by replacing them with some characters other than carriage return and new line in order for the files to be properly processed. With this change, it will automatically escape them if {{serialization.escape.crlf}} serde property is set to true. One incompatible change is: characters 'r' and 'n' cannot be used as separator or field delimiter (was: This change disallows carriage return and new line characters to be used as field separators or escape character. While before this change, those were allowed while those cases could easily lead to incorrect results if the content also contain carriage return or new line. Since even carriage return or new line was escaped, line based input format in MapReduce used in Hive will break the lines by carriage return and new line only and lead to incorrect result.) > Support escaping carriage return and new line for LazySimpleSerDe > - > > Key: HIVE-11785 > URL: https://issues.apache.org/jira/browse/HIVE-11785 > Project: Hive > Issue Type: New Feature > Components: Query Processor >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Labels: TODOC2.0 > Fix For: 2.0.0 > > Attachments: HIVE-11785.2.patch, HIVE-11785.3.patch, > HIVE-11785.patch, test.parquet > > > Create the table and perform the queries as follows. You will see different > results when the setting changes. > The expected result should be: > {noformat} > 1 newline > here > 2 carriage return > 3 both > here > {noformat} > {noformat} > hive> create table repo (lvalue int, charstring string) stored as parquet; > OK > Time taken: 0.34 seconds > hive> load data inpath '/tmp/repo/test.parquet' overwrite into table repo; > Loading data to table default.repo > chgrp: changing ownership of > 'hdfs://nameservice1/user/hive/warehouse/repo/test.parquet': User does not > belong to hive > Table default.repo stats: [numFiles=1, numRows=0, totalSize=610, > rawDataSize=0] > OK > Time taken: 0.732 seconds > hive> set hive.fetch.task.conversion=more; > hive> select * from repo; > OK > 1 newline > here > here carriage return > 3 both > here > Time taken: 0.253 seconds, Fetched: 3 row(s) > hive> set hive.fetch.task.conversion=none; > hive> select * from repo; > Query ID = root_20150909113535_e081db8b-ccd9-4c44-aad9-d990ffb8edf3 > Total jobs = 1 > Launching Job 1 out of 1 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1441752031022_0006, Tracking URL = > http://host-10-17-81-63.coe.cloudera.com:8088/proxy/application_1441752031022_0006/ > Kill Command = > /opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/hadoop/bin/hadoop job > -kill job_1441752031022_0006 > Hadoop job information for Stage-1: number of mappers: 1; number of reducers: > 0 > 2015-09-09 11:35:54,127 Stage-1 map = 0%, reduce = 0% > 2015-09-09 11:36:04,664 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.98 > sec > MapReduce Total cumulative CPU time: 2 seconds 980 msec > Ended Job = job_1441752031022_0006 > MapReduce Jobs Launched: > Stage-Stage-1: Map: 1 Cumulative CPU: 2.98 sec HDFS Read: 4251 HDFS > Write: 51 SUCCESS > Total MapReduce CPU Time Spent: 2 seconds 980 msec > OK > 1 newline > NULL NULL > 2 carriage return > NULL NULL > 3 both > NULL NULL > Time taken: 25.131 seconds, Fetched: 6 row(s) > hive> > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12883) Support basic stats and column stats in table properties in HBaseStore
[ https://issues.apache.org/jira/browse/HIVE-12883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107209#comment-15107209 ] Ashutosh Chauhan commented on HIVE-12883: - +1 > Support basic stats and column stats in table properties in HBaseStore > -- > > Key: HIVE-12883 > URL: https://issues.apache.org/jira/browse/HIVE-12883 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12883.01.patch, HIVE-12883.02.patch, > HIVE-12883.03.patch > > > Need to add support for HBase store too. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12820) Remove the check if carriage return and new line are used for separator or escape character
[ https://issues.apache.org/jira/browse/HIVE-12820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-12820: Fix Version/s: 2.1.0 > Remove the check if carriage return and new line are used for separator or > escape character > --- > > Key: HIVE-12820 > URL: https://issues.apache.org/jira/browse/HIVE-12820 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 2.0.0, 2.1.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Fix For: 2.1.0 > > Attachments: HIVE-12820.2.patch, HIVE-12820.patch > > > The change in HIVE-11785 doesn't allow \r or \n to be used as separator or > escape character which may break some existing tables which uses \r as > separator or escape character e.g.. > This case actually can be supported regardless of SERIALIZATION_ESCAPE_CRLF > set or not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12798) CBO: Calcite Operator To Hive Operator (Calcite Return Path): MiniTezCliDriver.vector* queries failures due to NPE in Vectorizer.onExpressionHasNullSafes()
[ https://issues.apache.org/jira/browse/HIVE-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107648#comment-15107648 ] Sergey Shelukhin commented on HIVE-12798: - +1 > CBO: Calcite Operator To Hive Operator (Calcite Return Path): > MiniTezCliDriver.vector* queries failures due to NPE in > Vectorizer.onExpressionHasNullSafes() > --- > > Key: HIVE-12798 > URL: https://issues.apache.org/jira/browse/HIVE-12798 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Fix For: 2.1.0 > > Attachments: HIVE-12798.1.patch > > > As of 01/04/2016, the following tests fail in the MiniTezCliDriver mode when > the cbo return path is enabled. We need to fix them : > {code} > vector_leftsemi_mapjoin > vector_join_filters > vector_interval_mapjoin > vector_left_outer_join > vectorized_mapjoin > vector_inner_join > vectorized_context > tez_vector_dynpart_hashjoin_1 > count > auto_sortmerge_join_6 > skewjoin > vector_auto_smb_mapjoin_14 > auto_join_filters > vector_outer_join0 > vector_outer_join1 > vector_outer_join2 > vector_outer_join3 > vector_outer_join4 > vector_outer_join5 > hybridgrace_hashjoin_1 > vector_mapjoin_reduce > vectorized_nested_mapjoin > vector_left_outer_join2 > vector_char_mapjoin1 > vector_decimal_mapjoin > vectorized_dynamic_partition_pruning > vector_varchar_mapjoin1 > {code} > This jira is intended to cover the vectorization issues related to the > MiniTezCliDriver failures caused by NPE via nullSafes array as shown below : > {code} > private boolean onExpressionHasNullSafes(MapJoinDesc desc) { > boolean[] nullSafes = desc.getNullSafes(); > for (boolean nullSafe : nullSafes) { > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12891) Hive fails when java.io.tmpdir is set to a relative location
[ https://issues.apache.org/jira/browse/HIVE-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reuben Kuhnert updated HIVE-12891: -- Attachment: HIVE-12891.01.19.2016.01.patch > Hive fails when java.io.tmpdir is set to a relative location > > > Key: HIVE-12891 > URL: https://issues.apache.org/jira/browse/HIVE-12891 > Project: Hive > Issue Type: Bug >Reporter: Reuben Kuhnert >Assignee: Reuben Kuhnert > Attachments: HIVE-12891.01.19.2016.01.patch > > > The function {{SessionState.createSessionDirs}} fails when trying to create > directories where {{java.io.tmpdir}} is set to a relative location. > {code} > \[uber-SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: > IllegalArgumentException java.net.URISyntaxException: Relative path in > absolute URI: > file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1 > ... > Minor variations: > \[uber-SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: SemanticException > Exception while processing Exception while writing out the local file > o.a.h.hive.ql/parse.SemanticException: Exception while processing exception > while writing out local file > ... > caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: > Relative path in absolute URI: > file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1 > at o.a.h.fs.Path.initialize (206) > at o.a.h.fs.Path.(197)... > at o.a.h.hive.ql.context.getScratchDir(267) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12891) Hive fails when java.io.tmpdir is set to a relative location
[ https://issues.apache.org/jira/browse/HIVE-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107807#comment-15107807 ] Reuben Kuhnert commented on HIVE-12891: --- Patch Fix: Ensure that paths are expanded to absolute locations. > Hive fails when java.io.tmpdir is set to a relative location > > > Key: HIVE-12891 > URL: https://issues.apache.org/jira/browse/HIVE-12891 > Project: Hive > Issue Type: Bug >Reporter: Reuben Kuhnert >Assignee: Reuben Kuhnert > Attachments: HIVE-12891.01.19.2016.01.patch > > > The function {{SessionState.createSessionDirs}} fails when trying to create > directories where {{java.io.tmpdir}} is set to a relative location. > {code} > \[uber-SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: > IllegalArgumentException java.net.URISyntaxException: Relative path in > absolute URI: > file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1 > ... > Minor variations: > \[uber-SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: SemanticException > Exception while processing Exception while writing out the local file > o.a.h.hive.ql/parse.SemanticException: Exception while processing exception > while writing out local file > ... > caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: > Relative path in absolute URI: > file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1 > at o.a.h.fs.Path.initialize (206) > at o.a.h.fs.Path.(197)... > at o.a.h.hive.ql.context.getScratchDir(267) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12783) fix the unit test failures in TestSparkClient and TestSparkSessionManagerImpl
[ https://issues.apache.org/jira/browse/HIVE-12783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107877#comment-15107877 ] Lefty Leverenz commented on HIVE-12783: --- Nudging [~owen.omalley]: this needs a status update. Thanks. > fix the unit test failures in TestSparkClient and TestSparkSessionManagerImpl > - > > Key: HIVE-12783 > URL: https://issues.apache.org/jira/browse/HIVE-12783 > Project: Hive > Issue Type: Sub-task > Components: Test >Affects Versions: 2.0.0 >Reporter: Pengcheng Xiong >Assignee: Owen O'Malley >Priority: Blocker > Attachments: HIVE-12783.patch, HIVE-12783.patch, HIVE-12783.patch > > > This includes > {code} > org.apache.hive.spark.client.TestSparkClient.testSyncRpc > org.apache.hive.spark.client.TestSparkClient.testJobSubmission > org.apache.hive.spark.client.TestSparkClient.testMetricsCollection > org.apache.hive.spark.client.TestSparkClient.testCounters > org.apache.hive.spark.client.TestSparkClient.testRemoteClient > org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles > org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob > org.apache.hive.spark.client.TestSparkClient.testErrorJob > org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse > org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse > {code} > all of them passed on my laptop. cc'ing [~szehon], [~xuefuz], could you > please take a look? Shall we ignore them? Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12798) CBO: Calcite Operator To Hive Operator (Calcite Return Path): MiniTezCliDriver.vector* queries failures due to NPE in Vectorizer.onExpressionHasNullSafes()
[ https://issues.apache.org/jira/browse/HIVE-12798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-12798: - Fix Version/s: 2.0.0 > CBO: Calcite Operator To Hive Operator (Calcite Return Path): > MiniTezCliDriver.vector* queries failures due to NPE in > Vectorizer.onExpressionHasNullSafes() > --- > > Key: HIVE-12798 > URL: https://issues.apache.org/jira/browse/HIVE-12798 > Project: Hive > Issue Type: Sub-task > Components: CBO >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Fix For: 2.0.0, 2.1.0 > > Attachments: HIVE-12798.1.patch > > > As of 01/04/2016, the following tests fail in the MiniTezCliDriver mode when > the cbo return path is enabled. We need to fix them : > {code} > vector_leftsemi_mapjoin > vector_join_filters > vector_interval_mapjoin > vector_left_outer_join > vectorized_mapjoin > vector_inner_join > vectorized_context > tez_vector_dynpart_hashjoin_1 > count > auto_sortmerge_join_6 > skewjoin > vector_auto_smb_mapjoin_14 > auto_join_filters > vector_outer_join0 > vector_outer_join1 > vector_outer_join2 > vector_outer_join3 > vector_outer_join4 > vector_outer_join5 > hybridgrace_hashjoin_1 > vector_mapjoin_reduce > vectorized_nested_mapjoin > vector_left_outer_join2 > vector_char_mapjoin1 > vector_decimal_mapjoin > vectorized_dynamic_partition_pruning > vector_varchar_mapjoin1 > {code} > This jira is intended to cover the vectorization issues related to the > MiniTezCliDriver failures caused by NPE via nullSafes array as shown below : > {code} > private boolean onExpressionHasNullSafes(MapJoinDesc desc) { > boolean[] nullSafes = desc.getNullSafes(); > for (boolean nullSafe : nullSafes) { > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-8680) Set Max Message for Binary Thrift endpoints
[ https://issues.apache.org/jira/browse/HIVE-8680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107768#comment-15107768 ] Lefty Leverenz edited comment on HIVE-8680 at 1/20/16 12:57 AM: Doc note: the wiki documentation has been added: * [Configuration Properties - hive.metastore.server.max.message.size | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.metastore.server.max.message.size] * [Configuration Properties - hive.server2.thrift.max.message.size | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.server2.thrift.max.message.size] Removed the TODOC15. was (Author: sladymon): Doc note: the wiki documentation has been added: * [Configuration Properties - hive.metastore.server.max.message.size | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.metastore.server.max.message.size] * [Configuration Properties - hive.server2.thrift.max.message.size | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.server2.thrift.max.message.size] Removed the TODOC15. > Set Max Message for Binary Thrift endpoints > --- > > Key: HIVE-8680 > URL: https://issues.apache.org/jira/browse/HIVE-8680 > Project: Hive > Issue Type: Bug >Reporter: Brock Noland >Assignee: Brock Noland > Fix For: 1.1.0, 1.0.2 > > Attachments: HIVE-8680.patch, HIVE-8680.patch > > > Thrift has a configuration open to restrict incoming message size. If we > configure this we'll stop OOM'ing when someone sends us an HTTP request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12244) Refactoring code for avoiding of comparison of Strings and do comparison on Path
[ https://issues.apache.org/jira/browse/HIVE-12244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107798#comment-15107798 ] Hive QA commented on HIVE-12244: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12783127/HIVE-12244.2.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6676/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6676/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6676/ Messages: {noformat} This message was trimmed, see log for full details [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] skip non existing resourceDirectory /data/hive-ptest/working/apache-github-source-source/itests/custom-udfs/udf-classloader-udf2/src/main/resources [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ udf-classloader-udf2 --- [INFO] Executing tasks main: [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ udf-classloader-udf2 --- [INFO] Compiling 1 source file to /data/hive-ptest/working/apache-github-source-source/itests/custom-udfs/udf-classloader-udf2/target/classes [INFO] [INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ udf-classloader-udf2 --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] skip non existing resourceDirectory /data/hive-ptest/working/apache-github-source-source/itests/custom-udfs/udf-classloader-udf2/src/test/resources [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ udf-classloader-udf2 --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/itests/custom-udfs/udf-classloader-udf2/target/tmp [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/itests/custom-udfs/udf-classloader-udf2/target/warehouse [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/itests/custom-udfs/udf-classloader-udf2/target/tmp/conf [copy] Copying 16 files to /data/hive-ptest/working/apache-github-source-source/itests/custom-udfs/udf-classloader-udf2/target/tmp/conf [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ udf-classloader-udf2 --- [INFO] No sources to compile [INFO] [INFO] --- maven-surefire-plugin:2.16:test (default-test) @ udf-classloader-udf2 --- [INFO] Tests are skipped. [INFO] [INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ udf-classloader-udf2 --- [INFO] Building jar: /data/hive-ptest/working/apache-github-source-source/itests/custom-udfs/udf-classloader-udf2/target/udf-classloader-udf2-2.1.0-SNAPSHOT.jar [INFO] [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ udf-classloader-udf2 --- [INFO] [INFO] --- maven-install-plugin:2.4:install (default-install) @ udf-classloader-udf2 --- [INFO] Installing /data/hive-ptest/working/apache-github-source-source/itests/custom-udfs/udf-classloader-udf2/target/udf-classloader-udf2-2.1.0-SNAPSHOT.jar to /data/hive-ptest/working/maven/org/apache/hive/hive-it-custom-udfs/udf-classloader-udf2/2.1.0-SNAPSHOT/udf-classloader-udf2-2.1.0-SNAPSHOT.jar [INFO] Installing /data/hive-ptest/working/apache-github-source-source/itests/custom-udfs/udf-classloader-udf2/pom.xml to /data/hive-ptest/working/maven/org/apache/hive/hive-it-custom-udfs/udf-classloader-udf2/2.1.0-SNAPSHOT/udf-classloader-udf2-2.1.0-SNAPSHOT.pom [INFO] [INFO] [INFO] Building Hive Integration - HCatalog Unit Tests 2.1.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-hcatalog-it-unit --- [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/itests/hcatalog-unit/target [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/itests/hcatalog-unit (includes = [datanucleus.log, derby.log], excludes = []) [INFO] [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ hive-hcatalog-it-unit --- [INFO] [INFO] --- maven-antrun-plugin:1.7:run (download-spark) @ hive-hcatalog-it-unit --- [INFO] Executing tasks main: [INFO] Executed tasks [INFO] [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ hive-hcatalog-it-unit --- [INFO] [INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ hive-hcatalog-it-unit --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] skip non existing
[jira] [Updated] (HIVE-12220) LLAP: Usability issues with hive.llap.io.cache.orc.size
[ https://issues.apache.org/jira/browse/HIVE-12220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12220: Attachment: HIVE-12220.03.patch The same patch... HiveQA didn't run for whatever reason > LLAP: Usability issues with hive.llap.io.cache.orc.size > --- > > Key: HIVE-12220 > URL: https://issues.apache.org/jira/browse/HIVE-12220 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Carter Shanklin >Assignee: Sergey Shelukhin > Attachments: HIVE-12220.01.patch, HIVE-12220.02.patch, > HIVE-12220.03.patch, HIVE-12220.patch, HIVE-12220.tmp.patch > > > In the llap-daemon site you need to set, among other things, > llap.daemon.memory.per.instance.mb > and > hive.llap.io.cache.orc.size > The use of hive.llap.io.cache.orc.size caused me some unnecessary problems, > initially I entered the value in MB rather than in bytes. Operator error you > could say but I look at this as a fraction of the other value which is in mb. > Second, is this really tied to ORC? E.g. when we have the vectorized text > reader will this data be cached as well? Or might it be in the future? > I would like to propose instead using hive.llap.io.cache.size.mb for this > setting. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11097) HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases
[ https://issues.apache.org/jira/browse/HIVE-11097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107678#comment-15107678 ] Hive QA commented on HIVE-11097: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12783035/HIVE-11097.3.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10011 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_symlink_text_input_format org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6674/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6674/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6674/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12783035 - PreCommit-HIVE-TRUNK-Build > HiveInputFormat uses String.startsWith to compare splitPath and PathToAliases > - > > Key: HIVE-11097 > URL: https://issues.apache.org/jira/browse/HIVE-11097 > Project: Hive > Issue Type: Bug > Components: File Formats >Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.2.0 > Environment: Hive 0.13.1, Hive 2.0.0, hadoop 2.4.1 >Reporter: Wan Chang >Assignee: Wan Chang >Priority: Critical > Attachments: HIVE-11097.1.patch, HIVE-11097.2.patch, > HIVE-11097.3.patch > > > Say we have a sql as > {code} > create table if not exists test_orc_src (a int, b int, c int) stored as orc; > create table if not exists test_orc_src2 (a int, b int, d int) stored as orc; > insert overwrite table test_orc_src select 1,2,3 from src limit 1; > insert overwrite table test_orc_src2 select 1,2,4 from src limit 1; > set hive.auto.convert.join = false; > set hive.execution.engine=mr; > select > tb.c > from test.test_orc_src tb > join (select * from test.test_orc_src2) tm > on tb.a = tm.a > where tb.b = 2 > {code} > The correct result is 3 but it produced no result. > I find that in HiveInputFormat.pushProjectionsAndFilters > {code} > match = splitPath.startsWith(key) || splitPathWithNoSchema.startsWith(key); > {code} > It uses startsWith to combine aliases with path, so tm will match two alias > in this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9147) Add unit test for HIVE-7323
[ https://issues.apache.org/jira/browse/HIVE-9147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Slawski updated HIVE-9147: Attachment: HIVE-9147.2.patch I rebased this patch on top of the latest master branch. Please see attachments. > Add unit test for HIVE-7323 > --- > > Key: HIVE-9147 > URL: https://issues.apache.org/jira/browse/HIVE-9147 > Project: Hive > Issue Type: Test > Components: Statistics >Affects Versions: 0.14.0, 0.13.1 >Reporter: Peter Slawski >Priority: Minor > Attachments: HIVE-9147.1.patch, HIVE-9147.2.patch > > > This unit test verifies that DateStatisticImpl doesn't store mutable objects > from callers for minimum and maximum values. This ensures callers cannot > modify the internal minimum and maximum values outside of DateStatisticImpl. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12727) allow full table queries in strict mode
[ https://issues.apache.org/jira/browse/HIVE-12727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12727: Attachment: HIVE-12727.01.patch The same patch. I cannot repro the test failures locally, need to see the logs here. > allow full table queries in strict mode > --- > > Key: HIVE-12727 > URL: https://issues.apache.org/jira/browse/HIVE-12727 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Blocker > Attachments: HIVE-12727.01.patch, HIVE-12727.patch > > > Making strict mode the default recently appears to have broken many normal > queries, such as some TPCDS benchmark queries, e.g. Q85: > Response message: org.apache.hive.service.cli.HiveSQLException: Error while > compiling statement: FAILED: SemanticException [Error 10041]: No partition > predicate found for Alias "web_sales" Table "web_returns" > We should remove this restriction from strict mode, or change the default > back to non-strict. Perhaps make a 3-value parameter, nonstrict, semistrict, > and strict, for backward compat for people who are relying on strict already. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12893) Sorted dynamic partition does not work if subset of partition columns are constant folded
[ https://issues.apache.org/jira/browse/HIVE-12893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107894#comment-15107894 ] Prasanth Jayachandran commented on HIVE-12893: -- [~ashutoshc] Can you please review this patch? > Sorted dynamic partition does not work if subset of partition columns are > constant folded > - > > Key: HIVE-12893 > URL: https://issues.apache.org/jira/browse/HIVE-12893 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer >Affects Versions: 1.3.0, 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-12893.1.patch > > > If all partition columns are constant folded then sorted dynamic partitioning > should not be used as it is similar to static partitioning. But if only > subset of partition columns are constant folded sorted dynamic partition > optimization will be helpful. Currently, this optimization is disabled if > atleast one partition column constant folded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12893) Sorted dynamic partition does not work if subset of partition columns are constant folded
[ https://issues.apache.org/jira/browse/HIVE-12893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-12893: - Attachment: HIVE-12893.1.patch This patch moves the SortedDynamicPartition optimizer above PartitionConditionRemover optimization. Removal of partition condition after constant folding makes it complicated to determine the partition columns as the folded columns will be removed from row schema. Also this patch disables BucketingSortingReduceSinkOptimizer if SortedDynamicPartition optimizer inserts a new ReduceSink. > Sorted dynamic partition does not work if subset of partition columns are > constant folded > - > > Key: HIVE-12893 > URL: https://issues.apache.org/jira/browse/HIVE-12893 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer >Affects Versions: 1.3.0, 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-12893.1.patch > > > If all partition columns are constant folded then sorted dynamic partitioning > should not be used as it is similar to static partitioning. But if only > subset of partition columns are constant folded sorted dynamic partition > optimization will be helpful. Currently, this optimization is disabled if > atleast one partition column constant folded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12893) Sorted dynamic partition does not work if subset of partition columns are constant folded
[ https://issues.apache.org/jira/browse/HIVE-12893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-12893: - Reporter: Yi Zhang (was: Prasanth Jayachandran) > Sorted dynamic partition does not work if subset of partition columns are > constant folded > - > > Key: HIVE-12893 > URL: https://issues.apache.org/jira/browse/HIVE-12893 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer >Affects Versions: 1.3.0, 2.0.0 >Reporter: Yi Zhang >Assignee: Prasanth Jayachandran > Attachments: HIVE-12893.1.patch > > > If all partition columns are constant folded then sorted dynamic partitioning > should not be used as it is similar to static partitioning. But if only > subset of partition columns are constant folded sorted dynamic partition > optimization will be helpful. Currently, this optimization is disabled if > atleast one partition column constant folded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8680) Set Max Message for Binary Thrift endpoints
[ https://issues.apache.org/jira/browse/HIVE-8680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107768#comment-15107768 ] Shannon Ladymon commented on HIVE-8680: --- Doc note: the wiki documentation has been added: * [Configuration Properties - hive.metastore.server.max.message.size | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.metastore.server.max.message.size] * [Configuration Properties - hive.server2.thrift.max.message.size | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.server2.thrift.max.message.size] Removed the TODOC15. > Set Max Message for Binary Thrift endpoints > --- > > Key: HIVE-8680 > URL: https://issues.apache.org/jira/browse/HIVE-8680 > Project: Hive > Issue Type: Bug >Reporter: Brock Noland >Assignee: Brock Noland > Fix For: 1.1.0, 1.0.2 > > Attachments: HIVE-8680.patch, HIVE-8680.patch > > > Thrift has a configuration open to restrict incoming message size. If we > configure this we'll stop OOM'ing when someone sends us an HTTP request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12856) LLAP: update (add/remove) the UDFs available in LLAP when they are changed; also refresh periodically
[ https://issues.apache.org/jira/browse/HIVE-12856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12856: Description: I don't think re-querying the functions is going to scale, and the sessions obviously cannot notify all LLAP clusters of every change. We should add global versioning to metastore functions to track changes, and then possibly add a notification mechanism, potentially thru ZK to avoid overloading the metastore itself. (was: I don't think re-querying the functions is going to scale, and the sessions obviously cannot notify all LLAP clusters of every change. We should add versioning to metastore functions to track changes, and then possibly add a notification mechanism, potentially thru ZK to avoid overloading the metastore itself.) > LLAP: update (add/remove) the UDFs available in LLAP when they are changed; > also refresh periodically > - > > Key: HIVE-12856 > URL: https://issues.apache.org/jira/browse/HIVE-12856 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > > I don't think re-querying the functions is going to scale, and the sessions > obviously cannot notify all LLAP clusters of every change. We should add > global versioning to metastore functions to track changes, and then possibly > add a notification mechanism, potentially thru ZK to avoid overloading the > metastore itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12892) Add global change versioning to permanent functions in metastore
[ https://issues.apache.org/jira/browse/HIVE-12892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12892: Summary: Add global change versioning to permanent functions in metastore (was: Add change versioning to permanent functions in metastore) > Add global change versioning to permanent functions in metastore > > > Key: HIVE-12892 > URL: https://issues.apache.org/jira/browse/HIVE-12892 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-12892) Add change versioning to permanent functions in metastore
[ https://issues.apache.org/jira/browse/HIVE-12892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HIVE-12892. - Resolution: Won't Fix > Add change versioning to permanent functions in metastore > - > > Key: HIVE-12892 > URL: https://issues.apache.org/jira/browse/HIVE-12892 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HIVE-12892) Add global change versioning to permanent functions in metastore
[ https://issues.apache.org/jira/browse/HIVE-12892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reopened HIVE-12892: - > Add global change versioning to permanent functions in metastore > > > Key: HIVE-12892 > URL: https://issues.apache.org/jira/browse/HIVE-12892 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12865) Exchange partition does not show inputs field for post/pre execute hooks
[ https://issues.apache.org/jira/browse/HIVE-12865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107830#comment-15107830 ] Paul Yang commented on HIVE-12865: -- Looks great to me - thanks Aihua! > Exchange partition does not show inputs field for post/pre execute hooks > > > Key: HIVE-12865 > URL: https://issues.apache.org/jira/browse/HIVE-12865 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.12.0 >Reporter: Paul Yang >Assignee: Aihua Xu > Attachments: HIVE-12865.patch > > > The pre/post execute hook interface has fields that indicate which Hive > objects were read / written to as a result of running the query. For the > exchange partition operation, the read entity field is empty. > This is an important issue as the hook interface may be configured to perform > critical warehouse operations. > See > ql/src/test/results/clientpositive/exchange_partition3.q.out > {code} > --- a/ql/src/test/results/clientpositive/exchange_partition3.q.out > +++ b/ql/src/test/results/clientpositive/exchange_partition3.q.out > @@ -65,9 +65,17 @@ ds=2013-04-05/hr=2 > PREHOOK: query: -- This will exchange both partitions hr=1 and hr=2 > ALTER TABLE exchange_part_test1 EXCHANGE PARTITION (ds='2013-04-05') WITH > TABLE exchange_part_test2 > PREHOOK: type: ALTERTABLE_EXCHANGEPARTITION > +PREHOOK: Output: default@exchange_part_test1 > +PREHOOK: Output: default@exchange_part_test2 > POSTHOOK: query: -- This will exchange both partitions hr=1 and hr=2 > ALTER TABLE exchange_part_test1 EXCHANGE PARTITION (ds='2013-04-05') WITH > TABLE exchange_part_test2 > POSTHOOK: type: ALTERTABLE_EXCHANGEPARTITION > +POSTHOOK: Output: default@exchange_part_test1 > +POSTHOOK: Output: default@exchange_part_test1@ds=2013-04-05/hr=1 > +POSTHOOK: Output: default@exchange_part_test1@ds=2013-04-05/hr=2 > +POSTHOOK: Output: default@exchange_part_test2 > +POSTHOOK: Output: default@exchange_part_test2@ds=2013-04-05/hr=1 > +POSTHOOK: Output: default@exchange_part_test2@ds=2013-04-05/hr=2 > PREHOOK: query: SHOW PARTITIONS exchange_part_test1 > PREHOOK: type: SHOWPARTITIONS > PREHOOK: Input: default@exchange_part_test1 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8680) Set Max Message for Binary Thrift endpoints
[ https://issues.apache.org/jira/browse/HIVE-8680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shannon Ladymon updated HIVE-8680: -- Labels: (was: TODOC15) > Set Max Message for Binary Thrift endpoints > --- > > Key: HIVE-8680 > URL: https://issues.apache.org/jira/browse/HIVE-8680 > Project: Hive > Issue Type: Bug >Reporter: Brock Noland >Assignee: Brock Noland > Fix For: 1.1.0, 1.0.2 > > Attachments: HIVE-8680.patch, HIVE-8680.patch > > > Thrift has a configuration open to restrict incoming message size. If we > configure this we'll stop OOM'ing when someone sends us an HTTP request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12856) LLAP: update (add/remove) the UDFs available in LLAP when they are changed; also refresh periodically
[ https://issues.apache.org/jira/browse/HIVE-12856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12856: Description: I don't think re-querying the functions is going to scale, and the sessions obviously cannot notify all LLAP clusters of every change. We should add versioning to metastore functions to track changes, and then possibly add a notification mechanism, potentially thru ZK to avoid overloading the metastore itself. > LLAP: update (add/remove) the UDFs available in LLAP when they are changed; > also refresh periodically > - > > Key: HIVE-12856 > URL: https://issues.apache.org/jira/browse/HIVE-12856 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > > I don't think re-querying the functions is going to scale, and the sessions > obviously cannot notify all LLAP clusters of every change. We should add > versioning to metastore functions to track changes, and then possibly add a > notification mechanism, potentially thru ZK to avoid overloading the > metastore itself. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12855) LLAP: add checks when resolving UDFs to enforce whitelist
[ https://issues.apache.org/jira/browse/HIVE-12855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108067#comment-15108067 ] Hive QA commented on HIVE-12855: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12783176/HIVE-12855.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10023 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dynpart_hashjoin_1 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dynpart_hashjoin_2 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_vector_dynpart_hashjoin_2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6679/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6679/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6679/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12783176 - PreCommit-HIVE-TRUNK-Build > LLAP: add checks when resolving UDFs to enforce whitelist > - > > Key: HIVE-12855 > URL: https://issues.apache.org/jira/browse/HIVE-12855 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12855.01.patch, HIVE-12855.part.patch > > > Currently, adding a temporary UDF and calling LLAP with it (bypassing the > LlapDecider check, I did it by just modifying the source) only fails because > the class could not be found. If the UDF was accessible to LLAP, it would > execute. Inside the daemon, UDF instantiation should fail for custom UDFs > (and only succeed for whitelisted custom UDFs, once that is implemented). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12889) Support COUNT(DISTINCT) for partitioning query.
[ https://issues.apache.org/jira/browse/HIVE-12889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107975#comment-15107975 ] Hive QA commented on HIVE-12889: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12783137/HIVE-12889.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 23 failed/errored test(s), 10010 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniTezCliDriver-vectorized_parquet.q-orc_merge6.q-vector_outer_join0.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_partitioned org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_join29 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_14 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_hybridgrace_hashjoin_1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_vec_mapwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_join_result_complex org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_joins_explain org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union_decimal org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_aggregate org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_mapjoin org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_reduce2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_0 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_17 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6677/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6677/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6677/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 23 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12783137 - PreCommit-HIVE-TRUNK-Build > Support COUNT(DISTINCT) for partitioning query. > --- > > Key: HIVE-12889 > URL: https://issues.apache.org/jira/browse/HIVE-12889 > Project: Hive > Issue Type: Sub-task > Components: PTF-Windowing >Affects Versions: 2.1.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-12889.patch > > > We need to support avg(distinct), count(distinct), sum(distinct) for the > parent jira HIVE-9534. Separate the work for count(distinct) in this subtask. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12353) When Compactor fails it calls CompactionTxnHandler.markedCleaned(). it should not.
[ https://issues.apache.org/jira/browse/HIVE-12353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-12353: -- Attachment: HIVE-12353.4.patch > When Compactor fails it calls CompactionTxnHandler.markedCleaned(). it > should not. > --- > > Key: HIVE-12353 > URL: https://issues.apache.org/jira/browse/HIVE-12353 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Blocker > Attachments: HIVE-12353.2.patch, HIVE-12353.3.patch, > HIVE-12353.4.patch, HIVE-12353.patch > > > One of the things that this method does is delete entries from TXN_COMPONENTS > for partition that it was trying to compact. > This causes Aborted transactions in TXNS to become empty according to > CompactionTxnHandler.cleanEmptyAbortedTxns() which means they can now be > deleted. > Once they are deleted, data that belongs to these txns is deemed committed... > We should extend COMPACTION_QUEUE state with 'f' and 's' (failed, success) > states. We should also not delete then entry from markedCleaned() > We'll have separate process that cleans 'f' and 's' records after X minutes > (or after > N records for a given partition exist). > This allows SHOW COMPACTIONS to show some history info and how many times > compaction failed on a given partition (subject to retention interval) so > that we don't have to call markCleaned() on Compactor failures at the same > time preventing Compactor to constantly getting stuck on the same bad > partition/table. > Ideally we'd want to include END_TIME field. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12736) It seems that result of Hive on Spark be mistaken and result of Hive and Hive on Spark are not the same
[ https://issues.apache.org/jira/browse/HIVE-12736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108052#comment-15108052 ] Xuefu Zhang commented on HIVE-12736: I also tried memcheck.q, and it passed locally for me too. It doesn't seem related to the patch regardless. As to the patch, it looks good to me. However, I do know much about mapjoin with hint, not sure why groupby and union cannot exist before mapjoin. If you have some explanation, that will help. +1 for the patch. > It seems that result of Hive on Spark be mistaken and result of Hive and Hive > on Spark are not the same > --- > > Key: HIVE-12736 > URL: https://issues.apache.org/jira/browse/HIVE-12736 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.1, 1.2.1 >Reporter: JoneZhang >Assignee: Chengxiang Li > Attachments: HIVE-12736.1-spark.patch, HIVE-12736.2-spark.patch, > HIVE-12736.3-spark.patch, HIVE-12736.4-spark.patch, HIVE-12736.5-spark.patch, > HIVE-12736.5-spark.patch > > > {code} > select * from staff; > 1 jone22 1 > 2 lucy21 1 > 3 hmm 22 2 > 4 james 24 3 > 5 xiaoliu 23 3 > select id,date_ from trade union all select id,"test" from trade ; > 1 201510210908 > 2 201509080234 > 2 201509080235 > 1 test > 2 test > 2 test > set hive.execution.engine=spark; > set spark.master=local; > select /*+mapjoin(t)*/ * from staff s join > (select id,date_ from trade union all select id,"test" from trade ) t on > s.id=t.id; > 1 jone22 1 1 201510210908 > 2 lucy21 1 2 201509080234 > 2 lucy21 1 2 201509080235 > set hive.execution.engine=mr; > select /*+mapjoin(t)*/ * from staff s join > (select id,date_ from trade union all select id,"test" from trade ) t on > s.id=t.id; > FAILED: SemanticException [Error 10227]: Not all clauses are supported with > mapjoin hint. Please remove mapjoin hint. > {code} > I have two questions > 1.Why result of hive on spark not include the following record? > {code} > 1 jone22 1 1 test > 2 lucy21 1 2 test > 2 lucy21 1 2 test > {code} > 2.Why there are two different ways of dealing same query? > explain 1: > {code} > set hive.execution.engine=spark; > set spark.master=local; > explain > select id,date_ from trade union all select id,"test" from trade; > OK > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Spark > DagName: jonezhang_20151222191643_5301d90a-caf0-4934-8092-d165c87a4190:1 > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: trade > Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: id (type: int), date_ (type: string) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 6 Data size: 48 Basic stats: > COMPLETE Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 12 Data size: 96 Basic stats: > COMPLETE Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Map 2 > Map Operator Tree: > TableScan > alias: trade > Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: id (type: int), 'test' (type: string) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 6 Data size: 48 Basic stats: > COMPLETE Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 12 Data size: 96 Basic stats: > COMPLETE Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > ListSink > {code} >
[jira] [Updated] (HIVE-12446) Tracking jira for changes required for move to Tez 0.8.2
[ https://issues.apache.org/jira/browse/HIVE-12446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-12446: -- Attachment: HIVE-12446.02.patch > Tracking jira for changes required for move to Tez 0.8.2 > > > Key: HIVE-12446 > URL: https://issues.apache.org/jira/browse/HIVE-12446 > Project: Hive > Issue Type: Task > Components: llap >Reporter: Siddharth Seth > Attachments: HIVE-12446.02.patch, HIVE-12446.02.patch, > HIVE-12446.combined.1.patch, HIVE-12446.combined.1.txt, HIVE-12446.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12892) Add global change versioning to permanent functions in metastore
[ https://issues.apache.org/jira/browse/HIVE-12892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12892: Attachment: HIVE-12892.WIP.patch WIP patch for backup. In the first cut, the version would be queriable. Perhaps it could also have ZK notifications to avoid overload on metastore from many subscribers connecting. > Add global change versioning to permanent functions in metastore > > > Key: HIVE-12892 > URL: https://issues.apache.org/jira/browse/HIVE-12892 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12892.WIP.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12894) Detect whether ORC is reading from ACID table correctly for Schema Evolution
[ https://issues.apache.org/jira/browse/HIVE-12894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-12894: Attachment: HIVE-12894.01.patch This patch included uncommitted changes for HIVE-12887, too. > Detect whether ORC is reading from ACID table correctly for Schema Evolution > > > Key: HIVE-12894 > URL: https://issues.apache.org/jira/browse/HIVE-12894 > Project: Hive > Issue Type: Bug > Components: ORC >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12894.01.patch > > > Set an configuration variable with 'transactional' property to indicate the > table is ACID. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12891) Hive fails when java.io.tmpdir is set to a relative location
[ https://issues.apache.org/jira/browse/HIVE-12891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108163#comment-15108163 ] Lenni Kuff commented on HIVE-12891: --- Comments: - Do you want to expand all of these paths to absolute? Some of them are HDFS scratch dirs, not sure if we want to support relative paths for those or just java.io.tmpdir - Update the config documentation to mention that relative or absolute paths are allowed. - Is it easy to add a test for this? > Hive fails when java.io.tmpdir is set to a relative location > > > Key: HIVE-12891 > URL: https://issues.apache.org/jira/browse/HIVE-12891 > Project: Hive > Issue Type: Bug >Reporter: Reuben Kuhnert >Assignee: Reuben Kuhnert > Attachments: HIVE-12891.01.19.2016.01.patch > > > The function {{SessionState.createSessionDirs}} fails when trying to create > directories where {{java.io.tmpdir}} is set to a relative location. > {code} > \[uber-SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: > IllegalArgumentException java.net.URISyntaxException: Relative path in > absolute URI: > file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1 > ... > Minor variations: > \[uber-SubtaskRunner] ERROR o.a.h.hive..ql.Driver - FAILED: SemanticException > Exception while processing Exception while writing out the local file > o.a.h.hive.ql/parse.SemanticException: Exception while processing exception > while writing out local file > ... > caused by: java.lang.IllegalArgumentException: java.net.URISyntaxException: > Relative path in absolute URI: > file:./tmp///hive_2015_12_11_09-12-25_352_4325234652356-1 > at o.a.h.fs.Path.initialize (206) > at o.a.h.fs.Path.(197)... > at o.a.h.hive.ql.context.getScratchDir(267) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12855) LLAP: add checks when resolving UDFs to enforce whitelist
[ https://issues.apache.org/jira/browse/HIVE-12855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106651#comment-15106651 ] Hive QA commented on HIVE-12855: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12782985/HIVE-12855.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 10025 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_bmj_schema_evolution org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dml org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dynpart_hashjoin_1 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_dynpart_hashjoin_2 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_schema_evolution org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_self_join org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_union_dynamic_partition org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_union_multiinsert org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_vector_dynpart_hashjoin_2 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_join_part_col_char org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6670/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6670/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6670/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 18 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12782985 - PreCommit-HIVE-TRUNK-Build > LLAP: add checks when resolving UDFs to enforce whitelist > - > > Key: HIVE-12855 > URL: https://issues.apache.org/jira/browse/HIVE-12855 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12855.part.patch, HIVE-12855.patch > > > Currently, adding a temporary UDF and calling LLAP with it (bypassing the > LlapDecider check, I did it by just modifying the source) only fails because > the class could not be found. If the UDF was accessible to LLAP, it would > execute. Inside the daemon, UDF instantiation should fail for custom UDFs > (and only succeed for whitelisted custom UDFs, once that is implemented). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12478) Improve Hive/Calcite Trasitive Predicate inference
[ https://issues.apache.org/jira/browse/HIVE-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106772#comment-15106772 ] Jesus Camacho Rodriguez commented on HIVE-12478: [~jpullokkaran], could you take a look at the current version of the patch? Finally, I had to store state about transitive inference in the operator itself, it was the only reasonable way of implementing exhaustive PPD, inference and constant propagation using HepPlanner. > Improve Hive/Calcite Trasitive Predicate inference > -- > > Key: HIVE-12478 > URL: https://issues.apache.org/jira/browse/HIVE-12478 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0, 2.1.0 >Reporter: Laljo John Pullokkaran >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12478.01.patch, HIVE-12478.02.patch, > HIVE-12478.03.patch, HIVE-12478.04.patch, HIVE-12478.05.patch, > HIVE-12478.06.patch, HIVE-12478.07.patch, HIVE-12478.08.patch, > HIVE-12478.patch > > > HiveJoinPushTransitivePredicatesRule does not pull up predicates for > transitive inference if they contain more than one column. > EXPLAIN select * from srcpart join (select ds as ds, ds as `date` from > srcpart where (ds = '2008-04-08' and value=1)) s on (srcpart.ds = s.ds); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12787) Trace improvement - Inconsistent logging upon shutdown-start of the Hive metastore process
[ https://issues.apache.org/jira/browse/HIVE-12787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106775#comment-15106775 ] Aihua Xu commented on HIVE-12787: - +1. Makes sense to me. > Trace improvement - Inconsistent logging upon shutdown-start of the Hive > metastore process > -- > > Key: HIVE-12787 > URL: https://issues.apache.org/jira/browse/HIVE-12787 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 1.2.1 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja >Priority: Minor > Attachments: HIVE-12787.1.patch > > > The log at: > https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L793 > logged at the start of the shutdown of the Hive metastore process can be > improved to match the finish of the shutdown log at: > https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L793 > by rephrasing from: "Shutting down the object store..." to: "Metastore > shutdown started...". This will match the shutdown completion log: "Metastore > shutdown complete.". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12736) It seems that result of Hive on Spark be mistaken and result of Hive and Hive on Spark are not the same
[ https://issues.apache.org/jira/browse/HIVE-12736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106714#comment-15106714 ] Hive QA commented on HIVE-12736: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12783072/HIVE-12736.5-spark.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9870 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_memcheck org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/1036/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/1036/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-1036/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12783072 - PreCommit-HIVE-SPARK-Build > It seems that result of Hive on Spark be mistaken and result of Hive and Hive > on Spark are not the same > --- > > Key: HIVE-12736 > URL: https://issues.apache.org/jira/browse/HIVE-12736 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.1, 1.2.1 >Reporter: JoneZhang >Assignee: Chengxiang Li > Attachments: HIVE-12736.1-spark.patch, HIVE-12736.2-spark.patch, > HIVE-12736.3-spark.patch, HIVE-12736.4-spark.patch, HIVE-12736.5-spark.patch > > > {code} > select * from staff; > 1 jone22 1 > 2 lucy21 1 > 3 hmm 22 2 > 4 james 24 3 > 5 xiaoliu 23 3 > select id,date_ from trade union all select id,"test" from trade ; > 1 201510210908 > 2 201509080234 > 2 201509080235 > 1 test > 2 test > 2 test > set hive.execution.engine=spark; > set spark.master=local; > select /*+mapjoin(t)*/ * from staff s join > (select id,date_ from trade union all select id,"test" from trade ) t on > s.id=t.id; > 1 jone22 1 1 201510210908 > 2 lucy21 1 2 201509080234 > 2 lucy21 1 2 201509080235 > set hive.execution.engine=mr; > select /*+mapjoin(t)*/ * from staff s join > (select id,date_ from trade union all select id,"test" from trade ) t on > s.id=t.id; > FAILED: SemanticException [Error 10227]: Not all clauses are supported with > mapjoin hint. Please remove mapjoin hint. > {code} > I have two questions > 1.Why result of hive on spark not include the following record? > {code} > 1 jone22 1 1 test > 2 lucy21 1 2 test > 2 lucy21 1 2 test > {code} > 2.Why there are two different ways of dealing same query? > explain 1: > {code} > set hive.execution.engine=spark; > set spark.master=local; > explain > select id,date_ from trade union all select id,"test" from trade; > OK > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Spark > DagName: jonezhang_20151222191643_5301d90a-caf0-4934-8092-d165c87a4190:1 > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: trade > Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: id (type: int), date_ (type: string) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 6 Data size: 48 Basic stats: > COMPLETE Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 12 Data size: 96 Basic stats: > COMPLETE Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: >
[jira] [Updated] (HIVE-12478) Improve Hive/Calcite Trasitive Predicate inference
[ https://issues.apache.org/jira/browse/HIVE-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-12478: --- Attachment: HIVE-12478.08.patch > Improve Hive/Calcite Trasitive Predicate inference > -- > > Key: HIVE-12478 > URL: https://issues.apache.org/jira/browse/HIVE-12478 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0, 2.1.0 >Reporter: Laljo John Pullokkaran >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12478.01.patch, HIVE-12478.02.patch, > HIVE-12478.03.patch, HIVE-12478.04.patch, HIVE-12478.05.patch, > HIVE-12478.06.patch, HIVE-12478.07.patch, HIVE-12478.08.patch, > HIVE-12478.patch > > > HiveJoinPushTransitivePredicatesRule does not pull up predicates for > transitive inference if they contain more than one column. > EXPLAIN select * from srcpart join (select ds as ds, ds as `date` from > srcpart where (ds = '2008-04-08' and value=1)) s on (srcpart.ds = s.ds); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12244) Refactoring code for avoiding of comparison of Strings and do comparison on Path
[ https://issues.apache.org/jira/browse/HIVE-12244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alina Abramova updated HIVE-12244: -- Attachment: HIVE-12244.2.patch Rebased patch > Refactoring code for avoiding of comparison of Strings and do comparison on > Path > > > Key: HIVE-12244 > URL: https://issues.apache.org/jira/browse/HIVE-12244 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 0.13.0, 0.14.0, 1.0.0, 1.2.1 >Reporter: Alina Abramova >Assignee: Alina Abramova >Priority: Minor > Labels: patch > Fix For: 1.2.1 > > Attachments: HIVE-12244.1.patch, HIVE-12244.2.patch > > > In Hive often String is used for representation path and it causes new issues. > We need to compare it with equals() but comparing Strings often is not right > in terms comparing paths . > I think if we use Path from org.apache.hadoop.fs we will avoid new problems > in future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12887) Handle ORC schema on read with fewer columns than file schema (after Schema Evolution changes)
[ https://issues.apache.org/jira/browse/HIVE-12887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106905#comment-15106905 ] Hive QA commented on HIVE-12887: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12783001/HIVE-12887.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10010 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6671/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6671/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6671/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12783001 - PreCommit-HIVE-TRUNK-Build > Handle ORC schema on read with fewer columns than file schema (after Schema > Evolution changes) > -- > > Key: HIVE-12887 > URL: https://issues.apache.org/jira/browse/HIVE-12887 > Project: Hive > Issue Type: Bug > Components: ORC >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12887.01.patch > > > Exception caused by reading after column removal. > {code} > Caused by: java.lang.IndexOutOfBoundsException: Index: 10, Size: 10 > at java.util.ArrayList.rangeCheck(ArrayList.java:653) > at java.util.ArrayList.get(ArrayList.java:429) > at java.util.Collections$UnmodifiableList.get(Collections.java:1309) > at > org.apache.hadoop.hive.ql.io.orc.OrcProto$Type.getSubtypes(OrcProto.java:12240) > at > org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.(TreeReaderFactory.java:2053) > at > org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory.createTreeReader(TreeReaderFactory.java:2481) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.(RecordReaderImpl.java:216) > at > org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:598) > at > org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$ReaderPair.(OrcRawRecordMerger.java:179) > at > org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$OriginalReaderPair.(OrcRawRecordMerger.java:222) > at > org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.(OrcRawRecordMerger.java:442) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.java:1285) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1165) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:249) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12244) Refactoring code for avoiding of comparison of Strings and do comparison on Path
[ https://issues.apache.org/jira/browse/HIVE-12244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106909#comment-15106909 ] Alina Abramova commented on HIVE-12244: --- Rebased patch was attached to the issue. > Refactoring code for avoiding of comparison of Strings and do comparison on > Path > > > Key: HIVE-12244 > URL: https://issues.apache.org/jira/browse/HIVE-12244 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 0.13.0, 0.14.0, 1.0.0, 1.2.1 >Reporter: Alina Abramova >Assignee: Alina Abramova >Priority: Minor > Labels: patch > Fix For: 1.2.1 > > Attachments: HIVE-12244.1.patch, HIVE-12244.2.patch > > > In Hive often String is used for representation path and it causes new issues. > We need to compare it with equals() but comparing Strings often is not right > in terms comparing paths . > I think if we use Path from org.apache.hadoop.fs we will avoid new problems > in future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12889) Support COUNT(DISTINCT) for partitioning query.
[ https://issues.apache.org/jira/browse/HIVE-12889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-12889: Summary: Support COUNT(DISTINCT) for partitioning query. (was: Support COUNT(DISTINCT) for partitioning qurery.) > Support COUNT(DISTINCT) for partitioning query. > --- > > Key: HIVE-12889 > URL: https://issues.apache.org/jira/browse/HIVE-12889 > Project: Hive > Issue Type: Sub-task > Components: PTF-Windowing >Affects Versions: 2.1.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > > We need to support avg(distinct), count(distinct), sum(distinct) for the > parent jira HIVE-9534. Separate the work for count(distinct) in this subtask. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12478) Improve Hive/Calcite Trasitive Predicate inference
[ https://issues.apache.org/jira/browse/HIVE-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107784#comment-15107784 ] Hive QA commented on HIVE-12478: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12783103/HIVE-12478.08.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10023 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_semijoin4 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6675/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6675/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6675/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12783103 - PreCommit-HIVE-TRUNK-Build > Improve Hive/Calcite Trasitive Predicate inference > -- > > Key: HIVE-12478 > URL: https://issues.apache.org/jira/browse/HIVE-12478 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0, 2.1.0 >Reporter: Laljo John Pullokkaran >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12478.01.patch, HIVE-12478.02.patch, > HIVE-12478.03.patch, HIVE-12478.04.patch, HIVE-12478.05.patch, > HIVE-12478.06.patch, HIVE-12478.07.patch, HIVE-12478.08.patch, > HIVE-12478.patch > > > HiveJoinPushTransitivePredicatesRule does not pull up predicates for > transitive inference if they contain more than one column. > EXPLAIN select * from srcpart join (select ds as ds, ds as `date` from > srcpart where (ds = '2008-04-08' and value=1)) s on (srcpart.ds = s.ds); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12736) It seems that result of Hive on Spark be mistaken and result of Hive and Hive on Spark are not the same
[ https://issues.apache.org/jira/browse/HIVE-12736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li updated HIVE-12736: - Attachment: HIVE-12736.5-spark.patch I can't reproduce the failed mapjoin_memcheck.q locally, upload the patch again to verify. > It seems that result of Hive on Spark be mistaken and result of Hive and Hive > on Spark are not the same > --- > > Key: HIVE-12736 > URL: https://issues.apache.org/jira/browse/HIVE-12736 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.1, 1.2.1 >Reporter: JoneZhang >Assignee: Chengxiang Li > Attachments: HIVE-12736.1-spark.patch, HIVE-12736.2-spark.patch, > HIVE-12736.3-spark.patch, HIVE-12736.4-spark.patch, HIVE-12736.5-spark.patch, > HIVE-12736.5-spark.patch > > > {code} > select * from staff; > 1 jone22 1 > 2 lucy21 1 > 3 hmm 22 2 > 4 james 24 3 > 5 xiaoliu 23 3 > select id,date_ from trade union all select id,"test" from trade ; > 1 201510210908 > 2 201509080234 > 2 201509080235 > 1 test > 2 test > 2 test > set hive.execution.engine=spark; > set spark.master=local; > select /*+mapjoin(t)*/ * from staff s join > (select id,date_ from trade union all select id,"test" from trade ) t on > s.id=t.id; > 1 jone22 1 1 201510210908 > 2 lucy21 1 2 201509080234 > 2 lucy21 1 2 201509080235 > set hive.execution.engine=mr; > select /*+mapjoin(t)*/ * from staff s join > (select id,date_ from trade union all select id,"test" from trade ) t on > s.id=t.id; > FAILED: SemanticException [Error 10227]: Not all clauses are supported with > mapjoin hint. Please remove mapjoin hint. > {code} > I have two questions > 1.Why result of hive on spark not include the following record? > {code} > 1 jone22 1 1 test > 2 lucy21 1 2 test > 2 lucy21 1 2 test > {code} > 2.Why there are two different ways of dealing same query? > explain 1: > {code} > set hive.execution.engine=spark; > set spark.master=local; > explain > select id,date_ from trade union all select id,"test" from trade; > OK > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Spark > DagName: jonezhang_20151222191643_5301d90a-caf0-4934-8092-d165c87a4190:1 > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: trade > Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: id (type: int), date_ (type: string) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 6 Data size: 48 Basic stats: > COMPLETE Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 12 Data size: 96 Basic stats: > COMPLETE Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Map 2 > Map Operator Tree: > TableScan > alias: trade > Statistics: Num rows: 6 Data size: 48 Basic stats: COMPLETE > Column stats: NONE > Select Operator > expressions: id (type: int), 'test' (type: string) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 6 Data size: 48 Basic stats: > COMPLETE Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 12 Data size: 96 Basic stats: > COMPLETE Column stats: NONE > table: > input format: > org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > ListSink > {code} > explain 2: > {code} > set hive.execution.engine=spark; > set spark.master=local; > explain > select /*+mapjoin(t)*/ * from staff s join > (select id,date_ from trade union all select id,"test" from trade ) t on > s.id=t.id;
[jira] [Updated] (HIVE-12889) Support COUNT(DISTINCT) for partitioning query.
[ https://issues.apache.org/jira/browse/HIVE-12889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-12889: Attachment: HIVE-12889.patch > Support COUNT(DISTINCT) for partitioning query. > --- > > Key: HIVE-12889 > URL: https://issues.apache.org/jira/browse/HIVE-12889 > Project: Hive > Issue Type: Sub-task > Components: PTF-Windowing >Affects Versions: 2.1.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-12889.patch > > > We need to support avg(distinct), count(distinct), sum(distinct) for the > parent jira HIVE-9534. Separate the work for count(distinct) in this subtask. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-5801) Support for reader/writer of ORC format for R environment
[ https://issues.apache.org/jira/browse/HIVE-5801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106968#comment-15106968 ] Jorge Martinez commented on HIVE-5801: -- Hi [~mhausenblas], there's an R package to read CSV an ORC files from HDFS. It's on Github: https://github.com/vertica/r-dataconnector > Support for reader/writer of ORC format for R environment > - > > Key: HIVE-5801 > URL: https://issues.apache.org/jira/browse/HIVE-5801 > Project: Hive > Issue Type: Improvement >Reporter: Michael Hausenblas >Priority: Minor > > It would be great if the ORC format would directly be accessible from R [1], > that is, providing reader/writer for it. > [1] http://www.r-project.org/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12889) Support COUNT(DISTINCT) for partitioning query.
[ https://issues.apache.org/jira/browse/HIVE-12889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15106994#comment-15106994 ] Aihua Xu commented on HIVE-12889: - Uploaded the first patch: in this patch, 1. Enabling the parser to properly parse such query "count(distinct) over (partition by c1)"; 2. ORDER BY and windowing frame won't work with the functions of distinct due to performance concern and implementation requirement. 3. We insert the distinct fields into the order by list, so during counting, we only need to compare the current row against the previous remembered row. > Support COUNT(DISTINCT) for partitioning query. > --- > > Key: HIVE-12889 > URL: https://issues.apache.org/jira/browse/HIVE-12889 > Project: Hive > Issue Type: Sub-task > Components: PTF-Windowing >Affects Versions: 2.1.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-12889.patch > > > We need to support avg(distinct), count(distinct), sum(distinct) for the > parent jira HIVE-9534. Separate the work for count(distinct) in this subtask. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12885) LDAP Authenticator improvements
[ https://issues.apache.org/jira/browse/HIVE-12885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107050#comment-15107050 ] Hive QA commented on HIVE-12885: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12783013/HIVE-12885.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10025 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.ql.TestTxnCommands.exchangePartition org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6672/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6672/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6672/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12783013 - PreCommit-HIVE-TRUNK-Build > LDAP Authenticator improvements > --- > > Key: HIVE-12885 > URL: https://issues.apache.org/jira/browse/HIVE-12885 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Attachments: HIVE-12885.2.patch, HIVE-12885.patch > > > Currently Hive's LDAP Atn provider assumes certain defaults to keep its > configuration simple. > 1) One of the assumptions is the presence of an attribute > "distinguishedName". In certain non-standard LDAP implementations, this > attribute may not be available. So instead of basing all ldap searches on > this attribute, getNameInNamespace() returns the same value. So this API is > to be used instead. > 2) It also assumes that the "user" value being passed in, will be able to > bind to LDAP. However, certain LDAP implementations, by default, only allow > the full DN to be used, just short user names are not permitted. We will need > to be able to support short names too when hive configuration only has > "BaseDN" specified (not userDNPatterns). So instead of hard-coding "uid" or > "CN" as keys for the short usernames, it probably better to make this a > configurable parameter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12446) Tracking jira for changes required for move to Tez 0.8.2
[ https://issues.apache.org/jira/browse/HIVE-12446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-12446: -- Attachment: HIVE-12446.02.patch > Tracking jira for changes required for move to Tez 0.8.2 > > > Key: HIVE-12446 > URL: https://issues.apache.org/jira/browse/HIVE-12446 > Project: Hive > Issue Type: Task > Components: llap >Reporter: Siddharth Seth > Attachments: HIVE-12446.02.patch, HIVE-12446.combined.1.patch, > HIVE-12446.combined.1.txt, HIVE-12446.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)