[jira] [Commented] (HIVE-10304) Add deprecation message to HiveCLI
[ https://issues.apache.org/jira/browse/HIVE-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492794#comment-14492794 ] Szehon Ho commented on HIVE-10304: -- Hi Ferdinand, thanks for the suggestion. In my opinion, we should avoid adding a link to Hive wiki in the Hive source code, as those are more bothersome to find and change if those change (as it involves compile, running test, etc). In existing source code there are no links to Hive wiki, although there are from README. Hence I think it is better to add this information in the README, maybe one of us can do that in a follow-up JIRA. Test failures do not look related. Add deprecation message to HiveCLI -- Key: HIVE-10304 URL: https://issues.apache.org/jira/browse/HIVE-10304 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 1.1.0 Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-10304.2.patch, HIVE-10304.3.patch, HIVE-10304.patch As Beeline is now the recommended command line tool to Hive, we should add a message to HiveCLI to indicate that it is deprecated and redirect them to Beeline. This is not suggesting to remove HiveCLI for now, but just a helpful direction for user to know the direction to focus attention in Beeline. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10319) Hive CLI startup takes a long time with a large number of databases
[ https://issues.apache.org/jira/browse/HIVE-10319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nezih Yigitbasi updated HIVE-10319: --- Attachment: (was: HIVE-10319.patch) Hive CLI startup takes a long time with a large number of databases --- Key: HIVE-10319 URL: https://issues.apache.org/jira/browse/HIVE-10319 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 1.0.0 Reporter: Nezih Yigitbasi The Hive CLI takes a long time to start when there is a large number of databases in the DW. I think the root cause is the way permanent UDFs are loaded from the metastore. When I looked at the logs and the source code I see that at startup Hive first gets all the databases from the metastore and then for each database it makes a metastore call to get the permanent functions for that database [see Hive.java | https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L162-185]. So the number of metastore calls made is in the order of the number of databases. In production we have several hundreds of databases so Hive makes several hundreds of RPC calls during startup, taking 30+ seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10320) CBO (Calcite Return Path): Disable choosing streaming side at join creation time [CBO branch]
[ https://issues.apache.org/jira/browse/HIVE-10320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10320: --- Attachment: HIVE-10320.cbo.patch CBO (Calcite Return Path): Disable choosing streaming side at join creation time [CBO branch] - Key: HIVE-10320 URL: https://issues.apache.org/jira/browse/HIVE-10320 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: cbo-branch Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: cbo-branch Attachments: HIVE-10320.cbo.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10313) Literal Decimal ExprNodeConstantDesc should contain value of HiveDecimal instead of String
[ https://issues.apache.org/jira/browse/HIVE-10313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492815#comment-14492815 ] Hive QA commented on HIVE-10313: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12724977/HIVE-10313.1.patch {color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8673 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact org.apache.hive.jdbc.TestJdbcWithMiniHS2.testNewConnectionConfiguration {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3406/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3406/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3406/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12724977 - PreCommit-HIVE-TRUNK-Build Literal Decimal ExprNodeConstantDesc should contain value of HiveDecimal instead of String -- Key: HIVE-10313 URL: https://issues.apache.org/jira/browse/HIVE-10313 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.0.0 Reporter: Chaoyu Tang Assignee: Chaoyu Tang Attachments: HIVE-10313.1.patch, HIVE-10313.patch In TyepCheckProcFactory.NumExprProcessor, the ExprNodeConstantDesc is created from strVal: {code} else if (expr.getText().endsWith(BD)) { // Literal decimal String strVal = expr.getText().substring(0, expr.getText().length() - 2); HiveDecimal hd = HiveDecimal.create(strVal); int prec = 1; int scale = 0; if (hd != null) { prec = hd.precision(); scale = hd.scale(); } DecimalTypeInfo typeInfo = TypeInfoFactory.getDecimalTypeInfo(prec, scale); return new ExprNodeConstantDesc(typeInfo, strVal); } {code} It should use HiveDecmal: return new ExprNodeConstantDesc(typeInfo, hd); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10322) TestJdbcWithMiniHS2.testNewConnectionConfiguration fails
[ https://issues.apache.org/jira/browse/HIVE-10322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-10322: --- Attachment: HIVE-10322.patch [~thejas], [~hsubramaniyan] and [~szehon], could you take a look at the patch to fix the test. It is straightforward and just removes the not more available property. TestJdbcWithMiniHS2.testNewConnectionConfiguration fails Key: HIVE-10322 URL: https://issues.apache.org/jira/browse/HIVE-10322 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 1.2.0 Reporter: Chaoyu Tang Assignee: Chaoyu Tang Priority: Trivial Attachments: HIVE-10322.patch Fix test org.apache.hive.jdbc.TestJdbcWithMiniHS2.testNewConnectionConfiguration failed with following error: {code} org.apache.hive.service.cli.HiveSQLException: Failed to open new session: org.apache.hive.service.cli.HiveSQLException: java.lang.IllegalArgumentException: hive configuration hive.server2.thrift.http.max.worker.threads does not exists. at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:243) at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:234) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:513) at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:188) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:233) at org.apache.hive.jdbc.TestJdbcWithMiniHS2.testNewConnectionConfiguration(TestJdbcWithMiniHS2.java:275) Caused by: org.apache.hive.service.cli.HiveSQLException: Failed to open new session: org.apache.hive.service.cli.HiveSQLException: java.lang.IllegalArgumentException: hive configuration hive.server2.thrift.http.max.worker.threads does not exists. {code} It seems related to HIVE-10271(remove hive.server2.thrift.http.min/max.worker.threads properties) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.
[ https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492966#comment-14492966 ] Chris Nauroth commented on HIVE-9736: - Thank you for the rebased patch. It looks great to me overall. I've entered a few comments in ReviewBoard for your consideration regarding consolidation of RPC calls and a few other minor things. https://reviews.apache.org/r/31615/ StorageBasedAuthProvider should batch namenode-calls where possible. Key: HIVE-9736 URL: https://issues.apache.org/jira/browse/HIVE-9736 Project: Hive Issue Type: Bug Components: Metastore, Security Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch Consider a table partitioned by 2 keys (dt, region). Say a dt partition could have 1 associated regions. Consider that the user does: {code:sql} ALTER TABLE my_table DROP PARTITION (dt='20150101'); {code} As things stand now, {{StorageBasedAuthProvider}} will make individual {{DistributedFileSystem.listStatus()}} calls for each partition-directory, and authorize each one separately. It'd be faster to batch the calls, and examine multiple FileStatus objects at once. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10319) Hive CLI startup takes a long time with a large number of databases
[ https://issues.apache.org/jira/browse/HIVE-10319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nezih Yigitbasi updated HIVE-10319: --- Attachment: HIVE-10319.patch Hive CLI startup takes a long time with a large number of databases --- Key: HIVE-10319 URL: https://issues.apache.org/jira/browse/HIVE-10319 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 1.0.0 Reporter: Nezih Yigitbasi Attachments: HIVE-10319.patch The Hive CLI takes a long time to start when there is a large number of databases in the DW. I think the root cause is the way permanent UDFs are loaded from the metastore. When I looked at the logs and the source code I see that at startup Hive first gets all the databases from the metastore and then for each database it makes a metastore call to get the permanent functions for that database [see Hive.java | https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L162-185]. So the number of metastore calls made is in the order of the number of databases. In production we have several hundreds of databases so Hive makes several hundreds of RPC calls during startup, taking 30+ seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-5307) NPE in JsonMetaDataFormatter if Table Path is null
[ https://issues.apache.org/jira/browse/HIVE-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reuben Kuhnert reassigned HIVE-5307: Assignee: Reuben Kuhnert NPE in JsonMetaDataFormatter if Table Path is null --- Key: HIVE-5307 URL: https://issues.apache.org/jira/browse/HIVE-5307 Project: Hive Issue Type: Bug Components: HCatalog, Views Affects Versions: 0.11.0 Reporter: Branky Shao Assignee: Reuben Kuhnert Priority: Trivial When I try to get a table (actually a view) description from HCatalog in HUE, get a NPE thrown as: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.DDLTask.showTableStatus(DDLTask.java:2758) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:347) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:144) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1355) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1139) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:945) at org.apache.hcatalog.cli.HCatDriver.run(HCatDriver.java:43) at org.apache.hcatalog.cli.HCatCli.processCmd(HCatCli.java:251) at org.apache.hcatalog.cli.HCatCli.processLine(HCatCli.java:205) at org.apache.hcatalog.cli.HCatCli.main(HCatCli.java:164) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.metadata.formatting.JsonMetaDataFormatter.putFileSystemsStats(JsonMetaDataFormatter.java:303) at org.apache.hadoop.hive.ql.metadata.formatting.JsonMetaDataFormatter.makeOneTableStatus(JsonMetaDataFormatter.java:257) at org.apache.hadoop.hive.ql.metadata.formatting.JsonMetaDataFormatter.makeAllTableStatus(JsonMetaDataFormatter.java:209) at org.apache.hadoop.hive.ql.metadata.formatting.JsonMetaDataFormatter.showTableStatus(JsonMetaDataFormatter.java:192) at org.apache.hadoop.hive.ql.exec.DDLTask.showTableStatus(DDLTask.java:2745) ... 15 more Digg into the implementation of JsonMetaDataFormatter, I think org.apache.hadoop.hive.ql.metadata.formatting.JsonMetaDataFormatter.putFileSystemsStats should handle the case if Table Path is NULL (like a view). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10320) CBO (Calcite Return Path): Disable choosing streaming side at join creation time [CBO branch]
[ https://issues.apache.org/jira/browse/HIVE-10320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-10320. - Resolution: Fixed Committed to branch. CBO (Calcite Return Path): Disable choosing streaming side at join creation time [CBO branch] - Key: HIVE-10320 URL: https://issues.apache.org/jira/browse/HIVE-10320 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: cbo-branch Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: cbo-branch Attachments: HIVE-10320.cbo.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10228) Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics
[ https://issues.apache.org/jira/browse/HIVE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-10228: Attachment: HIVE-10228.2.patch Updated patch. Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics -- Key: HIVE-10228 URL: https://issues.apache.org/jira/browse/HIVE-10228 Project: Hive Issue Type: Sub-task Components: Import/Export Affects Versions: 1.2.0 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-10228.2.patch, HIVE-10228.patch We need to update a couple of hive commands to support replication semantics. To wit, we need the following: EXPORT ... [FOR [METADATA] REPLICATION(“comment”)] Export will now support an extra optional clause to tell it that this export is being prepared for the purpose of replication. There is also an additional optional clause here, that allows for the export to be a metadata-only export, to handle cases of capturing the diff for alter statements, for example. Also, if done for replication, the non-presence of a table, or a table being a view/offline table/non-native table is not considered an error, and instead, will result in a successful no-op. IMPORT ... (as normal) – but handles new semantics No syntax changes for import, but import will have to change to be able to handle all the permutations of export dumps possible. Also, import will have to ensure that it should update the object only if the update being imported is not older than the state of the object. DROP TABLE ... FOR REPLICATION('eventid') Drop Table now has an additional clause, to specify that this drop table is being done for replication purposes, and that the dop should not actually drop the table if the table is newer than that event id specified. ALTER TABLE ... DROP PARTITION (...) FOR REPLICATION('eventid') Similarly, Drop Partition also has an equivalent change to Drop Table. = In addition, we introduce a new property repl.last.id, which when tagged on to table properties or partition properties on a replication-destination, holds the effective state identifier of the object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10314) CBO (Calcite Return Path): TOK_ALLCOLREF not being replaced in GroupBy clause [CBO branch]
[ https://issues.apache.org/jira/browse/HIVE-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-10314. - Resolution: Fixed Committed to branch. Thanks, Jesus! CBO (Calcite Return Path): TOK_ALLCOLREF not being replaced in GroupBy clause [CBO branch] -- Key: HIVE-10314 URL: https://issues.apache.org/jira/browse/HIVE-10314 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: cbo-branch Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: cbo-branch Attachments: HIVE-10314.cbo.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10320) CBO (Calcite Return Path): Disable choosing streaming side at join creation time [CBO branch]
[ https://issues.apache.org/jira/browse/HIVE-10320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492793#comment-14492793 ] Jesus Camacho Rodriguez commented on HIVE-10320: [~ashutoshc], can this one go in for the QA run too? This is the reason why the call to HiveMDRelSize was triggered. CC'd [~jpullokkaran] CBO (Calcite Return Path): Disable choosing streaming side at join creation time [CBO branch] - Key: HIVE-10320 URL: https://issues.apache.org/jira/browse/HIVE-10320 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: cbo-branch Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: cbo-branch Attachments: HIVE-10320.cbo.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10202) Beeline outputs prompt+query on standard output when used in non-interactive mode
[ https://issues.apache.org/jira/browse/HIVE-10202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492971#comment-14492971 ] Lefty Leverenz commented on HIVE-10202: --- Nope, no output example in the docs. But the Beeline Command Options section includes some usage notes and bug fixes so it could be mentioned there. The current text is: {quote} Reduce the amount of informational messages displayed (true) or not (false). It also stops displaying the log messages for the query from HiveServer2 (Hive 0.14 and later). Default is false. Usage: beeline --silent=true {quote} * [HiveServer2 Clients -- Beeline Command Options | https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-BeelineCommandOptions] Thanks [~spena]. Beeline outputs prompt+query on standard output when used in non-interactive mode - Key: HIVE-10202 URL: https://issues.apache.org/jira/browse/HIVE-10202 Project: Hive Issue Type: Bug Affects Versions: 1.0.0 Reporter: Sergio Peña Assignee: Naveen Gangam Fix For: 1.2.0 Attachments: HIVE-10202.patch When passing a SQL script file to Hive CLI, the prompt+query is not sent to the standard output nor standard error. This is totally fine because users might want to send only the query results to the standard output, and parse the results from it. In the case of BeeLine, the promp+query is sent to the standard output causing extra parsing on the user scripts to avoid reading the prompt+query. Another drawback is in the security side. Sensitive queries are logged directly to the files where the standard output is redirected. How to reproduce: {noformat} $ cat /tmp/query.sql select * from test limit 1; $ beeline --showheader=false --outputformat=tsv2 -u jdbc:hive2://localhost:1 -f /tmp/query.sql /tmp/output.log 2 /tmp/error.log $ cat /tmp/output.log 0: jdbc:hive2://localhost:1 select * . . . . . . . . . . . . . . . . from test . . . . . . . . . . . . . . . . limit 1; 451 451.713 false y2dh7 [866,528,936] 0: jdbc:hive2://localhost:1 {noformat} We should avoid sending the prompt+query to the standard output/error whenever a script file is passed to BeeLine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10190) CBO: AST mode checks for TABLESAMPLE with AST.toString().contains(TOK_TABLESPLITSAMPLE)
[ https://issues.apache.org/jira/browse/HIVE-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492903#comment-14492903 ] Reuben Kuhnert commented on HIVE-10190: --- Actually, I just found a simple solution to #1 as well. I'll put a new patch up here shortly. CBO: AST mode checks for TABLESAMPLE with AST.toString().contains(TOK_TABLESPLITSAMPLE) - Key: HIVE-10190 URL: https://issues.apache.org/jira/browse/HIVE-10190 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 1.2.0 Reporter: Gopal V Assignee: Reuben Kuhnert Priority: Trivial Labels: perfomance Attachments: HIVE-10190-querygen.py, HIVE-10190.01.patch, HIVE-10190.02.patch, HIVE-10190.03.patch {code} public static boolean validateASTForUnsupportedTokens(ASTNode ast) { String astTree = ast.toStringTree(); // if any of following tokens are present in AST, bail out String[] tokens = { TOK_CHARSETLITERAL, TOK_TABLESPLITSAMPLE }; for (String token : tokens) { if (astTree.contains(token)) { return false; } } return true; } {code} This is an issue for a SQL query which is bigger in AST form than in text (~700kb). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9622) Getting NPE when trying to restart HS2 when metastore is configured to use org.apache.hadoop.hive.thrift.DBTokenStore
[ https://issues.apache.org/jira/browse/HIVE-9622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492974#comment-14492974 ] Swarnim Kulkarni commented on HIVE-9622: [~brocknoland] Is there a current workaround for this issue? Getting NPE when trying to restart HS2 when metastore is configured to use org.apache.hadoop.hive.thrift.DBTokenStore - Key: HIVE-9622 URL: https://issues.apache.org/jira/browse/HIVE-9622 Project: Hive Issue Type: Bug Reporter: Aihua Xu Assignee: Aihua Xu Labels: HiveServer2, Security Fix For: 1.2.0 Attachments: HIVE-9622.1.patch, HIVE-9622.2.patch # Configure the cluster to use kerberos for HS2 and Metastore. ## http://www.cloudera.com/content/cloudera/en/documentation/cdh4/v4-3-0/CDH4-Security-Guide/cdh4sg_topic_9_1.html ## http://www.cloudera.com/content/cloudera/en/documentation/cdh4/v4-6-0/CDH4-Security-Guide/cdh4sg_topic_9_2.html # Set hive metastore delegation token to org.apache.hadoop.hive.thrift.DBTokenStore in hive-site.xml {code} property namehive.cluster.delegation.token.store.class/name valueorg.apache.hadoop.hive.thrift.DBTokenStore/value /property {code} # Then trying to restart hive service, HS2 fails to start the NPE below: {code} 9:43:10.711 AMERROR org.apache.hive.service.cli.thrift.ThriftCLIService Error: org.apache.thrift.transport.TTransportException: Failed to start token manager at org.apache.hive.service.auth.HiveAuthFactory.init(HiveAuthFactory.java:107) at org.apache.hive.service.cli.thrift.ThriftBinaryCLIService.run(ThriftBinaryCLIService.java:51) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Failed to initialize master key at org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager.startThreads(TokenStoreDelegationTokenSecretManager.java:223) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server.startDelegationTokenSecretManager(HadoopThriftAuthBridge20S.java:438) at org.apache.hive.service.auth.HiveAuthFactory.init(HiveAuthFactory.java:105) ... 2 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager.startThreads(TokenStoreDelegationTokenSecretManager.java:221) ... 4 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.thrift.DBTokenStore.invokeOnRawStore(DBTokenStore.java:145) at org.apache.hadoop.hive.thrift.DBTokenStore.addMasterKey(DBTokenStore.java:41) at org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager.logUpdateMasterKey(TokenStoreDelegationTokenSecretManager.java:203) at org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.updateCurrentKey(AbstractDelegationTokenSecretManager.java:339) ... 9 more 9:43:10.719 AMINFOorg.apache.hive.service.server.HiveServer2 SHUTDOWN_MSG: / SHUTDOWN_MSG: Shutting down HiveServer2 at a1909.halxg.cloudera.com/10.20.202.109 / {code} The problem appears that we didn't pass a {{RawStore}} object in the following: https://github.com/apache/hive/blob/trunk/service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java#L111 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10319) Hive CLI startup takes a long time with a large number of databases
[ https://issues.apache.org/jira/browse/HIVE-10319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492980#comment-14492980 ] Hive QA commented on HIVE-10319: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12724989/HIVE-10319.patch {color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8674 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string org.apache.hive.jdbc.TestJdbcWithMiniHS2.testNewConnectionConfiguration {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3407/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3407/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3407/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12724989 - PreCommit-HIVE-TRUNK-Build Hive CLI startup takes a long time with a large number of databases --- Key: HIVE-10319 URL: https://issues.apache.org/jira/browse/HIVE-10319 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 1.0.0 Reporter: Nezih Yigitbasi Attachments: HIVE-10319.patch The Hive CLI takes a long time to start when there is a large number of databases in the DW. I think the root cause is the way permanent UDFs are loaded from the metastore. When I looked at the logs and the source code I see that at startup Hive first gets all the databases from the metastore and then for each database it makes a metastore call to get the permanent functions for that database [see Hive.java | https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L162-185]. So the number of metastore calls made is in the order of the number of databases. In production we have several hundreds of databases so Hive makes several hundreds of RPC calls during startup, taking 30+ seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10228) Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics
[ https://issues.apache.org/jira/browse/HIVE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492982#comment-14492982 ] Hive QA commented on HIVE-10228: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12724991/HIVE-10228.2.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3408/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3408/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3408/ Messages: {noformat} This message was trimmed, see log for full details Reverted 'service/src/gen/thrift/gen-py/hive_service/constants.py' Reverted 'service/src/gen/thrift/gen-py/hive_service/ThriftHive-remote' Reverted 'service/src/gen/thrift/gen-py/TCLIService/ttypes.py' Reverted 'service/src/gen/thrift/gen-py/TCLIService/TCLIService.py' Reverted 'service/src/gen/thrift/gen-py/TCLIService/constants.py' Reverted 'service/src/gen/thrift/gen-py/TCLIService/TCLIService-remote' Reverted 'service/src/gen/thrift/gen-cpp/hive_service_types.cpp' Reverted 'service/src/gen/thrift/gen-cpp/TCLIService_types.cpp' Reverted 'service/src/gen/thrift/gen-cpp/TCLIService.h' Reverted 'service/src/gen/thrift/gen-cpp/ThriftHive.h' Reverted 'service/src/gen/thrift/gen-cpp/hive_service_types.h' Reverted 'service/src/gen/thrift/gen-cpp/TCLIService_types.h' Reverted 'service/src/gen/thrift/gen-cpp/hive_service_constants.cpp' Reverted 'service/src/gen/thrift/gen-cpp/TCLIService_constants.cpp' Reverted 'service/src/gen/thrift/gen-cpp/TCLIService.cpp' Reverted 'service/src/gen/thrift/gen-cpp/ThriftHive.cpp' Reverted 'service/src/gen/thrift/gen-cpp/hive_service_constants.h' Reverted 'service/src/gen/thrift/gen-cpp/TCLIService_constants.h' Reverted 'service/src/gen/thrift/gen-rb/hive_service_types.rb' Reverted 'service/src/gen/thrift/gen-rb/t_c_l_i_service_constants.rb' Reverted 'service/src/gen/thrift/gen-rb/hive_service_constants.rb' Reverted 'service/src/gen/thrift/gen-rb/t_c_l_i_service.rb' Reverted 'service/src/gen/thrift/gen-rb/thrift_hive.rb' Reverted 'service/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/service/HiveClusterStatus.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/service/HiveServerException.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/service/JobTrackerState.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/service/ThriftHive.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TCancelOperationReq.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TStatusCode.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TTypeQualifierValue.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetFunctionsReq.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TTypeDesc.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TCloseSessionReq.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TFetchResultsReq.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TRowSet.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TStringColumn.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetTableTypesReq.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TCLIServiceConstants.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetCatalogsResp.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetColumnsReq.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TI16Value.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TByteValue.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TMapTypeEntry.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetFunctionsResp.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TBinaryColumn.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TTypeEntry.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TFetchOrientation.java' Reverted 'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetTableTypesResp.java' Reverted
[jira] [Updated] (HIVE-10319) Hive CLI startup takes a long time with a large number of databases
[ https://issues.apache.org/jira/browse/HIVE-10319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nezih Yigitbasi updated HIVE-10319: --- Attachment: HIVE-10319.patch Hive CLI startup takes a long time with a large number of databases --- Key: HIVE-10319 URL: https://issues.apache.org/jira/browse/HIVE-10319 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 1.0.0 Reporter: Nezih Yigitbasi Attachments: HIVE-10319.patch The Hive CLI takes a long time to start when there is a large number of databases in the DW. I think the root cause is the way permanent UDFs are loaded from the metastore. When I looked at the logs and the source code I see that at startup Hive first gets all the databases from the metastore and then for each database it makes a metastore call to get the permanent functions for that database [see Hive.java | https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L162-185]. So the number of metastore calls made is in the order of the number of databases. In production we have several hundreds of databases so Hive makes several hundreds of RPC calls during startup, taking 30+ seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10202) Beeline outputs prompt+query on standard output when used in non-interactive mode
[ https://issues.apache.org/jira/browse/HIVE-10202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492721#comment-14492721 ] Sergio Peña commented on HIVE-10202: hi [~leftylev] This just changed the output of beeline in order to match hive-cli output only when the '--silent=true' is set to true. If we have a doc section where shows an output example of the '--silent=true', then we should do that change on it; if not, then I think it should be ok if we don't document this. If we have a doc section for this, could you post the url so that I can take a look? Beeline outputs prompt+query on standard output when used in non-interactive mode - Key: HIVE-10202 URL: https://issues.apache.org/jira/browse/HIVE-10202 Project: Hive Issue Type: Bug Affects Versions: 1.0.0 Reporter: Sergio Peña Assignee: Naveen Gangam Fix For: 1.2.0 Attachments: HIVE-10202.patch When passing a SQL script file to Hive CLI, the prompt+query is not sent to the standard output nor standard error. This is totally fine because users might want to send only the query results to the standard output, and parse the results from it. In the case of BeeLine, the promp+query is sent to the standard output causing extra parsing on the user scripts to avoid reading the prompt+query. Another drawback is in the security side. Sensitive queries are logged directly to the files where the standard output is redirected. How to reproduce: {noformat} $ cat /tmp/query.sql select * from test limit 1; $ beeline --showheader=false --outputformat=tsv2 -u jdbc:hive2://localhost:1 -f /tmp/query.sql /tmp/output.log 2 /tmp/error.log $ cat /tmp/output.log 0: jdbc:hive2://localhost:1 select * . . . . . . . . . . . . . . . . from test . . . . . . . . . . . . . . . . limit 1; 451 451.713 false y2dh7 [866,528,936] 0: jdbc:hive2://localhost:1 {noformat} We should avoid sending the prompt+query to the standard output/error whenever a script file is passed to BeeLine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10066) Hive on Tez job submission through WebHCat doesn't ship Tez artifacts
[ https://issues.apache.org/jira/browse/HIVE-10066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492820#comment-14492820 ] Eugene Koifman commented on HIVE-10066: --- HIVE-9486 added dependency on hive-common.jar Hive on Tez job submission through WebHCat doesn't ship Tez artifacts - Key: HIVE-10066 URL: https://issues.apache.org/jira/browse/HIVE-10066 Project: Hive Issue Type: Bug Components: Tez, WebHCat Affects Versions: 1.0.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-10066.2.patch, HIVE-10066.3.patch, HIVE-10066.patch From [~hitesh]: Tez is a client-side only component ( no daemons, etc ) and therefore it is meant to be installed on the gateway box ( or where its client libraries are needed by any other services’ daemons). It does not have any cluster dependencies both in terms of libraries/jars as well as configs. When it runs on a worker node, everything was pre-packaged and made available to the worker node via the distributed cache via the client code. Hence, its client-side configs are also only needed on the same (client) node as where it is installed. The only other install step needed is to have the tez tarball be uploaded to HDFS and the config has an entry “tez.lib.uris” which points to the HDFS path. We need a way to pass client jars and tez-site.xml to the LaunchMapper. We should create a general purpose mechanism here which can supply additional artifacts per job type. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10242) ACID: insert overwrite prevents create table command
[ https://issues.apache.org/jira/browse/HIVE-10242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492810#comment-14492810 ] Eugene Koifman commented on HIVE-10242: --- 1. The WebHCat stuff is there because I used it for testing. It has 1 line script to start hive services, etc. So the additions there are to be able to run it with MySQL based metastore to test concurrency. So this is not a product change but it's relevant. I can split it into a separate patch if necessary. 2. The default in LockInfo constructor matches previous implementation. W/o the default, switch would just fall off the end and type and state will get default values, i.e. null. 3. I find immutable objects easier to understand. There is no specific functional reason in this case. 4. lockTypeComparator static final in LockInfoComparator - there is exactly the same number of instances of both, that is 1 per TxnHandler.checkLock(). Also, checkLock() issues SQL queries and mostly likely is being called over the network from remote client. It seems unlikely to make any noticeable difference (even if we knew for a fact that the compiler won't inline it). So I thought code readability was more important here. 5. The changes in DbLockManager modifies package level method to make testing easier. This allows test code to attempt to acquire locks but not block if they cannot be acquired. It's the same idea as TxnHandler.numLocksInLockTable(). ACID: insert overwrite prevents create table command Key: HIVE-10242 URL: https://issues.apache.org/jira/browse/HIVE-10242 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 1.0.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-10242.2.patch, HIVE-10242.3.patch, HIVE-10242.patch 1. insert overwirte table DB.T1 select ... from T2: this takes X lock on DB.T1 and S lock on T2. X lock makes sense because we don't want anyone reading T1 while it's overwritten. S lock on T2 prevents if from being dropped while the query is in progress. 2. create table DB.T3: takes S lock on DB. This S lock gets blocked by X lock on T1. S lock prevents the DB from being dropped while create table is executed. If the insert statement is long running, this blocks DDL ops on the same database. This is a usability issue. There is no good reason why X lock on a table within a DB and S lock on DB should be in conflict. (this is different from a situation where X lock is on a partition and S lock is on the table to which this partition belongs. Here it makes sense. Basically there is no SQL way to address all tables in a DB but you can easily refer to all partitions of a table) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10268) Merge cbo branch into trunk
[ https://issues.apache.org/jira/browse/HIVE-10268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-10268: Attachment: HIVE-10268.2.patch After few fixes. Merge cbo branch into trunk --- Key: HIVE-10268 URL: https://issues.apache.org/jira/browse/HIVE-10268 Project: Hive Issue Type: Task Components: CBO Affects Versions: cbo-branch Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-10268.1.patch, HIVE-10268.2.patch, HIVE-10268.patch Merge patch generated on basis of diffs of trunk with cbo-branch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10319) Hive CLI startup takes a long time with a large number of databases
[ https://issues.apache.org/jira/browse/HIVE-10319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492735#comment-14492735 ] Nezih Yigitbasi commented on HIVE-10319: Attached patch proposes a new metastore function get_all_functions() that returns all functions in the metastore, with this change during startup only a single call is made to get all the functions instead of making O(# of databases) calls. Hive CLI startup takes a long time with a large number of databases --- Key: HIVE-10319 URL: https://issues.apache.org/jira/browse/HIVE-10319 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 1.0.0 Reporter: Nezih Yigitbasi Attachments: HIVE-10319.patch The Hive CLI takes a long time to start when there is a large number of databases in the DW. I think the root cause is the way permanent UDFs are loaded from the metastore. When I looked at the logs and the source code I see that at startup Hive first gets all the databases from the metastore and then for each database it makes a metastore call to get the permanent functions for that database [see Hive.java | https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L162-185]. So the number of metastore calls made is in the order of the number of databases. In production we have several hundreds of databases so Hive makes several hundreds of RPC calls during startup, taking 30+ seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10319) Hive CLI startup takes a long time with a large number of databases
[ https://issues.apache.org/jira/browse/HIVE-10319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nezih Yigitbasi updated HIVE-10319: --- Attachment: (was: HIVE-10319.patch) Hive CLI startup takes a long time with a large number of databases --- Key: HIVE-10319 URL: https://issues.apache.org/jira/browse/HIVE-10319 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 1.0.0 Reporter: Nezih Yigitbasi The Hive CLI takes a long time to start when there is a large number of databases in the DW. I think the root cause is the way permanent UDFs are loaded from the metastore. When I looked at the logs and the source code I see that at startup Hive first gets all the databases from the metastore and then for each database it makes a metastore call to get the permanent functions for that database [see Hive.java | https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L162-185]. So the number of metastore calls made is in the order of the number of databases. In production we have several hundreds of databases so Hive makes several hundreds of RPC calls during startup, taking 30+ seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10313) Literal Decimal ExprNodeConstantDesc should contain value of HiveDecimal instead of String
[ https://issues.apache.org/jira/browse/HIVE-10313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492863#comment-14492863 ] Chaoyu Tang commented on HIVE-10313: The failure of test index_auto_mult_tables_compact.q looks not related to this patch and I am not able to reproduce it. Also others from TestMinimrCliDriver are also not related and they might be due to some build/infra issue. Literal Decimal ExprNodeConstantDesc should contain value of HiveDecimal instead of String -- Key: HIVE-10313 URL: https://issues.apache.org/jira/browse/HIVE-10313 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.0.0 Reporter: Chaoyu Tang Assignee: Chaoyu Tang Attachments: HIVE-10313.1.patch, HIVE-10313.patch In TyepCheckProcFactory.NumExprProcessor, the ExprNodeConstantDesc is created from strVal: {code} else if (expr.getText().endsWith(BD)) { // Literal decimal String strVal = expr.getText().substring(0, expr.getText().length() - 2); HiveDecimal hd = HiveDecimal.create(strVal); int prec = 1; int scale = 0; if (hd != null) { prec = hd.precision(); scale = hd.scale(); } DecimalTypeInfo typeInfo = TypeInfoFactory.getDecimalTypeInfo(prec, scale); return new ExprNodeConstantDesc(typeInfo, strVal); } {code} It should use HiveDecmal: return new ExprNodeConstantDesc(typeInfo, hd); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10322) TestJdbcWithMiniHS2.testNewConnectionConfiguration fails
[ https://issues.apache.org/jira/browse/HIVE-10322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493003#comment-14493003 ] Vaibhav Gumashta commented on HIVE-10322: - +1 TestJdbcWithMiniHS2.testNewConnectionConfiguration fails Key: HIVE-10322 URL: https://issues.apache.org/jira/browse/HIVE-10322 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 1.2.0 Reporter: Chaoyu Tang Assignee: Chaoyu Tang Priority: Trivial Attachments: HIVE-10322.patch Fix test org.apache.hive.jdbc.TestJdbcWithMiniHS2.testNewConnectionConfiguration failed with following error: {code} org.apache.hive.service.cli.HiveSQLException: Failed to open new session: org.apache.hive.service.cli.HiveSQLException: java.lang.IllegalArgumentException: hive configuration hive.server2.thrift.http.max.worker.threads does not exists. at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:243) at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:234) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:513) at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:188) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:233) at org.apache.hive.jdbc.TestJdbcWithMiniHS2.testNewConnectionConfiguration(TestJdbcWithMiniHS2.java:275) Caused by: org.apache.hive.service.cli.HiveSQLException: Failed to open new session: org.apache.hive.service.cli.HiveSQLException: java.lang.IllegalArgumentException: hive configuration hive.server2.thrift.http.max.worker.threads does not exists. {code} It seems related to HIVE-10271(remove hive.server2.thrift.http.min/max.worker.threads properties) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10316) same query works with TEXTFILE and fails with ORC
[ https://issues.apache.org/jira/browse/HIVE-10316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493032#comment-14493032 ] Sergey Shelukhin commented on HIVE-10316: - [~prasanth_j] [~owen.omalley] fyi same query works with TEXTFILE and fails with ORC - Key: HIVE-10316 URL: https://issues.apache.org/jira/browse/HIVE-10316 Project: Hive Issue Type: Bug Components: Compression Affects Versions: 0.14.0 Environment: hortonworks HDP 2.2 running on Linux Reporter: Philippe Verhaeghe See also related answer in mailing list : http://mail-archives.apache.org/mod_mbox/hive-user/201504.mbox/%3CD15184D6.27779%25gopal%40hortonworks.com%3E I’m getting an error in Hive when executing a query on a table in ORC format. After several trials, I succeeded to run the same query on the same table in TEXTFILE format. I ‘ve been able to reproduce the error with the simple sql script below. I create the same table in TEXFILE and in ORC and I run a SELECT …GROUP BY on the tables. The first SELECT issued on the TEXTFILE table succeeds. The second SELECT issued on the ORC table fails. NB : There is a CONCAT in the query. If I remove the CONCAT the query is running ok with both tables … Example script to reproduce the error : USE pvr_temp; DROP TABLE IF EXISTS students_text; CREATE TABLE students_text (name VARCHAR(64), age INT, datetime TIMESTAMP, gpa DECIMAL(3, 2)) STORED AS TEXTFILE; INSERT INTO TABLE students_text VALUES ('fred flintstone', 35, '2015-04-13 13:40:00', 1.28), ('barney rubble', 32, '2015-04-13 13:40:00', 2.32); SELECT CONCAT(TO_DATE(datetime), '-'), SUM(gpa) FROM students_text GROUP BY CONCAT(TO_DATE(datetime), '-'); DROP TABLE IF EXISTS students_orc; CREATE TABLE students_orc (name VARCHAR(64), age INT, datetime TIMESTAMP, gpa DECIMAL(3, 2)) STORED AS ORC; INSERT INTO TABLE students_orc VALUES ('fred flintstone', 35, '2015-04-13 SELECT CONCAT(TO_DATE(datetime), '-'), SUM(gpa) FROM students_orc GROUP BY CONCAT(TO_DATE(datetime), '-'); 13:40:00', 1.28), ('barney rubble', 32, '2015-04-13 13:40:00', 2.32); Log where you can see the error : [pvr@tpcalr01s ~]$ cat test.log scan complete in 9ms Connecting to jdbc:hive2://tpcrmm03s:1 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/hdp/2.2.0.0-2041/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/2.2.0.0-2041/hive/lib/hive-jdbc-0.14.0.2.2.0.0-2041-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Connected to: Apache Hive (version 0.14.0.2.2.0.0-2041) Driver: Hive JDBC (version 0.14.0.2.2.0.0-2041) Transaction isolation: TRANSACTION_REPEATABLE_READ 0: jdbc:hive2://tpcrmm03s:1 USE pvr_temp; No rows affected (0.061 seconds) 0: jdbc:hive2://tpcrmm03s:1 DROP TABLE IF EXISTS students_text; No rows affected (0.12 seconds) 0: jdbc:hive2://tpcrmm03s:1 CREATE TABLE students_text (name VARCHAR(64), age INT, datetime TIMESTAMP, gpa DECIMAL(3, 2)) STORED AS TEXTFILE; No rows affected (0.057 seconds) 0: jdbc:hive2://tpcrmm03s:1 INSERT INTO TABLE students_text VALUES ('fred flintstone', 35, '2015-04-13 13:40:00', 1.28), ('barney rubble', 32, '2015-04-13 13:40:00', 2.32); INFO : Tez session hasn't been created yet. Opening session INFO : INFO : Status: Running (Executing on YARN cluster with App id application_1428656093356_0047) INFO : Map 1: -/- INFO : Map 1: 0/1 No rows affected (14.134 seconds) INFO : Map 1: 0/1 INFO : Map 1: 0(+1)/1 INFO : Map 1: 0(+1)/1 INFO : Map 1: 1/1 INFO : Loading data to table pvr_temp.students_text from hdfs://tpcrmm01s.priv.atos.fr:8020/tmp/hive/hive/bf19c354-de67-45ae-a3e4-cd57d81acd71/hive_2015-04-13_14-15-08_445_2811483497310651606-20/-ext-1 INFO : Table pvr_temp.students_text stats: [numFiles=1, numRows=2, totalSize=86, rawDataSize=84] 0: jdbc:hive2://tpcrmm03s:1 SELECT CONCAT(TO_DATE(datetime), '-'), SUM(gpa) FROM students_text GROUP BY CONCAT(TO_DATE(datetime), '-'); INFO : Session is already open INFO : INFO : Status: Running (Executing on YARN cluster with App id application_1428656093356_0047) INFO : Map 1: -/- Reducer 2: 0/1 INFO : Map 1: 0/1 Reducer 2: 0/1 INFO : Map 1: 0(+1)/1 Reducer 2: 0/1 INFO : Map 1: 1/1 Reducer 2: 0(+1)/1 INFO : Map 1: 1/1 Reducer 2: 1/1 +--+--+--+ | _c0 | _c1 | +--+--+--+ | 2015-04-13- | 3.6 | +--+--+--+ 1 row selected (3.258 seconds) 0: jdbc:hive2://tpcrmm03s:1 DROP TABLE IF
[jira] [Commented] (HIVE-10309) TestJdbcWithMiniHS2.java broken because of the removal of hive.server2.thrift.http.max.worker.threads
[ https://issues.apache.org/jira/browse/HIVE-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493049#comment-14493049 ] Vaibhav Gumashta commented on HIVE-10309: - Since this is causing precommit to fail, i'll commit the fix right away ( fix is trivial). TestJdbcWithMiniHS2.java broken because of the removal of hive.server2.thrift.http.max.worker.threads -- Key: HIVE-10309 URL: https://issues.apache.org/jira/browse/HIVE-10309 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 1.2.0 Attachments: HIVE-10309.1.patch HIVE-10271 removed hive.server2.thrift.http.min/max.worker.threads properties, however these properties are used in a few more places in hive code. For example, TestJdbcWithMiniHS2.java . We need to fix these as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10322) TestJdbcWithMiniHS2.testNewConnectionConfiguration fails
[ https://issues.apache.org/jira/browse/HIVE-10322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493027#comment-14493027 ] Vaibhav Gumashta commented on HIVE-10322: - Thanks [~hsubramaniyan]. In that case we'll stick to that since it was created before and has had a UT run. I'll close this as dup. TestJdbcWithMiniHS2.testNewConnectionConfiguration fails Key: HIVE-10322 URL: https://issues.apache.org/jira/browse/HIVE-10322 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 1.2.0 Reporter: Chaoyu Tang Assignee: Chaoyu Tang Priority: Trivial Attachments: HIVE-10322.patch Fix test org.apache.hive.jdbc.TestJdbcWithMiniHS2.testNewConnectionConfiguration failed with following error: {code} org.apache.hive.service.cli.HiveSQLException: Failed to open new session: org.apache.hive.service.cli.HiveSQLException: java.lang.IllegalArgumentException: hive configuration hive.server2.thrift.http.max.worker.threads does not exists. at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:243) at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:234) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:513) at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:188) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:233) at org.apache.hive.jdbc.TestJdbcWithMiniHS2.testNewConnectionConfiguration(TestJdbcWithMiniHS2.java:275) Caused by: org.apache.hive.service.cli.HiveSQLException: Failed to open new session: org.apache.hive.service.cli.HiveSQLException: java.lang.IllegalArgumentException: hive configuration hive.server2.thrift.http.max.worker.threads does not exists. {code} It seems related to HIVE-10271(remove hive.server2.thrift.http.min/max.worker.threads properties) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10268) Merge cbo branch into trunk
[ https://issues.apache.org/jira/browse/HIVE-10268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-10268: Attachment: HIVE-10268.3.patch Merge cbo branch into trunk --- Key: HIVE-10268 URL: https://issues.apache.org/jira/browse/HIVE-10268 Project: Hive Issue Type: Task Components: CBO Affects Versions: cbo-branch Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-10268.1.patch, HIVE-10268.2.patch, HIVE-10268.3.patch, HIVE-10268.patch Merge patch generated on basis of diffs of trunk with cbo-branch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10318) The HMS upgrade test does not test patches that affect the upgrade test scripts
[ https://issues.apache.org/jira/browse/HIVE-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493181#comment-14493181 ] Sergio Peña commented on HIVE-10318: [~szehon] [~brocknoland] Could help me review this patch? The HMS upgrade test does not test patches that affect the upgrade test scripts --- Key: HIVE-10318 URL: https://issues.apache.org/jira/browse/HIVE-10318 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-10318.1.patch When doing a change on the HMS upgrade scripts, such as adding a new DB server, then the HMS upgrade test does not test this change. In order to make it work, this patch needs to be committed to trunk. This is not desired, as we need to test the upgrade change first before doing the commit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10317) The HMS upgrade test fails when applying patches with '-p1' instead of '-p0'
[ https://issues.apache.org/jira/browse/HIVE-10317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña resolved HIVE-10317. Resolution: Duplicate The HMS upgrade test fails when applying patches with '-p1' instead of '-p0' Key: HIVE-10317 URL: https://issues.apache.org/jira/browse/HIVE-10317 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Sergio Peña Assignee: Sergio Peña Patches uploaded to jira might be applied to the source code using 'patch -p0' or 'patch -p1' directly on the root directory of the branch. The HMS upgrade test is using only '-p1', so some patches failed. We need to support the '-p0' on the HMS upgrade test as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10228) Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics
[ https://issues.apache.org/jira/browse/HIVE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493208#comment-14493208 ] Hive QA commented on HIVE-10228: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12725034/HIVE-10228.3.patch {color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 8677 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3411/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3411/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3411/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12725034 - PreCommit-HIVE-TRUNK-Build Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics -- Key: HIVE-10228 URL: https://issues.apache.org/jira/browse/HIVE-10228 Project: Hive Issue Type: Sub-task Components: Import/Export Affects Versions: 1.2.0 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-10228.2.patch, HIVE-10228.3.patch, HIVE-10228.patch We need to update a couple of hive commands to support replication semantics. To wit, we need the following: EXPORT ... [FOR [METADATA] REPLICATION(“comment”)] Export will now support an extra optional clause to tell it that this export is being prepared for the purpose of replication. There is also an additional optional clause here, that allows for the export to be a metadata-only export, to handle cases of capturing the diff for alter statements, for example. Also, if done for replication, the non-presence of a table, or a table being a view/offline table/non-native table is not considered an error, and instead, will result in a successful no-op. IMPORT ... (as normal) – but handles new semantics No syntax changes for import, but import will have to change to be able to handle all the permutations of export dumps possible. Also, import will have to ensure that it should update the object only if the update being imported is not older than the state of the object. DROP TABLE ... FOR REPLICATION('eventid') Drop Table now has an additional clause, to specify that this drop table is being done for replication purposes, and that the dop should not actually drop the table if the table is newer than that event id specified. ALTER TABLE ... DROP PARTITION (...) FOR REPLICATION('eventid') Similarly, Drop Partition also has an equivalent change to Drop Table. = In addition, we introduce a new property
[jira] [Commented] (HIVE-10243) Introduce JoinAlgorithm Interface
[ https://issues.apache.org/jira/browse/HIVE-10243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493004#comment-14493004 ] Laljo John Pullokkaran commented on HIVE-10243: --- [~jcamachorodriguez] Could you update the patch? I seem to be hitting compile issues. Introduce JoinAlgorithm Interface - Key: HIVE-10243 URL: https://issues.apache.org/jira/browse/HIVE-10243 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Laljo John Pullokkaran Assignee: Jesus Camacho Rodriguez Fix For: 1.2.0 Attachments: HIVE-10243.cbo.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10299) Enable new cost model for Tez execution engine
[ https://issues.apache.org/jira/browse/HIVE-10299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-10299: Attachment: HIVE-10299.2.patch With new cost model. Enable new cost model for Tez execution engine -- Key: HIVE-10299 URL: https://issues.apache.org/jira/browse/HIVE-10299 Project: Hive Issue Type: Task Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-10299.2.patch, HIVE-10299.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10309) TestJdbcWithMiniHS2.java broken because of the removal of hive.server2.thrift.http.max.worker.threads
[ https://issues.apache.org/jira/browse/HIVE-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493038#comment-14493038 ] Naveen Gangam commented on HIVE-10309: -- Should we revert fix for HIVE-10271 until this is resolved? This is causing the pre-commits to fail. Thanks TestJdbcWithMiniHS2.java broken because of the removal of hive.server2.thrift.http.max.worker.threads -- Key: HIVE-10309 URL: https://issues.apache.org/jira/browse/HIVE-10309 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 1.2.0 Attachments: HIVE-10309.1.patch HIVE-10271 removed hive.server2.thrift.http.min/max.worker.threads properties, however these properties are used in a few more places in hive code. For example, TestJdbcWithMiniHS2.java . We need to fix these as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10309) TestJdbcWithMiniHS2.java broken because of the removal of hive.server2.thrift.http.max.worker.threads
[ https://issues.apache.org/jira/browse/HIVE-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493039#comment-14493039 ] Vaibhav Gumashta commented on HIVE-10309: - Just looked at the test logs. TestSSL.testSSLFetchHttp doesnt look like its related to this patch, but might need separate investigation. +1 for this. TestJdbcWithMiniHS2.java broken because of the removal of hive.server2.thrift.http.max.worker.threads -- Key: HIVE-10309 URL: https://issues.apache.org/jira/browse/HIVE-10309 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 1.2.0 Attachments: HIVE-10309.1.patch HIVE-10271 removed hive.server2.thrift.http.min/max.worker.threads properties, however these properties are used in a few more places in hive code. For example, TestJdbcWithMiniHS2.java . We need to fix these as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10272) Some HCat tests fail under windows
[ https://issues.apache.org/jira/browse/HIVE-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-10272: Attachment: HIVE-10272.2.patch Updated patch, imports were needed. Some HCat tests fail under windows -- Key: HIVE-10272 URL: https://issues.apache.org/jira/browse/HIVE-10272 Project: Hive Issue Type: Bug Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-10272.2.patch, HIVE-10272.patch Some HCat tests fail under windows with errors like this: {noformat} java.lang.RuntimeException: java.lang.IllegalArgumentException: Pathname /D:/w/hv/hcatalog/hcatalog-pig-adapter/target/tmp/scratchdir from D:/w/hv/hcatalog/hcatalog-pig-adapter/target/tmp/scratchdir is not a valid DFS filename. at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:197) at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:106) at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305) at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424) at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:594) at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:552) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:504) at org.apache.hive.hcatalog.pig.TestHCatLoaderEncryption.setup(TestHCatLoaderEncryption.java:185) {noformat} We need to sanitize HiveConf objects with WindowsPathUtil.convertPathsFromWindowsToHdfs if running under windows before we use them to instantiate a SessionState/Driver -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9710) HiveServer2 should support cookie based authentication, when using HTTP transport.
[ https://issues.apache.org/jira/browse/HIVE-9710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493137#comment-14493137 ] Vaibhav Gumashta commented on HIVE-9710: [~hsubramaniyan] I have one minor comment on the new test you added. Rest looks good. HiveServer2 should support cookie based authentication, when using HTTP transport. -- Key: HIVE-9710 URL: https://issues.apache.org/jira/browse/HIVE-9710 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 1.2.0 Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9710.1.patch, HIVE-9710.2.patch, HIVE-9710.3.patch, HIVE-9710.4.patch, HIVE-9710.5.patch, HIVE-9710.6.patch, HIVE-9710.7.patch HiveServer2 should generate cookies and validate the client cookie send to it so that it need not perform User/Password or a Kerberos based authentication on each HTTP request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10318) The HMS upgrade test does not test patches that affect the upgrade test scripts
[ https://issues.apache.org/jira/browse/HIVE-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-10318: --- Attachment: HIVE-10318.1.patch The HMS upgrade test does not test patches that affect the upgrade test scripts --- Key: HIVE-10318 URL: https://issues.apache.org/jira/browse/HIVE-10318 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-10318.1.patch When doing a change on the HMS upgrade scripts, such as adding a new DB server, then the HMS upgrade test does not test this change. In order to make it work, this patch needs to be committed to trunk. This is not desired, as we need to test the upgrade change first before doing the commit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10316) same query works with TEXTFILE and fails with ORC
[ https://issues.apache.org/jira/browse/HIVE-10316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493043#comment-14493043 ] Prasanth Jayachandran commented on HIVE-10316: -- Looks like vectorization issue {code} Unsuported vector output type: StringGroup {code} [~mmccline] fyi.. same query works with TEXTFILE and fails with ORC - Key: HIVE-10316 URL: https://issues.apache.org/jira/browse/HIVE-10316 Project: Hive Issue Type: Bug Components: Compression Affects Versions: 0.14.0 Environment: hortonworks HDP 2.2 running on Linux Reporter: Philippe Verhaeghe See also related answer in mailing list : http://mail-archives.apache.org/mod_mbox/hive-user/201504.mbox/%3CD15184D6.27779%25gopal%40hortonworks.com%3E I’m getting an error in Hive when executing a query on a table in ORC format. After several trials, I succeeded to run the same query on the same table in TEXTFILE format. I ‘ve been able to reproduce the error with the simple sql script below. I create the same table in TEXFILE and in ORC and I run a SELECT …GROUP BY on the tables. The first SELECT issued on the TEXTFILE table succeeds. The second SELECT issued on the ORC table fails. NB : There is a CONCAT in the query. If I remove the CONCAT the query is running ok with both tables … Example script to reproduce the error : USE pvr_temp; DROP TABLE IF EXISTS students_text; CREATE TABLE students_text (name VARCHAR(64), age INT, datetime TIMESTAMP, gpa DECIMAL(3, 2)) STORED AS TEXTFILE; INSERT INTO TABLE students_text VALUES ('fred flintstone', 35, '2015-04-13 13:40:00', 1.28), ('barney rubble', 32, '2015-04-13 13:40:00', 2.32); SELECT CONCAT(TO_DATE(datetime), '-'), SUM(gpa) FROM students_text GROUP BY CONCAT(TO_DATE(datetime), '-'); DROP TABLE IF EXISTS students_orc; CREATE TABLE students_orc (name VARCHAR(64), age INT, datetime TIMESTAMP, gpa DECIMAL(3, 2)) STORED AS ORC; INSERT INTO TABLE students_orc VALUES ('fred flintstone', 35, '2015-04-13 SELECT CONCAT(TO_DATE(datetime), '-'), SUM(gpa) FROM students_orc GROUP BY CONCAT(TO_DATE(datetime), '-'); 13:40:00', 1.28), ('barney rubble', 32, '2015-04-13 13:40:00', 2.32); Log where you can see the error : [pvr@tpcalr01s ~]$ cat test.log scan complete in 9ms Connecting to jdbc:hive2://tpcrmm03s:1 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/hdp/2.2.0.0-2041/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/2.2.0.0-2041/hive/lib/hive-jdbc-0.14.0.2.2.0.0-2041-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Connected to: Apache Hive (version 0.14.0.2.2.0.0-2041) Driver: Hive JDBC (version 0.14.0.2.2.0.0-2041) Transaction isolation: TRANSACTION_REPEATABLE_READ 0: jdbc:hive2://tpcrmm03s:1 USE pvr_temp; No rows affected (0.061 seconds) 0: jdbc:hive2://tpcrmm03s:1 DROP TABLE IF EXISTS students_text; No rows affected (0.12 seconds) 0: jdbc:hive2://tpcrmm03s:1 CREATE TABLE students_text (name VARCHAR(64), age INT, datetime TIMESTAMP, gpa DECIMAL(3, 2)) STORED AS TEXTFILE; No rows affected (0.057 seconds) 0: jdbc:hive2://tpcrmm03s:1 INSERT INTO TABLE students_text VALUES ('fred flintstone', 35, '2015-04-13 13:40:00', 1.28), ('barney rubble', 32, '2015-04-13 13:40:00', 2.32); INFO : Tez session hasn't been created yet. Opening session INFO : INFO : Status: Running (Executing on YARN cluster with App id application_1428656093356_0047) INFO : Map 1: -/- INFO : Map 1: 0/1 No rows affected (14.134 seconds) INFO : Map 1: 0/1 INFO : Map 1: 0(+1)/1 INFO : Map 1: 0(+1)/1 INFO : Map 1: 1/1 INFO : Loading data to table pvr_temp.students_text from hdfs://tpcrmm01s.priv.atos.fr:8020/tmp/hive/hive/bf19c354-de67-45ae-a3e4-cd57d81acd71/hive_2015-04-13_14-15-08_445_2811483497310651606-20/-ext-1 INFO : Table pvr_temp.students_text stats: [numFiles=1, numRows=2, totalSize=86, rawDataSize=84] 0: jdbc:hive2://tpcrmm03s:1 SELECT CONCAT(TO_DATE(datetime), '-'), SUM(gpa) FROM students_text GROUP BY CONCAT(TO_DATE(datetime), '-'); INFO : Session is already open INFO : INFO : Status: Running (Executing on YARN cluster with App id application_1428656093356_0047) INFO : Map 1: -/- Reducer 2: 0/1 INFO : Map 1: 0/1 Reducer 2: 0/1 INFO : Map 1: 0(+1)/1 Reducer 2: 0/1 INFO : Map 1: 1/1 Reducer 2: 0(+1)/1 INFO : Map 1: 1/1 Reducer 2: 1/1 +--+--+--+ | _c0 | _c1 | +--+--+--+ | 2015-04-13- | 3.6 | +--+--+--+
[jira] [Updated] (HIVE-10309) TestJdbcWithMiniHS2.java broken because of the removal of hive.server2.thrift.http.max.worker.threads
[ https://issues.apache.org/jira/browse/HIVE-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-10309: Component/s: JDBC HiveServer2 TestJdbcWithMiniHS2.java broken because of the removal of hive.server2.thrift.http.max.worker.threads -- Key: HIVE-10309 URL: https://issues.apache.org/jira/browse/HIVE-10309 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 1.2.0 Attachments: HIVE-10309.1.patch HIVE-10271 removed hive.server2.thrift.http.min/max.worker.threads properties, however these properties are used in a few more places in hive code. For example, TestJdbcWithMiniHS2.java . We need to fix these as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10309) TestJdbcWithMiniHS2.java broken because of the removal of hive.server2.thrift.http.max.worker.threads
[ https://issues.apache.org/jira/browse/HIVE-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493028#comment-14493028 ] Vaibhav Gumashta commented on HIVE-10309: - org.apache.hive.jdbc.TestSSL.testSSLFetchHttp is a related failure. [~hsubramaniyan] this might need some investigation. TestJdbcWithMiniHS2.java broken because of the removal of hive.server2.thrift.http.max.worker.threads -- Key: HIVE-10309 URL: https://issues.apache.org/jira/browse/HIVE-10309 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 1.2.0 Attachments: HIVE-10309.1.patch HIVE-10271 removed hive.server2.thrift.http.min/max.worker.threads properties, however these properties are used in a few more places in hive code. For example, TestJdbcWithMiniHS2.java . We need to fix these as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10268) Merge cbo branch into trunk
[ https://issues.apache.org/jira/browse/HIVE-10268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493114#comment-14493114 ] Hive QA commented on HIVE-10268: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12725011/HIVE-10268.2.patch {color:red}ERROR:{color} -1 due to 50 failed/errored test(s), 8674 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join_pkfk org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ctas_colname org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_grouping_window org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_resolution org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32_lessSize org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join33 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_alt_syntax org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ptf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ptf_streaming org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_quotedid_basic org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in_explain_rewrite org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in_having org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_unqualcolumnrefs org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_6_subq org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_ptf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_windowing_streaming org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_join org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_ptf org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_ptf_streaming org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_subquery_in org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_ptf org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join32 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join32_lessSize org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join33 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_alt_syntax org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ptf org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ptf_streaming org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_subquery_in org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_ptf org.apache.hive.jdbc.TestJdbcWithMiniHS2.testNewConnectionConfiguration {noformat} Test results:
[jira] [Updated] (HIVE-10233) Hive on LLAP: Memory manager
[ https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-10233: -- Attachment: HIVE-10233-WIP.patch Hive on LLAP: Memory manager Key: HIVE-10233 URL: https://issues.apache.org/jira/browse/HIVE-10233 Project: Hive Issue Type: Bug Components: Tez Affects Versions: llap Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-10233-WIP.patch We need a memory manager in llap/tez to manage the usage of memory across threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10326) CBO (Calcite Return Path): Invoke Hive's Cumulative Cost
[ https://issues.apache.org/jira/browse/HIVE-10326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran resolved HIVE-10326. --- Resolution: Fixed CBO (Calcite Return Path): Invoke Hive's Cumulative Cost - Key: HIVE-10326 URL: https://issues.apache.org/jira/browse/HIVE-10326 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran Fix For: 1.2.0 Attachments: HIVE-10326.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10308) Vectorization execution throws java.lang.IllegalArgumentException: Unsupported complex type: MAP
[ https://issues.apache.org/jira/browse/HIVE-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493304#comment-14493304 ] Matt McCline commented on HIVE-10308: - The Vectorizer class is suppose to *exclude* the MAP data type from vectorization. We should never execute a Map or Reduce vertex with a MAP type. So, I am a -1 on this patch. Some verification path is missing in the Vectorizer class. Vectorization execution throws java.lang.IllegalArgumentException: Unsupported complex type: MAP Key: HIVE-10308 URL: https://issues.apache.org/jira/browse/HIVE-10308 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 0.14.0, 0.13.1, 1.2.0, 1.1.0 Reporter: Selina Zhang Assignee: Selina Zhang Attachments: HIVE-10308.1.patch Steps to reproduce: CREATE TABLE test_orc (a INT, b MAPINT, STRING) STORED AS ORC; INSERT OVERWRITE TABLE test_orc SELECT 1, MAP(1, one, 2, two) FROM src LIMIT 1; CREATE TABLE test(key INT) ; INSERT OVERWRITE TABLE test SELECT 1 FROM src LIMIT 1; set hive.vectorized.execution.enabled=true; set hive.auto.convert.join=false; select l.key from test l left outer join test_orc r on (l.key= r.a) where r.a is not null; Stack trace: Caused by: java.lang.IllegalArgumentException: Unsupported complex type: MAP at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory.genVectorExpressionWritable(VectorExpressionWriterFactory.java:456) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory.processVectorInspector(VectorExpressionWriterFactory.java:1191) at org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.initializeOp(VectorReduceSinkOperator.java:58) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:442) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:198) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10288) Cannot call permanent UDFs
[ https://issues.apache.org/jira/browse/HIVE-10288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nezih Yigitbasi updated HIVE-10288: --- Description: Just pulled the trunk and built the hive binary. If I create a permanent udf and exit the cli, and then open the cli and try calling the udf it fails with the exception below. However, the call succeeds if I call the udf right after registering the permanent udf (without exiting the cli). The call also succeeds with the apache-hive-1.0.0 release. {code} 15-04-13 17:04:54,004 INFO org.apache.hadoop.hive.ql.log.PerfLogger (PerfLogger.java:PerfLogEnd(148)) - /PERFLOG method=parse start=1428969893115 end=1428969894004 duration=889 from=org.apache.hadoop.hive.ql.Driver 2015-04-13 17:04:54,007 DEBUG org.apache.hadoop.hive.ql.Driver (Driver.java:recordValidTxns(939)) - Encoding valid txns info 9223372036854775807: 2015-04-13 17:04:54,007 INFO org.apache.hadoop.hive.ql.log.PerfLogger (PerfLogger.java:PerfLogBegin(121)) - PERFLOG method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver 2015-04-13 17:04:54,052 INFO org.apache.hadoop.hive.ql.parse.CalcitePlanner (SemanticAnalyzer.java:analyzeInternal(9997)) - Starting Semantic Analysis 2015-04-13 17:04:54,053 DEBUG org.apache.hadoop.hive.ql.exec.FunctionRegistry (FunctionRegistry.java:getGenericUDAFResolver(942)) - Looking up GenericUDAF: hour_now 2015-04-13 17:04:54,053 INFO org.apache.hadoop.hive.ql.parse.CalcitePlanner (SemanticAnalyzer.java:genResolvedParseTree(9980)) - Completed phase 1 of Semantic Analysis 2015-04-13 17:04:54,053 INFO org.apache.hadoop.hive.ql.parse.CalcitePlanner (SemanticAnalyzer.java:getMetaData(1530)) - Get metadata for source tables 2015-04-13 17:04:54,054 INFO org.apache.hadoop.hive.metastore.HiveMetaStore (HiveMetaStore.java:logInfo(744)) - 0: get_table : db=default tbl=test_table 2015-04-13 17:04:54,054 INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(369)) - ugi=nyigitbasi ip=unknown-ip-addr cmd=get_table : db=default tbl=test_table 2015-04-13 17:04:54,054 DEBUG org.apache.hadoop.hive.metastore.ObjectStore (ObjectStore.java:debugLog(6776)) - Open transaction: count = 1, isActive = true at: org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:927) 2015-04-13 17:04:54,054 DEBUG org.apache.hadoop.hive.metastore.ObjectStore (ObjectStore.java:debugLog(6776)) - Open transaction: count = 2, isActive = true at: org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:990) 2015-04-13 17:04:54,104 DEBUG org.apache.hadoop.hive.metastore.ObjectStore (ObjectStore.java:debugLog(6776)) - Commit transaction: count = 1, isactive true at: org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:998) 2015-04-13 17:04:54,232 DEBUG org.apache.hadoop.hive.metastore.ObjectStore (ObjectStore.java:debugLog(6776)) - Commit transaction: count = 0, isactive true at: org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:929) 2015-04-13 17:04:54,242 INFO org.apache.hadoop.hive.ql.parse.CalcitePlanner (SemanticAnalyzer.java:getMetaData(1682)) - Get metadata for subqueries 2015-04-13 17:04:54,247 INFO org.apache.hadoop.hive.ql.parse.CalcitePlanner (SemanticAnalyzer.java:getMetaData(1706)) - Get metadata for destination tables 2015-04-13 17:04:54,256 INFO org.apache.hadoop.hive.ql.parse.CalcitePlanner (SemanticAnalyzer.java:genResolvedParseTree(9984)) - Completed getting MetaData in Semantic Analysis 2015-04-13 17:04:54,259 INFO org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer (CalcitePlanner.java:canHandleAstForCbo(369)) - Not invoking CBO because the statement has too few joins 2015-04-13 17:04:54,344 DEBUG org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe (LazySimpleSerDe.java:initialize(135)) - org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe initialized with: columnNames=[_c0, _c1] columnTypes=[int, int] separator=[[B@6e6d4780] nullstring=\N lastColumnTakesRest=false timestampFormats=null 2015-04-13 17:04:54,406 DEBUG org.apache.hadoop.hive.ql.parse.CalcitePlanner (SemanticAnalyzer.java:genTablePlan(9458)) - Created Table Plan for test_table TS[0] 2015-04-13 17:04:54,410 DEBUG org.apache.hadoop.hive.ql.parse.CalcitePlanner (SemanticAnalyzer.java:genBodyPlan(8815)) - RR before GB test_table{(_c0,_c0: int)(_c1,_c1: int)(block__offset__inside__file,BLOCK__OFFSET__INSIDE__FILE: bigint)(input__file__name,INPUT__FILE__NAME: string)(row__id,ROW__ID: structtransactionid:bigint,bucketid:int,rowid:bigint)} after GB test_table{(_c0,_c0: int)(_c1,_c1: int)(block__offset__inside__file,BLOCK__OFFSET__INSIDE__FILE: bigint)(input__file__name,INPUT__FILE__NAME: string)(row__id,ROW__ID: structtransactionid:bigint,bucketid:int,rowid:bigint)} 2015-04-13 17:04:54,410 DEBUG org.apache.hadoop.hive.ql.parse.CalcitePlanner
[jira] [Commented] (HIVE-10299) Enable new cost model for Tez execution engine
[ https://issues.apache.org/jira/browse/HIVE-10299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493466#comment-14493466 ] Hive QA commented on HIVE-10299: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12725064/HIVE-10299.2.patch {color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 8674 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32_lessSize org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_alt_syntax org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_join org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cross_join org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_metadataonly1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join32_lessSize org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_alt_syntax org.apache.hive.jdbc.TestJdbcWithMiniHS2.testNewConnectionConfiguration {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3414/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3414/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3414/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 24 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12725064 - PreCommit-HIVE-TRUNK-Build Enable new cost model for Tez execution engine -- Key: HIVE-10299 URL: https://issues.apache.org/jira/browse/HIVE-10299 Project: Hive Issue Type: Task Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-10299.2.patch, HIVE-10299.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10028) LLAP: Create a fixed size execution queue for daemons
[ https://issues.apache.org/jira/browse/HIVE-10028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-10028: - Attachment: HIVE-10028.1.patch LLAP: Create a fixed size execution queue for daemons - Key: HIVE-10028 URL: https://issues.apache.org/jira/browse/HIVE-10028 Project: Hive Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Prasanth Jayachandran Fix For: llap Attachments: HIVE-10028.1.patch Currently, this is unbounded. This should be a configurable size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10308) Vectorization execution throws java.lang.IllegalArgumentException: Unsupported complex type: MAP
[ https://issues.apache.org/jira/browse/HIVE-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493272#comment-14493272 ] Selina Zhang commented on HIVE-10308: - Actually the test results show only 1 test failure and the failure is 33 days old. So should be irrelevant. Vectorization execution throws java.lang.IllegalArgumentException: Unsupported complex type: MAP Key: HIVE-10308 URL: https://issues.apache.org/jira/browse/HIVE-10308 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 0.14.0, 0.13.1, 1.2.0, 1.1.0 Reporter: Selina Zhang Assignee: Selina Zhang Attachments: HIVE-10308.1.patch Steps to reproduce: CREATE TABLE test_orc (a INT, b MAPINT, STRING) STORED AS ORC; INSERT OVERWRITE TABLE test_orc SELECT 1, MAP(1, one, 2, two) FROM src LIMIT 1; CREATE TABLE test(key INT) ; INSERT OVERWRITE TABLE test SELECT 1 FROM src LIMIT 1; set hive.vectorized.execution.enabled=true; set hive.auto.convert.join=false; select l.key from test l left outer join test_orc r on (l.key= r.a) where r.a is not null; Stack trace: Caused by: java.lang.IllegalArgumentException: Unsupported complex type: MAP at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory.genVectorExpressionWritable(VectorExpressionWriterFactory.java:456) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory.processVectorInspector(VectorExpressionWriterFactory.java:1191) at org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.initializeOp(VectorReduceSinkOperator.java:58) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:442) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:198) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10326) CBO (Calcite Return Path): Invoke Hive's Cumulative Cost
[ https://issues.apache.org/jira/browse/HIVE-10326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-10326: -- Attachment: HIVE-10326.patch CBO (Calcite Return Path): Invoke Hive's Cumulative Cost - Key: HIVE-10326 URL: https://issues.apache.org/jira/browse/HIVE-10326 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Laljo John Pullokkaran Assignee: Laljo John Pullokkaran Fix For: 1.2.0 Attachments: HIVE-10326.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10272) Some HCat tests fail under windows
[ https://issues.apache.org/jira/browse/HIVE-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493309#comment-14493309 ] Hive QA commented on HIVE-10272: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12725039/HIVE-10272.2.patch {color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 8674 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3412/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3412/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3412/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12725039 - PreCommit-HIVE-TRUNK-Build Some HCat tests fail under windows -- Key: HIVE-10272 URL: https://issues.apache.org/jira/browse/HIVE-10272 Project: Hive Issue Type: Bug Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-10272.2.patch, HIVE-10272.patch Some HCat tests fail under windows with errors like this: {noformat} java.lang.RuntimeException: java.lang.IllegalArgumentException: Pathname /D:/w/hv/hcatalog/hcatalog-pig-adapter/target/tmp/scratchdir from D:/w/hv/hcatalog/hcatalog-pig-adapter/target/tmp/scratchdir is not a valid DFS filename. at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:197) at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:106) at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305) at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424) at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:594) at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:552) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:504) at org.apache.hive.hcatalog.pig.TestHCatLoaderEncryption.setup(TestHCatLoaderEncryption.java:185) {noformat} We need to sanitize HiveConf objects with WindowsPathUtil.convertPathsFromWindowsToHdfs if running under windows before we use them to instantiate a SessionState/Driver -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-10324) Hive metatool should take table_param_key to allow for changes to avro serde's schema url key
[ https://issues.apache.org/jira/browse/HIVE-10324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu reassigned HIVE-10324: --- Assignee: Ferdinand Xu Hive metatool should take table_param_key to allow for changes to avro serde's schema url key - Key: HIVE-10324 URL: https://issues.apache.org/jira/browse/HIVE-10324 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 1.1.0 Reporter: Szehon Ho Assignee: Ferdinand Xu HIVE-3443 added support to change the serdeParams from 'metatool updateLocation' command. However, in avro it is possible to specify the schema via the tableParams: {noformat} CREATE TABLE `testavro`( `test` string COMMENT 'from deserializer') ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' TBLPROPERTIES ( 'avro.schema.url'='hdfs://namenode:8020/tmp/test.avsc', 'kite.compression.type'='snappy', 'transient_lastDdlTime'='1427996456') {noformat} Hence for those tables the 'metatool updateLocation' will not help. This is necessary in case like upgrade the namenode to HA where the absolute paths have changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10324) Hive metatool should take table_param_key to allow for changes to avro serde's schema url key
[ https://issues.apache.org/jira/browse/HIVE-10324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493342#comment-14493342 ] Ferdinand Xu commented on HIVE-10324: - Hi [~szehon], I will take a look at it. Hive metatool should take table_param_key to allow for changes to avro serde's schema url key - Key: HIVE-10324 URL: https://issues.apache.org/jira/browse/HIVE-10324 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 1.1.0 Reporter: Szehon Ho Assignee: Ferdinand Xu HIVE-3443 added support to change the serdeParams from 'metatool updateLocation' command. However, in avro it is possible to specify the schema via the tableParams: {noformat} CREATE TABLE `testavro`( `test` string COMMENT 'from deserializer') ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' TBLPROPERTIES ( 'avro.schema.url'='hdfs://namenode:8020/tmp/test.avsc', 'kite.compression.type'='snappy', 'transient_lastDdlTime'='1427996456') {noformat} Hence for those tables the 'metatool updateLocation' will not help. This is necessary in case like upgrade the namenode to HA where the absolute paths have changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10272) Some HCat tests fail under windows
[ https://issues.apache.org/jira/browse/HIVE-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493344#comment-14493344 ] Sushanth Sowmyan commented on HIVE-10272: - Since I have Thejas' and Hari's +1 on this issue, and have had only a minimal change from the previous patch, and this patch is minor and affects only windows tests for HCat, I'm going to go ahead and commit it. Some HCat tests fail under windows -- Key: HIVE-10272 URL: https://issues.apache.org/jira/browse/HIVE-10272 Project: Hive Issue Type: Bug Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Priority: Minor Attachments: HIVE-10272.2.patch, HIVE-10272.patch Some HCat tests fail under windows with errors like this: {noformat} java.lang.RuntimeException: java.lang.IllegalArgumentException: Pathname /D:/w/hv/hcatalog/hcatalog-pig-adapter/target/tmp/scratchdir from D:/w/hv/hcatalog/hcatalog-pig-adapter/target/tmp/scratchdir is not a valid DFS filename. at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:197) at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:106) at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305) at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424) at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:594) at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:552) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:504) at org.apache.hive.hcatalog.pig.TestHCatLoaderEncryption.setup(TestHCatLoaderEncryption.java:185) {noformat} We need to sanitize HiveConf objects with WindowsPathUtil.convertPathsFromWindowsToHdfs if running under windows before we use them to instantiate a SessionState/Driver -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10316) same query works with TEXTFILE and fails with ORC
[ https://issues.apache.org/jira/browse/HIVE-10316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493263#comment-14493263 ] Matt McCline commented on HIVE-10316: - The error is occurring in ~ql.exec.vector.VectorColumnSetInfo.addKey which uses VectorizationContext.isStringFamily which was fixed to use string_family and the rest of the code fixed to use string_family and not StringGroup. same query works with TEXTFILE and fails with ORC - Key: HIVE-10316 URL: https://issues.apache.org/jira/browse/HIVE-10316 Project: Hive Issue Type: Bug Components: Compression Affects Versions: 0.14.0 Environment: hortonworks HDP 2.2 running on Linux Reporter: Philippe Verhaeghe See also related answer in mailing list : http://mail-archives.apache.org/mod_mbox/hive-user/201504.mbox/%3CD15184D6.27779%25gopal%40hortonworks.com%3E I’m getting an error in Hive when executing a query on a table in ORC format. After several trials, I succeeded to run the same query on the same table in TEXTFILE format. I ‘ve been able to reproduce the error with the simple sql script below. I create the same table in TEXFILE and in ORC and I run a SELECT …GROUP BY on the tables. The first SELECT issued on the TEXTFILE table succeeds. The second SELECT issued on the ORC table fails. NB : There is a CONCAT in the query. If I remove the CONCAT the query is running ok with both tables … Example script to reproduce the error : USE pvr_temp; DROP TABLE IF EXISTS students_text; CREATE TABLE students_text (name VARCHAR(64), age INT, datetime TIMESTAMP, gpa DECIMAL(3, 2)) STORED AS TEXTFILE; INSERT INTO TABLE students_text VALUES ('fred flintstone', 35, '2015-04-13 13:40:00', 1.28), ('barney rubble', 32, '2015-04-13 13:40:00', 2.32); SELECT CONCAT(TO_DATE(datetime), '-'), SUM(gpa) FROM students_text GROUP BY CONCAT(TO_DATE(datetime), '-'); DROP TABLE IF EXISTS students_orc; CREATE TABLE students_orc (name VARCHAR(64), age INT, datetime TIMESTAMP, gpa DECIMAL(3, 2)) STORED AS ORC; INSERT INTO TABLE students_orc VALUES ('fred flintstone', 35, '2015-04-13 SELECT CONCAT(TO_DATE(datetime), '-'), SUM(gpa) FROM students_orc GROUP BY CONCAT(TO_DATE(datetime), '-'); 13:40:00', 1.28), ('barney rubble', 32, '2015-04-13 13:40:00', 2.32); Log where you can see the error : [pvr@tpcalr01s ~]$ cat test.log scan complete in 9ms Connecting to jdbc:hive2://tpcrmm03s:1 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/hdp/2.2.0.0-2041/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/2.2.0.0-2041/hive/lib/hive-jdbc-0.14.0.2.2.0.0-2041-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Connected to: Apache Hive (version 0.14.0.2.2.0.0-2041) Driver: Hive JDBC (version 0.14.0.2.2.0.0-2041) Transaction isolation: TRANSACTION_REPEATABLE_READ 0: jdbc:hive2://tpcrmm03s:1 USE pvr_temp; No rows affected (0.061 seconds) 0: jdbc:hive2://tpcrmm03s:1 DROP TABLE IF EXISTS students_text; No rows affected (0.12 seconds) 0: jdbc:hive2://tpcrmm03s:1 CREATE TABLE students_text (name VARCHAR(64), age INT, datetime TIMESTAMP, gpa DECIMAL(3, 2)) STORED AS TEXTFILE; No rows affected (0.057 seconds) 0: jdbc:hive2://tpcrmm03s:1 INSERT INTO TABLE students_text VALUES ('fred flintstone', 35, '2015-04-13 13:40:00', 1.28), ('barney rubble', 32, '2015-04-13 13:40:00', 2.32); INFO : Tez session hasn't been created yet. Opening session INFO : INFO : Status: Running (Executing on YARN cluster with App id application_1428656093356_0047) INFO : Map 1: -/- INFO : Map 1: 0/1 No rows affected (14.134 seconds) INFO : Map 1: 0/1 INFO : Map 1: 0(+1)/1 INFO : Map 1: 0(+1)/1 INFO : Map 1: 1/1 INFO : Loading data to table pvr_temp.students_text from hdfs://tpcrmm01s.priv.atos.fr:8020/tmp/hive/hive/bf19c354-de67-45ae-a3e4-cd57d81acd71/hive_2015-04-13_14-15-08_445_2811483497310651606-20/-ext-1 INFO : Table pvr_temp.students_text stats: [numFiles=1, numRows=2, totalSize=86, rawDataSize=84] 0: jdbc:hive2://tpcrmm03s:1 SELECT CONCAT(TO_DATE(datetime), '-'), SUM(gpa) FROM students_text GROUP BY CONCAT(TO_DATE(datetime), '-'); INFO : Session is already open INFO : INFO : Status: Running (Executing on YARN cluster with App id application_1428656093356_0047) INFO : Map 1: -/- Reducer 2: 0/1 INFO : Map 1: 0/1 Reducer 2: 0/1 INFO : Map 1: 0(+1)/1 Reducer 2: 0/1 INFO : Map 1: 1/1 Reducer 2: 0(+1)/1 INFO : Map 1: 1/1 Reducer 2: 1/1 +--+--+--+ |
[jira] [Commented] (HIVE-10272) Some HCat tests fail under windows
[ https://issues.apache.org/jira/browse/HIVE-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493337#comment-14493337 ] Sushanth Sowmyan commented on HIVE-10272: - Per http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3412/testReport, all tests passed, the issue above is that TestMiniMRCliDriver does not seem to be working generally, and is not connected to this patch. Some HCat tests fail under windows -- Key: HIVE-10272 URL: https://issues.apache.org/jira/browse/HIVE-10272 Project: Hive Issue Type: Bug Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Priority: Minor Attachments: HIVE-10272.2.patch, HIVE-10272.patch Some HCat tests fail under windows with errors like this: {noformat} java.lang.RuntimeException: java.lang.IllegalArgumentException: Pathname /D:/w/hv/hcatalog/hcatalog-pig-adapter/target/tmp/scratchdir from D:/w/hv/hcatalog/hcatalog-pig-adapter/target/tmp/scratchdir is not a valid DFS filename. at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:197) at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:106) at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305) at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424) at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:594) at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:552) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:504) at org.apache.hive.hcatalog.pig.TestHCatLoaderEncryption.setup(TestHCatLoaderEncryption.java:185) {noformat} We need to sanitize HiveConf objects with WindowsPathUtil.convertPathsFromWindowsToHdfs if running under windows before we use them to instantiate a SessionState/Driver -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10274) Send context and description to tez via dag info
[ https://issues.apache.org/jira/browse/HIVE-10274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493282#comment-14493282 ] Gunther Hagleitner commented on HIVE-10274: --- minimr test failures are unrelated. code only affects tez. Send context and description to tez via dag info Key: HIVE-10274 URL: https://issues.apache.org/jira/browse/HIVE-10274 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-10274.1.patch tez has a way to specify context and description (which is shown in the ui) for each dag. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9710) HiveServer2 should support cookie based authentication, when using HTTP transport.
[ https://issues.apache.org/jira/browse/HIVE-9710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-9710: Attachment: HIVE-9710.8.patch [~vgumashta] Have addressed the comments from previous patch. Thanks! -Hari HiveServer2 should support cookie based authentication, when using HTTP transport. -- Key: HIVE-9710 URL: https://issues.apache.org/jira/browse/HIVE-9710 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 1.2.0 Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-9710.1.patch, HIVE-9710.2.patch, HIVE-9710.3.patch, HIVE-9710.4.patch, HIVE-9710.5.patch, HIVE-9710.6.patch, HIVE-9710.7.patch, HIVE-9710.8.patch HiveServer2 should generate cookies and validate the client cookie send to it so that it need not perform User/Password or a Kerberos based authentication on each HTTP request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10272) Some HCat tests fail under windows
[ https://issues.apache.org/jira/browse/HIVE-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-10272: Priority: Minor (was: Major) Some HCat tests fail under windows -- Key: HIVE-10272 URL: https://issues.apache.org/jira/browse/HIVE-10272 Project: Hive Issue Type: Bug Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Priority: Minor Attachments: HIVE-10272.2.patch, HIVE-10272.patch Some HCat tests fail under windows with errors like this: {noformat} java.lang.RuntimeException: java.lang.IllegalArgumentException: Pathname /D:/w/hv/hcatalog/hcatalog-pig-adapter/target/tmp/scratchdir from D:/w/hv/hcatalog/hcatalog-pig-adapter/target/tmp/scratchdir is not a valid DFS filename. at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:197) at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:106) at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305) at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424) at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:594) at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:552) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:504) at org.apache.hive.hcatalog.pig.TestHCatLoaderEncryption.setup(TestHCatLoaderEncryption.java:185) {noformat} We need to sanitize HiveConf objects with WindowsPathUtil.convertPathsFromWindowsToHdfs if running under windows before we use them to instantiate a SessionState/Driver -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10325) Remove ExprNodeNullEvaluator
[ https://issues.apache.org/jira/browse/HIVE-10325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-10325: Attachment: HIVE-10325.patch Remove ExprNodeNullEvaluator Key: HIVE-10325 URL: https://issues.apache.org/jira/browse/HIVE-10325 Project: Hive Issue Type: Task Components: Query Processor Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-10325.patch since its purpose can instead be served by ExprNodeConstantEvaluator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10303) HIVE-9471 broke forward compatibility of ORC files
[ https://issues.apache.org/jira/browse/HIVE-10303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493354#comment-14493354 ] Prasanth Jayachandran commented on HIVE-10303: -- [~owen.omalley] can you take a look? HIVE-9471 broke forward compatibility of ORC files -- Key: HIVE-10303 URL: https://issues.apache.org/jira/browse/HIVE-10303 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 1.2.0 Reporter: Owen O'Malley Assignee: Prasanth Jayachandran Fix For: 1.2.0 Attachments: HIVE-10303.1.patch The change suppresses the streams in ORC files for ORC dictionaries with 0 entries. This causes NPE on ORC readers for all versions of Hive 0.11 to 1.1 and needs to be reverted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10303) HIVE-9471 broke forward compatibility of ORC files
[ https://issues.apache.org/jira/browse/HIVE-10303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-10303: - Attachment: HIVE-10303.1.patch This patch reverts the writer side changes of HIVE-9471 to restore compatibility with old readers. HIVE-9471 broke forward compatibility of ORC files -- Key: HIVE-10303 URL: https://issues.apache.org/jira/browse/HIVE-10303 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 1.2.0 Reporter: Owen O'Malley Assignee: Prasanth Jayachandran Fix For: 1.2.0 Attachments: HIVE-10303.1.patch The change suppresses the streams in ORC files for ORC dictionaries with 0 entries. This causes NPE on ORC readers for all versions of Hive 0.11 to 1.1 and needs to be reverted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10228) Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics
[ https://issues.apache.org/jira/browse/HIVE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493210#comment-14493210 ] Sushanth Sowmyan commented on HIVE-10228: - Note : Visiting the test report page http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3411/testReport/ shows 0 failures. The issues reported above are from TestMinimrCliDriver not producing TEST-*.xml files, which are unrelated to this patch. Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics -- Key: HIVE-10228 URL: https://issues.apache.org/jira/browse/HIVE-10228 Project: Hive Issue Type: Sub-task Components: Import/Export Affects Versions: 1.2.0 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-10228.2.patch, HIVE-10228.3.patch, HIVE-10228.patch We need to update a couple of hive commands to support replication semantics. To wit, we need the following: EXPORT ... [FOR [METADATA] REPLICATION(“comment”)] Export will now support an extra optional clause to tell it that this export is being prepared for the purpose of replication. There is also an additional optional clause here, that allows for the export to be a metadata-only export, to handle cases of capturing the diff for alter statements, for example. Also, if done for replication, the non-presence of a table, or a table being a view/offline table/non-native table is not considered an error, and instead, will result in a successful no-op. IMPORT ... (as normal) – but handles new semantics No syntax changes for import, but import will have to change to be able to handle all the permutations of export dumps possible. Also, import will have to ensure that it should update the object only if the update being imported is not older than the state of the object. DROP TABLE ... FOR REPLICATION('eventid') Drop Table now has an additional clause, to specify that this drop table is being done for replication purposes, and that the dop should not actually drop the table if the table is newer than that event id specified. ALTER TABLE ... DROP PARTITION (...) FOR REPLICATION('eventid') Similarly, Drop Partition also has an equivalent change to Drop Table. = In addition, we introduce a new property repl.last.id, which when tagged on to table properties or partition properties on a replication-destination, holds the effective state identifier of the object. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10028) LLAP: Create a fixed size execution queue for daemons
[ https://issues.apache.org/jira/browse/HIVE-10028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493234#comment-14493234 ] Prasanth Jayachandran commented on HIVE-10028: -- [~seth.siddha...@gmail.com] can you take a look at this patch? LLAP: Create a fixed size execution queue for daemons - Key: HIVE-10028 URL: https://issues.apache.org/jira/browse/HIVE-10028 Project: Hive Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Prasanth Jayachandran Fix For: llap Attachments: HIVE-10028.1.patch Currently, this is unbounded. This should be a configurable size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10028) LLAP: Create a fixed size execution queue for daemons
[ https://issues.apache.org/jira/browse/HIVE-10028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493236#comment-14493236 ] Prasanth Jayachandran commented on HIVE-10028: -- Notification to AM on pre-emption and rejection is yet to be done. Also priorities for task in wait queue needs to be improved. Pre-emption is disabled by default, so in default mode this should work exactly the same sa what we have currently. LLAP: Create a fixed size execution queue for daemons - Key: HIVE-10028 URL: https://issues.apache.org/jira/browse/HIVE-10028 Project: Hive Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Prasanth Jayachandran Fix For: llap Attachments: HIVE-10028.1.patch Currently, this is unbounded. This should be a configurable size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10316) same query works with TEXTFILE and fails with ORC
[ https://issues.apache.org/jira/browse/HIVE-10316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493258#comment-14493258 ] Matt McCline commented on HIVE-10316: - There is HIVE-8886 Some Vectorized String CONCAT expressions result in runtime error Vectorization: Unsuported vector output type: StringGroup that was fixed in trunk (at least). Could be the same issue. same query works with TEXTFILE and fails with ORC - Key: HIVE-10316 URL: https://issues.apache.org/jira/browse/HIVE-10316 Project: Hive Issue Type: Bug Components: Compression Affects Versions: 0.14.0 Environment: hortonworks HDP 2.2 running on Linux Reporter: Philippe Verhaeghe See also related answer in mailing list : http://mail-archives.apache.org/mod_mbox/hive-user/201504.mbox/%3CD15184D6.27779%25gopal%40hortonworks.com%3E I’m getting an error in Hive when executing a query on a table in ORC format. After several trials, I succeeded to run the same query on the same table in TEXTFILE format. I ‘ve been able to reproduce the error with the simple sql script below. I create the same table in TEXFILE and in ORC and I run a SELECT …GROUP BY on the tables. The first SELECT issued on the TEXTFILE table succeeds. The second SELECT issued on the ORC table fails. NB : There is a CONCAT in the query. If I remove the CONCAT the query is running ok with both tables … Example script to reproduce the error : USE pvr_temp; DROP TABLE IF EXISTS students_text; CREATE TABLE students_text (name VARCHAR(64), age INT, datetime TIMESTAMP, gpa DECIMAL(3, 2)) STORED AS TEXTFILE; INSERT INTO TABLE students_text VALUES ('fred flintstone', 35, '2015-04-13 13:40:00', 1.28), ('barney rubble', 32, '2015-04-13 13:40:00', 2.32); SELECT CONCAT(TO_DATE(datetime), '-'), SUM(gpa) FROM students_text GROUP BY CONCAT(TO_DATE(datetime), '-'); DROP TABLE IF EXISTS students_orc; CREATE TABLE students_orc (name VARCHAR(64), age INT, datetime TIMESTAMP, gpa DECIMAL(3, 2)) STORED AS ORC; INSERT INTO TABLE students_orc VALUES ('fred flintstone', 35, '2015-04-13 SELECT CONCAT(TO_DATE(datetime), '-'), SUM(gpa) FROM students_orc GROUP BY CONCAT(TO_DATE(datetime), '-'); 13:40:00', 1.28), ('barney rubble', 32, '2015-04-13 13:40:00', 2.32); Log where you can see the error : [pvr@tpcalr01s ~]$ cat test.log scan complete in 9ms Connecting to jdbc:hive2://tpcrmm03s:1 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/hdp/2.2.0.0-2041/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/2.2.0.0-2041/hive/lib/hive-jdbc-0.14.0.2.2.0.0-2041-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Connected to: Apache Hive (version 0.14.0.2.2.0.0-2041) Driver: Hive JDBC (version 0.14.0.2.2.0.0-2041) Transaction isolation: TRANSACTION_REPEATABLE_READ 0: jdbc:hive2://tpcrmm03s:1 USE pvr_temp; No rows affected (0.061 seconds) 0: jdbc:hive2://tpcrmm03s:1 DROP TABLE IF EXISTS students_text; No rows affected (0.12 seconds) 0: jdbc:hive2://tpcrmm03s:1 CREATE TABLE students_text (name VARCHAR(64), age INT, datetime TIMESTAMP, gpa DECIMAL(3, 2)) STORED AS TEXTFILE; No rows affected (0.057 seconds) 0: jdbc:hive2://tpcrmm03s:1 INSERT INTO TABLE students_text VALUES ('fred flintstone', 35, '2015-04-13 13:40:00', 1.28), ('barney rubble', 32, '2015-04-13 13:40:00', 2.32); INFO : Tez session hasn't been created yet. Opening session INFO : INFO : Status: Running (Executing on YARN cluster with App id application_1428656093356_0047) INFO : Map 1: -/- INFO : Map 1: 0/1 No rows affected (14.134 seconds) INFO : Map 1: 0/1 INFO : Map 1: 0(+1)/1 INFO : Map 1: 0(+1)/1 INFO : Map 1: 1/1 INFO : Loading data to table pvr_temp.students_text from hdfs://tpcrmm01s.priv.atos.fr:8020/tmp/hive/hive/bf19c354-de67-45ae-a3e4-cd57d81acd71/hive_2015-04-13_14-15-08_445_2811483497310651606-20/-ext-1 INFO : Table pvr_temp.students_text stats: [numFiles=1, numRows=2, totalSize=86, rawDataSize=84] 0: jdbc:hive2://tpcrmm03s:1 SELECT CONCAT(TO_DATE(datetime), '-'), SUM(gpa) FROM students_text GROUP BY CONCAT(TO_DATE(datetime), '-'); INFO : Session is already open INFO : INFO : Status: Running (Executing on YARN cluster with App id application_1428656093356_0047) INFO : Map 1: -/- Reducer 2: 0/1 INFO : Map 1: 0/1 Reducer 2: 0/1 INFO : Map 1: 0(+1)/1 Reducer 2: 0/1 INFO : Map 1: 1/1 Reducer 2: 0(+1)/1 INFO : Map 1: 1/1 Reducer 2: 1/1 +--+--+--+ | _c0 | _c1 |
[jira] [Commented] (HIVE-10324) Hive metatool should take table_param_key to allow for changes to avro serde's schema url key
[ https://issues.apache.org/jira/browse/HIVE-10324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493298#comment-14493298 ] Szehon Ho commented on HIVE-10324: -- FYI [~Ferd] or [~dongc] : any interest in looking at this? Hive metatool should take table_param_key to allow for changes to avro serde's schema url key - Key: HIVE-10324 URL: https://issues.apache.org/jira/browse/HIVE-10324 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 1.1.0 Reporter: Szehon Ho HIVE-3443 added support to change the serdeParams from 'metatool updateLocation' command. However, in avro it is possible to specify the schema via the tableParams: {noformat} CREATE TABLE `testavro`( `test` string COMMENT 'from deserializer') ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' TBLPROPERTIES ( 'avro.schema.url'='hdfs://namenode:8020/tmp/test.avsc', 'kite.compression.type'='snappy', 'transient_lastDdlTime'='1427996456') {noformat} Hence for those tables the 'metatool updateLocation' will not help. This is necessary in case like upgrade the namenode to HA where the absolute paths have changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10316) same query works with TEXTFILE and fails with ORC
[ https://issues.apache.org/jira/browse/HIVE-10316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493330#comment-14493330 ] Matt McCline commented on HIVE-10316: - Commit date into trunk is December 9th, 2014. Tagged as in release-1.1.0 branch, too. Please ask Hortonworks about where the fix is available via its supported releases. Thanks. same query works with TEXTFILE and fails with ORC - Key: HIVE-10316 URL: https://issues.apache.org/jira/browse/HIVE-10316 Project: Hive Issue Type: Bug Components: Compression Affects Versions: 0.14.0 Environment: hortonworks HDP 2.2 running on Linux Reporter: Philippe Verhaeghe See also related answer in mailing list : http://mail-archives.apache.org/mod_mbox/hive-user/201504.mbox/%3CD15184D6.27779%25gopal%40hortonworks.com%3E I’m getting an error in Hive when executing a query on a table in ORC format. After several trials, I succeeded to run the same query on the same table in TEXTFILE format. I ‘ve been able to reproduce the error with the simple sql script below. I create the same table in TEXFILE and in ORC and I run a SELECT …GROUP BY on the tables. The first SELECT issued on the TEXTFILE table succeeds. The second SELECT issued on the ORC table fails. NB : There is a CONCAT in the query. If I remove the CONCAT the query is running ok with both tables … Example script to reproduce the error : USE pvr_temp; DROP TABLE IF EXISTS students_text; CREATE TABLE students_text (name VARCHAR(64), age INT, datetime TIMESTAMP, gpa DECIMAL(3, 2)) STORED AS TEXTFILE; INSERT INTO TABLE students_text VALUES ('fred flintstone', 35, '2015-04-13 13:40:00', 1.28), ('barney rubble', 32, '2015-04-13 13:40:00', 2.32); SELECT CONCAT(TO_DATE(datetime), '-'), SUM(gpa) FROM students_text GROUP BY CONCAT(TO_DATE(datetime), '-'); DROP TABLE IF EXISTS students_orc; CREATE TABLE students_orc (name VARCHAR(64), age INT, datetime TIMESTAMP, gpa DECIMAL(3, 2)) STORED AS ORC; INSERT INTO TABLE students_orc VALUES ('fred flintstone', 35, '2015-04-13 SELECT CONCAT(TO_DATE(datetime), '-'), SUM(gpa) FROM students_orc GROUP BY CONCAT(TO_DATE(datetime), '-'); 13:40:00', 1.28), ('barney rubble', 32, '2015-04-13 13:40:00', 2.32); Log where you can see the error : [pvr@tpcalr01s ~]$ cat test.log scan complete in 9ms Connecting to jdbc:hive2://tpcrmm03s:1 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/hdp/2.2.0.0-2041/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/2.2.0.0-2041/hive/lib/hive-jdbc-0.14.0.2.2.0.0-2041-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Connected to: Apache Hive (version 0.14.0.2.2.0.0-2041) Driver: Hive JDBC (version 0.14.0.2.2.0.0-2041) Transaction isolation: TRANSACTION_REPEATABLE_READ 0: jdbc:hive2://tpcrmm03s:1 USE pvr_temp; No rows affected (0.061 seconds) 0: jdbc:hive2://tpcrmm03s:1 DROP TABLE IF EXISTS students_text; No rows affected (0.12 seconds) 0: jdbc:hive2://tpcrmm03s:1 CREATE TABLE students_text (name VARCHAR(64), age INT, datetime TIMESTAMP, gpa DECIMAL(3, 2)) STORED AS TEXTFILE; No rows affected (0.057 seconds) 0: jdbc:hive2://tpcrmm03s:1 INSERT INTO TABLE students_text VALUES ('fred flintstone', 35, '2015-04-13 13:40:00', 1.28), ('barney rubble', 32, '2015-04-13 13:40:00', 2.32); INFO : Tez session hasn't been created yet. Opening session INFO : INFO : Status: Running (Executing on YARN cluster with App id application_1428656093356_0047) INFO : Map 1: -/- INFO : Map 1: 0/1 No rows affected (14.134 seconds) INFO : Map 1: 0/1 INFO : Map 1: 0(+1)/1 INFO : Map 1: 0(+1)/1 INFO : Map 1: 1/1 INFO : Loading data to table pvr_temp.students_text from hdfs://tpcrmm01s.priv.atos.fr:8020/tmp/hive/hive/bf19c354-de67-45ae-a3e4-cd57d81acd71/hive_2015-04-13_14-15-08_445_2811483497310651606-20/-ext-1 INFO : Table pvr_temp.students_text stats: [numFiles=1, numRows=2, totalSize=86, rawDataSize=84] 0: jdbc:hive2://tpcrmm03s:1 SELECT CONCAT(TO_DATE(datetime), '-'), SUM(gpa) FROM students_text GROUP BY CONCAT(TO_DATE(datetime), '-'); INFO : Session is already open INFO : INFO : Status: Running (Executing on YARN cluster with App id application_1428656093356_0047) INFO : Map 1: -/- Reducer 2: 0/1 INFO : Map 1: 0/1 Reducer 2: 0/1 INFO : Map 1: 0(+1)/1 Reducer 2: 0/1 INFO : Map 1: 1/1 Reducer 2: 0(+1)/1 INFO : Map 1: 1/1 Reducer 2: 1/1 +--+--+--+ | _c0 | _c1 |
[jira] [Updated] (HIVE-10328) Enable new return path for cbo
[ https://issues.apache.org/jira/browse/HIVE-10328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-10328: Attachment: HIVE-10328.patch With both new cost model as well as new return path. Enable new return path for cbo -- Key: HIVE-10328 URL: https://issues.apache.org/jira/browse/HIVE-10328 Project: Hive Issue Type: Task Components: CBO Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-10328.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10304) Add deprecation message to HiveCLI
[ https://issues.apache.org/jira/browse/HIVE-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493527#comment-14493527 ] Lefty Leverenz commented on HIVE-10304: --- Doc note: At the least, the CLI wikidoc should include this warning for 1.2.0 with a link to the Beeline doc. Since the CLI doc already has a stub of a Beeline section, perhaps that should be moved before the CLI section with some rewriting. Also, other places that mention the CLI should be updated -- Getting Started and the Tutorial for starters, maybe a few more. * [Hive CLI | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli] * Getting Started ** [Running Hive | https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-RunningHive] ** [Configuration Management Overview | https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-ConfigurationManagementOverview] ** [Runtime Configuration | https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-RuntimeConfiguration] ** [Error Logs | https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-ErrorLogs] * Tutorial ** [Built In Operators and Functions | https://cwiki.apache.org/confluence/display/Hive/Tutorial#Tutorial-BuiltInOperatorsandFunctions] ** [Simple Query | https://cwiki.apache.org/confluence/display/Hive/Tutorial#Tutorial-SimpleQuery] Add deprecation message to HiveCLI -- Key: HIVE-10304 URL: https://issues.apache.org/jira/browse/HIVE-10304 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 1.1.0 Reporter: Szehon Ho Assignee: Szehon Ho Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-10304.2.patch, HIVE-10304.3.patch, HIVE-10304.patch As Beeline is now the recommended command line tool to Hive, we should add a message to HiveCLI to indicate that it is deprecated and redirect them to Beeline. This is not suggesting to remove HiveCLI for now, but just a helpful direction for user to know the direction to focus attention in Beeline. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10310) Support GROUPING() and GROUP_ID() in HIVE
[ https://issues.apache.org/jira/browse/HIVE-10310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493542#comment-14493542 ] sanjiv singh commented on HIVE-10310: - [~jpullokkaran] Not sure if not clear from JIRA or description or I am not able to describe problem clearly. I haven't asked for Grouping Sets and Grouping ID. It is for aggregate function GROUPING() and GROUP_ID(). Also exampe , I have given , is not clealy about Grouping Sets and Grouping ID , but about GROUPING() and GROUP_ID(). Let me know if any Q. Support GROUPING() and GROUP_ID() in HIVE - Key: HIVE-10310 URL: https://issues.apache.org/jira/browse/HIVE-10310 Project: Hive Issue Type: New Feature Components: Parser, SQL Reporter: sanjiv singh Priority: Minor I have lots of queries using GROUPING() function. failing on hive , just because GROUPING() not supported in hive. See the Query below; SELECT fact_1_id, fact_2_id, GROUPING(fact_1_id) AS f1g, GROUPING(fact_2_id) AS f2g FROM dimension_tab GROUP BY CUBE (fact_1_id, fact_2_id) ORDER BY fact_1_id, fact_2_id; In order to run in HIVE all such queries, It need to be transformed to HIVE syntax. See below transformed query, compatible to hive. Equivalent have been derived using Case statement . SELECT fact_1_id, fact_2_id, (case when (GROUPING__ID 1) = 0 then 1 else 0 end) as f1g, (case when (GROUPING__ID 2) = 0 then 1 else 0 end) as f2g FROM dimension_tab GROUP BY fact_1_id, fact_2_id WITH CUBE ORDER BY fact_1_id, fact_2_id; It would be great if GROUPING() implemented in hive. I see two ways to do it 1) Handle it at parser level. 2) GROUPING() aggregate function to hive(recommended) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10273) Union with partition tables which have no data fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-10273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493573#comment-14493573 ] Hive QA commented on HIVE-10273: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12725097/HIVE-10273.4.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3417/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3417/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3417/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-3417/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . Reverted 'ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeConstantDesc.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeEvaluatorFactory.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeNullEvaluator.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java' ++ egrep -v '^X|^Performing status on external' ++ awk '{print $2}' ++ svn status --no-ignore + rm -rf target datanucleus.log ant/target shims/target shims/0.20S/target shims/0.23/target shims/aggregator/target shims/common/target shims/scheduler/target packaging/target hbase-handler/target testutils/target jdbc/target metastore/target itests/target itests/thirdparty itests/hcatalog-unit/target itests/test-serde/target itests/qtest/target itests/hive-unit-hadoop2/target itests/hive-minikdc/target itests/hive-jmh/target itests/hive-unit/target itests/custom-serde/target itests/util/target itests/qtest-spark/target hcatalog/target hcatalog/core/target hcatalog/streaming/target hcatalog/server-extensions/target hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target hcatalog/hcatalog-pig-adapter/target accumulo-handler/target hwi/target common/target common/src/gen spark-client/target contrib/target service/target serde/target beeline/target odbc/target cli/target ql/dependency-reduced-pom.xml ql/target + svn update Fetching external item into 'hcatalog/src/test/e2e/harness' External at revision 1673357. At revision 1673357. + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12725097 - PreCommit-HIVE-TRUNK-Build Union with partition tables which have no data fails with NPE - Key: HIVE-10273 URL: https://issues.apache.org/jira/browse/HIVE-10273 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-10273.1.patch, HIVE-10273.2.patch, HIVE-10273.3.patch, HIVE-10273.4.patch As shown in the test case in the patch below, when we have partitioned tables which have no data, we fail with an NPE with the following stack trace: {code} NullPointerException null java.lang.NullPointerException at
[jira] [Commented] (HIVE-10325) Remove ExprNodeNullEvaluator
[ https://issues.apache.org/jira/browse/HIVE-10325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493570#comment-14493570 ] Hive QA commented on HIVE-10325: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12725091/HIVE-10325.patch {color:red}ERROR:{color} -1 due to 178 failed/errored test(s), 8688 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_change_col org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_cascade org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ambiguitycheck org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table_udfs org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cast_to_int org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_simple_select org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_subq_not_in org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_colstats_all_nulls org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_func1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_genericudf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_precision org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_udf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_partition_skip_default org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explode_null org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_join_breaktask2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_into_with_schema org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_nullsafe org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lateral_view_outer org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_load_dyn_part14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_null_cast org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_null_column org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_num_op_type_conv org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_null_check org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_vectorization_ppd org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_decimal org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_write_correct_definition_levels org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_constant_expr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_udf_case org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin_having org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_type_conversions_1
[jira] [Updated] (HIVE-10284) enable container reuse for grace hash join
[ https://issues.apache.org/jira/browse/HIVE-10284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-10284: -- Attachment: HIVE-10284.5.patch enable container reuse for grace hash join --- Key: HIVE-10284 URL: https://issues.apache.org/jira/browse/HIVE-10284 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-10284.1.patch, HIVE-10284.2.patch, HIVE-10284.3.patch, HIVE-10284.4.patch, HIVE-10284.5.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10304) Add deprecation message to HiveCLI
[ https://issues.apache.org/jira/browse/HIVE-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-10304: -- Labels: TODOC1.2 (was: ) Add deprecation message to HiveCLI -- Key: HIVE-10304 URL: https://issues.apache.org/jira/browse/HIVE-10304 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 1.1.0 Reporter: Szehon Ho Assignee: Szehon Ho Labels: TODOC1.2 Fix For: 1.2.0 Attachments: HIVE-10304.2.patch, HIVE-10304.3.patch, HIVE-10304.patch As Beeline is now the recommended command line tool to Hive, we should add a message to HiveCLI to indicate that it is deprecated and redirect them to Beeline. This is not suggesting to remove HiveCLI for now, but just a helpful direction for user to know the direction to focus attention in Beeline. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10318) The HMS upgrade test does not test patches that affect the upgrade test scripts
[ https://issues.apache.org/jira/browse/HIVE-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493521#comment-14493521 ] Hive QA commented on HIVE-10318: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12725069/HIVE-10318.1.patch {color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 8688 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3415/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3415/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3415/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12725069 - PreCommit-HIVE-TRUNK-Build The HMS upgrade test does not test patches that affect the upgrade test scripts --- Key: HIVE-10318 URL: https://issues.apache.org/jira/browse/HIVE-10318 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-10318.1.patch When doing a change on the HMS upgrade scripts, such as adding a new DB server, then the HMS upgrade test does not test this change. In order to make it work, this patch needs to be committed to trunk. This is not desired, as we need to test the upgrade change first before doing the commit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10324) Hive metatool should take table_param_key to allow for changes to avro serde's schema url key
[ https://issues.apache.org/jira/browse/HIVE-10324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493606#comment-14493606 ] Szehon Ho commented on HIVE-10324: -- Hm, I think its supposed to update the MTable as well? Right now its updating only MStorageDescriptor. I am doing something like: metatool -updateLocation hdfs://namenode2:8020 hdfs://namenode:8020 -tablePropKey avro.schema.url -serdePropKey avro.schema.url , with expectation that afterwards the table avro schema url has namenode2. Hive metatool should take table_param_key to allow for changes to avro serde's schema url key - Key: HIVE-10324 URL: https://issues.apache.org/jira/browse/HIVE-10324 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 1.1.0 Reporter: Szehon Ho Assignee: Ferdinand Xu HIVE-3443 added support to change the serdeParams from 'metatool updateLocation' command. However, in avro it is possible to specify the schema via the tableParams: {noformat} CREATE TABLE `testavro`( `test` string COMMENT 'from deserializer') ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' TBLPROPERTIES ( 'avro.schema.url'='hdfs://namenode:8020/tmp/test.avsc', 'kite.compression.type'='snappy', 'transient_lastDdlTime'='1427996456') {noformat} Hence for those tables the 'metatool updateLocation' will not help. This is necessary in case like upgrade the namenode to HA where the absolute paths have changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9695) Redundant filter operator in reducer Vertex when CBO is disabled
[ https://issues.apache.org/jira/browse/HIVE-9695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-9695: - Assignee: Laljo John Pullokkaran (was: Gunther Hagleitner) Redundant filter operator in reducer Vertex when CBO is disabled Key: HIVE-9695 URL: https://issues.apache.org/jira/browse/HIVE-9695 Project: Hive Issue Type: Bug Components: Physical Optimizer Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Laljo John Pullokkaran Fix For: 1.2.0 There is a redundant filter operator in reducer Vertex when CBO is disabled. Query {code} select ss_item_sk, ss_ticket_number, ss_store_sk from store_sales a, store_returns b, store where a.ss_item_sk = b.sr_item_sk and a.ss_ticket_number = b.sr_ticket_number and ss_sold_date_sk between 2450816 and 2451500 and sr_returned_date_sk between 2450816 and 2451500 and s_store_sk = ss_store_sk; {code} Plan snippet {code} Statistics: Num rows: 57439344 Data size: 1838059008 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (_col1 = _col27) and (_col8 = _col34)) and _col22 BETWEEN 2450816 AND 2451500) and _col45 BETWEEN 2450816 AND 2451500) and (_col49 = _col6)) (type: boolean) {code} Full plan with CBO disabled {code} STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Tez Edges: Reducer 2 - Map 1 (SIMPLE_EDGE), Map 3 (BROADCAST_EDGE), Map 4 (SIMPLE_EDGE) DagName: mmokhtar_20150214182626_ad6820c7-b667-4652-ab25-cb60deed1a6d:13 Vertices: Map 1 Map Operator Tree: TableScan alias: b filterExpr: ((sr_item_sk is not null and sr_ticket_number is not null) and sr_returned_date_sk BETWEEN 2450816 AND 2451500) (type: boolean) Statistics: Num rows: 2370038095 Data size: 170506118656 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (sr_item_sk is not null and sr_ticket_number is not null) (type: boolean) Statistics: Num rows: 706893063 Data size: 6498502768 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: sr_item_sk (type: int), sr_ticket_number (type: int) sort order: ++ Map-reduce partition columns: sr_item_sk (type: int), sr_ticket_number (type: int) Statistics: Num rows: 706893063 Data size: 6498502768 Basic stats: COMPLETE Column stats: COMPLETE value expressions: sr_returned_date_sk (type: int) Execution mode: vectorized Map 3 Map Operator Tree: TableScan alias: store filterExpr: s_store_sk is not null (type: boolean) Statistics: Num rows: 1704 Data size: 3256276 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: s_store_sk is not null (type: boolean) Statistics: Num rows: 1704 Data size: 6816 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: s_store_sk (type: int) sort order: + Map-reduce partition columns: s_store_sk (type: int) Statistics: Num rows: 1704 Data size: 6816 Basic stats: COMPLETE Column stats: COMPLETE Execution mode: vectorized Map 4 Map Operator Tree: TableScan alias: a filterExpr: (((ss_item_sk is not null and ss_ticket_number is not null) and ss_store_sk is not null) and ss_sold_date_sk BETWEEN 2450816 AND 2451500) (type: boolean) Statistics: Num rows: 28878719387 Data size: 2405805439460 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ((ss_item_sk is not null and ss_ticket_number is not null) and ss_store_sk is not null) (type: boolean) Statistics: Num rows: 8405840828 Data size: 110101408700 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: ss_item_sk (type: int), ss_ticket_number (type: int) sort order: ++ Map-reduce partition columns: ss_item_sk (type:
[jira] [Resolved] (HIVE-10315) CBO (Calcite Return Path): HiveRelSize accessing columns without available stats [CBO branch]
[ https://issues.apache.org/jira/browse/HIVE-10315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-10315. - Resolution: Fixed Committed to branch. Thanks, Jesus! CBO (Calcite Return Path): HiveRelSize accessing columns without available stats [CBO branch] -- Key: HIVE-10315 URL: https://issues.apache.org/jira/browse/HIVE-10315 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: cbo-branch Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: cbo-branch Attachments: HIVE-10315.cbo.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10310) Support GROUPING() and GROUP_ID() in HIVE
[ https://issues.apache.org/jira/browse/HIVE-10310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492694#comment-14492694 ] Laljo John Pullokkaran commented on HIVE-10310: --- Also see ./ql/bin/src/test/queries/clientpositive/groupby_grouping_id2.q Support GROUPING() and GROUP_ID() in HIVE - Key: HIVE-10310 URL: https://issues.apache.org/jira/browse/HIVE-10310 Project: Hive Issue Type: New Feature Components: Parser, SQL Reporter: sanjiv singh Priority: Minor I have lots of queries using GROUPING() function. failing on hive , just because GROUPING() not supported in hive. See the Query below; SELECT fact_1_id, fact_2_id, GROUPING(fact_1_id) AS f1g, GROUPING(fact_2_id) AS f2g FROM dimension_tab GROUP BY CUBE (fact_1_id, fact_2_id) ORDER BY fact_1_id, fact_2_id; In order to run in HIVE all such queries, It need to be transformed to HIVE syntax. See below transformed query, compatible to hive. Equivalent have been derived using Case statement . SELECT fact_1_id, fact_2_id, (case when (GROUPING__ID 1) = 0 then 1 else 0 end) as f1g, (case when (GROUPING__ID 2) = 0 then 1 else 0 end) as f2g FROM dimension_tab GROUP BY fact_1_id, fact_2_id WITH CUBE ORDER BY fact_1_id, fact_2_id; It would be great if GROUPING() implemented in hive. I see two ways to do it 1) Handle it at parser level. 2) GROUPING() aggregate function to hive(recommended) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10304) Add deprecation message to HiveCLI
[ https://issues.apache.org/jira/browse/HIVE-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14491994#comment-14491994 ] Hive QA commented on HIVE-10304: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12724872/HIVE-10304.3.patch {color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 8673 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hive.jdbc.TestJdbcWithMiniHS2.testNewConnectionConfiguration {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3403/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3403/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3403/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 14 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12724872 - PreCommit-HIVE-TRUNK-Build Add deprecation message to HiveCLI -- Key: HIVE-10304 URL: https://issues.apache.org/jira/browse/HIVE-10304 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 1.1.0 Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-10304.2.patch, HIVE-10304.3.patch, HIVE-10304.patch As Beeline is now the recommended command line tool to Hive, we should add a message to HiveCLI to indicate that it is deprecated and redirect them to Beeline. This is not suggesting to remove HiveCLI for now, but just a helpful direction for user to know the direction to focus attention in Beeline. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-3296) comments before a set statement in a test do not work
[ https://issues.apache.org/jira/browse/HIVE-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492104#comment-14492104 ] Per Ullberg commented on HIVE-3296: --- This also fails in the Cli so it's not isolated to the Testing Infrastructure. {code} hive -- this is a comment select * from some_table limit 1; OK ... hive -- this is another comment set foo=bar; NoViableAltException(213@[]) at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:900) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:190) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:435) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:353) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:979) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1022) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:915) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:905) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) FAILED: ParseException line 2:0 cannot recognize input near 'set' 'foo' '=' {code} Running CDH5beta1-Packaging-Hive-2013-10-27_18-38-32/hive-0.11.0-cdh5.0.0-beta-1 comments before a set statement in a test do not work - Key: HIVE-3296 URL: https://issues.apache.org/jira/browse/HIVE-3296 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Namit Jain Assignee: Srinivas Vemuri The following test fails::: -- test comment set hive.test=foo; create table tst(key string); The output is:: FAILED: ParseException line 2:0 cannot recognize input near 'set' 'hive' '.' However, if the comment position is changed it works fine. The following test: -- test comment create table tst(key string); set hive.test=foo; works fine -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10310) Support GROUPING() and GROUP_ID() in HIVE
[ https://issues.apache.org/jira/browse/HIVE-10310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14491949#comment-14491949 ] Ferdinand Xu commented on HIVE-10310: - Hi [~sanjiv singh], is this what you want? https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation,+Cube,+Grouping+and+Rollup Support GROUPING() and GROUP_ID() in HIVE - Key: HIVE-10310 URL: https://issues.apache.org/jira/browse/HIVE-10310 Project: Hive Issue Type: New Feature Components: Parser, SQL Reporter: sanjiv singh Priority: Minor I have lots of queries using GROUPING() function. failing on hive , just because GROUPING() not supported in hive. See the Query below; SELECT fact_1_id, fact_2_id, GROUPING(fact_1_id) AS f1g, GROUPING(fact_2_id) AS f2g FROM dimension_tab GROUP BY CUBE (fact_1_id, fact_2_id) ORDER BY fact_1_id, fact_2_id; In order to run in HIVE all such queries, It need to be transformed to HIVE syntax. See below transformed query, compatible to hive. Equivalent have been derived using Case statement . SELECT fact_1_id, fact_2_id, (case when (GROUPING__ID 1) = 0 then 1 else 0 end) as f1g, (case when (GROUPING__ID 2) = 0 then 1 else 0 end) as f2g FROM dimension_tab GROUP BY fact_1_id, fact_2_id WITH CUBE ORDER BY fact_1_id, fact_2_id; It would be great if GROUPING() implemented in hive. I see two ways to do it 1) Handle it at parser level. 2) GROUPING() aggregate function to hive(recommended) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9005) HiveSever2 error with Illegal Operation state transition from CLOSED to ERROR
[ https://issues.apache.org/jira/browse/HIVE-9005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492321#comment-14492321 ] Sunil Kumar commented on HIVE-9005: --- Hi, we are getting below error on HIve 1.0.0. for log running query through JDBC client. While Same query get executed successfully from the beeline. Can please let me know the reason?? 2015-04-13 11:32:41,622 INFO [HiveServer2-Background-Pool: Thread-42]: ql.Driver (SessionState.java:printInfo(824)) - MapReduce Jobs Launched: 2015-04-13 11:32:41,622 INFO [HiveServer2-Background-Pool: Thread-42]: ql.Driver (SessionState.java:printInfo(824)) - Stage-Stage-1: Map: 100 Reduce: 1 Cumulative CPU: 288.74 sec HDFS Read: 162865697 HDFS Write: 127 SUCCESS 2015-04-13 11:32:41,622 INFO [HiveServer2-Background-Pool: Thread-42]: ql.Driver (SessionState.java:printInfo(824)) - Stage-Stage-4: Map: 100 Reduce: 1 Cumulative CPU: 291.33 sec HDFS Read: 162865697 HDFS Write: 127 SUCCESS 2015-04-13 11:32:41,623 INFO [HiveServer2-Background-Pool: Thread-42]: ql.Driver (SessionState.java:printInfo(824)) - Stage-Stage-7: Map: 100 Reduce: 1 Cumulative CPU: 290.74 sec HDFS Read: 162865697 HDFS Write: 127 SUCCESS 2015-04-13 11:32:41,623 INFO [HiveServer2-Background-Pool: Thread-42]: ql.Driver (SessionState.java:printInfo(824)) - Stage-Stage-9: Map: 100 Reduce: 1 Cumulative CPU: 294.8 sec HDFS Read: 162865697 HDFS Write: 127 SUCCESS 2015-04-13 11:32:41,623 INFO [HiveServer2-Background-Pool: Thread-42]: ql.Driver (SessionState.java:printInfo(824)) - Stage-Stage-10: Map: 100 Reduce: 1 Cumulative CPU: 71.42 sec HDFS Read: 37431861 HDFS Write: 0 FAIL 2015-04-13 11:32:41,623 INFO [HiveServer2-Background-Pool: Thread-42]: ql.Driver (SessionState.java:printInfo(824)) - Total MapReduce CPU Time Spent: 20 minutes 37 seconds 30 msec 2015-04-13 11:32:41,623 INFO [HiveServer2-Background-Pool: Thread-42]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver 2015-04-13 11:32:41,623 INFO [HiveServer2-Background-Pool: Thread-42]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - /PERFLOG method=releaseLocks start=1428904961623 end=1428904961623 duration=0 from=org.apache.hadoop.hive.ql.Driver 2015-04-13 11:32:41,657 ERROR [HiveServer2-Background-Pool: Thread-42]: operation.Operation (SQLOperation.java:run(199)) - Error running hive query: org.apache.hive.service.cli.HiveSQLException: Illegal Operation state transition from CLOSED to ERROR at org.apache.hive.service.cli.OperationState.validateTransition(OperationState.java:91) at org.apache.hive.service.cli.OperationState.validateTransition(OperationState.java:97) at org.apache.hive.service.cli.operation.Operation.setState(Operation.java:125) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:157) at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69) at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:536) at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) HiveSever2 error with Illegal Operation state transition from CLOSED to ERROR --- Key: HIVE-9005 URL: https://issues.apache.org/jira/browse/HIVE-9005 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Binglin Chang Attachments: HIVE-9005.1.patch {noformat} 2014-12-02 11:25:40,855 WARN [HiveServer2-Background-Pool: Thread-17]: ql.Driver (DriverContext.java:shutdown(137)) - Shutting down task : Stage-1:MAPRED 2014-12-02 11:25:41,898 INFO [HiveServer2-Background-Pool: Thread-30]: exec.Task (SessionState.java:printInfo(536)) - Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2014-12-02 11:25:41,942 WARN [HiveServer2-Background-Pool: Thread-30]: mapreduce.Counters (AbstractCounters.java:getGroup(234)) - Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use
[jira] [Commented] (HIVE-10242) ACID: insert overwrite prevents create table command
[ https://issues.apache.org/jira/browse/HIVE-10242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492301#comment-14492301 ] Alan Gates commented on HIVE-10242: --- Why is the templeton stuff in this patch? I suspect it doesn't belong here. TxnHandler, in the LockInfo constructor. If you're going to add defaults for those switch cases they should throw errors rather than return null. Why did you change the fields in LockInfo to final? It's fine, I'm just curious. Why make a new class LockTypeComparator? This is in the tight loop of determining locks, so we want it to go as fast as we can. Rather than requiring an object creation and function call every time through this it's better to inline that code. I don't understand the changes to DbLockManager. What is the purpose of the changes here. ACID: insert overwrite prevents create table command Key: HIVE-10242 URL: https://issues.apache.org/jira/browse/HIVE-10242 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 1.0.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-10242.2.patch, HIVE-10242.3.patch, HIVE-10242.patch 1. insert overwirte table DB.T1 select ... from T2: this takes X lock on DB.T1 and S lock on T2. X lock makes sense because we don't want anyone reading T1 while it's overwritten. S lock on T2 prevents if from being dropped while the query is in progress. 2. create table DB.T3: takes S lock on DB. This S lock gets blocked by X lock on T1. S lock prevents the DB from being dropped while create table is executed. If the insert statement is long running, this blocks DDL ops on the same database. This is a usability issue. There is no good reason why X lock on a table within a DB and S lock on DB should be in conflict. (this is different from a situation where X lock is on a partition and S lock is on the table to which this partition belongs. Here it makes sense. Basically there is no SQL way to address all tables in a DB but you can easily refer to all partitions of a table) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10277) Unable to process Comment line '--' in HIVE-1.1.0
[ https://issues.apache.org/jira/browse/HIVE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492310#comment-14492310 ] Kaveen Raajan commented on HIVE-10277: -- Here I removed all '--' contained line and passing rest of the string to processCmd() at cliDriver.java https://issues.apache.org/jira/secure/attachment/12724891/HIVE-10277-1.patch It working fine.. Unable to process Comment line '--' in HIVE-1.1.0 - Key: HIVE-10277 URL: https://issues.apache.org/jira/browse/HIVE-10277 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.0.0 Reporter: Kaveen Raajan Assignee: Chinna Rao Lalam Priority: Minor Labels: hive Attachments: HIVE-10277-1.patch, HIVE-10277.patch I tried to use comment line (*--*) in HIVE-1.1.0 grunt shell like, ~hive--this is comment line~ ~hiveshow tables;~ I got error like {quote} NoViableAltException(-1@[]) at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java: 1020) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:19 9) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:16 6) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:393) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1112) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1160) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:2 07) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:754 ) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces sorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) FAILED: ParseException line 2:0 cannot recognize input near 'EOF' 'EOF' 'EO F' {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10313) Literal Decimal ExprNodeConstantDesc should contain value of HiveDecimal instead of String
[ https://issues.apache.org/jira/browse/HIVE-10313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-10313: --- Attachment: HIVE-10313.patch Patch has also been uploaded to https://reviews.apache.org/r/33128/ and requested for review. Thanks in advanced. Literal Decimal ExprNodeConstantDesc should contain value of HiveDecimal instead of String -- Key: HIVE-10313 URL: https://issues.apache.org/jira/browse/HIVE-10313 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.0.0 Reporter: Chaoyu Tang Assignee: Chaoyu Tang Attachments: HIVE-10313.patch In TyepCheckProcFactory.NumExprProcessor, the ExprNodeConstantDesc is created from strVal: {code} else if (expr.getText().endsWith(BD)) { // Literal decimal String strVal = expr.getText().substring(0, expr.getText().length() - 2); HiveDecimal hd = HiveDecimal.create(strVal); int prec = 1; int scale = 0; if (hd != null) { prec = hd.precision(); scale = hd.scale(); } DecimalTypeInfo typeInfo = TypeInfoFactory.getDecimalTypeInfo(prec, scale); return new ExprNodeConstantDesc(typeInfo, strVal); } {code} It should use HiveDecmal: return new ExprNodeConstantDesc(typeInfo, hd); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10315) CBO (Calcite Return Path): HiveRelSize accessing columns without available stats [CBO branch]
[ https://issues.apache.org/jira/browse/HIVE-10315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10315: --- Attachment: HIVE-10315.cbo.patch CBO (Calcite Return Path): HiveRelSize accessing columns without available stats [CBO branch] -- Key: HIVE-10315 URL: https://issues.apache.org/jira/browse/HIVE-10315 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: cbo-branch Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: cbo-branch Attachments: HIVE-10315.cbo.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10315) CBO (Calcite Return Path): HiveRelSize accessing columns without available stats [CBO branch]
[ https://issues.apache.org/jira/browse/HIVE-10315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10315: --- Attachment: (was: HIVE-10315.cbo.patch) CBO (Calcite Return Path): HiveRelSize accessing columns without available stats [CBO branch] -- Key: HIVE-10315 URL: https://issues.apache.org/jira/browse/HIVE-10315 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: cbo-branch Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: cbo-branch Attachments: HIVE-10315.cbo.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10190) CBO: AST mode checks for TABLESAMPLE with AST.toString().contains(TOK_TABLESPLITSAMPLE)
[ https://issues.apache.org/jira/browse/HIVE-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-10190: --- Assignee: Reuben Kuhnert (was: Pengcheng Xiong) CBO: AST mode checks for TABLESAMPLE with AST.toString().contains(TOK_TABLESPLITSAMPLE) - Key: HIVE-10190 URL: https://issues.apache.org/jira/browse/HIVE-10190 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 1.2.0 Reporter: Gopal V Assignee: Reuben Kuhnert Priority: Trivial Labels: perfomance Attachments: HIVE-10190-querygen.py, HIVE-10190.01.patch, HIVE-10190.02.patch, HIVE-10190.03.patch {code} public static boolean validateASTForUnsupportedTokens(ASTNode ast) { String astTree = ast.toStringTree(); // if any of following tokens are present in AST, bail out String[] tokens = { TOK_CHARSETLITERAL, TOK_TABLESPLITSAMPLE }; for (String token : tokens) { if (astTree.contains(token)) { return false; } } return true; } {code} This is an issue for a SQL query which is bigger in AST form than in text (~700kb). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10239) Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and PostgreSQL
[ https://issues.apache.org/jira/browse/HIVE-10239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492451#comment-14492451 ] Sergio Peña commented on HIVE-10239: The metastore upgade test failed, but it did not publish the results on this ticket. http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-METASTORE-Test/34/console The failure was because it tried to patch the file with 'patch -p1' instead of -p0. Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and PostgreSQL Key: HIVE-10239 URL: https://issues.apache.org/jira/browse/HIVE-10239 Project: Hive Issue Type: Improvement Affects Versions: 1.1.0 Reporter: Naveen Gangam Assignee: Naveen Gangam Attachments: HIVE-10239-donotcommit.patch, HIVE-10239.0.patch, HIVE-10239.patch Need to create DB-implementation specific scripts to use the framework introduced in HIVE-9800 to have any metastore schema changes tested across all supported databases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10317) The HMS upgrade test fails when applying patches with '-p1' instead of '-p0'
[ https://issues.apache.org/jira/browse/HIVE-10317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-10317: --- Component/s: Testing Infrastructure The HMS upgrade test fails when applying patches with '-p1' instead of '-p0' Key: HIVE-10317 URL: https://issues.apache.org/jira/browse/HIVE-10317 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Sergio Peña Assignee: Sergio Peña Patches uploaded to jira might be applied to the source code using 'patch -p0' or 'patch -p1' directly on the root directory of the branch. The HMS upgrade test is using only '-p1', so some patches failed. We need to support the '-p0' on the HMS upgrade test as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10314) CBO (Calcite Return Path): TOK_ALLCOLREF not being replaced in GroupBy clause [CBO branch]
[ https://issues.apache.org/jira/browse/HIVE-10314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10314: --- Summary: CBO (Calcite Return Path): TOK_ALLCOLREF not being replaced in GroupBy clause [CBO branch] (was: CBO (Calcite Return Path): TOK_ALLCOLREF no being replaced in GroupBy clause [CBO branch]) CBO (Calcite Return Path): TOK_ALLCOLREF not being replaced in GroupBy clause [CBO branch] -- Key: HIVE-10314 URL: https://issues.apache.org/jira/browse/HIVE-10314 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: cbo-branch Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: cbo-branch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10148) update of bucketing column should not be allowed
[ https://issues.apache.org/jira/browse/HIVE-10148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492624#comment-14492624 ] Eugene Koifman commented on HIVE-10148: --- fixed Jira name update of bucketing column should not be allowed Key: HIVE-10148 URL: https://issues.apache.org/jira/browse/HIVE-10148 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 1.1.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 1.2.0 Attachments: HIVE-10148.2.patch, HIVE-10148.3.patch, HIVE-10148.4.patch, HIVE-10148.5.patch, HIVE-10148.6.patch, HIVE-10148.patch update tbl set a = 5; should raise an error if 'a' is a bucketing column. Such operation is not supported but currently not checked for. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10062) HiveOnTez: Union followed by Multi-GB followed by Multi-insert loses data
[ https://issues.apache.org/jira/browse/HIVE-10062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492637#comment-14492637 ] Pengcheng Xiong commented on HIVE-10062: [~hagleitn], the test failures are not related. Could you please review the code again? Thanks. HiveOnTez: Union followed by Multi-GB followed by Multi-insert loses data - Key: HIVE-10062 URL: https://issues.apache.org/jira/browse/HIVE-10062 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Priority: Critical Attachments: HIVE-10062.01.patch, HIVE-10062.02.patch, HIVE-10062.03.patch, HIVE-10062.04.patch In q.test environment with src table, execute the following query: {code} CREATE TABLE DEST1(key STRING, value STRING) STORED AS TEXTFILE; CREATE TABLE DEST2(key STRING, val1 STRING, val2 STRING) STORED AS TEXTFILE; FROM (select 'tst1' as key, cast(count(1) as string) as value from src s1 UNION all select s2.key as key, s2.value as value from src s2) unionsrc INSERT OVERWRITE TABLE DEST1 SELECT unionsrc.key, COUNT(DISTINCT SUBSTR(unionsrc.value,5)) GROUP BY unionsrc.key INSERT OVERWRITE TABLE DEST2 SELECT unionsrc.key, unionsrc.value, COUNT(DISTINCT SUBSTR(unionsrc.value,5)) GROUP BY unionsrc.key, unionsrc.value; select * from DEST1; select * from DEST2; {code} DEST1 and DEST2 should both have 310 rows. However, DEST2 only has 1 row tst1500 1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10313) Literal Decimal ExprNodeConstantDesc should contain value of HiveDecimal instead of String
[ https://issues.apache.org/jira/browse/HIVE-10313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-10313: --- Attachment: HIVE-10313.1.patch Fixed two failed test outputs (vector_decimal_2.q.out). The failure from TestJdbcWithMiniHS2#testNewConnectionConfiguration seems not related to this patch. It fails w/ or w/o this patch. Literal Decimal ExprNodeConstantDesc should contain value of HiveDecimal instead of String -- Key: HIVE-10313 URL: https://issues.apache.org/jira/browse/HIVE-10313 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.0.0 Reporter: Chaoyu Tang Assignee: Chaoyu Tang Attachments: HIVE-10313.1.patch, HIVE-10313.patch In TyepCheckProcFactory.NumExprProcessor, the ExprNodeConstantDesc is created from strVal: {code} else if (expr.getText().endsWith(BD)) { // Literal decimal String strVal = expr.getText().substring(0, expr.getText().length() - 2); HiveDecimal hd = HiveDecimal.create(strVal); int prec = 1; int scale = 0; if (hd != null) { prec = hd.precision(); scale = hd.scale(); } DecimalTypeInfo typeInfo = TypeInfoFactory.getDecimalTypeInfo(prec, scale); return new ExprNodeConstantDesc(typeInfo, strVal); } {code} It should use HiveDecmal: return new ExprNodeConstantDesc(typeInfo, hd); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10315) CBO (Calcite Return Path): HiveRelSize accessing columns without available stats [CBO branch]
[ https://issues.apache.org/jira/browse/HIVE-10315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10315: --- Attachment: (was: HIVE-10315.cbo.patch) CBO (Calcite Return Path): HiveRelSize accessing columns without available stats [CBO branch] -- Key: HIVE-10315 URL: https://issues.apache.org/jira/browse/HIVE-10315 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: cbo-branch Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: cbo-branch Attachments: HIVE-10315.cbo.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10315) CBO (Calcite Return Path): HiveRelSize accessing columns without available stats [CBO branch]
[ https://issues.apache.org/jira/browse/HIVE-10315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10315: --- Attachment: HIVE-10315.cbo.patch CBO (Calcite Return Path): HiveRelSize accessing columns without available stats [CBO branch] -- Key: HIVE-10315 URL: https://issues.apache.org/jira/browse/HIVE-10315 Project: Hive Issue Type: Sub-task Components: CBO Affects Versions: cbo-branch Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: cbo-branch Attachments: HIVE-10315.cbo.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)