[jira] [Assigned] (HIVE-4239) Remove lock on compilation stage
[ https://issues.apache.org/jira/browse/HIVE-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-4239: -- Assignee: Sergey Shelukhin Remove lock on compilation stage Key: HIVE-4239 URL: https://issues.apache.org/jira/browse/HIVE-4239 Project: Hive Issue Type: Bug Components: HiveServer2, Query Processor Reporter: Carl Steinbach Assignee: Sergey Shelukhin -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10528) Hiveserver2 in HTTP mode is not applying auth_to_local rules
[ https://issues.apache.org/jira/browse/HIVE-10528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdelrahman Shettia updated HIVE-10528: --- Attachment: HIVE-10528.3.patch Hiveserver2 in HTTP mode is not applying auth_to_local rules Key: HIVE-10528 URL: https://issues.apache.org/jira/browse/HIVE-10528 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0 Environment: Centos 6 Reporter: Abdelrahman Shettia Assignee: Abdelrahman Shettia Attachments: HIVE-10528.1.patch, HIVE-10528.1.patch, HIVE-10528.2.patch, HIVE-10528.3.patch PROBLEM: Authenticating to HS2 in HTTP mode with Kerberos, auth_to_local mappings do not get applied. Because of this various permissions checks which rely on the local cluster name for a user are going to fail. STEPS TO REPRODUCE: 1. Create kerberos cluster and HS2 in HTTP mode 2. Create a new user, test, along with a kerberos principal for this user 3. Create a separate principal, mapped-test 4. Create an auth_to_local rule to make sure that mapped-test is mapped to test 5. As the test user, connect to HS2 with beeline and create a simple table: {code} CREATE TABLE permtest (field1 int); {code} There is no need to load anything into this table. 6. Establish that it works as the test user: {code} show create table permtest; {code} 7. Drop the test identity and become mapped-test 8. Re-connect to HS2 with beeline, re-run the above command: {code} show create table permtest; {code} You will find that when this is done in HTTP mode, you will get an HDFS error (because of StorageBasedAuthorization doing a HDFS permissions check) and the user will be mapped-test and NOT test as it should be. ANALYSIS: This appears to be HTTP specific and the problem seems to come in {{ThriftHttpServlet$HttpKerberosServerAction.getPrincipalWithoutRealmAndHost()}}: {code} try { fullKerberosName = ShimLoader.getHadoopShims().getKerberosNameShim(fullPrincipal); } catch (IOException e) { throw new HttpAuthenticationException(e); } return fullKerberosName.getServiceName(); {code} getServiceName applies no auth_to_local rules. Seems like maybe this should be getShortName()? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10828) Insert...values for fewer number of columns fail
[ https://issues.apache.org/jira/browse/HIVE-10828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aswathy Chellammal Sreekumar updated HIVE-10828: Description: Schema on insert queries with fewer number of columns fails with below error message ERROR ql.Driver (SessionState.java:printError(957)) - FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genReduceSinkPlan(SemanticAnalyzer.java:7277) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBucketingSortingDest(SemanticAnalyzer.java:6120) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:6291) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8992) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8883) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9728) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9621) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10094) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:324) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10105) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:208) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311) at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:409) at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:425) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:714) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Steps to reproduce: set hive.support.concurrency=true; set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; set hive.enforce.bucketing=true; drop table if exists table1; create table table1 (a int, b string, c string) partitioned by (bkt int) clustered by (a) into 2 buckets stored as orc tblproperties ('transactional'='true'); insert into table_1 partition (bkt) (b, a, bkt) values ('part one', 1, 1), ('part one', 2, 1), ('part two', 3, 2), ('part three', 4, 3); was: Schema on insert queries with fewer number of columns fails with below error message ERROR ql.Driver (SessionState.java:printError(957)) - FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genReduceSinkPlan(SemanticAnalyzer.java:7277) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBucketingSortingDest(SemanticAnalyzer.java:6120) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:6291) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8992) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8883) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9728) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9621) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10094) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:324) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10105) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:208) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at
[jira] [Updated] (HIVE-10550) Dynamic RDD caching optimization for HoS.[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li updated HIVE-10550: - Attachment: HIVE-10550.5-spark.patch Dynamic RDD caching optimization for HoS.[Spark Branch] --- Key: HIVE-10550 URL: https://issues.apache.org/jira/browse/HIVE-10550 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Attachments: HIVE-10550.1-spark.patch, HIVE-10550.1.patch, HIVE-10550.2-spark.patch, HIVE-10550.3-spark.patch, HIVE-10550.4-spark.patch, HIVE-10550.5-spark.patch A Hive query may try to scan the same table multi times, like self-join, self-union, or even share the same subquery, [TPC-DS Q39|https://github.com/hortonworks/hive-testbench/blob/hive14/sample-queries-tpcds/query39.sql] is an example. As you may know that, Spark support cache RDD data, which mean Spark would put the calculated RDD data in memory and get the data from memory directly for next time, this avoid the calculation cost of this RDD(and all the cost of its dependencies) at the cost of more memory usage. Through analyze the query context, we should be able to understand which part of query could be shared, so that we can reuse the cached RDD in the generated Spark job. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10761) Create codahale-based metrics system for Hive
[ https://issues.apache.org/jira/browse/HIVE-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-10761: - Attachment: HIVE-10761.2.patch Some loose ends, like make it take in configured list of reporters, and add end-to-end unit test for Metastore metrics, latest patch should be ready for review. Create codahale-based metrics system for Hive - Key: HIVE-10761 URL: https://issues.apache.org/jira/browse/HIVE-10761 Project: Hive Issue Type: New Feature Components: Diagnosability Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-10761.2.patch, HIVE-10761.patch, hms-metrics.json There is a current Hive metrics system that hooks up to a JMX reporting, but all its measurements, models are custom. This is to make another metrics system that will be based on Codahale (ie yammer, dropwizard), which has the following advantage: * Well-defined metric model for frequently-needed metrics (ie JVM metrics) * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, etc), * Built-in reporting frameworks like JMX, Console, Log, JSON webserver It is used for many projects, including several Apache projects like Oozie. Overall, monitoring tools should find it easier to understand these common metric, measurement, reporting models. The existing metric subsystem will be kept and can be enabled if backward compatibility is desired. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10689) HS2 metadata api calls should use HiveAuthorizer interface for authorization
[ https://issues.apache.org/jira/browse/HIVE-10689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-10689: - Attachment: HIVE-10689.1.patch HS2 metadata api calls should use HiveAuthorizer interface for authorization Key: HIVE-10689 URL: https://issues.apache.org/jira/browse/HIVE-10689 Project: Hive Issue Type: Bug Components: Authorization, SQLStandardAuthorization Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-10689.1.patch java.sql.DataBaseMetadata apis in jdbc api result in calls to HS2 metadata api's and their execution is via separate Hive Operation implementations, that don't use the Hive Driver class. Invocation of these api's should also be authorized using the HiveAuthorizer api. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10829) ATS hook fails for explainTask
[ https://issues.apache.org/jira/browse/HIVE-10829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-10829: --- Attachment: HIVE-10829.01.patch ATS hook fails for explainTask -- Key: HIVE-10829 URL: https://issues.apache.org/jira/browse/HIVE-10829 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Priority: Minor Attachments: HIVE-10829.01.patch Commands: create table idtable(id string); create table ctastable as select * from idtable; With ATS hook: 2015-05-22 18:54:47,092 INFO [ATS Logger 0]: hooks.ATSHook (ATSHook.java:run(136)) - Failed to submit plan to ATS: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:589) at org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:576) at org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:821) at org.apache.hadoop.hive.ql.exec.ExplainTask.outputStagePlans(ExplainTask.java:965) at org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:219) at org.apache.hadoop.hive.ql.hooks.ATSHook$2.run(ATSHook.java:120) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10731) NullPointerException in HiveParser.g
[ https://issues.apache.org/jira/browse/HIVE-10731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560177#comment-14560177 ] Pengcheng Xiong commented on HIVE-10731: [~jpullokkaran], this patch also needs your review. Thanks. NullPointerException in HiveParser.g Key: HIVE-10731 URL: https://issues.apache.org/jira/browse/HIVE-10731 Project: Hive Issue Type: Bug Components: Query Planning Affects Versions: 1.2.0 Reporter: Xiu Assignee: Pengcheng Xiong Priority: Minor Attachments: HIVE-10731.01.patch In HiveParser.g: {code:Java} protected boolean useSQL11ReservedKeywordsForIdentifier() { return !HiveConf.getBoolVar(hiveConf, HiveConf.ConfVars.HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS); } {code} NullPointerException is thrown when hiveConf is not set. Stack trace: {code:Java} java.lang.NullPointerException at org.apache.hadoop.hive.conf.HiveConf.getBoolVar(HiveConf.java:2583) at org.apache.hadoop.hive.ql.parse.HiveParser.useSQL11ReservedKeywordsForIdentifier(HiveParser.java:1000) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.useSQL11ReservedKeywordsForIdentifier(HiveParser_IdentifiersParser.java:726) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:10922) at org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45808) at org.apache.hadoop.hive.ql.parse.HiveParser.columnNameType(HiveParser.java:38008) at org.apache.hadoop.hive.ql.parse.HiveParser.columnNameTypeList(HiveParser.java:36167) at org.apache.hadoop.hive.ql.parse.HiveParser.createTableStatement(HiveParser.java:5214) at org.apache.hadoop.hive.ql.parse.HiveParser.ddlStatement(HiveParser.java:2640) at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1650) at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1109) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:202) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:161) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10528) Hiveserver2 in HTTP mode is not applying auth_to_local rules
[ https://issues.apache.org/jira/browse/HIVE-10528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abdelrahman Shettia updated HIVE-10528: --- Attachment: HIVE-10528.2.patch Hiveserver2 in HTTP mode is not applying auth_to_local rules Key: HIVE-10528 URL: https://issues.apache.org/jira/browse/HIVE-10528 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0 Environment: Centos 6 Reporter: Abdelrahman Shettia Assignee: Abdelrahman Shettia Attachments: HIVE-10528.1.patch, HIVE-10528.1.patch, HIVE-10528.2.patch PROBLEM: Authenticating to HS2 in HTTP mode with Kerberos, auth_to_local mappings do not get applied. Because of this various permissions checks which rely on the local cluster name for a user are going to fail. STEPS TO REPRODUCE: 1. Create kerberos cluster and HS2 in HTTP mode 2. Create a new user, test, along with a kerberos principal for this user 3. Create a separate principal, mapped-test 4. Create an auth_to_local rule to make sure that mapped-test is mapped to test 5. As the test user, connect to HS2 with beeline and create a simple table: {code} CREATE TABLE permtest (field1 int); {code} There is no need to load anything into this table. 6. Establish that it works as the test user: {code} show create table permtest; {code} 7. Drop the test identity and become mapped-test 8. Re-connect to HS2 with beeline, re-run the above command: {code} show create table permtest; {code} You will find that when this is done in HTTP mode, you will get an HDFS error (because of StorageBasedAuthorization doing a HDFS permissions check) and the user will be mapped-test and NOT test as it should be. ANALYSIS: This appears to be HTTP specific and the problem seems to come in {{ThriftHttpServlet$HttpKerberosServerAction.getPrincipalWithoutRealmAndHost()}}: {code} try { fullKerberosName = ShimLoader.getHadoopShims().getKerberosNameShim(fullPrincipal); } catch (IOException e) { throw new HttpAuthenticationException(e); } return fullKerberosName.getServiceName(); {code} getServiceName applies no auth_to_local rules. Seems like maybe this should be getShortName()? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10819) SearchArgumentImpl for Timestamp is broken by HIVE-10286
[ https://issues.apache.org/jira/browse/HIVE-10819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560238#comment-14560238 ] Ferdinand Xu commented on HIVE-10819: - Hi [~sershe], [~daijy], the problematic commit is already reverted. {noformat} Repository: hive Updated Branches: refs/heads/master db8067f96 - a00bf4f87 Revert HIVE-10277: Unable to process Comment line '--' in HIVE-1.1.0 (Chinna via Xuefu) This reverts commit d66a7347ab97983cc5b9fca6bdabebc81e5a77e5. {noformat} SearchArgumentImpl for Timestamp is broken by HIVE-10286 Key: HIVE-10819 URL: https://issues.apache.org/jira/browse/HIVE-10819 Project: Hive Issue Type: Bug Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 1.2.1 Attachments: HIVE-10819.1.patch, HIVE-10819.2.patch, HIVE-10819.3.patch The work around for kryo bug for Timestamp is accidentally removed by HIVE-10286. Need to bring it back. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10788) Change sort_array to support non-primitive types
[ https://issues.apache.org/jira/browse/HIVE-10788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560240#comment-14560240 ] Hive QA commented on HIVE-10788: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12735427/HIVE-10788.1.patch {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 8977 tests executed *Failed tests:* {noformat} TestCustomAuthentication - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_crc32 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_sha1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_join30 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_null_projection org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udf_sort_array_wrong1 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4048/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4048/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4048/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12735427 - PreCommit-HIVE-TRUNK-Build Change sort_array to support non-primitive types Key: HIVE-10788 URL: https://issues.apache.org/jira/browse/HIVE-10788 Project: Hive Issue Type: Bug Components: UDF Reporter: Chao Sun Assignee: Chao Sun Attachments: HIVE-10788.1.patch Currently {{sort_array}} only support primitive types. As we already support comparison between non-primitive types, it makes sense to remove this restriction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10828) Insert...values for fewer number of columns fail
[ https://issues.apache.org/jira/browse/HIVE-10828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aswathy Chellammal Sreekumar updated HIVE-10828: Description: Schema on insert queries with fewer number of columns fails with below error message ERROR ql.Driver (SessionState.java:printError(957)) - FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genReduceSinkPlan(SemanticAnalyzer.java:7277) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBucketingSortingDest(SemanticAnalyzer.java:6120) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:6291) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8992) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8883) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9728) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9621) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10094) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:324) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10105) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:208) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311) at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:409) at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:425) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:714) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) *Steps to reproduce:* set hive.support.concurrency=true; set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; set hive.enforce.bucketing=true; drop table if exists table1; create table table1 (a int, b string, c string) partitioned by (bkt int) clustered by (a) into 2 buckets stored as orc tblproperties ('transactional'='true'); insert into table_1 partition (bkt) (b, a, bkt) values ('part one', 1, 1), ('part one', 2, 1), ('part two', 3, 2), ('part three', 4, 3); was: Schema on insert queries with fewer number of columns fails with below error message ERROR ql.Driver (SessionState.java:printError(957)) - FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genReduceSinkPlan(SemanticAnalyzer.java:7277) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBucketingSortingDest(SemanticAnalyzer.java:6120) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:6291) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8992) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8883) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9728) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9621) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10094) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:324) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10105) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:208) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at
[jira] [Updated] (HIVE-10828) Insert...values for fewer number of columns fail
[ https://issues.apache.org/jira/browse/HIVE-10828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-10828: -- Description: Schema on insert queries with fewer number of columns fails with below error message {noformat} ERROR ql.Driver (SessionState.java:printError(957)) - FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genReduceSinkPlan(SemanticAnalyzer.java:7277) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBucketingSortingDest(SemanticAnalyzer.java:6120) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:6291) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8992) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8883) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9728) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9621) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10094) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:324) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10105) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:208) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311) at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:409) at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:425) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:714) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {noformat} *Steps to reproduce:* set hive.support.concurrency=true; set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; set hive.enforce.bucketing=true; drop table if exists table1; create table table1 (a int, b string, c string) partitioned by (bkt int) clustered by (a) into 2 buckets stored as orc tblproperties ('transactional'='true'); insert into table_1 partition (bkt) (b, a, bkt) values ('part one', 1, 1), ('part one', 2, 1), ('part two', 3, 2), ('part three', 4, 3); was: Schema on insert queries with fewer number of columns fails with below error message ERROR ql.Driver (SessionState.java:printError(957)) - FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genReduceSinkPlan(SemanticAnalyzer.java:7277) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBucketingSortingDest(SemanticAnalyzer.java:6120) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:6291) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8992) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8883) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9728) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9621) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10094) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:324) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10105) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:208) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at
[jira] [Commented] (HIVE-10828) Insert...values for fewer number of columns fail
[ https://issues.apache.org/jira/browse/HIVE-10828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560290#comment-14560290 ] Eugene Koifman commented on HIVE-10828: --- Simpler repro case {noformat} set hive.enforce.bucketing=true; set hive.exec.dynamic.partition.mode=nonstrict; set hive.cbo.enable=false; drop table if exists acid_partitioned; create table acid_partitioned (a int, c string) partitioned by (p int) clustered by (a) into 1 buckets; insert into acid_partitioned partition (p) (a,p) values(1,1); {noformat} above example disables CBO because it causes additional issues. will file separate ticket for that Insert...values for fewer number of columns fail Key: HIVE-10828 URL: https://issues.apache.org/jira/browse/HIVE-10828 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.2.0 Reporter: Aswathy Chellammal Sreekumar Assignee: Eugene Koifman Schema on insert queries with fewer number of columns fails with below error message ERROR ql.Driver (SessionState.java:printError(957)) - FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genReduceSinkPlan(SemanticAnalyzer.java:7277) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBucketingSortingDest(SemanticAnalyzer.java:6120) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:6291) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8992) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8883) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9728) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9621) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10094) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:324) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10105) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:208) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311) at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:409) at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:425) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:714) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) *Steps to reproduce:* set hive.support.concurrency=true; set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; set hive.enforce.bucketing=true; drop table if exists table1; create table table1 (a int, b string, c string) partitioned by (bkt int) clustered by (a) into 2 buckets stored as orc tblproperties ('transactional'='true'); insert into table_1 partition (bkt) (b, a, bkt) values ('part one', 1, 1), ('part one', 2, 1), ('part two', 3, 2), ('part three', 4, 3); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9069) Simplify filter predicates for CBO
[ https://issues.apache.org/jira/browse/HIVE-9069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560306#comment-14560306 ] Hive QA commented on HIVE-9069: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12735433/HIVE-9069.14.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 8975 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_7 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_7 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4049/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4049/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4049/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12735433 - PreCommit-HIVE-TRUNK-Build Simplify filter predicates for CBO -- Key: HIVE-9069 URL: https://issues.apache.org/jira/browse/HIVE-9069 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Jesus Camacho Rodriguez Fix For: 0.14.1 Attachments: HIVE-9069.01.patch, HIVE-9069.02.patch, HIVE-9069.03.patch, HIVE-9069.04.patch, HIVE-9069.05.patch, HIVE-9069.06.patch, HIVE-9069.07.patch, HIVE-9069.08.patch, HIVE-9069.08.patch, HIVE-9069.09.patch, HIVE-9069.10.patch, HIVE-9069.11.patch, HIVE-9069.12.patch, HIVE-9069.13.patch, HIVE-9069.14.patch, HIVE-9069.14.patch, HIVE-9069.patch Simplify predicates for disjunctive predicates so that can get pushed down to the scan. Looks like this is still an issue, some of the filters can be pushed down to the scan. {code} set hive.cbo.enable=true set hive.stats.fetch.column.stats=true set hive.exec.dynamic.partition.mode=nonstrict set hive.tez.auto.reducer.parallelism=true set hive.auto.convert.join.noconditionaltask.size=32000 set hive.exec.reducers.bytes.per.reducer=1 set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager set hive.support.concurrency=false set hive.tez.exec.print.summary=true explain select substr(r_reason_desc,1,20) as r ,avg(ws_quantity) wq ,avg(wr_refunded_cash) ref ,avg(wr_fee) fee from web_sales, web_returns, web_page, customer_demographics cd1, customer_demographics cd2, customer_address, date_dim, reason where web_sales.ws_web_page_sk = web_page.wp_web_page_sk and web_sales.ws_item_sk = web_returns.wr_item_sk and web_sales.ws_order_number = web_returns.wr_order_number and web_sales.ws_sold_date_sk = date_dim.d_date_sk and d_year = 1998 and cd1.cd_demo_sk = web_returns.wr_refunded_cdemo_sk and cd2.cd_demo_sk = web_returns.wr_returning_cdemo_sk and customer_address.ca_address_sk = web_returns.wr_refunded_addr_sk and reason.r_reason_sk = web_returns.wr_reason_sk and ( ( cd1.cd_marital_status = 'M' and cd1.cd_marital_status = cd2.cd_marital_status and cd1.cd_education_status = '4 yr Degree' and cd1.cd_education_status = cd2.cd_education_status and ws_sales_price between 100.00 and 150.00 ) or ( cd1.cd_marital_status = 'D' and cd1.cd_marital_status = cd2.cd_marital_status and cd1.cd_education_status = 'Primary' and cd1.cd_education_status = cd2.cd_education_status and ws_sales_price between 50.00 and 100.00 ) or ( cd1.cd_marital_status = 'U' and cd1.cd_marital_status = cd2.cd_marital_status and cd1.cd_education_status = 'Advanced Degree' and cd1.cd_education_status = cd2.cd_education_status and ws_sales_price between 150.00 and 200.00 ) ) and ( ( ca_country = 'United States' and ca_state in ('KY', 'GA', 'NM') and ws_net_profit between 100 and 200 ) or ( ca_country = 'United States' and ca_state in ('MT', 'OR', 'IN') and ws_net_profit between 150 and 300 ) or ( ca_country = 'United States' and ca_state in ('WI', 'MO', 'WV') and ws_net_profit between 50 and 250 ) ) group by r_reason_desc order by r, wq, ref, fee limit 100 OK STAGE DEPENDENCIES: Stage-1
[jira] [Updated] (HIVE-7723) Explain plan for complex query with lots of partitions is slow due to in-efficient collection used to find a matching ReadEntity
[ https://issues.apache.org/jira/browse/HIVE-7723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mostafa Mokhtar updated HIVE-7723: -- Attachment: HIVE-7723.11.patch Explain plan for complex query with lots of partitions is slow due to in-efficient collection used to find a matching ReadEntity Key: HIVE-7723 URL: https://issues.apache.org/jira/browse/HIVE-7723 Project: Hive Issue Type: Bug Components: CLI, Physical Optimizer Affects Versions: 0.13.1 Reporter: Mostafa Mokhtar Assignee: Mostafa Mokhtar Attachments: HIVE-7723.1.patch, HIVE-7723.10.patch, HIVE-7723.11.patch, HIVE-7723.2.patch, HIVE-7723.3.patch, HIVE-7723.4.patch, HIVE-7723.5.patch, HIVE-7723.6.patch, HIVE-7723.7.patch, HIVE-7723.8.patch, HIVE-7723.9.patch Explain on TPC-DS query 64 took 11 seconds, when the CLI was profiled it showed that ReadEntity.equals is taking ~40% of the CPU. ReadEntity.equals is called from the snippet below. Again and again the set is iterated over to get the actual match, a HashMap is a better option for this case as Set doesn't have a Get method. Also for ReadEntity equals is case-insensitive while hash is , which is an undesired behavior. {code} public static ReadEntity addInput(SetReadEntity inputs, ReadEntity newInput) { // If the input is already present, make sure the new parent is added to the input. if (inputs.contains(newInput)) { for (ReadEntity input : inputs) { if (input.equals(newInput)) { if ((newInput.getParents() != null) (!newInput.getParents().isEmpty())) { input.getParents().addAll(newInput.getParents()); input.setDirect(input.isDirect() || newInput.isDirect()); } return input; } } assert false; } else { inputs.add(newInput); return newInput; } // make compile happy return null; } {code} This is the query used : {code} select cs1.product_name ,cs1.store_name ,cs1.store_zip ,cs1.b_street_number ,cs1.b_streen_name ,cs1.b_city ,cs1.b_zip ,cs1.c_street_number ,cs1.c_street_name ,cs1.c_city ,cs1.c_zip ,cs1.syear ,cs1.cnt ,cs1.s1 ,cs1.s2 ,cs1.s3 ,cs2.s1 ,cs2.s2 ,cs2.s3 ,cs2.syear ,cs2.cnt from (select i_product_name as product_name ,i_item_sk as item_sk ,s_store_name as store_name ,s_zip as store_zip ,ad1.ca_street_number as b_street_number ,ad1.ca_street_name as b_streen_name ,ad1.ca_city as b_city ,ad1.ca_zip as b_zip ,ad2.ca_street_number as c_street_number ,ad2.ca_street_name as c_street_name ,ad2.ca_city as c_city ,ad2.ca_zip as c_zip ,d1.d_year as syear ,d2.d_year as fsyear ,d3.d_year as s2year ,count(*) as cnt ,sum(ss_wholesale_cost) as s1 ,sum(ss_list_price) as s2 ,sum(ss_coupon_amt) as s3 FROM store_sales JOIN store_returns ON store_sales.ss_item_sk = store_returns.sr_item_sk and store_sales.ss_ticket_number = store_returns.sr_ticket_number JOIN customer ON store_sales.ss_customer_sk = customer.c_customer_sk JOIN date_dim d1 ON store_sales.ss_sold_date_sk = d1.d_date_sk JOIN date_dim d2 ON customer.c_first_sales_date_sk = d2.d_date_sk JOIN date_dim d3 ON customer.c_first_shipto_date_sk = d3.d_date_sk JOIN store ON store_sales.ss_store_sk = store.s_store_sk JOIN customer_demographics cd1 ON store_sales.ss_cdemo_sk= cd1.cd_demo_sk JOIN customer_demographics cd2 ON customer.c_current_cdemo_sk = cd2.cd_demo_sk JOIN promotion ON store_sales.ss_promo_sk = promotion.p_promo_sk JOIN household_demographics hd1 ON store_sales.ss_hdemo_sk = hd1.hd_demo_sk JOIN household_demographics hd2 ON customer.c_current_hdemo_sk = hd2.hd_demo_sk JOIN customer_address ad1 ON store_sales.ss_addr_sk = ad1.ca_address_sk JOIN customer_address ad2 ON customer.c_current_addr_sk = ad2.ca_address_sk JOIN income_band ib1 ON hd1.hd_income_band_sk = ib1.ib_income_band_sk JOIN income_band ib2 ON hd2.hd_income_band_sk = ib2.ib_income_band_sk JOIN item ON store_sales.ss_item_sk = item.i_item_sk JOIN (select cs_item_sk ,sum(cs_ext_list_price) as sale,sum(cr_refunded_cash+cr_reversed_charge+cr_store_credit) as refund from catalog_sales JOIN catalog_returns ON catalog_sales.cs_item_sk = catalog_returns.cr_item_sk and catalog_sales.cs_order_number = catalog_returns.cr_order_number group by cs_item_sk having sum(cs_ext_list_price)2*sum(cr_refunded_cash+cr_reversed_charge+cr_store_credit)) cs_ui ON store_sales.ss_item_sk = cs_ui.cs_item_sk WHERE cd1.cd_marital_status
[jira] [Updated] (HIVE-7723) Explain plan for complex query with lots of partitions is slow due to in-efficient collection used to find a matching ReadEntity
[ https://issues.apache.org/jira/browse/HIVE-7723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mostafa Mokhtar updated HIVE-7723: -- Attachment: (was: HIVE-7723.11.patch) Explain plan for complex query with lots of partitions is slow due to in-efficient collection used to find a matching ReadEntity Key: HIVE-7723 URL: https://issues.apache.org/jira/browse/HIVE-7723 Project: Hive Issue Type: Bug Components: CLI, Physical Optimizer Affects Versions: 0.13.1 Reporter: Mostafa Mokhtar Assignee: Mostafa Mokhtar Attachments: HIVE-7723.1.patch, HIVE-7723.10.patch, HIVE-7723.2.patch, HIVE-7723.3.patch, HIVE-7723.4.patch, HIVE-7723.5.patch, HIVE-7723.6.patch, HIVE-7723.7.patch, HIVE-7723.8.patch, HIVE-7723.9.patch Explain on TPC-DS query 64 took 11 seconds, when the CLI was profiled it showed that ReadEntity.equals is taking ~40% of the CPU. ReadEntity.equals is called from the snippet below. Again and again the set is iterated over to get the actual match, a HashMap is a better option for this case as Set doesn't have a Get method. Also for ReadEntity equals is case-insensitive while hash is , which is an undesired behavior. {code} public static ReadEntity addInput(SetReadEntity inputs, ReadEntity newInput) { // If the input is already present, make sure the new parent is added to the input. if (inputs.contains(newInput)) { for (ReadEntity input : inputs) { if (input.equals(newInput)) { if ((newInput.getParents() != null) (!newInput.getParents().isEmpty())) { input.getParents().addAll(newInput.getParents()); input.setDirect(input.isDirect() || newInput.isDirect()); } return input; } } assert false; } else { inputs.add(newInput); return newInput; } // make compile happy return null; } {code} This is the query used : {code} select cs1.product_name ,cs1.store_name ,cs1.store_zip ,cs1.b_street_number ,cs1.b_streen_name ,cs1.b_city ,cs1.b_zip ,cs1.c_street_number ,cs1.c_street_name ,cs1.c_city ,cs1.c_zip ,cs1.syear ,cs1.cnt ,cs1.s1 ,cs1.s2 ,cs1.s3 ,cs2.s1 ,cs2.s2 ,cs2.s3 ,cs2.syear ,cs2.cnt from (select i_product_name as product_name ,i_item_sk as item_sk ,s_store_name as store_name ,s_zip as store_zip ,ad1.ca_street_number as b_street_number ,ad1.ca_street_name as b_streen_name ,ad1.ca_city as b_city ,ad1.ca_zip as b_zip ,ad2.ca_street_number as c_street_number ,ad2.ca_street_name as c_street_name ,ad2.ca_city as c_city ,ad2.ca_zip as c_zip ,d1.d_year as syear ,d2.d_year as fsyear ,d3.d_year as s2year ,count(*) as cnt ,sum(ss_wholesale_cost) as s1 ,sum(ss_list_price) as s2 ,sum(ss_coupon_amt) as s3 FROM store_sales JOIN store_returns ON store_sales.ss_item_sk = store_returns.sr_item_sk and store_sales.ss_ticket_number = store_returns.sr_ticket_number JOIN customer ON store_sales.ss_customer_sk = customer.c_customer_sk JOIN date_dim d1 ON store_sales.ss_sold_date_sk = d1.d_date_sk JOIN date_dim d2 ON customer.c_first_sales_date_sk = d2.d_date_sk JOIN date_dim d3 ON customer.c_first_shipto_date_sk = d3.d_date_sk JOIN store ON store_sales.ss_store_sk = store.s_store_sk JOIN customer_demographics cd1 ON store_sales.ss_cdemo_sk= cd1.cd_demo_sk JOIN customer_demographics cd2 ON customer.c_current_cdemo_sk = cd2.cd_demo_sk JOIN promotion ON store_sales.ss_promo_sk = promotion.p_promo_sk JOIN household_demographics hd1 ON store_sales.ss_hdemo_sk = hd1.hd_demo_sk JOIN household_demographics hd2 ON customer.c_current_hdemo_sk = hd2.hd_demo_sk JOIN customer_address ad1 ON store_sales.ss_addr_sk = ad1.ca_address_sk JOIN customer_address ad2 ON customer.c_current_addr_sk = ad2.ca_address_sk JOIN income_band ib1 ON hd1.hd_income_band_sk = ib1.ib_income_band_sk JOIN income_band ib2 ON hd2.hd_income_band_sk = ib2.ib_income_band_sk JOIN item ON store_sales.ss_item_sk = item.i_item_sk JOIN (select cs_item_sk ,sum(cs_ext_list_price) as sale,sum(cr_refunded_cash+cr_reversed_charge+cr_store_credit) as refund from catalog_sales JOIN catalog_returns ON catalog_sales.cs_item_sk = catalog_returns.cr_item_sk and catalog_sales.cs_order_number = catalog_returns.cr_order_number group by cs_item_sk having sum(cs_ext_list_price)2*sum(cr_refunded_cash+cr_reversed_charge+cr_store_credit)) cs_ui ON store_sales.ss_item_sk = cs_ui.cs_item_sk WHERE cd1.cd_marital_status cd2.cd_marital_status
[jira] [Commented] (HIVE-10811) RelFieldTrimmer throws NoSuchElementException in some cases
[ https://issues.apache.org/jira/browse/HIVE-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560335#comment-14560335 ] Laljo John Pullokkaran commented on HIVE-10811: --- Why do we need to keep the fields from input that is part of the collation but is not used by parent. If no operators from parent refer to that column then i don't see how preserving sort order is helpful. RelFieldTrimmer throws NoSuchElementException in some cases --- Key: HIVE-10811 URL: https://issues.apache.org/jira/browse/HIVE-10811 Project: Hive Issue Type: Bug Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-10811.01.patch, HIVE-10811.02.patch, HIVE-10811.patch RelFieldTrimmer runs into NoSuchElementException in some cases. Stack trace: {noformat} Exception in thread main java.lang.AssertionError: Internal error: While invoking method 'public org.apache.calcite.sql2rel.RelFieldTrimmer$TrimResult org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(org.apache.calcite.rel.core.Sort,org.apache.calcite.util.ImmutableBitSet,java.util.Set)' at org.apache.calcite.util.Util.newInternal(Util.java:743) at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:543) at org.apache.calcite.sql2rel.RelFieldTrimmer.dispatchTrimFields(RelFieldTrimmer.java:269) at org.apache.calcite.sql2rel.RelFieldTrimmer.trim(RelFieldTrimmer.java:175) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:947) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:820) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:768) at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:109) at org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:730) at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:145) at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:105) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:607) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:244) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10048) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:207) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:536) ... 32 more Caused by: java.lang.AssertionError: Internal error: While invoking method 'public org.apache.calcite.sql2rel.RelFieldTrimmer$TrimResult org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(org.apache.calcite.rel.core.Sort,org.apache.calcite.util.ImmutableBitSet,java.util.Set)' at
[jira] [Commented] (HIVE-10550) Dynamic RDD caching optimization for HoS.[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560342#comment-14560342 ] Hive QA commented on HIVE-10550: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12735497/HIVE-10550.5-spark.patch {color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 8721 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucket6.q-scriptfile1_win.q-quotedid_smb.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-bucketizedhiveinputformat.q-empty_dir_in_table.q - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-infer_bucket_sort_map_operators.q-load_hdfs_file_with_space_in_the_name.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-import_exported_table.q-truncate_column_buckets.q-bucket_num_reducers2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-infer_bucket_sort_num_buckets.q-parallel_orderby.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-join1.q-infer_bucket_sort_bucketed_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-bucket5.q-infer_bucket_sort_merge.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-input16_cc.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-bucket_num_reducers.q-scriptfile1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx_cbo_2.q-bucketmapjoin6.q-bucket4.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-reduce_deduplicate.q-infer_bucket_sort_dyn_part.q-udf_using.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-uber_reduce.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-stats_counter_partitioned.q-external_table_with_space_in_location_path.q-disable_merge_for_bucketing.q-and-1-more - did not produce a TEST-*.xml file org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/866/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/866/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-866/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 14 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12735497 - PreCommit-HIVE-SPARK-Build Dynamic RDD caching optimization for HoS.[Spark Branch] --- Key: HIVE-10550 URL: https://issues.apache.org/jira/browse/HIVE-10550 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Attachments: HIVE-10550.1-spark.patch, HIVE-10550.1.patch, HIVE-10550.2-spark.patch, HIVE-10550.3-spark.patch, HIVE-10550.4-spark.patch, HIVE-10550.5-spark.patch A Hive query may try to scan the same table multi times, like self-join, self-union, or even share the same subquery, [TPC-DS Q39|https://github.com/hortonworks/hive-testbench/blob/hive14/sample-queries-tpcds/query39.sql] is an example. As you may know that, Spark support cache RDD data, which mean Spark would put the calculated RDD data in memory and get the data from memory directly for next time, this avoid the calculation cost of this RDD(and all the cost of its dependencies) at the cost of more memory usage. Through analyze the query context, we should be able to understand which part of query could be shared, so that we can reuse the cached RDD in the generated Spark job. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10812) Scaling PK/FK's selectivity for stats annotation
[ https://issues.apache.org/jira/browse/HIVE-10812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-10812: Component/s: Statistics Physical Optimizer Scaling PK/FK's selectivity for stats annotation Key: HIVE-10812 URL: https://issues.apache.org/jira/browse/HIVE-10812 Project: Hive Issue Type: Improvement Components: Physical Optimizer, Statistics Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Fix For: 1.2.1 Attachments: HIVE-10812.01.patch, HIVE-10812.02.patch, HIVE-10812.03.patch Right now, the computation of the selectivity of FK side based on PK side does not take into consideration of the range of FK and the range of PK. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10704) Errors in Tez HashTableLoader when estimated table size is 0
[ https://issues.apache.org/jira/browse/HIVE-10704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560431#comment-14560431 ] Alexander Pivovarov commented on HIVE-10704: Mostafa, can you check RB link? I'm not sure it shows HIVE-10704.3.patch Errors in Tez HashTableLoader when estimated table size is 0 Key: HIVE-10704 URL: https://issues.apache.org/jira/browse/HIVE-10704 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Jason Dere Assignee: Mostafa Mokhtar Fix For: 1.2.1 Attachments: HIVE-10704.1.patch, HIVE-10704.2.patch, HIVE-10704.3.patch Couple of issues: - If the table sizes in MapJoinOperator.getParentDataSizes() are 0 for all tables, the largest small table selection is wrong and could select the large table (which results in NPE) - The memory estimates can either divide-by-zero, or allocate 0 memory if the table size is 0. Try to come up with a sensible default for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9069) Simplify filter predicates for CBO
[ https://issues.apache.org/jira/browse/HIVE-9069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560332#comment-14560332 ] Laljo John Pullokkaran commented on HIVE-9069: -- [~jcamachorodriguez] In extractCommonOperands for a disjunction if any operand doesn't have any of the reductionCondition then we can short circuit and bail out. Simplify filter predicates for CBO -- Key: HIVE-9069 URL: https://issues.apache.org/jira/browse/HIVE-9069 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Jesus Camacho Rodriguez Fix For: 0.14.1 Attachments: HIVE-9069.01.patch, HIVE-9069.02.patch, HIVE-9069.03.patch, HIVE-9069.04.patch, HIVE-9069.05.patch, HIVE-9069.06.patch, HIVE-9069.07.patch, HIVE-9069.08.patch, HIVE-9069.08.patch, HIVE-9069.09.patch, HIVE-9069.10.patch, HIVE-9069.11.patch, HIVE-9069.12.patch, HIVE-9069.13.patch, HIVE-9069.14.patch, HIVE-9069.14.patch, HIVE-9069.patch Simplify predicates for disjunctive predicates so that can get pushed down to the scan. Looks like this is still an issue, some of the filters can be pushed down to the scan. {code} set hive.cbo.enable=true set hive.stats.fetch.column.stats=true set hive.exec.dynamic.partition.mode=nonstrict set hive.tez.auto.reducer.parallelism=true set hive.auto.convert.join.noconditionaltask.size=32000 set hive.exec.reducers.bytes.per.reducer=1 set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager set hive.support.concurrency=false set hive.tez.exec.print.summary=true explain select substr(r_reason_desc,1,20) as r ,avg(ws_quantity) wq ,avg(wr_refunded_cash) ref ,avg(wr_fee) fee from web_sales, web_returns, web_page, customer_demographics cd1, customer_demographics cd2, customer_address, date_dim, reason where web_sales.ws_web_page_sk = web_page.wp_web_page_sk and web_sales.ws_item_sk = web_returns.wr_item_sk and web_sales.ws_order_number = web_returns.wr_order_number and web_sales.ws_sold_date_sk = date_dim.d_date_sk and d_year = 1998 and cd1.cd_demo_sk = web_returns.wr_refunded_cdemo_sk and cd2.cd_demo_sk = web_returns.wr_returning_cdemo_sk and customer_address.ca_address_sk = web_returns.wr_refunded_addr_sk and reason.r_reason_sk = web_returns.wr_reason_sk and ( ( cd1.cd_marital_status = 'M' and cd1.cd_marital_status = cd2.cd_marital_status and cd1.cd_education_status = '4 yr Degree' and cd1.cd_education_status = cd2.cd_education_status and ws_sales_price between 100.00 and 150.00 ) or ( cd1.cd_marital_status = 'D' and cd1.cd_marital_status = cd2.cd_marital_status and cd1.cd_education_status = 'Primary' and cd1.cd_education_status = cd2.cd_education_status and ws_sales_price between 50.00 and 100.00 ) or ( cd1.cd_marital_status = 'U' and cd1.cd_marital_status = cd2.cd_marital_status and cd1.cd_education_status = 'Advanced Degree' and cd1.cd_education_status = cd2.cd_education_status and ws_sales_price between 150.00 and 200.00 ) ) and ( ( ca_country = 'United States' and ca_state in ('KY', 'GA', 'NM') and ws_net_profit between 100 and 200 ) or ( ca_country = 'United States' and ca_state in ('MT', 'OR', 'IN') and ws_net_profit between 150 and 300 ) or ( ca_country = 'United States' and ca_state in ('WI', 'MO', 'WV') and ws_net_profit between 50 and 250 ) ) group by r_reason_desc order by r, wq, ref, fee limit 100 OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Tez Edges: Map 9 - Map 1 (BROADCAST_EDGE) Reducer 3 - Map 13 (SIMPLE_EDGE), Map 2 (SIMPLE_EDGE) Reducer 4 - Map 9 (SIMPLE_EDGE), Reducer 3 (SIMPLE_EDGE) Reducer 5 - Map 14 (SIMPLE_EDGE), Reducer 4 (SIMPLE_EDGE) Reducer 6 - Map 10 (SIMPLE_EDGE), Map 11 (BROADCAST_EDGE), Map 12 (BROADCAST_EDGE), Reducer 5 (SIMPLE_EDGE) Reducer 7 - Reducer 6 (SIMPLE_EDGE) Reducer 8 - Reducer 7 (SIMPLE_EDGE) DagName: mmokhtar_2014161818_f5fd23ba-d783-4b13-8507-7faa65851798:1 Vertices: Map 1 Map Operator Tree: TableScan alias: web_page filterExpr: wp_web_page_sk is not null (type: boolean) Statistics: Num rows: 4602 Data size: 2696178 Basic stats: COMPLETE Column stats: COMPLETE
[jira] [Updated] (HIVE-686) add UDF substring_index
[ https://issues.apache.org/jira/browse/HIVE-686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated HIVE-686: - Attachment: HIVE-686.1.patch patch #1 - derive substring_index from GenericUDF - add Junit and qtest tests add UDF substring_index --- Key: HIVE-686 URL: https://issues.apache.org/jira/browse/HIVE-686 Project: Hive Issue Type: New Feature Components: UDF Reporter: Namit Jain Assignee: Alexander Pivovarov Attachments: HIVE-686.1.patch, HIVE-686.patch, HIVE-686.patch SUBSTRING_INDEX(str,delim,count) Returns the substring from string str before count occurrences of the delimiter delim. If count is positive, everything to the left of the final delimiter (counting from the left) is returned. If count is negative, everything to the right of the final delimiter (counting from the right) is returned. SUBSTRING_INDEX() performs a case-sensitive match when searching for delim. Examples: {code} SELECT SUBSTRING_INDEX('www.mysql.com', '.', 3); --www.mysql.com SELECT SUBSTRING_INDEX('www.mysql.com', '.', 2); --www.mysql SELECT SUBSTRING_INDEX('www.mysql.com', '.', 1); --www SELECT SUBSTRING_INDEX('www.mysql.com', '.', 0); --'' SELECT SUBSTRING_INDEX('www.mysql.com', '.', -1); --com SELECT SUBSTRING_INDEX('www.mysql.com', '.', -2); --mysql.com SELECT SUBSTRING_INDEX('www.mysql.com', '.', -3); --www.mysql.com {code} {code} --#delim does not exist in str SELECT SUBSTRING_INDEX('www.mysql.com', 'Q', 1); --www.mysql.com --#delim is 2 chars SELECT SUBSTRING_INDEX('www||mysql||com', '||', 2); --www||mysql --#delim is empty string SELECT SUBSTRING_INDEX('www.mysql.com', '', 2); --'' --#str is empty string SELECT SUBSTRING_INDEX('', '.', 2); --'' {code} {code} --#null params SELECT SUBSTRING_INDEX(null, '.', 1); --null SELECT SUBSTRING_INDEX('www.mysql.com', null, 1); --null SELECT SUBSTRING_INDEX('www.mysql.com', '.', null); --null {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10807) Invalidate basic stats for insert queries if autogather=false
[ https://issues.apache.org/jira/browse/HIVE-10807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560370#comment-14560370 ] Hive QA commented on HIVE-10807: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12735432/HIVE-10807.2.patch {color:red}ERROR:{color} -1 due to 59 failed/errored test(s), 8974 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_into1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_null_element org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_multi_field_struct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_optional_elements org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_required_elements org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_single_field_struct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_structs org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_unannotated_groups org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_unannotated_primitives org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_avro_array_of_primitives org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_avro_array_of_single_field_struct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_create org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_decimal1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_map_null org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_map_of_arrays_of_ints org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_map_of_maps org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_nested_complex org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_read_backward_compatible_files org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_schema_evolution org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_thrift_array_of_primitives org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_thrift_array_of_single_field_struct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_types org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_crc32 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_sha1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_join30 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_null_projection org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_parquet_join org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testAmbiguousSingleFieldGroupInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testAvroPrimitiveInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testAvroSingleFieldGroupInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testHiveRequiredGroupInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testMultiFieldGroupInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testNewOptionalGroupInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testNewRequiredGroupInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testThriftPrimitiveInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testThriftSingleFieldGroupInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testUnannotatedListOfGroups org.apache.hadoop.hive.ql.io.parquet.TestDataWritableWriter.testSimpleType org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testDoubleMapWithStructValue org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testMapWithComplexKey org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testNestedMap org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testStringMapOfOptionalArray org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testStringMapOfOptionalIntArray org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testStringMapOptionalPrimitive org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testStringMapRequiredPrimitive org.apache.hadoop.hive.ql.io.parquet.TestParquetSerDe.testParquetHiveSerDe org.apache.hadoop.hive.ql.io.parquet.serde.TestAbstractParquetMapInspector.testEmptyContainer org.apache.hadoop.hive.ql.io.parquet.serde.TestAbstractParquetMapInspector.testNullContainer org.apache.hadoop.hive.ql.io.parquet.serde.TestAbstractParquetMapInspector.testRegularMap org.apache.hadoop.hive.ql.io.parquet.serde.TestDeepParquetHiveMapInspector.testEmptyContainer org.apache.hadoop.hive.ql.io.parquet.serde.TestDeepParquetHiveMapInspector.testNullContainer org.apache.hadoop.hive.ql.io.parquet.serde.TestDeepParquetHiveMapInspector.testRegularMap
[jira] [Updated] (HIVE-10807) Invalidate basic stats for insert queries if autogather=false
[ https://issues.apache.org/jira/browse/HIVE-10807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-10807: Attachment: HIVE-10807.3.patch Invalidate basic stats for insert queries if autogather=false - Key: HIVE-10807 URL: https://issues.apache.org/jira/browse/HIVE-10807 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 1.2.0 Reporter: Gopal V Assignee: Ashutosh Chauhan Attachments: HIVE-10807.2.patch, HIVE-10807.3.patch, HIVE-10807.patch if stats.autogather=false leads to incorrect basic stats in case of insert statements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10716) Fold case/when udf for expression involving nulls in filter operator.
[ https://issues.apache.org/jira/browse/HIVE-10716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560403#comment-14560403 ] Gopal V commented on HIVE-10716: The easiest fix to the problem seems to be an additional filter expr to produce an AND() {code} hive explain select avg(ss_sold_date_sk) from store_sales where (case ss_sold_date when '1998-01-02' then 1 else null end)=1; Map Operator Tree: TableScan alias: store_sales filterExpr: CASE (ss_sold_date) WHEN ('1998-01-02') THEN (true) ELSE (null) END (type: int) Statistics: Num rows: 2474913 Data size: 9899654 Basic stats: COMPLETE Column stats: COMPLETE {code} vs {code} hive explain select avg(ss_sold_date_sk) from store_sales where (case ss_sold_date when '1998-01-02' then 1 else null end)=1 and ss_sold_time_Sk 0; Map Operator Tree: TableScan alias: store_sales filterExpr: ((ss_sold_date = '1998-01-02') and (ss_sold_time_sk 0)) (type: boolean) Statistics: Num rows: 1237456 Data size: 9899654 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (ss_sold_time_sk 0) (type: boolean) {code} [~ashutoshc]: any idea why the extra filter helps in fixing the PPD case? Fold case/when udf for expression involving nulls in filter operator. - Key: HIVE-10716 URL: https://issues.apache.org/jira/browse/HIVE-10716 Project: Hive Issue Type: New Feature Components: Logical Optimizer Affects Versions: 1.3.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-10716.patch From HIVE-10636 comments, more folding is possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10830) First column of a Hive table created with LazyBinaryColumnarSerDe is not read properly
[ https://issues.apache.org/jira/browse/HIVE-10830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lovekesh bansal updated HIVE-10830: --- Description: 1. create external table platdev.table_target ( id INT, message String, state string, date string ) partitioned by (country string) row format delimited fields terminated by ',' stored as RCFILE location '/user/nikgupta/table_target' ; 2. insert overwrite table platdev.table_target partition(country) select case when id=13 then 15 else id end,message,state,date,country from platdev.table_base2 where id between 13 and 16; \n say now my table is written by default using LazyBinaryColumnarSerDe and has the following data: 15 thirteendelhi 2-12-2014 india 14 fourteendelhi 1-1-2014india 15 fifteen florida 1-1-2014us 16 sixteen florida 2-12-2014 us Now If I try to read the data with a mapreduce program, with map function as given below: public void map(LongWritable key, BytesRefArrayWritable val, Context context) throws IOException, InterruptedException { for (int i = 0; i val.size(); i++) { BytesRefWritable bytesRefread = val.get(i); byte[] currentCell = Arrays.copyOfRange(bytesRefread.getData(), bytesRefread.getStart(), bytesRefread.getStart()+bytesRefread.getLength()); Text currentCellStr = new Text(currentCell); System.out.println(rowText=+currentCellStr ); } context.write(NullWritable.get(), bytes); } and set the following job configuration parameters:- job.setInputFormatClass(RCFileMapReduceInputFormat.class); job.setOutputFormatClass(RCFileMapReduceOutputFormat.class); jobConf.setInt(RCFile.COLUMN_NUMBER_CONF_STR, 5) The output shown is as follows: (LazyBinaryColumnarSerDe) rowText= rowText=fifteen rowText=goa rowText=2-2- rowText=us But exactly the same case using the (ColumnarSerDe) explicitly in the table definition would give the following output: rowText=1 rowText=fifteen rowText=goa rowText=2-2- rowText=us Point is that First column value is missing in the case of LazyBinaryColumnarSerDe. was: 1. create external table platdev.table_target ( id INT, message String, state string, date string ) partitioned by (country string) row format delimited fields terminated by ',' stored as RCFILE location '/user/nikgupta/table_target' ; 2. insert overwrite table platdev.table_target partition(country) select case when id=13 then 15 else id end,message,state,date,country from platdev.table_base2 where id between 13 and 16; \n say now my table has the following data: 15 thirteendelhi 2-12-2014 india 14 fourteendelhi 1-1-2014india 15 fifteen florida 1-1-2014us 16 sixteen florida 2-12-2014 us Now If I try to read the data with a mapreduce program, with map function as given below: public void map(LongWritable key, BytesRefArrayWritable val, Context context) throws IOException, InterruptedException { for (int i = 0; i val.size(); i++) { BytesRefWritable bytesRefread = val.get(i); byte[] currentCell = Arrays.copyOfRange(bytesRefread.getData(), bytesRefread.getStart(), bytesRefread.getStart()+bytesRefread.getLength()); Text currentCellStr = new Text(currentCell); System.out.println(rowText=+currentCellStr ); } context.write(NullWritable.get(), bytes); } and set the following job configuration parameters:- job.setInputFormatClass(RCFileMapReduceInputFormat.class); job.setOutputFormatClass(RCFileMapReduceOutputFormat.class); jobConf.setInt(RCFile.COLUMN_NUMBER_CONF_STR, 5) The output shown is as follows: rowText= rowText=fifteen rowText=goa rowText=2-2- rowText=us But exactly the same case using the ColumnarSerDe explicitly in the table definition would give the following output: rowText=1 rowText=fifteen rowText=goa rowText=2-2- rowText=us Point is that First column value is missing. First column of a Hive table created with LazyBinaryColumnarSerDe is not read properly -- Key: HIVE-10830 URL: https://issues.apache.org/jira/browse/HIVE-10830 Project: Hive Issue Type: Bug Reporter: lovekesh bansal 1. create external table platdev.table_target ( id INT, message String, state string, date string ) partitioned by (country string) row format delimited fields terminated by ',' stored as RCFILE location '/user/nikgupta/table_target' ; 2. insert overwrite table platdev.table_target partition(country) select case when id=13 then 15 else id end,message,state,date,country from platdev.table_base2 where id between 13 and 16; \n say now my table is written by
[jira] [Commented] (HIVE-10819) SearchArgumentImpl for Timestamp is broken by HIVE-10286
[ https://issues.apache.org/jira/browse/HIVE-10819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560438#comment-14560438 ] Hive QA commented on HIVE-10819: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12735439/HIVE-10819.3.patch {color:red}ERROR:{color} -1 due to 59 failed/errored test(s), 8974 tests executed *Failed tests:* {noformat} TestCustomAuthentication - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_null_element org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_multi_field_struct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_optional_elements org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_required_elements org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_single_field_struct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_structs org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_unannotated_groups org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_array_of_unannotated_primitives org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_avro_array_of_primitives org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_avro_array_of_single_field_struct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_create org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_decimal1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_map_null org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_map_of_arrays_of_ints org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_map_of_maps org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_nested_complex org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_read_backward_compatible_files org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_schema_evolution org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_thrift_array_of_primitives org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_thrift_array_of_single_field_struct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_types org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_crc32 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_sha1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_join30 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_null_projection org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_parquet_join org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testAmbiguousSingleFieldGroupInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testAvroPrimitiveInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testAvroSingleFieldGroupInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testHiveRequiredGroupInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testMultiFieldGroupInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testNewOptionalGroupInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testNewRequiredGroupInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testThriftPrimitiveInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testThriftSingleFieldGroupInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testUnannotatedListOfGroups org.apache.hadoop.hive.ql.io.parquet.TestDataWritableWriter.testSimpleType org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testDoubleMapWithStructValue org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testMapWithComplexKey org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testNestedMap org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testStringMapOfOptionalArray org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testStringMapOfOptionalIntArray org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testStringMapOptionalPrimitive org.apache.hadoop.hive.ql.io.parquet.TestMapStructures.testStringMapRequiredPrimitive org.apache.hadoop.hive.ql.io.parquet.TestParquetSerDe.testParquetHiveSerDe org.apache.hadoop.hive.ql.io.parquet.serde.TestAbstractParquetMapInspector.testEmptyContainer org.apache.hadoop.hive.ql.io.parquet.serde.TestAbstractParquetMapInspector.testNullContainer org.apache.hadoop.hive.ql.io.parquet.serde.TestAbstractParquetMapInspector.testRegularMap org.apache.hadoop.hive.ql.io.parquet.serde.TestDeepParquetHiveMapInspector.testEmptyContainer org.apache.hadoop.hive.ql.io.parquet.serde.TestDeepParquetHiveMapInspector.testNullContainer org.apache.hadoop.hive.ql.io.parquet.serde.TestDeepParquetHiveMapInspector.testRegularMap
[jira] [Resolved] (HIVE-10813) Fix current test failures after HIVE-8769
[ https://issues.apache.org/jira/browse/HIVE-10813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-10813. - Resolution: Fixed Fix Version/s: 1.3.0 Fixed by HIVE-10812 Fix current test failures after HIVE-8769 - Key: HIVE-10813 URL: https://issues.apache.org/jira/browse/HIVE-10813 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Fix For: 1.3.0 We fix the stats annotation in HIVE-8769. However, there are some newly committed test cases (e.g., udf_sha1.q) that are not covered in the patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10716) Fold case/when udf for expression involving nulls in filter operator.
[ https://issues.apache.org/jira/browse/HIVE-10716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560400#comment-14560400 ] Gopal V commented on HIVE-10716: [~ashutoshc]: LGTM - +1 for the count(1) case, but it looks really odd that the {{TableScan::filterExpr}} is not getting folded for this. TableScan FilterExpr is populated before this folding happens, so it might just be an optimization ordering issue? {code} hive explain select count(1) from store_sales where (case ss_sold_date when 'x' then 1 else null end)=1; STAGE PLANS: Stage: Stage-1 Tez Edges: Reducer 2 - Map 1 (SIMPLE_EDGE) DagName: gopal_20150526214205_80c41d84-1694-47e9-ab24-144f8007b187:13 Vertices: Map 1 Map Operator Tree: TableScan alias: store_sales filterExpr: CASE (ss_sold_date) WHEN ('x') THEN (true) ELSE (null) END (type: int) Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE Filter Operator predicate: (ss_sold_date = 'x') (type: boolean) Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE Select Operator Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE Group By Operator aggregations: count(1) mode: hash outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 93 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator sort order: Statistics: Num rows: 1 Data size: 93 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col0 (type: bigint) Execution mode: vectorized Reducer 2 Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0) {code} Fold case/when udf for expression involving nulls in filter operator. - Key: HIVE-10716 URL: https://issues.apache.org/jira/browse/HIVE-10716 Project: Hive Issue Type: New Feature Components: Logical Optimizer Affects Versions: 1.3.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-10716.patch From HIVE-10636 comments, more folding is possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10704) Errors in Tez HashTableLoader when estimated table size is 0
[ https://issues.apache.org/jira/browse/HIVE-10704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560432#comment-14560432 ] Alexander Pivovarov commented on HIVE-10704: Mostafa, can you check RB link? I'm not sure it shows HIVE-10704.3.patch Errors in Tez HashTableLoader when estimated table size is 0 Key: HIVE-10704 URL: https://issues.apache.org/jira/browse/HIVE-10704 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Jason Dere Assignee: Mostafa Mokhtar Fix For: 1.2.1 Attachments: HIVE-10704.1.patch, HIVE-10704.2.patch, HIVE-10704.3.patch Couple of issues: - If the table sizes in MapJoinOperator.getParentDataSizes() are 0 for all tables, the largest small table selection is wrong and could select the large table (which results in NPE) - The memory estimates can either divide-by-zero, or allocate 0 memory if the table size is 0. Try to come up with a sensible default for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10716) Fold case/when udf for expression involving nulls in filter operator.
[ https://issues.apache.org/jira/browse/HIVE-10716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-10716: Affects Version/s: (was: 1.3.0) 1.2.0 Fold case/when udf for expression involving nulls in filter operator. - Key: HIVE-10716 URL: https://issues.apache.org/jira/browse/HIVE-10716 Project: Hive Issue Type: New Feature Components: Logical Optimizer Affects Versions: 1.2.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 1.2.1 Attachments: HIVE-10716.patch From HIVE-10636 comments, more folding is possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10716) Fold case/when udf for expression involving nulls in filter operator.
[ https://issues.apache.org/jira/browse/HIVE-10716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560457#comment-14560457 ] Ashutosh Chauhan commented on HIVE-10716: - [~gopalv] I need to verify, but my guess is https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java#L80 is coming in play here. Fold case/when udf for expression involving nulls in filter operator. - Key: HIVE-10716 URL: https://issues.apache.org/jira/browse/HIVE-10716 Project: Hive Issue Type: New Feature Components: Logical Optimizer Affects Versions: 1.2.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 1.2.1 Attachments: HIVE-10716.patch From HIVE-10636 comments, more folding is possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10793) Hybrid Hybrid Grace Hash Join : Don't allocate all hash table memory upfront
[ https://issues.apache.org/jira/browse/HIVE-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560459#comment-14560459 ] Lefty Leverenz commented on HIVE-10793: --- Doc note: This changes the default value of *hive.mapjoin.optimized.hashtable.wbsize* so the wiki needs to be updated (with version information). * [Configuration Properties -- hive.mapjoin.optimized.hashtable.wbsize | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.mapjoin.optimized.hashtable.wbsize] The patch also makes minor changes to the definitions of *hive.mapjoin.hybridgrace.minwbsize* and *hive.mapjoin.hybridgrace.minnumpartitions* which do not need any doc changes. Hybrid Hybrid Grace Hash Join : Don't allocate all hash table memory upfront Key: HIVE-10793 URL: https://issues.apache.org/jira/browse/HIVE-10793 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.2.0 Reporter: Mostafa Mokhtar Assignee: Mostafa Mokhtar Fix For: 1.3.0 Attachments: HIVE-10793.1.patch, HIVE-10793.2.patch HybridHashTableContainer will allocate memory based on estimate, which means if the actual is less than the estimate the allocated memory won't be used. Number of partitions is calculated based on estimated data size {code} numPartitions = calcNumPartitions(memoryThreshold, estimatedTableSize, minNumParts, minWbSize, nwayConf); {code} Then based on number of partitions writeBufferSize is set {code} writeBufferSize = (int)(estimatedTableSize / numPartitions); {code} Each hash partition will allocate 1 WriteBuffer, with no further allocation if the estimate data size is correct. Suggested solution is to reduce writeBufferSize by a factor such that only X% of the memory is preallocated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10793) Hybrid Hybrid Grace Hash Join : Don't allocate all hash table memory upfront
[ https://issues.apache.org/jira/browse/HIVE-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-10793: -- Labels: TODOC1.3 (was: ) Hybrid Hybrid Grace Hash Join : Don't allocate all hash table memory upfront Key: HIVE-10793 URL: https://issues.apache.org/jira/browse/HIVE-10793 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.2.0 Reporter: Mostafa Mokhtar Assignee: Mostafa Mokhtar Labels: TODOC1.3 Fix For: 1.3.0 Attachments: HIVE-10793.1.patch, HIVE-10793.2.patch HybridHashTableContainer will allocate memory based on estimate, which means if the actual is less than the estimate the allocated memory won't be used. Number of partitions is calculated based on estimated data size {code} numPartitions = calcNumPartitions(memoryThreshold, estimatedTableSize, minNumParts, minWbSize, nwayConf); {code} Then based on number of partitions writeBufferSize is set {code} writeBufferSize = (int)(estimatedTableSize / numPartitions); {code} Each hash partition will allocate 1 WriteBuffer, with no further allocation if the estimate data size is correct. Suggested solution is to reduce writeBufferSize by a factor such that only X% of the memory is preallocated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10775) Frequent calls to printStackTrace() obscuring legitimate problems
[ https://issues.apache.org/jira/browse/HIVE-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-10775: Issue Type: Improvement (was: Test) Frequent calls to printStackTrace() obscuring legitimate problems - Key: HIVE-10775 URL: https://issues.apache.org/jira/browse/HIVE-10775 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor Reporter: Andrew Cowie Assignee: Andrew Cowie Priority: Minor Attachments: HIVE-10775.1.patch When running test suites built on top of libraries that build on top of ... that use Hive, the signal to noise ratio with exceptions flying past is appalling. Most of this is down to calls to printStackTrace() embedded in this library. HIVE-7697 showed someone cleaning that up and replacing with logging the exception instead. That seems wise (logging can be redirected by the calling test suite). So, if you don't object, I'll hunt down the calls to printStackTrace() and replace them with LOG.warn() instead. I'm about half way through the patch now. AfC -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10775) Frequent calls to printStackTrace() obscuring legitimate problems
[ https://issues.apache.org/jira/browse/HIVE-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558713#comment-14558713 ] Ashutosh Chauhan commented on HIVE-10775: - yeah.. this is certainly useful. Thanks [~afcowie] for picking up this task. Frequent calls to printStackTrace() obscuring legitimate problems - Key: HIVE-10775 URL: https://issues.apache.org/jira/browse/HIVE-10775 Project: Hive Issue Type: Test Components: Metastore, Query Processor Reporter: Andrew Cowie Assignee: Andrew Cowie Priority: Minor Attachments: HIVE-10775.1.patch When running test suites built on top of libraries that build on top of ... that use Hive, the signal to noise ratio with exceptions flying past is appalling. Most of this is down to calls to printStackTrace() embedded in this library. HIVE-7697 showed someone cleaning that up and replacing with logging the exception instead. That seems wise (logging can be redirected by the calling test suite). So, if you don't object, I'll hunt down the calls to printStackTrace() and replace them with LOG.warn() instead. I'm about half way through the patch now. AfC -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10817) Blacklist For Bad MetaStore
[ https://issues.apache.org/jira/browse/HIVE-10817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nemon Lou updated HIVE-10817: - Attachment: HIVE-10817 Blacklist For Bad MetaStore --- Key: HIVE-10817 URL: https://issues.apache.org/jira/browse/HIVE-10817 Project: Hive Issue Type: Improvement Components: HiveServer2, Metastore Affects Versions: 1.2.0 Reporter: Nemon Lou Assignee: Nemon Lou Attachments: HIVE-10817 During a reliability test ,when one of MetaStore 's machine power down ,HiveServer2 then never submit jobs to YARN. There are 100 JDBC clients (Beeline) running concurrently.And all the 100 JDBC clients hangs. After checking HiveServer2's thread stack,i find that most of the threads waiting to lock AbstractService while the one holding it is trying to connect to the bad MetaStore which has been power down.When the thread which hold this lock finally return SocketTimeoutException and release this lock,another thread will hold this lock and again stuck until socket time out. Adding a new blacklist mechanism finally solved this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6791) Support variable substition for Beeline shell command
[ https://issues.apache.org/jira/browse/HIVE-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558832#comment-14558832 ] Ferdinand Xu commented on HIVE-6791: Hi [~xuefuz], I am working on hive cli replacing work. Seems this is a gap between cli and beeline. Do you want to work on this jira? If not, I'd like to pick it up. And I have a basic idea to archive the goal that we can add a new command process to add new hive variable to the hiveVariables in SessionState. Any thoughts about it? Thank you! Support variable substition for Beeline shell command - Key: HIVE-6791 URL: https://issues.apache.org/jira/browse/HIVE-6791 Project: Hive Issue Type: New Feature Components: CLI, Clients Affects Versions: 0.14.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang A follow-up task from HIVE-6694. Similar to HIVE-6570. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10819) SearchArgumentImpl for Timestamp is broken by HIVE-10286
[ https://issues.apache.org/jira/browse/HIVE-10819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558759#comment-14558759 ] Hive QA commented on HIVE-10819: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12735252/HIVE-10819.2.patch {color:red}ERROR:{color} -1 due to 638 failed/errored test(s), 8972 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_multiple org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alias_casted_column org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table2_h23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table_h23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_protect_mode org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition_authorization org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join26 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_reordering_values org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_change_schema org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_comments org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_compression_enabled org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_compression_enabled_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_date org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_deserialize_map_null org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_evolved_schemas org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_joins org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_joins_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_fields org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_sanity_test org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_schema_evolution_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_timestamp org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_type_evolution org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table_udfs org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_binary_output_format org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_5
[jira] [Updated] (HIVE-10815) Let HiveMetaStoreClient Choose MetaStore Randomly
[ https://issues.apache.org/jira/browse/HIVE-10815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nemon Lou updated HIVE-10815: - Attachment: (was: HIVE-10815.patch) Let HiveMetaStoreClient Choose MetaStore Randomly - Key: HIVE-10815 URL: https://issues.apache.org/jira/browse/HIVE-10815 Project: Hive Issue Type: Improvement Components: HiveServer2, Metastore Affects Versions: 1.2.0 Reporter: Nemon Lou Assignee: Nemon Lou Attachments: HIVE-10815.patch Currently HiveMetaStoreClient using a fixed order to choose MetaStore URIs when multiple metastores configured. Choosing MetaStore Randomly will be good for load balance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10815) Let HiveMetaStoreClient Choose MetaStore Randomly
[ https://issues.apache.org/jira/browse/HIVE-10815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nemon Lou updated HIVE-10815: - Attachment: HIVE-10815.patch Let HiveMetaStoreClient Choose MetaStore Randomly - Key: HIVE-10815 URL: https://issues.apache.org/jira/browse/HIVE-10815 Project: Hive Issue Type: Improvement Components: HiveServer2, Metastore Affects Versions: 1.2.0 Reporter: Nemon Lou Assignee: Nemon Lou Attachments: HIVE-10815.patch Currently HiveMetaStoreClient using a fixed order to choose MetaStore URIs when multiple metastores configured. Choosing MetaStore Randomly will be good for load balance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10815) Let HiveMetaStoreClient Choose MetaStore Randomly
[ https://issues.apache.org/jira/browse/HIVE-10815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nemon Lou updated HIVE-10815: - Attachment: HIVE-10815.patch Let HiveMetaStoreClient Choose MetaStore Randomly - Key: HIVE-10815 URL: https://issues.apache.org/jira/browse/HIVE-10815 Project: Hive Issue Type: Improvement Components: HiveServer2, Metastore Affects Versions: 1.2.0 Reporter: Nemon Lou Assignee: Nemon Lou Attachments: HIVE-10815.patch Currently HiveMetaStoreClient using a fixed order to choose MetaStore URIs when multiple metastores configured. Choosing MetaStore Randomly will be good for load balance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10775) Frequent calls to printStackTrace() obscuring legitimate problems
[ https://issues.apache.org/jira/browse/HIVE-10775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558855#comment-14558855 ] Hive QA commented on HIVE-10775: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12735230/HIVE-10775.1.patch {color:red}ERROR:{color} -1 due to 639 failed/errored test(s), 8971 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_multiple org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alias_casted_column org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table2_h23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table_h23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_protect_mode org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition_authorization org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join26 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_reordering_values org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_change_schema org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_comments org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_compression_enabled org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_compression_enabled_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_date org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_deserialize_map_null org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_evolved_schemas org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_joins org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_joins_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_fields org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_sanity_test org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_schema_evolution_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_timestamp org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_type_evolution org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table_udfs org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_binary_output_format org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_5
[jira] [Updated] (HIVE-10165) Improve hive-hcatalog-streaming extensibility and support updates and deletes.
[ https://issues.apache.org/jira/browse/HIVE-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliot West updated HIVE-10165: --- Attachment: HIVE-10165.5.patch Attached [^HIVE-10165.5.patch] to fix a failing test of mine. What should I do with the tests that failing in other parts of the Hive project? Improve hive-hcatalog-streaming extensibility and support updates and deletes. -- Key: HIVE-10165 URL: https://issues.apache.org/jira/browse/HIVE-10165 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 1.2.0 Reporter: Elliot West Assignee: Elliot West Labels: streaming_api Attachments: HIVE-10165.0.patch, HIVE-10165.4.patch, HIVE-10165.5.patch h3. Overview I'd like to extend the [hive-hcatalog-streaming|https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest] API so that it also supports the writing of record updates and deletes in addition to the already supported inserts. h3. Motivation We have many Hadoop processes outside of Hive that merge changed facts into existing datasets. Traditionally we achieve this by: reading in a ground-truth dataset and a modified dataset, grouping by a key, sorting by a sequence and then applying a function to determine inserted, updated, and deleted rows. However, in our current scheme we must rewrite all partitions that may potentially contain changes. In practice the number of mutated records is very small when compared with the records contained in a partition. This approach results in a number of operational issues: * Excessive amount of write activity required for small data changes. * Downstream applications cannot robustly read these datasets while they are being updated. * Due to scale of the updates (hundreds or partitions) the scope for contention is high. I believe we can address this problem by instead writing only the changed records to a Hive transactional table. This should drastically reduce the amount of data that we need to write and also provide a means for managing concurrent access to the data. Our existing merge processes can read and retain each record's {{ROW_ID}}/{{RecordIdentifier}} and pass this through to an updated form of the hive-hcatalog-streaming API which will then have the required data to perform an update or insert in a transactional manner. h3. Benefits * Enables the creation of large-scale dataset merge processes * Opens up Hive transactional functionality in an accessible manner to processes that operate outside of Hive. h3. Implementation Our changes do not break the existing API contracts. Instead our approach has been to consider the functionality offered by the existing API and our proposed API as fulfilling separate and distinct use-cases. The existing API is primarily focused on the task of continuously writing large volumes of new data into a Hive table for near-immediate analysis. Our use-case however, is concerned more with the frequent but not continuous ingestion of mutations to a Hive table from some ETL merge process. Consequently we feel it is justifiable to add our new functionality via an alternative set of public interfaces and leave the existing API as is. This keeps both APIs clean and focused at the expense of presenting additional options to potential users. Wherever possible, shared implementation concerns have been factored out into abstract base classes that are open to third-party extension. A detailed breakdown of the changes is as follows: * We've introduced a public {{RecordMutator}} interface whose purpose is to expose insert/update/delete operations to the user. This is a counterpart to the write-only {{RecordWriter}}. We've also factored out life-cycle methods common to these two interfaces into a super {{RecordOperationWriter}} interface. Note that the row representation has be changed from {{byte[]}} to {{Object}}. Within our data processing jobs our records are often available in a strongly typed and decoded form such as a POJO or a Tuple object. Therefore is seems to make sense that we are able to pass this through to the {{OrcRecordUpdater}} without having to go through a {{byte[]}} encoding step. This of course still allows users to use {{byte[]}} if they wish. * The introduction of {{RecordMutator}} requires that insert/update/delete operations are then also exposed on a {{TransactionBatch}} type. We've done this with the introduction of a public {{MutatorTransactionBatch}} interface which is a counterpart to the write-only {{TransactionBatch}}. We've also factored out life-cycle methods common to these two interfaces into a super {{BaseTransactionBatch}} interface. *
[jira] [Updated] (HIVE-10811) RelFieldTrimmer throws NoSuchElementException in some cases
[ https://issues.apache.org/jira/browse/HIVE-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10811: --- Attachment: HIVE-10811.01.patch RelFieldTrimmer throws NoSuchElementException in some cases --- Key: HIVE-10811 URL: https://issues.apache.org/jira/browse/HIVE-10811 Project: Hive Issue Type: Bug Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-10811.01.patch, HIVE-10811.patch RelFieldTrimmer runs into NoSuchElementException in some cases. Stack trace: {noformat} Exception in thread main java.lang.AssertionError: Internal error: While invoking method 'public org.apache.calcite.sql2rel.RelFieldTrimmer$TrimResult org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(org.apache.calcite.rel.core.Sort,org.apache.calcite.util.ImmutableBitSet,java.util.Set)' at org.apache.calcite.util.Util.newInternal(Util.java:743) at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:543) at org.apache.calcite.sql2rel.RelFieldTrimmer.dispatchTrimFields(RelFieldTrimmer.java:269) at org.apache.calcite.sql2rel.RelFieldTrimmer.trim(RelFieldTrimmer.java:175) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:947) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:820) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:768) at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:109) at org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:730) at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:145) at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:105) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:607) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:244) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10048) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:207) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:536) ... 32 more Caused by: java.lang.AssertionError: Internal error: While invoking method 'public org.apache.calcite.sql2rel.RelFieldTrimmer$TrimResult org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(org.apache.calcite.rel.core.Sort,org.apache.calcite.util.ImmutableBitSet,java.util.Set)' at org.apache.calcite.util.Util.newInternal(Util.java:743) at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:543) at org.apache.calcite.sql2rel.RelFieldTrimmer.dispatchTrimFields(RelFieldTrimmer.java:269) at
[jira] [Updated] (HIVE-9069) Simplify filter predicates for CBO
[ https://issues.apache.org/jira/browse/HIVE-9069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-9069: -- Attachment: HIVE-9069.14.patch Simplify filter predicates for CBO -- Key: HIVE-9069 URL: https://issues.apache.org/jira/browse/HIVE-9069 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Jesus Camacho Rodriguez Fix For: 0.14.1 Attachments: HIVE-9069.01.patch, HIVE-9069.02.patch, HIVE-9069.03.patch, HIVE-9069.04.patch, HIVE-9069.05.patch, HIVE-9069.06.patch, HIVE-9069.07.patch, HIVE-9069.08.patch, HIVE-9069.08.patch, HIVE-9069.09.patch, HIVE-9069.10.patch, HIVE-9069.11.patch, HIVE-9069.12.patch, HIVE-9069.13.patch, HIVE-9069.14.patch, HIVE-9069.patch Simplify predicates for disjunctive predicates so that can get pushed down to the scan. Looks like this is still an issue, some of the filters can be pushed down to the scan. {code} set hive.cbo.enable=true set hive.stats.fetch.column.stats=true set hive.exec.dynamic.partition.mode=nonstrict set hive.tez.auto.reducer.parallelism=true set hive.auto.convert.join.noconditionaltask.size=32000 set hive.exec.reducers.bytes.per.reducer=1 set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager set hive.support.concurrency=false set hive.tez.exec.print.summary=true explain select substr(r_reason_desc,1,20) as r ,avg(ws_quantity) wq ,avg(wr_refunded_cash) ref ,avg(wr_fee) fee from web_sales, web_returns, web_page, customer_demographics cd1, customer_demographics cd2, customer_address, date_dim, reason where web_sales.ws_web_page_sk = web_page.wp_web_page_sk and web_sales.ws_item_sk = web_returns.wr_item_sk and web_sales.ws_order_number = web_returns.wr_order_number and web_sales.ws_sold_date_sk = date_dim.d_date_sk and d_year = 1998 and cd1.cd_demo_sk = web_returns.wr_refunded_cdemo_sk and cd2.cd_demo_sk = web_returns.wr_returning_cdemo_sk and customer_address.ca_address_sk = web_returns.wr_refunded_addr_sk and reason.r_reason_sk = web_returns.wr_reason_sk and ( ( cd1.cd_marital_status = 'M' and cd1.cd_marital_status = cd2.cd_marital_status and cd1.cd_education_status = '4 yr Degree' and cd1.cd_education_status = cd2.cd_education_status and ws_sales_price between 100.00 and 150.00 ) or ( cd1.cd_marital_status = 'D' and cd1.cd_marital_status = cd2.cd_marital_status and cd1.cd_education_status = 'Primary' and cd1.cd_education_status = cd2.cd_education_status and ws_sales_price between 50.00 and 100.00 ) or ( cd1.cd_marital_status = 'U' and cd1.cd_marital_status = cd2.cd_marital_status and cd1.cd_education_status = 'Advanced Degree' and cd1.cd_education_status = cd2.cd_education_status and ws_sales_price between 150.00 and 200.00 ) ) and ( ( ca_country = 'United States' and ca_state in ('KY', 'GA', 'NM') and ws_net_profit between 100 and 200 ) or ( ca_country = 'United States' and ca_state in ('MT', 'OR', 'IN') and ws_net_profit between 150 and 300 ) or ( ca_country = 'United States' and ca_state in ('WI', 'MO', 'WV') and ws_net_profit between 50 and 250 ) ) group by r_reason_desc order by r, wq, ref, fee limit 100 OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Tez Edges: Map 9 - Map 1 (BROADCAST_EDGE) Reducer 3 - Map 13 (SIMPLE_EDGE), Map 2 (SIMPLE_EDGE) Reducer 4 - Map 9 (SIMPLE_EDGE), Reducer 3 (SIMPLE_EDGE) Reducer 5 - Map 14 (SIMPLE_EDGE), Reducer 4 (SIMPLE_EDGE) Reducer 6 - Map 10 (SIMPLE_EDGE), Map 11 (BROADCAST_EDGE), Map 12 (BROADCAST_EDGE), Reducer 5 (SIMPLE_EDGE) Reducer 7 - Reducer 6 (SIMPLE_EDGE) Reducer 8 - Reducer 7 (SIMPLE_EDGE) DagName: mmokhtar_2014161818_f5fd23ba-d783-4b13-8507-7faa65851798:1 Vertices: Map 1 Map Operator Tree: TableScan alias: web_page filterExpr: wp_web_page_sk is not null (type: boolean) Statistics: Num rows: 4602 Data size: 2696178 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: wp_web_page_sk is not null (type: boolean) Statistics: Num rows: 4602 Data size: 18408 Basic stats: COMPLETE Column stats:
[jira] [Updated] (HIVE-10792) PPD leads to wrong answer when mapper scans the same table with multiple aliases
[ https://issues.apache.org/jira/browse/HIVE-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dayue Gao updated HIVE-10792: - Attachment: HIVE-10792.2.patch Generate patch against master branch instead of old trunk branch, also adding 2 test cases. PPD leads to wrong answer when mapper scans the same table with multiple aliases Key: HIVE-10792 URL: https://issues.apache.org/jira/browse/HIVE-10792 Project: Hive Issue Type: Bug Components: File Formats, Query Processor Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.0.0, 1.2.0, 1.1.0, 1.2.1 Reporter: Dayue Gao Assignee: Dayue Gao Priority: Critical Fix For: 1.2.1 Attachments: HIVE-10792.1.patch, HIVE-10792.2.patch, HIVE-10792.test.sql Here's the steps to reproduce the bug. First of all, prepare a simple ORC table with one row {code} create table test_orc (c0 int, c1 int) stored as ORC; {code} Table: test_orc ||c0||c1|| |0|1| The following SQL gets empty result which is not expected {code} select * from test_orc t1 union all select * from test_orc t2 where t2.c0 = 1 {code} Self join is also broken {code} set hive.auto.convert.join=false; -- force common join select * from test_orc t1 left outer join test_orc t2 on (t1.c0=t2.c0 and t2.c1=0); {code} It gets empty result while the expected answer is ||t1.c0||t1.c1||t2.c0||t2.c1|| |0|1|NULL|NULL| In these cases, we pushdown predicates into OrcInputFormat. As a result, TableScanOperator for t1 can't receive its rows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6991) History not able to disable/enable after session started
[ https://issues.apache.org/jira/browse/HIVE-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinna Rao Lalam updated HIVE-6991: --- Attachment: HIVE-6991.2.patch After session started if set hive.session.history.enabled property, it won't take effect because creating the history file is done while starting the session only. Added new method updateHistory(), it will call if set hive.session.history.enabled property. History not able to disable/enable after session started Key: HIVE-6991 URL: https://issues.apache.org/jira/browse/HIVE-6991 Project: Hive Issue Type: Bug Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HIVE-6991.1.patch, HIVE-6991.2.patch, HIVE-6991.patch By default history is disabled, after session started if enable history through this command set hive.session.history.enabled=true. It is not working. I think it will help to this user query http://mail-archives.apache.org/mod_mbox/hive-user/201404.mbox/%3ccajqy7afapa_pjs6buon0o8zyt2qwfn2wt-mtznwfmurav_8...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10822) CLI start script throwing error message on console
[ https://issues.apache.org/jira/browse/HIVE-10822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinna Rao Lalam updated HIVE-10822: Attachment: HIVE-10822.patch Updated the patch with double quotes. CLI start script throwing error message on console -- Key: HIVE-10822 URL: https://issues.apache.org/jira/browse/HIVE-10822 Project: Hive Issue Type: Sub-task Components: CLI Affects Versions: beeline-cli-branch Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HIVE-10822.patch Starting cli throwing following message on console {noformat} [chinna@stobdtserver1 bin]$ ./hive ./ext/cli.sh: line 20: [: ==: unary operator expected {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10823) CLI start script throwing error message on console
[ https://issues.apache.org/jira/browse/HIVE-10823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinna Rao Lalam resolved HIVE-10823. - Resolution: Duplicate CLI start script throwing error message on console -- Key: HIVE-10823 URL: https://issues.apache.org/jira/browse/HIVE-10823 Project: Hive Issue Type: Sub-task Components: CLI Affects Versions: beeline-cli-branch Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Starting cli throwing following message on console {noformat} [chinna@stobdtserver1 bin]$ ./hive ./ext/cli.sh: line 20: [: ==: unary operator expected {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10792) PPD leads to wrong answer when mapper scans the same table with multiple aliases
[ https://issues.apache.org/jira/browse/HIVE-10792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559027#comment-14559027 ] Hive QA commented on HIVE-10792: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12735292/HIVE-10792.2.patch {color:red}ERROR:{color} -1 due to 640 failed/errored test(s), 8975 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_multiple org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alias_casted_column org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table2_h23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table_h23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_protect_mode org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition_authorization org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join26 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_reordering_values org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_change_schema org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_comments org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_compression_enabled org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_compression_enabled_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_date org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_deserialize_map_null org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_evolved_schemas org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_joins org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_joins_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_fields org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_sanity_test org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_schema_evolution_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_timestamp org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_type_evolution org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table_udfs org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_binary_output_format org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_5
[jira] [Commented] (HIVE-10822) CLI start script throwing error message on console
[ https://issues.apache.org/jira/browse/HIVE-10822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559084#comment-14559084 ] Ferdinand Xu commented on HIVE-10822: - [~chinnalalam], I can't reproduce this issue locally. Do you use export USE_DEPRECATED_CLI=true to export? Thank you CLI start script throwing error message on console -- Key: HIVE-10822 URL: https://issues.apache.org/jira/browse/HIVE-10822 Project: Hive Issue Type: Sub-task Components: CLI Affects Versions: beeline-cli-branch Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HIVE-10822.patch Starting cli throwing following message on console {noformat} [chinna@stobdtserver1 bin]$ ./hive ./ext/cli.sh: line 20: [: ==: unary operator expected {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10822) CLI start script throwing error message on console
[ https://issues.apache.org/jira/browse/HIVE-10822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559089#comment-14559089 ] Ferdinand Xu commented on HIVE-10822: - +1 for the patch. Thank you for figuring it out. CLI start script throwing error message on console -- Key: HIVE-10822 URL: https://issues.apache.org/jira/browse/HIVE-10822 Project: Hive Issue Type: Sub-task Components: CLI Affects Versions: beeline-cli-branch Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HIVE-10822.patch Starting cli throwing following message on console {noformat} [chinna@stobdtserver1 bin]$ ./hive ./ext/cli.sh: line 20: [: ==: unary operator expected {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10697) ObjectInspectorConvertors#UnionConvertor does a faulty conversion
[ https://issues.apache.org/jira/browse/HIVE-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Olson,Andrew updated HIVE-10697: Summary: ObjectInspectorConvertors#UnionConvertor does a faulty conversion (was: ObjecInspectorConvertors#UnionConvertor does a faulty conversion) ObjectInspectorConvertors#UnionConvertor does a faulty conversion - Key: HIVE-10697 URL: https://issues.apache.org/jira/browse/HIVE-10697 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Currently the UnionConvertor in the ObjectInspectorConvertors class has an issue with the convert method where it attempts to convert the objectinspector itself instead of converting the field.[1]. This should be changed to convert the field itself. This could result in a ClassCastException as shown below: {code} Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.lazy.objectinspector.LazyUnionObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.lazy.LazyString at org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyStringObjectInspector.getPrimitiveWritableObject(LazyStringObjectInspector.java:51) at org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$TextConverter.convert(PrimitiveObjectInspectorConverter.java:391) at org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$TextConverter.convert(PrimitiveObjectInspectorConverter.java:338) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$UnionConverter.convert(ObjectInspectorConverters.java:456) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.convert(ObjectInspectorConverters.java:395) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$MapConverter.convert(ObjectInspectorConverters.java:539) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.convert(ObjectInspectorConverters.java:395) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.readRow(MapOperator.java:154) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.access$200(MapOperator.java:127) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:518) ... 9 more {code} [1] https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorConverters.java#L466 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10165) Improve hive-hcatalog-streaming extensibility and support updates and deletes.
[ https://issues.apache.org/jira/browse/HIVE-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559100#comment-14559100 ] Hive QA commented on HIVE-10165: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12735291/HIVE-10165.5.patch {color:red}ERROR:{color} -1 due to 637 failed/errored test(s), 9048 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_multiple org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alias_casted_column org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table2_h23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table_h23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_protect_mode org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition_authorization org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join26 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_reordering_values org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_change_schema org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_comments org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_compression_enabled org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_compression_enabled_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_date org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_deserialize_map_null org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_evolved_schemas org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_joins org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_joins_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_fields org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_sanity_test org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_schema_evolution_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_timestamp org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_type_evolution org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table_udfs org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_binary_output_format org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_5
[jira] [Commented] (HIVE-10811) RelFieldTrimmer throws NoSuchElementException in some cases
[ https://issues.apache.org/jira/browse/HIVE-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559540#comment-14559540 ] Hive QA commented on HIVE-10811: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12735331/HIVE-10811.02.patch {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 8973 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_crc32 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_sha1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_join30 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_null_projection org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4044/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4044/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4044/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12735331 - PreCommit-HIVE-TRUNK-Build RelFieldTrimmer throws NoSuchElementException in some cases --- Key: HIVE-10811 URL: https://issues.apache.org/jira/browse/HIVE-10811 Project: Hive Issue Type: Bug Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-10811.01.patch, HIVE-10811.02.patch, HIVE-10811.patch RelFieldTrimmer runs into NoSuchElementException in some cases. Stack trace: {noformat} Exception in thread main java.lang.AssertionError: Internal error: While invoking method 'public org.apache.calcite.sql2rel.RelFieldTrimmer$TrimResult org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(org.apache.calcite.rel.core.Sort,org.apache.calcite.util.ImmutableBitSet,java.util.Set)' at org.apache.calcite.util.Util.newInternal(Util.java:743) at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:543) at org.apache.calcite.sql2rel.RelFieldTrimmer.dispatchTrimFields(RelFieldTrimmer.java:269) at org.apache.calcite.sql2rel.RelFieldTrimmer.trim(RelFieldTrimmer.java:175) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:947) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:820) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:768) at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:109) at org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:730) at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:145) at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:105) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:607) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:244) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10048) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:207) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
[jira] [Commented] (HIVE-10812) Scaling PK/FK's selectivity for stats annotation
[ https://issues.apache.org/jira/browse/HIVE-10812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559538#comment-14559538 ] Laljo John Pullokkaran commented on HIVE-10812: --- +1 Scaling PK/FK's selectivity for stats annotation Key: HIVE-10812 URL: https://issues.apache.org/jira/browse/HIVE-10812 Project: Hive Issue Type: Improvement Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-10812.01.patch, HIVE-10812.02.patch Right now, the computation of the selectivity of FK side based on PK side does not take into consideration of the range of FK and the range of PK. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10711) Tez HashTableLoader attempts to allocate more memory than available when HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD exceeds process max mem
[ https://issues.apache.org/jira/browse/HIVE-10711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559618#comment-14559618 ] Mostafa Mokhtar commented on HIVE-10711: [~apivovarov] do you have anymore feedback? Tez HashTableLoader attempts to allocate more memory than available when HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD exceeds process max mem -- Key: HIVE-10711 URL: https://issues.apache.org/jira/browse/HIVE-10711 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Mostafa Mokhtar Fix For: 1.2.1 Attachments: HIVE-10711.1.patch, HIVE-10711.2.patch, HIVE-10711.3.patch, HIVE-10711.4.patch Tez HashTableLoader bases its memory allocation on HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD. If this value is largeer than the process max memory then this can result in the HashTableLoader trying to use more memory than available to the process. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10749) Implement Insert ACID statement for parquet [Parquet branch]
[ https://issues.apache.org/jira/browse/HIVE-10749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-10749: --- Summary: Implement Insert ACID statement for parquet [Parquet branch] (was: Implement Insert ACID statement for parquet) Implement Insert ACID statement for parquet [Parquet branch] Key: HIVE-10749 URL: https://issues.apache.org/jira/browse/HIVE-10749 Project: Hive Issue Type: Sub-task Reporter: Ferdinand Xu Assignee: Ferdinand Xu Attachments: HIVE-10749.1.patch, HIVE-10749.1.patch, HIVE-10749.2.patch, HIVE-10749.patch We need to implement insert statement for parquet format like ORC. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10812) Scaling PK/FK's selectivity for stats annotation
[ https://issues.apache.org/jira/browse/HIVE-10812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-10812: --- Attachment: HIVE-10812.03.patch Scaling PK/FK's selectivity for stats annotation Key: HIVE-10812 URL: https://issues.apache.org/jira/browse/HIVE-10812 Project: Hive Issue Type: Improvement Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-10812.01.patch, HIVE-10812.02.patch, HIVE-10812.03.patch Right now, the computation of the selectivity of FK side based on PK side does not take into consideration of the range of FK and the range of PK. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10749) Implement Insert ACID statement for parquet [Parquet branch]
[ https://issues.apache.org/jira/browse/HIVE-10749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-10749: --- Attachment: HIVE-10749.2-parquet.patch Re-attaching patch to allow jenkins job to execute tests on parquet branch Implement Insert ACID statement for parquet [Parquet branch] Key: HIVE-10749 URL: https://issues.apache.org/jira/browse/HIVE-10749 Project: Hive Issue Type: Sub-task Reporter: Ferdinand Xu Assignee: Ferdinand Xu Attachments: HIVE-10749.1.patch, HIVE-10749.1.patch, HIVE-10749.2-parquet.patch, HIVE-10749.2.patch, HIVE-10749.patch We need to implement insert statement for parquet format like ORC. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10304) Add deprecation message to HiveCLI
[ https://issues.apache.org/jira/browse/HIVE-10304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559325#comment-14559325 ] Xuefu Zhang commented on HIVE-10304: The final decision will be replacing Hive CLI's implementation with beeline (HIVE-10511). You still have the script. Since you have so many scripts using Hive CLI. When HIVE-10511 is in place, it would be great if you can test it with your script. Thanks. Add deprecation message to HiveCLI -- Key: HIVE-10304 URL: https://issues.apache.org/jira/browse/HIVE-10304 Project: Hive Issue Type: Sub-task Components: CLI Affects Versions: 1.1.0 Reporter: Szehon Ho Assignee: Szehon Ho Labels: TODOC1.2 Attachments: HIVE-10304.2.patch, HIVE-10304.3.patch, HIVE-10304.patch As Beeline is now the recommended command line tool to Hive, we should add a message to HiveCLI to indicate that it is deprecated and redirect them to Beeline. This is not suggesting to remove HiveCLI for now, but just a helpful direction for user to know the direction to focus attention in Beeline. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10815) Let HiveMetaStoreClient Choose MetaStore Randomly
[ https://issues.apache.org/jira/browse/HIVE-10815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nemon Lou updated HIVE-10815: - Attachment: (was: HIVE-10815.patch) Let HiveMetaStoreClient Choose MetaStore Randomly - Key: HIVE-10815 URL: https://issues.apache.org/jira/browse/HIVE-10815 Project: Hive Issue Type: Improvement Components: HiveServer2, Metastore Affects Versions: 1.2.0 Reporter: Nemon Lou Assignee: Nemon Lou Attachments: HIVE-10815.patch Currently HiveMetaStoreClient using a fixed order to choose MetaStore URIs when multiple metastores configured. Choosing MetaStore Randomly will be good for load balance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6991) History not able to disable/enable after session started
[ https://issues.apache.org/jira/browse/HIVE-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559359#comment-14559359 ] Hive QA commented on HIVE-6991: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12735306/HIVE-6991.2.patch {color:red}ERROR:{color} -1 due to 637 failed/errored test(s), 8973 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_multiple org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alias_casted_column org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table2_h23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table_h23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_protect_mode org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition_authorization org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join26 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_reordering_values org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_change_schema org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_comments org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_compression_enabled org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_compression_enabled_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_date org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_deserialize_map_null org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_evolved_schemas org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_joins org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_joins_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_fields org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_sanity_test org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_schema_evolution_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_timestamp org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_type_evolution org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table_udfs org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_binary_output_format org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_5
[jira] [Commented] (HIVE-10277) Unable to process Comment line '--' in HIVE-1.1.0
[ https://issues.apache.org/jira/browse/HIVE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559321#comment-14559321 ] Xuefu Zhang commented on HIVE-10277: Thank you. I have reverted it. @Chinna, I'll reopen the JIRA. Could you resubmit a patch if it's still a problem, and make sure that tests passes. Thanks, Xuefu On Tue, May 26, 2015 at 7:00 AM, Ferdinand Xu (JIRA) j...@apache.org Unable to process Comment line '--' in HIVE-1.1.0 - Key: HIVE-10277 URL: https://issues.apache.org/jira/browse/HIVE-10277 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.0.0 Reporter: Kaveen Raajan Assignee: Chinna Rao Lalam Priority: Minor Labels: hive Fix For: 1.3.0 Attachments: HIVE-10277-1.patch, HIVE-10277.2.patch, HIVE-10277.patch I tried to use comment line (*--*) in HIVE-1.1.0 grunt shell like, ~hive--this is comment line~ ~hiveshow tables;~ I got error like {quote} NoViableAltException(-1@[]) at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java: 1020) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:19 9) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:16 6) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:393) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1112) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1160) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:2 07) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:754 ) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces sorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) FAILED: ParseException line 2:0 cannot recognize input near 'EOF' 'EOF' 'EO F' {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10815) Let HiveMetaStoreClient Choose MetaStore Randomly
[ https://issues.apache.org/jira/browse/HIVE-10815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nemon Lou updated HIVE-10815: - Attachment: HIVE-10815.patch Let HiveMetaStoreClient Choose MetaStore Randomly - Key: HIVE-10815 URL: https://issues.apache.org/jira/browse/HIVE-10815 Project: Hive Issue Type: Improvement Components: HiveServer2, Metastore Affects Versions: 1.2.0 Reporter: Nemon Lou Assignee: Nemon Lou Attachments: HIVE-10815.patch, HIVE-10815.patch Currently HiveMetaStoreClient using a fixed order to choose MetaStore URIs when multiple metastores configured. Choosing MetaStore Randomly will be good for load balance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-10653) LLAP: registry logs strange lines on daemons
[ https://issues.apache.org/jira/browse/HIVE-10653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-10653: --- Assignee: Sergey Shelukhin (was: Gopal V) LLAP: registry logs strange lines on daemons Key: HIVE-10653 URL: https://issues.apache.org/jira/browse/HIVE-10653 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Discovered while looking at HIVE-10648; [~sseth] mentioned that this should not be happening. Most of the daemons described as being killed were actually alive. Several/all LLAP daemons in the cluster logged these messages at approximately the same time (while AM was stuck, incidentally; perhaps they were just bored with no work). {noformat} 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Starting to refresh ServiceInstanceSet 515383300 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding new worker f698eaee-bf6c-484d-9b90-a60d9005760c which mapped to DynamicServiceInstance [alive=true, host=cn057-10.l42scl.hortonworks.com:15001 with resources=memory:20480, vCores:6] 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding new worker 9d1f50d1-f237-43c1-a8c5-32741e82d18b which mapped to DynamicServiceInstance [alive=true, host=cn041-10.l42scl.hortonworks.com:15001 with resources=memory:20480, vCores:6] 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding new worker b8a22e2f-652a-4fde-be7a-744786bc93c9 which mapped to DynamicServiceInstance [alive=true, host=cn042-10.l42scl.hortonworks.com:15001 with resources=memory:20480, vCores:6] 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding new worker 8394e271-e0d5-4589-817e-0181db0866b9 which mapped to DynamicServiceInstance [alive=true, host=cn056-10.l42scl.hortonworks.com:15001 with resources=memory:20480, vCores:6] 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding new worker 1cabdcce-1089-4de6-abdf-315f18a8b4c0 which mapped to DynamicServiceInstance [alive=true, host=cn054-10.l42scl.hortonworks.com:15001 with resources=memory:20480, vCores:6] 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding new worker 4027ad61-8c61-4173-90e2-d166ceaad74b which mapped to DynamicServiceInstance [alive=true, host=cn051-10.l42scl.hortonworks.com:15001 with resources=memory:20480, vCores:6] 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding new worker 7f71a05f-f849-43d2-8fdb-09ba144d4b93 which mapped to DynamicServiceInstance [alive=true, host=cn050-10.l42scl.hortonworks.com:15001 with resources=memory:20480, vCores:6] 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding new worker 41835ca1-69cd-4290-8c8f-8a9583a5d635 which mapped to DynamicServiceInstance [alive=true, host=cn053-10.l42scl.hortonworks.com:15001 with resources=memory:20480, vCores:6] 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding new worker 54952e48-41be-48e1-922c-a39d0ee48a33 which mapped to DynamicServiceInstance [alive=true, host=cn055-10.l42scl.hortonworks.com:15001 with resources=memory:20480, vCores:6] 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding new worker 980dfe6c-d03b-462b-bee3-35d183c74aee which mapped to DynamicServiceInstance [alive=true, host=cn052-10.l42scl.hortonworks.com:15001 with resources=memory:20480, vCores:6] 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding new worker d524212a-6743-4f18-bcf6-525a0d4b1a0a which mapped to DynamicServiceInstance [alive=true, host=cn046-10.l42scl.hortonworks.com:15001 with resources=memory:20480, vCores:6] 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Killing service instance: DynamicServiceInstance [alive=true,
[jira] [Commented] (HIVE-10244) Vectorization : TPC-DS Q80 fails with java.lang.ClassCastException when hive.vectorized.execution.reduce.enabled is enabled
[ https://issues.apache.org/jira/browse/HIVE-10244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559797#comment-14559797 ] Laljo John Pullokkaran commented on HIVE-10244: --- [~mmccline] How can you end up grouping id without grouping sets? Language prevents referring to grouping id without grouping sets. If grouping sets are present then previous line should bail out right? if (desc.isGroupingSetsPresent()) { LOG.info(Grouping sets not supported in vector mode); return false; } Vectorization : TPC-DS Q80 fails with java.lang.ClassCastException when hive.vectorized.execution.reduce.enabled is enabled --- Key: HIVE-10244 URL: https://issues.apache.org/jira/browse/HIVE-10244 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Matt McCline Attachments: HIVE-10244.01.patch, explain_q80_vectorized_reduce_on.txt Query {code} set hive.vectorized.execution.reduce.enabled=true; with ssr as (select s_store_id as store_id, sum(ss_ext_sales_price) as sales, sum(coalesce(sr_return_amt, 0)) as returns, sum(ss_net_profit - coalesce(sr_net_loss, 0)) as profit from store_sales left outer join store_returns on (ss_item_sk = sr_item_sk and ss_ticket_number = sr_ticket_number), date_dim, store, item, promotion where ss_sold_date_sk = d_date_sk and d_date between cast('1998-08-04' as date) and (cast('1998-09-04' as date)) and ss_store_sk = s_store_sk and ss_item_sk = i_item_sk and i_current_price 50 and ss_promo_sk = p_promo_sk and p_channel_tv = 'N' group by s_store_id) , csr as (select cp_catalog_page_id as catalog_page_id, sum(cs_ext_sales_price) as sales, sum(coalesce(cr_return_amount, 0)) as returns, sum(cs_net_profit - coalesce(cr_net_loss, 0)) as profit from catalog_sales left outer join catalog_returns on (cs_item_sk = cr_item_sk and cs_order_number = cr_order_number), date_dim, catalog_page, item, promotion where cs_sold_date_sk = d_date_sk and d_date between cast('1998-08-04' as date) and (cast('1998-09-04' as date)) and cs_catalog_page_sk = cp_catalog_page_sk and cs_item_sk = i_item_sk and i_current_price 50 and cs_promo_sk = p_promo_sk and p_channel_tv = 'N' group by cp_catalog_page_id) , wsr as (select web_site_id, sum(ws_ext_sales_price) as sales, sum(coalesce(wr_return_amt, 0)) as returns, sum(ws_net_profit - coalesce(wr_net_loss, 0)) as profit from web_sales left outer join web_returns on (ws_item_sk = wr_item_sk and ws_order_number = wr_order_number), date_dim, web_site, item, promotion where ws_sold_date_sk = d_date_sk and d_date between cast('1998-08-04' as date) and (cast('1998-09-04' as date)) and ws_web_site_sk = web_site_sk and ws_item_sk = i_item_sk and i_current_price 50 and ws_promo_sk = p_promo_sk and p_channel_tv = 'N' group by web_site_id) select channel , id , sum(sales) as sales , sum(returns) as returns , sum(profit) as profit from (select 'store channel' as channel , concat('store', store_id) as id , sales , returns , profit from ssr union all select 'catalog channel' as channel , concat('catalog_page', catalog_page_id) as id , sales , returns , profit from csr union all select 'web channel' as channel , concat('web_site', web_site_id) as id , sales , returns , profit from wsr ) x group by channel, id with rollup order by channel ,id limit 100 {code} Exception {code} Vertex failed, vertexName=Reducer 5, vertexId=vertex_1426707664723_1377_1_22, diagnostics=[Task failed, taskId=task_1426707664723_1377_1_22_00, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) \N\N09.285817653506076E84.639990363237801E7-1.1814318134887291E8 \N\N04.682909323885761E82.2415242712669864E7-5.966176123188091E7 \N\N01.2847032699693155E96.300096113768728E7-5.94963316209578E8 at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) at
[jira] [Updated] (HIVE-10777) LLAP: add pre-fragment and per-table cache details
[ https://issues.apache.org/jira/browse/HIVE-10777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-10777: Attachment: HIVE-10777.02.patch Updated the name of the config setting LLAP: add pre-fragment and per-table cache details -- Key: HIVE-10777 URL: https://issues.apache.org/jira/browse/HIVE-10777 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: llap Attachments: HIVE-10777.01.patch, HIVE-10777.02.patch, HIVE-10777.WIP.patch, HIVE-10777.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10808) Inner join on Null throwing Cast Exception
[ https://issues.apache.org/jira/browse/HIVE-10808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559785#comment-14559785 ] Naveen Gangam commented on HIVE-10808: -- [~swarnim] Agreed. However, we received this stack trace from a customer that can no longer reproduce the issue( their infra underwent some changes/upgrades). We have not been able to reproduce this using a test dataset. If I am able to reproduce this more consistently, I can create a unit test for this. Fair? Inner join on Null throwing Cast Exception -- Key: HIVE-10808 URL: https://issues.apache.org/jira/browse/HIVE-10808 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.13.1 Reporter: Naveen Gangam Assignee: Naveen Gangam Priority: Critical Attachments: HIVE-10808.patch select a.col1, a.col2, a.col3, a.col4 from tab1 a inner join ( select max(x) as x from tab1 where x 20130327 ) r on a.x = r.x where a.col1 = 'F' and a.col3 in ('A', 'S', 'G'); Failed Task log snippet: 2015-05-18 19:22:17,372 INFO [main] org.apache.hadoop.hive.ql.exec.mr.ObjectCache: Ignoring retrieval request: __MAP_PLAN__ 2015-05-18 19:22:17,372 INFO [main] org.apache.hadoop.hive.ql.exec.mr.ObjectCache: Ignoring cache key: __MAP_PLAN__ 2015-05-18 19:22:17,457 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:446) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38) ... 14 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) ... 17 more Caused by: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:157) ... 22 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.NullStructSerDe$NullStructSerDeObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:334) at org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:352) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:126) ... 22 more Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.NullStructSerDe$NullStructSerDeObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1149) at
[jira] [Resolved] (HIVE-10653) LLAP: registry logs strange lines on daemons
[ https://issues.apache.org/jira/browse/HIVE-10653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HIVE-10653. - Resolution: Fixed Fix Version/s: llap committed to branch LLAP: registry logs strange lines on daemons Key: HIVE-10653 URL: https://issues.apache.org/jira/browse/HIVE-10653 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: llap Discovered while looking at HIVE-10648; [~sseth] mentioned that this should not be happening. Most of the daemons described as being killed were actually alive. Several/all LLAP daemons in the cluster logged these messages at approximately the same time (while AM was stuck, incidentally; perhaps they were just bored with no work). {noformat} 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Starting to refresh ServiceInstanceSet 515383300 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding new worker f698eaee-bf6c-484d-9b90-a60d9005760c which mapped to DynamicServiceInstance [alive=true, host=cn057-10.l42scl.hortonworks.com:15001 with resources=memory:20480, vCores:6] 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding new worker 9d1f50d1-f237-43c1-a8c5-32741e82d18b which mapped to DynamicServiceInstance [alive=true, host=cn041-10.l42scl.hortonworks.com:15001 with resources=memory:20480, vCores:6] 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding new worker b8a22e2f-652a-4fde-be7a-744786bc93c9 which mapped to DynamicServiceInstance [alive=true, host=cn042-10.l42scl.hortonworks.com:15001 with resources=memory:20480, vCores:6] 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding new worker 8394e271-e0d5-4589-817e-0181db0866b9 which mapped to DynamicServiceInstance [alive=true, host=cn056-10.l42scl.hortonworks.com:15001 with resources=memory:20480, vCores:6] 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding new worker 1cabdcce-1089-4de6-abdf-315f18a8b4c0 which mapped to DynamicServiceInstance [alive=true, host=cn054-10.l42scl.hortonworks.com:15001 with resources=memory:20480, vCores:6] 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding new worker 4027ad61-8c61-4173-90e2-d166ceaad74b which mapped to DynamicServiceInstance [alive=true, host=cn051-10.l42scl.hortonworks.com:15001 with resources=memory:20480, vCores:6] 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding new worker 7f71a05f-f849-43d2-8fdb-09ba144d4b93 which mapped to DynamicServiceInstance [alive=true, host=cn050-10.l42scl.hortonworks.com:15001 with resources=memory:20480, vCores:6] 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding new worker 41835ca1-69cd-4290-8c8f-8a9583a5d635 which mapped to DynamicServiceInstance [alive=true, host=cn053-10.l42scl.hortonworks.com:15001 with resources=memory:20480, vCores:6] 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding new worker 54952e48-41be-48e1-922c-a39d0ee48a33 which mapped to DynamicServiceInstance [alive=true, host=cn055-10.l42scl.hortonworks.com:15001 with resources=memory:20480, vCores:6] 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding new worker 980dfe6c-d03b-462b-bee3-35d183c74aee which mapped to DynamicServiceInstance [alive=true, host=cn052-10.l42scl.hortonworks.com:15001 with resources=memory:20480, vCores:6] 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Adding new worker d524212a-6743-4f18-bcf6-525a0d4b1a0a which mapped to DynamicServiceInstance [alive=true, host=cn046-10.l42scl.hortonworks.com:15001 with resources=memory:20480, vCores:6] 2015-05-07 12:14:30,016 [LlapYarnRegistryRefresher()] INFO org.apache.hadoop.hive.llap.daemon.registry.impl.LlapYarnRegistryImpl: Killing service instance: DynamicServiceInstance
[jira] [Commented] (HIVE-10808) Inner join on Null throwing Cast Exception
[ https://issues.apache.org/jira/browse/HIVE-10808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559800#comment-14559800 ] Swarnim Kulkarni commented on HIVE-10808: - Sounds great. Easier to review patches with tests on it which guarantee that the patch actually works ;) Inner join on Null throwing Cast Exception -- Key: HIVE-10808 URL: https://issues.apache.org/jira/browse/HIVE-10808 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.13.1 Reporter: Naveen Gangam Assignee: Naveen Gangam Priority: Critical Attachments: HIVE-10808.patch select a.col1, a.col2, a.col3, a.col4 from tab1 a inner join ( select max(x) as x from tab1 where x 20130327 ) r on a.x = r.x where a.col1 = 'F' and a.col3 in ('A', 'S', 'G'); Failed Task log snippet: 2015-05-18 19:22:17,372 INFO [main] org.apache.hadoop.hive.ql.exec.mr.ObjectCache: Ignoring retrieval request: __MAP_PLAN__ 2015-05-18 19:22:17,372 INFO [main] org.apache.hadoop.hive.ql.exec.mr.ObjectCache: Ignoring cache key: __MAP_PLAN__ 2015-05-18 19:22:17,457 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:446) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38) ... 14 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) ... 17 more Caused by: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:157) ... 22 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.NullStructSerDe$NullStructSerDeObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:334) at org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:352) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:126) ... 22 more Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.NullStructSerDe$NullStructSerDeObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1149) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:219) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:183) at
[jira] [Resolved] (HIVE-10777) LLAP: add pre-fragment and per-table cache details
[ https://issues.apache.org/jira/browse/HIVE-10777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HIVE-10777. - committed to branch LLAP: add pre-fragment and per-table cache details -- Key: HIVE-10777 URL: https://issues.apache.org/jira/browse/HIVE-10777 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: llap Attachments: HIVE-10777.01.patch, HIVE-10777.02.patch, HIVE-10777.WIP.patch, HIVE-10777.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6791) Support variable substition for Beeline shell command
[ https://issues.apache.org/jira/browse/HIVE-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559120#comment-14559120 ] Xuefu Zhang commented on HIVE-6791: --- Yes, I just assigned it to you. Thanks. Support variable substition for Beeline shell command - Key: HIVE-6791 URL: https://issues.apache.org/jira/browse/HIVE-6791 Project: Hive Issue Type: New Feature Components: CLI, Clients Affects Versions: 0.14.0 Reporter: Xuefu Zhang Assignee: Ferdinand Xu A follow-up task from HIVE-6694. Similar to HIVE-6570. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-6791) Support variable substition for Beeline shell command
[ https://issues.apache.org/jira/browse/HIVE-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang reassigned HIVE-6791: - Assignee: Ferdinand Xu (was: Xuefu Zhang) Support variable substition for Beeline shell command - Key: HIVE-6791 URL: https://issues.apache.org/jira/browse/HIVE-6791 Project: Hive Issue Type: New Feature Components: CLI, Clients Affects Versions: 0.14.0 Reporter: Xuefu Zhang Assignee: Ferdinand Xu A follow-up task from HIVE-6694. Similar to HIVE-6570. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10277) Unable to process Comment line '--' in HIVE-1.1.0
[ https://issues.apache.org/jira/browse/HIVE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559123#comment-14559123 ] Ferdinand Xu commented on HIVE-10277: - Hi [~xuefuz], seems this commit breaks lots of test. Could you take a look at it? http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4034/ Unable to process Comment line '--' in HIVE-1.1.0 - Key: HIVE-10277 URL: https://issues.apache.org/jira/browse/HIVE-10277 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.0.0 Reporter: Kaveen Raajan Assignee: Chinna Rao Lalam Priority: Minor Labels: hive Fix For: 1.3.0 Attachments: HIVE-10277-1.patch, HIVE-10277.2.patch, HIVE-10277.patch I tried to use comment line (*--*) in HIVE-1.1.0 grunt shell like, ~hive--this is comment line~ ~hiveshow tables;~ I got error like {quote} NoViableAltException(-1@[]) at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java: 1020) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:19 9) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:16 6) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:393) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:307) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1112) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1160) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1039) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:2 07) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:754 ) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces sorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) FAILED: ParseException line 2:0 cannot recognize input near 'EOF' 'EOF' 'EO F' {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10819) SearchArgumentImpl for Timestamp is broken by HIVE-10286
[ https://issues.apache.org/jira/browse/HIVE-10819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559696#comment-14559696 ] Sergey Shelukhin commented on HIVE-10819: - this breaks a lot of tests... SearchArgumentImpl for Timestamp is broken by HIVE-10286 Key: HIVE-10819 URL: https://issues.apache.org/jira/browse/HIVE-10819 Project: Hive Issue Type: Bug Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 1.2.1 Attachments: HIVE-10819.1.patch, HIVE-10819.2.patch The work around for kryo bug for Timestamp is accidentally removed by HIVE-10286. Need to bring it back. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10711) Tez HashTableLoader attempts to allocate more memory than available when HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD exceeds process max mem
[ https://issues.apache.org/jira/browse/HIVE-10711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559699#comment-14559699 ] Mostafa Mokhtar commented on HIVE-10711: [~sushanth] FYI [~apivovarov] Can you please commit the change to 1.2.1? Tez HashTableLoader attempts to allocate more memory than available when HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD exceeds process max mem -- Key: HIVE-10711 URL: https://issues.apache.org/jira/browse/HIVE-10711 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Mostafa Mokhtar Fix For: 1.2.1 Attachments: HIVE-10711.1.patch, HIVE-10711.2.patch, HIVE-10711.3.patch, HIVE-10711.4.patch Tez HashTableLoader bases its memory allocation on HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD. If this value is largeer than the process max memory then this can result in the HashTableLoader trying to use more memory than available to the process. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10793) Hybrid Hybrid Grace Hash Join : Don't allocate all hash table memory upfront
[ https://issues.apache.org/jira/browse/HIVE-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559704#comment-14559704 ] Mostafa Mokhtar commented on HIVE-10793: [~sushanth] [~sershe] Can this go to 1.2.1 as well? Hybrid Hybrid Grace Hash Join : Don't allocate all hash table memory upfront Key: HIVE-10793 URL: https://issues.apache.org/jira/browse/HIVE-10793 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.2.0 Reporter: Mostafa Mokhtar Assignee: Mostafa Mokhtar Fix For: 1.2.1 Attachments: HIVE-10793.1.patch, HIVE-10793.2.patch HybridHashTableContainer will allocate memory based on estimate, which means if the actual is less than the estimate the allocated memory won't be used. Number of partitions is calculated based on estimated data size {code} numPartitions = calcNumPartitions(memoryThreshold, estimatedTableSize, minNumParts, minWbSize, nwayConf); {code} Then based on number of partitions writeBufferSize is set {code} writeBufferSize = (int)(estimatedTableSize / numPartitions); {code} Each hash partition will allocate 1 WriteBuffer, with no further allocation if the estimate data size is correct. Suggested solution is to reduce writeBufferSize by a factor such that only X% of the memory is preallocated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7723) Explain plan for complex query with lots of partitions is slow due to in-efficient collection used to find a matching ReadEntity
[ https://issues.apache.org/jira/browse/HIVE-7723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mostafa Mokhtar updated HIVE-7723: -- Attachment: HIVE-7723.11.patch Explain plan for complex query with lots of partitions is slow due to in-efficient collection used to find a matching ReadEntity Key: HIVE-7723 URL: https://issues.apache.org/jira/browse/HIVE-7723 Project: Hive Issue Type: Bug Components: CLI, Physical Optimizer Affects Versions: 0.13.1 Reporter: Mostafa Mokhtar Assignee: Mostafa Mokhtar Attachments: HIVE-7723.1.patch, HIVE-7723.10.patch, HIVE-7723.11.patch, HIVE-7723.2.patch, HIVE-7723.3.patch, HIVE-7723.4.patch, HIVE-7723.5.patch, HIVE-7723.6.patch, HIVE-7723.7.patch, HIVE-7723.8.patch, HIVE-7723.9.patch Explain on TPC-DS query 64 took 11 seconds, when the CLI was profiled it showed that ReadEntity.equals is taking ~40% of the CPU. ReadEntity.equals is called from the snippet below. Again and again the set is iterated over to get the actual match, a HashMap is a better option for this case as Set doesn't have a Get method. Also for ReadEntity equals is case-insensitive while hash is , which is an undesired behavior. {code} public static ReadEntity addInput(SetReadEntity inputs, ReadEntity newInput) { // If the input is already present, make sure the new parent is added to the input. if (inputs.contains(newInput)) { for (ReadEntity input : inputs) { if (input.equals(newInput)) { if ((newInput.getParents() != null) (!newInput.getParents().isEmpty())) { input.getParents().addAll(newInput.getParents()); input.setDirect(input.isDirect() || newInput.isDirect()); } return input; } } assert false; } else { inputs.add(newInput); return newInput; } // make compile happy return null; } {code} This is the query used : {code} select cs1.product_name ,cs1.store_name ,cs1.store_zip ,cs1.b_street_number ,cs1.b_streen_name ,cs1.b_city ,cs1.b_zip ,cs1.c_street_number ,cs1.c_street_name ,cs1.c_city ,cs1.c_zip ,cs1.syear ,cs1.cnt ,cs1.s1 ,cs1.s2 ,cs1.s3 ,cs2.s1 ,cs2.s2 ,cs2.s3 ,cs2.syear ,cs2.cnt from (select i_product_name as product_name ,i_item_sk as item_sk ,s_store_name as store_name ,s_zip as store_zip ,ad1.ca_street_number as b_street_number ,ad1.ca_street_name as b_streen_name ,ad1.ca_city as b_city ,ad1.ca_zip as b_zip ,ad2.ca_street_number as c_street_number ,ad2.ca_street_name as c_street_name ,ad2.ca_city as c_city ,ad2.ca_zip as c_zip ,d1.d_year as syear ,d2.d_year as fsyear ,d3.d_year as s2year ,count(*) as cnt ,sum(ss_wholesale_cost) as s1 ,sum(ss_list_price) as s2 ,sum(ss_coupon_amt) as s3 FROM store_sales JOIN store_returns ON store_sales.ss_item_sk = store_returns.sr_item_sk and store_sales.ss_ticket_number = store_returns.sr_ticket_number JOIN customer ON store_sales.ss_customer_sk = customer.c_customer_sk JOIN date_dim d1 ON store_sales.ss_sold_date_sk = d1.d_date_sk JOIN date_dim d2 ON customer.c_first_sales_date_sk = d2.d_date_sk JOIN date_dim d3 ON customer.c_first_shipto_date_sk = d3.d_date_sk JOIN store ON store_sales.ss_store_sk = store.s_store_sk JOIN customer_demographics cd1 ON store_sales.ss_cdemo_sk= cd1.cd_demo_sk JOIN customer_demographics cd2 ON customer.c_current_cdemo_sk = cd2.cd_demo_sk JOIN promotion ON store_sales.ss_promo_sk = promotion.p_promo_sk JOIN household_demographics hd1 ON store_sales.ss_hdemo_sk = hd1.hd_demo_sk JOIN household_demographics hd2 ON customer.c_current_hdemo_sk = hd2.hd_demo_sk JOIN customer_address ad1 ON store_sales.ss_addr_sk = ad1.ca_address_sk JOIN customer_address ad2 ON customer.c_current_addr_sk = ad2.ca_address_sk JOIN income_band ib1 ON hd1.hd_income_band_sk = ib1.ib_income_band_sk JOIN income_band ib2 ON hd2.hd_income_band_sk = ib2.ib_income_band_sk JOIN item ON store_sales.ss_item_sk = item.i_item_sk JOIN (select cs_item_sk ,sum(cs_ext_list_price) as sale,sum(cr_refunded_cash+cr_reversed_charge+cr_store_credit) as refund from catalog_sales JOIN catalog_returns ON catalog_sales.cs_item_sk = catalog_returns.cr_item_sk and catalog_sales.cs_order_number = catalog_returns.cr_order_number group by cs_item_sk having sum(cs_ext_list_price)2*sum(cr_refunded_cash+cr_reversed_charge+cr_store_credit)) cs_ui ON store_sales.ss_item_sk = cs_ui.cs_item_sk WHERE cd1.cd_marital_status
[jira] [Updated] (HIVE-10825) Add parquet branch profile to jenkins-submit-build.sh
[ https://issues.apache.org/jira/browse/HIVE-10825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-10825: --- Description: NO PRECOMMIT TESTS (was: NO PRECOMMIT TEST) Add parquet branch profile to jenkins-submit-build.sh - Key: HIVE-10825 URL: https://issues.apache.org/jira/browse/HIVE-10825 Project: Hive Issue Type: Sub-task Components: Testing Infrastructure Reporter: Sergio Peña Assignee: Sergio Peña Priority: Minor Attachments: HIVE-10825.1.patch NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10711) Tez HashTableLoader attempts to allocate more memory than available when HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD exceeds process max mem
[ https://issues.apache.org/jira/browse/HIVE-10711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559729#comment-14559729 ] Alexander Pivovarov commented on HIVE-10711: Mostafa, lets wait 24 hours before commit. Just to clarify. Do you want me to commit it to master and then do hotfix (cherry-pick) from master to branch-1.2? Tez HashTableLoader attempts to allocate more memory than available when HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD exceeds process max mem -- Key: HIVE-10711 URL: https://issues.apache.org/jira/browse/HIVE-10711 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Mostafa Mokhtar Fix For: 1.2.1 Attachments: HIVE-10711.1.patch, HIVE-10711.2.patch, HIVE-10711.3.patch, HIVE-10711.4.patch Tez HashTableLoader bases its memory allocation on HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD. If this value is largeer than the process max memory then this can result in the HashTableLoader trying to use more memory than available to the process. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10825) Add parquet branch profile to jenkins-submit-build.sh
[ https://issues.apache.org/jira/browse/HIVE-10825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559730#comment-14559730 ] Szehon Ho commented on HIVE-10825: -- +1 Add parquet branch profile to jenkins-submit-build.sh - Key: HIVE-10825 URL: https://issues.apache.org/jira/browse/HIVE-10825 Project: Hive Issue Type: Sub-task Components: Testing Infrastructure Reporter: Sergio Peña Assignee: Sergio Peña Priority: Minor Attachments: HIVE-10825.1.patch NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10711) Tez HashTableLoader attempts to allocate more memory than available when HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD exceeds process max mem
[ https://issues.apache.org/jira/browse/HIVE-10711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559731#comment-14559731 ] Mostafa Mokhtar commented on HIVE-10711: Yes, please. Tez HashTableLoader attempts to allocate more memory than available when HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD exceeds process max mem -- Key: HIVE-10711 URL: https://issues.apache.org/jira/browse/HIVE-10711 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Mostafa Mokhtar Fix For: 1.2.1 Attachments: HIVE-10711.1.patch, HIVE-10711.2.patch, HIVE-10711.3.patch, HIVE-10711.4.patch Tez HashTableLoader bases its memory allocation on HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD. If this value is largeer than the process max memory then this can result in the HashTableLoader trying to use more memory than available to the process. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10711) Tez HashTableLoader attempts to allocate more memory than available when HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD exceeds process max mem
[ https://issues.apache.org/jira/browse/HIVE-10711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559732#comment-14559732 ] Mostafa Mokhtar commented on HIVE-10711: Yes, please. Tez HashTableLoader attempts to allocate more memory than available when HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD exceeds process max mem -- Key: HIVE-10711 URL: https://issues.apache.org/jira/browse/HIVE-10711 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Mostafa Mokhtar Fix For: 1.2.1 Attachments: HIVE-10711.1.patch, HIVE-10711.2.patch, HIVE-10711.3.patch, HIVE-10711.4.patch Tez HashTableLoader bases its memory allocation on HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD. If this value is largeer than the process max memory then this can result in the HashTableLoader trying to use more memory than available to the process. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10711) Tez HashTableLoader attempts to allocate more memory than available when HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD exceeds process max mem
[ https://issues.apache.org/jira/browse/HIVE-10711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559667#comment-14559667 ] Alexander Pivovarov commented on HIVE-10711: +1 Tez HashTableLoader attempts to allocate more memory than available when HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD exceeds process max mem -- Key: HIVE-10711 URL: https://issues.apache.org/jira/browse/HIVE-10711 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Mostafa Mokhtar Fix For: 1.2.1 Attachments: HIVE-10711.1.patch, HIVE-10711.2.patch, HIVE-10711.3.patch, HIVE-10711.4.patch Tez HashTableLoader bases its memory allocation on HIVECONVERTJOINNOCONDITIONALTASKTHRESHOLD. If this value is largeer than the process max memory then this can result in the HashTableLoader trying to use more memory than available to the process. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10793) Hybrid Hybrid Grace Hash Join : Don't allocate all hash table memory upfront
[ https://issues.apache.org/jira/browse/HIVE-10793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-10793: Fix Version/s: (was: 1.2.1) 1.3.0 Hybrid Hybrid Grace Hash Join : Don't allocate all hash table memory upfront Key: HIVE-10793 URL: https://issues.apache.org/jira/browse/HIVE-10793 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.2.0 Reporter: Mostafa Mokhtar Assignee: Mostafa Mokhtar Fix For: 1.3.0 Attachments: HIVE-10793.1.patch, HIVE-10793.2.patch HybridHashTableContainer will allocate memory based on estimate, which means if the actual is less than the estimate the allocated memory won't be used. Number of partitions is calculated based on estimated data size {code} numPartitions = calcNumPartitions(memoryThreshold, estimatedTableSize, minNumParts, minWbSize, nwayConf); {code} Then based on number of partitions writeBufferSize is set {code} writeBufferSize = (int)(estimatedTableSize / numPartitions); {code} Each hash partition will allocate 1 WriteBuffer, with no further allocation if the estimate data size is correct. Suggested solution is to reduce writeBufferSize by a factor such that only X% of the memory is preallocated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10825) Add parquet branch profile to jenkins-submit-build.sh
[ https://issues.apache.org/jira/browse/HIVE-10825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-10825: --- Description: NO PRECOMMIT TEST Add parquet branch profile to jenkins-submit-build.sh - Key: HIVE-10825 URL: https://issues.apache.org/jira/browse/HIVE-10825 Project: Hive Issue Type: Sub-task Components: Testing Infrastructure Reporter: Sergio Peña Assignee: Sergio Peña Priority: Minor Attachments: HIVE-10825.1.patch NO PRECOMMIT TEST -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10825) Add parquet branch profile to jenkins-submit-build.sh
[ https://issues.apache.org/jira/browse/HIVE-10825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-10825: --- Attachment: HIVE-10825.1.patch Add parquet branch profile to jenkins-submit-build.sh - Key: HIVE-10825 URL: https://issues.apache.org/jira/browse/HIVE-10825 Project: Hive Issue Type: Sub-task Components: Testing Infrastructure Reporter: Sergio Peña Assignee: Sergio Peña Priority: Minor Attachments: HIVE-10825.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10812) Scaling PK/FK's selectivity for stats annotation
[ https://issues.apache.org/jira/browse/HIVE-10812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559839#comment-14559839 ] Hive QA commented on HIVE-10812: {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12735375/HIVE-10812.03.patch {color:green}SUCCESS:{color} +1 8974 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4045/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4045/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4045/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12735375 - PreCommit-HIVE-TRUNK-Build Scaling PK/FK's selectivity for stats annotation Key: HIVE-10812 URL: https://issues.apache.org/jira/browse/HIVE-10812 Project: Hive Issue Type: Improvement Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-10812.01.patch, HIVE-10812.02.patch, HIVE-10812.03.patch Right now, the computation of the selectivity of FK side based on PK side does not take into consideration of the range of FK and the range of PK. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10165) Improve hive-hcatalog-streaming extensibility and support updates and deletes.
[ https://issues.apache.org/jira/browse/HIVE-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559879#comment-14559879 ] Alan Gates commented on HIVE-10165: --- I'll review if someone else doesn't get to it first. It will take me a few days to get to it as I'm out the rest of this week. As far as the failing tests, the 5 earlier failures didn't look related to your patch. Unless we really broke the trunk it's surprising to see 600+ test failures for your later patch. Have you tried running some of these locally to see whether you can reproduce them? Improve hive-hcatalog-streaming extensibility and support updates and deletes. -- Key: HIVE-10165 URL: https://issues.apache.org/jira/browse/HIVE-10165 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 1.2.0 Reporter: Elliot West Assignee: Elliot West Labels: streaming_api Attachments: HIVE-10165.0.patch, HIVE-10165.4.patch, HIVE-10165.5.patch h3. Overview I'd like to extend the [hive-hcatalog-streaming|https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest] API so that it also supports the writing of record updates and deletes in addition to the already supported inserts. h3. Motivation We have many Hadoop processes outside of Hive that merge changed facts into existing datasets. Traditionally we achieve this by: reading in a ground-truth dataset and a modified dataset, grouping by a key, sorting by a sequence and then applying a function to determine inserted, updated, and deleted rows. However, in our current scheme we must rewrite all partitions that may potentially contain changes. In practice the number of mutated records is very small when compared with the records contained in a partition. This approach results in a number of operational issues: * Excessive amount of write activity required for small data changes. * Downstream applications cannot robustly read these datasets while they are being updated. * Due to scale of the updates (hundreds or partitions) the scope for contention is high. I believe we can address this problem by instead writing only the changed records to a Hive transactional table. This should drastically reduce the amount of data that we need to write and also provide a means for managing concurrent access to the data. Our existing merge processes can read and retain each record's {{ROW_ID}}/{{RecordIdentifier}} and pass this through to an updated form of the hive-hcatalog-streaming API which will then have the required data to perform an update or insert in a transactional manner. h3. Benefits * Enables the creation of large-scale dataset merge processes * Opens up Hive transactional functionality in an accessible manner to processes that operate outside of Hive. h3. Implementation Our changes do not break the existing API contracts. Instead our approach has been to consider the functionality offered by the existing API and our proposed API as fulfilling separate and distinct use-cases. The existing API is primarily focused on the task of continuously writing large volumes of new data into a Hive table for near-immediate analysis. Our use-case however, is concerned more with the frequent but not continuous ingestion of mutations to a Hive table from some ETL merge process. Consequently we feel it is justifiable to add our new functionality via an alternative set of public interfaces and leave the existing API as is. This keeps both APIs clean and focused at the expense of presenting additional options to potential users. Wherever possible, shared implementation concerns have been factored out into abstract base classes that are open to third-party extension. A detailed breakdown of the changes is as follows: * We've introduced a public {{RecordMutator}} interface whose purpose is to expose insert/update/delete operations to the user. This is a counterpart to the write-only {{RecordWriter}}. We've also factored out life-cycle methods common to these two interfaces into a super {{RecordOperationWriter}} interface. Note that the row representation has be changed from {{byte[]}} to {{Object}}. Within our data processing jobs our records are often available in a strongly typed and decoded form such as a POJO or a Tuple object. Therefore is seems to make sense that we are able to pass this through to the {{OrcRecordUpdater}} without having to go through a {{byte[]}} encoding step. This of course still allows users to use {{byte[]}} if they wish. * The introduction of {{RecordMutator}} requires that insert/update/delete operations are then also exposed on a {{TransactionBatch}} type. We've done this
[jira] [Commented] (HIVE-7723) Explain plan for complex query with lots of partitions is slow due to in-efficient collection used to find a matching ReadEntity
[ https://issues.apache.org/jira/browse/HIVE-7723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559849#comment-14559849 ] Hive QA commented on HIVE-7723: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12735389/HIVE-7723.11.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4046/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4046/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4046/ Messages: {noformat} This message was trimmed, see log for full details [WARNING] /data/hive-ptest/working/apache-github-source-source/spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java: Recompile with -Xlint:unchecked for details. [INFO] [INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ spark-client --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] Copying 1 resource [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ spark-client --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/warehouse [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf [copy] Copying 11 files to /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ spark-client --- [INFO] Compiling 5 source files to /data/hive-ptest/working/apache-github-source-source/spark-client/target/test-classes [INFO] [INFO] --- maven-dependency-plugin:2.8:copy (copy-guava-14) @ spark-client --- [INFO] Configured Artifact: com.google.guava:guava:14.0.1:jar [INFO] Copying guava-14.0.1.jar to /data/hive-ptest/working/apache-github-source-source/spark-client/target/dependency/guava-14.0.1.jar [INFO] [INFO] --- maven-surefire-plugin:2.16:test (default-test) @ spark-client --- [INFO] Tests are skipped. [INFO] [INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ spark-client --- [INFO] Building jar: /data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-1.3.0-SNAPSHOT.jar [INFO] [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ spark-client --- [INFO] [INFO] --- maven-install-plugin:2.4:install (default-install) @ spark-client --- [INFO] Installing /data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-1.3.0-SNAPSHOT.jar to /home/hiveptest/.m2/repository/org/apache/hive/spark-client/1.3.0-SNAPSHOT/spark-client-1.3.0-SNAPSHOT.jar [INFO] Installing /data/hive-ptest/working/apache-github-source-source/spark-client/pom.xml to /home/hiveptest/.m2/repository/org/apache/hive/spark-client/1.3.0-SNAPSHOT/spark-client-1.3.0-SNAPSHOT.pom [INFO] [INFO] [INFO] Building Hive Query Language 1.3.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-exec --- [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql/target [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql (includes = [datanucleus.log, derby.log], excludes = []) [INFO] [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ hive-exec --- [INFO] [INFO] --- maven-antrun-plugin:1.7:run (generate-sources) @ hive-exec --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-test-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen Generating vector expression code Generating vector expression test code [INFO] Executed tasks [INFO] [INFO] --- build-helper-maven-plugin:1.8:add-source (add-source) @ hive-exec --- [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/src/gen/protobuf/gen-java added. [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/src/gen/thrift/gen-javabean added. [INFO]
[jira] [Commented] (HIVE-10811) RelFieldTrimmer throws NoSuchElementException in some cases
[ https://issues.apache.org/jira/browse/HIVE-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559854#comment-14559854 ] Laljo John Pullokkaran commented on HIVE-10811: --- [~jcamachorodriguez] I don't get the patch. Shouldn't we be checking collations from rel present in input? RelFieldTrimmer throws NoSuchElementException in some cases --- Key: HIVE-10811 URL: https://issues.apache.org/jira/browse/HIVE-10811 Project: Hive Issue Type: Bug Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-10811.01.patch, HIVE-10811.02.patch, HIVE-10811.patch RelFieldTrimmer runs into NoSuchElementException in some cases. Stack trace: {noformat} Exception in thread main java.lang.AssertionError: Internal error: While invoking method 'public org.apache.calcite.sql2rel.RelFieldTrimmer$TrimResult org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(org.apache.calcite.rel.core.Sort,org.apache.calcite.util.ImmutableBitSet,java.util.Set)' at org.apache.calcite.util.Util.newInternal(Util.java:743) at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:543) at org.apache.calcite.sql2rel.RelFieldTrimmer.dispatchTrimFields(RelFieldTrimmer.java:269) at org.apache.calcite.sql2rel.RelFieldTrimmer.trim(RelFieldTrimmer.java:175) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:947) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:820) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:768) at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:109) at org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:730) at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:145) at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:105) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:607) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:244) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10048) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:207) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:536) ... 32 more Caused by: java.lang.AssertionError: Internal error: While invoking method 'public org.apache.calcite.sql2rel.RelFieldTrimmer$TrimResult org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(org.apache.calcite.rel.core.Sort,org.apache.calcite.util.ImmutableBitSet,java.util.Set)' at org.apache.calcite.util.Util.newInternal(Util.java:743) at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:543)
[jira] [Commented] (HIVE-10165) Improve hive-hcatalog-streaming extensibility and support updates and deletes.
[ https://issues.apache.org/jira/browse/HIVE-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559866#comment-14559866 ] Elliot West commented on HIVE-10165: I'm not quite sure what to do next. I have a '-1' because some (unrelated) tests fail. However I (perhaps naïvely) don't believe this is connected to my patch. Could someone please review? Improve hive-hcatalog-streaming extensibility and support updates and deletes. -- Key: HIVE-10165 URL: https://issues.apache.org/jira/browse/HIVE-10165 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 1.2.0 Reporter: Elliot West Assignee: Elliot West Labels: streaming_api Attachments: HIVE-10165.0.patch, HIVE-10165.4.patch, HIVE-10165.5.patch h3. Overview I'd like to extend the [hive-hcatalog-streaming|https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest] API so that it also supports the writing of record updates and deletes in addition to the already supported inserts. h3. Motivation We have many Hadoop processes outside of Hive that merge changed facts into existing datasets. Traditionally we achieve this by: reading in a ground-truth dataset and a modified dataset, grouping by a key, sorting by a sequence and then applying a function to determine inserted, updated, and deleted rows. However, in our current scheme we must rewrite all partitions that may potentially contain changes. In practice the number of mutated records is very small when compared with the records contained in a partition. This approach results in a number of operational issues: * Excessive amount of write activity required for small data changes. * Downstream applications cannot robustly read these datasets while they are being updated. * Due to scale of the updates (hundreds or partitions) the scope for contention is high. I believe we can address this problem by instead writing only the changed records to a Hive transactional table. This should drastically reduce the amount of data that we need to write and also provide a means for managing concurrent access to the data. Our existing merge processes can read and retain each record's {{ROW_ID}}/{{RecordIdentifier}} and pass this through to an updated form of the hive-hcatalog-streaming API which will then have the required data to perform an update or insert in a transactional manner. h3. Benefits * Enables the creation of large-scale dataset merge processes * Opens up Hive transactional functionality in an accessible manner to processes that operate outside of Hive. h3. Implementation Our changes do not break the existing API contracts. Instead our approach has been to consider the functionality offered by the existing API and our proposed API as fulfilling separate and distinct use-cases. The existing API is primarily focused on the task of continuously writing large volumes of new data into a Hive table for near-immediate analysis. Our use-case however, is concerned more with the frequent but not continuous ingestion of mutations to a Hive table from some ETL merge process. Consequently we feel it is justifiable to add our new functionality via an alternative set of public interfaces and leave the existing API as is. This keeps both APIs clean and focused at the expense of presenting additional options to potential users. Wherever possible, shared implementation concerns have been factored out into abstract base classes that are open to third-party extension. A detailed breakdown of the changes is as follows: * We've introduced a public {{RecordMutator}} interface whose purpose is to expose insert/update/delete operations to the user. This is a counterpart to the write-only {{RecordWriter}}. We've also factored out life-cycle methods common to these two interfaces into a super {{RecordOperationWriter}} interface. Note that the row representation has be changed from {{byte[]}} to {{Object}}. Within our data processing jobs our records are often available in a strongly typed and decoded form such as a POJO or a Tuple object. Therefore is seems to make sense that we are able to pass this through to the {{OrcRecordUpdater}} without having to go through a {{byte[]}} encoding step. This of course still allows users to use {{byte[]}} if they wish. * The introduction of {{RecordMutator}} requires that insert/update/delete operations are then also exposed on a {{TransactionBatch}} type. We've done this with the introduction of a public {{MutatorTransactionBatch}} interface which is a counterpart to the write-only {{TransactionBatch}}. We've also factored out life-cycle methods common to these two
[jira] [Commented] (HIVE-10753) hs2 jdbc url - wrong connection string cause error on beeline/jdbc/odbc client, misleading message
[ https://issues.apache.org/jira/browse/HIVE-10753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559893#comment-14559893 ] Thejas M Nair commented on HIVE-10753: -- +1 hs2 jdbc url - wrong connection string cause error on beeline/jdbc/odbc client, misleading message --- Key: HIVE-10753 URL: https://issues.apache.org/jira/browse/HIVE-10753 Project: Hive Issue Type: Bug Components: Beeline, JDBC Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-10753.1.patch, HIVE-10753.2.patch {noformat} beeline -u 'jdbc:hive2://localhost:10001/default?httpPath=/;transportMode=http' -n hdiuser scan complete in 15ms Connecting to jdbc:hive2://localhost:10001/default?httpPath=/;transportMode=http Java heap space Beeline version 0.14.0.2.2.4.1-1 by Apache Hive 0: jdbc:hive2://localhost:10001/default (closed) ^Chdiuser@headnode0:~$ But it works if I use the deprecated param - hdiuser@headnode0:~$ beeline -u 'jdbc:hive2://localhost:10001/default?hive.server2.transport.mode=http;httpPath=/' -n hdiuser scan complete in 12ms Connecting to jdbc:hive2://localhost:10001/default?hive.server2.transport.mode=http;httpPath=/ 15/04/28 23:16:46 [main]: WARN jdbc.Utils: * JDBC param deprecation * 15/04/28 23:16:46 [main]: WARN jdbc.Utils: The use of hive.server2.transport.mode is deprecated. 15/04/28 23:16:46 [main]: WARN jdbc.Utils: Please use transportMode like so: jdbc:hive2://host:port/dbName;transportMode=transport_mode_value Connected to: Apache Hive (version 0.14.0.2.2.4.1-1) Driver: Hive JDBC (version 0.14.0.2.2.4.1-1) Transaction isolation: TRANSACTION_REPEATABLE_READ Beeline version 0.14.0.2.2.4.1-1 by Apache Hive 0: jdbc:hive2://localhost:10001/default show tables; +--+--+ | tab_name | +--+--+ | hivesampletable | +--+--+ 1 row selected (18.181 seconds) 0: jdbc:hive2://localhost:10001/default ^Chdiuser@headnode0:~$ ^C {noformat} The reason for the above message is : The url is wrong. Correct one: {code} beeline -u 'jdbc:hive2://localhost:10001/default;httpPath=/;transportMode=http' -n hdiuser {code} Note the ; instead of ?. The deprecation msg prints the format as well: {code} Please use transportMode like so: jdbc:hive2://host:port/dbName;transportMode=transport_mode_value {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10809) HCat FileOutputCommitterContainer leaves behind empty _SCRATCH directories
[ https://issues.apache.org/jira/browse/HIVE-10809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Selina Zhang updated HIVE-10809: Attachment: HIVE-10809.2.patch The above unit test failures seem not relevant to this patch. Uploaded a new patch. Add verification in TestHCatStorer to verify the scratch directories are removed. HCat FileOutputCommitterContainer leaves behind empty _SCRATCH directories -- Key: HIVE-10809 URL: https://issues.apache.org/jira/browse/HIVE-10809 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 1.2.0 Reporter: Selina Zhang Assignee: Selina Zhang Attachments: HIVE-10809.1.patch, HIVE-10809.2.patch When static partition is added through HCatStorer or HCatWriter {code} JoinedData = LOAD '/user/selinaz/data/part-r-0' USING JsonLoader(); STORE JoinedData INTO 'selina.joined_events_e' USING org.apache.hive.hcatalog.pig.HCatStorer('author=selina'); {code} The table directory looks like {noformat} drwx-- - selinaz users 0 2015-05-22 21:19 /user/selinaz/joined_events_e/_SCRATCH0.9157208938193798 drwx-- - selinaz users 0 2015-05-22 21:19 /user/selinaz/joined_events_e/author=selina {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10244) Vectorization : TPC-DS Q80 fails with java.lang.ClassCastException when hive.vectorized.execution.reduce.enabled is enabled
[ https://issues.apache.org/jira/browse/HIVE-10244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559918#comment-14559918 ] Matt McCline commented on HIVE-10244: - Ya, I know, that is what I thought. But the new prune flag seems to be on in the Reducer even though isGroupingSetsPresent is false. We should talk to the author and reviewer of the change. Jedi Master [~ashutoshc], can you explain to us Padawan Learners [~jpullokkaran] [~mmccline] [~jcamachorodriguez] all about the prune flag? Vectorization : TPC-DS Q80 fails with java.lang.ClassCastException when hive.vectorized.execution.reduce.enabled is enabled --- Key: HIVE-10244 URL: https://issues.apache.org/jira/browse/HIVE-10244 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Matt McCline Attachments: HIVE-10244.01.patch, explain_q80_vectorized_reduce_on.txt Query {code} set hive.vectorized.execution.reduce.enabled=true; with ssr as (select s_store_id as store_id, sum(ss_ext_sales_price) as sales, sum(coalesce(sr_return_amt, 0)) as returns, sum(ss_net_profit - coalesce(sr_net_loss, 0)) as profit from store_sales left outer join store_returns on (ss_item_sk = sr_item_sk and ss_ticket_number = sr_ticket_number), date_dim, store, item, promotion where ss_sold_date_sk = d_date_sk and d_date between cast('1998-08-04' as date) and (cast('1998-09-04' as date)) and ss_store_sk = s_store_sk and ss_item_sk = i_item_sk and i_current_price 50 and ss_promo_sk = p_promo_sk and p_channel_tv = 'N' group by s_store_id) , csr as (select cp_catalog_page_id as catalog_page_id, sum(cs_ext_sales_price) as sales, sum(coalesce(cr_return_amount, 0)) as returns, sum(cs_net_profit - coalesce(cr_net_loss, 0)) as profit from catalog_sales left outer join catalog_returns on (cs_item_sk = cr_item_sk and cs_order_number = cr_order_number), date_dim, catalog_page, item, promotion where cs_sold_date_sk = d_date_sk and d_date between cast('1998-08-04' as date) and (cast('1998-09-04' as date)) and cs_catalog_page_sk = cp_catalog_page_sk and cs_item_sk = i_item_sk and i_current_price 50 and cs_promo_sk = p_promo_sk and p_channel_tv = 'N' group by cp_catalog_page_id) , wsr as (select web_site_id, sum(ws_ext_sales_price) as sales, sum(coalesce(wr_return_amt, 0)) as returns, sum(ws_net_profit - coalesce(wr_net_loss, 0)) as profit from web_sales left outer join web_returns on (ws_item_sk = wr_item_sk and ws_order_number = wr_order_number), date_dim, web_site, item, promotion where ws_sold_date_sk = d_date_sk and d_date between cast('1998-08-04' as date) and (cast('1998-09-04' as date)) and ws_web_site_sk = web_site_sk and ws_item_sk = i_item_sk and i_current_price 50 and ws_promo_sk = p_promo_sk and p_channel_tv = 'N' group by web_site_id) select channel , id , sum(sales) as sales , sum(returns) as returns , sum(profit) as profit from (select 'store channel' as channel , concat('store', store_id) as id , sales , returns , profit from ssr union all select 'catalog channel' as channel , concat('catalog_page', catalog_page_id) as id , sales , returns , profit from csr union all select 'web channel' as channel , concat('web_site', web_site_id) as id , sales , returns , profit from wsr ) x group by channel, id with rollup order by channel ,id limit 100 {code} Exception {code} Vertex failed, vertexName=Reducer 5, vertexId=vertex_1426707664723_1377_1_22, diagnostics=[Task failed, taskId=task_1426707664723_1377_1_22_00, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) \N\N09.285817653506076E84.639990363237801E7-1.1814318134887291E8 \N\N04.682909323885761E82.2415242712669864E7-5.966176123188091E7 \N\N01.2847032699693155E96.300096113768728E7-5.94963316209578E8 at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) at
[jira] [Commented] (HIVE-9069) Simplify filter predicates for CBO
[ https://issues.apache.org/jira/browse/HIVE-9069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559220#comment-14559220 ] Hive QA commented on HIVE-9069: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12735298/HIVE-9069.14.patch {color:red}ERROR:{color} -1 due to 636 failed/errored test(s), 8974 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_multiple org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alias_casted_column org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table2_h23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_numbuckets_partitioned_table_h23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_protect_mode org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition_authorization org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_serde2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join26 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_reordering_values org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_add_column3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_change_schema org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_comments org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_compression_enabled org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_compression_enabled_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_date org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_deserialize_map_null org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_evolved_schemas org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_joins org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_joins_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_fields org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_sanity_test org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_schema_evolution_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_timestamp org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_type_evolution org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table_udfs org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_binary_output_format org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_5
[jira] [Commented] (HIVE-10811) RelFieldTrimmer throws NoSuchElementException in some cases
[ https://issues.apache.org/jira/browse/HIVE-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559104#comment-14559104 ] Hive QA commented on HIVE-10811: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12735296/HIVE-10811.01.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4040/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4040/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4040/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4040/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 1f75e34 HIVE-9605:Remove parquet nested objects from wrapper writable objects (Sergio Pena, reviewed by Ferdinand Xu) + git clean -f -d Removing hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/ Removing hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/ + git checkout master Already on 'master' + git reset --hard origin/master HEAD is now at 1f75e34 HIVE-9605:Remove parquet nested objects from wrapper writable objects (Sergio Pena, reviewed by Ferdinand Xu) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12735296 - PreCommit-HIVE-TRUNK-Build RelFieldTrimmer throws NoSuchElementException in some cases --- Key: HIVE-10811 URL: https://issues.apache.org/jira/browse/HIVE-10811 Project: Hive Issue Type: Bug Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-10811.01.patch, HIVE-10811.patch RelFieldTrimmer runs into NoSuchElementException in some cases. Stack trace: {noformat} Exception in thread main java.lang.AssertionError: Internal error: While invoking method 'public org.apache.calcite.sql2rel.RelFieldTrimmer$TrimResult org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(org.apache.calcite.rel.core.Sort,org.apache.calcite.util.ImmutableBitSet,java.util.Set)' at org.apache.calcite.util.Util.newInternal(Util.java:743) at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:543) at org.apache.calcite.sql2rel.RelFieldTrimmer.dispatchTrimFields(RelFieldTrimmer.java:269) at org.apache.calcite.sql2rel.RelFieldTrimmer.trim(RelFieldTrimmer.java:175) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:947) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:820) at