[jira] [Commented] (HIVE-6981) Remove old website from SVN
[ https://issues.apache.org/jira/browse/HIVE-6981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172046#comment-14172046 ] Lefty Leverenz commented on HIVE-6981: -- Step 5 in the Publishing section of How to Release is also obsolete (not to mention most of the other steps): {quote} 5. Prepare to edit the website. {{svn co https://svn.apache.org/repos/asf/hive/site}} {quote} We need complete reviews of both the How to Commit and How to Release wikidocs. Should this be a new jira ticket? Quick reference: * [How to Commit | https://cwiki.apache.org/confluence/display/Hive/HowToCommit] ** [How to Commit -- Committing Documentation | https://cwiki.apache.org/confluence/display/Hive/HowToCommit#HowToCommit-CommittingDocumentation] * [How to Release | https://cwiki.apache.org/confluence/display/Hive/HowToRelease] ** [How to Release -- Publishing | https://cwiki.apache.org/confluence/display/Hive/HowToRelease#HowToRelease-Publishing] * [How to edit the website | https://cwiki.apache.org/confluence/display/Hive/How+to+edit+the+website] Remove old website from SVN --- Key: HIVE-6981 URL: https://issues.apache.org/jira/browse/HIVE-6981 Project: Hive Issue Type: Task Reporter: Brock Noland Assignee: Brock Noland Command to do removal: {noformat} svn delete https://svn.apache.org/repos/asf/hive/site/ --message HIVE-6981 - Remove old website from SVN {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8320) Error in MetaException(message:Got exception: org.apache.thrift.transport.TTransportException java.net.SocketTimeoutException: Read timed out)
[ https://issues.apache.org/jira/browse/HIVE-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gavin kim updated HIVE-8320: Attachment: HIVE-8320.1.patch I re-code getMetastoreClient of HiveSessionImpl.java to use threadLocal Method (i.e. Hive.get). Hiveserver2's session's meta store client was property per session, but after this patch meta store client is resource per thread. I cannot find problems in my test yet. Is this suitable for Hive's coding pattern?? Error in MetaException(message:Got exception: org.apache.thrift.transport.TTransportException java.net.SocketTimeoutException: Read timed out) -- Key: HIVE-8320 URL: https://issues.apache.org/jira/browse/HIVE-8320 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.13.1 Reporter: gavin kim Assignee: gavin kim Priority: Minor Labels: patch Fix For: 0.13.1 Attachments: 0001-make-to-synchronize-hiveserver2-session-s-metastore-.patch, HIVE-8320.1.patch I'm using Hive 13.1 in cdh environment. Using hue's beeswax, sometimes hiveserver2 occur MetaException. And after that, hive meta data request timed out. error log's detail is below. 2014-09-29 12:05:44,829 ERROR hive.log: Got exception: org.apache.thrift.transport.TTransportException java.net.SocketTimeoutException: Read timed out org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_databases(ThriftHiveMetastore.java:600) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_databases(ThriftHiveMetastore.java:587) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:826) at org.apache.hive.service.cli.operation.GetSchemasOperation.run(GetSchemasOperation.java:62) at org.apache.hive.service.cli.session.HiveSessionImpl.runOperationWithLogCapture(HiveSessionImpl.java:562) at org.apache.hive.service.cli.session.HiveSessionImpl.getSchemas(HiveSessionImpl.java:315) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79) at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37) at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:493) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:60) at com.sun.proxy.$Proxy13.getSchemas(Unknown Source) at org.apache.hive.service.cli.CLIService.getSchemas(CLIService.java:273) at org.apache.hive.service.cli.thrift.ThriftCLIService.GetSchemas(ThriftCLIService.java:402) at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1429) at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1414) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:55) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at
[jira] [Commented] (HIVE-8320) Error in MetaException(message:Got exception: org.apache.thrift.transport.TTransportException java.net.SocketTimeoutException: Read timed out)
[ https://issues.apache.org/jira/browse/HIVE-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172051#comment-14172051 ] gavin kim commented on HIVE-8320: - And i thank you again for your detailed reply. It's very helpful to me. :) Error in MetaException(message:Got exception: org.apache.thrift.transport.TTransportException java.net.SocketTimeoutException: Read timed out) -- Key: HIVE-8320 URL: https://issues.apache.org/jira/browse/HIVE-8320 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.13.1 Reporter: gavin kim Assignee: gavin kim Priority: Minor Labels: patch Fix For: 0.13.1 Attachments: 0001-make-to-synchronize-hiveserver2-session-s-metastore-.patch, HIVE-8320.1.patch I'm using Hive 13.1 in cdh environment. Using hue's beeswax, sometimes hiveserver2 occur MetaException. And after that, hive meta data request timed out. error log's detail is below. 2014-09-29 12:05:44,829 ERROR hive.log: Got exception: org.apache.thrift.transport.TTransportException java.net.SocketTimeoutException: Read timed out org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_databases(ThriftHiveMetastore.java:600) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_databases(ThriftHiveMetastore.java:587) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:826) at org.apache.hive.service.cli.operation.GetSchemasOperation.run(GetSchemasOperation.java:62) at org.apache.hive.service.cli.session.HiveSessionImpl.runOperationWithLogCapture(HiveSessionImpl.java:562) at org.apache.hive.service.cli.session.HiveSessionImpl.getSchemas(HiveSessionImpl.java:315) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79) at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37) at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:493) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:60) at com.sun.proxy.$Proxy13.getSchemas(Unknown Source) at org.apache.hive.service.cli.CLIService.getSchemas(CLIService.java:273) at org.apache.hive.service.cli.thrift.ThriftCLIService.GetSchemas(ThriftCLIService.java:402) at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1429) at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1414) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:55) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at
[jira] [Updated] (HIVE-8320) Error in MetaException(message:Got exception: org.apache.thrift.transport.TTransportException java.net.SocketTimeoutException: Read timed out)
[ https://issues.apache.org/jira/browse/HIVE-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gavin kim updated HIVE-8320: Status: Patch Available (was: Open) Error in MetaException(message:Got exception: org.apache.thrift.transport.TTransportException java.net.SocketTimeoutException: Read timed out) -- Key: HIVE-8320 URL: https://issues.apache.org/jira/browse/HIVE-8320 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.13.1 Reporter: gavin kim Assignee: gavin kim Priority: Minor Labels: patch Fix For: 0.13.1 Attachments: 0001-make-to-synchronize-hiveserver2-session-s-metastore-.patch, HIVE-8320.1.patch I'm using Hive 13.1 in cdh environment. Using hue's beeswax, sometimes hiveserver2 occur MetaException. And after that, hive meta data request timed out. error log's detail is below. 2014-09-29 12:05:44,829 ERROR hive.log: Got exception: org.apache.thrift.transport.TTransportException java.net.SocketTimeoutException: Read timed out org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_databases(ThriftHiveMetastore.java:600) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_databases(ThriftHiveMetastore.java:587) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:826) at org.apache.hive.service.cli.operation.GetSchemasOperation.run(GetSchemasOperation.java:62) at org.apache.hive.service.cli.session.HiveSessionImpl.runOperationWithLogCapture(HiveSessionImpl.java:562) at org.apache.hive.service.cli.session.HiveSessionImpl.getSchemas(HiveSessionImpl.java:315) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79) at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37) at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:493) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:60) at com.sun.proxy.$Proxy13.getSchemas(Unknown Source) at org.apache.hive.service.cli.CLIService.getSchemas(CLIService.java:273) at org.apache.hive.service.cli.thrift.ThriftCLIService.GetSchemas(ThriftCLIService.java:402) at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1429) at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1414) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:55) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:152) at java.net.SocketInputStream.read(SocketInputStream.java:122)
[jira] [Updated] (HIVE-8450) Create table like does not copy over table properties
[ https://issues.apache.org/jira/browse/HIVE-8450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-8450: Assignee: Navis Status: Patch Available (was: Open) Create table like does not copy over table properties - Key: HIVE-8450 URL: https://issues.apache.org/jira/browse/HIVE-8450 Project: Hive Issue Type: Bug Affects Versions: 0.13.1, 0.14.0 Reporter: Brock Noland Assignee: Navis Priority: Critical Attachments: HIVE-8450.1.patch.txt Assuming t2 is a avro backed table, the following: {{create table t1 like t2}} should create an avro backed table, but the schema.url.* is not being copied correctly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8450) Create table like does not copy over table properties
[ https://issues.apache.org/jira/browse/HIVE-8450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-8450: Attachment: HIVE-8450.1.patch.txt Create table like does not copy over table properties - Key: HIVE-8450 URL: https://issues.apache.org/jira/browse/HIVE-8450 Project: Hive Issue Type: Bug Affects Versions: 0.14.0, 0.13.1 Reporter: Brock Noland Priority: Critical Attachments: HIVE-8450.1.patch.txt Assuming t2 is a avro backed table, the following: {{create table t1 like t2}} should create an avro backed table, but the schema.url.* is not being copied correctly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7733) Ambiguous column reference error on query
[ https://issues.apache.org/jira/browse/HIVE-7733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172059#comment-14172059 ] Hive QA commented on HIVE-7733: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12674705/HIVE-7733.7.patch.txt {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 6557 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel org.apache.hive.beeline.TestSchemaTool.testSchemaInit org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchAbort {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1271/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1271/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1271/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12674705 Ambiguous column reference error on query - Key: HIVE-7733 URL: https://issues.apache.org/jira/browse/HIVE-7733 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jason Dere Assignee: Navis Attachments: HIVE-7733.1.patch.txt, HIVE-7733.2.patch.txt, HIVE-7733.3.patch.txt, HIVE-7733.4.patch.txt, HIVE-7733.5.patch.txt, HIVE-7733.6.patch.txt, HIVE-7733.7.patch.txt {noformat} CREATE TABLE agg1 ( col0 INT, col1 STRING, col2 DOUBLE ); explain SELECT single_use_subq11.a1 AS a1, single_use_subq11.a2 AS a2 FROM (SELECT Sum(agg1.col2) AS a1 FROM agg1 GROUP BY agg1.col0) single_use_subq12 JOIN (SELECT alias.a2 AS a0, alias.a1 AS a1, alias.a1 AS a2 FROM (SELECT agg1.col1 AS a0, '42' AS a1, agg1.col0 AS a2 FROM agg1 UNION ALL SELECT agg1.col1 AS a0, '41' AS a1, agg1.col0 AS a2 FROM agg1) alias GROUP BY alias.a2, alias.a1) single_use_subq11 ON ( single_use_subq11.a0 = single_use_subq11.a0 ); {noformat} Gets the following error: FAILED: SemanticException [Error 10007]: Ambiguous column reference a2 Looks like this query had been working in 0.12 but starting failing with this error in 0.13 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8465) Fix some minor test fails on trunk
[ https://issues.apache.org/jira/browse/HIVE-8465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-8465: Summary: Fix some minor test fails on trunk (was: Fix some minor test fails) Fix some minor test fails on trunk -- Key: HIVE-8465 URL: https://issues.apache.org/jira/browse/HIVE-8465 Project: Hive Issue Type: Task Components: Tests Reporter: Navis Assignee: Navis Priority: Minor org.apache.hive.beeline.TestSchemaTool.testSchemaInit org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8465) Fix some minor test fails
Navis created HIVE-8465: --- Summary: Fix some minor test fails Key: HIVE-8465 URL: https://issues.apache.org/jira/browse/HIVE-8465 Project: Hive Issue Type: Task Components: Tests Reporter: Navis Assignee: Navis Priority: Minor org.apache.hive.beeline.TestSchemaTool.testSchemaInit org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8083) Authorization DDLs should not enforce hive identifier syntax for user or group
[ https://issues.apache.org/jira/browse/HIVE-8083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-8083: - Labels: (was: TODOC14) Authorization DDLs should not enforce hive identifier syntax for user or group -- Key: HIVE-8083 URL: https://issues.apache.org/jira/browse/HIVE-8083 Project: Hive Issue Type: Bug Components: SQL, SQLStandardAuthorization Affects Versions: 0.13.0, 0.13.1 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.14.0 Attachments: HIVE-8083.1.patch, HIVE-8083.2.patch, HIVE-8083.3.patch The compiler expects principals (user, group and role) as hive identifiers for authorization DDLs. The user and group are entities that belong to external namespace and we can't expect those to follow hive identifier syntax rules. For example, a userid or group can contain '-' which is not allowed by compiler. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8465) Fix some minor test fails on trunk
[ https://issues.apache.org/jira/browse/HIVE-8465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-8465: Attachment: HIVE-8465.1.patch.txt Fix some minor test fails on trunk -- Key: HIVE-8465 URL: https://issues.apache.org/jira/browse/HIVE-8465 Project: Hive Issue Type: Task Components: Tests Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-8465.1.patch.txt org.apache.hive.beeline.TestSchemaTool.testSchemaInit org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8465) Fix some minor test fails on trunk
[ https://issues.apache.org/jira/browse/HIVE-8465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-8465: Status: Patch Available (was: Open) Fix some minor test fails on trunk -- Key: HIVE-8465 URL: https://issues.apache.org/jira/browse/HIVE-8465 Project: Hive Issue Type: Task Components: Tests Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-8465.1.patch.txt org.apache.hive.beeline.TestSchemaTool.testSchemaInit org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7733) Ambiguous column reference error on query
[ https://issues.apache.org/jira/browse/HIVE-7733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172076#comment-14172076 ] Navis commented on HIVE-7733: - Cannot reproduce fail of TestStreaming.testTransactionBatchAbort and other fails are booked on https://issues.apache.org/jira/browse/HIVE-8465 Ambiguous column reference error on query - Key: HIVE-7733 URL: https://issues.apache.org/jira/browse/HIVE-7733 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Jason Dere Assignee: Navis Attachments: HIVE-7733.1.patch.txt, HIVE-7733.2.patch.txt, HIVE-7733.3.patch.txt, HIVE-7733.4.patch.txt, HIVE-7733.5.patch.txt, HIVE-7733.6.patch.txt, HIVE-7733.7.patch.txt {noformat} CREATE TABLE agg1 ( col0 INT, col1 STRING, col2 DOUBLE ); explain SELECT single_use_subq11.a1 AS a1, single_use_subq11.a2 AS a2 FROM (SELECT Sum(agg1.col2) AS a1 FROM agg1 GROUP BY agg1.col0) single_use_subq12 JOIN (SELECT alias.a2 AS a0, alias.a1 AS a1, alias.a1 AS a2 FROM (SELECT agg1.col1 AS a0, '42' AS a1, agg1.col0 AS a2 FROM agg1 UNION ALL SELECT agg1.col1 AS a0, '41' AS a1, agg1.col0 AS a2 FROM agg1) alias GROUP BY alias.a2, alias.a1) single_use_subq11 ON ( single_use_subq11.a0 = single_use_subq11.a0 ); {noformat} Gets the following error: FAILED: SemanticException [Error 10007]: Ambiguous column reference a2 Looks like this query had been working in 0.12 but starting failing with this error in 0.13 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8083) Authorization DDLs should not enforce hive identifier syntax for user or group
[ https://issues.apache.org/jira/browse/HIVE-8083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172077#comment-14172077 ] Lefty Leverenz commented on HIVE-8083: -- Docs look good. I added links back to this jira and removed the jira's TODOC14 label. Thanks [~prasadm]. Authorization DDLs should not enforce hive identifier syntax for user or group -- Key: HIVE-8083 URL: https://issues.apache.org/jira/browse/HIVE-8083 Project: Hive Issue Type: Bug Components: SQL, SQLStandardAuthorization Affects Versions: 0.13.0, 0.13.1 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.14.0 Attachments: HIVE-8083.1.patch, HIVE-8083.2.patch, HIVE-8083.3.patch The compiler expects principals (user, group and role) as hive identifiers for authorization DDLs. The user and group are entities that belong to external namespace and we can't expect those to follow hive identifier syntax rules. For example, a userid or group can contain '-' which is not allowed by compiler. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8455) Print Spark job progress format info on the console[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li updated HIVE-8455: Attachment: hive on spark job status.PNG Print Spark job progress format info on the console[Spark Branch] - Key: HIVE-8455 URL: https://issues.apache.org/jira/browse/HIVE-8455 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M3 Attachments: HIVE-8455.1-spark.patch, hive on spark job status.PNG We have add support of spark job status monitoring on HIVE-7439, but not print job progress format info on the console, user may confuse about what the progress info means, so I would like to add job progress format info here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8455) Print Spark job progress format info on the console[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li updated HIVE-8455: Attachment: HIVE-8455.2-spark.patch Print Spark job progress format info on the console[Spark Branch] - Key: HIVE-8455 URL: https://issues.apache.org/jira/browse/HIVE-8455 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M3 Attachments: HIVE-8455.1-spark.patch, HIVE-8455.2-spark.patch, hive on spark job status.PNG We have add support of spark job status monitoring on HIVE-7439, but not print job progress format info on the console, user may confuse about what the progress info means, so I would like to add job progress format info here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8455) Print Spark job progress format info on the console[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172102#comment-14172102 ] Chengxiang Li commented on HIVE-8455: - I‘ve add date stamp and stage cost time after stage finished. I tried to print the key every 15 lines, it turns out very unharmonious with job progress info, As the key info is printed at the begin and the progress info is quite self-explained, I think we may not actually need this, what do you think? Print Spark job progress format info on the console[Spark Branch] - Key: HIVE-8455 URL: https://issues.apache.org/jira/browse/HIVE-8455 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M3 Attachments: HIVE-8455.1-spark.patch, HIVE-8455.2-spark.patch, hive on spark job status.PNG We have add support of spark job status monitoring on HIVE-7439, but not print job progress format info on the console, user may confuse about what the progress info means, so I would like to add job progress format info here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 26732: HIVE-8455 Print Spark job progress format info on the console[Spark Branch]
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26732/ --- Review request for hive, Brock Noland and Szehon Ho. Bugs: HIVE-8455 https://issues.apache.org/jira/browse/HIVE-8455 Repository: hive-git Description --- 1.add data stamp in progress info. 2.print stage cost time after stage finished. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkJobMonitor.java b092abc ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkJobStatus.java 8717fe2 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkProgress.java 36322eb ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/SparkStageProgress.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/impl/SimpleSparkJobStatus.java c7ef83c Diff: https://reviews.apache.org/r/26732/diff/ Testing --- Thanks, chengxiang li
[jira] [Commented] (HIVE-2906) Support providing some table properties by user via SQL
[ https://issues.apache.org/jira/browse/HIVE-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172104#comment-14172104 ] cw commented on HIVE-2906: -- Hi [~navis]. We found that we can not use the identifier 'user' as table alias because the FromClauseParser.g was changed in this patch. -: tabname=tableName (ts=tableSample)? (KW_AS? alias=identifier)? -- ^(TOK_TABREF $tabname $ts? $alias?) +: tabname=tableName (props=tableProperties)? (ts=tableSample)? (KW_AS? alias=Identifier)? +- ^(TOK_TABREF $tabname $props? $ts? $alias?) It changed the 'identifier' to a uppercase 'Identifier'. Is it intended or just a mistake? Support providing some table properties by user via SQL --- Key: HIVE-2906 URL: https://issues.apache.org/jira/browse/HIVE-2906 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Fix For: 0.12.0 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.2.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.3.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.4.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.5.patch, HIVE-2906.D2499.6.patch, HIVE-2906.D2499.7.patch Some properties are needed to be provided to StorageHandler by user in runtime. It might be an address for remote resource or retry count for access or maximum version count(for hbase), etc. For example, {code} select emp.empno, emp.ename from hbase_emp ('max.version'='3') emp; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-2906) Support providing some table properties by user via SQL
[ https://issues.apache.org/jira/browse/HIVE-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172115#comment-14172115 ] Navis commented on HIVE-2906: - [~cwsteinbach] It seemed apparently a mistake. There was not identifier type when this patch was first created (see the first patch). Seemed need a fix. Would you do that or do I? Support providing some table properties by user via SQL --- Key: HIVE-2906 URL: https://issues.apache.org/jira/browse/HIVE-2906 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Fix For: 0.12.0 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.2.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.3.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.4.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.5.patch, HIVE-2906.D2499.6.patch, HIVE-2906.D2499.7.patch Some properties are needed to be provided to StorageHandler by user in runtime. It might be an address for remote resource or retry count for access or maximum version count(for hbase), etc. For example, {code} select emp.empno, emp.ename from hbase_emp ('max.version'='3') emp; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-2906) Support providing some table properties by user via SQL
[ https://issues.apache.org/jira/browse/HIVE-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172119#comment-14172119 ] cw commented on HIVE-2906: -- [~navis] I would like to create a issue and submit a patch to fix it. Support providing some table properties by user via SQL --- Key: HIVE-2906 URL: https://issues.apache.org/jira/browse/HIVE-2906 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Fix For: 0.12.0 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.2.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.3.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.4.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.5.patch, HIVE-2906.D2499.6.patch, HIVE-2906.D2499.7.patch Some properties are needed to be provided to StorageHandler by user in runtime. It might be an address for remote resource or retry count for access or maximum version count(for hbase), etc. For example, {code} select emp.empno, emp.ename from hbase_emp ('max.version'='3') emp; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8466) nonReserved keywords can not be used as table alias
cw created HIVE-8466: Summary: nonReserved keywords can not be used as table alias Key: HIVE-8466 URL: https://issues.apache.org/jira/browse/HIVE-8466 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.13.1, 0.13.0, 0.12.0 Reporter: cw Priority: Minor There is a small mistake in the patch of issue HIVE-2906. See the change of FromClauseParser.g -: tabname=tableName (ts=tableSample)? (KW_AS? alias=identifier)? -- ^(TOK_TABREF $tabname $ts? $alias?) +: tabname=tableName (props=tableProperties)? (ts=tableSample)? (KW_AS? alias=Identifier)? +- ^(TOK_TABREF $tabname $props? $ts? $alias?) With the 'identifier' changed to 'Identifier' we can not use nonReserved keywords as table alias. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8466) nonReserved keywords can not be used as table alias
[ https://issues.apache.org/jira/browse/HIVE-8466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cw updated HIVE-8466: - Attachment: HIVE-8466.1.patch Commit a patch. nonReserved keywords can not be used as table alias --- Key: HIVE-8466 URL: https://issues.apache.org/jira/browse/HIVE-8466 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.12.0, 0.13.0, 0.13.1 Reporter: cw Priority: Minor Attachments: HIVE-8466.1.patch There is a small mistake in the patch of issue HIVE-2906. See the change of FromClauseParser.g -: tabname=tableName (ts=tableSample)? (KW_AS? alias=identifier)? -- ^(TOK_TABREF $tabname $ts? $alias?) +: tabname=tableName (props=tableProperties)? (ts=tableSample)? (KW_AS? alias=Identifier)? +- ^(TOK_TABREF $tabname $props? $ts? $alias?) With the 'identifier' changed to 'Identifier' we can not use nonReserved keywords as table alias. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8343) Return value from BlockingQueue.offer() is not checked in DynamicPartitionPruner
[ https://issues.apache.org/jira/browse/HIVE-8343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172126#comment-14172126 ] Hive QA commented on HIVE-8343: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12674723/HIVE-8343.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 6559 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.testStatsAfterCompactionPartTbl org.apache.hive.beeline.TestSchemaTool.testSchemaInit org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1272/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1272/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1272/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12674723 Return value from BlockingQueue.offer() is not checked in DynamicPartitionPruner Key: HIVE-8343 URL: https://issues.apache.org/jira/browse/HIVE-8343 Project: Hive Issue Type: Bug Reporter: Ted Yu Assignee: JongWon Park Priority: Minor Attachments: HIVE-8343.patch In addEvent() and processVertex(), there is call such as the following: {code} queue.offer(event); {code} The return value should be checked. If false is returned, event would not have been queued. Take a look at line 328 in: http://fuseyism.com/classpath/doc/java/util/concurrent/LinkedBlockingQueue-source.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8466) nonReserved keywords can not be used as table alias
[ https://issues.apache.org/jira/browse/HIVE-8466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cw updated HIVE-8466: - Status: Patch Available (was: Open) nonReserved keywords can not be used as table alias --- Key: HIVE-8466 URL: https://issues.apache.org/jira/browse/HIVE-8466 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.13.1, 0.13.0, 0.12.0 Reporter: cw Priority: Minor Attachments: HIVE-8466.1.patch There is a small mistake in the patch of issue HIVE-2906. See the change of FromClauseParser.g -: tabname=tableName (ts=tableSample)? (KW_AS? alias=identifier)? -- ^(TOK_TABREF $tabname $ts? $alias?) +: tabname=tableName (props=tableProperties)? (ts=tableSample)? (KW_AS? alias=Identifier)? +- ^(TOK_TABREF $tabname $props? $ts? $alias?) With the 'identifier' changed to 'Identifier' we can not use nonReserved keywords as table alias. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-2906) Support providing some table properties by user via SQL
[ https://issues.apache.org/jira/browse/HIVE-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172132#comment-14172132 ] cw commented on HIVE-2906: -- [~navis] I created a issue here: https://issues.apache.org/jira/browse/HIVE-8466 Support providing some table properties by user via SQL --- Key: HIVE-2906 URL: https://issues.apache.org/jira/browse/HIVE-2906 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Fix For: 0.12.0 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.2.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.3.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.4.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.5.patch, HIVE-2906.D2499.6.patch, HIVE-2906.D2499.7.patch Some properties are needed to be provided to StorageHandler by user in runtime. It might be an address for remote resource or retry count for access or maximum version count(for hbase), etc. For example, {code} select emp.empno, emp.ename from hbase_emp ('max.version'='3') emp; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-2906) Support providing some table properties by user via SQL
[ https://issues.apache.org/jira/browse/HIVE-2906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172135#comment-14172135 ] Navis commented on HIVE-2906: - [~cwsteinbach] Good. Thanks! Support providing some table properties by user via SQL --- Key: HIVE-2906 URL: https://issues.apache.org/jira/browse/HIVE-2906 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Fix For: 0.12.0 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.2.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.3.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.4.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2906.D2499.5.patch, HIVE-2906.D2499.6.patch, HIVE-2906.D2499.7.patch Some properties are needed to be provided to StorageHandler by user in runtime. It might be an address for remote resource or retry count for access or maximum version count(for hbase), etc. For example, {code} select emp.empno, emp.ename from hbase_emp ('max.version'='3') emp; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8456) Support Hive Counter to collect spark job metric[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li updated HIVE-8456: Attachment: HIVE-8456.2-spark.patch Support Hive Counter to collect spark job metric[Spark Branch] -- Key: HIVE-8456 URL: https://issues.apache.org/jira/browse/HIVE-8456 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M3 Attachments: HIVE-8456.1-spark.patch, HIVE-8456.2-spark.patch Several Hive query metric in Hive operators is collected by Hive Counter, such as CREATEDFILES and DESERIALIZE_ERRORS, Besides, Hive use Counter as an option to collect table stats info. Spark support Accumulator which is pretty similiar with Hive Counter, we could try to enable Hive Counter based on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 26733: HIVE-8456 Support Hive Counter to collect spark job metric[Spark Branch]
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26733/ --- Review request for hive, Brock Noland and Szehon Ho. Bugs: HIVE-8456 https://issues.apache.org/jira/browse/HIVE-8456 Repository: hive-git Description --- Several Hive query metric in Hive operators is collected by Hive Counter, such as CREATEDFILES and DESERIALIZE_ERRORS, Besides, Hive use Counter as an option to collect table stats info. Spark support Accumulator which is pretty similiar with Hive Counter, we could try to enable Hive Counter based on it. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/spark/counter/SparkCounter.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/counter/SparkCounterGroup.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/counter/SparkCounters.java PRE-CREATION Diff: https://reviews.apache.org/r/26733/diff/ Testing --- Thanks, chengxiang li
[jira] [Commented] (HIVE-8455) Print Spark job progress format info on the console[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172155#comment-14172155 ] Hive QA commented on HIVE-8455: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12674942/HIVE-8455.2-spark.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6769 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_tez_smb_1 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/219/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/219/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-219/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12674942 Print Spark job progress format info on the console[Spark Branch] - Key: HIVE-8455 URL: https://issues.apache.org/jira/browse/HIVE-8455 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M3 Attachments: HIVE-8455.1-spark.patch, HIVE-8455.2-spark.patch, hive on spark job status.PNG We have add support of spark job status monitoring on HIVE-7439, but not print job progress format info on the console, user may confuse about what the progress info means, so I would like to add job progress format info here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8456) Support Hive Counter to collect spark job metric[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172158#comment-14172158 ] Chengxiang Li commented on HIVE-8456: - As Spark does not use Counter to collect framework/job metric like MR/Tez, so Hive on Spark only use Counter in several simple case, I implement a simple implementation which would fit into those requirement based on Spark accumulator. Support Hive Counter to collect spark job metric[Spark Branch] -- Key: HIVE-8456 URL: https://issues.apache.org/jira/browse/HIVE-8456 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M3 Attachments: HIVE-8456.1-spark.patch, HIVE-8456.2-spark.patch Several Hive query metric in Hive operators is collected by Hive Counter, such as CREATEDFILES and DESERIALIZE_ERRORS, Besides, Hive use Counter as an option to collect table stats info. Spark support Accumulator which is pretty similiar with Hive Counter, we could try to enable Hive Counter based on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-2573) Create per-session function registry
[ https://issues.apache.org/jira/browse/HIVE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-2573: - Attachment: HIVE-2573.7.patch Fix test failures. TestHiveServerSessions seems to pass when I move it to itests/ Create per-session function registry - Key: HIVE-2573 URL: https://issues.apache.org/jira/browse/HIVE-2573 Project: Hive Issue Type: Improvement Components: Server Infrastructure Reporter: Navis Assignee: Navis Priority: Minor Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2573.D3231.1.patch, HIVE-2573.1.patch.txt, HIVE-2573.2.patch.txt, HIVE-2573.3.patch.txt, HIVE-2573.4.patch.txt, HIVE-2573.5.patch, HIVE-2573.6.patch, HIVE-2573.7.patch Currently the function registry is shared resource and could be overrided by other users when using HiveServer. If per-session function registry is provided, this situation could be prevented. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8456) Support Hive Counter to collect spark job metric[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li updated HIVE-8456: Attachment: HIVE-8456.2-spark.patch Support Hive Counter to collect spark job metric[Spark Branch] -- Key: HIVE-8456 URL: https://issues.apache.org/jira/browse/HIVE-8456 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M3 Attachments: HIVE-8456.1-spark.patch, HIVE-8456.2-spark.patch Several Hive query metric in Hive operators is collected by Hive Counter, such as CREATEDFILES and DESERIALIZE_ERRORS, Besides, Hive use Counter as an option to collect table stats info. Spark support Accumulator which is pretty similiar with Hive Counter, we could try to enable Hive Counter based on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8456) Support Hive Counter to collect spark job metric[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chengxiang Li updated HIVE-8456: Attachment: (was: HIVE-8456.2-spark.patch) Support Hive Counter to collect spark job metric[Spark Branch] -- Key: HIVE-8456 URL: https://issues.apache.org/jira/browse/HIVE-8456 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M3 Attachments: HIVE-8456.1-spark.patch, HIVE-8456.2-spark.patch Several Hive query metric in Hive operators is collected by Hive Counter, such as CREATEDFILES and DESERIALIZE_ERRORS, Besides, Hive use Counter as an option to collect table stats info. Spark support Accumulator which is pretty similiar with Hive Counter, we could try to enable Hive Counter based on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8284) Equality comparison is done between two floating point variables in HiveRelMdUniqueKeys#getUniqueKeys()
[ https://issues.apache.org/jira/browse/HIVE-8284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172210#comment-14172210 ] Hive QA commented on HIVE-8284: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12674726/HIVE-8284.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 6559 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel org.apache.hive.beeline.TestSchemaTool.testSchemaInit org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1273/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1273/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1273/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12674726 Equality comparison is done between two floating point variables in HiveRelMdUniqueKeys#getUniqueKeys() --- Key: HIVE-8284 URL: https://issues.apache.org/jira/browse/HIVE-8284 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Ted Yu Assignee: JongWon Park Priority: Minor Fix For: 0.14.0 Attachments: HIVE-8284.patch {code} double numRows = tScan.getRows(); ... double r = cStat.getRange().maxValue.doubleValue() - cStat.getRange().minValue.doubleValue() + 1; isKey = (numRows == r); {code} The equality check should use a small epsilon as tolerance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8456) Support Hive Counter to collect spark job metric[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172213#comment-14172213 ] Hive QA commented on HIVE-8456: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12674958/HIVE-8456.2-spark.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6769 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_tez_smb_1 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/220/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/220/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-220/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12674958 Support Hive Counter to collect spark job metric[Spark Branch] -- Key: HIVE-8456 URL: https://issues.apache.org/jira/browse/HIVE-8456 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M3 Attachments: HIVE-8456.1-spark.patch, HIVE-8456.2-spark.patch Several Hive query metric in Hive operators is collected by Hive Counter, such as CREATEDFILES and DESERIALIZE_ERRORS, Besides, Hive use Counter as an option to collect table stats info. Spark support Accumulator which is pretty similiar with Hive Counter, we could try to enable Hive Counter based on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8456) Support Hive Counter to collect spark job metric[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172248#comment-14172248 ] Hive QA commented on HIVE-8456: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12674960/HIVE-8456.2-spark.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6769 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_tez_smb_1 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/221/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/221/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-221/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12674960 Support Hive Counter to collect spark job metric[Spark Branch] -- Key: HIVE-8456 URL: https://issues.apache.org/jira/browse/HIVE-8456 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M3 Attachments: HIVE-8456.1-spark.patch, HIVE-8456.2-spark.patch Several Hive query metric in Hive operators is collected by Hive Counter, such as CREATEDFILES and DESERIALIZE_ERRORS, Besides, Hive use Counter as an option to collect table stats info. Spark support Accumulator which is pretty similiar with Hive Counter, we could try to enable Hive Counter based on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8341) Transaction information in config file can grow excessively large
[ https://issues.apache.org/jira/browse/HIVE-8341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172265#comment-14172265 ] Hive QA commented on HIVE-8341: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12674786/HIVE-8341.2.patch {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 6564 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel org.apache.hadoop.hive.ql.exec.TestOperators.testScriptOperatorEnvVarsProcessing org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.testStatsAfterCompactionPartTbl org.apache.hive.beeline.TestSchemaTool.testSchemaInit org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1274/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1274/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1274/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12674786 Transaction information in config file can grow excessively large - Key: HIVE-8341 URL: https://issues.apache.org/jira/browse/HIVE-8341 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.14.0 Reporter: Alan Gates Assignee: Alan Gates Priority: Critical Attachments: HIVE-8341.2.patch, HIVE-8341.patch In our testing we have seen cases where the transaction list grows very large. We need a more efficient way of communicating the list. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-7467) When querying HBase table, task fails with exception: java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString
[ https://issues.apache.org/jira/browse/HIVE-7467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang resolved HIVE-7467. --- Resolution: Later I see. Thanks for bringing up this issue. There is not much we can do in Hive side as I know for now. Thanks. When querying HBase table, task fails with exception: java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString --- Key: HIVE-7467 URL: https://issues.apache.org/jira/browse/HIVE-7467 Project: Hive Issue Type: Bug Components: Spark Environment: Spark-1.0.0, HBase-0.98.2 Reporter: Rui Li Assignee: Jimmy Xiang When I run select count( * ) on an HBase table, spark task fails with: {quote} java.lang.IllegalAccessError: com/google/protobuf/HBaseZeroCopyByteString at org.apache.hadoop.hbase.protobuf.RequestConverter.buildRegionSpecifier(RequestConverter.java:910) at org.apache.hadoop.hbase.protobuf.RequestConverter.buildGetRowOrBeforeRequest(RequestConverter.java:131) at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRowOrBefore(ProtobufUtil.java:1403) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1181) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1059) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1016) at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:326) at org.apache.hadoop.hbase.client.HTable.init(HTable.java:192) at org.apache.hadoop.hbase.client.HTable.init(HTable.java:165) at org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat.getRecordReader(HiveHBaseTableInputFormat.java:93) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:241) at org.apache.spark.rdd.HadoopRDD$$anon$1.init(HadoopRDD.scala:193) at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:184) at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:93) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) at org.apache.spark.rdd.RDD.iterator(RDD.scala:229) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:158) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99) at org.apache.spark.scheduler.Task.run(Task.scala:51) {quote} NO PRECOMMIT TESTS. This is for spark branch only. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8362) Investigate flaky test parallel.q
[ https://issues.apache.org/jira/browse/HIVE-8362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172330#comment-14172330 ] Hive QA commented on HIVE-8362: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12674791/HIVE-8362.3.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6559 tests executed *Failed tests:* {noformat} org.apache.hive.beeline.TestSchemaTool.testSchemaInit org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1275/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1275/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1275/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12674791 Investigate flaky test parallel.q - Key: HIVE-8362 URL: https://issues.apache.org/jira/browse/HIVE-8362 Project: Hive Issue Type: Sub-task Components: Testing Infrastructure Affects Versions: 0.14.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Attachments: HIVE-8362.1-spark.patch, HIVE-8362.2.patch, HIVE-8362.3.patch, HIVE-8362.patch Test parallel.q is flaky. It fails sometimes with error like: {noformat} Failed tests: TestSparkCliDriver.testCliDriver_parallel:120-runTest:146 Unexpected exception junit.framework.AssertionFailedError: Client Execution results failed with error code = 1 See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, or check ./ql/target/surefire-reports or ./itests/qtest/target/surefire-reports/ for specific test cases logs. {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 26661: HIVE-7873 Re-enable lazy HiveBaseFunctionResultList
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26661/ --- (Updated Oct. 15, 2014, 12:55 p.m.) Review request for hive and Xuefu Zhang. Changes --- The new patch made HiveKVResultCache fully thread safe as Xuefu suggested. Bugs: HIVE-7873 https://issues.apache.org/jira/browse/HIVE-7873 Repository: hive-git Description --- Re-enabled lazy HiveBaseFunctionResultList. A separate RowContainer is used to work around the no-write-after-read limitation of RowContainer. The patch also fixed a concurrency issue in HiveKVResultCache. Synchronized is used instead of reentrant lock since I assume there won't be many threads to access the cache. Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveBaseFunctionResultList.java 0df2580 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java a6b9037 ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestHiveKVResultCache.java 496a11f Diff: https://reviews.apache.org/r/26661/diff/ Testing --- Unit test, some simple perf test. Thanks, Jimmy Xiang
[jira] [Updated] (HIVE-7873) Re-enable lazy HiveBaseFunctionResultList
[ https://issues.apache.org/jira/browse/HIVE-7873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HIVE-7873: -- Attachment: HIVE-7873.3-spark.patch Thanks Xuefu for the review. Attached patch 3 that addressed review comments. Re-enable lazy HiveBaseFunctionResultList - Key: HIVE-7873 URL: https://issues.apache.org/jira/browse/HIVE-7873 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Jimmy Xiang Labels: Spark-M4, spark Attachments: HIVE-7873.1-spark.patch, HIVE-7873.2-spark.patch, HIVE-7873.2-spark.patch, HIVE-7873.3-spark.patch We removed this optimization in HIVE-7799. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 26661: HIVE-7873 Re-enable lazy HiveBaseFunctionResultList
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26661/#review56698 --- Ship it! Ship It! - Xuefu Zhang On Oct. 15, 2014, 12:55 p.m., Jimmy Xiang wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26661/ --- (Updated Oct. 15, 2014, 12:55 p.m.) Review request for hive and Xuefu Zhang. Bugs: HIVE-7873 https://issues.apache.org/jira/browse/HIVE-7873 Repository: hive-git Description --- Re-enabled lazy HiveBaseFunctionResultList. A separate RowContainer is used to work around the no-write-after-read limitation of RowContainer. The patch also fixed a concurrency issue in HiveKVResultCache. Synchronized is used instead of reentrant lock since I assume there won't be many threads to access the cache. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveBaseFunctionResultList.java 0df2580 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveKVResultCache.java a6b9037 ql/src/test/org/apache/hadoop/hive/ql/exec/spark/TestHiveKVResultCache.java 496a11f Diff: https://reviews.apache.org/r/26661/diff/ Testing --- Unit test, some simple perf test. Thanks, Jimmy Xiang
[jira] [Commented] (HIVE-7873) Re-enable lazy HiveBaseFunctionResultList
[ https://issues.apache.org/jira/browse/HIVE-7873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172356#comment-14172356 ] Xuefu Zhang commented on HIVE-7873: --- +1 pending on test. Re-enable lazy HiveBaseFunctionResultList - Key: HIVE-7873 URL: https://issues.apache.org/jira/browse/HIVE-7873 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Jimmy Xiang Labels: Spark-M4, spark Attachments: HIVE-7873.1-spark.patch, HIVE-7873.2-spark.patch, HIVE-7873.2-spark.patch, HIVE-7873.3-spark.patch We removed this optimization in HIVE-7799. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8448) Union All might not work due to the type conversion issue
[ https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-8448: --- Attachment: HIVE-8448.1.patch Need code review. Union All might not work due to the type conversion issue - Key: HIVE-8448 URL: https://issues.apache.org/jira/browse/HIVE-8448 Project: Hive Issue Type: Bug Reporter: Chaoyu Tang Assignee: Yongzhi Chen Priority: Minor Attachments: HIVE-8448.1.patch create table t1 (val date); insert overwrite table t1 select '2014-10-10' from src limit 1; create table t2 (val varchar(10)); insert overwrite table t2 select '2014-10-10' from src limit 1; == Query: select t.val from (select val from t1 union all select val from t1 union all select val from t2 union all select val from t1) t; == Will throw exception: {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible types for union operator at org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133) ... 22 more {code} It was because at this query parse step, getCommonClassForUnionAll is used, but at execution getCommonClass is used. They are not used consistently in union. The later one does not support the implicit conversion from date to string, which is the problem cause. The change might be simple to fix this particular union issue but I noticed that there are three versions of getCommonClass: getCommonClass, getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they need to be cleaned and refactored. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8448) Union All might not work due to the type conversion issue
[ https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-8448: --- Affects Version/s: 0.13.1 Status: Patch Available (was: Open) when SemanticAnalyzer check union plan, it use getCommonClassForUnionAll, but when evaluate union operator it uses getCommonClass. The inconsistency cause some queries with multiple union all on date type column can pass the analyzer but fail with HiveException: Incompatible types at execute time. Fixed by use new updateForUnionAll method which use getCommonClassForUnionAll to update column for union operator. Union All might not work due to the type conversion issue - Key: HIVE-8448 URL: https://issues.apache.org/jira/browse/HIVE-8448 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Chaoyu Tang Assignee: Yongzhi Chen Priority: Minor Attachments: HIVE-8448.1.patch create table t1 (val date); insert overwrite table t1 select '2014-10-10' from src limit 1; create table t2 (val varchar(10)); insert overwrite table t2 select '2014-10-10' from src limit 1; == Query: select t.val from (select val from t1 union all select val from t1 union all select val from t2 union all select val from t1) t; == Will throw exception: {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible types for union operator at org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133) ... 22 more {code} It was because at this query parse step, getCommonClassForUnionAll is used, but at execution getCommonClass is used. They are not used consistently in union. The later one does not support the implicit conversion from date to string, which is the problem cause. The change might be simple to fix this particular union issue but I noticed that there are three versions of getCommonClass: getCommonClass, getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they need to be cleaned and refactored. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7873) Re-enable lazy HiveBaseFunctionResultList
[ https://issues.apache.org/jira/browse/HIVE-7873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172391#comment-14172391 ] Hive QA commented on HIVE-7873: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12674987/HIVE-7873.3-spark.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6770 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_tez_smb_1 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/222/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/222/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-222/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12674987 Re-enable lazy HiveBaseFunctionResultList - Key: HIVE-7873 URL: https://issues.apache.org/jira/browse/HIVE-7873 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Jimmy Xiang Labels: Spark-M4, spark Attachments: HIVE-7873.1-spark.patch, HIVE-7873.2-spark.patch, HIVE-7873.2-spark.patch, HIVE-7873.3-spark.patch We removed this optimization in HIVE-7799. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8429) Add records in/out counters
[ https://issues.apache.org/jira/browse/HIVE-8429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172408#comment-14172408 ] Hive QA commented on HIVE-8429: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12674794/HIVE-8429.4.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 6558 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel org.apache.hive.beeline.TestSchemaTool.testSchemaInit org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchAbort {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1276/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1276/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1276/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12674794 Add records in/out counters --- Key: HIVE-8429 URL: https://issues.apache.org/jira/browse/HIVE-8429 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-8429.1.patch, HIVE-8429.2.patch, HIVE-8429.3.patch, HIVE-8429.4.patch We don't do counters for input/output records right now. That would help for debugging though (if it can be done with minimal overhead). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-8455) Print Spark job progress format info on the console[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172409#comment-14172409 ] Xuefu Zhang edited comment on HIVE-8455 at 10/15/14 2:32 PM: - {quote} I think we may not actually need this, what do you think? {quote} IMO, it's okay to just print at the beginning. Once user understands this, it might actually becomes annoying to see more of it. And at this stage, we don't have to make everything perfect. We can always come back to improve it. was (Author: xuefuz): {quote} I think we may not actually need this, what do you think? {quote} IMO, it's okay to just print at the beginning. Once user understands this, it might actually becomes annoying to see more of it. Print Spark job progress format info on the console[Spark Branch] - Key: HIVE-8455 URL: https://issues.apache.org/jira/browse/HIVE-8455 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M3 Attachments: HIVE-8455.1-spark.patch, HIVE-8455.2-spark.patch, hive on spark job status.PNG We have add support of spark job status monitoring on HIVE-7439, but not print job progress format info on the console, user may confuse about what the progress info means, so I would like to add job progress format info here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8395) CBO: enable by default
[ https://issues.apache.org/jira/browse/HIVE-8395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172411#comment-14172411 ] Hive QA commented on HIVE-8395: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12674845/HIVE-8395.05.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1277/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1277/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1277/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-maven-3.0.5/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.6.0_34/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-1277/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . Reverted 'ql/src/test/org/apache/hadoop/hive/ql/exec/TestOperators.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/JoinOperator.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/FilterOperator.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapper.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecReducer.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapOperator.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorFilterOperator.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileRecordProcessor.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ReduceRecordProcessor.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MapRecordProcessor.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/tez/RecordProcessor.java' ++ egrep -v '^X|^Performing status on external' ++ awk '{print $2}' ++ svn status --no-ignore + rm -rf target datanucleus.log ant/target shims/target shims/0.20/target shims/0.20S/target shims/0.23/target shims/aggregator/target shims/common/target shims/common-secure/target packaging/target hbase-handler/target testutils/target jdbc/target metastore/target itests/target itests/hcatalog-unit/target itests/test-serde/target itests/qtest/target itests/hive-unit-hadoop2/target itests/hive-minikdc/target itests/hive-unit/target itests/custom-serde/target itests/util/target hcatalog/target hcatalog/core/target hcatalog/streaming/target hcatalog/server-extensions/target hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target hcatalog/hcatalog-pig-adapter/target accumulo-handler/target hwi/target common/target common/src/gen service/target contrib/target serde/target beeline/target odbc/target cli/target ql/dependency-reduced-pom.xml ql/target + svn update Fetching external item into 'hcatalog/src/test/e2e/harness' External at revision 1632054. At revision 1632054. + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh +
[jira] [Updated] (HIVE-7873) Re-enable lazy HiveBaseFunctionResultList [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-7873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-7873: -- Summary: Re-enable lazy HiveBaseFunctionResultList [Spark Branch] (was: Re-enable lazy HiveBaseFunctionResultList) Re-enable lazy HiveBaseFunctionResultList [Spark Branch] Key: HIVE-7873 URL: https://issues.apache.org/jira/browse/HIVE-7873 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Jimmy Xiang Labels: Spark-M4, spark Attachments: HIVE-7873.1-spark.patch, HIVE-7873.2-spark.patch, HIVE-7873.2-spark.patch, HIVE-7873.3-spark.patch We removed this optimization in HIVE-7799. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6715) Hive JDBC should include username into open session request for non-sasl connection
[ https://issues.apache.org/jira/browse/HIVE-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-6715: Attachment: (was: HIVE-6715.2.patch) Hive JDBC should include username into open session request for non-sasl connection --- Key: HIVE-6715 URL: https://issues.apache.org/jira/browse/HIVE-6715 Project: Hive Issue Type: Bug Components: JDBC Reporter: Srinath Assignee: Prasad Mujumdar Attachments: HIVE-6715.1.patch The only parameter from sessVars that's being set in HiveConnection.openSession() is HS2_PROXY_USER. HIVE_AUTH_USER must also be set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6715) Hive JDBC should include username into open session request for non-sasl connection
[ https://issues.apache.org/jira/browse/HIVE-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-6715: Attachment: HIVE-6715.2.patch [~prasadm] Thanks for the patch. Looks good to me. +1 I have rebased it for the latest trunk. Hive JDBC should include username into open session request for non-sasl connection --- Key: HIVE-6715 URL: https://issues.apache.org/jira/browse/HIVE-6715 Project: Hive Issue Type: Bug Components: JDBC Reporter: Srinath Assignee: Prasad Mujumdar Attachments: HIVE-6715.1.patch The only parameter from sessVars that's being set in HiveConnection.openSession() is HS2_PROXY_USER. HIVE_AUTH_USER must also be set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6715) Hive JDBC should include username into open session request for non-sasl connection
[ https://issues.apache.org/jira/browse/HIVE-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-6715: Attachment: HIVE-6715.2.patch Hive JDBC should include username into open session request for non-sasl connection --- Key: HIVE-6715 URL: https://issues.apache.org/jira/browse/HIVE-6715 Project: Hive Issue Type: Bug Components: JDBC Reporter: Srinath Assignee: Prasad Mujumdar Attachments: HIVE-6715.1.patch, HIVE-6715.2.patch The only parameter from sessVars that's being set in HiveConnection.openSession() is HS2_PROXY_USER. HIVE_AUTH_USER must also be set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6715) Hive JDBC should include username into open session request for non-sasl connection
[ https://issues.apache.org/jira/browse/HIVE-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-6715: Status: Patch Available (was: Open) Hive JDBC should include username into open session request for non-sasl connection --- Key: HIVE-6715 URL: https://issues.apache.org/jira/browse/HIVE-6715 Project: Hive Issue Type: Bug Components: JDBC Reporter: Srinath Assignee: Prasad Mujumdar Attachments: HIVE-6715.1.patch, HIVE-6715.2.patch The only parameter from sessVars that's being set in HiveConnection.openSession() is HS2_PROXY_USER. HIVE_AUTH_USER must also be set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7914) Simplify join predicates for CBO to avoid cross products
[ https://issues.apache.org/jira/browse/HIVE-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-7914: --- Resolution: Fixed Fix Version/s: (was: 0.14.0) 0.15.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, John! [~vikram.dixit] it will be good to have this in 0.14 Simplify join predicates for CBO to avoid cross products Key: HIVE-7914 URL: https://issues.apache.org/jira/browse/HIVE-7914 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 0.13.1 Reporter: Mostafa Mokhtar Assignee: Laljo John Pullokkaran Fix For: 0.15.0 Attachments: HIVE-7914.patch Simplify join predicates for disjunctive predicates to avoid cross products. For TPC-DS query 13 we generate a cross products. The join predicate on (store_sales x customer_demographics) , (store_sales x household_demographics) and (store_sales x customer_address) can be pull up to avoid the cross products {code} select avg(ss_quantity) ,avg(ss_ext_sales_price) ,avg(ss_ext_wholesale_cost) ,sum(ss_ext_wholesale_cost) from store_sales ,store ,customer_demographics ,household_demographics ,customer_address ,date_dim where store.s_store_sk = store_sales.ss_store_sk and store_sales.ss_sold_date_sk = date_dim.d_date_sk and date_dim.d_year = 2001 and((store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk and customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk and customer_demographics.cd_marital_status = 'M' and customer_demographics.cd_education_status = '4 yr Degree' and store_sales.ss_sales_price between 100.00 and 150.00 and household_demographics.hd_dep_count = 3 )or (store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk and customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk and customer_demographics.cd_marital_status = 'D' and customer_demographics.cd_education_status = 'Primary' and store_sales.ss_sales_price between 50.00 and 100.00 and household_demographics.hd_dep_count = 1 ) or (store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk and customer_demographics.cd_demo_sk = ss_cdemo_sk and customer_demographics.cd_marital_status = 'U' and customer_demographics.cd_education_status = 'Advanced Degree' and store_sales.ss_sales_price between 150.00 and 200.00 and household_demographics.hd_dep_count = 1 )) and((store_sales.ss_addr_sk = customer_address.ca_address_sk and customer_address.ca_country = 'United States' and customer_address.ca_state in ('KY', 'GA', 'NM') and store_sales.ss_net_profit between 100 and 200 ) or (store_sales.ss_addr_sk = customer_address.ca_address_sk and customer_address.ca_country = 'United States' and customer_address.ca_state in ('MT', 'OR', 'IN') and store_sales.ss_net_profit between 150 and 300 ) or (store_sales.ss_addr_sk = customer_address.ca_address_sk and customer_address.ca_country = 'United States' and customer_address.ca_state in ('WI', 'MO', 'WV') and store_sales.ss_net_profit between 50 and 250 )) ; {code} This is the plan currently generated without any predicate simplification {code} Warning: Map Join MAPJOIN[59][bigTable=?] in task 'Map 8' is a cross product Warning: Map Join MAPJOIN[58][bigTable=?] in task 'Map 8' is a cross product Warning: Shuffle Join JOIN[29][tables = [$hdt$_5, $hdt$_6]] in Stage 'Reducer 2' is a cross product OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Tez Edges: Map 7 - Map 8 (BROADCAST_EDGE) Map 8 - Map 5 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE) Reducer 2 - Map 1 (SIMPLE_EDGE), Map 4 (BROADCAST_EDGE), Map 7 (SIMPLE_EDGE) Reducer 3 - Reducer 2 (SIMPLE_EDGE) DagName: mmokhtar_20140828155050_7059c24b-501b-4683-86c0-4f3c023f0b0e:1 Vertices: Map 1 Map Operator Tree: TableScan alias: customer_address Statistics: Num rows: 4000 Data size: 40595195284 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: ca_address_sk (type: int), ca_state (type: string), ca_country (type: string) outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 4000 Data size: 40595195284 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: Statistics: Num rows: 4000 Data size: 40595195284 Basic stats:
[jira] [Resolved] (HIVE-7913) Simplify filter predicates for CBO
[ https://issues.apache.org/jira/browse/HIVE-7913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-7913. Resolution: Duplicate Fixed via HIVE-7914 Simplify filter predicates for CBO -- Key: HIVE-7913 URL: https://issues.apache.org/jira/browse/HIVE-7913 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 0.13.1 Reporter: Mostafa Mokhtar Assignee: Laljo John Pullokkaran Fix For: 0.14.0 Simplify predicates for disjunctive predicates so that can get pushed down to the scan. For TPC-DS query 13 we push down predicates in the following form where c_martial_status in ('M','D','U') etc.. {code} select avg(ss_quantity) ,avg(ss_ext_sales_price) ,avg(ss_ext_wholesale_cost) ,sum(ss_ext_wholesale_cost) from store_sales ,store ,customer_demographics ,household_demographics ,customer_address ,date_dim where store.s_store_sk = store_sales.ss_store_sk and store_sales.ss_sold_date_sk = date_dim.d_date_sk and date_dim.d_year = 2001 and((store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk and customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk and customer_demographics.cd_marital_status = 'M' and customer_demographics.cd_education_status = '4 yr Degree' and store_sales.ss_sales_price between 100.00 and 150.00 and household_demographics.hd_dep_count = 3 )or (store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk and customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk and customer_demographics.cd_marital_status = 'D' and customer_demographics.cd_education_status = 'Primary' and store_sales.ss_sales_price between 50.00 and 100.00 and household_demographics.hd_dep_count = 1 ) or (store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk and customer_demographics.cd_demo_sk = ss_cdemo_sk and customer_demographics.cd_marital_status = 'U' and customer_demographics.cd_education_status = 'Advanced Degree' and store_sales.ss_sales_price between 150.00 and 200.00 and household_demographics.hd_dep_count = 1 )) and((store_sales.ss_addr_sk = customer_address.ca_address_sk and customer_address.ca_country = 'United States' and customer_address.ca_state in ('KY', 'GA', 'NM') and store_sales.ss_net_profit between 100 and 200 ) or (store_sales.ss_addr_sk = customer_address.ca_address_sk and customer_address.ca_country = 'United States' and customer_address.ca_state in ('MT', 'OR', 'IN') and store_sales.ss_net_profit between 150 and 300 ) or (store_sales.ss_addr_sk = customer_address.ca_address_sk and customer_address.ca_country = 'United States' and customer_address.ca_state in ('WI', 'MO', 'WV') and store_sales.ss_net_profit between 50 and 250 )) ; {code} This is the plan currently generated without any predicate simplification {code} STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Tez Edges: Map 7 - Map 8 (BROADCAST_EDGE) Map 8 - Map 5 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE) Reducer 2 - Map 1 (SIMPLE_EDGE), Map 4 (BROADCAST_EDGE), Map 7 (SIMPLE_EDGE) Reducer 3 - Reducer 2 (SIMPLE_EDGE) DagName: mmokhtar_20140828155050_7059c24b-501b-4683-86c0-4f3c023f0b0e:1 Vertices: Map 1 Map Operator Tree: TableScan alias: customer_address Statistics: Num rows: 4000 Data size: 40595195284 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: ca_address_sk (type: int), ca_state (type: string), ca_country (type: string) outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 4000 Data size: 40595195284 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: Statistics: Num rows: 4000 Data size: 40595195284 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: int), _col1 (type: string), _col2 (type: string) Execution mode: vectorized Map 4 Map Operator Tree: TableScan alias: date_dim filterExpr: ((d_year = 2001) and d_date_sk is not null) (type: boolean) Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: ((d_year = 2001) and d_date_sk is not null) (type: boolean)
[jira] [Updated] (HIVE-6715) Hive JDBC should include username into open session request for non-sasl connection
[ https://issues.apache.org/jira/browse/HIVE-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-6715: Attachment: HIVE-6715.3.patch HIVE-6715.3.patch - In current trunk, setting the auth property via conf for MiniHS2 is what works (HiveAuthFactory no longer creates a new hiveconf). Updating test case. Hive JDBC should include username into open session request for non-sasl connection --- Key: HIVE-6715 URL: https://issues.apache.org/jira/browse/HIVE-6715 Project: Hive Issue Type: Bug Components: JDBC Reporter: Srinath Assignee: Prasad Mujumdar Attachments: HIVE-6715.1.patch, HIVE-6715.2.patch, HIVE-6715.3.patch The only parameter from sessVars that's being set in HiveConnection.openSession() is HS2_PROXY_USER. HIVE_AUTH_USER must also be set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8467) Table Copy - Background, incremental data load
Rajat Venkatesh created HIVE-8467: - Summary: Table Copy - Background, incremental data load Key: HIVE-8467 URL: https://issues.apache.org/jira/browse/HIVE-8467 Project: Hive Issue Type: New Feature Reporter: Rajat Venkatesh Traditionally, Hive and other tools in the Hadoop eco-system havent required a load stage. However, with recent developments, Hive is much more performant when data is stored in specific formats like ORC, Parquet, Avro etc. Technologies like Presto, also work much better with certain data formats. At the same time, data is generated or obtained from 3rd parties in non-optimal formats such as CSV, tab-limited or JSON. Many a times, its not an option to change the data format at the source. We've found that users either use sub-optimal formats or spend a large amount of effort creating and maintaining copies. We want to propose a new construct - Table Copy - to help “load” data into an optimal storage format. I am going to attach a PDF document with a lot more details especially addressing how is this different from bulk loads in relational DBs or materialized views. Looking forward to hear if others see a similar need to formalize conversion of data to different storage formats. If yes, are the details in the PDF document a good start ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8467) Table Copy - Background, incremental data load
[ https://issues.apache.org/jira/browse/HIVE-8467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajat Venkatesh updated HIVE-8467: -- Attachment: Table Copies.pdf Table Copy - Background, incremental data load -- Key: HIVE-8467 URL: https://issues.apache.org/jira/browse/HIVE-8467 Project: Hive Issue Type: New Feature Reporter: Rajat Venkatesh Attachments: Table Copies.pdf Traditionally, Hive and other tools in the Hadoop eco-system havent required a load stage. However, with recent developments, Hive is much more performant when data is stored in specific formats like ORC, Parquet, Avro etc. Technologies like Presto, also work much better with certain data formats. At the same time, data is generated or obtained from 3rd parties in non-optimal formats such as CSV, tab-limited or JSON. Many a times, its not an option to change the data format at the source. We've found that users either use sub-optimal formats or spend a large amount of effort creating and maintaining copies. We want to propose a new construct - Table Copy - to help “load” data into an optimal storage format. I am going to attach a PDF document with a lot more details especially addressing how is this different from bulk loads in relational DBs or materialized views. Looking forward to hear if others see a similar need to formalize conversion of data to different storage formats. If yes, are the details in the PDF document a good start ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8433) CBO loses a column during AST conversion
[ https://issues.apache.org/jira/browse/HIVE-8433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172508#comment-14172508 ] Hive QA commented on HIVE-8433: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12674858/HIVE-8433.patch {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 6560 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_correctness org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan org.apache.hive.beeline.TestSchemaTool.testSchemaInit org.apache.hive.beeline.TestSchemaTool.testSchemaUpgrade {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1278/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1278/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1278/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12674858 CBO loses a column during AST conversion Key: HIVE-8433 URL: https://issues.apache.org/jira/browse/HIVE-8433 Project: Hive Issue Type: Bug Components: CBO Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Critical Attachments: HIVE-8433.patch {noformat} SELECT CAST(value AS BINARY), value FROM src ORDER BY value LIMIT 100 {noformat} returns only one column. Final CBO plan is {noformat} HiveSortRel(sort0=[$1], dir0=[ASC]): rowcount = 500.0, cumulative cost = {24858.432393688767 rows, 500.0 cpu, 0.0 io}, id = 44 HiveProjectRel(value=[CAST($0):BINARY(2147483647) NOT NULL], value1=[$0]): rowcount = 500.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 42 HiveProjectRel(value=[$1]): rowcount = 500.0, cumulative cost = {0.0 rows, 0.0 cpu, 0.0 io}, id = 40 HiveTableScanRel(table=[[default.src]]): rowcount = 500.0, cumulative cost = {0}, id = 0 {noformat} but the resulting AST has only one column. Must be some bug in conversion, probably related to the name collision in the schema, judging by the alias of the column for the binary-cast value in the AST {noformat} TOK_QUERY TOK_FROM TOK_SUBQUERY TOK_QUERY TOK_FROM TOK_TABREF TOK_TABNAME default src src TOK_INSERT TOK_DESTINATION TOK_DIR TOK_TMP_FILE TOK_SELECT TOK_SELEXPR . TOK_TABLE_OR_COL src value value $hdt$_0 TOK_INSERT TOK_DESTINATION TOK_DIR TOK_TMP_FILE TOK_SELECT TOK_SELEXPR TOK_FUNCTION TOK_BINARY . TOK_TABLE_OR_COL $hdt$_0 value value TOK_ORDERBY TOK_TABSORTCOLNAMEASC TOK_TABLE_OR_COL value TOK_LIMIT 100 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8450) Create table like does not copy over table properties
[ https://issues.apache.org/jira/browse/HIVE-8450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172558#comment-14172558 ] Brock Noland commented on HIVE-8450: Nice work!! +1 pending tests Create table like does not copy over table properties - Key: HIVE-8450 URL: https://issues.apache.org/jira/browse/HIVE-8450 Project: Hive Issue Type: Bug Affects Versions: 0.14.0, 0.13.1 Reporter: Brock Noland Assignee: Navis Priority: Critical Attachments: HIVE-8450.1.patch.txt Assuming t2 is a avro backed table, the following: {{create table t1 like t2}} should create an avro backed table, but the schema.url.* is not being copied correctly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8468) TestSchemaTool is failing after committing the version number change
Brock Noland created HIVE-8468: -- Summary: TestSchemaTool is failing after committing the version number change Key: HIVE-8468 URL: https://issues.apache.org/jira/browse/HIVE-8468 Project: Hive Issue Type: Bug Reporter: Brock Noland -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8468) TestSchemaTool is failing after committing the version number change
[ https://issues.apache.org/jira/browse/HIVE-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172563#comment-14172563 ] Brock Noland commented on HIVE-8468: I think this is related to HIVE-8381. FYI [~vikram.dixit] TestSchemaTool is failing after committing the version number change Key: HIVE-8468 URL: https://issues.apache.org/jira/browse/HIVE-8468 Project: Hive Issue Type: Bug Reporter: Brock Noland -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8468) TestSchemaTool is failing after committing the version number change
[ https://issues.apache.org/jira/browse/HIVE-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-8468: --- Description: Logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1278/failed/TestSchemaTool/TEST-TestSchemaTool-TEST-org.apache.hive.beeline.TestSchemaTool.xml TestSchemaTool is failing after committing the version number change Key: HIVE-8468 URL: https://issues.apache.org/jira/browse/HIVE-8468 Project: Hive Issue Type: Bug Reporter: Brock Noland Logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1278/failed/TestSchemaTool/TEST-TestSchemaTool-TEST-org.apache.hive.beeline.TestSchemaTool.xml -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8455) Print Spark job progress format info on the console[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172619#comment-14172619 ] Brock Noland commented on HIVE-8455: Thanks guys! This was my wishlist from yesterday after watching a 10+ minute job run. It's certainly possible that not all of the items are needed. Let's go ahead with this patch! +1 Print Spark job progress format info on the console[Spark Branch] - Key: HIVE-8455 URL: https://issues.apache.org/jira/browse/HIVE-8455 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M3 Attachments: HIVE-8455.1-spark.patch, HIVE-8455.2-spark.patch, hive on spark job status.PNG We have add support of spark job status monitoring on HIVE-7439, but not print job progress format info on the console, user may confuse about what the progress info means, so I would like to add job progress format info here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
New Feature Request
Hello Hive Dev, I filed a JIRA for a new feature in Hive - Table Copy. At Qubole, we've noticed that many users want to convert data to one of the more efficient data formats (for e.g. ORC) supported by Hive. Similarly, one of the prereqs for having a good experience on Presto is to convert the data to ORC. So we've tried to formalize the process of converting data to a more efficient format. We have a prototype that some of our users are trying out. Please take a look at https://issues.apache.org/jira/browse/HIVE-8467 We would love to get your feedback if such a feature is useful to the larger Hive community. -- Rajat Venkatesh | Engg Lead Qubole Inc | www.qubole.com
[jira] [Commented] (HIVE-8455) Print Spark job progress format info on the console[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172623#comment-14172623 ] Brock Noland commented on HIVE-8455: Thank you Chengxiang and Xuefu! I have committed this to spark. Print Spark job progress format info on the console[Spark Branch] - Key: HIVE-8455 URL: https://issues.apache.org/jira/browse/HIVE-8455 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M3 Fix For: 0.15.0 Attachments: HIVE-8455.1-spark.patch, HIVE-8455.2-spark.patch, hive on spark job status.PNG We have add support of spark job status monitoring on HIVE-7439, but not print job progress format info on the console, user may confuse about what the progress info means, so I would like to add job progress format info here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8455) Print Spark job progress format info on the console[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-8455: --- Resolution: Fixed Fix Version/s: 0.15.0 Status: Resolved (was: Patch Available) Print Spark job progress format info on the console[Spark Branch] - Key: HIVE-8455 URL: https://issues.apache.org/jira/browse/HIVE-8455 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M3 Fix For: 0.15.0 Attachments: HIVE-8455.1-spark.patch, HIVE-8455.2-spark.patch, hive on spark job status.PNG We have add support of spark job status monitoring on HIVE-7439, but not print job progress format info on the console, user may confuse about what the progress info means, so I would like to add job progress format info here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8387) add retry logic to ZooKeeperStorage in WebHCat
[ https://issues.apache.org/jira/browse/HIVE-8387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8387: - Status: Patch Available (was: Open) add retry logic to ZooKeeperStorage in WebHCat -- Key: HIVE-8387 URL: https://issues.apache.org/jira/browse/HIVE-8387 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-8387.patch ZK interactions may run into transient errors that should be retried. Currently there is no retry logic in WebHCat for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8387) add retry logic to ZooKeeperStorage in WebHCat
[ https://issues.apache.org/jira/browse/HIVE-8387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8387: - Attachment: HIVE-8387.patch [~thejas], [~sushanth] Could one of you review this please add retry logic to ZooKeeperStorage in WebHCat -- Key: HIVE-8387 URL: https://issues.apache.org/jira/browse/HIVE-8387 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-8387.patch ZK interactions may run into transient errors that should be retried. Currently there is no retry logic in WebHCat for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8362) Investigate flaky test parallel.q
[ https://issues.apache.org/jira/browse/HIVE-8362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-8362: --- Resolution: Fixed Fix Version/s: 0.15.0 Status: Resolved (was: Patch Available) Thank you Jimmy for fixing this long standing issue! I have committed it to trunk! Investigate flaky test parallel.q - Key: HIVE-8362 URL: https://issues.apache.org/jira/browse/HIVE-8362 Project: Hive Issue Type: Sub-task Components: Testing Infrastructure Affects Versions: 0.14.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.15.0 Attachments: HIVE-8362.1-spark.patch, HIVE-8362.2.patch, HIVE-8362.3.patch, HIVE-8362.patch Test parallel.q is flaky. It fails sometimes with error like: {noformat} Failed tests: TestSparkCliDriver.testCliDriver_parallel:120-runTest:146 Unexpected exception junit.framework.AssertionFailedError: Client Execution results failed with error code = 1 See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, or check ./ql/target/surefire-reports or ./itests/qtest/target/surefire-reports/ for specific test cases logs. {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8448) Union All might not work due to the type conversion issue
[ https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172650#comment-14172650 ] Szehon Ho commented on HIVE-8448: - This looks fine to me. Only style feedback is please follow the common style convention in GenericUDFUtils, like put parenthesis for if {...} else {...}, and also put a space after if. And the method name capitalization should be updatePriv, although I would suggest renaming it to avoid confusion (when I first read it, I thought its updating the privilege). To be honest I'm not the expert of union, I wonder if [~jdere], [~navis] would have any further comment? If not, +1 after these changes, pending the test. Union All might not work due to the type conversion issue - Key: HIVE-8448 URL: https://issues.apache.org/jira/browse/HIVE-8448 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Chaoyu Tang Assignee: Yongzhi Chen Priority: Minor Attachments: HIVE-8448.1.patch create table t1 (val date); insert overwrite table t1 select '2014-10-10' from src limit 1; create table t2 (val varchar(10)); insert overwrite table t2 select '2014-10-10' from src limit 1; == Query: select t.val from (select val from t1 union all select val from t1 union all select val from t2 union all select val from t1) t; == Will throw exception: {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible types for union operator at org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133) ... 22 more {code} It was because at this query parse step, getCommonClassForUnionAll is used, but at execution getCommonClass is used. They are not used consistently in union. The later one does not support the implicit conversion from date to string, which is the problem cause. The change might be simple to fix this particular union issue but I noticed that there are three versions of getCommonClass: getCommonClass, getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they need to be cleaned and refactored. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7858) Parquet compression should be configurable via table property
[ https://issues.apache.org/jira/browse/HIVE-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-7858: --- Resolution: Fixed Fix Version/s: 0.15.0 Status: Resolved (was: Patch Available) Thank you so much! I have committed this to trunk! Parquet compression should be configurable via table property - Key: HIVE-7858 URL: https://issues.apache.org/jira/browse/HIVE-7858 Project: Hive Issue Type: Improvement Reporter: Brock Noland Assignee: Ferdinand Xu Fix For: 0.15.0 Attachments: HIVE-7858.1.patch, HIVE-7858.patch, HIVE-7858.patch ORC supports the orc.compress table property: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC {noformat} create table Addresses ( name string, street string, city string, state string, zip int ) stored as orc tblproperties (orc.compress=NONE); {noformat} I think it'd be great to support the same for Parquet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8469) Add parquet.compression as a Serde Property
Brock Noland created HIVE-8469: -- Summary: Add parquet.compression as a Serde Property Key: HIVE-8469 URL: https://issues.apache.org/jira/browse/HIVE-8469 Project: Hive Issue Type: Improvement Reporter: Brock Noland Priority: Minor In HIVE-8450 we are annotating the serdes with their properties. We should add compression for paquet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7858) Parquet compression should be configurable via table property
[ https://issues.apache.org/jira/browse/HIVE-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172685#comment-14172685 ] Brock Noland commented on HIVE-7858: FYI that I created HIVE-8469 to add this as an annotated serde property post HIVE-8450 Parquet compression should be configurable via table property - Key: HIVE-7858 URL: https://issues.apache.org/jira/browse/HIVE-7858 Project: Hive Issue Type: Improvement Reporter: Brock Noland Assignee: Ferdinand Xu Fix For: 0.15.0 Attachments: HIVE-7858.1.patch, HIVE-7858.patch, HIVE-7858.patch ORC supports the orc.compress table property: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC {noformat} create table Addresses ( name string, street string, city string, state string, zip int ) stored as orc tblproperties (orc.compress=NONE); {noformat} I think it'd be great to support the same for Parquet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7858) Parquet compression should be configurable via table property
[ https://issues.apache.org/jira/browse/HIVE-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-7858: --- Release Note: The property parquet.compression can not be configured as a table property. Parquet compression should be configurable via table property - Key: HIVE-7858 URL: https://issues.apache.org/jira/browse/HIVE-7858 Project: Hive Issue Type: Improvement Reporter: Brock Noland Assignee: Ferdinand Xu Fix For: 0.15.0 Attachments: HIVE-7858.1.patch, HIVE-7858.patch, HIVE-7858.patch ORC supports the orc.compress table property: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC {noformat} create table Addresses ( name string, street string, city string, state string, zip int ) stored as orc tblproperties (orc.compress=NONE); {noformat} I think it'd be great to support the same for Parquet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8470) Orc writer cant handle column of type void
Ashutosh Chauhan created HIVE-8470: -- Summary: Orc writer cant handle column of type void Key: HIVE-8470 URL: https://issues.apache.org/jira/browse/HIVE-8470 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.13.0, 0.12.0, 0.11.0, 0.14.0 Reporter: Ashutosh Chauhan e.g, insert into table t1 select null from src; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8470) Orc writer cant handle column of type void
[ https://issues.apache.org/jira/browse/HIVE-8470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172700#comment-14172700 ] Ashutosh Chauhan commented on HIVE-8470: Stack trace: {code} Caused by: java.lang.IllegalArgumentException: Bad primitive category VOID at org.apache.hadoop.hive.ql.io.orc.WriterImpl.createTreeWriter(WriterImpl.java:1842) at org.apache.hadoop.hive.ql.io.orc.WriterImpl.access$1500(WriterImpl.java:106) at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.init(WriterImpl.java:1592) at org.apache.hadoop.hive.ql.io.orc.WriterImpl.createTreeWriter(WriterImpl.java:1846) at org.apache.hadoop.hive.ql.io.orc.WriterImpl.init(WriterImpl.java:203) at org.apache.hadoop.hive.ql.io.orc.OrcFile.createWriter(OrcFile.java:415) at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.write(OrcOutputFormat.java:84) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:671) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:799) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:799) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:799) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:536) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:546) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:85) {code} Orc writer cant handle column of type void -- Key: HIVE-8470 URL: https://issues.apache.org/jira/browse/HIVE-8470 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.14.0 Reporter: Ashutosh Chauhan e.g, insert into table t1 select null from src; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8456) Support Hive Counter to collect spark job metric[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172706#comment-14172706 ] Xuefu Zhang commented on HIVE-8456: --- [~lirui] would you like to review the patch? Thanks. Support Hive Counter to collect spark job metric[Spark Branch] -- Key: HIVE-8456 URL: https://issues.apache.org/jira/browse/HIVE-8456 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Labels: Spark-M3 Attachments: HIVE-8456.1-spark.patch, HIVE-8456.2-spark.patch Several Hive query metric in Hive operators is collected by Hive Counter, such as CREATEDFILES and DESERIALIZE_ERRORS, Besides, Hive use Counter as an option to collect table stats info. Spark support Accumulator which is pretty similiar with Hive Counter, we could try to enable Hive Counter based on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8387) add retry logic to ZooKeeperStorage in WebHCat
[ https://issues.apache.org/jira/browse/HIVE-8387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172714#comment-14172714 ] Thejas M Nair commented on HIVE-8387: - [~ekoifman] Can you please add a review board link ? add retry logic to ZooKeeperStorage in WebHCat -- Key: HIVE-8387 URL: https://issues.apache.org/jira/browse/HIVE-8387 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-8387.patch ZK interactions may run into transient errors that should be retried. Currently there is no retry logic in WebHCat for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8448) Union All might not work due to the type conversion issue
[ https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-8448: --- Attachment: HIVE-8448.2.patch fixes after review Union All might not work due to the type conversion issue - Key: HIVE-8448 URL: https://issues.apache.org/jira/browse/HIVE-8448 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Chaoyu Tang Assignee: Yongzhi Chen Priority: Minor Attachments: HIVE-8448.1.patch, HIVE-8448.2.patch create table t1 (val date); insert overwrite table t1 select '2014-10-10' from src limit 1; create table t2 (val varchar(10)); insert overwrite table t2 select '2014-10-10' from src limit 1; == Query: select t.val from (select val from t1 union all select val from t1 union all select val from t2 union all select val from t1) t; == Will throw exception: {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible types for union operator at org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133) ... 22 more {code} It was because at this query parse step, getCommonClassForUnionAll is used, but at execution getCommonClass is used. They are not used consistently in union. The later one does not support the implicit conversion from date to string, which is the problem cause. The change might be simple to fix this particular union issue but I noticed that there are three versions of getCommonClass: getCommonClass, getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they need to be cleaned and refactored. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Update: Hive user group meeting tonight
Hi all, Quick update, you should be able to attend as long as you have an ID with you. (Sorry about the confusion.) For those willing to dial in, the info is on the meetup page. http://www.meetup.com/Hive-User-Group-Meeting/events/202007872/ Regards, Xuefu
[jira] [Commented] (HIVE-8448) Union All might not work due to the type conversion issue
[ https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172726#comment-14172726 ] Yongzhi Chen commented on HIVE-8448: Thanks [~szehon], I attached the new patch following your review advice, I also submit a review request for the jira: https://reviews.apache.org/r/26763/ Union All might not work due to the type conversion issue - Key: HIVE-8448 URL: https://issues.apache.org/jira/browse/HIVE-8448 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Chaoyu Tang Assignee: Yongzhi Chen Priority: Minor Attachments: HIVE-8448.1.patch, HIVE-8448.2.patch create table t1 (val date); insert overwrite table t1 select '2014-10-10' from src limit 1; create table t2 (val varchar(10)); insert overwrite table t2 select '2014-10-10' from src limit 1; == Query: select t.val from (select val from t1 union all select val from t1 union all select val from t2 union all select val from t1) t; == Will throw exception: {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible types for union operator at org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133) ... 22 more {code} It was because at this query parse step, getCommonClassForUnionAll is used, but at execution getCommonClass is used. They are not used consistently in union. The later one does not support the implicit conversion from date to string, which is the problem cause. The change might be simple to fix this particular union issue but I noticed that there are three versions of getCommonClass: getCommonClass, getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they need to be cleaned and refactored. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8122) Make use of SearchArgument classes
[ https://issues.apache.org/jira/browse/HIVE-8122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172733#comment-14172733 ] Brock Noland commented on HIVE-8122: You are right. I created this JIRA to use something better than the string we are currently using. However, if we used SearchArgument then I think the parquet project would need that. We should create FilterPredicate and give that to Parquet. Thank you for your detailed analysis! Make use of SearchArgument classes -- Key: HIVE-8122 URL: https://issues.apache.org/jira/browse/HIVE-8122 Project: Hive Issue Type: Sub-task Reporter: Brock Noland Assignee: Ferdinand Xu ParquetSerde could be much cleaner if we used SearchArgument and associated classes like ORC does: https://github.com/apache/hive/blob/trunk/serde/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7914) Simplify join predicates for CBO to avoid cross products
[ https://issues.apache.org/jira/browse/HIVE-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172736#comment-14172736 ] Mostafa Mokhtar commented on HIVE-7914: --- Issue resolved {code} STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Tez Edges: Map 4 - Map 1 (BROADCAST_EDGE), Map 2 (BROADCAST_EDGE), Map 3 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE), Map 7 (BROADCAST_EDGE) Reducer 5 - Map 4 (SIMPLE_EDGE) DagName: mmokhtar_2014101512_452c339a-3fa1-4ae4-99ed-0fb052342532:1 Vertices: Map 1 Map Operator Tree: TableScan alias: household_demographics filterExpr: hd_demo_sk is not null (type: boolean) Statistics: Num rows: 7200 Data size: 770400 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: hd_demo_sk is not null (type: boolean) Statistics: Num rows: 7200 Data size: 57600 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: hd_demo_sk (type: int), hd_dep_count (type: int) outputColumnNames: _col0, _col1 Statistics: Num rows: 7200 Data size: 57600 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 7200 Data size: 57600 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col1 (type: int) Execution mode: vectorized Map 2 Map Operator Tree: TableScan alias: customer_address filterExpr: ((ca_country = 'United States') and ca_address_sk is not null) (type: boolean) Statistics: Num rows: 80 Data size: 811903688 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ((ca_country = 'United States') and ca_address_sk is not null) (type: boolean) Statistics: Num rows: 40 Data size: 7480 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ca_address_sk (type: int), ca_state (type: string) outputColumnNames: _col0, _col1 Statistics: Num rows: 40 Data size: 3600 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 40 Data size: 3600 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col1 (type: string) Execution mode: vectorized Map 3 Map Operator Tree: TableScan alias: date_dim filterExpr: ((d_year = 2001) and d_date_sk is not null) (type: boolean) Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ((d_year = 2001) and d_date_sk is not null) (type: boolean) Statistics: Num rows: 652 Data size: 5216 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: d_date_sk (type: int) outputColumnNames: _col0 Statistics: Num rows: 652 Data size: 2608 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 652 Data size: 2608 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: _col0 (type: int) outputColumnNames: _col0 Statistics: Num rows: 652 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE Group By Operator keys: _col0 (type: int) mode: hash outputColumnNames: _col0 Statistics: Num rows: 652 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE Dynamic Partitioning Event Operator Target
[jira] [Updated] (HIVE-8448) Union All might not work due to the type conversion issue
[ https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-8448: --- Status: Open (was: Patch Available) Union All might not work due to the type conversion issue - Key: HIVE-8448 URL: https://issues.apache.org/jira/browse/HIVE-8448 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Chaoyu Tang Assignee: Yongzhi Chen Priority: Minor Attachments: HIVE-8448.1.patch, HIVE-8448.2.patch create table t1 (val date); insert overwrite table t1 select '2014-10-10' from src limit 1; create table t2 (val varchar(10)); insert overwrite table t2 select '2014-10-10' from src limit 1; == Query: select t.val from (select val from t1 union all select val from t1 union all select val from t2 union all select val from t1) t; == Will throw exception: {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible types for union operator at org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133) ... 22 more {code} It was because at this query parse step, getCommonClassForUnionAll is used, but at execution getCommonClass is used. They are not used consistently in union. The later one does not support the implicit conversion from date to string, which is the problem cause. The change might be simple to fix this particular union issue but I noticed that there are three versions of getCommonClass: getCommonClass, getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they need to be cleaned and refactored. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8448) Union All might not work due to the type conversion issue
[ https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-8448: --- Attachment: HIVE-8448.3.patch need review Union All might not work due to the type conversion issue - Key: HIVE-8448 URL: https://issues.apache.org/jira/browse/HIVE-8448 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Chaoyu Tang Assignee: Yongzhi Chen Priority: Minor Attachments: HIVE-8448.1.patch, HIVE-8448.2.patch, HIVE-8448.3.patch create table t1 (val date); insert overwrite table t1 select '2014-10-10' from src limit 1; create table t2 (val varchar(10)); insert overwrite table t2 select '2014-10-10' from src limit 1; == Query: select t.val from (select val from t1 union all select val from t1 union all select val from t2 union all select val from t1) t; == Will throw exception: {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible types for union operator at org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133) ... 22 more {code} It was because at this query parse step, getCommonClassForUnionAll is used, but at execution getCommonClass is used. They are not used consistently in union. The later one does not support the implicit conversion from date to string, which is the problem cause. The change might be simple to fix this particular union issue but I noticed that there are three versions of getCommonClass: getCommonClass, getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they need to be cleaned and refactored. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8448) Union All might not work due to the type conversion issue
[ https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-8448: --- Status: Patch Available (was: Open) Fixed code style after review Union All might not work due to the type conversion issue - Key: HIVE-8448 URL: https://issues.apache.org/jira/browse/HIVE-8448 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Chaoyu Tang Assignee: Yongzhi Chen Priority: Minor Attachments: HIVE-8448.1.patch, HIVE-8448.2.patch, HIVE-8448.3.patch create table t1 (val date); insert overwrite table t1 select '2014-10-10' from src limit 1; create table t2 (val varchar(10)); insert overwrite table t2 select '2014-10-10' from src limit 1; == Query: select t.val from (select val from t1 union all select val from t1 union all select val from t2 union all select val from t1) t; == Will throw exception: {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible types for union operator at org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133) ... 22 more {code} It was because at this query parse step, getCommonClassForUnionAll is used, but at execution getCommonClass is used. They are not used consistently in union. The later one does not support the implicit conversion from date to string, which is the problem cause. The change might be simple to fix this particular union issue but I noticed that there are three versions of getCommonClass: getCommonClass, getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they need to be cleaned and refactored. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8467) Table Copy - Background, incremental data load
[ https://issues.apache.org/jira/browse/HIVE-8467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172767#comment-14172767 ] Julian Hyde commented on HIVE-8467: --- I see this as a particular kind of materialized view. In general, a materialized view is a table whose contents are guaranteed to be the same as executing a particular query. In this case, that query is simply 'select * from t'. We don't have materialized view support yet, but I have been working on lattices in Calcite (formerly known as Optiq) (see OPTIQ-344) and there is a lot of interest in adding them to Hive. Each materialized tile in a lattice is a materialized view of the form 'select d1, d2, sum(m1), count(m2) from t group by d1, d2'. So, let's talk about whether we could change the syntax to 'create materialized view' and still deliver the functionality you need. Of course if the user enters anything other than 'select * from t order by k1, k2' they would get an error. In terms of query planning, I strongly recommend that you build on the CBO work powered by Calcite. Let's suppose there is a table T and a copy C. After translating the query to a Calcite RelNode tree, there will be a TableAccessRel(T). After reading the metadata, we should create a TableAccessRel(C) and tell Calcite that it is equivalent. That's all you need to do. Calcite will take it from there. Assuming the stats indicate that C is better (and they should, right, because the ORC representation will be smaller?) then the query will end up using C. But if, say, T has a partitioning scheme which is more suitable for a particular query, then Calcite will choose T. Table Copy - Background, incremental data load -- Key: HIVE-8467 URL: https://issues.apache.org/jira/browse/HIVE-8467 Project: Hive Issue Type: New Feature Reporter: Rajat Venkatesh Attachments: Table Copies.pdf Traditionally, Hive and other tools in the Hadoop eco-system havent required a load stage. However, with recent developments, Hive is much more performant when data is stored in specific formats like ORC, Parquet, Avro etc. Technologies like Presto, also work much better with certain data formats. At the same time, data is generated or obtained from 3rd parties in non-optimal formats such as CSV, tab-limited or JSON. Many a times, its not an option to change the data format at the source. We've found that users either use sub-optimal formats or spend a large amount of effort creating and maintaining copies. We want to propose a new construct - Table Copy - to help “load” data into an optimal storage format. I am going to attach a PDF document with a lot more details especially addressing how is this different from bulk loads in relational DBs or materialized views. Looking forward to hear if others see a similar need to formalize conversion of data to different storage formats. If yes, are the details in the PDF document a good start ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8428) PCR doesnt remove filters involving casts
[ https://issues.apache.org/jira/browse/HIVE-8428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8428: --- Status: Open (was: Patch Available) PCR doesnt remove filters involving casts - Key: HIVE-8428 URL: https://issues.apache.org/jira/browse/HIVE-8428 Project: Hive Issue Type: Improvement Components: Logical Optimizer Affects Versions: 0.13.0, 0.12.0, 0.11.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-8428.1.patch, HIVE-8428.2.patch, HIVE-8428.patch e.g., select key,value from srcpart where hr = cast(11 as double); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8428) PCR doesnt remove filters involving casts
[ https://issues.apache.org/jira/browse/HIVE-8428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8428: --- Status: Patch Available (was: Open) PCR doesnt remove filters involving casts - Key: HIVE-8428 URL: https://issues.apache.org/jira/browse/HIVE-8428 Project: Hive Issue Type: Improvement Components: Logical Optimizer Affects Versions: 0.13.0, 0.12.0, 0.11.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-8428.1.patch, HIVE-8428.2.patch, HIVE-8428.3.patch, HIVE-8428.patch e.g., select key,value from srcpart where hr = cast(11 as double); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8428) PCR doesnt remove filters involving casts
[ https://issues.apache.org/jira/browse/HIVE-8428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8428: --- Attachment: HIVE-8428.3.patch PCR doesnt remove filters involving casts - Key: HIVE-8428 URL: https://issues.apache.org/jira/browse/HIVE-8428 Project: Hive Issue Type: Improvement Components: Logical Optimizer Affects Versions: 0.11.0, 0.12.0, 0.13.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-8428.1.patch, HIVE-8428.2.patch, HIVE-8428.3.patch, HIVE-8428.patch e.g., select key,value from srcpart where hr = cast(11 as double); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8428) PCR doesnt remove filters involving casts
[ https://issues.apache.org/jira/browse/HIVE-8428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172781#comment-14172781 ] Ashutosh Chauhan commented on HIVE-8428: Failures are because of other bugs: * orc_ppd_decimal : HIVE-8460 * orc_vectorization_ppd : HIVE-8470 * parallel : HIVE-8362 PCR doesnt remove filters involving casts - Key: HIVE-8428 URL: https://issues.apache.org/jira/browse/HIVE-8428 Project: Hive Issue Type: Improvement Components: Logical Optimizer Affects Versions: 0.11.0, 0.12.0, 0.13.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-8428.1.patch, HIVE-8428.2.patch, HIVE-8428.3.patch, HIVE-8428.patch e.g., select key,value from srcpart where hr = cast(11 as double); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8387) add retry logic to ZooKeeperStorage in WebHCat
[ https://issues.apache.org/jira/browse/HIVE-8387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172785#comment-14172785 ] Eugene Koifman commented on HIVE-8387: -- https://reviews.apache.org/r/26771/ add retry logic to ZooKeeperStorage in WebHCat -- Key: HIVE-8387 URL: https://issues.apache.org/jira/browse/HIVE-8387 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-8387.patch ZK interactions may run into transient errors that should be retried. Currently there is no retry logic in WebHCat for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8448) Union All might not work due to the type conversion issue
[ https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-8448: --- Attachment: (was: HIVE-8448.3.patch) Union All might not work due to the type conversion issue - Key: HIVE-8448 URL: https://issues.apache.org/jira/browse/HIVE-8448 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Chaoyu Tang Assignee: Yongzhi Chen Priority: Minor create table t1 (val date); insert overwrite table t1 select '2014-10-10' from src limit 1; create table t2 (val varchar(10)); insert overwrite table t2 select '2014-10-10' from src limit 1; == Query: select t.val from (select val from t1 union all select val from t1 union all select val from t2 union all select val from t1) t; == Will throw exception: {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible types for union operator at org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133) ... 22 more {code} It was because at this query parse step, getCommonClassForUnionAll is used, but at execution getCommonClass is used. They are not used consistently in union. The later one does not support the implicit conversion from date to string, which is the problem cause. The change might be simple to fix this particular union issue but I noticed that there are three versions of getCommonClass: getCommonClass, getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they need to be cleaned and refactored. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8448) Union All might not work due to the type conversion issue
[ https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-8448: --- Attachment: (was: HIVE-8448.2.patch) Union All might not work due to the type conversion issue - Key: HIVE-8448 URL: https://issues.apache.org/jira/browse/HIVE-8448 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Chaoyu Tang Assignee: Yongzhi Chen Priority: Minor create table t1 (val date); insert overwrite table t1 select '2014-10-10' from src limit 1; create table t2 (val varchar(10)); insert overwrite table t2 select '2014-10-10' from src limit 1; == Query: select t.val from (select val from t1 union all select val from t1 union all select val from t2 union all select val from t1) t; == Will throw exception: {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible types for union operator at org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133) ... 22 more {code} It was because at this query parse step, getCommonClassForUnionAll is used, but at execution getCommonClass is used. They are not used consistently in union. The later one does not support the implicit conversion from date to string, which is the problem cause. The change might be simple to fix this particular union issue but I noticed that there are three versions of getCommonClass: getCommonClass, getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they need to be cleaned and refactored. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8448) Union All might not work due to the type conversion issue
[ https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-8448: --- Attachment: (was: HIVE-8448.1.patch) Union All might not work due to the type conversion issue - Key: HIVE-8448 URL: https://issues.apache.org/jira/browse/HIVE-8448 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Chaoyu Tang Assignee: Yongzhi Chen Priority: Minor create table t1 (val date); insert overwrite table t1 select '2014-10-10' from src limit 1; create table t2 (val varchar(10)); insert overwrite table t2 select '2014-10-10' from src limit 1; == Query: select t.val from (select val from t1 union all select val from t1 union all select val from t2 union all select val from t1) t; == Will throw exception: {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible types for union operator at org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133) ... 22 more {code} It was because at this query parse step, getCommonClassForUnionAll is used, but at execution getCommonClass is used. They are not used consistently in union. The later one does not support the implicit conversion from date to string, which is the problem cause. The change might be simple to fix this particular union issue but I noticed that there are three versions of getCommonClass: getCommonClass, getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they need to be cleaned and refactored. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8448) Union All might not work due to the type conversion issue
[ https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-8448: --- Attachment: HIVE-8448.4.patch need code review Union All might not work due to the type conversion issue - Key: HIVE-8448 URL: https://issues.apache.org/jira/browse/HIVE-8448 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Chaoyu Tang Assignee: Yongzhi Chen Priority: Minor Attachments: HIVE-8448.4.patch create table t1 (val date); insert overwrite table t1 select '2014-10-10' from src limit 1; create table t2 (val varchar(10)); insert overwrite table t2 select '2014-10-10' from src limit 1; == Query: select t.val from (select val from t1 union all select val from t1 union all select val from t2 union all select val from t1) t; == Will throw exception: {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible types for union operator at org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133) ... 22 more {code} It was because at this query parse step, getCommonClassForUnionAll is used, but at execution getCommonClass is used. They are not used consistently in union. The later one does not support the implicit conversion from date to string, which is the problem cause. The change might be simple to fix this particular union issue but I noticed that there are three versions of getCommonClass: getCommonClass, getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they need to be cleaned and refactored. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8448) Union All might not work due to the type conversion issue
[ https://issues.apache.org/jira/browse/HIVE-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-8448: --- Status: Open (was: Patch Available) Union All might not work due to the type conversion issue - Key: HIVE-8448 URL: https://issues.apache.org/jira/browse/HIVE-8448 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Chaoyu Tang Assignee: Yongzhi Chen Priority: Minor create table t1 (val date); insert overwrite table t1 select '2014-10-10' from src limit 1; create table t2 (val varchar(10)); insert overwrite table t2 select '2014-10-10' from src limit 1; == Query: select t.val from (select val from t1 union all select val from t1 union all select val from t2 union all select val from t1) t; == Will throw exception: {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Incompatible types for union operator at org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:86) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:443) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133) ... 22 more {code} It was because at this query parse step, getCommonClassForUnionAll is used, but at execution getCommonClass is used. They are not used consistently in union. The later one does not support the implicit conversion from date to string, which is the problem cause. The change might be simple to fix this particular union issue but I noticed that there are three versions of getCommonClass: getCommonClass, getCommonClassForComparison, getCommonClassForUnionAll, and wonder if they need to be cleaned and refactored. -- This message was sent by Atlassian JIRA (v6.3.4#6332)