[jira] [Updated] (HIVE-4212) sort merge join should work for outer joins for more than 8 inputs
[ https://issues.apache.org/jira/browse/HIVE-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-4212: - Resolution: Fixed Fix Version/s: 0.11.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed. Thanks Gang Tim Liu sort merge join should work for outer joins for more than 8 inputs -- Key: HIVE-4212 URL: https://issues.apache.org/jira/browse/HIVE-4212 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Fix For: 0.11.0 Attachments: hive.4212.1.patch, hive.4212.2.patch, hive.4212.3.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3348) semi-colon in comments in .q file does not work
[ https://issues.apache.org/jira/browse/HIVE-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3348: - Resolution: Fixed Fix Version/s: 0.11.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed. Thanks Nick semi-colon in comments in .q file does not work --- Key: HIVE-3348 URL: https://issues.apache.org/jira/browse/HIVE-3348 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Namit Jain Assignee: Nick Collins Fix For: 0.11.0 Attachments: hive-3348.patch -- comment ; -- comment select count(1) from src; The above test file fails -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4232) JDBC2 HiveConnection has odd defaults
Chris Drome created HIVE-4232: - Summary: JDBC2 HiveConnection has odd defaults Key: HIVE-4232 URL: https://issues.apache.org/jira/browse/HIVE-4232 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.11.0 Reporter: Chris Drome HiveConnection defaults to using a plain SASL transport if auth is not set. To get a raw transport auth must be set to noSasl; furthermore noSasl is case sensitive. Code tries to infer Kerberos or plain authentication based on the presence of principal. There is no provision for specifying QOP level. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4232) JDBC2 HiveConnection has odd defaults
[ https://issues.apache.org/jira/browse/HIVE-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome updated HIVE-4232: -- Assignee: Chris Drome JDBC2 HiveConnection has odd defaults - Key: HIVE-4232 URL: https://issues.apache.org/jira/browse/HIVE-4232 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.11.0 Reporter: Chris Drome Assignee: Chris Drome HiveConnection defaults to using a plain SASL transport if auth is not set. To get a raw transport auth must be set to noSasl; furthermore noSasl is case sensitive. Code tries to infer Kerberos or plain authentication based on the presence of principal. There is no provision for specifying QOP level. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4232) JDBC2 HiveConnection has odd defaults
[ https://issues.apache.org/jira/browse/HIVE-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13613592#comment-13613592 ] Chris Drome commented on HIVE-4232: --- The patch proposes that if auth is not specified or auth=none the transport will default to TSocket. If auth=kerberos|plain then the Kerberos SASL transport or the Plain SASL transport will be used. If auth=kerberos then principal must also be specified. We propose controlling the QOP with another parameter qop=auth|auth-int|auth-conf. The patch also takes care of case-sensitive value comparisons. I feel that these changes result in a more reasonable set of defaults and don't rely upon inferring Kerberos or plain based on the existence of other parameters. I will rebase HIVE-4225 if these proposed changes are accepted. JDBC2 HiveConnection has odd defaults - Key: HIVE-4232 URL: https://issues.apache.org/jira/browse/HIVE-4232 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.11.0 Reporter: Chris Drome Assignee: Chris Drome Attachments: HIVE-4232.patch HiveConnection defaults to using a plain SASL transport if auth is not set. To get a raw transport auth must be set to noSasl; furthermore noSasl is case sensitive. Code tries to infer Kerberos or plain authentication based on the presence of principal. There is no provision for specifying QOP level. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4232) JDBC2 HiveConnection has odd defaults
[ https://issues.apache.org/jira/browse/HIVE-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome updated HIVE-4232: -- Attachment: HIVE-4232.patch Uploaded patch which implements the proposed changes. JDBC2 HiveConnection has odd defaults - Key: HIVE-4232 URL: https://issues.apache.org/jira/browse/HIVE-4232 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.11.0 Reporter: Chris Drome Assignee: Chris Drome Attachments: HIVE-4232.patch HiveConnection defaults to using a plain SASL transport if auth is not set. To get a raw transport auth must be set to noSasl; furthermore noSasl is case sensitive. Code tries to infer Kerberos or plain authentication based on the presence of principal. There is no provision for specifying QOP level. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4225) HiveServer2 does not support SASL QOP
[ https://issues.apache.org/jira/browse/HIVE-4225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13613595#comment-13613595 ] Chris Drome commented on HIVE-4225: --- I will rebase this patch if the changes in HIVE-4232 are acceptable. HiveServer2 does not support SASL QOP - Key: HIVE-4225 URL: https://issues.apache.org/jira/browse/HIVE-4225 Project: Hive Issue Type: Bug Components: HiveServer2, Shims Affects Versions: 0.11.0 Reporter: Chris Drome Assignee: Chris Drome Fix For: 0.11.0 Attachments: HIVE-4225.patch HiveServer2 implements Kerberos authentication through SASL framework, but does not support setting QOP. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4042) ignore mapjoin hint
[ https://issues.apache.org/jira/browse/HIVE-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4042: --- Resolution: Fixed Fix Version/s: 0.11.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Namit! ignore mapjoin hint --- Key: HIVE-4042 URL: https://issues.apache.org/jira/browse/HIVE-4042 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Fix For: 0.11.0 Attachments: hive.4042.10.patch, hive.4042.11.patch, hive.4042.12.patch, hive.4042.1.patch, hive.4042.2.patch, hive.4042.3.patch, hive.4042.4.patch, hive.4042.5.patch, hive.4042.6.patch, hive.4042.7.patch, hive.4042.8.patch, hive.4042.9.patch After HIVE-3784, in a production environment, it can become difficult to deploy since a lot of production queries can break. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3562) Some limit can be pushed down to map stage
[ https://issues.apache.org/jira/browse/HIVE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-3562: -- Attachment: HIVE-3562.D5967.4.patch navis updated the revision HIVE-3562 [jira] Some limit can be pushed down to map stage. 1. Used Heap for ORDER BY and Map for GROUP BY 2. Added tests for spill/break 3. Changed to use percentage for memory threshold Reviewers: tarball, JIRA REVISION DETAIL https://reviews.facebook.net/D5967 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D5967?vs=24861id=30483#toc AFFECTED FILES common/src/java/org/apache/hadoop/hive/conf/HiveConf.java conf/hive-default.xml.template ql/build.xml ql/ivy.xml ql/src/java/org/apache/hadoop/hive/ql/exec/ExtractOperator.java ql/src/java/org/apache/hadoop/hive/ql/exec/ForwardOperator.java ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java ql/src/java/org/apache/hadoop/hive/ql/exec/SelectOperator.java ql/src/java/org/apache/hadoop/hive/ql/exec/TopNHash.java ql/src/java/org/apache/hadoop/hive/ql/exec/TopNHashForGBY.java ql/src/java/org/apache/hadoop/hive/ql/io/HiveKey.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/LimitPushdownOptimizer.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java ql/src/test/queries/clientpositive/limit_pushdown.q ql/src/test/results/clientpositive/limit_pushdown.q.out To: JIRA, tarball, navis Cc: njain Some limit can be pushed down to map stage -- Key: HIVE-3562 URL: https://issues.apache.org/jira/browse/HIVE-3562 Project: Hive Issue Type: Bug Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-3562.D5967.1.patch, HIVE-3562.D5967.2.patch, HIVE-3562.D5967.3.patch, HIVE-3562.D5967.4.patch Queries with limit clause (with reasonable number), for example {noformat} select * from src order by key limit 10; {noformat} makes operator tree, TS-SEL-RS-EXT-LIMIT-FS But LIMIT can be partially calculated in RS, reducing size of shuffling. TS-SEL-RS(TOP-N)-EXT-LIMIT-FS -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4007) Create abstract classes for serializer and deserializer
[ https://issues.apache.org/jira/browse/HIVE-4007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13613612#comment-13613612 ] Ashutosh Chauhan commented on HIVE-4007: Cool. +1 running tests. Create abstract classes for serializer and deserializer --- Key: HIVE-4007 URL: https://issues.apache.org/jira/browse/HIVE-4007 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.4007.1.patch, hive.4007.2.patch, hive.4007.3.patch, hive.4007.4.patch Currently, it is very difficult to change the Serializer/Deserializer interface, since all the SerDes directly implement the interface. Instead, we should have abstract classes for implementing these interfaces. In case of a interface change, only the abstract class and the relevant serde needs to change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3980) Cleanup after HIVE-3403
[ https://issues.apache.org/jira/browse/HIVE-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-3980: --- Resolution: Fixed Fix Version/s: 0.11.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Namit! Cleanup after HIVE-3403 --- Key: HIVE-3980 URL: https://issues.apache.org/jira/browse/HIVE-3980 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Fix For: 0.11.0 Attachments: hive.3980.1.patch, hive.3980.2.patch, hive.3980.3.patch, hive.3980.4.patch There have been a lot of comments on HIVE-3403, which involve changing variable names/function names/adding more comments/general cleanup etc. Since HIVE-3403 involves a lot of refactoring, it was fairly difficult to address the comments there, since refreshing becomes impossible. This jira is to track those cleanups. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4197) Bring windowing support inline with SQL Standard
[ https://issues.apache.org/jira/browse/HIVE-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13613618#comment-13613618 ] Ashutosh Chauhan commented on HIVE-4197: https://reviews.facebook.net/D9717 Review request on behalf of [~rhbutani] Bring windowing support inline with SQL Standard Key: HIVE-4197 URL: https://issues.apache.org/jira/browse/HIVE-4197 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Harish Butani Attachments: WindowingSpecification.pdf The current behavior defers from the Standard in several significant places. Please review attached doc; there are still a few open issues. Once we agree on the behavior, can proceed with fixing the implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3381) Result of outer join is not valid
[ https://issues.apache.org/jira/browse/HIVE-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-3381: -- Attachment: HIVE-3381.D5565.7.patch navis updated the revision HIVE-3381 [jira] Result of outer join is not valid. Rebased to trunk (HIVE-3980) added test (Thanks Vikram) Reviewers: ashutoshc, JIRA REVISION DETAIL https://reviews.facebook.net/D5565 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D5565?vs=30351id=30495#toc BRANCH DPAL-1739 ARCANIST PROJECT hive AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java ql/src/java/org/apache/hadoop/hive/ql/exec/JoinOperator.java ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java ql/src/test/queries/clientpositive/mapjoin_test_outer.q ql/src/test/results/clientpositive/auto_join21.q.out ql/src/test/results/clientpositive/auto_join29.q.out ql/src/test/results/clientpositive/auto_join7.q.out ql/src/test/results/clientpositive/auto_join_filters.q.out ql/src/test/results/clientpositive/join21.q.out ql/src/test/results/clientpositive/join7.q.out ql/src/test/results/clientpositive/join_1to1.q.out ql/src/test/results/clientpositive/join_filters.q.out ql/src/test/results/clientpositive/join_filters_overlap.q.out ql/src/test/results/clientpositive/mapjoin1.q.out ql/src/test/results/clientpositive/mapjoin_test_outer.q.out To: JIRA, ashutoshc, navis Cc: njain Result of outer join is not valid - Key: HIVE-3381 URL: https://issues.apache.org/jira/browse/HIVE-3381 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Priority: Critical Attachments: HIVE-3381.D5565.3.patch, HIVE-3381.D5565.4.patch, HIVE-3381.D5565.5.patch, HIVE-3381.D5565.6.patch, HIVE-3381.D5565.7.patch, mapjoin_testOuter.q Outer joins, especially full outer joins or outer join with filter on 'ON clause' is not showing proper results. For example, query in test join_1to1.q {code} SELECT * FROM join_1to1_1 a full outer join join_1to1_2 b on a.key1 = b.key1 and a.value = 66 and b.value = 66 ORDER BY a.key1 ASC, a.key2 ASC, a.value ASC, b.key1 ASC, b.key2 ASC, b.value ASC; {code} results {code} NULL NULLNULLNULLNULL66 NULL NULLNULLNULL10050 66 NULL NULLNULL10 10010 66 NULL NULLNULL30 10030 88 NULL NULLNULL35 10035 88 NULL NULLNULL40 10040 88 NULL NULLNULL40 10040 88 NULL NULLNULL50 10050 88 NULL NULLNULL50 10050 88 NULL NULLNULL50 10050 88 NULL NULLNULL70 10040 88 NULL NULLNULL70 10040 88 NULL NULLNULL70 10040 88 NULL NULLNULL70 10040 88 NULL NULL66 NULLNULLNULL NULL 10050 66 NULLNULLNULL 5 10005 66 5 10005 66 1510015 66 NULLNULLNULL 2010020 66 20 10020 66 2510025 88 NULLNULLNULL 3010030 66 NULLNULLNULL 3510035 88 NULLNULLNULL 4010040 66 NULLNULLNULL 4010040 66 40 10040 66 4010040 88 NULLNULLNULL 4010040 88 NULLNULLNULL 5010050 66 NULLNULLNULL 5010050 66 50 10050 66 5010050 66 50 10050 66 5010050 88 NULLNULLNULL 5010050 88 NULLNULLNULL 5010050 88 NULLNULLNULL 5010050 88 NULLNULLNULL 5010050 88 NULLNULLNULL 5010050 88 NULLNULLNULL 6010040 66 60 10040 66 6010040 66 60 10040 66 6010040 66 60 10040 66 6010040 66 60 10040 66 7010040 66 NULLNULLNULL 7010040 66 NULLNULLNULL 7010040 66 NULLNULLNULL 7010040 66 NULLNULLNULL 8010040 88 NULLNULLNULL 8010040 88 NULLNULLNULL 8010040 88 NULLNULLNULL 8010040 88 NULLNULLNULL {code} but it seemed not right. This should be {code} NULL NULLNULLNULLNULL66 NULL NULLNULLNULL10050 66 NULL NULLNULL10 10010 66 NULL NULLNULL25 10025 66 NULL
[jira] [Created] (HIVE-4233) The TGT gotten from class 'CLIService' should be renewed on time?
dong created HIVE-4233: -- Summary: The TGT gotten from class 'CLIService' should be renewed on time? Key: HIVE-4233 URL: https://issues.apache.org/jira/browse/HIVE-4233 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.10.0 Environment: CentOS release 6.3 (Final) jdk1.6.0_31 HiveServer2 0.10.0-cdh4.2.0 Kerberos Security Reporter: dong Priority: Critical When the HIveServer2 have started more than 7 days, I use beeline shell to connect the HiveServer2,all operation failed. The log of HiveServer2 shows it was caused by the Kerberos auth failure,the exception stack trace is: 2013-03-26 11:55:20,932 ERROR hive.ql.metadata.Hive: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1084) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.init(RetryingMetaStoreClient.java:51) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:61) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2140) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2151) at org.apache.hadoop.hive.ql.metadata.Hive.getDelegationToken(Hive.java:2275) at org.apache.hive.service.cli.CLIService.getDelegationTokenFromMetaStore(CLIService.java:358) at org.apache.hive.service.cli.thrift.ThriftCLIService.OpenSession(ThriftCLIService.java:127) at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1073) at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1058) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge20S.java:565) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedConstructorAccessor52.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1082) ... 16 more Caused by: java.lang.IllegalStateException: This ticket is no longer valid at javax.security.auth.kerberos.KerberosTicket.toString(KerberosTicket.java:601) at java.lang.String.valueOf(String.java:2826) at java.lang.StringBuilder.append(StringBuilder.java:115) at sun.security.jgss.krb5.SubjectComber.findAux(SubjectComber.java:120) at sun.security.jgss.krb5.SubjectComber.find(SubjectComber.java:41) at sun.security.jgss.krb5.Krb5Util.getTicket(Krb5Util.java:130) at sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:328) at java.security.AccessController.doPrivileged(Native Method) at sun.security.jgss.krb5.Krb5InitCredential.getTgt(Krb5InitCredential.java:325) at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:128) at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:106) at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:172) at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:209) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:195) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:162) at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:175) at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253) at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) at java.security.AccessController.doPrivileged(Native Method)
[jira] [Commented] (HIVE-1558) introducing the dual table
[ https://issues.apache.org/jira/browse/HIVE-1558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13614159#comment-13614159 ] Maxime LANCIAUX commented on HIVE-1558: --- What do you think about modify Hive.g and add a token for DUAL ? introducing the dual table Key: HIVE-1558 URL: https://issues.apache.org/jira/browse/HIVE-1558 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Ning Zhang Assignee: Marcin Kurczych The dual table in MySQL and Oracle is very convenient in testing UDFs or constructing rows without reading any other tables. If dual is the only data source we could leverage the local mode execution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21 #332
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/332/ -- [...truncated 5078 lines...] A ql/src/gen/thrift/gen-py/queryplan/constants.py A ql/src/gen/thrift/gen-py/queryplan/__init__.py A ql/src/gen/thrift/gen-cpp A ql/src/gen/thrift/gen-cpp/queryplan_constants.h A ql/src/gen/thrift/gen-cpp/queryplan_types.cpp A ql/src/gen/thrift/gen-cpp/queryplan_types.h A ql/src/gen/thrift/gen-cpp/queryplan_constants.cpp A ql/src/gen/thrift/gen-rb A ql/src/gen/thrift/gen-rb/queryplan_types.rb A ql/src/gen/thrift/gen-rb/queryplan_constants.rb A ql/src/gen/thrift/gen-javabean A ql/src/gen/thrift/gen-javabean/org A ql/src/gen/thrift/gen-javabean/org/apache A ql/src/gen/thrift/gen-javabean/org/apache/hadoop A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/QueryPlan.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Adjacency.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Graph.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Task.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/AdjacencyType.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Stage.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/TaskType.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Query.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/StageType.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/NodeType.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/Operator.java A ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/OperatorType.java A ql/src/gen/thrift/gen-php A ql/src/gen/thrift/gen-php/queryplan A ql/src/gen/thrift/gen-php/queryplan/queryplan_types.php A ql/src/gen-javabean A ql/src/gen-javabean/org A ql/src/gen-javabean/org/apache A ql/src/gen-javabean/org/apache/hadoop A ql/src/gen-javabean/org/apache/hadoop/hive A ql/src/gen-javabean/org/apache/hadoop/hive/ql A ql/src/gen-javabean/org/apache/hadoop/hive/ql/plan A ql/src/gen-javabean/org/apache/hadoop/hive/ql/plan/api A ql/src/gen-php A ql/build.xml A ql/if A ql/if/queryplan.thrift A pdk A pdk/ivy.xml A pdk/scripts A pdk/scripts/class-registration.xsl A pdk/scripts/build-plugin.xml A pdk/scripts/README A pdk/src A pdk/src/java A pdk/src/java/org A pdk/src/java/org/apache A pdk/src/java/org/apache/hive A pdk/src/java/org/apache/hive/pdk A pdk/src/java/org/apache/hive/pdk/FunctionExtractor.java A pdk/src/java/org/apache/hive/pdk/HivePdkUnitTest.java A pdk/src/java/org/apache/hive/pdk/HivePdkUnitTests.java A pdk/src/java/org/apache/hive/pdk/PluginTest.java A pdk/test-plugin A pdk/test-plugin/test A pdk/test-plugin/test/cleanup.sql A pdk/test-plugin/test/onerow.txt A pdk/test-plugin/test/setup.sql A pdk/test-plugin/src A pdk/test-plugin/src/org A pdk/test-plugin/src/org/apache A pdk/test-plugin/src/org/apache/hive A pdk/test-plugin/src/org/apache/hive/pdktest A pdk/test-plugin/src/org/apache/hive/pdktest/Rot13.java A pdk/test-plugin/build.xml A pdk/build.xml A build-offline.xml U. At revision 1461198 no change for http://svn.apache.org/repos/asf/hive/branches/branch-0.9 since the previous build [hive] $ /home/hudson/tools/ant/apache-ant-1.8.1/bin/ant -Dversion=0.9.1-SNAPSHOT very-clean tar binary Buildfile: /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build.xml ivy-init-dirs: [echo] Project: hive [mkdir] Created dir: /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/ivy [mkdir] Created dir: /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/ivy/lib [mkdir] Created dir: /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/ivy/report [mkdir] Created dir: /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/ivy/maven ivy-download: [echo] Project: hive [get] Getting: http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar [get] To:
[jira] [Commented] (HIVE-4179) NonBlockingOpDeDup does not merge SEL operators correctly
[ https://issues.apache.org/jira/browse/HIVE-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13614279#comment-13614279 ] Gunther Hagleitner commented on HIVE-4179: -- [~navis] I think you'd be the best person to take a look. Can you spare a moment? NonBlockingOpDeDup does not merge SEL operators correctly - Key: HIVE-4179 URL: https://issues.apache.org/jira/browse/HIVE-4179 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-4179.1.patch, HIVE-4179.2.patch The input columns list for SEL operations isn't merged properly in the optimization. The best way to see this is running union_remove_22.q with -Dhadoop.mr.rev=23. The plan shows lost UDFs and a broken lineage for one column. Note: union_remove tests do not run on hadoop 1 or 0.20. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3381) Result of outer join is not valid
[ https://issues.apache.org/jira/browse/HIVE-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-3381: --- Resolution: Fixed Fix Version/s: 0.11.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Navis! Result of outer join is not valid - Key: HIVE-3381 URL: https://issues.apache.org/jira/browse/HIVE-3381 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Priority: Critical Fix For: 0.11.0 Attachments: HIVE-3381.D5565.3.patch, HIVE-3381.D5565.4.patch, HIVE-3381.D5565.5.patch, HIVE-3381.D5565.6.patch, HIVE-3381.D5565.7.patch, mapjoin_testOuter.q Outer joins, especially full outer joins or outer join with filter on 'ON clause' is not showing proper results. For example, query in test join_1to1.q {code} SELECT * FROM join_1to1_1 a full outer join join_1to1_2 b on a.key1 = b.key1 and a.value = 66 and b.value = 66 ORDER BY a.key1 ASC, a.key2 ASC, a.value ASC, b.key1 ASC, b.key2 ASC, b.value ASC; {code} results {code} NULL NULLNULLNULLNULL66 NULL NULLNULLNULL10050 66 NULL NULLNULL10 10010 66 NULL NULLNULL30 10030 88 NULL NULLNULL35 10035 88 NULL NULLNULL40 10040 88 NULL NULLNULL40 10040 88 NULL NULLNULL50 10050 88 NULL NULLNULL50 10050 88 NULL NULLNULL50 10050 88 NULL NULLNULL70 10040 88 NULL NULLNULL70 10040 88 NULL NULLNULL70 10040 88 NULL NULLNULL70 10040 88 NULL NULL66 NULLNULLNULL NULL 10050 66 NULLNULLNULL 5 10005 66 5 10005 66 1510015 66 NULLNULLNULL 2010020 66 20 10020 66 2510025 88 NULLNULLNULL 3010030 66 NULLNULLNULL 3510035 88 NULLNULLNULL 4010040 66 NULLNULLNULL 4010040 66 40 10040 66 4010040 88 NULLNULLNULL 4010040 88 NULLNULLNULL 5010050 66 NULLNULLNULL 5010050 66 50 10050 66 5010050 66 50 10050 66 5010050 88 NULLNULLNULL 5010050 88 NULLNULLNULL 5010050 88 NULLNULLNULL 5010050 88 NULLNULLNULL 5010050 88 NULLNULLNULL 5010050 88 NULLNULLNULL 6010040 66 60 10040 66 6010040 66 60 10040 66 6010040 66 60 10040 66 6010040 66 60 10040 66 7010040 66 NULLNULLNULL 7010040 66 NULLNULLNULL 7010040 66 NULLNULLNULL 7010040 66 NULLNULLNULL 8010040 88 NULLNULLNULL 8010040 88 NULLNULLNULL 8010040 88 NULLNULLNULL 8010040 88 NULLNULLNULL {code} but it seemed not right. This should be {code} NULL NULLNULLNULLNULL66 NULL NULLNULLNULL10050 66 NULL NULLNULL10 10010 66 NULL NULLNULL25 10025 66 NULL NULLNULL30 10030 88 NULL NULLNULL35 10035 88 NULL NULLNULL40 10040 88 NULL NULLNULL50 10050 88 NULL NULLNULL70 10040 88 NULL NULLNULL70 10040 88 NULL NULLNULL80 10040 66 NULL NULLNULL80 10040 66 NULL NULL66 NULLNULLNULL NULL 10050 66 NULLNULLNULL 5 10005 66 5 10005 66 1510015 66 NULLNULLNULL 2010020 66 20 10020 66 2510025 88 NULLNULLNULL 3010030 66 NULLNULLNULL 3510035 88 NULLNULLNULL 4010040 66 40 10040 66 4010040 88 NULLNULLNULL 5010050 66 50 10050 66 5010050 66 50 10050 66 5010050 88 NULLNULLNULL 5010050 88 NULLNULLNULL 6010040 66 60 10040 66 6010040 66 60 10040 66 6010040 66 60 10040 66 6010040 66 60 10040 66 7010040 66 NULLNULLNULL 7010040 66 NULLNULLNULL 8010040 88 NULLNULLNULL 8010040 88 NULLNULLNULL {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA
[jira] [Updated] (HIVE-4007) Create abstract classes for serializer and deserializer
[ https://issues.apache.org/jira/browse/HIVE-4007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4007: --- Resolution: Fixed Fix Version/s: 0.11.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Namit! Create abstract classes for serializer and deserializer --- Key: HIVE-4007 URL: https://issues.apache.org/jira/browse/HIVE-4007 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Reporter: Namit Jain Assignee: Namit Jain Fix For: 0.11.0 Attachments: hive.4007.1.patch, hive.4007.2.patch, hive.4007.3.patch, hive.4007.4.patch Currently, it is very difficult to change the Serializer/Deserializer interface, since all the SerDes directly implement the interface. Instead, we should have abstract classes for implementing these interfaces. In case of a interface change, only the abstract class and the relevant serde needs to change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3958) support partial scan for analyze command - RCFile
[ https://issues.apache.org/jira/browse/HIVE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Tim Liu updated HIVE-3958: --- Attachment: HIVE-3958.patch.5 support partial scan for analyze command - RCFile - Key: HIVE-3958 URL: https://issues.apache.org/jira/browse/HIVE-3958 Project: Hive Issue Type: Improvement Reporter: Gang Tim Liu Assignee: Gang Tim Liu Attachments: HIVE-3958.patch.1, HIVE-3958.patch.2, HIVE-3958.patch.3, HIVE-3958.patch.4, HIVE-3958.patch.5 analyze commands allows us to collect statistics on existing tables/partitions. It works great but might be slow since it scans all files. There are 2 ways to speed it up: 1. collect stats without file scan. It may not collect all stats but good and fast enough for use case. HIVE-3917 addresses it 2. collect stats via partial file scan. It doesn't scan all content of files but part of it to get file metadata. some examples are https://cwiki.apache.org/Hive/rcfilecat.html for RCFile, ORC ( HIVE-3874 ) and HFile of Hbase This jira is targeted to address the #2. More specifically RCFile format. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work stopped] (HIVE-3958) support partial scan for analyze command - RCFile
[ https://issues.apache.org/jira/browse/HIVE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-3958 stopped by Gang Tim Liu. support partial scan for analyze command - RCFile - Key: HIVE-3958 URL: https://issues.apache.org/jira/browse/HIVE-3958 Project: Hive Issue Type: Improvement Reporter: Gang Tim Liu Assignee: Gang Tim Liu Attachments: HIVE-3958.patch.1, HIVE-3958.patch.2, HIVE-3958.patch.3, HIVE-3958.patch.4, HIVE-3958.patch.5 analyze commands allows us to collect statistics on existing tables/partitions. It works great but might be slow since it scans all files. There are 2 ways to speed it up: 1. collect stats without file scan. It may not collect all stats but good and fast enough for use case. HIVE-3917 addresses it 2. collect stats via partial file scan. It doesn't scan all content of files but part of it to get file metadata. some examples are https://cwiki.apache.org/Hive/rcfilecat.html for RCFile, ORC ( HIVE-3874 ) and HFile of Hbase This jira is targeted to address the #2. More specifically RCFile format. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3958) support partial scan for analyze command - RCFile
[ https://issues.apache.org/jira/browse/HIVE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Tim Liu updated HIVE-3958: --- Status: Patch Available (was: In Progress) Another diff is ready. thanks support partial scan for analyze command - RCFile - Key: HIVE-3958 URL: https://issues.apache.org/jira/browse/HIVE-3958 Project: Hive Issue Type: Improvement Reporter: Gang Tim Liu Assignee: Gang Tim Liu Attachments: HIVE-3958.patch.1, HIVE-3958.patch.2, HIVE-3958.patch.3, HIVE-3958.patch.4, HIVE-3958.patch.5 analyze commands allows us to collect statistics on existing tables/partitions. It works great but might be slow since it scans all files. There are 2 ways to speed it up: 1. collect stats without file scan. It may not collect all stats but good and fast enough for use case. HIVE-3917 addresses it 2. collect stats via partial file scan. It doesn't scan all content of files but part of it to get file metadata. some examples are https://cwiki.apache.org/Hive/rcfilecat.html for RCFile, ORC ( HIVE-3874 ) and HFile of Hbase This jira is targeted to address the #2. More specifically RCFile format. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (HIVE-3958) support partial scan for analyze command - RCFile
[ https://issues.apache.org/jira/browse/HIVE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-3958 started by Gang Tim Liu. support partial scan for analyze command - RCFile - Key: HIVE-3958 URL: https://issues.apache.org/jira/browse/HIVE-3958 Project: Hive Issue Type: Improvement Reporter: Gang Tim Liu Assignee: Gang Tim Liu Attachments: HIVE-3958.patch.1, HIVE-3958.patch.2, HIVE-3958.patch.3, HIVE-3958.patch.4, HIVE-3958.patch.5 analyze commands allows us to collect statistics on existing tables/partitions. It works great but might be slow since it scans all files. There are 2 ways to speed it up: 1. collect stats without file scan. It may not collect all stats but good and fast enough for use case. HIVE-3917 addresses it 2. collect stats via partial file scan. It doesn't scan all content of files but part of it to get file metadata. some examples are https://cwiki.apache.org/Hive/rcfilecat.html for RCFile, ORC ( HIVE-3874 ) and HFile of Hbase This jira is targeted to address the #2. More specifically RCFile format. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4234) Add the partitioned column information to the TableScanDesc.
sachin created HIVE-4234: Summary: Add the partitioned column information to the TableScanDesc. Key: HIVE-4234 URL: https://issues.apache.org/jira/browse/HIVE-4234 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: sachin Assignee: sachin Priority: Minor Fix For: 0.10.1 This information will be useful for row processing by various operator hooks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4234) Add the partitioned column information to the TableScanDesc.
[ https://issues.apache.org/jira/browse/HIVE-4234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sachin updated HIVE-4234: - Fix Version/s: (was: 0.10.1) Add the partitioned column information to the TableScanDesc. Key: HIVE-4234 URL: https://issues.apache.org/jira/browse/HIVE-4234 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: sachin Assignee: sachin Priority: Minor This information will be useful for row processing by various operator hooks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4095) Add exchange partition in Hive
[ https://issues.apache.org/jira/browse/HIVE-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dheeraj Kumar Singh updated HIVE-4095: -- Assignee: Dheeraj Kumar Singh (was: Rui Jian) Add exchange partition in Hive -- Key: HIVE-4095 URL: https://issues.apache.org/jira/browse/HIVE-4095 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: Dheeraj Kumar Singh It would very useful to support exchange partition in hive, something similar to http://www.orafaq.com/node/2570 in Oracle. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4228) Bump up hadoop2 version in trunk
[ https://issues.apache.org/jira/browse/HIVE-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13614548#comment-13614548 ] Thiruvel Thirumoolan commented on HIVE-4228: Patch on Phabricator - https://reviews.facebook.net/D9723 Bump up hadoop2 version in trunk Key: HIVE-4228 URL: https://issues.apache.org/jira/browse/HIVE-4228 Project: Hive Issue Type: Improvement Components: Build Infrastructure Affects Versions: 0.11.0 Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Fix For: 0.11.0 Attachments: HIVE-4228.patch Hive builds with hadoop 2.0.0-alpha now. Bumping up to hadoop-2.0.3-alpha. Have raised JIRAs with hive10-hadoop23.6 unit tests. Most of them should fix any new failures due to this bump. [I am guessing this should also help HCatalog]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4228) Bump up hadoop2 version in trunk
[ https://issues.apache.org/jira/browse/HIVE-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-4228: --- Status: Patch Available (was: Open) Bump up hadoop2 version in trunk Key: HIVE-4228 URL: https://issues.apache.org/jira/browse/HIVE-4228 Project: Hive Issue Type: Improvement Components: Build Infrastructure Affects Versions: 0.11.0 Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Fix For: 0.11.0 Attachments: HIVE-4228.patch Hive builds with hadoop 2.0.0-alpha now. Bumping up to hadoop-2.0.3-alpha. Have raised JIRAs with hive10-hadoop23.6 unit tests. Most of them should fix any new failures due to this bump. [I am guessing this should also help HCatalog]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3958) support partial scan for analyze command - RCFile
[ https://issues.apache.org/jira/browse/HIVE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Tim Liu updated HIVE-3958: --- Attachment: HIVE-3958.patch.6 support partial scan for analyze command - RCFile - Key: HIVE-3958 URL: https://issues.apache.org/jira/browse/HIVE-3958 Project: Hive Issue Type: Improvement Reporter: Gang Tim Liu Assignee: Gang Tim Liu Attachments: HIVE-3958.patch.1, HIVE-3958.patch.2, HIVE-3958.patch.3, HIVE-3958.patch.4, HIVE-3958.patch.5, HIVE-3958.patch.6 analyze commands allows us to collect statistics on existing tables/partitions. It works great but might be slow since it scans all files. There are 2 ways to speed it up: 1. collect stats without file scan. It may not collect all stats but good and fast enough for use case. HIVE-3917 addresses it 2. collect stats via partial file scan. It doesn't scan all content of files but part of it to get file metadata. some examples are https://cwiki.apache.org/Hive/rcfilecat.html for RCFile, ORC ( HIVE-3874 ) and HFile of Hbase This jira is targeted to address the #2. More specifically RCFile format. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Merging HCatalog into Hive
There's an issue with the permissions here. In the authorization file you granted permission to hcatalog committers on a directory /hive/hcatalog. But in Hive you created /hive/trunk/hcatalog, which none of the hcatalog committers can access. In the authorization file you'll need to change hive-hcatalog to have authorization /hive/trunk/hcatalog. There is also a scalability issue. Every time Hive branches you'll have to add a line for that branch as well. Also, this will prohibit any dev branches for hcatalog users, or access to any dev branches done in Hive. I suspect you'll find it much easier to give the hive-hcatalog group access to /hive and then use community mores to enforce that no hcat committers commit outside the hcat directory. Alan. On Mar 15, 2013, at 5:26 PM, Carl Steinbach wrote: Hi Alan, I committed HIVE-4145, created an HCatalog component on JIRA, and updated the asf-authorization-template to give the HCatalog committers karma on the hcatalog subdirectory. At this point I think everything should be ready to go. Let me know if you run into any problems. Thanks. Carl On Wed, Mar 13, 2013 at 11:56 AM, Alan Gates ga...@hortonworks.com wrote: Proposed changes look good to me. And you don't need an infra ticket to grant karma. Since you're Hive VP you can do it. See http://www.apache.org/dev/pmc.html#SVNaccess Alan. On Mar 10, 2013, at 9:29 PM, Carl Steinbach wrote: Hi Alan, I submitted a patch that creates the hcatalog directory and makes some other necessary changes here: https://issues.apache.org/jira/browse/HIVE-4145 Once this is committed I will contact ASFINFRA and ask them to grant the HCatalog committers karma. Thanks. Carl On Sat, Mar 9, 2013 at 12:54 PM, Alan Gates ga...@hortonworks.com wrote: Alright, I've gotten some feedback from Brock around the JIRA stuff and Carl in a live conversation expressed his desire to move hcat into the Hive namespace sooner rather than later. So the proposal is that we'd move the code to org.apache.hive.hcatalog, though we would create shell classes and interfaces in org.apache.hcatalog for all public classes and interfaces so that it will be backward compatible. I'm fine with doing this now. So, let's get started. Carl, could you create an hcatalog directory under trunk/hive and grant the listed hcat committers karma on it? Then I'll get started on moving the actual code. Alan. On Feb 24, 2013, at 12:22 PM, Brock Noland wrote: Looks good from my perspective and I glad to see this moving forward. Regarding #4 (JIRA) I don't know if there's a way to upload existing JIRAs into Hive's JIRA, but I think it would be better to leave them where they are. JIRA has a bulk move feature, but I am curious as why we would leave them under the old project? There might be good reason to orphan them, but my first thought is that it would be nice to have them under the HIVE project simply for search purposes. Brock On Fri, Feb 22, 2013 at 7:12 PM, Alan Gates ga...@hortonworks.com wrote: Alright, our vote has passed, it's time to get on with merging HCatalog into Hive. Here's the things I can think of we need to deal with. Please add additional issues I've missed: 1) Moving the code 2) Dealing with domain names in the code 3) The mailing lists 4) The JIRA 5) The website 6) Committer rights 7) Make a proposal for how HCat is released going forward 8) Publish an FAQ Proposals for how we handle these: Below I propose an approach for how to handle each of these. Feedback welcome. 1) Moving the code I propose that HCat move into a subdirectory of Hive. This fits nicely into Hive's structure since it already has metastore, ql, etc. We'd just add 'hcatalog' as a new directory. This directory would contain hcatalog as it is today. It does not follow Hive's standard build model so we'd need to do some work to make it so that building Hive also builds HCat, but this should be minimal. 2) Dealing with domain names HCat code currently is under org.apache.hcatalog. Do we want to change it? In time we probably should change it to match the rest of Hive (org.apache.hadoop.hive.hcatalog). We need to do this in a backward compatible way. I propose we leave it as is for now and if we decide to in the future we can move the actual code to org.apache.hadoop.hive.hcatalog and create shell classes under org.apache.hcatalog. 3) The mailing lists Given that our goal is to merge the projects and not create a subproject we should merge the mailing lists rather than keep hcat specific lists. We can ask infra to remove hcatalog-*@incubator.apache.org and forward any new mail to the appropriate Hive lists. We need to find out if they can auto-subscribe people from the
Re: Merging HCatalog into Hive
Hi Alan, I agree that it will probably be too painful to enforce the rules with SVN, so I went ahead and gave all of the HCatalog committers RW access to /hive. Please follow the rules. If I receive any complaints about this I'll revert back to the old scheme. Thanks. Carl On Tue, Mar 26, 2013 at 2:34 PM, Alan Gates ga...@hortonworks.com wrote: There's an issue with the permissions here. In the authorization file you granted permission to hcatalog committers on a directory /hive/hcatalog. But in Hive you created /hive/trunk/hcatalog, which none of the hcatalog committers can access. In the authorization file you'll need to change hive-hcatalog to have authorization /hive/trunk/hcatalog. There is also a scalability issue. Every time Hive branches you'll have to add a line for that branch as well. Also, this will prohibit any dev branches for hcatalog users, or access to any dev branches done in Hive. I suspect you'll find it much easier to give the hive-hcatalog group access to /hive and then use community mores to enforce that no hcat committers commit outside the hcat directory. Alan. On Mar 15, 2013, at 5:26 PM, Carl Steinbach wrote: Hi Alan, I committed HIVE-4145, created an HCatalog component on JIRA, and updated the asf-authorization-template to give the HCatalog committers karma on the hcatalog subdirectory. At this point I think everything should be ready to go. Let me know if you run into any problems. Thanks. Carl On Wed, Mar 13, 2013 at 11:56 AM, Alan Gates ga...@hortonworks.com wrote: Proposed changes look good to me. And you don't need an infra ticket to grant karma. Since you're Hive VP you can do it. See http://www.apache.org/dev/pmc.html#SVNaccess Alan. On Mar 10, 2013, at 9:29 PM, Carl Steinbach wrote: Hi Alan, I submitted a patch that creates the hcatalog directory and makes some other necessary changes here: https://issues.apache.org/jira/browse/HIVE-4145 Once this is committed I will contact ASFINFRA and ask them to grant the HCatalog committers karma. Thanks. Carl On Sat, Mar 9, 2013 at 12:54 PM, Alan Gates ga...@hortonworks.com wrote: Alright, I've gotten some feedback from Brock around the JIRA stuff and Carl in a live conversation expressed his desire to move hcat into the Hive namespace sooner rather than later. So the proposal is that we'd move the code to org.apache.hive.hcatalog, though we would create shell classes and interfaces in org.apache.hcatalog for all public classes and interfaces so that it will be backward compatible. I'm fine with doing this now. So, let's get started. Carl, could you create an hcatalog directory under trunk/hive and grant the listed hcat committers karma on it? Then I'll get started on moving the actual code. Alan. On Feb 24, 2013, at 12:22 PM, Brock Noland wrote: Looks good from my perspective and I glad to see this moving forward. Regarding #4 (JIRA) I don't know if there's a way to upload existing JIRAs into Hive's JIRA, but I think it would be better to leave them where they are. JIRA has a bulk move feature, but I am curious as why we would leave them under the old project? There might be good reason to orphan them, but my first thought is that it would be nice to have them under the HIVE project simply for search purposes. Brock On Fri, Feb 22, 2013 at 7:12 PM, Alan Gates ga...@hortonworks.com wrote: Alright, our vote has passed, it's time to get on with merging HCatalog into Hive. Here's the things I can think of we need to deal with. Please add additional issues I've missed: 1) Moving the code 2) Dealing with domain names in the code 3) The mailing lists 4) The JIRA 5) The website 6) Committer rights 7) Make a proposal for how HCat is released going forward 8) Publish an FAQ Proposals for how we handle these: Below I propose an approach for how to handle each of these. Feedback welcome. 1) Moving the code I propose that HCat move into a subdirectory of Hive. This fits nicely into Hive's structure since it already has metastore, ql, etc. We'd just add 'hcatalog' as a new directory. This directory would contain hcatalog as it is today. It does not follow Hive's standard build model so we'd need to do some work to make it so that building Hive also builds HCat, but this should be minimal. 2) Dealing with domain names HCat code currently is under org.apache.hcatalog. Do we want to change it? In time we probably should change it to match the rest of Hive (org.apache.hadoop.hive.hcatalog). We need to do this in a backward compatible way. I propose we leave it as is for now and if we decide to in the future we can move the actual code to org.apache.hadoop.hive.hcatalog
Re: Merging HCatalog into Hive
Cool, it works now. Thanks for the fast response. Alan. On Mar 26, 2013, at 2:58 PM, Carl Steinbach wrote: Hi Alan, I agree that it will probably be too painful to enforce the rules with SVN, so I went ahead and gave all of the HCatalog committers RW access to /hive. Please follow the rules. If I receive any complaints about this I'll revert back to the old scheme. Thanks. Carl On Tue, Mar 26, 2013 at 2:34 PM, Alan Gates ga...@hortonworks.com wrote: There's an issue with the permissions here. In the authorization file you granted permission to hcatalog committers on a directory /hive/hcatalog. But in Hive you created /hive/trunk/hcatalog, which none of the hcatalog committers can access. In the authorization file you'll need to change hive-hcatalog to have authorization /hive/trunk/hcatalog. There is also a scalability issue. Every time Hive branches you'll have to add a line for that branch as well. Also, this will prohibit any dev branches for hcatalog users, or access to any dev branches done in Hive. I suspect you'll find it much easier to give the hive-hcatalog group access to /hive and then use community mores to enforce that no hcat committers commit outside the hcat directory. Alan. On Mar 15, 2013, at 5:26 PM, Carl Steinbach wrote: Hi Alan, I committed HIVE-4145, created an HCatalog component on JIRA, and updated the asf-authorization-template to give the HCatalog committers karma on the hcatalog subdirectory. At this point I think everything should be ready to go. Let me know if you run into any problems. Thanks. Carl On Wed, Mar 13, 2013 at 11:56 AM, Alan Gates ga...@hortonworks.com wrote: Proposed changes look good to me. And you don't need an infra ticket to grant karma. Since you're Hive VP you can do it. See http://www.apache.org/dev/pmc.html#SVNaccess Alan. On Mar 10, 2013, at 9:29 PM, Carl Steinbach wrote: Hi Alan, I submitted a patch that creates the hcatalog directory and makes some other necessary changes here: https://issues.apache.org/jira/browse/HIVE-4145 Once this is committed I will contact ASFINFRA and ask them to grant the HCatalog committers karma. Thanks. Carl On Sat, Mar 9, 2013 at 12:54 PM, Alan Gates ga...@hortonworks.com wrote: Alright, I've gotten some feedback from Brock around the JIRA stuff and Carl in a live conversation expressed his desire to move hcat into the Hive namespace sooner rather than later. So the proposal is that we'd move the code to org.apache.hive.hcatalog, though we would create shell classes and interfaces in org.apache.hcatalog for all public classes and interfaces so that it will be backward compatible. I'm fine with doing this now. So, let's get started. Carl, could you create an hcatalog directory under trunk/hive and grant the listed hcat committers karma on it? Then I'll get started on moving the actual code. Alan. On Feb 24, 2013, at 12:22 PM, Brock Noland wrote: Looks good from my perspective and I glad to see this moving forward. Regarding #4 (JIRA) I don't know if there's a way to upload existing JIRAs into Hive's JIRA, but I think it would be better to leave them where they are. JIRA has a bulk move feature, but I am curious as why we would leave them under the old project? There might be good reason to orphan them, but my first thought is that it would be nice to have them under the HIVE project simply for search purposes. Brock On Fri, Feb 22, 2013 at 7:12 PM, Alan Gates ga...@hortonworks.com wrote: Alright, our vote has passed, it's time to get on with merging HCatalog into Hive. Here's the things I can think of we need to deal with. Please add additional issues I've missed: 1) Moving the code 2) Dealing with domain names in the code 3) The mailing lists 4) The JIRA 5) The website 6) Committer rights 7) Make a proposal for how HCat is released going forward 8) Publish an FAQ Proposals for how we handle these: Below I propose an approach for how to handle each of these. Feedback welcome. 1) Moving the code I propose that HCat move into a subdirectory of Hive. This fits nicely into Hive's structure since it already has metastore, ql, etc. We'd just add 'hcatalog' as a new directory. This directory would contain hcatalog as it is today. It does not follow Hive's standard build model so we'd need to do some work to make it so that building Hive also builds HCat, but this should be minimal. 2) Dealing with domain names HCat code currently is under org.apache.hcatalog. Do we want to change it? In time we probably should change it to match the rest of Hive (org.apache.hadoop.hive.hcatalog).
[jira] [Created] (HIVE-4235) CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists
Gang Tim Liu created HIVE-4235: -- Summary: CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists Key: HIVE-4235 URL: https://issues.apache.org/jira/browse/HIVE-4235 Project: Hive Issue Type: Bug Components: JDBC, Query Processor, SQL Reporter: Gang Tim Liu Assignee: Gang Tim Liu CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists. It uses Hive.java's getTablesByPattern(...) to check if table exists. It involves regular expression and eventually database join. Very efficient. May cause database lock time increases and hurt db performance if a lot of such commands hit database. The suggested approach is to use getTable(...) since we know tablename already -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Moving code to Hive NOW
I've moved the code. I'll be moving a lot of other code around over the next few days as I do what we discussed in https://issues.apache.org/jira/browse/HIVE-4198 so don't rebase your patches just yet. Alan. On Mar 26, 2013, at 3:14 PM, Alan Gates wrote: I am going to move the HCatalog code to Hive in the next few minutes. Please don't check anything into HCatalog until this is done. All patches will be invalidated by this move. I'll send an all clear when this is done. Alan.
[jira] [Updated] (HIVE-4235) CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists
[ https://issues.apache.org/jira/browse/HIVE-4235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Tim Liu updated HIVE-4235: --- Description: CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists. It uses Hive.java's getTablesByPattern(...) to check if table exists. It involves regular expression and eventually database join. Very efficient. It can cause database lock time increase and hurt db performance if a lot of such commands hit database. The suggested approach is to use getTable(...) since we know tablename already was: CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists. It uses Hive.java's getTablesByPattern(...) to check if table exists. It involves regular expression and eventually database join. Very efficient. May cause database lock time increases and hurt db performance if a lot of such commands hit database. The suggested approach is to use getTable(...) since we know tablename already CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists Key: HIVE-4235 URL: https://issues.apache.org/jira/browse/HIVE-4235 Project: Hive Issue Type: Bug Components: JDBC, Query Processor, SQL Reporter: Gang Tim Liu Assignee: Gang Tim Liu CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists. It uses Hive.java's getTablesByPattern(...) to check if table exists. It involves regular expression and eventually database join. Very efficient. It can cause database lock time increase and hurt db performance if a lot of such commands hit database. The suggested approach is to use getTable(...) since we know tablename already -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (HIVE-4235) CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists
[ https://issues.apache.org/jira/browse/HIVE-4235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-4235 started by Gang Tim Liu. CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists Key: HIVE-4235 URL: https://issues.apache.org/jira/browse/HIVE-4235 Project: Hive Issue Type: Bug Components: JDBC, Query Processor, SQL Reporter: Gang Tim Liu Assignee: Gang Tim Liu CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists. It uses Hive.java's getTablesByPattern(...) to check if table exists. It involves regular expression and eventually database join. Very efficient. It can cause database lock time increase and hurt db performance if a lot of such commands hit database. The suggested approach is to use getTable(...) since we know tablename already -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4179) NonBlockingOpDeDup does not merge SEL operators correctly
[ https://issues.apache.org/jira/browse/HIVE-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-4179: - Fix Version/s: 0.11.0 NonBlockingOpDeDup does not merge SEL operators correctly - Key: HIVE-4179 URL: https://issues.apache.org/jira/browse/HIVE-4179 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.11.0 Attachments: HIVE-4179.1.patch, HIVE-4179.2.patch The input columns list for SEL operations isn't merged properly in the optimization. The best way to see this is running union_remove_22.q with -Dhadoop.mr.rev=23. The plan shows lost UDFs and a broken lineage for one column. Note: union_remove tests do not run on hadoop 1 or 0.20. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4179) NonBlockingOpDeDup does not merge SEL operators correctly
[ https://issues.apache.org/jira/browse/HIVE-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-4179: - Priority: Critical (was: Major) NonBlockingOpDeDup does not merge SEL operators correctly - Key: HIVE-4179 URL: https://issues.apache.org/jira/browse/HIVE-4179 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Priority: Critical Fix For: 0.11.0 Attachments: HIVE-4179.1.patch, HIVE-4179.2.patch The input columns list for SEL operations isn't merged properly in the optimization. The best way to see this is running union_remove_22.q with -Dhadoop.mr.rev=23. The plan shows lost UDFs and a broken lineage for one column. Note: union_remove tests do not run on hadoop 1 or 0.20. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4235) CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists
[ https://issues.apache.org/jira/browse/HIVE-4235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13614649#comment-13614649 ] Gang Tim Liu commented on HIVE-4235: https://reviews.facebook.net/D9729 CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists Key: HIVE-4235 URL: https://issues.apache.org/jira/browse/HIVE-4235 Project: Hive Issue Type: Bug Components: JDBC, Query Processor, SQL Reporter: Gang Tim Liu Assignee: Gang Tim Liu Attachments: HIVE-4235.patch.1 CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists. It uses Hive.java's getTablesByPattern(...) to check if table exists. It involves regular expression and eventually database join. Very efficient. It can cause database lock time increase and hurt db performance if a lot of such commands hit database. The suggested approach is to use getTable(...) since we know tablename already -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4235) CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists
[ https://issues.apache.org/jira/browse/HIVE-4235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Tim Liu updated HIVE-4235: --- Attachment: HIVE-4235.patch.1 CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists Key: HIVE-4235 URL: https://issues.apache.org/jira/browse/HIVE-4235 Project: Hive Issue Type: Bug Components: JDBC, Query Processor, SQL Reporter: Gang Tim Liu Assignee: Gang Tim Liu Attachments: HIVE-4235.patch.1 CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists. It uses Hive.java's getTablesByPattern(...) to check if table exists. It involves regular expression and eventually database join. Very efficient. It can cause database lock time increase and hurt db performance if a lot of such commands hit database. The suggested approach is to use getTable(...) since we know tablename already -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4235) CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists
[ https://issues.apache.org/jira/browse/HIVE-4235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Tim Liu updated HIVE-4235: --- Status: Patch Available (was: In Progress) diff ready CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists Key: HIVE-4235 URL: https://issues.apache.org/jira/browse/HIVE-4235 Project: Hive Issue Type: Bug Components: JDBC, Query Processor, SQL Reporter: Gang Tim Liu Assignee: Gang Tim Liu Attachments: HIVE-4235.patch.1 CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists. It uses Hive.java's getTablesByPattern(...) to check if table exists. It involves regular expression and eventually database join. Very efficient. It can cause database lock time increase and hurt db performance if a lot of such commands hit database. The suggested approach is to use getTable(...) since we know tablename already -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4114) hive-metastore.jar depends on jdo2-api:jar:2.3-ec, which is missing in maven central
[ https://issues.apache.org/jira/browse/HIVE-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13614667#comment-13614667 ] Konstantin Boudnik commented on HIVE-4114: -- Can we wrap it into a pom file and deploy to, perhaps, apache maven? hive-metastore.jar depends on jdo2-api:jar:2.3-ec, which is missing in maven central Key: HIVE-4114 URL: https://issues.apache.org/jira/browse/HIVE-4114 Project: Hive Issue Type: Bug Components: Build Infrastructure Reporter: Gopal V Priority: Trivial Adding hive-exec-0.10.0 to an independent pom.xml results in the following error {code} Failed to retrieve javax.jdo:jdo2-api-2.3-ec Caused by: Could not find artifact javax.jdo:jdo2-api:jar:2.3-ec in central (http://repo1.maven.org/maven2) ... Path to dependency: 1) org.notmysock.hive:plan-viewer:jar:1.0-SNAPSHOT 2) org.apache.hive:hive-exec:jar:0.10.0 3) org.apache.hive:hive-metastore:jar:0.10.0 4) javax.jdo:jdo2-api:jar:2.3-ec {code} From the best I could tell, in the hive build ant+ivy pulls this file from the datanucleus repo http://www.datanucleus.org/downloads/maven2/javax/jdo/jdo2-api/2.3-ec/ For completeness sake, the dependency needs to be pulled to maven central. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3381) Result of outer join is not valid
[ https://issues.apache.org/jira/browse/HIVE-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13614737#comment-13614737 ] Navis commented on HIVE-3381: - Finally! Thanks to all. Result of outer join is not valid - Key: HIVE-3381 URL: https://issues.apache.org/jira/browse/HIVE-3381 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Priority: Critical Fix For: 0.11.0 Attachments: HIVE-3381.D5565.3.patch, HIVE-3381.D5565.4.patch, HIVE-3381.D5565.5.patch, HIVE-3381.D5565.6.patch, HIVE-3381.D5565.7.patch, mapjoin_testOuter.q Outer joins, especially full outer joins or outer join with filter on 'ON clause' is not showing proper results. For example, query in test join_1to1.q {code} SELECT * FROM join_1to1_1 a full outer join join_1to1_2 b on a.key1 = b.key1 and a.value = 66 and b.value = 66 ORDER BY a.key1 ASC, a.key2 ASC, a.value ASC, b.key1 ASC, b.key2 ASC, b.value ASC; {code} results {code} NULL NULLNULLNULLNULL66 NULL NULLNULLNULL10050 66 NULL NULLNULL10 10010 66 NULL NULLNULL30 10030 88 NULL NULLNULL35 10035 88 NULL NULLNULL40 10040 88 NULL NULLNULL40 10040 88 NULL NULLNULL50 10050 88 NULL NULLNULL50 10050 88 NULL NULLNULL50 10050 88 NULL NULLNULL70 10040 88 NULL NULLNULL70 10040 88 NULL NULLNULL70 10040 88 NULL NULLNULL70 10040 88 NULL NULL66 NULLNULLNULL NULL 10050 66 NULLNULLNULL 5 10005 66 5 10005 66 1510015 66 NULLNULLNULL 2010020 66 20 10020 66 2510025 88 NULLNULLNULL 3010030 66 NULLNULLNULL 3510035 88 NULLNULLNULL 4010040 66 NULLNULLNULL 4010040 66 40 10040 66 4010040 88 NULLNULLNULL 4010040 88 NULLNULLNULL 5010050 66 NULLNULLNULL 5010050 66 50 10050 66 5010050 66 50 10050 66 5010050 88 NULLNULLNULL 5010050 88 NULLNULLNULL 5010050 88 NULLNULLNULL 5010050 88 NULLNULLNULL 5010050 88 NULLNULLNULL 5010050 88 NULLNULLNULL 6010040 66 60 10040 66 6010040 66 60 10040 66 6010040 66 60 10040 66 6010040 66 60 10040 66 7010040 66 NULLNULLNULL 7010040 66 NULLNULLNULL 7010040 66 NULLNULLNULL 7010040 66 NULLNULLNULL 8010040 88 NULLNULLNULL 8010040 88 NULLNULLNULL 8010040 88 NULLNULLNULL 8010040 88 NULLNULLNULL {code} but it seemed not right. This should be {code} NULL NULLNULLNULLNULL66 NULL NULLNULLNULL10050 66 NULL NULLNULL10 10010 66 NULL NULLNULL25 10025 66 NULL NULLNULL30 10030 88 NULL NULLNULL35 10035 88 NULL NULLNULL40 10040 88 NULL NULLNULL50 10050 88 NULL NULLNULL70 10040 88 NULL NULLNULL70 10040 88 NULL NULLNULL80 10040 66 NULL NULLNULL80 10040 66 NULL NULL66 NULLNULLNULL NULL 10050 66 NULLNULLNULL 5 10005 66 5 10005 66 1510015 66 NULLNULLNULL 2010020 66 20 10020 66 2510025 88 NULLNULLNULL 3010030 66 NULLNULLNULL 3510035 88 NULLNULLNULL 4010040 66 40 10040 66 4010040 88 NULLNULLNULL 5010050 66 50 10050 66 5010050 66 50 10050 66 5010050 88 NULLNULLNULL 5010050 88 NULLNULLNULL 6010040 66 60 10040 66 6010040 66 60 10040 66 6010040 66 60 10040 66 6010040 66 60 10040 66 7010040 66 NULLNULLNULL 7010040 66 NULLNULLNULL 8010040 88 NULLNULLNULL 8010040 88 NULLNULLNULL {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4235) CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists
[ https://issues.apache.org/jira/browse/HIVE-4235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13614742#comment-13614742 ] Kevin Wilfong commented on HIVE-4235: - +1 CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists Key: HIVE-4235 URL: https://issues.apache.org/jira/browse/HIVE-4235 Project: Hive Issue Type: Bug Components: JDBC, Query Processor, SQL Reporter: Gang Tim Liu Assignee: Gang Tim Liu Attachments: HIVE-4235.patch.1 CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists. It uses Hive.java's getTablesByPattern(...) to check if table exists. It involves regular expression and eventually database join. Very efficient. It can cause database lock time increase and hurt db performance if a lot of such commands hit database. The suggested approach is to use getTable(...) since we know tablename already -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4179) NonBlockingOpDeDup does not merge SEL operators correctly
[ https://issues.apache.org/jira/browse/HIVE-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13614786#comment-13614786 ] Navis commented on HIVE-4179: - I've took a look at this. The root cause is from UnionProcessor which does not copy colExprMapping of parent SEL operator. After applying that, I've confirmed the result is valid. [~hagleitn] The patch you've provided is valid but the missing colExprMap info can make problems in anytime. So I prefer to revise it as suggested above. Could you do that? NonBlockingOpDeDup does not merge SEL operators correctly - Key: HIVE-4179 URL: https://issues.apache.org/jira/browse/HIVE-4179 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Priority: Critical Fix For: 0.11.0 Attachments: HIVE-4179.1.patch, HIVE-4179.2.patch The input columns list for SEL operations isn't merged properly in the optimization. The best way to see this is running union_remove_22.q with -Dhadoop.mr.rev=23. The plan shows lost UDFs and a broken lineage for one column. Note: union_remove tests do not run on hadoop 1 or 0.20. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4209) Cache evaluation result of deterministic expression and reuse it
[ https://issues.apache.org/jira/browse/HIVE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4209: -- Attachment: HIVE-4209.D9585.2.patch navis updated the revision HIVE-4209 [jira] Cache evaluation result of deterministic expression and reuse it. Fix NPE, running test Reviewers: JIRA REVISION DETAIL https://reviews.facebook.net/D9585 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D9585?vs=30201id=30531#toc AFFECTED FILES common/src/java/org/apache/hadoop/hive/conf/HiveConf.java ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeColumnEvaluator.java ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeConstantEvaluator.java ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeEvaluator.java ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeEvaluatorFactory.java ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeEvaluatorRef.java ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeNullEvaluator.java ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeFieldEvaluator.java ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java ql/src/java/org/apache/hadoop/hive/ql/exec/FilterOperator.java ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java ql/src/java/org/apache/hadoop/hive/ql/exec/SelectOperator.java To: JIRA, navis Cache evaluation result of deterministic expression and reuse it Key: HIVE-4209 URL: https://issues.apache.org/jira/browse/HIVE-4209 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-4209.D9585.1.patch, HIVE-4209.D9585.2.patch For example, {noformat} select key from src where key + 1 100 AND key + 1 200 limit 3; {noformat} key + 1 need not to be evaluated twice. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4233) The TGT gotten from class 'CLIService' should be renewed on time
[ https://issues.apache.org/jira/browse/HIVE-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dong updated HIVE-4233: --- Summary: The TGT gotten from class 'CLIService' should be renewed on time (was: The TGT gotten from class 'CLIService' should be renewed on time? ) The TGT gotten from class 'CLIService' should be renewed on time - Key: HIVE-4233 URL: https://issues.apache.org/jira/browse/HIVE-4233 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.10.0 Environment: CentOS release 6.3 (Final) jdk1.6.0_31 HiveServer2 0.10.0-cdh4.2.0 Kerberos Security Reporter: dong Priority: Critical When the HIveServer2 have started more than 7 days, I use beeline shell to connect the HiveServer2,all operation failed. The log of HiveServer2 shows it was caused by the Kerberos auth failure,the exception stack trace is: 2013-03-26 11:55:20,932 ERROR hive.ql.metadata.Hive: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1084) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.init(RetryingMetaStoreClient.java:51) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:61) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2140) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2151) at org.apache.hadoop.hive.ql.metadata.Hive.getDelegationToken(Hive.java:2275) at org.apache.hive.service.cli.CLIService.getDelegationTokenFromMetaStore(CLIService.java:358) at org.apache.hive.service.cli.thrift.ThriftCLIService.OpenSession(ThriftCLIService.java:127) at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1073) at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1058) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge20S.java:565) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.GeneratedConstructorAccessor52.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1082) ... 16 more Caused by: java.lang.IllegalStateException: This ticket is no longer valid at javax.security.auth.kerberos.KerberosTicket.toString(KerberosTicket.java:601) at java.lang.String.valueOf(String.java:2826) at java.lang.StringBuilder.append(StringBuilder.java:115) at sun.security.jgss.krb5.SubjectComber.findAux(SubjectComber.java:120) at sun.security.jgss.krb5.SubjectComber.find(SubjectComber.java:41) at sun.security.jgss.krb5.Krb5Util.getTicket(Krb5Util.java:130) at sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:328) at java.security.AccessController.doPrivileged(Native Method) at sun.security.jgss.krb5.Krb5InitCredential.getTgt(Krb5InitCredential.java:325) at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:128) at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:106) at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:172) at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:209) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:195) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:162) at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:175) at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) at
Where to put hcatalog branches and site code
Right after I moved the hcat code to hive/trunk/hcatalog Owen pointed out that the problem with this is now everyone who checks out Hive pulls _all_ of the hcat code. This isn't what we want. The site code I propose we integrate with Hive's site code. I'll put up a patch for this shortly. The branches we could either move into Hive's branches directory (and move them to hcatalog-branch-0.x) or we could create a /hive/hcatalog-historical and put them there. I'm fine with either. Thoughts? Alan.
[jira] [Updated] (HIVE-4171) Current database in metastore.Hive is not consistent with SessionState
[ https://issues.apache.org/jira/browse/HIVE-4171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4171: -- Attachment: HIVE-4171.D9399.2.patch navis updated the revision HIVE-4171 [jira] Current database in metastore.Hive is not consistent with SessionState. Should change context loader, too Reviewers: JIRA REVISION DETAIL https://reviews.facebook.net/D9399 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D9399?vs=29805id=30543#toc AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java To: JIRA, navis Cc: prasadm Current database in metastore.Hive is not consistent with SessionState -- Key: HIVE-4171 URL: https://issues.apache.org/jira/browse/HIVE-4171 Project: Hive Issue Type: Bug Components: CLI Reporter: Navis Assignee: Navis Labels: HiveServer2 Attachments: HIVE-4171.D9399.1.patch, HIVE-4171.D9399.2.patch metastore.Hive is thread local instance, which can have different status with SessionState. Currently the only status in metastore.Hive is database name in use. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3464) Merging join tree may reorder joins which could be invalid
[ https://issues.apache.org/jira/browse/HIVE-3464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-3464: -- Attachment: HIVE-3464.D5409.4.patch navis updated the revision HIVE-3464 [jira] Merging join tree may reorder joins which could be invalid. Rebased to trunk Reviewers: JIRA REVISION DETAIL https://reviews.facebook.net/D5409 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D5409?vs=23079id=30549#toc AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ql/src/test/queries/clientpositive/mergejoins_mixed.q ql/src/test/results/clientpositive/join_filters_overlap.q.out ql/src/test/results/clientpositive/mergejoins_mixed.q.out To: JIRA, navis Cc: njain Merging join tree may reorder joins which could be invalid -- Key: HIVE-3464 URL: https://issues.apache.org/jira/browse/HIVE-3464 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Attachments: HIVE-3464.D5409.2.patch, HIVE-3464.D5409.3.patch, HIVE-3464.D5409.4.patch Currently, hive merges join tree from right to left regardless of join types, which may introduce join reordering. For example, select * from a join a b on a.key=b.key join a c on b.key=c.key join a d on a.key=d.key; Hive tries to merge join tree in a-d=b-d, a-d=a-b, b-c=a-b order and a-d=a-b and b-c=a-b will be merged. Final join tree is a-(bdc). With this, ab-d join will be executed prior to ab-c. But if join type of -c and -d is different, this is not valid. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira