[jira] [Commented] (HIVE-3652) Join optimization for star schema
[ https://issues.apache.org/jira/browse/HIVE-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490506#comment-13490506 ] Amareshwari Sriramadasu commented on HIVE-3652: --- If we have series of MapJoinOperators, and the Operator tree has MapJoin followed by MapJoin, then we can run all the map joins in single query. Some of this already solved by HIVE-1246. I think we need to do some more changes to accept queries of the following form also : select /*+ MAPJOIN(b,c) */ from FACT a join DIM1 b on a.k1=b.k1 JOIN DIM2 c on a.k2=c.k2 or select /*+ MAPJOIN(b,c) */ from FACT a join DIM1 b on a.k1=b.k1 JOIN DIM2 c on b.k2=c.k2 Join optimization for star schema - Key: HIVE-3652 URL: https://issues.apache.org/jira/browse/HIVE-3652 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Currently, if we join one fact table with multiple dimension tables, it results in multiple mapreduce jobs for each join with dimension table, because join would be on different keys for each dimension. Usually all the dimension tables will be small and can fit into memory and so map-side join can used to join with fact table. In this issue I want to look at optimizing such query to generate single mapreduce job sothat mapper loads dimension tables into memory and joins with fact table on different keys as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3652) Join optimization for star schema
[ https://issues.apache.org/jira/browse/HIVE-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490513#comment-13490513 ] Amareshwari Sriramadasu commented on HIVE-3652: --- bq. If we have series of MapJoinOperators, and the Operator tree has MapJoin followed by MapJoin, then we can run all the map joins in single query. Some of this already solved by HIVE-1246. Sorry, was too quick here. HIVE-1246 solves case of all join keys being same. If we have MapJoin followed by MapJoin, can we make the second operator child of first instead of a sink in between? I'm thinking that should just working for the cases of joining on different keys. Let me know if I'm wrong. Join optimization for star schema - Key: HIVE-3652 URL: https://issues.apache.org/jira/browse/HIVE-3652 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Currently, if we join one fact table with multiple dimension tables, it results in multiple mapreduce jobs for each join with dimension table, because join would be on different keys for each dimension. Usually all the dimension tables will be small and can fit into memory and so map-side join can used to join with fact table. In this issue I want to look at optimizing such query to generate single mapreduce job sothat mapper loads dimension tables into memory and joins with fact table on different keys as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3652) Join optimization for star schema
[ https://issues.apache.org/jira/browse/HIVE-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490521#comment-13490521 ] Amareshwari Sriramadasu commented on HIVE-3652: --- bq. select /*+ MAPJOIN(b,c) */ from FACT a join DIM1 b on a.k1=b.k1 JOIN DIM2 c on a.k2=c.k2 I modified the above query to be the following (with a subquery) : SELECT /*+ MAPJOIN(dim2) */ subq.m1, subq.m2 FROM (SELECT /*+ MAPJOIN(dim1) */ m1, m2, k2 FROM fact JOIN dim1 ON (fact.k1 = dim1.k1)) subq JOIN dim2 ON (subq.k2 = dim2.k2); And it is already launching a single map reduce job for both the joins. Join optimization for star schema - Key: HIVE-3652 URL: https://issues.apache.org/jira/browse/HIVE-3652 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Currently, if we join one fact table with multiple dimension tables, it results in multiple mapreduce jobs for each join with dimension table, because join would be on different keys for each dimension. Usually all the dimension tables will be small and can fit into memory and so map-side join can used to join with fact table. In this issue I want to look at optimizing such query to generate single mapreduce job sothat mapper loads dimension tables into memory and joins with fact table on different keys as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3652) Join optimization for star schema
[ https://issues.apache.org/jira/browse/HIVE-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490583#comment-13490583 ] Namit Jain commented on HIVE-3652: -- I think this will launch 2 map-only jobs in the presence of auto join conversion. We do want to support auto join conversion. Join optimization for star schema - Key: HIVE-3652 URL: https://issues.apache.org/jira/browse/HIVE-3652 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Currently, if we join one fact table with multiple dimension tables, it results in multiple mapreduce jobs for each join with dimension table, because join would be on different keys for each dimension. Usually all the dimension tables will be small and can fit into memory and so map-side join can used to join with fact table. In this issue I want to look at optimizing such query to generate single mapreduce job sothat mapper loads dimension tables into memory and joins with fact table on different keys as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3652) Join optimization for star schema
[ https://issues.apache.org/jira/browse/HIVE-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490587#comment-13490587 ] Amareshwari Sriramadasu commented on HIVE-3652: --- Namit, Are you saying if we enhance MapJoinProcessor to do single map for MapJoin followed by MapJoin (as in HIVE-1246) may not work in auto join conversion? I thought whether it is by providing hint or auto join conversion, it does not matter to MapJoinProcessor optimizer. Correct me if I'm wrong. Join optimization for star schema - Key: HIVE-3652 URL: https://issues.apache.org/jira/browse/HIVE-3652 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Currently, if we join one fact table with multiple dimension tables, it results in multiple mapreduce jobs for each join with dimension table, because join would be on different keys for each dimension. Usually all the dimension tables will be small and can fit into memory and so map-side join can used to join with fact table. In this issue I want to look at optimizing such query to generate single mapreduce job sothat mapper loads dimension tables into memory and joins with fact table on different keys as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [DISCUSS] HCatalog becoming a subproject of Hive
+1 for that. I like it. 2012/11/5 Alexander Lorenz wget.n...@gmail.com: Like it too. +1 Thanks, Alex On Nov 5, 2012, at 5:35 AM, Namit Jain nj...@fb.com wrote: I like the idea of Hcatalog becoming a Hive sub-project. The enhancements/bugs in the serde/metastore areas can indirectly benefit the hive community, and it will be easier for the fix to be in one place. Having said that, I don't see serde/metastore moving out of hive into a separate component. Things are tied too closely together. I am assuming that no new committers would be automatically added to Hive as part of this, and both Hive and HCatalog will continue to have its own committers. Thanks, -namit On 11/3/12 2:22 AM, Alan Gates ga...@hortonworks.com wrote: Hello Hive community. It is time for HCatalog to graduate from the Apache Incubator. Given the heavy dependence of HCatalog on Hive the HCatalog community agreed it made sense to explore graduating from the Incubator to become a subproject of Hive (see http://mail-archives.apache.org/mod_mbox/incubator-hcatalog-user/201209.mb ox/%3C08C40723-8D4D-48EB-942B-8EE4327DD84A%40hortonworks.com%3E and http://mail-archives.apache.org/mod_mbox/incubator-hcatalog-user/201210.mb ox/%3CCABN7xTCRM5wXGgJKEko0PmqDXhuAYpK%2BD-H57T29zcSGhkwGQw%40mail.gmail.c om%3E ). To help both communities understand what HCatalog is and hopes to become we also developed a roadmap that summarizes HCatalog's current features, planned features, and other possible features under discussion: https://cwiki.apache.org/confluence/display/HCATALOG/HCatalog+Roadmap So we are now approaching you to see if there is agreement in the Hive community that HCatalog graduating into Hive would make sense. Alan. -- Alexander Alten-Lorenz http://mapredit.blogspot.com German Hadoop LinkedIn Group: http://goo.gl/N8pCF
[jira] [Created] (HIVE-3667) Umbrella jira for Correlation Optimizer
Yin Huai created HIVE-3667: -- Summary: Umbrella jira for Correlation Optimizer Key: HIVE-3667 URL: https://issues.apache.org/jira/browse/HIVE-3667 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Yin Huai Assignee: Yin Huai This is an umbrella jira to track work related to Correlation Optimizer (originally proposed in HIVE-2206). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3668) Merge MapReduce jobs which share input tables and the same partitioning keys into a single MapReduce job
Yin Huai created HIVE-3668: -- Summary: Merge MapReduce jobs which share input tables and the same partitioning keys into a single MapReduce job Key: HIVE-3668 URL: https://issues.apache.org/jira/browse/HIVE-3668 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Yin Huai Assignee: Yin Huai In the plan of a query, if there exists MapReduce jobs created for aggregation operators or join operators which share one or more input tables and use the same partitioning keys to shuffle the data, these jobs can be merged into a single MapReduce job. Shared tables will be only scanned once. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3669) Support queries in which input tables of correlated MR jobs involves intermediate tables
Yin Huai created HIVE-3669: -- Summary: Support queries in which input tables of correlated MR jobs involves intermediate tables Key: HIVE-3669 URL: https://issues.apache.org/jira/browse/HIVE-3669 Project: Hive Issue Type: Sub-task Reporter: Yin Huai Correlation optimizer implemented in HIVE-2206 does not optimize correlated MapReduce jobs which have intermediate tables as input. Here is an example originally posted in HIVE-3430 {code:sql} select * from ( select c.value, count(1) as cnt from ( select b.key, b.value from ( select key, length(value) from T1 where ds = '1' ) a join T2 b on b.ds = '1' and a.key = b.key ) c group by c.value ) d join ( select value, count(1) as cnt from T2 c where c.ds = '1' group by value ) e on d.value = e.value; {code} Since correlated MapReduce jobs (those use value as the portioning key) involves an intermediate table c, implementation of HIVE-2206 do not optimize this query. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3669) Support queries in which input tables of correlated MR jobs involves intermediate tables
[ https://issues.apache.org/jira/browse/HIVE-3669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated HIVE-3669: --- Assignee: Yin Huai Support queries in which input tables of correlated MR jobs involves intermediate tables Key: HIVE-3669 URL: https://issues.apache.org/jira/browse/HIVE-3669 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Yin Huai Assignee: Yin Huai Correlation optimizer implemented in HIVE-2206 does not optimize correlated MapReduce jobs which have intermediate tables as input. Here is an example originally posted in HIVE-3430 {code:sql} select * from ( select c.value, count(1) as cnt from ( select b.key, b.value from ( select key, length(value) from T1 where ds = '1' ) a join T2 b on b.ds = '1' and a.key = b.key ) c group by c.value ) d join ( select value, count(1) as cnt from T2 c where c.ds = '1' group by value ) e on d.value = e.value; {code} Since correlated MapReduce jobs (those use value as the portioning key) involves an intermediate table c, implementation of HIVE-2206 do not optimize this query. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3670) Optimize queries involving self join
[ https://issues.apache.org/jira/browse/HIVE-3670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated HIVE-3670: --- Assignee: Yin Huai Optimize queries involving self join Key: HIVE-3670 URL: https://issues.apache.org/jira/browse/HIVE-3670 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Yin Huai Assignee: Yin Huai -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3670) Optimize queries involving self join
Yin Huai created HIVE-3670: -- Summary: Optimize queries involving self join Key: HIVE-3670 URL: https://issues.apache.org/jira/browse/HIVE-3670 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Yin Huai -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3671) If a query has been optimized by correlation optimizer, join auto convert cannot optimize it
Yin Huai created HIVE-3671: -- Summary: If a query has been optimized by correlation optimizer, join auto convert cannot optimize it Key: HIVE-3671 URL: https://issues.apache.org/jira/browse/HIVE-3671 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Yin Huai Assignee: Yin Huai Currently, if a query has been optimized by correlation optimizer, the first operator at the reduce side will be CorrelationReducerDispatchOperator. Since CommonJoinResolver will only work if the first operator at the reduce side is a join operator, join auto convert cannot optimize this query. This jira is used to discuss how to make correlation optimizer compatible with join auto convert. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2206) add a new optimizer for query correlation discovery and optimization
[ https://issues.apache.org/jira/browse/HIVE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490681#comment-13490681 ] Yin Huai commented on HIVE-2206: [~namit] Sure. I created the umbrella jira (HIVE-3667) for all work related to correlation optimizer and also created several follow-up jiras as sub-tasks. You can also add other sub-tasks into that jira. add a new optimizer for query correlation discovery and optimization Key: HIVE-2206 URL: https://issues.apache.org/jira/browse/HIVE-2206 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.10.0 Reporter: He Yongqiang Assignee: Yin Huai Attachments: HIVE-2206.10-r1384442.patch.txt, HIVE-2206.11-r1385084.patch.txt, HIVE-2206.12-r1386996.patch.txt, HIVE-2206.13-r1389072.patch.txt, HIVE-2206.14-r1389704.patch.txt, HIVE-2206.15-r1392491.patch.txt, HIVE-2206.16-r1399936.patch.txt, HIVE-2206.17-r1404933.patch.txt, HIVE-2206.1.patch.txt, HIVE-2206.2.patch.txt, HIVE-2206.3.patch.txt, HIVE-2206.4.patch.txt, HIVE-2206.5-1.patch.txt, HIVE-2206.5.patch.txt, HIVE-2206.6.patch.txt, HIVE-2206.7.patch.txt, HIVE-2206.8.r1224646.patch.txt, HIVE-2206.8-r1237253.patch.txt, testQueries.2.q, YSmartPatchForHive.patch This issue proposes a new logical optimizer called Correlation Optimizer, which is used to merge correlated MapReduce jobs (MR jobs) into a single MR job. The idea is based on YSmart (http://ysmart.cse.ohio-state.edu/).The paper and slides of YSmart are linked at the bottom. Since Hive translates queries in a sentence by sentence fashion, for every operation which may need to shuffle the data (e.g. join and aggregation operations), Hive will generate a MapReduce job for that operation. However, for those operations which may need to shuffle the data, they may involve correlations explained below and thus can be executed in a single MR job. # Input Correlation: Multiple MR jobs have input correlation (IC) if their input relation sets are not disjoint; # Transit Correlation: Multiple MR jobs have transit correlation (TC) if they have not only input correlation, but also the same partition key; # Job Flow Correlation: An MR has job flow correlation (JFC) with one of its child nodes if it has the same partition key as that child node. The current implementation of correlation optimizer only detect correlations among MR jobs for reduce-side join operators and reduce-side aggregation operators (not map only aggregation). A query will be optimized if it satisfies following conditions. # There exists a MR job for reduce-side join operator or reduce side aggregation operator which have JFC with all of its parents MR jobs (TCs will be also exploited if JFC exists); # All input tables of those correlated MR job are original input tables (not intermediate tables generated by sub-queries); and # No self join is involved in those correlated MR jobs. Correlation optimizer is implemented as a logical optimizer. The main reasons are that it only needs to manipulate the query plan tree and it can leverage the existing component on generating MR jobs. Current implementation can serve as a framework for correlation related optimizations. I think that it is better than adding individual optimizers. There are several work that can be done in future to improve this optimizer. Here are three examples. # Support queries only involve TC; # Support queries in which input tables of correlated MR jobs involves intermediate tables; and # Optimize queries involving self join. References: Paper and presentation of YSmart. Paper: http://www.cse.ohio-state.edu/hpcs/WWW/HTML/publications/papers/TR-11-7.pdf Slides: http://sdrv.ms/UpwJJc -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [DISCUSS] HCatalog becoming a subproject of Hive
That is a good idea. I would really like to see hcatalog be a sub project and then have hive use remove it's metastore code in favour of hcatalog. How should we coordinate this? Edward On Mon, Nov 5, 2012 at 6:41 AM, Clark Yang (杨卓荦) clarkyzl-h...@yahoo.com.cn wrote: +1 for that. I like it. 2012/11/5 Alexander Lorenz wget.n...@gmail.com: Like it too. +1 Thanks, Alex On Nov 5, 2012, at 5:35 AM, Namit Jain nj...@fb.com wrote: I like the idea of Hcatalog becoming a Hive sub-project. The enhancements/bugs in the serde/metastore areas can indirectly benefit the hive community, and it will be easier for the fix to be in one place. Having said that, I don't see serde/metastore moving out of hive into a separate component. Things are tied too closely together. I am assuming that no new committers would be automatically added to Hive as part of this, and both Hive and HCatalog will continue to have its own committers. Thanks, -namit On 11/3/12 2:22 AM, Alan Gates ga...@hortonworks.com wrote: Hello Hive community. It is time for HCatalog to graduate from the Apache Incubator. Given the heavy dependence of HCatalog on Hive the HCatalog community agreed it made sense to explore graduating from the Incubator to become a subproject of Hive (see http://mail-archives.apache.org/mod_mbox/incubator-hcatalog-user/201209.mb ox/%3C08C40723-8D4D-48EB-942B-8EE4327DD84A%40hortonworks.com%3E and http://mail-archives.apache.org/mod_mbox/incubator-hcatalog-user/201210.mb ox/%3CCABN7xTCRM5wXGgJKEko0PmqDXhuAYpK%2BD-H57T29zcSGhkwGQw%40mail.gmail.c om%3E ). To help both communities understand what HCatalog is and hopes to become we also developed a roadmap that summarizes HCatalog's current features, planned features, and other possible features under discussion: https://cwiki.apache.org/confluence/display/HCATALOG/HCatalog+Roadmap So we are now approaching you to see if there is agreement in the Hive community that HCatalog graduating into Hive would make sense. Alan. -- Alexander Alten-Lorenz http://mapredit.blogspot.com German Hadoop LinkedIn Group: http://goo.gl/N8pCF
Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21 #189
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/ -- [...truncated 5328 lines...] [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/shims [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/shims/classes [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/shims/test [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/shims/test/src [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/shims/test/classes [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/shims/test/resources [copy] Warning: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/ws/hive/shims/src/test/resources does not exist. init: [echo] Project: shims create-dirs: [echo] Project: common [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/common [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/common/classes [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/common/test [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/common/test/src [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/common/test/classes [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/common/test/resources [copy] Copying 2 files to https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/common/test/resources init: [echo] Project: common create-dirs: [echo] Project: serde [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/serde [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/serde/classes [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/serde/test [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/serde/test/src [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/serde/test/classes [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/serde/test/resources [copy] Warning: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/ws/hive/serde/src/test/resources does not exist. init: [echo] Project: serde create-dirs: [echo] Project: metastore [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/metastore [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/metastore/classes [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/metastore/test [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/metastore/test/src [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/metastore/test/classes [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/metastore/test/resources [copy] Warning: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/ws/hive/metastore/src/test/resources does not exist. init: [echo] Project: metastore create-dirs: [echo] Project: ql [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/ql [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/ql/classes [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/ql/test [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/ql/test/src [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/ql/test/classes [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/ql/test/resources [copy] Warning: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/ws/hive/ql/src/test/resources does not exist. init: [echo] Project: ql create-dirs: [echo] Project: contrib [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/189/artifact/hive/build/contrib [mkdir] Created dir:
Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false #189
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/189/ -- [...truncated 10093 lines...] [echo] Project: odbc [copy] Warning: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/odbc/src/conf does not exist. ivy-resolve-test: [echo] Project: odbc ivy-retrieve-test: [echo] Project: odbc compile-test: [echo] Project: odbc create-dirs: [echo] Project: serde [copy] Warning: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/serde/src/test/resources does not exist. init: [echo] Project: serde ivy-init-settings: [echo] Project: serde ivy-resolve: [echo] Project: serde [ivy:resolve] :: loading settings :: file = https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/ivy/ivysettings.xml [ivy:report] Processing https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/189/artifact/hive/build/ivy/resolution-cache/org.apache.hive-hive-serde-default.xml to https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/189/artifact/hive/build/ivy/report/org.apache.hive-hive-serde-default.html ivy-retrieve: [echo] Project: serde dynamic-serde: compile: [echo] Project: serde ivy-resolve-test: [echo] Project: serde ivy-retrieve-test: [echo] Project: serde compile-test: [echo] Project: serde [javac] Compiling 26 source files to https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/189/artifact/hive/build/serde/test/classes [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. create-dirs: [echo] Project: service [copy] Warning: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/service/src/test/resources does not exist. init: [echo] Project: service ivy-init-settings: [echo] Project: service ivy-resolve: [echo] Project: service [ivy:resolve] :: loading settings :: file = https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/ivy/ivysettings.xml [ivy:report] Processing https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/189/artifact/hive/build/ivy/resolution-cache/org.apache.hive-hive-service-default.xml to https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/189/artifact/hive/build/ivy/report/org.apache.hive-hive-service-default.html ivy-retrieve: [echo] Project: service compile: [echo] Project: service ivy-resolve-test: [echo] Project: service ivy-retrieve-test: [echo] Project: service compile-test: [echo] Project: service [javac] Compiling 2 source files to https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/189/artifact/hive/build/service/test/classes test: [echo] Project: hive test-shims: [echo] Project: hive test-conditions: [echo] Project: shims gen-test: [echo] Project: shims create-dirs: [echo] Project: shims [copy] Warning: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/shims/src/test/resources does not exist. init: [echo] Project: shims ivy-init-settings: [echo] Project: shims ivy-resolve: [echo] Project: shims [ivy:resolve] :: loading settings :: file = https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/ivy/ivysettings.xml [ivy:report] Processing https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/189/artifact/hive/build/ivy/resolution-cache/org.apache.hive-hive-shims-default.xml to https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/189/artifact/hive/build/ivy/report/org.apache.hive-hive-shims-default.html ivy-retrieve: [echo] Project: shims compile: [echo] Project: shims [echo] Building shims 0.20 build_shims: [echo] Project: shims [echo] Compiling https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/shims/src/common/java;/home/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/shims/src/0.20/java against hadoop 0.20.2 (https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/189/artifact/hive/build/hadoopcore/hadoop-0.20.2) ivy-init-settings: [echo] Project: shims ivy-resolve-hadoop-shim: [echo] Project: shims [ivy:resolve] :: loading settings :: file = https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/ivy/ivysettings.xml ivy-retrieve-hadoop-shim: [echo] Project: shims [echo] Building shims 0.20S build_shims: [echo] Project: shims [echo] Compiling
[jira] [Commented] (HIVE-3652) Join optimization for star schema
[ https://issues.apache.org/jira/browse/HIVE-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490750#comment-13490750 ] Namit Jain commented on HIVE-3652: -- Yes, it will not work automatically. You need to change the auto join conversion logic, where you will create a task tree as a backup task for multiple mapjoins. It is do-able, and maybe the cleanest way of fitting it into the current architecture. Join optimization for star schema - Key: HIVE-3652 URL: https://issues.apache.org/jira/browse/HIVE-3652 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Currently, if we join one fact table with multiple dimension tables, it results in multiple mapreduce jobs for each join with dimension table, because join would be on different keys for each dimension. Usually all the dimension tables will be small and can fit into memory and so map-side join can used to join with fact table. In this issue I want to look at optimizing such query to generate single mapreduce job sothat mapper loads dimension tables into memory and joins with fact table on different keys as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1362) column level statistics
[ https://issues.apache.org/jira/browse/HIVE-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490814#comment-13490814 ] Shreepadma Venugopalan commented on HIVE-1362: -- @Namit: I responded to your comment about Exceptions on phabricator. Thanks. column level statistics --- Key: HIVE-1362 URL: https://issues.apache.org/jira/browse/HIVE-1362 Project: Hive Issue Type: Sub-task Components: Statistics Reporter: Ning Zhang Assignee: Shreepadma Venugopalan Attachments: HIVE-1362.1.patch.txt, HIVE-1362.2.patch.txt, HIVE-1362.3.patch.txt, HIVE-1362.4.patch.txt, HIVE-1362.5.patch.txt, HIVE-1362.6.patch.txt, HIVE-1362.7.patch.txt, HIVE-1362.8.patch.txt, HIVE-1362.9.patch.txt, HIVE-1362.D6339.1.patch, HIVE-1362-gen_thrift.1.patch.txt, HIVE-1362-gen_thrift.2.patch.txt, HIVE-1362-gen_thrift.3.patch.txt, HIVE-1362-gen_thrift.4.patch.txt, HIVE-1362-gen_thrift.5.patch.txt, HIVE-1362-gen_thrift.6.patch.txt, HIVE-1362_gen-thrift.7.patch.txt, HIVE-1362_gen-thrift.8.patch.txt, HIVE-1362_gen-thrift.9.patch.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3621) Make prompt in Hive CLI configurable
[ https://issues.apache.org/jira/browse/HIVE-3621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490852#comment-13490852 ] Jingwei Lu commented on HIVE-3621: -- I do explicitly state that hiveconf can be used in the new configuration. If it is not clear, your suggestion are welcome. Make prompt in Hive CLI configurable Key: HIVE-3621 URL: https://issues.apache.org/jira/browse/HIVE-3621 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 0.9.0 Reporter: Jingwei Lu Assignee: Jingwei Lu Priority: Minor Labels: newbie, patch Fix For: 0.10.0 Attachments: HIVE-3621.patch.1.txt Original Estimate: 48h Remaining Estimate: 48h Right now the Hive CLI prompt just says hive, for users (primarily power users) who run in different clusters it can be easy to forget which cluster your Hive CLI is pointing to. If we change the Hive CLI prompt to be something like hive(silver) it would be much clearer. We could potentially extend this to namespaces as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3621) Make prompt in Hive CLI configurable
[ https://issues.apache.org/jira/browse/HIVE-3621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490856#comment-13490856 ] Mark Grover commented on HIVE-3621: --- Nevermind, I missed it. Looks good to me. Thanks! Make prompt in Hive CLI configurable Key: HIVE-3621 URL: https://issues.apache.org/jira/browse/HIVE-3621 Project: Hive Issue Type: Improvement Components: CLI Affects Versions: 0.9.0 Reporter: Jingwei Lu Assignee: Jingwei Lu Priority: Minor Labels: newbie, patch Fix For: 0.10.0 Attachments: HIVE-3621.patch.1.txt Original Estimate: 48h Remaining Estimate: 48h Right now the Hive CLI prompt just says hive, for users (primarily power users) who run in different clusters it can be easy to forget which cluster your Hive CLI is pointing to. If we change the Hive CLI prompt to be something like hive(silver) it would be much clearer. We could potentially extend this to namespaces as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1868) ALTER TABLE ADD PARTITION can't handle partition key names that conflict with HQL keywords
[ https://issues.apache.org/jira/browse/HIVE-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490901#comment-13490901 ] Tamir Duberstein commented on HIVE-1868: 2 years - what's going on with this? ALTER TABLE ADD PARTITION can't handle partition key names that conflict with HQL keywords -- Key: HIVE-1868 URL: https://issues.apache.org/jira/browse/HIVE-1868 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Carl Steinbach {code} hive CREATE TABLE foo (a INT) PARTITIONED BY (`date` STRING); OK hive DESCRIBE EXTENDED foo; OK a int date string Detailed Table Information Table(tableName:foo, dbName:default, owner:carl, createTime:1293508187, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols: [FieldSchema(name:a, type:int, comment:null), FieldSchema(name:date, type:string, comment:null)], location:file:/user/hive/warehouse/foo, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[FieldSchema(name:date, type:string, comment:null)], parameters:{transient_lastDdlTime=1293508187}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE) Time taken: 0.107 seconds hive ALTER TABLE foo ADD PARTITION (date='Feb 25 1980'); FAILED: Parse Error: line 1:31 mismatched input 'date' expecting Identifier in add partition statement 10/12/27 19:50:13 ERROR ql.Driver: FAILED: Parse Error: line 1:31 mismatched input 'date' expecting Identifier in add partition statement org.apache.hadoop.hive.ql.parse.ParseException: line 1:31 mismatched input 'date' expecting Identifier in add partition statement at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:406) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:320) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:686) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:161) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:235) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:450) hive ALTER TABLE foo ADD PARTITION (`date`='Feb 25 1980'); ALTER TABLE foo ADD PARTITION (`date`='Feb 25 1980'); 10/12/27 19:50:22 INFO parse.ParseDriver: Parsing command: ALTER TABLE foo ADD PARTITION (`date`='Feb 25 1980') 10/12/27 19:50:22 INFO parse.ParseDriver: Parse Completed 10/12/27 19:50:22 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=foo 10/12/27 19:50:22 INFO hive.log: DDL: struct foo { i32 a} 10/12/27 19:50:22 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=foo 10/12/27 19:50:22 INFO hive.log: DDL: struct foo { i32 a} 10/12/27 19:50:22 INFO ql.Driver: Semantic Analysis Completed 10/12/27 19:50:22 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:null, properties:null) 10/12/27 19:50:22 INFO ql.Driver: Starting command: ALTER TABLE foo ADD PARTITION (`date`='Feb 25 1980') 10/12/27 19:50:22 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=foo 10/12/27 19:50:22 INFO hive.log: DDL: struct foo { i32 a} FAILED: Error in metadata: date not found in table's partition spec: {`date`=Feb 25 1980} 10/12/27 19:50:22 ERROR exec.DDLTask: FAILED: Error in metadata: date not found in table's partition spec: {`date`=Feb 25 1980} org.apache.hadoop.hive.ql.metadata.HiveException: date not found in table's partition spec: {`date`=Feb 25 1980} at org.apache.hadoop.hive.ql.metadata.Table.isValidSpec(Table.java:348) at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:1165) at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:1147) at org.apache.hadoop.hive.ql.exec.DDLTask.addPartition(DDLTask.java:395) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:230) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:988) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:825) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:698) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:161) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:235) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:450) FAILED: Execution Error,
[jira] [Commented] (HIVE-895) Add SerDe for Avro serialized data
[ https://issues.apache.org/jira/browse/HIVE-895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490912#comment-13490912 ] Mithun Radhakrishnan commented on HIVE-895: --- Jakob, a quick question: The JIRA indicates that the AvroSerde is available in 0.9.1 and in 0.10.0. (And that's in sync with https://cwiki.apache.org/confluence/display/Hive/AvroSerDe+-+working+with+Avro+from+Hive). But it looks like the branch-0.9/ doesn't have this code. Are there plans to port this over? (Thanks for writing this, by the way. :]) Add SerDe for Avro serialized data -- Key: HIVE-895 URL: https://issues.apache.org/jira/browse/HIVE-895 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Affects Versions: 0.9.0 Reporter: Jeff Hammerbacher Assignee: Jakob Homan Fix For: 0.10.0, 0.9.1 Attachments: doctors.avro, episodes.avro, HIVE-895-draft.patch, HIVE-895.patch, hive-895.patch.1.txt As Avro continues to mature, having a SerDe to allow HiveQL queries over Avro data seems like a solid win. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Hive-trunk-h0.21 - Build # 1774 - Failure
Changes for Build #1764 [kevinwilfong] HIVE-3610. Add a command Explain dependency ... (Sambavi Muthukrishnan via kevinwilfong) Changes for Build #1765 Changes for Build #1766 [hashutosh] HIVE-3441 : testcases escape1,escape2 fail on windows (Thejas Nair via Ashutosh Chauhan) [kevinwilfong] HIVE-3499. add tests to use bucketing metadata for partitions. (njain via kevinwilfong) Changes for Build #1767 [kevinwilfong] HIVE-3276. optimize union sub-queries. (njain via kevinwilfong) Changes for Build #1768 Changes for Build #1769 Changes for Build #1770 [namit] HIVE-3570 Add/fix facility to collect operator specific statisticsin hive + add hash-in/hash-out counter for GroupBy Optr (Satadru Pan via namit) [namit] HIVE-3554 Hive List Bucketing - Query logic (Gang Tim Liu via namit) [cws] HIVE-3563. Drop database cascade fails when there are indexes on any tables (Prasad Mujumdar via cws) Changes for Build #1771 [kevinwilfong] HIVE-3640. Reducer allocation is incorrect if enforce bucketing and mapred.reduce.tasks are both set. (Vighnesh Avadhani via kevinwilfong) Changes for Build #1772 Changes for Build #1773 Changes for Build #1774 7 tests failed. REGRESSION: org.apache.hadoop.hive.metastore.TestMetaStoreEventListener.testListener Error Message: java.net.SocketException: Broken pipe Stack Trace: org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:147) at org.apache.thrift.protocol.TBinaryProtocol.writeI32(TBinaryProtocol.java:163) at org.apache.thrift.protocol.TBinaryProtocol.writeString(TBinaryProtocol.java:186) at org.apache.thrift.protocol.TBinaryProtocol.writeMessageBegin(TBinaryProtocol.java:92) at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.send_get_database(ThriftHiveMetastore.java:372) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_database(ThriftHiveMetastore.java:364) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabase(HiveMetaStoreClient.java:713) at org.apache.hadoop.hive.metastore.TestMetaStoreEventListener.testListener(TestMetaStoreEventListener.java:191) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at junit.framework.TestCase.runTest(TestCase.java:168) at junit.framework.TestCase.runBare(TestCase.java:134) at junit.framework.TestResult$1.protect(TestResult.java:110) at junit.framework.TestResult.runProtected(TestResult.java:128) at junit.framework.TestResult.run(TestResult.java:113) at junit.framework.TestCase.run(TestCase.java:124) at junit.framework.TestSuite.runTest(TestSuite.java:232) at junit.framework.TestSuite.run(TestSuite.java:227) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:518) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1052) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:906) Caused by: java.net.SocketException: Broken pipe at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92) at java.net.SocketOutputStream.write(SocketOutputStream.java:136) at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:145) ... 23 more FAILED: org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby1 Error Message: Unexpected exception See build/ql/tmp/hive.log, or try ant test ... -Dtest.silent=false to get more logs. Stack Trace: junit.framework.AssertionFailedError: Unexpected exception See build/ql/tmp/hive.log, or try ant test ... -Dtest.silent=false to get more logs. at junit.framework.Assert.fail(Assert.java:47) at org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby1(TestParse.java:227) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at junit.framework.TestCase.runTest(TestCase.java:168) at junit.framework.TestCase.runBare(TestCase.java:134) at junit.framework.TestResult$1.protect(TestResult.java:110) at junit.framework.TestResult.runProtected(TestResult.java:128) at
[jira] [Commented] (HIVE-895) Add SerDe for Avro serialized data
[ https://issues.apache.org/jira/browse/HIVE-895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13490998#comment-13490998 ] Jakob Homan commented on HIVE-895: -- The fixed version is 9.1, but at least in the JIRA, I don't see it having been committed to anywhere else. You can check that branch for the 895 commit, but I don't think it was. I don't plan to do any porting. You're welcome to try it or just go ahead and use haivvreo until 0.10 comes out. Add SerDe for Avro serialized data -- Key: HIVE-895 URL: https://issues.apache.org/jira/browse/HIVE-895 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Affects Versions: 0.9.0 Reporter: Jeff Hammerbacher Assignee: Jakob Homan Fix For: 0.10.0, 0.9.1 Attachments: doctors.avro, episodes.avro, HIVE-895-draft.patch, HIVE-895.patch, hive-895.patch.1.txt As Avro continues to mature, having a SerDe to allow HiveQL queries over Avro data seems like a solid win. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3672) Support altering partition column type in Hive
Jingwei Lu created HIVE-3672: Summary: Support altering partition column type in Hive Key: HIVE-3672 URL: https://issues.apache.org/jira/browse/HIVE-3672 Project: Hive Issue Type: Improvement Components: CLI, SQL Affects Versions: 0.10.0 Reporter: Jingwei Lu Assignee: Jingwei Lu Fix For: 0.10.0 Currently, Hive does not allow altering partition column types. As we've discouraged users from using non-string partition column types, this presents a problem for users who want to change there partition columns to be strings, they have to rename their table, create a new table, and copy all the data over. To support this via the CLI, adding a command like ALTER TABLE table_name PARTITION COLUMN (column_name new type); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3673) Sort merge join not used when join columns have different names
Kevin Wilfong created HIVE-3673: --- Summary: Sort merge join not used when join columns have different names Key: HIVE-3673 URL: https://issues.apache.org/jira/browse/HIVE-3673 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong If two tables are joined on columns with different names, the sort merge join optimization is not applied. E.g. SELECT /*+ MAPJOIN(b) */ * FROM t1 a JOIN t2 b ON a.key = b.value; This will not use sort merge join even if t1 and t2 are bucketed and sorted by key, value respectively. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3673) Sort merge join not used when join columns have different names
[ https://issues.apache.org/jira/browse/HIVE-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13491030#comment-13491030 ] Kevin Wilfong commented on HIVE-3673: - https://reviews.facebook.net/D6483 Sort merge join not used when join columns have different names --- Key: HIVE-3673 URL: https://issues.apache.org/jira/browse/HIVE-3673 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-3673.1.patch.txt If two tables are joined on columns with different names, the sort merge join optimization is not applied. E.g. SELECT /*+ MAPJOIN(b) */ * FROM t1 a JOIN t2 b ON a.key = b.value; This will not use sort merge join even if t1 and t2 are bucketed and sorted by key, value respectively. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3673) Sort merge join not used when join columns have different names
[ https://issues.apache.org/jira/browse/HIVE-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong updated HIVE-3673: Attachment: HIVE-3673.1.patch.txt Sort merge join not used when join columns have different names --- Key: HIVE-3673 URL: https://issues.apache.org/jira/browse/HIVE-3673 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-3673.1.patch.txt If two tables are joined on columns with different names, the sort merge join optimization is not applied. E.g. SELECT /*+ MAPJOIN(b) */ * FROM t1 a JOIN t2 b ON a.key = b.value; This will not use sort merge join even if t1 and t2 are bucketed and sorted by key, value respectively. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3673) Sort merge join not used when join columns have different names
[ https://issues.apache.org/jira/browse/HIVE-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong updated HIVE-3673: Status: Patch Available (was: Open) Sort merge join not used when join columns have different names --- Key: HIVE-3673 URL: https://issues.apache.org/jira/browse/HIVE-3673 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-3673.1.patch.txt If two tables are joined on columns with different names, the sort merge join optimization is not applied. E.g. SELECT /*+ MAPJOIN(b) */ * FROM t1 a JOIN t2 b ON a.key = b.value; This will not use sort merge join even if t1 and t2 are bucketed and sorted by key, value respectively. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3674) test case TestParse broken after recent checkin
Sambavi Muthukrishnan created HIVE-3674: --- Summary: test case TestParse broken after recent checkin Key: HIVE-3674 URL: https://issues.apache.org/jira/browse/HIVE-3674 Project: Hive Issue Type: Bug Reporter: Sambavi Muthukrishnan Assignee: Sambavi Muthukrishnan The below test cases fail after running svn up on my clean checkout. org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby2 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby3 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby5 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby6 The build on Nov 2 shows this issue as well. https://builds.apache.org/job/Hive-trunk-h0.21/1770/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3674) test case TestParse broken after recent checkin
[ https://issues.apache.org/jira/browse/HIVE-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13491063#comment-13491063 ] Sambavi Muthukrishnan commented on HIVE-3674: - Link to diff: https://reviews.facebook.net/D6489 test case TestParse broken after recent checkin --- Key: HIVE-3674 URL: https://issues.apache.org/jira/browse/HIVE-3674 Project: Hive Issue Type: Bug Reporter: Sambavi Muthukrishnan Assignee: Sambavi Muthukrishnan The below test cases fail after running svn up on my clean checkout. org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby2 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby3 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby5 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby6 The build on Nov 2 shows this issue as well. https://builds.apache.org/job/Hive-trunk-h0.21/1770/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3674) test case TestParse broken after recent checkin
[ https://issues.apache.org/jira/browse/HIVE-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sambavi Muthukrishnan updated HIVE-3674: Attachment: TestParseFix.1.patch test case TestParse broken after recent checkin --- Key: HIVE-3674 URL: https://issues.apache.org/jira/browse/HIVE-3674 Project: Hive Issue Type: Bug Affects Versions: 0.9.0 Reporter: Sambavi Muthukrishnan Assignee: Sambavi Muthukrishnan Attachments: TestParseFix.1.patch The below test cases fail after running svn up on my clean checkout. org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby2 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby3 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby5 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby6 The build on Nov 2 shows this issue as well. https://builds.apache.org/job/Hive-trunk-h0.21/1770/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3674) test case TestParse broken after recent checkin
[ https://issues.apache.org/jira/browse/HIVE-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sambavi Muthukrishnan updated HIVE-3674: Affects Version/s: 0.9.0 Status: Patch Available (was: Open) test case TestParse broken after recent checkin --- Key: HIVE-3674 URL: https://issues.apache.org/jira/browse/HIVE-3674 Project: Hive Issue Type: Bug Affects Versions: 0.9.0 Reporter: Sambavi Muthukrishnan Assignee: Sambavi Muthukrishnan Attachments: TestParseFix.1.patch The below test cases fail after running svn up on my clean checkout. org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby2 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby3 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby5 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby6 The build on Nov 2 shows this issue as well. https://builds.apache.org/job/Hive-trunk-h0.21/1770/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3674) test case TestParse broken after recent checkin
[ https://issues.apache.org/jira/browse/HIVE-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13491069#comment-13491069 ] Shreepadma Venugopalan commented on HIVE-3674: -- Thanks Sambavi for the patch. I encountered this error as well on the latest version of trunk. test case TestParse broken after recent checkin --- Key: HIVE-3674 URL: https://issues.apache.org/jira/browse/HIVE-3674 Project: Hive Issue Type: Bug Affects Versions: 0.9.0 Reporter: Sambavi Muthukrishnan Assignee: Sambavi Muthukrishnan Attachments: TestParseFix.1.patch The below test cases fail after running svn up on my clean checkout. org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby2 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby3 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby5 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby6 The build on Nov 2 shows this issue as well. https://builds.apache.org/job/Hive-trunk-h0.21/1770/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3674) test case TestParse broken after recent checkin
[ https://issues.apache.org/jira/browse/HIVE-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13491077#comment-13491077 ] Kevin Wilfong commented on HIVE-3674: - +1 Thanks Sambavi test case TestParse broken after recent checkin --- Key: HIVE-3674 URL: https://issues.apache.org/jira/browse/HIVE-3674 Project: Hive Issue Type: Bug Affects Versions: 0.9.0 Reporter: Sambavi Muthukrishnan Assignee: Sambavi Muthukrishnan Attachments: TestParseFix.1.patch The below test cases fail after running svn up on my clean checkout. org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby2 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby3 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby5 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby6 The build on Nov 2 shows this issue as well. https://builds.apache.org/job/Hive-trunk-h0.21/1770/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3649) Hive List Bucketing - enhance DDL to specify list bucketing table
[ https://issues.apache.org/jira/browse/HIVE-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Tim Liu updated HIVE-3649: --- Description: We need to differ normal skewed table from list bucketing table. we use an optional parameter store as DIRECTORIES create table T (schema) skewed by (keys) on ('c1', 'c2') [store as DIRECTORIES]; details in https://cwiki.apache.org/confluence/display/Hive/ListBucketing 4 main cases: 1. create table name skewed by .. on .. stored as directories; 2. alter table name skewed by .. on .. stored as directories; 3. alter table name not skewed; 4. alter table name not stored as directories; was: We need to differ normal skewed table from list bucketing table. we use an optional parameter store as DIRECTORIES create table T (schema) skewed by (keys) on ('c1', 'c2') [store as DIRECTORIES]; details in https://cwiki.apache.org/confluence/display/Hive/ListBucketing Hive List Bucketing - enhance DDL to specify list bucketing table - Key: HIVE-3649 URL: https://issues.apache.org/jira/browse/HIVE-3649 Project: Hive Issue Type: New Feature Affects Versions: 0.10.0 Reporter: Gang Tim Liu We need to differ normal skewed table from list bucketing table. we use an optional parameter store as DIRECTORIES create table T (schema) skewed by (keys) on ('c1', 'c2') [store as DIRECTORIES]; details in https://cwiki.apache.org/confluence/display/Hive/ListBucketing 4 main cases: 1. create table name skewed by .. on .. stored as directories; 2. alter table name skewed by .. on .. stored as directories; 3. alter table name not skewed; 4. alter table name not stored as directories; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3649) Hive List Bucketing - enhance DDL to specify list bucketing table
[ https://issues.apache.org/jira/browse/HIVE-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Tim Liu updated HIVE-3649: --- Description: We need to differ normal skewed table from list bucketing table. we use an optional parameter store as DIRECTORIES 4 main cases: 1. create table name skewed by .. on .. stored as directories; 2. alter table name skewed by .. on .. stored as directories; 3. alter table name not skewed; 4. alter table name not stored as directories; details in https://cwiki.apache.org/confluence/display/Hive/ListBucketing was: We need to differ normal skewed table from list bucketing table. we use an optional parameter store as DIRECTORIES create table T (schema) skewed by (keys) on ('c1', 'c2') [store as DIRECTORIES]; details in https://cwiki.apache.org/confluence/display/Hive/ListBucketing 4 main cases: 1. create table name skewed by .. on .. stored as directories; 2. alter table name skewed by .. on .. stored as directories; 3. alter table name not skewed; 4. alter table name not stored as directories; Hive List Bucketing - enhance DDL to specify list bucketing table - Key: HIVE-3649 URL: https://issues.apache.org/jira/browse/HIVE-3649 Project: Hive Issue Type: New Feature Affects Versions: 0.10.0 Reporter: Gang Tim Liu We need to differ normal skewed table from list bucketing table. we use an optional parameter store as DIRECTORIES 4 main cases: 1. create table name skewed by .. on .. stored as directories; 2. alter table name skewed by .. on .. stored as directories; 3. alter table name not skewed; 4. alter table name not stored as directories; details in https://cwiki.apache.org/confluence/display/Hive/ListBucketing -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3229) null values being loaded as non-null values into Hive
[ https://issues.apache.org/jira/browse/HIVE-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13491160#comment-13491160 ] Shengsheng Huang commented on HIVE-3229: Do you mean you want the empty strings in column C2 to be loaded as nulls? Actually Hive interprets \N instead of empty string as null string. So you should write a literal \N in the columns to represent a null string. You could override the default null string value \N with ROW FORMAT null values being loaded as non-null values into Hive - Key: HIVE-3229 URL: https://issues.apache.org/jira/browse/HIVE-3229 Project: Hive Issue Type: Bug Reporter: N Campbell Attachments: CERT.TSET1.txt various tab delimited input files contain one or more columns that represent null values in rows. the data appears to load (without an error such as in JIRA 3228) however the resulting values are now non-null values which is incorrect. create table if not exists CERT.TSET1_E ( RNUM int , C1 int, C2 string) row format delimited fields terminated by '\t' stored as textfile; create table if not exists CERT.TSET1 ( RNUM int , C1 int, C2 string) stored as sequencefile; load data local inpath 'CERT.TSET1.txt' overwrite into table CERT.TSET1_E; insert overwrite table CERT.TSET1 select * from CERT.TSET1_E; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-895) Add SerDe for Avro serialized data
[ https://issues.apache.org/jira/browse/HIVE-895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13491177#comment-13491177 ] Ruslan Al-Fakikh commented on HIVE-895: --- Mithun, what distro do you use? Cloudera patched an earlier version of Hive with this SerDe: https://ccp.cloudera.com/display/CDHDOC/New+Features+in+CDH3#NewFeaturesinCDH3-What%27sNewinCDH3Update5 https://ccp.cloudera.com/display/DOC/CDH+Version+and+Packaging+Information#CDHVersionandPackagingInformation-CDH3Update5Packaging Add SerDe for Avro serialized data -- Key: HIVE-895 URL: https://issues.apache.org/jira/browse/HIVE-895 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Affects Versions: 0.9.0 Reporter: Jeff Hammerbacher Assignee: Jakob Homan Fix For: 0.10.0, 0.9.1 Attachments: doctors.avro, episodes.avro, HIVE-895-draft.patch, HIVE-895.patch, hive-895.patch.1.txt As Avro continues to mature, having a SerDe to allow HiveQL queries over Avro data seems like a solid win. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3674) test case TestParse broken after recent checkin
[ https://issues.apache.org/jira/browse/HIVE-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13491193#comment-13491193 ] Namit Jain commented on HIVE-3674: -- https://issues.apache.org/jira/browse/HIVE-3657 filed for the same issue. test case TestParse broken after recent checkin --- Key: HIVE-3674 URL: https://issues.apache.org/jira/browse/HIVE-3674 Project: Hive Issue Type: Bug Affects Versions: 0.9.0 Reporter: Sambavi Muthukrishnan Assignee: Sambavi Muthukrishnan Attachments: TestParseFix.1.patch The below test cases fail after running svn up on my clean checkout. org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby2 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby3 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby5 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby6 The build on Nov 2 shows this issue as well. https://builds.apache.org/job/Hive-trunk-h0.21/1770/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1362) column level statistics
[ https://issues.apache.org/jira/browse/HIVE-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13491194#comment-13491194 ] Namit Jain commented on HIVE-1362: -- I think you should wrap the error even if is being thrown by a function being called. column level statistics --- Key: HIVE-1362 URL: https://issues.apache.org/jira/browse/HIVE-1362 Project: Hive Issue Type: Sub-task Components: Statistics Reporter: Ning Zhang Assignee: Shreepadma Venugopalan Attachments: HIVE-1362.1.patch.txt, HIVE-1362.2.patch.txt, HIVE-1362.3.patch.txt, HIVE-1362.4.patch.txt, HIVE-1362.5.patch.txt, HIVE-1362.6.patch.txt, HIVE-1362.7.patch.txt, HIVE-1362.8.patch.txt, HIVE-1362.9.patch.txt, HIVE-1362.D6339.1.patch, HIVE-1362-gen_thrift.1.patch.txt, HIVE-1362-gen_thrift.2.patch.txt, HIVE-1362-gen_thrift.3.patch.txt, HIVE-1362-gen_thrift.4.patch.txt, HIVE-1362-gen_thrift.5.patch.txt, HIVE-1362-gen_thrift.6.patch.txt, HIVE-1362_gen-thrift.7.patch.txt, HIVE-1362_gen-thrift.8.patch.txt, HIVE-1362_gen-thrift.9.patch.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3673) Sort merge join not used when join columns have different names
[ https://issues.apache.org/jira/browse/HIVE-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3673: - Status: Open (was: Patch Available) comments on phabricator Sort merge join not used when join columns have different names --- Key: HIVE-3673 URL: https://issues.apache.org/jira/browse/HIVE-3673 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-3673.1.patch.txt If two tables are joined on columns with different names, the sort merge join optimization is not applied. E.g. SELECT /*+ MAPJOIN(b) */ * FROM t1 a JOIN t2 b ON a.key = b.value; This will not use sort merge join even if t1 and t2 are bucketed and sorted by key, value respectively. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1362) column level statistics
[ https://issues.apache.org/jira/browse/HIVE-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-1362: - Status: Open (was: Patch Available) comments column level statistics --- Key: HIVE-1362 URL: https://issues.apache.org/jira/browse/HIVE-1362 Project: Hive Issue Type: Sub-task Components: Statistics Reporter: Ning Zhang Assignee: Shreepadma Venugopalan Attachments: HIVE-1362.1.patch.txt, HIVE-1362.2.patch.txt, HIVE-1362.3.patch.txt, HIVE-1362.4.patch.txt, HIVE-1362.5.patch.txt, HIVE-1362.6.patch.txt, HIVE-1362.7.patch.txt, HIVE-1362.8.patch.txt, HIVE-1362.9.patch.txt, HIVE-1362.D6339.1.patch, HIVE-1362-gen_thrift.1.patch.txt, HIVE-1362-gen_thrift.2.patch.txt, HIVE-1362-gen_thrift.3.patch.txt, HIVE-1362-gen_thrift.4.patch.txt, HIVE-1362-gen_thrift.5.patch.txt, HIVE-1362-gen_thrift.6.patch.txt, HIVE-1362_gen-thrift.7.patch.txt, HIVE-1362_gen-thrift.8.patch.txt, HIVE-1362_gen-thrift.9.patch.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3588) Get Hive to work with hbase 94
[ https://issues.apache.org/jira/browse/HIVE-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13491203#comment-13491203 ] qipan commented on HIVE-3588: - hi, I got an error when trying to use hive 0.9.0 import to hbase 0.94.0 : FATAL ExecMapper: java.lang.IllegalAccessError: tried to access field org.slf4j.impl.StaticLoggerBinder.SINGLETON from class org.slf4j.LoggerFactory dose this patch fits this problem? Get Hive to work with hbase 94 -- Key: HIVE-3588 URL: https://issues.apache.org/jira/browse/HIVE-3588 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.9.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-3588.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3673) Sort merge join not used when join columns have different names
[ https://issues.apache.org/jira/browse/HIVE-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13491215#comment-13491215 ] Kevin Wilfong commented on HIVE-3673: - Addressed comments. Sort merge join not used when join columns have different names --- Key: HIVE-3673 URL: https://issues.apache.org/jira/browse/HIVE-3673 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-3673.1.patch.txt, HIVE-3673.2.patch.txt If two tables are joined on columns with different names, the sort merge join optimization is not applied. E.g. SELECT /*+ MAPJOIN(b) */ * FROM t1 a JOIN t2 b ON a.key = b.value; This will not use sort merge join even if t1 and t2 are bucketed and sorted by key, value respectively. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3673) Sort merge join not used when join columns have different names
[ https://issues.apache.org/jira/browse/HIVE-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong updated HIVE-3673: Attachment: HIVE-3673.2.patch.txt Sort merge join not used when join columns have different names --- Key: HIVE-3673 URL: https://issues.apache.org/jira/browse/HIVE-3673 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-3673.1.patch.txt, HIVE-3673.2.patch.txt If two tables are joined on columns with different names, the sort merge join optimization is not applied. E.g. SELECT /*+ MAPJOIN(b) */ * FROM t1 a JOIN t2 b ON a.key = b.value; This will not use sort merge join even if t1 and t2 are bucketed and sorted by key, value respectively. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3673) Sort merge join not used when join columns have different names
[ https://issues.apache.org/jira/browse/HIVE-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong updated HIVE-3673: Status: Patch Available (was: Open) Sort merge join not used when join columns have different names --- Key: HIVE-3673 URL: https://issues.apache.org/jira/browse/HIVE-3673 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-3673.1.patch.txt, HIVE-3673.2.patch.txt If two tables are joined on columns with different names, the sort merge join optimization is not applied. E.g. SELECT /*+ MAPJOIN(b) */ * FROM t1 a JOIN t2 b ON a.key = b.value; This will not use sort merge join even if t1 and t2 are bucketed and sorted by key, value respectively. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3675) NaN does not work correctly for round(n)
Namit Jain created HIVE-3675: Summary: NaN does not work correctly for round(n) Key: HIVE-3675 URL: https://issues.apache.org/jira/browse/HIVE-3675 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Fix For: 0.10.0 It works correctly for round(n, d) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-895) Add SerDe for Avro serialized data
[ https://issues.apache.org/jira/browse/HIVE-895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13491238#comment-13491238 ] Mithun Radhakrishnan commented on HIVE-895: --- @[~jghoman]: Thanks for clarifying. I'll see if this can't be merged into branch-0.9/. @[~metaruslan]: Thank you for the heads-up. We at Yahoo are currently using our own builds instead of a commercial distro. :] It'd be great to have this included in 0.9.1. Add SerDe for Avro serialized data -- Key: HIVE-895 URL: https://issues.apache.org/jira/browse/HIVE-895 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Affects Versions: 0.9.0 Reporter: Jeff Hammerbacher Assignee: Jakob Homan Fix For: 0.10.0, 0.9.1 Attachments: doctors.avro, episodes.avro, HIVE-895-draft.patch, HIVE-895.patch, hive-895.patch.1.txt As Avro continues to mature, having a SerDe to allow HiveQL queries over Avro data seems like a solid win. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3673) Sort merge join not used when join columns have different names
[ https://issues.apache.org/jira/browse/HIVE-3673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13491242#comment-13491242 ] Namit Jain commented on HIVE-3673: -- +1 Sort merge join not used when join columns have different names --- Key: HIVE-3673 URL: https://issues.apache.org/jira/browse/HIVE-3673 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-3673.1.patch.txt, HIVE-3673.2.patch.txt If two tables are joined on columns with different names, the sort merge join optimization is not applied. E.g. SELECT /*+ MAPJOIN(b) */ * FROM t1 a JOIN t2 b ON a.key = b.value; This will not use sort merge join even if t1 and t2 are bucketed and sorted by key, value respectively. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3675) NaN does not work correctly for round(n)
[ https://issues.apache.org/jira/browse/HIVE-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3675: - Attachment: hive.3675.1.patch NaN does not work correctly for round(n) Key: HIVE-3675 URL: https://issues.apache.org/jira/browse/HIVE-3675 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Fix For: 0.10.0 Attachments: hive.3675.1.patch It works correctly for round(n, d) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3676) INSERT INTO regression caused by HIVE-3465
Carl Steinbach created HIVE-3676: Summary: INSERT INTO regression caused by HIVE-3465 Key: HIVE-3676 URL: https://issues.apache.org/jira/browse/HIVE-3676 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Carl Steinbach -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3676) INSERT INTO regression caused by HIVE-3465
[ https://issues.apache.org/jira/browse/HIVE-3676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13491258#comment-13491258 ] Carl Steinbach commented on HIVE-3676: -- Prior to HIVE-3465 the following set of statements produced the expected result: {noformat} CREATE DATABASE db2; USE db2; CREATE TABLE result(col1 STRING); INSERT OVERWRITE TABLE result SELECT 'db2_insert1' FROM default.src LIMIT 1; INSERT INTO TABLE result SELECT 'db2_insert2' FROM default.src LIMIT 1; SELECT * FROM result; db_insert1 db_insert2 {noformat} While the following set of statements produced inaccurate results: {noformat} CREATE DATABASE db1; CREATE TABLE db1.result(col1 STRING); INSERT OVERWRITE TABLE db1.result SELECT 'db1_insert1' FROM src LIMIT 1; INSERT INTO TABLE db1.result SELECT 'db1_insert2' FROM src LIMIT 1; SELECT * FROM db1.result; db1_insert2 {noformat} After HIVE-3465 the first set of statements produces inaccurate results, while the second set of statements now behaves as expected. INSERT INTO regression caused by HIVE-3465 -- Key: HIVE-3676 URL: https://issues.apache.org/jira/browse/HIVE-3676 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Carl Steinbach -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1362) column level statistics
[ https://issues.apache.org/jira/browse/HIVE-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shreepadma Venugopalan updated HIVE-1362: - Status: Patch Available (was: Open) column level statistics --- Key: HIVE-1362 URL: https://issues.apache.org/jira/browse/HIVE-1362 Project: Hive Issue Type: Sub-task Components: Statistics Reporter: Ning Zhang Assignee: Shreepadma Venugopalan Attachments: HIVE-1362.1.patch.txt, HIVE-1362.2.patch.txt, HIVE-1362.3.patch.txt, HIVE-1362.4.patch.txt, HIVE-1362.5.patch.txt, HIVE-1362.6.patch.txt, HIVE-1362.7.patch.txt, HIVE-1362.8.patch.txt, HIVE-1362.9.patch.txt, HIVE-1362.D6339.1.patch, HIVE-1362-gen_thrift.1.patch.txt, HIVE-1362-gen_thrift.2.patch.txt, HIVE-1362-gen_thrift.3.patch.txt, HIVE-1362-gen_thrift.4.patch.txt, HIVE-1362-gen_thrift.5.patch.txt, HIVE-1362-gen_thrift.6.patch.txt, HIVE-1362_gen-thrift.7.patch.txt, HIVE-1362_gen-thrift.8.patch.txt, HIVE-1362_gen-thrift.9.patch.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1362) column level statistics
[ https://issues.apache.org/jira/browse/HIVE-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shreepadma Venugopalan updated HIVE-1362: - Attachment: HIVE-1362_gen-thrift.10.patch.txt column level statistics --- Key: HIVE-1362 URL: https://issues.apache.org/jira/browse/HIVE-1362 Project: Hive Issue Type: Sub-task Components: Statistics Reporter: Ning Zhang Assignee: Shreepadma Venugopalan Attachments: HIVE-1362.10.patch.txt, HIVE-1362.1.patch.txt, HIVE-1362.2.patch.txt, HIVE-1362.3.patch.txt, HIVE-1362.4.patch.txt, HIVE-1362.5.patch.txt, HIVE-1362.6.patch.txt, HIVE-1362.7.patch.txt, HIVE-1362.8.patch.txt, HIVE-1362.9.patch.txt, HIVE-1362.D6339.1.patch, HIVE-1362_gen-thrift.10.patch.txt, HIVE-1362-gen_thrift.1.patch.txt, HIVE-1362-gen_thrift.2.patch.txt, HIVE-1362-gen_thrift.3.patch.txt, HIVE-1362-gen_thrift.4.patch.txt, HIVE-1362-gen_thrift.5.patch.txt, HIVE-1362-gen_thrift.6.patch.txt, HIVE-1362_gen-thrift.7.patch.txt, HIVE-1362_gen-thrift.8.patch.txt, HIVE-1362_gen-thrift.9.patch.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1362) column level statistics
[ https://issues.apache.org/jira/browse/HIVE-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shreepadma Venugopalan updated HIVE-1362: - Attachment: HIVE-1362.10.patch.txt column level statistics --- Key: HIVE-1362 URL: https://issues.apache.org/jira/browse/HIVE-1362 Project: Hive Issue Type: Sub-task Components: Statistics Reporter: Ning Zhang Assignee: Shreepadma Venugopalan Attachments: HIVE-1362.10.patch.txt, HIVE-1362.1.patch.txt, HIVE-1362.2.patch.txt, HIVE-1362.3.patch.txt, HIVE-1362.4.patch.txt, HIVE-1362.5.patch.txt, HIVE-1362.6.patch.txt, HIVE-1362.7.patch.txt, HIVE-1362.8.patch.txt, HIVE-1362.9.patch.txt, HIVE-1362.D6339.1.patch, HIVE-1362_gen-thrift.10.patch.txt, HIVE-1362-gen_thrift.1.patch.txt, HIVE-1362-gen_thrift.2.patch.txt, HIVE-1362-gen_thrift.3.patch.txt, HIVE-1362-gen_thrift.4.patch.txt, HIVE-1362-gen_thrift.5.patch.txt, HIVE-1362-gen_thrift.6.patch.txt, HIVE-1362_gen-thrift.7.patch.txt, HIVE-1362_gen-thrift.8.patch.txt, HIVE-1362_gen-thrift.9.patch.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1362) column level statistics
[ https://issues.apache.org/jira/browse/HIVE-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13491274#comment-13491274 ] Shreepadma Venugopalan commented on HIVE-1362: -- @Namit: The latest patch addressed your comments. Its available both on JIRA and phabricator. Thanks. column level statistics --- Key: HIVE-1362 URL: https://issues.apache.org/jira/browse/HIVE-1362 Project: Hive Issue Type: Sub-task Components: Statistics Reporter: Ning Zhang Assignee: Shreepadma Venugopalan Attachments: HIVE-1362.10.patch.txt, HIVE-1362.1.patch.txt, HIVE-1362.2.patch.txt, HIVE-1362.3.patch.txt, HIVE-1362.4.patch.txt, HIVE-1362.5.patch.txt, HIVE-1362.6.patch.txt, HIVE-1362.7.patch.txt, HIVE-1362.8.patch.txt, HIVE-1362.9.patch.txt, HIVE-1362.D6339.1.patch, HIVE-1362_gen-thrift.10.patch.txt, HIVE-1362-gen_thrift.1.patch.txt, HIVE-1362-gen_thrift.2.patch.txt, HIVE-1362-gen_thrift.3.patch.txt, HIVE-1362-gen_thrift.4.patch.txt, HIVE-1362-gen_thrift.5.patch.txt, HIVE-1362-gen_thrift.6.patch.txt, HIVE-1362_gen-thrift.7.patch.txt, HIVE-1362_gen-thrift.8.patch.txt, HIVE-1362_gen-thrift.9.patch.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira