[jira] [Commented] (HIVE-4014) Hive+RCFile is not doing column pruning and reading much more data than necessary
[ https://issues.apache.org/jira/browse/HIVE-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591350#comment-13591350 ] Tamas Tarjanyi commented on HIVE-4014: -- Hi Vinod, As I have stated above BAD: CDH4.1.3 - which is using hadoop-2.0.0+556 / hive-0.9.0+158 GOOD: hadoop 1.0.3 / hive 0.10.0 (apache download) GOOD: hadoop 1.0.4 / hive 0.10.0 (apache download) Meanwhile I have also tried Hortonworks Data Platform 1.2.1 GOOD: HDP1.2.1 Apache Hadoop 1.1.2-rc3 / Apache Hive 0.10.0 So it seems that the issue is in hive-0.9 now. My real problem is that both Hortonworks and Cloudera bundle hive-0.9 with hadoop-2.x.y and I wanted to use hadoop-2.x.y with hive-0.10.x and not hadoop-1. Hive+RCFile is not doing column pruning and reading much more data than necessary - Key: HIVE-4014 URL: https://issues.apache.org/jira/browse/HIVE-4014 Project: Hive Issue Type: Bug Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli With even simple projection queries, I see that HDFS bytes read counter doesn't show any reduction in the amount of data read. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4053) Add support for phonetic algorithms in Hive
[ https://issues.apache.org/jira/browse/HIVE-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krishna updated HIVE-4053: -- Status: Open (was: Patch Available) I will re-submit the patch Add support for phonetic algorithms in Hive --- Key: HIVE-4053 URL: https://issues.apache.org/jira/browse/HIVE-4053 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.10.0 Reporter: Krishna Labels: patch Fix For: 0.10.0 Attachments: FunctionRegistry.java, GenericUDFRefinedSoundex.java, HIVE-4053.1.patch.txt Following phonetic algorithms should be considered, which are very useful in search: Soundex: http://en.wikipedia.org/wiki/Soundex Refined Soundex: Refer to the comment on 22/Feb/13 23:51 Daitch–Mokotoff Soundex: http://en.wikipedia.org/wiki/Daitch%E2%80%93Mokotoff_Soundex Metaphone and Double Metaphone: http://en.wikipedia.org/wiki/Metaphone New York State Identification and Intelligence System (NYSIIS): http://en.wikipedia.org/wiki/New_York_State_Identification_and_Intelligence_System Caverphone: http://en.wikipedia.org/wiki/Caverphone -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4108) Allow over() clause to contain an order by with no partition by
Brock Noland created HIVE-4108: -- Summary: Allow over() clause to contain an order by with no partition by Key: HIVE-4108 URL: https://issues.apache.org/jira/browse/HIVE-4108 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Brock Noland HIVE-4073 allows over() to be called with no partition by and no order by. We should allow only an order by. From the review of HIVE-4073: Ashutosh {noformat} Can you also add following test. This should also work. select p_name, p_retailprice, avg(p_retailprice) over(order by p_name) from part partition by p_name; {noformat} Harish {noformat} This test will not work (: The grammar needs to be changed so: partitioningSpec @init { msgs.push(partitioningSpec clause); } @after { msgs.pop(); } : partitionByClause orderByClause? - ^(TOK_PARTITIONINGSPEC partitionByClause orderByClause?) | orderByClause - ^(TOK_PARTITIONINGSPEC orderByClause) | distributeByClause sortByClause? - ^(TOK_PARTITIONINGSPEC distributeByClause sortByClause?) | sortByClause? - ^(TOK_PARTITIONINGSPEC sortByClause) | clusterByClause - ^(TOK_PARTITIONINGSPEC clusterByClause) ; And the SemanticAnalyzer::processPTFPartitionSpec has to handle this shape of the AST Tree. The PTFTranslator also needs changes. Do this as another Jira {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4073) Make partition by optional in over clause
[ https://issues.apache.org/jira/browse/HIVE-4073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-4073: --- Attachment: HIVE-4073-3.patch Latest patch Make partition by optional in over clause - Key: HIVE-4073 URL: https://issues.apache.org/jira/browse/HIVE-4073 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Ashutosh Chauhan Assignee: Brock Noland Attachments: HIVE-4073-0.patch, HIVE-4073-1.patch, HIVE-4073-2.patch, HIVE-4073-3.patch select s, sum( i ) over() from tt; should work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4109) Partition by column does not have to be in order by
Brock Noland created HIVE-4109: -- Summary: Partition by column does not have to be in order by Key: HIVE-4109 URL: https://issues.apache.org/jira/browse/HIVE-4109 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Brock Noland Cam up in the review of HIVE-4093. {noformat} I am not sure if this is illegal query. I tried following two queries in postgres, both of them succeeded. select p_mfgr, avg(p_retailprice) over(partition by p_mfgr, p_type order by p_mfgr) from part; select p_mfgr, avg(p_retailprice) over(partition by p_mfgr order by p_type,p_mfgr) from part; {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4109) Partition by column does not have to be in order by
[ https://issues.apache.org/jira/browse/HIVE-4109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-4109: --- Description: Cam up in the review of HIVE-4093. Ashutosh {noformat} I am not sure if this is illegal query. I tried following two queries in postgres, both of them succeeded. select p_mfgr, avg(p_retailprice) over(partition by p_mfgr, p_type order by p_mfgr) from part; select p_mfgr, avg(p_retailprice) over(partition by p_mfgr order by p_type,p_mfgr) from part; {noformat} Harish {noformat} The first one doesn't make sense, right? Order on a subset of the partition columns The second one: Can we do this with the Hive ReduceOp have the orderColumns be in a different order than the key columns? {noformat} was: Cam up in the review of HIVE-4093. {noformat} I am not sure if this is illegal query. I tried following two queries in postgres, both of them succeeded. select p_mfgr, avg(p_retailprice) over(partition by p_mfgr, p_type order by p_mfgr) from part; select p_mfgr, avg(p_retailprice) over(partition by p_mfgr order by p_type,p_mfgr) from part; {noformat} Partition by column does not have to be in order by --- Key: HIVE-4109 URL: https://issues.apache.org/jira/browse/HIVE-4109 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Brock Noland Cam up in the review of HIVE-4093. Ashutosh {noformat} I am not sure if this is illegal query. I tried following two queries in postgres, both of them succeeded. select p_mfgr, avg(p_retailprice) over(partition by p_mfgr, p_type order by p_mfgr) from part; select p_mfgr, avg(p_retailprice) over(partition by p_mfgr order by p_type,p_mfgr) from part; {noformat} Harish {noformat} The first one doesn't make sense, right? Order on a subset of the partition columns The second one: Can we do this with the Hive ReduceOp have the orderColumns be in a different order than the key columns? {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4093) Remove sprintf from PTFTranslator and use String.format()
[ https://issues.apache.org/jira/browse/HIVE-4093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-4093: --- Attachment: HIVE-4093-2.patch Latest patch which addresses the feedback! Remove sprintf from PTFTranslator and use String.format() - Key: HIVE-4093 URL: https://issues.apache.org/jira/browse/HIVE-4093 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Brock Noland Assignee: Brock Noland Priority: Minor Attachments: HIVE-4093-0.patch, HIVE-4093-1.patch, HIVE-4093-2.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4053) Add support for phonetic algorithms in Hive
[ https://issues.apache.org/jira/browse/HIVE-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591449#comment-13591449 ] Mark Grover commented on HIVE-4053: --- Krishna, thanks for doing this. I don't have a whole lot of insight into these particular algorithms but do they always take the same parameters? What's the possibility of a new phonetic algorithm using a different set or number of parameters? If these functions always take same parameters, it may make sense to do (2). However, if not, (1) would be a good idea. Of course, you can still refactor the code and share amongst all different UDFs even when they are separate. To post a review on reviewboard, go to reviews.apache.org. Generate a diff file of your changes on top of hive trunk (using svn diff or git diff) and upload that diff (use hive repository when using svn diff output and hive-git repository when using git diff output). Please let me know if you have any further questions. Add support for phonetic algorithms in Hive --- Key: HIVE-4053 URL: https://issues.apache.org/jira/browse/HIVE-4053 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.10.0 Reporter: Krishna Labels: patch Fix For: 0.10.0 Attachments: FunctionRegistry.java, GenericUDFRefinedSoundex.java, HIVE-4053.1.patch.txt Following phonetic algorithms should be considered, which are very useful in search: Soundex: http://en.wikipedia.org/wiki/Soundex Refined Soundex: Refer to the comment on 22/Feb/13 23:51 Daitch–Mokotoff Soundex: http://en.wikipedia.org/wiki/Daitch%E2%80%93Mokotoff_Soundex Metaphone and Double Metaphone: http://en.wikipedia.org/wiki/Metaphone New York State Identification and Intelligence System (NYSIIS): http://en.wikipedia.org/wiki/New_York_State_Identification_and_Intelligence_System Caverphone: http://en.wikipedia.org/wiki/Caverphone -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3987) Update PTF invocation and windowing grammar
[ https://issues.apache.org/jira/browse/HIVE-3987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-3987: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to branch. Thanks, Harish for review. Update PTF invocation and windowing grammar --- Key: HIVE-3987 URL: https://issues.apache.org/jira/browse/HIVE-3987 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Harish Butani Assignee: Ashutosh Chauhan Attachments: HIVE-3987.patch Changes to grammar to make it more Standards based: - support Partition Order style along with Hive specific Distribute/Cluster and Sort in windowing specification. - PTF args should come after Input details like in Aster. - tbd: do we need to support named parameters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4073) Make partition by optional in over clause
[ https://issues.apache.org/jira/browse/HIVE-4073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4073: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to branch. Thanks, Brock! Make partition by optional in over clause - Key: HIVE-4073 URL: https://issues.apache.org/jira/browse/HIVE-4073 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Ashutosh Chauhan Assignee: Brock Noland Attachments: HIVE-4073-0.patch, HIVE-4073-1.patch, HIVE-4073-2.patch, HIVE-4073-3.patch select s, sum( i ) over() from tt; should work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4073) Make partition by optional in over clause
[ https://issues.apache.org/jira/browse/HIVE-4073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591522#comment-13591522 ] Brock Noland commented on HIVE-4073: Thank you guys for pointing me the right direction! :) Make partition by optional in over clause - Key: HIVE-4073 URL: https://issues.apache.org/jira/browse/HIVE-4073 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Ashutosh Chauhan Assignee: Brock Noland Attachments: HIVE-4073-0.patch, HIVE-4073-1.patch, HIVE-4073-2.patch, HIVE-4073-3.patch select s, sum( i ) over() from tt; should work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4093) Remove sprintf from PTFTranslator and use String.format()
[ https://issues.apache.org/jira/browse/HIVE-4093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4093: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to branch. Thanks, Brock! Remove sprintf from PTFTranslator and use String.format() - Key: HIVE-4093 URL: https://issues.apache.org/jira/browse/HIVE-4093 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Brock Noland Assignee: Brock Noland Priority: Minor Attachments: HIVE-4093-0.patch, HIVE-4093-1.patch, HIVE-4093-2.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4082) Break up ptf tests in PTF, Windowing and Lead/Lag tests
[ https://issues.apache.org/jira/browse/HIVE-4082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591529#comment-13591529 ] Ashutosh Chauhan commented on HIVE-4082: Prajakta, This patch has gone stale. Can you regenerate it? Break up ptf tests in PTF, Windowing and Lead/Lag tests --- Key: HIVE-4082 URL: https://issues.apache.org/jira/browse/HIVE-4082 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Harish Butani Assignee: Prajakta Kalmegh Attachments: HIVE-4082.D9033.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4110) Aggregation functions must have aliases when multiple functions are used
Brock Noland created HIVE-4110: -- Summary: Aggregation functions must have aliases when multiple functions are used Key: HIVE-4110 URL: https://issues.apache.org/jira/browse/HIVE-4110 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Brock Noland Assignee: Brock Noland The following query fails: {noformat} select p_mfgr, p_retailprice, p_size, lead(p_retailprice) over(partition by p_mfgr order by p_size), lag(p_retailprice) over(partition by p_mfgr order by p_size) from part; {noformat} with the error below: {noformat} 2013-03-02 16:10:47,126 ERROR ql.Driver (SessionState.java:printError(401)) - FAILED: SemanticException [Error 10011]: Line 2:38 Invalid function 'p_mfgr' org.apache.hadoop.hive.ql.parse.SemanticException: Line 2:38 Invalid function 'p_mfgr' at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:678) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:908) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:87) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:124) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:101) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:166) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:8895) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:2634) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:2433) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:7234) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:7200) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:7978) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8651) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:259) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:335) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:898) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:756) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4110) Aggregation functions must have aliases when multiple functions are used
[ https://issues.apache.org/jira/browse/HIVE-4110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-4110: --- Attachment: HIVE-4110-0.patch Aggregation functions must have aliases when multiple functions are used Key: HIVE-4110 URL: https://issues.apache.org/jira/browse/HIVE-4110 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-4110-0.patch The following query fails: {noformat} select p_mfgr, p_retailprice, p_size, lead(p_retailprice) over(partition by p_mfgr order by p_size), lag(p_retailprice) over(partition by p_mfgr order by p_size) from part; {noformat} with the error below: {noformat} 2013-03-02 16:10:47,126 ERROR ql.Driver (SessionState.java:printError(401)) - FAILED: SemanticException [Error 10011]: Line 2:38 Invalid function 'p_mfgr' org.apache.hadoop.hive.ql.parse.SemanticException: Line 2:38 Invalid function 'p_mfgr' at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:678) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:908) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:87) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:124) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:101) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:166) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:8895) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:2634) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:2433) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:7234) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:7200) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:7978) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8651) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:259) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:335) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:898) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:756) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4110) Aggregation functions must have aliases when multiple functions are used
[ https://issues.apache.org/jira/browse/HIVE-4110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-4110: --- Status: Patch Available (was: Open) Aggregation functions must have aliases when multiple functions are used Key: HIVE-4110 URL: https://issues.apache.org/jira/browse/HIVE-4110 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-4110-0.patch The following query fails: {noformat} select p_mfgr, p_retailprice, p_size, lead(p_retailprice) over(partition by p_mfgr order by p_size), lag(p_retailprice) over(partition by p_mfgr order by p_size) from part; {noformat} with the error below: {noformat} 2013-03-02 16:10:47,126 ERROR ql.Driver (SessionState.java:printError(401)) - FAILED: SemanticException [Error 10011]: Line 2:38 Invalid function 'p_mfgr' org.apache.hadoop.hive.ql.parse.SemanticException: Line 2:38 Invalid function 'p_mfgr' at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:678) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:908) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:87) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:124) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:101) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:166) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:8895) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:2634) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:2433) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:7234) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:7200) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:7978) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8651) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:259) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:335) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:898) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:756) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4109) Partition by column does not have to be in order by
[ https://issues.apache.org/jira/browse/HIVE-4109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591549#comment-13591549 ] Ashutosh Chauhan commented on HIVE-4109: [~rhbutani] Yeah.. I wasn't sure either about legality of such queries. Thats why I tried it on postgres. Amazingly, they succeeded on postgres. Need to dig more into it. Partition by column does not have to be in order by --- Key: HIVE-4109 URL: https://issues.apache.org/jira/browse/HIVE-4109 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Brock Noland Cam up in the review of HIVE-4093. Ashutosh {noformat} I am not sure if this is illegal query. I tried following two queries in postgres, both of them succeeded. select p_mfgr, avg(p_retailprice) over(partition by p_mfgr, p_type order by p_mfgr) from part; select p_mfgr, avg(p_retailprice) over(partition by p_mfgr order by p_type,p_mfgr) from part; {noformat} Harish {noformat} The first one doesn't make sense, right? Order on a subset of the partition columns The second one: Can we do this with the Hive ReduceOp have the orderColumns be in a different order than the key columns? {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4110) Aggregation functions must have aliases when multiple functions are used
[ https://issues.apache.org/jira/browse/HIVE-4110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591551#comment-13591551 ] Harish Butani commented on HIVE-4110: - this is going to get fixed with 4081. We don't need to set the alias based on the AST tree. Just using the internalNames _wcol1, _wcol2,... The internal names are used as fields in OI, so this is the safest way to set the aliases. Will post a patch tonight or latest by tomorrow. Let me post the patch also, and then let's discuss. Sorry if this is duplicate effort. Aggregation functions must have aliases when multiple functions are used Key: HIVE-4110 URL: https://issues.apache.org/jira/browse/HIVE-4110 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-4110-0.patch The following query fails: {noformat} select p_mfgr, p_retailprice, p_size, lead(p_retailprice) over(partition by p_mfgr order by p_size), lag(p_retailprice) over(partition by p_mfgr order by p_size) from part; {noformat} with the error below: {noformat} 2013-03-02 16:10:47,126 ERROR ql.Driver (SessionState.java:printError(401)) - FAILED: SemanticException [Error 10011]: Line 2:38 Invalid function 'p_mfgr' org.apache.hadoop.hive.ql.parse.SemanticException: Line 2:38 Invalid function 'p_mfgr' at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:678) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:908) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:87) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:124) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:101) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:166) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:8895) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:2634) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:2433) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:7234) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:7200) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:7978) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8651) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:259) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:335) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:898) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:756) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4110) Aggregation functions must have aliases when multiple functions are used
[ https://issues.apache.org/jira/browse/HIVE-4110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591561#comment-13591561 ] Brock Noland commented on HIVE-4110: Ok sounds good! Don't worry about duplicating work. Finding this one was a great learning experience! Aggregation functions must have aliases when multiple functions are used Key: HIVE-4110 URL: https://issues.apache.org/jira/browse/HIVE-4110 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-4110-0.patch The following query fails: {noformat} select p_mfgr, p_retailprice, p_size, lead(p_retailprice) over(partition by p_mfgr order by p_size), lag(p_retailprice) over(partition by p_mfgr order by p_size) from part; {noformat} with the error below: {noformat} 2013-03-02 16:10:47,126 ERROR ql.Driver (SessionState.java:printError(401)) - FAILED: SemanticException [Error 10011]: Line 2:38 Invalid function 'p_mfgr' org.apache.hadoop.hive.ql.parse.SemanticException: Line 2:38 Invalid function 'p_mfgr' at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:678) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:908) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:87) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:124) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:101) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:166) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:8895) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:2634) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:2433) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:7234) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:7200) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:7978) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8651) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:259) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:335) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:898) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:756) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4015) Add ORC file to the grammar as a file format
[ https://issues.apache.org/jira/browse/HIVE-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591563#comment-13591563 ] Gunther Hagleitner commented on HIVE-4015: -- ~owen: Thanks, I've removed the optional overwrite of the serde when using Orc. Here's the updated diff: https://reviews.facebook.net/D9057 Add ORC file to the grammar as a file format Key: HIVE-4015 URL: https://issues.apache.org/jira/browse/HIVE-4015 Project: Hive Issue Type: Improvement Reporter: Owen O'Malley Assignee: Gunther Hagleitner Attachments: HIVE-4015.1.patch, HIVE-4015.2.patch, HIVE-4015.3.patch, HIVE-4015.4.patch It would be much more convenient for users if we enable them to use ORC as a file format in the HQL grammar. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4015) Add ORC file to the grammar as a file format
[ https://issues.apache.org/jira/browse/HIVE-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-4015: - Attachment: HIVE-4015.4.patch Add ORC file to the grammar as a file format Key: HIVE-4015 URL: https://issues.apache.org/jira/browse/HIVE-4015 Project: Hive Issue Type: Improvement Reporter: Owen O'Malley Assignee: Gunther Hagleitner Attachments: HIVE-4015.1.patch, HIVE-4015.2.patch, HIVE-4015.3.patch, HIVE-4015.4.patch It would be much more convenient for users if we enable them to use ORC as a file format in the HQL grammar. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3862) testNegativeCliDriver_cascade_dbdrop fails on hadoop-1
[ https://issues.apache.org/jira/browse/HIVE-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-3862: - Status: Patch Available (was: Open) testNegativeCliDriver_cascade_dbdrop fails on hadoop-1 -- Key: HIVE-3862 URL: https://issues.apache.org/jira/browse/HIVE-3862 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-3862.patch Actually functionality is working correctly, but incorrect include/exclude macro make cause the wrong query file to be run. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Hive-trunk-h0.21 - Build # 1996 - Still Failing
Changes for Build #1995 Changes for Build #1996 No tests ran. The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1996) Status: Still Failing Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1996/ to view the results.
Build failed in Jenkins: Hive-0.10.0-SNAPSHOT-h0.20.1 #81
See https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/81/ -- [...truncated 2367 lines...] A ql/src/test/results/clientpositive/udf_negative.q.out A ql/src/test/results/clientpositive/input20.q.out A ql/src/test/results/clientpositive/ppd_outer_join4.q.out A ql/src/test/results/clientpositive/input43.q.out A ql/src/test/results/clientpositive/udf_dayofmonth.q.out A ql/src/test/results/clientpositive/regex_col.q.out A ql/src/test/results/clientpositive/partition_wise_fileformat.q.out A ql/src/test/results/clientpositive/quote2.q.out A ql/src/test/results/clientpositive/groupby_map_ppr_multi_distinct.q.out A ql/src/test/results/clientpositive/merge1.q.out A ql/src/test/results/clientpositive/udf_regexp_extract.q.out A ql/src/test/results/clientpositive/lineage1.q.out A ql/src/test/results/clientpositive/load_dyn_part15.q.out A ql/src/test/results/clientpositive/udf_ascii.q.out A ql/src/test/results/clientpositive/input8.q.out A ql/src/test/results/clientpositive/filter_join_breaktask.q.out A ql/src/test/results/clientpositive/auto_join24.q.out A ql/src/test/results/clientpositive/union16.q.out A ql/src/test/results/clientpositive/udf6.q.out A ql/src/test/results/clientpositive/join17.q.out A ql/src/test/results/clientpositive/sort_merge_join_desc_5.q.out A ql/src/test/results/clientpositive/udf_double.q.out A ql/src/test/results/clientpositive/nullgroup3.q.out A ql/src/test/results/clientpositive/udf_concat_insert2.q.out A ql/src/test/results/clientpositive/input_part9.q.out A ql/src/test/results/clientpositive/create_insert_outputformat.q.out A ql/src/test/results/clientpositive/udf_atan.q.out A ql/src/test/results/clientpositive/bucketmapjoin7.q.out A ql/src/test/results/clientpositive/alter_table_serde2.q.out A ql/src/test/results/clientpositive/udf_rand.q.out A ql/src/test/results/clientpositive/insert_into6.q.out A ql/src/test/results/clientpositive/auto_join10.q.out A ql/src/test/results/clientpositive/ppr_pushdown.q.out A ql/src/test/results/clientpositive/leftsemijoin.q.out A ql/src/test/results/clientpositive/udf_logic_java_boolean.q.out A ql/src/test/results/clientpositive/split_sample.q.out A ql/src/test/results/clientpositive/bucketmapjoin11.q.out A ql/src/test/results/clientpositive/avro_evolved_schemas.q.out A ql/src/test/results/clientpositive/union25.q.out A ql/src/test/results/clientpositive/udtf_json_tuple.q.out A ql/src/test/results/clientpositive/groupby_neg_float.q.out A ql/src/test/results/clientpositive/join26.q.out A ql/src/test/results/clientpositive/udf_to_date.q.out A ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out A ql/src/test/results/clientpositive/rcfile_default_format.q.out A ql/src/test/results/clientpositive/load_dyn_part10.q.out A ql/src/test/results/clientpositive/udf_hash.q.out A ql/src/test/results/clientpositive/merge_dynamic_partition5.q.out A ql/src/test/results/clientpositive/escape2.q.out A ql/src/test/results/clientpositive/alter_concatenate_indexed_table.q.out A ql/src/test/results/clientpositive/input3.q.out A ql/src/test/results/clientpositive/udf_date_add.q.out A ql/src/test/results/clientpositive/stats5.q.out A ql/src/test/results/clientpositive/count.q.out A ql/src/test/results/clientpositive/union11.q.out A ql/src/test/results/clientpositive/index_compact_1.q.out A ql/src/test/results/clientpositive/udf1.q.out A ql/src/test/results/clientpositive/alter_table_serde.q.out A ql/src/test/results/clientpositive/join12.q.out A ql/src/test/results/clientpositive/virtual_column.q.out A ql/src/test/results/clientpositive/ppd_join_filter.q.out A ql/src/test/results/clientpositive/join35.q.out A ql/src/test/results/clientpositive/lock1.q.out A ql/src/test/results/clientpositive/input_part4.q.out A ql/src/test/results/clientpositive/ba_table_udfs.q.out A ql/src/test/results/clientpositive/bucketmapjoin2.q.out A ql/src/test/results/clientpositive/index_compact.q.out A ql/src/test/results/clientpositive/udf_concat_ws.q.out A ql/src/test/results/clientpositive/mapjoin_subquery2.q.out A ql/src/test/results/clientpositive/mapjoin1.q.out A ql/src/test/results/clientpositive/insert_into1.q.out A ql/src/test/results/clientpositive/groupby_grouping_sets1.q.out A ql/src/test/results/clientpositive/drop_table_removes_partition_dirs.q.out A ql/src/test/results/clientpositive/smb_mapjoin9.q.out A
Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21 #308
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/308/ -- [...truncated 4584 lines...] A ql/src/java/org/apache/hadoop/hive/ql/exec/errors/RegexErrorHeuristic.java A ql/src/java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java A ql/src/java/org/apache/hadoop/hive/ql/exec/errors/package-info.java A ql/src/java/org/apache/hadoop/hive/ql/exec/errors/MapAggrMemErrorHeuristic.java A ql/src/java/org/apache/hadoop/hive/ql/exec/errors/ErrorAndSolution.java A ql/src/java/org/apache/hadoop/hive/ql/exec/errors/ErrorHeuristic.java A ql/src/java/org/apache/hadoop/hive/ql/exec/errors/ScriptErrorHeuristic.java A ql/src/java/org/apache/hadoop/hive/ql/exec/CopyTask.java A ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinMetaData.java A ql/src/java/org/apache/hadoop/hive/ql/exec/AutoProgressor.java A ql/src/java/org/apache/hadoop/hive/ql/exec/ExecReducer.java A ql/src/java/org/apache/hadoop/hive/ql/plan A ql/src/java/org/apache/hadoop/hive/ql/plan/ShowIndexesDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/UDTFDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/UnlockTableDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/ScriptDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/ForwardDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/DDLWork.java A ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/MoveWork.java A ql/src/java/org/apache/hadoop/hive/ql/plan/DropFunctionDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/DropTableDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/HiveOperation.java A ql/src/java/org/apache/hadoop/hive/ql/plan/CopyWork.java A ql/src/java/org/apache/hadoop/hive/ql/plan/ArchiveDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/AddPartitionDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/FilterDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/AggregationDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/UnionDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/FunctionWork.java A ql/src/java/org/apache/hadoop/hive/ql/plan/RevokeDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/DropIndexDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/FetchWork.java A ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeGenericFuncDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/ShowPartitionsDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/Explain.java A ql/src/java/org/apache/hadoop/hive/ql/plan/LateralViewForwardDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/CreateTableDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/ShowTablesDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/ShowGrantDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/ShowLocksDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/PrivilegeObjectDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/LimitDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeNullDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/ExplosionDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/PrincipalDesc.java AUql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionSpec.java A ql/src/java/org/apache/hadoop/hive/ql/plan/ArchiveWork.java A ql/src/java/org/apache/hadoop/hive/ql/plan/SwitchDatabaseDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/DescTableDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolver.java A ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/ShowDatabasesDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/CreateDatabaseDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java A ql/src/java/org/apache/hadoop/hive/ql/plan/CreateIndexDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/GrantRevokeRoleDDL.java A ql/src/java/org/apache/hadoop/hive/ql/plan/JoinDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/SchemaDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverSkewJoin.java A ql/src/java/org/apache/hadoop/hive/ql/plan/LockTableDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/SMBJoinDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/RenamePartitionDesc.java A ql/src/java/org/apache/hadoop/hive/ql/plan/ExplainWork.java
[jira] [Commented] (HIVE-4007) Create abstract classes for serializer and deserializer
[ https://issues.apache.org/jira/browse/HIVE-4007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591614#comment-13591614 ] Ashutosh Chauhan commented on HIVE-4007: Just so we are on same page, let me say how it ought to work: We need not to remove Serde interface in medium term. In Hive codebase we still refer implementations everywhere by Serde (not by AbstractSerde). This will let existing serde impl work without needing to change anything on their part. Than lets say we add new apis to AbstractSerde. Than in Hive codebase, using instanceof we can determine if Serde is of type AbstractSerde or not and call those new apis, *only if* they do implement it. This way we can keep supporting old implementations without breaking them when we do add new apis in AbstractSerde. Ofcourse, we encourage everyone to start switching to AbstractSerde as soon as they can, so that we don't need to do ugly instanceof forever. Do you agree with this approach? Create abstract classes for serializer and deserializer --- Key: HIVE-4007 URL: https://issues.apache.org/jira/browse/HIVE-4007 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.4007.1.patch, hive.4007.2.patch, hive.4007.3.patch, hive.4007.4.patch Currently, it is very difficult to change the Serializer/Deserializer interface, since all the SerDes directly implement the interface. Instead, we should have abstract classes for implementing these interfaces. In case of a interface change, only the abstract class and the relevant serde needs to change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4042) ignore mapjoin hint
[ https://issues.apache.org/jira/browse/HIVE-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591622#comment-13591622 ] Ashutosh Chauhan commented on HIVE-4042: I think we should get HIVE-3891 in asap (I am reviewing it) and than either always ignore the mapjoin hint or not (depending on this new config). I don't like the idea of not ignoring it for bucketed/sorted join case. I think it will be burdensome for user to reason about whether to enable this config, hint will be ignored in some cases but not all. Than they will ask which are those cases and why. To avoid all this unnecessary explanation, just let either always ignore or never. Also, you can set the value of this config to false in data/conf/hive-site.xml which is used for tests, so that patch need not to update test outputs. ignore mapjoin hint --- Key: HIVE-4042 URL: https://issues.apache.org/jira/browse/HIVE-4042 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.4042.1.patch, hive.4042.2.patch, hive.4042.3.patch, hive.4042.4.patch, hive.4042.5.patch, hive.4042.6.patch, hive.4042.7.patch, hive.4042.8.patch After HIVE-3784, in a production environment, it can become difficult to deploy since a lot of production queries can break. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4104) Hive localtask does not buffer disk-writes or reads
[ https://issues.apache.org/jira/browse/HIVE-4104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591628#comment-13591628 ] Ashutosh Chauhan commented on HIVE-4104: +1 Nice catch Gopal. Any guidance on size of buffer. Wondering higher value (8K?, 64K?) will result in higher saving. Hive localtask does not buffer disk-writes or reads --- Key: HIVE-4104 URL: https://issues.apache.org/jira/browse/HIVE-4104 Project: Hive Issue Type: Bug Reporter: Gopal V Assignee: Gopal V Priority: Minor Attachments: HIVE-4104.patch Hive's HashMapWrapper does not use any buffering in its File I/O, but operates sequentially for writes reads. The strace logs show clearly that {code} 9495 write(222, x, 1)= 1 9495 write(222, sq\0~\0\5, 6)= 6 9495 write(222, w\25, 2) = 2 9495 write(222, \0\0\0\1\0\0\0\1\0\0\0\2\0\0\0\5\3\1M\1S, 21) = 21 9495 write(222, x, 1)= 1 9495 write(222, sq\0~\0\2, 6)= 6 9495 write(222, w\t, 2) = 2 9495 write(222, \0\0\0\5\1\215\r\325v, 9) = 9 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4020) Swap applying order of CP and PPD
[ https://issues.apache.org/jira/browse/HIVE-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591640#comment-13591640 ] Ashutosh Chauhan commented on HIVE-4020: +1 Navis can you refresh the patch. I suspect .q.out files need to be updated because of other recent commits. I will test and get in quickly before it goes stale again. Swap applying order of CP and PPD - Key: HIVE-4020 URL: https://issues.apache.org/jira/browse/HIVE-4020 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-4020.D8571.1.patch Doing Hive-2340, I've found CP removed some column mapping needed for backtracking expression desc. By swapping order of CP and PPD, the problem was solved. After that I've realized that CP on earlier stage is possible after PPD is applied cause some columns on filter predicate are not selected and can be removed right after the new pushed-down filter. For example, (bucketmapjoin1.q) noformat select /*+mapjoin(b)*/ a.key, a.value, b.value from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b on a.key=b.key where b.ds=2008-04-08 noformat plan for hashtable sink operator is changed to noformat HashTable Sink Operator condition expressions: 0 {key} {value} 1 {value} noformat which was noformat HashTable Sink Operator condition expressions: 0 {key} {value} 1 {value} {ds} noformat HIVE-2340 seemed need more time for commit, so booked as an another issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3464) Merging join tree may reorder joins which could be invalid
[ https://issues.apache.org/jira/browse/HIVE-3464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591644#comment-13591644 ] Ashutosh Chauhan commented on HIVE-3464: [~namit] you want to take a relook at this one? Merging join tree may reorder joins which could be invalid -- Key: HIVE-3464 URL: https://issues.apache.org/jira/browse/HIVE-3464 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Attachments: HIVE-3464.D5409.2.patch, HIVE-3464.D5409.3.patch Currently, hive merges join tree from right to left regardless of join types, which may introduce join reordering. For example, select * from a join a b on a.key=b.key join a c on b.key=c.key join a d on a.key=d.key; Hive tries to merge join tree in a-d=b-d, a-d=a-b, b-c=a-b order and a-d=a-b and b-c=a-b will be merged. Final join tree is a-(bdc). With this, ab-d join will be executed prior to ab-c. But if join type of -c and -d is different, this is not valid. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4009) CLI Tests fail randomly due to MapReduce LocalJobRunner race condition
[ https://issues.apache.org/jira/browse/HIVE-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591651#comment-13591651 ] Ashutosh Chauhan commented on HIVE-4009: I never hit this in my CLI tests. Brock, can you describe the situation when you ran into it. Are these HiveServer2 tests? CLI Tests fail randomly due to MapReduce LocalJobRunner race condition -- Key: HIVE-4009 URL: https://issues.apache.org/jira/browse/HIVE-4009 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Brock Noland Attachments: HIVE-4009-0.patch Hadoop has a race condition MAPREDUCE-5001 which causes tests to fail randomly when using LocalJobRunner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4082) Break up ptf tests in PTF, Windowing and Lead/Lag tests
[ https://issues.apache.org/jira/browse/HIVE-4082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4082: -- Attachment: HIVE-4082.D9033.2.patch pkalmegh updated the revision HIVE-4082 [jira] Break up ptf tests in PTF, Windowing and Lead/Lag tests. - HIVE-4082: Refactor tests - Resolve merge issues after merge with ptf-windowing - HIVE-4082 [jira] Break up ptf tests in PTF, Windowing and Lead/Lag tests Reviewers: JIRA REVISION DETAIL https://reviews.facebook.net/D9033 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D9033?vs=29001id=29079#toc AFFECTED FILES data/files/flights_tiny.txt data/files/part.rc data/files/part.seq ql/src/test/queries/clientpositive/leadlag.q ql/src/test/queries/clientpositive/ptf.q ql/src/test/queries/clientpositive/ptf_general_queries.q ql/src/test/queries/clientpositive/ptf_npath.q ql/src/test/queries/clientpositive/ptf_over_no_partition_by.q ql/src/test/queries/clientpositive/ptf_window_boundaries.q ql/src/test/queries/clientpositive/windowing.q ql/src/test/results/clientpositive/leadlag.q.out ql/src/test/results/clientpositive/ptf.q.out ql/src/test/results/clientpositive/ptf_general_queries.q.out ql/src/test/results/clientpositive/ptf_npath.q.out ql/src/test/results/clientpositive/ptf_over_no_partition_by.q.out ql/src/test/results/clientpositive/ptf_window_boundaries.q.out ql/src/test/results/clientpositive/windowing.q.out To: JIRA, pkalmegh Break up ptf tests in PTF, Windowing and Lead/Lag tests --- Key: HIVE-4082 URL: https://issues.apache.org/jira/browse/HIVE-4082 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Harish Butani Assignee: Prajakta Kalmegh Attachments: HIVE-4082.D9033.1.patch, HIVE-4082.D9033.2.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4082) Break up ptf tests in PTF, Windowing and Lead/Lag tests
[ https://issues.apache.org/jira/browse/HIVE-4082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591657#comment-13591657 ] Prajakta Kalmegh commented on HIVE-4082: Ashutosh, please see the patch updated with changes in HIVE-3987. I have merged the queries from ptf_over_no_partition_by.q and ptf_window_boundaries.q in the new tests. Can you please review and commit this patch before others? Break up ptf tests in PTF, Windowing and Lead/Lag tests --- Key: HIVE-4082 URL: https://issues.apache.org/jira/browse/HIVE-4082 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Harish Butani Assignee: Prajakta Kalmegh Attachments: HIVE-4082.D9033.1.patch, HIVE-4082.D9033.2.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4081) allow expressions with over clause
[ https://issues.apache.org/jira/browse/HIVE-4081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4081: -- Attachment: HIVE-4081.D9063.1.patch hbutani requested code review of HIVE-4081 [jira] allow expressions with over clause. Reviewers: JIRA allow expressions with over clause remove current restriction where only a UDAF invocation is allowed with a windowing specification TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D9063 AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java ql/src/test/queries/clientpositive/windowing_expressions.q ql/src/test/results/clientpositive/windowing_expressions.q.out MANAGE HERALD RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/21975/ To: JIRA, hbutani allow expressions with over clause -- Key: HIVE-4081 URL: https://issues.apache.org/jira/browse/HIVE-4081 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-4081.D9063.1.patch remove current restriction where only a UDAF invocation is allowed with a windowing specification -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4081) allow expressions with over clause
[ https://issues.apache.org/jira/browse/HIVE-4081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591664#comment-13591664 ] Harish Butani commented on HIVE-4081: - This also includes the fix for the alias issue: Jiras 4084, 4110 allow expressions with over clause -- Key: HIVE-4081 URL: https://issues.apache.org/jira/browse/HIVE-4081 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-4081.D9063.1.patch remove current restriction where only a UDAF invocation is allowed with a windowing specification -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4104) Hive localtask does not buffer disk-writes or reads
[ https://issues.apache.org/jira/browse/HIVE-4104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591668#comment-13591668 ] Gopal V commented on HIVE-4104: --- Increasing the size up from 4kb to 64kb did not seem to make any appreciable difference. I would assume that 4kb works because it is the disk write block-size. Hive localtask does not buffer disk-writes or reads --- Key: HIVE-4104 URL: https://issues.apache.org/jira/browse/HIVE-4104 Project: Hive Issue Type: Bug Reporter: Gopal V Assignee: Gopal V Priority: Minor Attachments: HIVE-4104.patch Hive's HashMapWrapper does not use any buffering in its File I/O, but operates sequentially for writes reads. The strace logs show clearly that {code} 9495 write(222, x, 1)= 1 9495 write(222, sq\0~\0\5, 6)= 6 9495 write(222, w\25, 2) = 2 9495 write(222, \0\0\0\1\0\0\0\1\0\0\0\2\0\0\0\5\3\1M\1S, 21) = 21 9495 write(222, x, 1)= 1 9495 write(222, sq\0~\0\2, 6)= 6 9495 write(222, w\t, 2) = 2 9495 write(222, \0\0\0\5\1\215\r\325v, 9) = 9 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira