[jira] [Created] (HIVE-7346) Wrong results caused by hive ppd under specific join condition
dima machlin created HIVE-7346: -- Summary: Wrong results caused by hive ppd under specific join condition Key: HIVE-7346 URL: https://issues.apache.org/jira/browse/HIVE-7346 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: dima machlin Assuming two tables : {code:sql} t1(id1 string,id2 string) , t2 (id string,d int) {code} t1 contains 1 row : 'a','a' t2 contains 1 row : 'a',2 The following query : {code:sql} select a.*,b.d d1,c.d d2 from t1 a join t2 b on (a.id1=b.id) join t2 c on (a.id2=b.id) where b.d =1 and c.d=1 {code} Returns 0 rows as expected because t2.d = 2 Wrapping this query, like so : {code:sql} select * from ( select a.*,b.d d1,c.d d2 from t1 a join t2 b on (a.id1=b.id) join t2 c on (a.id2=b.id) where b.d =1 and c.d=1 ) z where d11 or d21 {code} Where another filter was add on the columns causes the plan to lack the filter of the =1 and return a single row - *Wrong Results*. The plan is : {code:sql} ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_JOIN (TOK_TABREF (TOK_TABNAME t1) a) (TOK_TABREF (TOK_TABNAME t2) b) (= (. (TOK_TABLE_OR_COL a) id1) (. (TOK_TABLE_OR_COL b) id))) (TOK_TABREF (TOK_TABNAME t2) c) (= (. (TOK_TABLE_OR_COL a) id2) (. (TOK_TABLE_OR_COL b) id (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_ALLCOLREF (TOK_TABNAME a))) (TOK_SELEXPR (. (TOK_TABLE_OR_COL b) d) d1) (TOK_SELEXPR (. (TOK_TABLE_OR_COL c) d) d2)) (TOK_WHERE (and (= (. (TOK_TABLE_OR_COL b) d) 1) (= (. (TOK_TABLE_OR_COL c) d) 1) z)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)) (TOK_WHERE (or ( (TOK_TABLE_OR_COL d1) 1) ( (TOK_TABLE_OR_COL d2) 1) STAGE DEPENDENCIES: Stage-7 is a root stage Stage-5 depends on stages: Stage-7 Stage-0 is a root stage STAGE PLANS: Stage: Stage-7 Map Reduce Local Work Alias - Map Local Tables: z:b Fetch Operator limit: -1 z:c Fetch Operator limit: -1 Alias - Map Local Operator Tree: z:b TableScan alias: b HashTable Sink Operator condition expressions: 0 {id1} {id2} 1 {id} {d} handleSkewJoin: false keys: 0 [Column[id1]] 1 [Column[id]] Position of Big Table: 0 z:c TableScan alias: c HashTable Sink Operator condition expressions: 0 {_col5} {_col0} {_col1} 1 {d} handleSkewJoin: false keys: 0 [] 1 [] Position of Big Table: 0 Stage: Stage-5 Map Reduce Alias - Map Operator Tree: z:a TableScan alias: a Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {id1} {id2} 1 {id} {d} handleSkewJoin: false keys: 0 [Column[id1]] 1 [Column[id]] outputColumnNames: _col0, _col1, _col4, _col5 Position of Big Table: 0 Filter Operator predicate: expr: (_col1 = _col4) type: boolean Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col5} {_col0} {_col1} 1 {d} handleSkewJoin: false keys: 0 [] 1 [] outputColumnNames: _col1, _col4, _col5, _col9 Position of Big Table: 0 Filter Operator predicate: expr: ((_col1 1) or (_col9 1)) type: boolean Select Operator expressions: expr: _col4 type: string expr: _col5 type: string expr: _col1 type: int expr: _col9 type: int outputColumnNames: _col0, _col1, _col2, _col3 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Local Work: Map Reduce Local
[jira] [Created] (HIVE-7314) Wrong results of UDF when hive.cache.expr.evaluation is set
dima machlin created HIVE-7314: -- Summary: Wrong results of UDF when hive.cache.expr.evaluation is set Key: HIVE-7314 URL: https://issues.apache.org/jira/browse/HIVE-7314 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: dima machlin It seems that the expression caching doesn't work when using UDF inside another UDF or a hive function. For example : tbl has one row : 'a','b' The following query : select concat(custUDF(a),' ', custUDF(b)) from tbl; returns 'a a' seems to cache custUDF(a) and use it for custUDF(b). Same query without the concat works fine. Replacing the concat with another custom UDF also returns 'a a' -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7314) Wrong results of UDF when hive.cache.expr.evaluation is set
[ https://issues.apache.org/jira/browse/HIVE-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dima machlin updated HIVE-7314: --- Description: It seems that the expression caching doesn't work when using UDF inside another UDF or a hive function. For example : tbl has one row : 'a','b' The following query : {code:sql} select concat(custUDF(a),' ', custUDF(b)) from tbl; {code} returns 'a a' seems to cache custUDF(a) and use it for custUDF(b). Same query without the concat works fine. Replacing the concat with another custom UDF also returns 'a a' was: It seems that the expression caching doesn't work when using UDF inside another UDF or a hive function. For example : tbl has one row : 'a','b' The following query : select concat(custUDF(a),' ', custUDF(b)) from tbl; returns 'a a' seems to cache custUDF(a) and use it for custUDF(b). Same query without the concat works fine. Replacing the concat with another custom UDF also returns 'a a' Wrong results of UDF when hive.cache.expr.evaluation is set --- Key: HIVE-7314 URL: https://issues.apache.org/jira/browse/HIVE-7314 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: dima machlin It seems that the expression caching doesn't work when using UDF inside another UDF or a hive function. For example : tbl has one row : 'a','b' The following query : {code:sql} select concat(custUDF(a),' ', custUDF(b)) from tbl; {code} returns 'a a' seems to cache custUDF(a) and use it for custUDF(b). Same query without the concat works fine. Replacing the concat with another custom UDF also returns 'a a' -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7314) Wrong results of UDF when hive.cache.expr.evaluation is set
[ https://issues.apache.org/jira/browse/HIVE-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047477#comment-14047477 ] dima machlin commented on HIVE-7314: {code:java} public String getDisplayString(String[] children) { return PurswayCleanNameUDF; } {code} Wrong results of UDF when hive.cache.expr.evaluation is set --- Key: HIVE-7314 URL: https://issues.apache.org/jira/browse/HIVE-7314 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: dima machlin It seems that the expression caching doesn't work when using UDF inside another UDF or a hive function. For example : tbl has one row : 'a','b' The following query : {code:sql} select concat(custUDF(a),' ', custUDF(b)) from tbl; {code} returns 'a a' seems to cache custUDF(a) and use it for custUDF(b). Same query without the concat works fine. Replacing the concat with another custom UDF also returns 'a a' -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7314) Wrong results of UDF when hive.cache.expr.evaluation is set
[ https://issues.apache.org/jira/browse/HIVE-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047493#comment-14047493 ] dima machlin commented on HIVE-7314: This is the workaround i'm using. Thank you. But why does it work as separate columns and not inside a 2nd function? Wrong results of UDF when hive.cache.expr.evaluation is set --- Key: HIVE-7314 URL: https://issues.apache.org/jira/browse/HIVE-7314 Project: Hive Issue Type: Bug Affects Versions: 0.12.0, 0.13.0, 0.13.1 Reporter: dima machlin Assignee: Navis Attachments: HIVE-7314.1.patch.txt It seems that the expression caching doesn't work when using UDF inside another UDF or a hive function. For example : tbl has one row : 'a','b' The following query : {code:sql} select concat(custUDF(a),' ', custUDF(b)) from tbl; {code} returns 'a a' seems to cache custUDF(a) and use it for custUDF(b). Same query without the concat works fine. Replacing the concat with another custom UDF also returns 'a a' -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7205) Wrong results when union all of grouping followed by group by with correlation optimization
dima machlin created HIVE-7205: -- Summary: Wrong results when union all of grouping followed by group by with correlation optimization Key: HIVE-7205 URL: https://issues.apache.org/jira/browse/HIVE-7205 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: dima machlin Priority: Critical use case : table TBL (a string,b string) contains single row : 'a','a' the following query : select b, sum(cc) from ( select b,count(1) as cc from TBL group by b union all select a as b,count(1) as cc from TBL group by a ) z group by b returns a 1 a 1 while set hive.optimize.correlation=true; if we change set hive.optimize.correlation=false; it returns correct results : a 2 The plan with correlation optimization : ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_UNION (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME DB TBL))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL b)) (TOK_SELEXPR (TOK_FUNCTION count 1) cc)) (TOK_GROUPBY (TOK_TABLE_OR_COL b (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME DB TBL))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL a) b) (TOK_SELEXPR (TOK_FUNCTION count 1) cc)) (TOK_GROUPBY (TOK_TABLE_OR_COL a) z)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL b)) (TOK_SELEXPR (TOK_FUNCTION sum (TOK_TABLE_OR_COL cc (TOK_GROUPBY (TOK_TABLE_OR_COL b STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Alias - Map Operator Tree: null-subquery1:z-subquery1:TBL TableScan alias: TBL Select Operator expressions: expr: b type: string outputColumnNames: b Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: b type: string mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 0 value expressions: expr: _col1 type: bigint null-subquery2:z-subquery2:TBL TableScan alias: TBL Select Operator expressions: expr: a type: string outputColumnNames: a Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: a type: string mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 1 value expressions: expr: _col1 type: bigint Reduce Operator Tree: Demux Operator Group By Operator aggregations: expr: count(VALUE._col0) bucketGroup: false keys: expr: KEY._col0 type: string mode: mergepartial outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 Union Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 Mux Operator Group By Operator aggregations: expr: sum(_col1) bucketGroup: false keys: expr: _col0 type: string mode: complete outputColumnNames: _col0, _col1 Select Operator
[jira] [Updated] (HIVE-7205) Wrong results when union all of grouping followed by group by with correlation optimization
[ https://issues.apache.org/jira/browse/HIVE-7205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dima machlin updated HIVE-7205: --- Description: use case : table TBL (a string,b string) contains single row : 'a','a' the following query : select b, sum(cc) from ( select b,count(1) as cc from TBL group by b union all select a as b,count(1) as cc from TBL group by a ) z group by b returns a 1 a 1 while set hive.optimize.correlation=true; if we change set hive.optimize.correlation=false; it returns correct results : a 2 The plan with correlation optimization : {code:sql} ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_UNION (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME DB TBL))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL b)) (TOK_SELEXPR (TOK_FUNCTION count 1) cc)) (TOK_GROUPBY (TOK_TABLE_OR_COL b (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME DB TBL))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL a) b) (TOK_SELEXPR (TOK_FUNCTION count 1) cc)) (TOK_GROUPBY (TOK_TABLE_OR_COL a) z)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL b)) (TOK_SELEXPR (TOK_FUNCTION sum (TOK_TABLE_OR_COL cc (TOK_GROUPBY (TOK_TABLE_OR_COL b STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Alias - Map Operator Tree: null-subquery1:z-subquery1:TBL TableScan alias: TBL Select Operator expressions: expr: b type: string outputColumnNames: b Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: b type: string mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 0 value expressions: expr: _col1 type: bigint null-subquery2:z-subquery2:TBL TableScan alias: TBL Select Operator expressions: expr: a type: string outputColumnNames: a Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: a type: string mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 1 value expressions: expr: _col1 type: bigint Reduce Operator Tree: Demux Operator Group By Operator aggregations: expr: count(VALUE._col0) bucketGroup: false keys: expr: KEY._col0 type: string mode: mergepartial outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 Union Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 Mux Operator Group By Operator aggregations: expr: sum(_col1) bucketGroup: false keys: expr: _col0 type: string mode: complete outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint
[jira] [Updated] (HIVE-7205) Wrong results when union all of grouping followed by group by with correlation optimization
[ https://issues.apache.org/jira/browse/HIVE-7205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dima machlin updated HIVE-7205: --- Description: use case : table TBL (a string,b string) contains single row : 'a','a' the following query : {code:sql} select b, sum(cc) from ( select b,count(1) as cc from TBL group by b union all select a as b,count(1) as cc from TBL group by a ) z group by b {code} returns a 1 a 1 while set hive.optimize.correlation=true; if we change set hive.optimize.correlation=false; it returns correct results : a 2 The plan with correlation optimization : {code:sql} ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_UNION (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME DB TBL))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL b)) (TOK_SELEXPR (TOK_FUNCTION count 1) cc)) (TOK_GROUPBY (TOK_TABLE_OR_COL b (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME DB TBL))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL a) b) (TOK_SELEXPR (TOK_FUNCTION count 1) cc)) (TOK_GROUPBY (TOK_TABLE_OR_COL a) z)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL b)) (TOK_SELEXPR (TOK_FUNCTION sum (TOK_TABLE_OR_COL cc (TOK_GROUPBY (TOK_TABLE_OR_COL b STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Alias - Map Operator Tree: null-subquery1:z-subquery1:TBL TableScan alias: TBL Select Operator expressions: expr: b type: string outputColumnNames: b Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: b type: string mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 0 value expressions: expr: _col1 type: bigint null-subquery2:z-subquery2:TBL TableScan alias: TBL Select Operator expressions: expr: a type: string outputColumnNames: a Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: a type: string mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 1 value expressions: expr: _col1 type: bigint Reduce Operator Tree: Demux Operator Group By Operator aggregations: expr: count(VALUE._col0) bucketGroup: false keys: expr: KEY._col0 type: string mode: mergepartial outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 Union Select Operator expressions: expr: _col0 type: string expr: _col1 type: bigint outputColumnNames: _col0, _col1 Mux Operator Group By Operator aggregations: expr: sum(_col1) bucketGroup: false keys: expr: _col0 type: string mode: complete outputColumnNames: _col0, _col1 Select Operator expressions: expr: _col0 type: string expr: _col1 type:
[jira] [Updated] (HIVE-2627) NPE on MAP-JOIN with a UDF in an external JAR
[ https://issues.apache.org/jira/browse/HIVE-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dima machlin updated HIVE-2627: --- Affects Version/s: 0.12.0 NPE on MAP-JOIN with a UDF in an external JAR - Key: HIVE-2627 URL: https://issues.apache.org/jira/browse/HIVE-2627 Project: Hive Issue Type: Bug Affects Versions: 0.12.0 Reporter: Jonathan Chang When a query is converted into a map join, and it depends on some UDF (ADD JAR...; CREATE TEMPORARY FUNCTION...), then an NPE may happen. Here is an example. SELECT some_udf(dummy1) as dummies FROM ( SELECT a.dummy as dummy1, b.dummy as dummy2 FROM test a LEFT OUTER JOIN test b ON a.dummy = b.dummy ) c; My guess is that the JAR classes are not getting propagated to the hashmapjoin operator. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-4317) StackOverflowError when add jar concurrently
[ https://issues.apache.org/jira/browse/HIVE-4317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14009499#comment-14009499 ] dima machlin commented on HIVE-4317: I can confirm that this also happens in Hive 0.12 and is fully reproducible. StackOverflowError when add jar concurrently - Key: HIVE-4317 URL: https://issues.apache.org/jira/browse/HIVE-4317 Project: Hive Issue Type: Bug Affects Versions: 0.9.0, 0.10.0 Reporter: wangwenli Attachments: hive-4317.1.patch scenario: multiple thread add jar and do select operation by jdbc concurrently , when hiveserver serializeMapRedWork sometimes, it will throw StackOverflowError from XMLEncoder. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-2627) NPE on MAP-JOIN with a UDF in an external JAR
[ https://issues.apache.org/jira/browse/HIVE-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14009503#comment-14009503 ] dima machlin commented on HIVE-2627: I can confirm that this still happens in hive 0.12. Getting : java.lang.ClassNotFoundException: com.some.class.used.by.UDF Continuing ... java.lang.NullPointerException: target should not be null java.lang.NullPointerException: target should not be null Continuing ... and eventually ERROR mr.MapredLocalTask: Hive Runtime Error: Map local work failed java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.FunctionRegistry.isStateful(FunctionRegistry.java:1415) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.isDeterministic(FunctionRegistry.java:1385) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.isDeterministic(ExprNodeGenericFuncEvaluator.java:132) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.isDeterministic(FunctionRegistry.java:1385) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.isDeterministic(ExprNodeGenericFuncEvaluator.java:132) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluatorFactory.iterate(ExprNodeEvaluatorFactory.java:83) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluatorFactory.iterate(ExprNodeEvaluatorFactory.java:83) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluatorFactory.toCachedEval(ExprNodeEvaluatorFactory.java:73) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:57) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:57) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:377)at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:377) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:453)at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:453) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:409) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:188) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:188) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:377) at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.initializeOperators(MapredLocalTask.java:419) at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:305) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:722) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:197) NPE on MAP-JOIN with a UDF in an external JAR - Key: HIVE-2627 URL: https://issues.apache.org/jira/browse/HIVE-2627 Project: Hive Issue Type: Bug Reporter: Jonathan Chang When a query is converted into a map join, and it depends on some UDF (ADD JAR...; CREATE TEMPORARY FUNCTION...), then an NPE may happen. Here is an example. SELECT some_udf(dummy1) as dummies FROM ( SELECT a.dummy as dummy1, b.dummy as dummy2 FROM test a LEFT OUTER JOIN test b ON a.dummy = b.dummy ) c; My guess is that the JAR classes are not getting propagated to the hashmapjoin operator. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7045) Wrong results in multi-table insert aggregating without group by clause
dima machlin created HIVE-7045: -- Summary: Wrong results in multi-table insert aggregating without group by clause Key: HIVE-7045 URL: https://issues.apache.org/jira/browse/HIVE-7045 Project: Hive Issue Type: Bug Affects Versions: 0.12.0, 0.10.0 Reporter: dima machlin The scenario : CREATE TABLE t1 (a int, b int); CREATE TABLE t2 (cnt int) PARTITIONED BY (var_name string); insert into table t1 select 1,1 from asd limit 1; insert into table t1 select 2,2 from asd limit 1; from t1 insert overwrite table t2 partition(var_name='a') select count(a) cnt insert overwrite table t2 partition(var_name='b') select count(b) cnt ; select * from t2; returns : 2 a 2 b as expected. Setting the number of reducers higher than 1 : set mapred.reduce.tasks=2; from t1 insert overwrite table t2 partition(var_name='a') select count(a) cnt insert overwrite table t2 partition(var_name='b') select count(b) cnt; select * from t2; 1 a 1 a 1 b 1 b Wrong results. This happens when ever t1 is big enough to automatically generate more than 1 reducers and without specifying it directly. adding group by 1 in the end of each insert solves the problem : from t1 insert overwrite table t2 partition(var_name='a') select count(a) cnt group by 1 insert overwrite table t2 partition(var_name='b') select count(b) cnt group by 1; generates : 2 a 2 b This should work without the group by... The number of rows for each partition will be the amount of reducers. Each reducer calculated a sub total of the count. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7045) Wrong results in multi-table insert aggregating without group by clause
[ https://issues.apache.org/jira/browse/HIVE-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dima machlin updated HIVE-7045: --- Description: The scenario : CREATE TABLE t1 (a int, b int); CREATE TABLE t2 (cnt int) PARTITIONED BY (var_name string); insert into table t1 select 1,1 from asd limit 1; insert into table t1 select 2,2 from asd limit 1; t1 contains : 1 1 2 2 from t1 insert overwrite table t2 partition(var_name='a') select count(a) cnt insert overwrite table t2 partition(var_name='b') select count(b) cnt ; select * from t2; returns : 2 a 2 b as expected. Setting the number of reducers higher than 1 : set mapred.reduce.tasks=2; from t1 insert overwrite table t2 partition(var_name='a') select count(a) cnt insert overwrite table t2 partition(var_name='b') select count(b) cnt; select * from t2; 1 a 1 a 1 b 1 b Wrong results. This happens when ever t1 is big enough to automatically generate more than 1 reducers and without specifying it directly. adding group by 1 in the end of each insert solves the problem : from t1 insert overwrite table t2 partition(var_name='a') select count(a) cnt group by 1 insert overwrite table t2 partition(var_name='b') select count(b) cnt group by 1; generates : 2 a 2 b This should work without the group by... The number of rows for each partition will be the amount of reducers. Each reducer calculated a sub total of the count. was: The scenario : CREATE TABLE t1 (a int, b int); CREATE TABLE t2 (cnt int) PARTITIONED BY (var_name string); insert into table t1 select 1,1 from asd limit 1; insert into table t1 select 2,2 from asd limit 1; from t1 insert overwrite table t2 partition(var_name='a') select count(a) cnt insert overwrite table t2 partition(var_name='b') select count(b) cnt ; select * from t2; returns : 2 a 2 b as expected. Setting the number of reducers higher than 1 : set mapred.reduce.tasks=2; from t1 insert overwrite table t2 partition(var_name='a') select count(a) cnt insert overwrite table t2 partition(var_name='b') select count(b) cnt; select * from t2; 1 a 1 a 1 b 1 b Wrong results. This happens when ever t1 is big enough to automatically generate more than 1 reducers and without specifying it directly. adding group by 1 in the end of each insert solves the problem : from t1 insert overwrite table t2 partition(var_name='a') select count(a) cnt group by 1 insert overwrite table t2 partition(var_name='b') select count(b) cnt group by 1; generates : 2 a 2 b This should work without the group by... The number of rows for each partition will be the amount of reducers. Each reducer calculated a sub total of the count. Wrong results in multi-table insert aggregating without group by clause --- Key: HIVE-7045 URL: https://issues.apache.org/jira/browse/HIVE-7045 Project: Hive Issue Type: Bug Affects Versions: 0.10.0, 0.12.0 Reporter: dima machlin The scenario : CREATE TABLE t1 (a int, b int); CREATE TABLE t2 (cnt int) PARTITIONED BY (var_name string); insert into table t1 select 1,1 from asd limit 1; insert into table t1 select 2,2 from asd limit 1; t1 contains : 1 1 2 2 from t1 insert overwrite table t2 partition(var_name='a') select count(a) cnt insert overwrite table t2 partition(var_name='b') select count(b) cnt ; select * from t2; returns : 2 a 2 b as expected. Setting the number of reducers higher than 1 : set mapred.reduce.tasks=2; from t1 insert overwrite table t2 partition(var_name='a') select count(a) cnt insert overwrite table t2 partition(var_name='b') select count(b) cnt; select * from t2; 1 a 1 a 1 b 1 b Wrong results. This happens when ever t1 is big enough to automatically generate more than 1 reducers and without specifying it directly. adding group by 1 in the end of each insert solves the problem : from t1 insert overwrite table t2 partition(var_name='a') select count(a) cnt group by 1 insert overwrite table t2 partition(var_name='b') select count(b) cnt group by 1; generates : 2 a 2 b This should work without the group by... The number of rows for each partition will be the amount of reducers. Each reducer calculated a sub total of the count. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6448) Java heap space pasring a query with many union all
dima machlin created HIVE-6448: -- Summary: Java heap space pasring a query with many union all Key: HIVE-6448 URL: https://issues.apache.org/jira/browse/HIVE-6448 Project: Hive Issue Type: Bug Components: HiveServer2, Query Processor Affects Versions: 0.11.0 Environment: Reporter: dima machlin too many union all statements in a single query require massive amount of memory to process. Under default HiveServer Xmx (256) we can run ~50 union alls. It seems to be rather linear as increasing the Xmx to 1024 we can run ~200. The error is : java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2882) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390) at java.lang.StringBuilder.append(StringBuilder.java:119) at org.apache.hadoop.hive.ql.QueryPlan.getJSONKeyValue(QueryPlan.java:453) at org.apache.hadoop.hive.ql.QueryPlan.getJSONQuery(QueryPlan.java:598) at org.apache.hadoop.hive.ql.QueryPlan.toString(QueryPlan.java:609) at org.apache.hadoop.hive.ql.history.HiveHistory.logPlanProgress(HiveHistory.java:499) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:136) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:47) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6448) Java heap space pasring a query with many union all
[ https://issues.apache.org/jira/browse/HIVE-6448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dima machlin updated HIVE-6448: --- Description: too many union all statements in a single query require massive amount of memory to process. Under default HiveServer Xmx (256) we can run ~50 union alls. It seems to be rather linear as increasing the Xmx to 1024 we can run ~200. The error is : java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2882) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390) at java.lang.StringBuilder.append(StringBuilder.java:119) at org.apache.hadoop.hive.ql.QueryPlan.getJSONKeyValue(QueryPlan.java:453) at org.apache.hadoop.hive.ql.QueryPlan.getJSONQuery(QueryPlan.java:598) at org.apache.hadoop.hive.ql.QueryPlan.toString(QueryPlan.java:609) at org.apache.hadoop.hive.ql.history.HiveHistory.logPlanProgress(HiveHistory.java:499) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:136) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:47) Query example : select * from ( select count(*) from default.dual union all select count(*) from default.dual union all .. select count(*) from default.dual) z was: too many union all statements in a single query require massive amount of memory to process. Under default HiveServer Xmx (256) we can run ~50 union alls. It seems to be rather linear as increasing the Xmx to 1024 we can run ~200. The error is : java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2882) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390) at java.lang.StringBuilder.append(StringBuilder.java:119) at org.apache.hadoop.hive.ql.QueryPlan.getJSONKeyValue(QueryPlan.java:453) at org.apache.hadoop.hive.ql.QueryPlan.getJSONQuery(QueryPlan.java:598) at org.apache.hadoop.hive.ql.QueryPlan.toString(QueryPlan.java:609) at org.apache.hadoop.hive.ql.history.HiveHistory.logPlanProgress(HiveHistory.java:499) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:136) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:47) Java heap space pasring a query with many union all --- Key: HIVE-6448 URL: https://issues.apache.org/jira/browse/HIVE-6448 Project: Hive Issue Type: Bug Components: HiveServer2, Query Processor Affects Versions: 0.11.0 Environment: Reporter: dima machlin too many union all statements in a single query require massive amount of memory to process. Under default HiveServer Xmx (256) we can run ~50 union alls. It seems to be rather linear as increasing the Xmx to 1024 we can run ~200. The error is : java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2882) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390) at java.lang.StringBuilder.append(StringBuilder.java:119) at org.apache.hadoop.hive.ql.QueryPlan.getJSONKeyValue(QueryPlan.java:453) at org.apache.hadoop.hive.ql.QueryPlan.getJSONQuery(QueryPlan.java:598) at org.apache.hadoop.hive.ql.QueryPlan.toString(QueryPlan.java:609) at org.apache.hadoop.hive.ql.history.HiveHistory.logPlanProgress(HiveHistory.java:499) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:136) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:47) Query example : select * from ( select count(*) from default.dual union all select count(*) from default.dual union all .. select count(*) from default.dual) z -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6448) Java heap space pasring a query with many union all
[ https://issues.apache.org/jira/browse/HIVE-6448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dima machlin updated HIVE-6448: --- Description: too many union all statements in a single query require massive amount of memory to process. Under default HiveServer Xmx (256) we can run ~50 union alls. It seems to be rather linear as increasing the Xmx to 1024 we can run ~200. The error is : java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2882) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390) at java.lang.StringBuilder.append(StringBuilder.java:119) at org.apache.hadoop.hive.ql.QueryPlan.getJSONKeyValue(QueryPlan.java:453) at org.apache.hadoop.hive.ql.QueryPlan.getJSONQuery(QueryPlan.java:598) at org.apache.hadoop.hive.ql.QueryPlan.toString(QueryPlan.java:609) at org.apache.hadoop.hive.ql.history.HiveHistory.logPlanProgress(HiveHistory.java:499) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:136) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:47) Query example : select * from ( select count( * ) from default.dual union all select count ( * ) from default.dual union all .. select count( * ) from default.dual) z was: too many union all statements in a single query require massive amount of memory to process. Under default HiveServer Xmx (256) we can run ~50 union alls. It seems to be rather linear as increasing the Xmx to 1024 we can run ~200. The error is : java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2882) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390) at java.lang.StringBuilder.append(StringBuilder.java:119) at org.apache.hadoop.hive.ql.QueryPlan.getJSONKeyValue(QueryPlan.java:453) at org.apache.hadoop.hive.ql.QueryPlan.getJSONQuery(QueryPlan.java:598) at org.apache.hadoop.hive.ql.QueryPlan.toString(QueryPlan.java:609) at org.apache.hadoop.hive.ql.history.HiveHistory.logPlanProgress(HiveHistory.java:499) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:136) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:47) Query example : select * from ( select count(*) from default.dual union all select count(*) from default.dual union all .. select count(*) from default.dual) z Java heap space pasring a query with many union all --- Key: HIVE-6448 URL: https://issues.apache.org/jira/browse/HIVE-6448 Project: Hive Issue Type: Bug Components: HiveServer2, Query Processor Affects Versions: 0.11.0 Environment: Reporter: dima machlin too many union all statements in a single query require massive amount of memory to process. Under default HiveServer Xmx (256) we can run ~50 union alls. It seems to be rather linear as increasing the Xmx to 1024 we can run ~200. The error is : java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2882) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390) at java.lang.StringBuilder.append(StringBuilder.java:119) at org.apache.hadoop.hive.ql.QueryPlan.getJSONKeyValue(QueryPlan.java:453) at org.apache.hadoop.hive.ql.QueryPlan.getJSONQuery(QueryPlan.java:598) at org.apache.hadoop.hive.ql.QueryPlan.toString(QueryPlan.java:609) at org.apache.hadoop.hive.ql.history.HiveHistory.logPlanProgress(HiveHistory.java:499) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:136) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:47) Query example : select * from ( select count( * ) from default.dual union all select count ( * ) from default.dual union all .. select count( * ) from default.dual) z -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HIVE-6141) Can't select from views after upgrading metastore from 0.7 to 0.10
dima machlin created HIVE-6141: -- Summary: Can't select from views after upgrading metastore from 0.7 to 0.10 Key: HIVE-6141 URL: https://issues.apache.org/jira/browse/HIVE-6141 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.10.0 Reporter: dima machlin Priority: Minor Selecting from a view created in 0.7 after upgrade to 0.10 fails on bq. Invalid table alias or column reference The reason is that while running this : bq. create view a as select b from c.d; in v0.7 will generate in the metastore this SQL : bq. select `*c.*d`.b from `c`.`d`; which fails in v0.10 While running the same SQL in v0.10 generates different SQL : bq. select `d`.b from `c`.`d`; which succeeds in v0.10. A workaround is to recreate the view. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6141) Can't select from views after upgrading metastore from 0.7 to 0.10
[ https://issues.apache.org/jira/browse/HIVE-6141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dima machlin updated HIVE-6141: --- Description: Selecting from a view created in 0.7 after upgrade to 0.10 fails on bq. Invalid table alias or column reference The reason is that while running this : bq. create view a as select b from c.d; in v0.7 will generate in the metastore this SQL : bq. select `*c.*d`.b from `c`.`d`; which fails in v0.10 While running the same create SQL in v0.10 generates different SQL : bq. select `d`.b from `c`.`d`; which succeeds in v0.10. A workaround is to recreate the view. was: Selecting from a view created in 0.7 after upgrade to 0.10 fails on bq. Invalid table alias or column reference The reason is that while running this : bq. create view a as select b from c.d; in v0.7 will generate in the metastore this SQL : bq. select `*c.*d`.b from `c`.`d`; which fails in v0.10 While running the same SQL in v0.10 generates different SQL : bq. select `d`.b from `c`.`d`; which succeeds in v0.10. A workaround is to recreate the view. Can't select from views after upgrading metastore from 0.7 to 0.10 -- Key: HIVE-6141 URL: https://issues.apache.org/jira/browse/HIVE-6141 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.10.0 Reporter: dima machlin Priority: Minor Selecting from a view created in 0.7 after upgrade to 0.10 fails on bq. Invalid table alias or column reference The reason is that while running this : bq. create view a as select b from c.d; in v0.7 will generate in the metastore this SQL : bq. select `*c.*d`.b from `c`.`d`; which fails in v0.10 While running the same create SQL in v0.10 generates different SQL : bq. select `d`.b from `c`.`d`; which succeeds in v0.10. A workaround is to recreate the view. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6141) Can't select from views after upgrading metastore from 0.7 to 0.10
[ https://issues.apache.org/jira/browse/HIVE-6141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dima machlin updated HIVE-6141: --- Description: Selecting from a view created in 0.7 after upgrade to 0.10 fails on bq. Invalid table alias or column reference The reason is that while running this : bq. create view a as select b from c.d; in v0.7 will generate in the metastore this SQL : bq. select `*c.*d`.b from `c`.`d`; which fails in v0.10 While running the same create SQL in v0.10 generates different SQL in the metastore : bq. select `d`.b from `c`.`d`; which succeeds in v0.10. A workaround is to recreate the view. was: Selecting from a view created in 0.7 after upgrade to 0.10 fails on bq. Invalid table alias or column reference The reason is that while running this : bq. create view a as select b from c.d; in v0.7 will generate in the metastore this SQL : bq. select `*c.*d`.b from `c`.`d`; which fails in v0.10 While running the same create SQL in v0.10 generates different SQL : bq. select `d`.b from `c`.`d`; which succeeds in v0.10. A workaround is to recreate the view. Can't select from views after upgrading metastore from 0.7 to 0.10 -- Key: HIVE-6141 URL: https://issues.apache.org/jira/browse/HIVE-6141 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.10.0 Reporter: dima machlin Priority: Minor Selecting from a view created in 0.7 after upgrade to 0.10 fails on bq. Invalid table alias or column reference The reason is that while running this : bq. create view a as select b from c.d; in v0.7 will generate in the metastore this SQL : bq. select `*c.*d`.b from `c`.`d`; which fails in v0.10 While running the same create SQL in v0.10 generates different SQL in the metastore : bq. select `d`.b from `c`.`d`; which succeeds in v0.10. A workaround is to recreate the view. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6141) Can't select from views after upgrading metastore from 0.7 to 0.10
[ https://issues.apache.org/jira/browse/HIVE-6141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dima machlin updated HIVE-6141: --- Description: Selecting from a view created in 0.7 after upgrade to 0.10 fails on bq. Invalid table alias or column reference The reason is that while running this : bq. create view a as select b from c.d; in v0.7 will generate in the metastore this SQL : bq. select `c.d`.b from `c`.`d`; which fails in v0.10 While running the same create SQL in v0.10 generates different SQL in the metastore : bq. select `d`.b from `c`.`d`; which succeeds in v0.10 (notice there is no DB prefix in the select part) *A workaround is to recreate the view.* was: Selecting from a view created in 0.7 after upgrade to 0.10 fails on bq. Invalid table alias or column reference The reason is that while running this : bq. create view a as select b from c.d; in v0.7 will generate in the metastore this SQL : bq. select `*c.*d`.b from `c`.`d`; which fails in v0.10 While running the same create SQL in v0.10 generates different SQL in the metastore : bq. select `d`.b from `c`.`d`; which succeeds in v0.10. A workaround is to recreate the view. Can't select from views after upgrading metastore from 0.7 to 0.10 -- Key: HIVE-6141 URL: https://issues.apache.org/jira/browse/HIVE-6141 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.10.0 Reporter: dima machlin Priority: Minor Selecting from a view created in 0.7 after upgrade to 0.10 fails on bq. Invalid table alias or column reference The reason is that while running this : bq. create view a as select b from c.d; in v0.7 will generate in the metastore this SQL : bq. select `c.d`.b from `c`.`d`; which fails in v0.10 While running the same create SQL in v0.10 generates different SQL in the metastore : bq. select `d`.b from `c`.`d`; which succeeds in v0.10 (notice there is no DB prefix in the select part) *A workaround is to recreate the view.* -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HIVE-5964) Hive missing a filter predicate causing wrong results joining tables after sort by
dima machlin created HIVE-5964: -- Summary: Hive missing a filter predicate causing wrong results joining tables after sort by Key: HIVE-5964 URL: https://issues.apache.org/jira/browse/HIVE-5964 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0, 0.10.0 Reporter: dima machlin It seems like the optimization of predicate pushdown is failing under certain conditions causing wrong results as a filter predicate appears to be completely disregarded by the query processor for some reason. Here is the scenario (assuming dual table exists) : set hive.optimize.ppd=true; drop table if exists test_tbl ; create table test_tbl (id string,name string); insert into table test_tbl select 'a','b' from dual; test_tbl now contains : a b the following query : select t2.* from (select id,name from (select id,name from test_tbl) t1 sort by id) t2 join test_tbl t3 on (t2.id=t3.id ) where t2.name='c' and t3.id='a'; returns : a b The filter : t2.name='c' is missing from the execution plan and obviously doesn't apply. The filter t3.id='a' does appear in the plan and is being applied before the join. If the query changes a little bit like removing the sort by, removing the t1 sub-query or disabling hive.optimize.ppd then the predicate appears. I'm able to reproduce the problem both in Hive 0.10 and Hive 0.11 although It seems to work fine in Hive 0.7 -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5607) Hive fails to parse the % (mod) sign after brackets.
dima machlin created HIVE-5607: -- Summary: Hive fails to parse the % (mod) sign after brackets. Key: HIVE-5607 URL: https://issues.apache.org/jira/browse/HIVE-5607 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: dima machlin Priority: Minor the scenario : create table t(a int); select * from t order by (a)%7; will fail with the following exception : FAILED: ParseException line 1:28 mismatched input '%' expecting EOF near ')' I must mention that this *does* work in 0.7.1 and doesn't work in 0.10 -- This message was sent by Atlassian JIRA (v6.1#6144)