[jira] [Work logged] (HIVE-25275) OOM during query planning due to HiveJoinPushTransitivePredicatesRule matching infinitely

2022-01-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25275?focusedWorklogId=704694=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-704694
 ]

ASF GitHub Bot logged work on HIVE-25275:
-

Author: ASF GitHub Bot
Created on: 06/Jan/22 18:26
Start Date: 06/Jan/22 18:26
Worklog Time Spent: 10m 
  Work Description: asolimando opened a new pull request #2925:
URL: https://github.com/apache/hive/pull/2925


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 704694)
Time Spent: 40m  (was: 0.5h)

> OOM during query planning due to HiveJoinPushTransitivePredicatesRule 
> matching infinitely
> -
>
> Key: HIVE-25275
> URL: https://issues.apache.org/jira/browse/HIVE-25275
> Project: Hive
>  Issue Type: Bug
>Reporter: László Pintér
>Assignee: Alessandro Solimando
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> While running the following query OOM is raised during the planning phase
> {code:sql}
> CREATE TABLE A (`value_date` date) STORED AS ORC;
> CREATE TABLE B (`business_date` date) STORED AS ORC;
> SELECT A.VALUE_DATE
> FROM A, B
> WHERE A.VALUE_DATE = BUSINESS_DATE
>   AND A.VALUE_DATE = TRUNC(BUSINESS_DATE, 'MONTH');
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25849) Disable insert overwrite for bucket partitioned Iceberg tables

2022-01-06 Thread Marton Bod (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Bod updated HIVE-25849:
--
Description: 
Insert overwrite should be disabled where the target Iceberg table is a bucket 
partitioned table, since which existing partitions will be overwritten is very 
hard to predict from a user's POV, as it depends on the bucket hash values 
calculated for the new dataset's rows. It's better to be on the safe side and 
disable this operation to avoid unwanted data loss.

Note: this the same approach followed by Impala too.

  was:Insert overwrite should be disabled where the target Iceberg table is a 
bucket partitioned table, since which existing partitions will be overwritten 
is very hard to predict from a user's POV, as it depends on the bucket hash 
values calculated for the new dataset's rows. It's better to be on the safe 
side and disable this operation to avoid unwanted data loss.


> Disable insert overwrite for bucket partitioned Iceberg tables
> --
>
> Key: HIVE-25849
> URL: https://issues.apache.org/jira/browse/HIVE-25849
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
>
> Insert overwrite should be disabled where the target Iceberg table is a 
> bucket partitioned table, since which existing partitions will be overwritten 
> is very hard to predict from a user's POV, as it depends on the bucket hash 
> values calculated for the new dataset's rows. It's better to be on the safe 
> side and disable this operation to avoid unwanted data loss.
> Note: this the same approach followed by Impala too.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-25849) Disable insert overwrite for bucket partitioned Iceberg tables

2022-01-06 Thread Marton Bod (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17469972#comment-17469972
 ] 

Marton Bod commented on HIVE-25849:
---

PR: [https://github.com/apache/hive/pull/2856/]

 

> Disable insert overwrite for bucket partitioned Iceberg tables
> --
>
> Key: HIVE-25849
> URL: https://issues.apache.org/jira/browse/HIVE-25849
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
>
> Insert overwrite should be disabled where the target Iceberg table is a 
> bucket partitioned table, since which existing partitions will be overwritten 
> is very hard to predict from a user's POV, as it depends on the bucket hash 
> values calculated for the new dataset's rows. It's better to be on the safe 
> side and disable this operation to avoid unwanted data loss.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-25849) Disable insert overwrite for bucket partitioned Iceberg tables

2022-01-06 Thread Marton Bod (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Bod reassigned HIVE-25849:
-


> Disable insert overwrite for bucket partitioned Iceberg tables
> --
>
> Key: HIVE-25849
> URL: https://issues.apache.org/jira/browse/HIVE-25849
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
>
> Insert overwrite should be disabled where the target Iceberg table is a 
> bucket partitioned table, since which existing partitions will be overwritten 
> is very hard to predict from a user's POV, as it depends on the bucket hash 
> values calculated for the new dataset's rows. It's better to be on the safe 
> side and disable this operation to avoid unwanted data loss.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-21074) Hive bucketed table query pruning does not work for IS NOT NULL condition

2022-01-06 Thread Jira


[ 
https://issues.apache.org/jira/browse/HIVE-21074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17469926#comment-17469926
 ] 

Ádám Szita commented on HIVE-21074:
---

[~thaibui] - I don't think this is solved by HIVE-19097 and I don't even see 
how that would help this issue. I can take this over unless you have cycles to 
work on this of course.

> Hive bucketed table query pruning does not work for IS NOT NULL condition
> -
>
> Key: HIVE-21074
> URL: https://issues.apache.org/jira/browse/HIVE-21074
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 3.1.0, 3.0.0, 3.1.1
>Reporter: Thai Bui
>Assignee: Thai Bui
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-21074.patch
>
>
> The current version of bucket pruning skips all the predicates when it 
> detects that one of the predicates is a compound type (e.g. NOT(IS_NULL) ) 
> when evaluating AND logical operators.
> This logic is faulty since as long as one of the AND operators is a bucketed 
> column (_col_ = *literal*), the *literal* value of that _col_ should be 
> considered in the bucket pruning optimization no matter what. For example:
> SELECT * FROM tbl WHERE bucketed_col = 1 AND (some_compound_expr)
> Then the the value '*1'* should be considered for pruning in the query plan. 
> This limitation has manifested into a simpler case where a table that I am 
> trying to optimized using bucketing technique is not effective when IS NOT 
> NULL is used. Since IS NOT NULL is parsed into NOT(IS_NULL) (a compound 
> expression), the pruning phase is completed skipped causing unnecessary tasks 
> to be spawned. For instance:
> SELECT * FROM tbl WHERE bucketed_col = 1 AND some_other_col IS NOT NULL
> Will not trigger bucket pruning logic and perform a full table scan.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25818) Values query with order by position clause fails

2022-01-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25818?focusedWorklogId=704469=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-704469
 ]

ASF GitHub Bot logged work on HIVE-25818:
-

Author: ASF GitHub Bot
Created on: 06/Jan/22 08:37
Start Date: 06/Jan/22 08:37
Worklog Time Spent: 10m 
  Work Description: kasakrisz merged pull request #2922:
URL: https://github.com/apache/hive/pull/2922


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 704469)
Time Spent: 2h  (was: 1h 50m)

> Values query with order by position clause fails
> 
>
> Key: HIVE-25818
> URL: https://issues.apache.org/jira/browse/HIVE-25818
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Query Planning
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> {code}
> values(1+1, 2, 5.0, 'a') order by 1 limit 2;
> {code}
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.getFieldIndexFromColumnNumber(CalcitePlanner.java:4146)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.beginGenOBLogicalPlan(CalcitePlanner.java:4028)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genOBLogicalPlan(CalcitePlanner.java:3933)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:5148)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1651)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1593)
>   at 
> org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:131)
>   at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:914)
>   at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:180)
>   at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:126)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1345)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:563)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12565)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:456)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:317)
>   at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:223)
>   at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:105)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:500)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:453)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:417)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:411)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:125)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:229)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:256)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:201)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:127)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:422)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:353)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:726)
>   at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:696)
>   at 
> org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:114)
>   at 
> org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157)
>   at 
> org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:62)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at