[jira] [Work logged] (HIVE-25275) OOM during query planning due to HiveJoinPushTransitivePredicatesRule matching infinitely
[ https://issues.apache.org/jira/browse/HIVE-25275?focusedWorklogId=704694=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-704694 ] ASF GitHub Bot logged work on HIVE-25275: - Author: ASF GitHub Bot Created on: 06/Jan/22 18:26 Start Date: 06/Jan/22 18:26 Worklog Time Spent: 10m Work Description: asolimando opened a new pull request #2925: URL: https://github.com/apache/hive/pull/2925 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 704694) Time Spent: 40m (was: 0.5h) > OOM during query planning due to HiveJoinPushTransitivePredicatesRule > matching infinitely > - > > Key: HIVE-25275 > URL: https://issues.apache.org/jira/browse/HIVE-25275 > Project: Hive > Issue Type: Bug >Reporter: László Pintér >Assignee: Alessandro Solimando >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > While running the following query OOM is raised during the planning phase > {code:sql} > CREATE TABLE A (`value_date` date) STORED AS ORC; > CREATE TABLE B (`business_date` date) STORED AS ORC; > SELECT A.VALUE_DATE > FROM A, B > WHERE A.VALUE_DATE = BUSINESS_DATE > AND A.VALUE_DATE = TRUNC(BUSINESS_DATE, 'MONTH'); > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HIVE-25849) Disable insert overwrite for bucket partitioned Iceberg tables
[ https://issues.apache.org/jira/browse/HIVE-25849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marton Bod updated HIVE-25849: -- Description: Insert overwrite should be disabled where the target Iceberg table is a bucket partitioned table, since which existing partitions will be overwritten is very hard to predict from a user's POV, as it depends on the bucket hash values calculated for the new dataset's rows. It's better to be on the safe side and disable this operation to avoid unwanted data loss. Note: this the same approach followed by Impala too. was:Insert overwrite should be disabled where the target Iceberg table is a bucket partitioned table, since which existing partitions will be overwritten is very hard to predict from a user's POV, as it depends on the bucket hash values calculated for the new dataset's rows. It's better to be on the safe side and disable this operation to avoid unwanted data loss. > Disable insert overwrite for bucket partitioned Iceberg tables > -- > > Key: HIVE-25849 > URL: https://issues.apache.org/jira/browse/HIVE-25849 > Project: Hive > Issue Type: Improvement >Reporter: Marton Bod >Assignee: Marton Bod >Priority: Major > > Insert overwrite should be disabled where the target Iceberg table is a > bucket partitioned table, since which existing partitions will be overwritten > is very hard to predict from a user's POV, as it depends on the bucket hash > values calculated for the new dataset's rows. It's better to be on the safe > side and disable this operation to avoid unwanted data loss. > Note: this the same approach followed by Impala too. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HIVE-25849) Disable insert overwrite for bucket partitioned Iceberg tables
[ https://issues.apache.org/jira/browse/HIVE-25849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17469972#comment-17469972 ] Marton Bod commented on HIVE-25849: --- PR: [https://github.com/apache/hive/pull/2856/] > Disable insert overwrite for bucket partitioned Iceberg tables > -- > > Key: HIVE-25849 > URL: https://issues.apache.org/jira/browse/HIVE-25849 > Project: Hive > Issue Type: Improvement >Reporter: Marton Bod >Assignee: Marton Bod >Priority: Major > > Insert overwrite should be disabled where the target Iceberg table is a > bucket partitioned table, since which existing partitions will be overwritten > is very hard to predict from a user's POV, as it depends on the bucket hash > values calculated for the new dataset's rows. It's better to be on the safe > side and disable this operation to avoid unwanted data loss. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (HIVE-25849) Disable insert overwrite for bucket partitioned Iceberg tables
[ https://issues.apache.org/jira/browse/HIVE-25849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marton Bod reassigned HIVE-25849: - > Disable insert overwrite for bucket partitioned Iceberg tables > -- > > Key: HIVE-25849 > URL: https://issues.apache.org/jira/browse/HIVE-25849 > Project: Hive > Issue Type: Improvement >Reporter: Marton Bod >Assignee: Marton Bod >Priority: Major > > Insert overwrite should be disabled where the target Iceberg table is a > bucket partitioned table, since which existing partitions will be overwritten > is very hard to predict from a user's POV, as it depends on the bucket hash > values calculated for the new dataset's rows. It's better to be on the safe > side and disable this operation to avoid unwanted data loss. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HIVE-21074) Hive bucketed table query pruning does not work for IS NOT NULL condition
[ https://issues.apache.org/jira/browse/HIVE-21074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17469926#comment-17469926 ] Ádám Szita commented on HIVE-21074: --- [~thaibui] - I don't think this is solved by HIVE-19097 and I don't even see how that would help this issue. I can take this over unless you have cycles to work on this of course. > Hive bucketed table query pruning does not work for IS NOT NULL condition > - > > Key: HIVE-21074 > URL: https://issues.apache.org/jira/browse/HIVE-21074 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer >Affects Versions: 3.1.0, 3.0.0, 3.1.1 >Reporter: Thai Bui >Assignee: Thai Bui >Priority: Minor > Fix For: 4.0.0 > > Attachments: HIVE-21074.patch > > > The current version of bucket pruning skips all the predicates when it > detects that one of the predicates is a compound type (e.g. NOT(IS_NULL) ) > when evaluating AND logical operators. > This logic is faulty since as long as one of the AND operators is a bucketed > column (_col_ = *literal*), the *literal* value of that _col_ should be > considered in the bucket pruning optimization no matter what. For example: > SELECT * FROM tbl WHERE bucketed_col = 1 AND (some_compound_expr) > Then the the value '*1'* should be considered for pruning in the query plan. > This limitation has manifested into a simpler case where a table that I am > trying to optimized using bucketing technique is not effective when IS NOT > NULL is used. Since IS NOT NULL is parsed into NOT(IS_NULL) (a compound > expression), the pruning phase is completed skipped causing unnecessary tasks > to be spawned. For instance: > SELECT * FROM tbl WHERE bucketed_col = 1 AND some_other_col IS NOT NULL > Will not trigger bucket pruning logic and perform a full table scan. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Work logged] (HIVE-25818) Values query with order by position clause fails
[ https://issues.apache.org/jira/browse/HIVE-25818?focusedWorklogId=704469=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-704469 ] ASF GitHub Bot logged work on HIVE-25818: - Author: ASF GitHub Bot Created on: 06/Jan/22 08:37 Start Date: 06/Jan/22 08:37 Worklog Time Spent: 10m Work Description: kasakrisz merged pull request #2922: URL: https://github.com/apache/hive/pull/2922 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 704469) Time Spent: 2h (was: 1h 50m) > Values query with order by position clause fails > > > Key: HIVE-25818 > URL: https://issues.apache.org/jira/browse/HIVE-25818 > Project: Hive > Issue Type: Bug > Components: CBO, Query Planning >Reporter: Krisztian Kasa >Assignee: Krisztian Kasa >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > {code} > values(1+1, 2, 5.0, 'a') order by 1 limit 2; > {code} > {code} > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.getFieldIndexFromColumnNumber(CalcitePlanner.java:4146) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.beginGenOBLogicalPlan(CalcitePlanner.java:4028) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genOBLogicalPlan(CalcitePlanner.java:3933) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:5148) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1651) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1593) > at > org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:131) > at > org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:914) > at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:180) > at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:126) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1345) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:563) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12565) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:456) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:317) > at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:223) > at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:105) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:500) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:453) > at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:417) > at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:411) > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:125) > at > org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:229) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:256) > at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:201) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:127) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:422) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:353) > at > org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:726) > at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:696) > at > org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:114) > at > org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157) > at > org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:62) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at