[jira] [Work logged] (HIVE-24241) Enable SharedWorkOptimizer to merge downstream operators after an optimization step
[ https://issues.apache.org/jira/browse/HIVE-24241?focusedWorklogId=506040=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-506040 ] ASF GitHub Bot logged work on HIVE-24241: - Author: ASF GitHub Bot Created on: 29/Oct/20 04:04 Start Date: 29/Oct/20 04:04 Worklog Time Spent: 10m Work Description: dengzhhu653 commented on a change in pull request #1562: URL: https://github.com/apache/hive/pull/1562#discussion_r513942354 ## File path: ql/src/test/results/clientpositive/llap/sharedwork_semi.q.out ## @@ -541,7 +541,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: s - filterExpr: (ss_sold_date_sk is not null and ((ss_sold_date_sk BETWEEN DynamicValue(RS_7_d_d_date_sk_min) AND DynamicValue(RS_7_d_d_date_sk_max) and in_bloom_filter(ss_sold_date_sk, DynamicValue(RS_7_d_d_date_sk_bloom_filter))) or (ss_sold_date_sk BETWEEN DynamicValue(RS_21_d_d_date_sk_min) AND DynamicValue(RS_21_d_d_date_sk_max) and in_bloom_filter(ss_sold_date_sk, DynamicValue(RS_21_d_d_date_sk_bloom_filter) (type: boolean) + filterExpr: (((ss_sold_date_sk BETWEEN DynamicValue(RS_7_d_d_date_sk_min) AND DynamicValue(RS_7_d_d_date_sk_max) and in_bloom_filter(ss_sold_date_sk, DynamicValue(RS_7_d_d_date_sk_bloom_filter))) or (ss_sold_date_sk BETWEEN DynamicValue(RS_21_d_d_date_sk_min) AND DynamicValue(RS_21_d_d_date_sk_max) and in_bloom_filter(ss_sold_date_sk, DynamicValue(RS_21_d_d_date_sk_bloom_filter and ss_sold_date_sk is not null) (type: boolean) Review comment: we see a case when NonBlockingOpDeDupProc merges FIL-FIL, the conditionals may be reorder. https://github.com/apache/hive/pull/1308 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 506040) Time Spent: 1h 40m (was: 1.5h) > Enable SharedWorkOptimizer to merge downstream operators after an > optimization step > --- > > Key: HIVE-24241 > URL: https://issues.apache.org/jira/browse/HIVE-24241 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24325) Cardinality preserving join optimization fails when column is backtracked to a constant
[ https://issues.apache.org/jira/browse/HIVE-24325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-24325: --- Status: Patch Available (was: In Progress) > Cardinality preserving join optimization fails when column is backtracked to > a constant > --- > > Key: HIVE-24325 > URL: https://issues.apache.org/jira/browse/HIVE-24325 > Project: Hive > Issue Type: Bug > Components: CBO >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > This error happens when one of the columns that is used in the output > backtracks to a constant. We end up without a mapping for the column, which > leads to exception below. > {code} > org.apache.calcite.util.mapping.Mappings$NoElementException: source #8 has no > target in mapping [size=9, sourceCount=23, targetCount=9, elements=[0:0, 1:1, > 2:2, 3:3, 4:4, 9:5, 11:6, 12:7, 13:8]] > at > org.apache.calcite.util.mapping.Mappings$AbstractMapping.getTarget(Mappings.java:879) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveCardinalityPreservingJoinOptimization.trim(HiveCardinalityPreservingJoinOptimization.java:228) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveCardinalityPreservingJoinRule.trim(HiveCardinalityPreservingJoinRule.java:48) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveFieldTrimmerRule.onMatch(HiveFieldTrimmerRule.java:70) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:319) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:560) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:419) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:256) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:127) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:215) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:202) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2669) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2635) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPostJoinOrderingTransform(CalcitePlanner.java:2547) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1941) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1809) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:130) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:915) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:179) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:125) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1570) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:549) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12539) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:443) >
[jira] [Work logged] (HIVE-24325) Cardinality preserving join optimization fails when column is backtracked to a constant
[ https://issues.apache.org/jira/browse/HIVE-24325?focusedWorklogId=506032=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-506032 ] ASF GitHub Bot logged work on HIVE-24325: - Author: ASF GitHub Bot Created on: 29/Oct/20 03:33 Start Date: 29/Oct/20 03:33 Worklog Time Spent: 10m Work Description: jcamachor opened a new pull request #1622: URL: https://github.com/apache/hive/pull/1622 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 506032) Remaining Estimate: 0h Time Spent: 10m > Cardinality preserving join optimization fails when column is backtracked to > a constant > --- > > Key: HIVE-24325 > URL: https://issues.apache.org/jira/browse/HIVE-24325 > Project: Hive > Issue Type: Bug > Components: CBO >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > This error happens when one of the columns that is used in the output > backtracks to a constant. We end up without a mapping for the column, which > leads to exception below. > {code} > org.apache.calcite.util.mapping.Mappings$NoElementException: source #8 has no > target in mapping [size=9, sourceCount=23, targetCount=9, elements=[0:0, 1:1, > 2:2, 3:3, 4:4, 9:5, 11:6, 12:7, 13:8]] > at > org.apache.calcite.util.mapping.Mappings$AbstractMapping.getTarget(Mappings.java:879) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveCardinalityPreservingJoinOptimization.trim(HiveCardinalityPreservingJoinOptimization.java:228) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveCardinalityPreservingJoinRule.trim(HiveCardinalityPreservingJoinRule.java:48) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveFieldTrimmerRule.onMatch(HiveFieldTrimmerRule.java:70) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:319) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:560) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:419) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:256) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:127) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:215) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:202) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2669) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2635) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPostJoinOrderingTransform(CalcitePlanner.java:2547) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1941) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1809) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:130) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:915) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:179) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at >
[jira] [Work logged] (HIVE-24325) Cardinality preserving join optimization fails when column is backtracked to a constant
[ https://issues.apache.org/jira/browse/HIVE-24325?focusedWorklogId=506033=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-506033 ] ASF GitHub Bot logged work on HIVE-24325: - Author: ASF GitHub Bot Created on: 29/Oct/20 03:33 Start Date: 29/Oct/20 03:33 Worklog Time Spent: 10m Work Description: jcamachor commented on pull request #1622: URL: https://github.com/apache/hive/pull/1622#issuecomment-718337682 @kasakrisz , could you review? Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 506033) Time Spent: 20m (was: 10m) > Cardinality preserving join optimization fails when column is backtracked to > a constant > --- > > Key: HIVE-24325 > URL: https://issues.apache.org/jira/browse/HIVE-24325 > Project: Hive > Issue Type: Bug > Components: CBO >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > This error happens when one of the columns that is used in the output > backtracks to a constant. We end up without a mapping for the column, which > leads to exception below. > {code} > org.apache.calcite.util.mapping.Mappings$NoElementException: source #8 has no > target in mapping [size=9, sourceCount=23, targetCount=9, elements=[0:0, 1:1, > 2:2, 3:3, 4:4, 9:5, 11:6, 12:7, 13:8]] > at > org.apache.calcite.util.mapping.Mappings$AbstractMapping.getTarget(Mappings.java:879) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveCardinalityPreservingJoinOptimization.trim(HiveCardinalityPreservingJoinOptimization.java:228) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveCardinalityPreservingJoinRule.trim(HiveCardinalityPreservingJoinRule.java:48) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveFieldTrimmerRule.onMatch(HiveFieldTrimmerRule.java:70) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:319) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:560) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:419) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:256) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:127) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:215) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:202) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2669) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2635) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPostJoinOrderingTransform(CalcitePlanner.java:2547) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1941) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1809) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:130) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:915) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:179) >
[jira] [Updated] (HIVE-24325) Cardinality preserving join optimization fails when column is backtracked to a constant
[ https://issues.apache.org/jira/browse/HIVE-24325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24325: -- Labels: pull-request-available (was: ) > Cardinality preserving join optimization fails when column is backtracked to > a constant > --- > > Key: HIVE-24325 > URL: https://issues.apache.org/jira/browse/HIVE-24325 > Project: Hive > Issue Type: Bug > Components: CBO >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > This error happens when one of the columns that is used in the output > backtracks to a constant. We end up without a mapping for the column, which > leads to exception below. > {code} > org.apache.calcite.util.mapping.Mappings$NoElementException: source #8 has no > target in mapping [size=9, sourceCount=23, targetCount=9, elements=[0:0, 1:1, > 2:2, 3:3, 4:4, 9:5, 11:6, 12:7, 13:8]] > at > org.apache.calcite.util.mapping.Mappings$AbstractMapping.getTarget(Mappings.java:879) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveCardinalityPreservingJoinOptimization.trim(HiveCardinalityPreservingJoinOptimization.java:228) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveCardinalityPreservingJoinRule.trim(HiveCardinalityPreservingJoinRule.java:48) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveFieldTrimmerRule.onMatch(HiveFieldTrimmerRule.java:70) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:319) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:560) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:419) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:256) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:127) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:215) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:202) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2669) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2635) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPostJoinOrderingTransform(CalcitePlanner.java:2547) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1941) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1809) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:130) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:915) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:179) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:125) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1570) > ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:549) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12539) > [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:443) >
[jira] [Updated] (HIVE-24325) Cardinality preserving join optimization fails when column is backtracked to a constant
[ https://issues.apache.org/jira/browse/HIVE-24325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-24325: --- Description: This error happens when one of the columns that is used in the output backtracks to a constant. We end up without a mapping for the column, which leads to exception below. {code} org.apache.calcite.util.mapping.Mappings$NoElementException: source #8 has no target in mapping [size=9, sourceCount=23, targetCount=9, elements=[0:0, 1:1, 2:2, 3:3, 4:4, 9:5, 11:6, 12:7, 13:8]] at org.apache.calcite.util.mapping.Mappings$AbstractMapping.getTarget(Mappings.java:879) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveCardinalityPreservingJoinOptimization.trim(HiveCardinalityPreservingJoinOptimization.java:228) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveCardinalityPreservingJoinRule.trim(HiveCardinalityPreservingJoinRule.java:48) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveFieldTrimmerRule.onMatch(HiveFieldTrimmerRule.java:70) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:319) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:560) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:419) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:256) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:127) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:215) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:202) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2669) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2635) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPostJoinOrderingTransform(CalcitePlanner.java:2547) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1941) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1809) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:130) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:915) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:179) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:125) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1570) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:549) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12539) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:443) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:301) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:171) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:301) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:223) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:104) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at
[jira] [Work started] (HIVE-24325) Cardinality preserving join optimization fails when column is backtracked to a constant
[ https://issues.apache.org/jira/browse/HIVE-24325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-24325 started by Jesus Camacho Rodriguez. -- > Cardinality preserving join optimization fails when column is backtracked to > a constant > --- > > Key: HIVE-24325 > URL: https://issues.apache.org/jira/browse/HIVE-24325 > Project: Hive > Issue Type: Bug > Components: CBO >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > > More info to come. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24325) Cardinality preserving join optimization fails when column is backtracked to a constant
[ https://issues.apache.org/jira/browse/HIVE-24325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-24325: --- Summary: Cardinality preserving join optimization fails when column is backtracked to a constant (was: Cardinality preserving join optimization may fail when column is backtracked to a constant) > Cardinality preserving join optimization fails when column is backtracked to > a constant > --- > > Key: HIVE-24325 > URL: https://issues.apache.org/jira/browse/HIVE-24325 > Project: Hive > Issue Type: Bug > Components: CBO >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > > More info to come. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24325) Cardinality preserving join optimization may fail when column is backtracked to a constant
[ https://issues.apache.org/jira/browse/HIVE-24325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-24325: --- Summary: Cardinality preserving join optimization may fail when column is backtracked to a constant (was: Cardinality preserving join optimization may fail when column is a constant) > Cardinality preserving join optimization may fail when column is backtracked > to a constant > -- > > Key: HIVE-24325 > URL: https://issues.apache.org/jira/browse/HIVE-24325 > Project: Hive > Issue Type: Bug > Components: CBO >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > > More info to come. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24325) Cardinality preserving join optimization may fail when column is a constant
[ https://issues.apache.org/jira/browse/HIVE-24325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez reassigned HIVE-24325: -- > Cardinality preserving join optimization may fail when column is a constant > --- > > Key: HIVE-24325 > URL: https://issues.apache.org/jira/browse/HIVE-24325 > Project: Hive > Issue Type: Bug > Components: CBO >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > > More info to come. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24253) HMS and HS2 needs to support keystore/truststores types besides JKS by config
[ https://issues.apache.org/jira/browse/HIVE-24253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222646#comment-17222646 ] Yongzhi Chen commented on HIVE-24253: - [~kgyrtkirk], we cannot remove the ignore from it for the tests run fine when only run by itself, when I run the suite, it will have failures. My fixes in this Jira is just to add a new configurable property, it does not fix SSL existing issues in Hive. We may remove the ignore after we find the root cause of the issue and fix it. > HMS and HS2 needs to support keystore/truststores types besides JKS by config > - > > Key: HIVE-24253 > URL: https://issues.apache.org/jira/browse/HIVE-24253 > Project: Hive > Issue Type: Bug > Components: HiveServer2, Standalone Metastore >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > When HiveMetaStoreClient connects to HMS with enabled SSL, HMS should support > the Keystore type configurable and default to keystore type specified for the > JDK and not always use JKS. Same as HIVE-23958 for hive, HMS should support > to set additional keystore/truststore types used for different applications > like for FIPS crypto algorithms. > Also, make hive keystore type and algorithm configurable. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-24253) HMS and HS2 needs to support keystore/truststores types besides JKS by config
[ https://issues.apache.org/jira/browse/HIVE-24253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen resolved HIVE-24253. - Fix Version/s: 4.0.0 Resolution: Fixed > HMS and HS2 needs to support keystore/truststores types besides JKS by config > - > > Key: HIVE-24253 > URL: https://issues.apache.org/jira/browse/HIVE-24253 > Project: Hive > Issue Type: Bug > Components: HiveServer2, Standalone Metastore >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > When HiveMetaStoreClient connects to HMS with enabled SSL, HMS should support > the Keystore type configurable and default to keystore type specified for the > JDK and not always use JKS. Same as HIVE-23958 for hive, HMS should support > to set additional keystore/truststore types used for different applications > like for FIPS crypto algorithms. > Also, make hive keystore type and algorithm configurable. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23930) Upgrade to tez 0.10.0
[ https://issues.apache.org/jira/browse/HIVE-23930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222486#comment-17222486 ] László Bodor commented on HIVE-23930: - pushed to master, thanks for the reviews on the blocker tasks: [~rbalamohan], [~ashutoshc], [~harishjp]! > Upgrade to tez 0.10.0 > - > > Key: HIVE-23930 > URL: https://issues.apache.org/jira/browse/HIVE-23930 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > Tez 0.10.0 is not yet released, but this ticket is for tracking the effort > and the needed hive changes. > Currently, Hive depends on 0.9.1 > Hadoop dependencies: > Hive/master: *3.1.0* > Tez/master: *3.1.3* > Tez/branch-0.9: *2.7.2* > TODOs: > - check why HIVE-23689 broke some unit tests intermittently (0.9.2 ->0.9.3 > bump), because a 0.10.x upgrade will also contain those tez changes which > could be related > - maintain the needed hive changes (reflecting tez api changes): > HIVE-23190: LLAP: modify IndexCache to pass filesystem object to > TezSpillRecord -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24324) Remove deprecated API usage from Avro
[ https://issues.apache.org/jira/browse/HIVE-24324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun reassigned HIVE-24324: --- > Remove deprecated API usage from Avro > - > > Key: HIVE-24324 > URL: https://issues.apache.org/jira/browse/HIVE-24324 > Project: Hive > Issue Type: Improvement > Components: Avro >Reporter: Chao Sun >Assignee: Chao Sun >Priority: Major > > {{JsonProperties#getJsonProp}} has been marked as deprecated in Avro 1.8 and > removed since Avro 1.9. This replaces the API usage for this with > {{getObjectProp}} which doesn't leak Json node from jackson. This will help > downstream apps to depend on Hive while using higher version of Avro, and > also help Hive to upgrade Avro version itself. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23930) Upgrade to tez 0.10.0
[ https://issues.apache.org/jira/browse/HIVE-23930?focusedWorklogId=505917=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505917 ] ASF GitHub Bot logged work on HIVE-23930: - Author: ASF GitHub Bot Created on: 28/Oct/20 21:16 Start Date: 28/Oct/20 21:16 Worklog Time Spent: 10m Work Description: abstractdog commented on pull request #1311: URL: https://github.com/apache/hive/pull/1311#issuecomment-718212897 this is pushed to master directly with HIVE-24108, HIVE-23190 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505917) Time Spent: 1.5h (was: 1h 20m) > Upgrade to tez 0.10.0 > - > > Key: HIVE-23930 > URL: https://issues.apache.org/jira/browse/HIVE-23930 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > Tez 0.10.0 is not yet released, but this ticket is for tracking the effort > and the needed hive changes. > Currently, Hive depends on 0.9.1 > Hadoop dependencies: > Hive/master: *3.1.0* > Tez/master: *3.1.3* > Tez/branch-0.9: *2.7.2* > TODOs: > - check why HIVE-23689 broke some unit tests intermittently (0.9.2 ->0.9.3 > bump), because a 0.10.x upgrade will also contain those tez changes which > could be related > - maintain the needed hive changes (reflecting tez api changes): > HIVE-23190: LLAP: modify IndexCache to pass filesystem object to > TezSpillRecord -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23190) LLAP: modify IndexCache to pass filesystem object to TezSpillRecord
[ https://issues.apache.org/jira/browse/HIVE-23190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor updated HIVE-23190: Fix Version/s: 4.0.0 > LLAP: modify IndexCache to pass filesystem object to TezSpillRecord > --- > > Key: HIVE-23190 > URL: https://issues.apache.org/jira/browse/HIVE-23190 > Project: Hive > Issue Type: Bug >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-23190.01.patch > > > This ticket is about making the changes introduced in TEZ-4145 in Hive's copy > of IndexCache class. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24108) AddToClassPathAction should use TezClassLoader
[ https://issues.apache.org/jira/browse/HIVE-24108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor updated HIVE-24108: Fix Version/s: 4.0.0 > AddToClassPathAction should use TezClassLoader > -- > > Key: HIVE-24108 > URL: https://issues.apache.org/jira/browse/HIVE-24108 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-24108.01.patch, HIVE-24108.02.patch, > hive_log_llap.log > > > TEZ-4228 fixes an issue from tez side, which is about to use TezClassLoader > instead of the system classloader. However, there are some codepaths, e.g. in > [^hive_log_llap.log] which shows that the system class loader is used. As > thread context classloaders are inherited, the easier solution is to > early-initialize TezClassLoader in LlapDaemon, and let all threads use that > as context class loader, so this solution is more like TEZ-4223 for llap > daemons. > {code} > 2020-09-02T00:18:20,242 ERROR [TezTR-93696_1_1_1_0_0] tez.TezProcessor: > java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:351) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at > org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.RuntimeException: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at > org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:332) > at > org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:427) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:288) > ... 16 more > Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:79) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:100) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:95) > at > org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:313) > ... 18 more > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:76) > ... 21 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-23190) LLAP: modify IndexCache to pass filesystem object to TezSpillRecord
[ https://issues.apache.org/jira/browse/HIVE-23190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor resolved HIVE-23190. - Resolution: Fixed > LLAP: modify IndexCache to pass filesystem object to TezSpillRecord > --- > > Key: HIVE-23190 > URL: https://issues.apache.org/jira/browse/HIVE-23190 > Project: Hive > Issue Type: Bug >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-23190.01.patch > > > This ticket is about making the changes introduced in TEZ-4145 in Hive's copy > of IndexCache class. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24108) AddToClassPathAction should use TezClassLoader
[ https://issues.apache.org/jira/browse/HIVE-24108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222482#comment-17222482 ] László Bodor commented on HIVE-24108: - pushed to master, thanks [~harishjp], [~ashutoshc] for the review! > AddToClassPathAction should use TezClassLoader > -- > > Key: HIVE-24108 > URL: https://issues.apache.org/jira/browse/HIVE-24108 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Attachments: HIVE-24108.01.patch, HIVE-24108.02.patch, > hive_log_llap.log > > > TEZ-4228 fixes an issue from tez side, which is about to use TezClassLoader > instead of the system classloader. However, there are some codepaths, e.g. in > [^hive_log_llap.log] which shows that the system class loader is used. As > thread context classloaders are inherited, the easier solution is to > early-initialize TezClassLoader in LlapDaemon, and let all threads use that > as context class loader, so this solution is more like TEZ-4223 for llap > daemons. > {code} > 2020-09-02T00:18:20,242 ERROR [TezTR-93696_1_1_1_0_0] tez.TezProcessor: > java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:351) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at > org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.RuntimeException: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at > org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:332) > at > org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:427) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:288) > ... 16 more > Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:79) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:100) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:95) > at > org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:313) > ... 18 more > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:76) > ... 21 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23190) LLAP: modify IndexCache to pass filesystem object to TezSpillRecord
[ https://issues.apache.org/jira/browse/HIVE-23190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222484#comment-17222484 ] László Bodor commented on HIVE-23190: - pushed to master, thanks [~rajesh.balamohan] for the review! > LLAP: modify IndexCache to pass filesystem object to TezSpillRecord > --- > > Key: HIVE-23190 > URL: https://issues.apache.org/jira/browse/HIVE-23190 > Project: Hive > Issue Type: Bug >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-23190.01.patch > > > This ticket is about making the changes introduced in TEZ-4145 in Hive's copy > of IndexCache class. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-23930) Upgrade to tez 0.10.0
[ https://issues.apache.org/jira/browse/HIVE-23930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor resolved HIVE-23930. - Resolution: Fixed > Upgrade to tez 0.10.0 > - > > Key: HIVE-23930 > URL: https://issues.apache.org/jira/browse/HIVE-23930 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Tez 0.10.0 is not yet released, but this ticket is for tracking the effort > and the needed hive changes. > Currently, Hive depends on 0.9.1 > Hadoop dependencies: > Hive/master: *3.1.0* > Tez/master: *3.1.3* > Tez/branch-0.9: *2.7.2* > TODOs: > - check why HIVE-23689 broke some unit tests intermittently (0.9.2 ->0.9.3 > bump), because a 0.10.x upgrade will also contain those tez changes which > could be related > - maintain the needed hive changes (reflecting tez api changes): > HIVE-23190: LLAP: modify IndexCache to pass filesystem object to > TezSpillRecord -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23930) Upgrade to tez 0.10.0
[ https://issues.apache.org/jira/browse/HIVE-23930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor updated HIVE-23930: Fix Version/s: 4.0.0 > Upgrade to tez 0.10.0 > - > > Key: HIVE-23930 > URL: https://issues.apache.org/jira/browse/HIVE-23930 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Tez 0.10.0 is not yet released, but this ticket is for tracking the effort > and the needed hive changes. > Currently, Hive depends on 0.9.1 > Hadoop dependencies: > Hive/master: *3.1.0* > Tez/master: *3.1.3* > Tez/branch-0.9: *2.7.2* > TODOs: > - check why HIVE-23689 broke some unit tests intermittently (0.9.2 ->0.9.3 > bump), because a 0.10.x upgrade will also contain those tez changes which > could be related > - maintain the needed hive changes (reflecting tez api changes): > HIVE-23190: LLAP: modify IndexCache to pass filesystem object to > TezSpillRecord -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-24108) AddToClassPathAction should use TezClassLoader
[ https://issues.apache.org/jira/browse/HIVE-24108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor resolved HIVE-24108. - Resolution: Fixed > AddToClassPathAction should use TezClassLoader > -- > > Key: HIVE-24108 > URL: https://issues.apache.org/jira/browse/HIVE-24108 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Attachments: HIVE-24108.01.patch, HIVE-24108.02.patch, > hive_log_llap.log > > > TEZ-4228 fixes an issue from tez side, which is about to use TezClassLoader > instead of the system classloader. However, there are some codepaths, e.g. in > [^hive_log_llap.log] which shows that the system class loader is used. As > thread context classloaders are inherited, the easier solution is to > early-initialize TezClassLoader in LlapDaemon, and let all threads use that > as context class loader, so this solution is more like TEZ-4223 for llap > daemons. > {code} > 2020-09-02T00:18:20,242 ERROR [TezTR-93696_1_1_1_0_0] tez.TezProcessor: > java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:351) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at > org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.RuntimeException: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at > org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:332) > at > org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:427) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:288) > ... 16 more > Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:79) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:100) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:95) > at > org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:313) > ... 18 more > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:76) > ... 21 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-22415) Upgrade to Java 11
[ https://issues.apache.org/jira/browse/HIVE-22415?focusedWorklogId=505919=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505919 ] ASF GitHub Bot logged work on HIVE-22415: - Author: ASF GitHub Bot Created on: 28/Oct/20 21:19 Start Date: 28/Oct/20 21:19 Worklog Time Spent: 10m Work Description: abstractdog commented on pull request #1241: URL: https://github.com/apache/hive/pull/1241#issuecomment-718214905 FYI, HIVE-23930 is resolved, hive now depends on tez 0.10.0 which is JDK11 compliant This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505919) Time Spent: 3h 10m (was: 3h) > Upgrade to Java 11 > -- > > Key: HIVE-22415 > URL: https://issues.apache.org/jira/browse/HIVE-22415 > Project: Hive > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Critical > Labels: pull-request-available > Time Spent: 3h 10m > Remaining Estimate: 0h > > Upgrade Hive to Java JDK 11 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24270) Move scratchdir cleanup to background
[ https://issues.apache.org/jira/browse/HIVE-24270?focusedWorklogId=505862=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505862 ] ASF GitHub Bot logged work on HIVE-24270: - Author: ASF GitHub Bot Created on: 28/Oct/20 18:48 Start Date: 28/Oct/20 18:48 Worklog Time Spent: 10m Work Description: mustafaiman closed pull request #1577: URL: https://github.com/apache/hive/pull/1577 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505862) Time Spent: 2h 20m (was: 2h 10m) > Move scratchdir cleanup to background > - > > Key: HIVE-24270 > URL: https://issues.apache.org/jira/browse/HIVE-24270 > Project: Hive > Issue Type: Improvement >Reporter: Mustafa Iman >Assignee: Mustafa Iman >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 2h 20m > Remaining Estimate: 0h > > In cloud environment, scratchdir cleaning at the end of the query may take > long time. This causes client to hang up to 1 minute even after the results > were streamed back. During this time client just waits for cleanup to finish. > Cleanup can take place in the background in HiveServer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24241) Enable SharedWorkOptimizer to merge downstream operators after an optimization step
[ https://issues.apache.org/jira/browse/HIVE-24241?focusedWorklogId=505853=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505853 ] ASF GitHub Bot logged work on HIVE-24241: - Author: ASF GitHub Bot Created on: 28/Oct/20 18:38 Start Date: 28/Oct/20 18:38 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on a change in pull request #1562: URL: https://github.com/apache/hive/pull/1562#discussion_r513671527 ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/OperatorGraph.java ## @@ -0,0 +1,231 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hive.ql.optimizer; + +import java.io.File; +import java.io.PrintWriter; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedHashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; + +import org.apache.hadoop.hive.ql.exec.AppMasterEventOperator; +import org.apache.hadoop.hive.ql.exec.Operator; +import org.apache.hadoop.hive.ql.exec.ReduceSinkOperator; +import org.apache.hadoop.hive.ql.exec.TableScanOperator; +import org.apache.hadoop.hive.ql.optimizer.calcite.rules.HivePointLookupOptimizerRule.DiGraph; +import org.apache.hadoop.hive.ql.parse.ParseContext; +import org.apache.hadoop.hive.ql.parse.SemiJoinBranchInfo; +import org.apache.hadoop.hive.ql.plan.DynamicPruningEventDesc; + +import com.google.common.collect.Sets; + +public class OperatorGraph { Review comment: @jcamachor this is the checker class I was talking about - right now it builds on top of the basic `digraph` class I've introduce some time ago in `PointLookupOptimizer` ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/OperatorGraph.java ## @@ -0,0 +1,231 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hive.ql.optimizer; + +import java.io.File; +import java.io.PrintWriter; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedHashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; + +import org.apache.hadoop.hive.ql.exec.AppMasterEventOperator; +import org.apache.hadoop.hive.ql.exec.Operator; +import org.apache.hadoop.hive.ql.exec.ReduceSinkOperator; +import org.apache.hadoop.hive.ql.exec.TableScanOperator; +import org.apache.hadoop.hive.ql.optimizer.calcite.rules.HivePointLookupOptimizerRule.DiGraph; +import org.apache.hadoop.hive.ql.parse.ParseContext; +import org.apache.hadoop.hive.ql.parse.SemiJoinBranchInfo; +import org.apache.hadoop.hive.ql.plan.DynamicPruningEventDesc; + +import com.google.common.collect.Sets; + +public class OperatorGraph { + + /** + * A directed graph extended with support to check dag property. + */ + static class DagGraph extends DiGraph { Review comment: we can definetly roll our own graph representation; however sometimes I would feel that it would make things easier to have access to basic graph algorithms (for example to do a topological order walk/etc) there is a small library called [jgrapht](https://jgrapht.org/) (EPL 2.0 license - I think it will be okay) which could be utilized for these kind of things @jcamachor what do you think about pulling in the jgrapht lib and removing the makeshift digraph classes? ## File path:
[jira] [Commented] (HIVE-24108) AddToClassPathAction should use TezClassLoader
[ https://issues.apache.org/jira/browse/HIVE-24108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222330#comment-17222330 ] Ashutosh Chauhan commented on HIVE-24108: - +1 > AddToClassPathAction should use TezClassLoader > -- > > Key: HIVE-24108 > URL: https://issues.apache.org/jira/browse/HIVE-24108 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Attachments: HIVE-24108.01.patch, HIVE-24108.02.patch, > hive_log_llap.log > > > TEZ-4228 fixes an issue from tez side, which is about to use TezClassLoader > instead of the system classloader. However, there are some codepaths, e.g. in > [^hive_log_llap.log] which shows that the system class loader is used. As > thread context classloaders are inherited, the easier solution is to > early-initialize TezClassLoader in LlapDaemon, and let all threads use that > as context class loader, so this solution is more like TEZ-4223 for llap > daemons. > {code} > 2020-09-02T00:18:20,242 ERROR [TezTR-93696_1_1_1_0_0] tez.TezProcessor: > java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:351) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at > org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.RuntimeException: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at > org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:332) > at > org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:427) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:288) > ... 16 more > Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:79) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:100) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:95) > at > org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:313) > ... 18 more > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:76) > ... 21 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24270) Move scratchdir cleanup to background
[ https://issues.apache.org/jira/browse/HIVE-24270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-24270: Fix Version/s: 4.0.0 Resolution: Fixed Status: Resolved (was: Patch Available) Pushed to master. Thanks, Mustafa! > Move scratchdir cleanup to background > - > > Key: HIVE-24270 > URL: https://issues.apache.org/jira/browse/HIVE-24270 > Project: Hive > Issue Type: Improvement >Reporter: Mustafa Iman >Assignee: Mustafa Iman >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > In cloud environment, scratchdir cleaning at the end of the query may take > long time. This causes client to hang up to 1 minute even after the results > were streamed back. During this time client just waits for cleanup to finish. > Cleanup can take place in the background in HiveServer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24323) JDBC driver fails when using Kerberos due to missing dependencies
[ https://issues.apache.org/jira/browse/HIVE-24323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] N Campbell updated HIVE-24323: -- Description: *The Apache Hive web pages historically implied that only 3-JAR files are required* hadoop-auth hadoop-common hive-jdbc *If a connection is attempted using Kerberos authentication, it will fail due to several missing dependencies* hadoop-auth-3.1.1.3.1.5.0-152.jar hadoop-common-3.1.1.3.1.5.0-152.jar hive-jdbc-3.1.0.3.1.5.0-152-standalone.jar *Dependencies* commons-collections-3.2.2.jar commons-configuration2.jar commons-lang-2.6.jar guava-29.0-jre.jar log4j-1.2.17.jar slf4j-api-1.7.25.jar *It is unclear if the intent of the standalone JAR is to include these dependencies or not.* But does not seem to be any documentation either way. *It also appears that dependencies are not being shaded, which can result in conflicts with guava or wstx jar files in the class path.* Such as noted by ORACLE {color:#00}Doc ID 2650046.1{color} {color:#00} com.ctc.wstx.io.StreamBootstrapper.getInstance(Ljava/lang/String;Lcom/ctc/wstx/io/SystemId;Ljava/io/InputStream;)Lcom/ctc/wstx/io/StreamBootstrapper; ] java.lang.NoSuchMethodError: com.ctc.wstx.io.StreamBootstrapper.getInstance(Ljava/lang/String;Lcom/ctc/wstx/io/SystemId;Ljava/io/InputStream;)Lcom/ctc/wstx/io/StreamBootstrapper; at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2918) at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2901){color} was: The Apache Hive web pages historically implied that only 3-JAR files are required hadoop-auth hadoop-common hive-jdbc If a connection is attempted using Kerberos authentication, it will fail due to several missing dependencies hadoop-auth-3.1.1.3.1.5.0-152.jar hadoop-common-3.1.1.3.1.5.0-152.jar hive-jdbc-3.1.0.3.1.5.0-152-standalone.jar It is unclear if the intent of the standalone JAR is to include these dependencies or not. But does not seem to be any documentation either way. It also appears that dependencies are not being shaded, which can result in conflicts with guava or wstx jar files in the class path. Such as noted by ORACLE {color:#00}Doc ID 2650046.1{color} commons-collections-3.2.2.jar commons-configuration2.jar commons-lang-2.6.jar guava-29.0-jre.jar log4j-1.2.17.jar slf4j-api-1.7.25.jar > JDBC driver fails when using Kerberos due to missing dependencies > - > > Key: HIVE-24323 > URL: https://issues.apache.org/jira/browse/HIVE-24323 > Project: Hive > Issue Type: Bug > Components: JDBC >Affects Versions: 3.1.0 >Reporter: N Campbell >Priority: Major > > *The Apache Hive web pages historically implied that only 3-JAR files are > required* > hadoop-auth > hadoop-common > hive-jdbc > *If a connection is attempted using Kerberos authentication, it will fail due > to several missing dependencies* > hadoop-auth-3.1.1.3.1.5.0-152.jar > hadoop-common-3.1.1.3.1.5.0-152.jar > hive-jdbc-3.1.0.3.1.5.0-152-standalone.jar > *Dependencies* > commons-collections-3.2.2.jar > commons-configuration2.jar > commons-lang-2.6.jar > guava-29.0-jre.jar > log4j-1.2.17.jar > slf4j-api-1.7.25.jar > *It is unclear if the intent of the standalone JAR is to include these > dependencies or not.* > But does not seem to be any documentation either way. > *It also appears that dependencies are not being shaded, which can result in > conflicts with guava or wstx jar files in the class path.* > Such as noted by ORACLE {color:#00}Doc ID 2650046.1{color} > {color:#00} > com.ctc.wstx.io.StreamBootstrapper.getInstance(Ljava/lang/String;Lcom/ctc/wstx/io/SystemId;Ljava/io/InputStream;)Lcom/ctc/wstx/io/StreamBootstrapper; > ] > java.lang.NoSuchMethodError: > com.ctc.wstx.io.StreamBootstrapper.getInstance(Ljava/lang/String;Lcom/ctc/wstx/io/SystemId;Ljava/io/InputStream;)Lcom/ctc/wstx/io/StreamBootstrapper; > at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2918) > at > org.apache.hadoop.conf.Configuration.parse(Configuration.java:2901){color} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23410) ACID: Improve the delete and update operations to avoid the move step
[ https://issues.apache.org/jira/browse/HIVE-23410?focusedWorklogId=505832=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505832 ] ASF GitHub Bot logged work on HIVE-23410: - Author: ASF GitHub Bot Created on: 28/Oct/20 17:56 Start Date: 28/Oct/20 17:56 Worklog Time Spent: 10m Work Description: kuczoram opened a new pull request #1620: URL: https://github.com/apache/hive/pull/1620 …he move step ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505832) Time Spent: 0.5h (was: 20m) > ACID: Improve the delete and update operations to avoid the move step > - > > Key: HIVE-23410 > URL: https://issues.apache.org/jira/browse/HIVE-23410 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: Marta Kuczora >Assignee: Marta Kuczora >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23410.1.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > This is a follow-up task for > [HIVE-21164|https://issues.apache.org/jira/browse/HIVE-21164], where the > insert operation has been modified to write directly to the table locations > instead of the staging directory. The same improvement should be done for the > ACID update and delete operations as well. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23410) ACID: Improve the delete and update operations to avoid the move step
[ https://issues.apache.org/jira/browse/HIVE-23410?focusedWorklogId=505829=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505829 ] ASF GitHub Bot logged work on HIVE-23410: - Author: ASF GitHub Bot Created on: 28/Oct/20 17:44 Start Date: 28/Oct/20 17:44 Worklog Time Spent: 10m Work Description: kuczoram closed pull request #1557: URL: https://github.com/apache/hive/pull/1557 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505829) Time Spent: 20m (was: 10m) > ACID: Improve the delete and update operations to avoid the move step > - > > Key: HIVE-23410 > URL: https://issues.apache.org/jira/browse/HIVE-23410 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0 >Reporter: Marta Kuczora >Assignee: Marta Kuczora >Priority: Major > Labels: pull-request-available > Attachments: HIVE-23410.1.patch > > Time Spent: 20m > Remaining Estimate: 0h > > This is a follow-up task for > [HIVE-21164|https://issues.apache.org/jira/browse/HIVE-21164], where the > insert operation has been modified to write directly to the table locations > instead of the staging directory. The same improvement should be done for the > ACID update and delete operations as well. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24322) In case of direct insert, the attempt ID has to be checked when reading the manifest files
[ https://issues.apache.org/jira/browse/HIVE-24322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marta Kuczora updated HIVE-24322: - Description: In IMPALA-10247 there was an exception from Hive when tyring to load the data: {noformat} 2020-10-13T16:50:53,424 ERROR [HiveServer2-Background-Pool: Thread-23832] exec.Task: Job Commit failed with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(java.io.EOFException)' org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException at org.apache.hadoop.hive.ql.exec.FileSinkOperator.jobCloseOp(FileSinkOperator.java:1468) at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:798) at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:803) at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:803) at org.apache.hadoop.hive.ql.exec.tez.TezTask.close(TezTask.java:627) at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:342) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:225) at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:322) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:340) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:392) at org.apache.hadoop.hive.ql.exec.Utilities.handleDirectInsertTableFinalPath(Utilities.java:4587) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.jobCloseOp(FileSinkOperator.java:1462) ... 29 more {noformat} The reason of the exception was that Hive was trying to read an empty manifest file. Manifest files are used in case of direct insert to determine which files needs to be kept and which one needs to be cleaned up. They are created by the tasks and they use the task attempt Id as postfix. In this particular test what happened is that one of the container ran out of memory so Tez decided to kill it right after the manifest file got created but before the paths got written into the manifest file. This was the manifest file for the task attempt 0. Then Tez assigned a new container to the task, so a new attempt was made with attemptId=1. This one was successful, and wrote the manifest file correctly. But Hive didn't know about this, since this out of memory issue got handled by Tez under the hood, so there was no exception in Hive, therefore no clean-up in the manifest folder. And when Hive is reading the manifest files, it just reads every file from the defined folder, so it tried to read the manifest files for attempt 0 and 1 as well. If there are multiple manifest files with the same name but different attemptId, Hive should only read the one with the biggest attempt Id. was: In [IMPALA-10247|https://issues.apache.org/jira/browse/IMPALA-10247] there was an exception from Hive when tyring to load the data: {noformat} 2020-10-13T16:50:53,424 ERROR [HiveServer2-Background-Pool: Thread-23832] exec.Task: Job Commit failed with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(java.io.EOFException)' org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException at org.apache.hadoop.hive.ql.exec.FileSinkOperator.jobCloseOp(FileSinkOperator.java:1468) at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:798) at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:803) at
[jira] [Assigned] (HIVE-24322) In case of direct insert, the attempt ID has to be checked when reading the manifest files
[ https://issues.apache.org/jira/browse/HIVE-24322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marta Kuczora reassigned HIVE-24322: > In case of direct insert, the attempt ID has to be checked when reading the > manifest files > -- > > Key: HIVE-24322 > URL: https://issues.apache.org/jira/browse/HIVE-24322 > Project: Hive > Issue Type: Bug >Affects Versions: 4.0.0 >Reporter: Marta Kuczora >Assignee: Marta Kuczora >Priority: Major > Fix For: 4.0.0 > > > In [IMPALA-10247|https://issues.apache.org/jira/browse/IMPALA-10247] there > was an exception from Hive when tyring to load the data: > {noformat} > 2020-10-13T16:50:53,424 ERROR [HiveServer2-Background-Pool: Thread-23832] > exec.Task: Job Commit failed with exception > 'org.apache.hadoop.hive.ql.metadata.HiveException(java.io.EOFException)' > org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.jobCloseOp(FileSinkOperator.java:1468) > at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:798) > at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:803) > at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:803) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.close(TezTask.java:627) > at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:342) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) > at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) > at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) > at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) > at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) > at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:225) > at > org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) > at > org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:322) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) > at > org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:340) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:392) > at > org.apache.hadoop.hive.ql.exec.Utilities.handleDirectInsertTableFinalPath(Utilities.java:4587) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.jobCloseOp(FileSinkOperator.java:1462) > ... 29 more > {noformat} > The reason of the exception was that Hive was trying to read an empty > manifest file. Manifest files are used in case of direct insert to determine > which files needs to be kept and which one needs to be cleaned up. They are > created by the tasks and they use the tast attempt Id as postfix. In this > particular test what happened is that one of the container ran out of memory > so Tez decided to kill it right after the manifest file got created but > before the pathes got written into the manifest file. This was the manifest > file for the task attempt 0. Then Tez assigned a new container to the task, > so a new attemp was made with attemptId=1. This one was successful, and wrote > the manifest file correctly. But Hive didn't know about this, since this out > of memory issue got handled by Tez under the hood, so there was no exception > in Hive, therefore no clean-up in the manifest folder. And when Hive is > reading the manifest files, it just reads every file from the defined folder, > so it tried to read the manifest files for attemp 0 and 1 as well. > If there are multiple manifest files with the same name but different > attemptId, Hive should only read the one with the biggest attempt Id. -- This
[jira] [Updated] (HIVE-24321) Implement Default getSerDeStats in AbstractSerDe
[ https://issues.apache.org/jira/browse/HIVE-24321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24321: -- Labels: pull-request-available (was: ) > Implement Default getSerDeStats in AbstractSerDe > > > Key: HIVE-24321 > URL: https://issues.apache.org/jira/browse/HIVE-24321 > Project: Hive > Issue Type: Improvement > Components: Serializers/Deserializers >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Seems like very few SerDes implement the getSerDeStats feature. Add a > default implementation and remove all of the superfluous overrides in the > implementing classes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24321) Implement Default getSerDeStats in AbstractSerDe
[ https://issues.apache.org/jira/browse/HIVE-24321?focusedWorklogId=505778=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505778 ] ASF GitHub Bot logged work on HIVE-24321: - Author: ASF GitHub Bot Created on: 28/Oct/20 15:55 Start Date: 28/Oct/20 15:55 Worklog Time Spent: 10m Work Description: belugabehr opened a new pull request #1619: URL: https://github.com/apache/hive/pull/1619 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505778) Remaining Estimate: 0h Time Spent: 10m > Implement Default getSerDeStats in AbstractSerDe > > > Key: HIVE-24321 > URL: https://issues.apache.org/jira/browse/HIVE-24321 > Project: Hive > Issue Type: Improvement > Components: Serializers/Deserializers >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Seems like very few SerDes implement the getSerDeStats feature. Add a > default implementation and remove all of the superfluous overrides in the > implementing classes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24321) Implement Default getSerDeStats in AbstractSerDe
[ https://issues.apache.org/jira/browse/HIVE-24321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Mollitor reassigned HIVE-24321: - > Implement Default getSerDeStats in AbstractSerDe > > > Key: HIVE-24321 > URL: https://issues.apache.org/jira/browse/HIVE-24321 > Project: Hive > Issue Type: Improvement > Components: Serializers/Deserializers >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > > Seems like very few SerDes implement the getSerDeStats feature. Add a > default implementation and remove all of the superfluous overrides in the > implementing classes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24253) HMS and HS2 needs to support keystore/truststores types besides JKS by config
[ https://issues.apache.org/jira/browse/HIVE-24253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1730#comment-1730 ] Zoltan Haindrich commented on HIVE-24253: - [~ychena] the PR seems to be merged around a week ago - I think we should close this ticket. but...I was coming here for a different reason: it seems to me that you've made some improvements to the `TestSSL` test ; do you think we can remove the Ignore from it? (some testcases from TestSSL were frequent guests in unrelated testruns - so it was marked as ignored) https://github.com/apache/hive/blob/375433510b73c5a22bde4e13485dfc16eaa24706/itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestSSL.java#L56 > HMS and HS2 needs to support keystore/truststores types besides JKS by config > - > > Key: HIVE-24253 > URL: https://issues.apache.org/jira/browse/HIVE-24253 > Project: Hive > Issue Type: Bug > Components: HiveServer2, Standalone Metastore >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > When HiveMetaStoreClient connects to HMS with enabled SSL, HMS should support > the Keystore type configurable and default to keystore type specified for the > JDK and not always use JKS. Same as HIVE-23958 for hive, HMS should support > to set additional keystore/truststore types used for different applications > like for FIPS crypto algorithms. > Also, make hive keystore type and algorithm configurable. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24316) Upgrade ORC from 1.5.6 to 1.5.8 in branch-3.1
[ https://issues.apache.org/jira/browse/HIVE-24316?focusedWorklogId=505767=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505767 ] ASF GitHub Bot logged work on HIVE-24316: - Author: ASF GitHub Bot Created on: 28/Oct/20 15:36 Start Date: 28/Oct/20 15:36 Worklog Time Spent: 10m Work Description: dongjoon-hyun opened a new pull request #1616: URL: https://github.com/apache/hive/pull/1616 ### What changes were proposed in this pull request? This PR aims to upgrade Apache ORC from 1.5.6 to 1.5.8. ### Why are the changes needed? This will bring eleven bug fixes. - ORC 1.5.7: https://issues.apache.org/jira/projects/ORC/versions/12345702 - ORC 1.5.8: https://issues.apache.org/jira/projects/ORC/versions/12346462 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CI with the existing test cases. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505767) Time Spent: 0.5h (was: 20m) > Upgrade ORC from 1.5.6 to 1.5.8 in branch-3.1 > - > > Key: HIVE-24316 > URL: https://issues.apache.org/jira/browse/HIVE-24316 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 3.1.3 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > This will bring eleven bug fixes. > * ORC 1.5.7: [https://issues.apache.org/jira/projects/ORC/versions/12345702] > * ORC 1.5.8: [https://issues.apache.org/jira/projects/ORC/versions/12346462] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24316) Upgrade ORC from 1.5.6 to 1.5.8 in branch-3.1
[ https://issues.apache.org/jira/browse/HIVE-24316?focusedWorklogId=505768=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505768 ] ASF GitHub Bot logged work on HIVE-24316: - Author: ASF GitHub Bot Created on: 28/Oct/20 15:36 Start Date: 28/Oct/20 15:36 Worklog Time Spent: 10m Work Description: dongjoon-hyun closed pull request #1616: URL: https://github.com/apache/hive/pull/1616 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505768) Time Spent: 40m (was: 0.5h) > Upgrade ORC from 1.5.6 to 1.5.8 in branch-3.1 > - > > Key: HIVE-24316 > URL: https://issues.apache.org/jira/browse/HIVE-24316 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 3.1.3 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > This will bring eleven bug fixes. > * ORC 1.5.7: [https://issues.apache.org/jira/projects/ORC/versions/12345702] > * ORC 1.5.8: [https://issues.apache.org/jira/projects/ORC/versions/12346462] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24316) Upgrade ORC from 1.5.6 to 1.5.8 in branch-3.1
[ https://issues.apache.org/jira/browse/HIVE-24316?focusedWorklogId=505769=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505769 ] ASF GitHub Bot logged work on HIVE-24316: - Author: ASF GitHub Bot Created on: 28/Oct/20 15:36 Start Date: 28/Oct/20 15:36 Worklog Time Spent: 10m Work Description: dongjoon-hyun commented on pull request #1616: URL: https://github.com/apache/hive/pull/1616#issuecomment-718017194 Thanks, @pgaref . I closed and reopened this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505769) Time Spent: 50m (was: 40m) > Upgrade ORC from 1.5.6 to 1.5.8 in branch-3.1 > - > > Key: HIVE-24316 > URL: https://issues.apache.org/jira/browse/HIVE-24316 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 3.1.3 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > This will bring eleven bug fixes. > * ORC 1.5.7: [https://issues.apache.org/jira/projects/ORC/versions/12345702] > * ORC 1.5.8: [https://issues.apache.org/jira/projects/ORC/versions/12346462] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24297) LLAP buffer collision causes NPE
[ https://issues.apache.org/jira/browse/HIVE-24297?focusedWorklogId=505743=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505743 ] ASF GitHub Bot logged work on HIVE-24297: - Author: ASF GitHub Bot Created on: 28/Oct/20 14:43 Start Date: 28/Oct/20 14:43 Worklog Time Spent: 10m Work Description: szlta merged pull request #1614: URL: https://github.com/apache/hive/pull/1614 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505743) Time Spent: 20m (was: 10m) > LLAP buffer collision causes NPE > > > Key: HIVE-24297 > URL: https://issues.apache.org/jira/browse/HIVE-24297 > Project: Hive > Issue Type: Bug >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > HIVE-23741 introduced an optimization so that CacheTags are not stored on > buffer level, but rather on file level, as one cache tag can only relate to > one file. With this change a buffer->filecache reference was introduced so > that the buffer's tag can be calculated with an extra indirection i.e. > buffer.filecache.tag. > However during buffer collision in putFileData method, we don't set the > filecache reference of the collided (new) buffer: > [https://github.com/apache/hive/commit/2e18a7408a8dd49beecad8d66bfe054b7dc474da#diff-d2ccd7cf3042845a0812a5e118f82db49253d82fc86449ffa408903bf434fb6dR309-R311] > Later this cases NPE when the new (instantly decRef'ed) buffer is evicted: > {code:java} > Caused by: java.lang.NullPointerException > at > java.util.concurrent.ConcurrentSkipListMap.doGet(ConcurrentSkipListMap.java:778) > at > java.util.concurrent.ConcurrentSkipListMap.get(ConcurrentSkipListMap.java:1546) > at > org.apache.hadoop.hive.llap.cache.CacheContentsTracker.getTagState(CacheContentsTracker.java:129) > at > org.apache.hadoop.hive.llap.cache.CacheContentsTracker.getTagState(CacheContentsTracker.java:125) > at > org.apache.hadoop.hive.llap.cache.CacheContentsTracker.reportRemoved(CacheContentsTracker.java:109) > at > org.apache.hadoop.hive.llap.cache.CacheContentsTracker.notifyEvicted(CacheContentsTracker.java:238) > at > org.apache.hadoop.hive.llap.cache.LowLevelLrfuCachePolicy.evictSomeBlocks(LowLevelLrfuCachePolicy.java:276) > at > org.apache.hadoop.hive.llap.cache.CacheContentsTracker.evictSomeBlocks(CacheContentsTracker.java:177) > at > org.apache.hadoop.hive.llap.cache.LowLevelCacheMemoryManager.reserveMemory(LowLevelCacheMemoryManager.java:98) > at > org.apache.hadoop.hive.llap.cache.LowLevelCacheMemoryManager.reserveMemory(LowLevelCacheMemoryManager.java:65) > at > org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateMultiple(BuddyAllocator.java:323) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.allocateMultiple(EncodedReaderImpl.java:1302) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedStream(EncodedReaderImpl.java:930) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:506) > ... 16 more {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-23829) Compute Stats Incorrect for Binary Columns
[ https://issues.apache.org/jira/browse/HIVE-23829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Mollitor resolved HIVE-23829. --- Fix Version/s: 4.0.0 Resolution: Fixed Pushed to master. Thanks! > Compute Stats Incorrect for Binary Columns > -- > > Key: HIVE-23829 > URL: https://issues.apache.org/jira/browse/HIVE-23829 > Project: Hive > Issue Type: Bug >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > I came across an issue when working on [HIVE-22674]. > The SerDe used for processing binary data tries to auto-detect if the data is > in Base-64. It uses > {{org.apache.commons.codec.binary.Base64#isArrayByteBase64}} which has two > issues: > # It's slow since it will check if the array is compatible,... and then > process the data (examines the array twice) > # More importantly, this method _Tests a given byte array to see if it > contains only valid characters within the Base64 alphabet. Currently the > method treats whitespace as valid._ > https://commons.apache.org/proper/commons-codec/apidocs/org/apache/commons/codec/binary/Base64.html#isArrayByteBase64-byte:A- > The > [qtest|https://github.com/apache/hive/blob/f98e136bdd5642e3de10d2fd1a4c14d1d6762113/ql/src/test/queries/clientpositive/compute_stats_binary.q] > for this feature uses full sentences (which includes spaces) > [here|https://github.com/apache/hive/blob/f98e136bdd5642e3de10d2fd1a4c14d1d6762113/data/files/binary.txt] > and therefore it thinks this data is Base-64 and returns an incorrect > estimation for size. > This should really not auto-detect Base64 data and instead it should be > enabled with a table property. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23829) Compute Stats Incorrect for Binary Columns
[ https://issues.apache.org/jira/browse/HIVE-23829?focusedWorklogId=505720=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505720 ] ASF GitHub Bot logged work on HIVE-23829: - Author: ASF GitHub Bot Created on: 28/Oct/20 13:31 Start Date: 28/Oct/20 13:31 Worklog Time Spent: 10m Work Description: belugabehr merged pull request #1313: URL: https://github.com/apache/hive/pull/1313 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505720) Time Spent: 1h 40m (was: 1.5h) > Compute Stats Incorrect for Binary Columns > -- > > Key: HIVE-23829 > URL: https://issues.apache.org/jira/browse/HIVE-23829 > Project: Hive > Issue Type: Bug >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Minor > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > I came across an issue when working on [HIVE-22674]. > The SerDe used for processing binary data tries to auto-detect if the data is > in Base-64. It uses > {{org.apache.commons.codec.binary.Base64#isArrayByteBase64}} which has two > issues: > # It's slow since it will check if the array is compatible,... and then > process the data (examines the array twice) > # More importantly, this method _Tests a given byte array to see if it > contains only valid characters within the Base64 alphabet. Currently the > method treats whitespace as valid._ > https://commons.apache.org/proper/commons-codec/apidocs/org/apache/commons/codec/binary/Base64.html#isArrayByteBase64-byte:A- > The > [qtest|https://github.com/apache/hive/blob/f98e136bdd5642e3de10d2fd1a4c14d1d6762113/ql/src/test/queries/clientpositive/compute_stats_binary.q] > for this feature uses full sentences (which includes spaces) > [here|https://github.com/apache/hive/blob/f98e136bdd5642e3de10d2fd1a4c14d1d6762113/data/files/binary.txt] > and therefore it thinks this data is Base-64 and returns an incorrect > estimation for size. > This should really not auto-detect Base64 data and instead it should be > enabled with a table property. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23829) Compute Stats Incorrect for Binary Columns
[ https://issues.apache.org/jira/browse/HIVE-23829?focusedWorklogId=505721=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505721 ] ASF GitHub Bot logged work on HIVE-23829: - Author: ASF GitHub Bot Created on: 28/Oct/20 13:31 Start Date: 28/Oct/20 13:31 Worklog Time Spent: 10m Work Description: belugabehr commented on pull request #1313: URL: https://github.com/apache/hive/pull/1313#issuecomment-717934374 @HunterL Merged to master. Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505721) Time Spent: 1h 50m (was: 1h 40m) > Compute Stats Incorrect for Binary Columns > -- > > Key: HIVE-23829 > URL: https://issues.apache.org/jira/browse/HIVE-23829 > Project: Hive > Issue Type: Bug >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Minor > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > I came across an issue when working on [HIVE-22674]. > The SerDe used for processing binary data tries to auto-detect if the data is > in Base-64. It uses > {{org.apache.commons.codec.binary.Base64#isArrayByteBase64}} which has two > issues: > # It's slow since it will check if the array is compatible,... and then > process the data (examines the array twice) > # More importantly, this method _Tests a given byte array to see if it > contains only valid characters within the Base64 alphabet. Currently the > method treats whitespace as valid._ > https://commons.apache.org/proper/commons-codec/apidocs/org/apache/commons/codec/binary/Base64.html#isArrayByteBase64-byte:A- > The > [qtest|https://github.com/apache/hive/blob/f98e136bdd5642e3de10d2fd1a4c14d1d6762113/ql/src/test/queries/clientpositive/compute_stats_binary.q] > for this feature uses full sentences (which includes spaces) > [here|https://github.com/apache/hive/blob/f98e136bdd5642e3de10d2fd1a4c14d1d6762113/data/files/binary.txt] > and therefore it thinks this data is Base-64 and returns an incorrect > estimation for size. > This should really not auto-detect Base64 data and instead it should be > enabled with a table property. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24288) Files created by CompileProcessor have incorrect permissions
[ https://issues.apache.org/jira/browse/HIVE-24288?focusedWorklogId=505716=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505716 ] ASF GitHub Bot logged work on HIVE-24288: - Author: ASF GitHub Bot Created on: 28/Oct/20 13:04 Start Date: 28/Oct/20 13:04 Worklog Time Spent: 10m Work Description: nrg4878 commented on a change in pull request #1590: URL: https://github.com/apache/hive/pull/1590#discussion_r513424742 ## File path: ql/src/java/org/apache/hadoop/hive/ql/processors/CompileProcessor.java ## @@ -254,6 +276,9 @@ CommandProcessorResponse compile(SessionState ss) throws CommandProcessorExcepti if (ss != null){ ss.add_resource(ResourceType.JAR, testArchive.getAbsolutePath()); + try { +testArchive.deleteOnExit(); Review comment: after testing further, somehow deleting the jar after add_resource will result in CNFE when creating the UDF. So the above should be fine given the permissions are rw java.lang.ClassNotFoundException: Pyth at java.net.URLClassLoader.findClass(URLClassLoader.java:382) ~[?:1.8.0_231] at java.lang.ClassLoader.loadClass(ClassLoader.java:418) ~[?:1.8.0_231] at java.lang.ClassLoader.loadClass(ClassLoader.java:351) ~[?:1.8.0_231] at java.lang.Class.forName0(Native Method) ~[?:1.8.0_231] at java.lang.Class.forName(Class.java:348) ~[?:1.8.0_231] at org.apache.hadoop.hive.ql.ddl.function.create.CreateFunctionOperation.getUdfClass(CreateFunctionOperation.java:96) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.ddl.function.create.CreateFunctionOperation.createTemporaryFunction(CreateFunctionOperation.java:73) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.ddl.function.create.CreateFunctionOperation.execute(CreateFunctionOperation.java:57) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:80) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:361) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:334) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:245) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:108) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:326) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:149) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:144) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:164) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:228) ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hive.service.cli.operation.SQLOperation.access$500(SQLOperation.java:88) ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:325) ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_231] at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_231] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) ~[hadoop-common-3.1.0.jar:?] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:343) ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_231] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_231] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_231] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_231] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_231] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.
[jira] [Commented] (HIVE-24320) TestMiniLlapLocal sometimes hangs because of some derby issues
[ https://issues.apache.org/jira/browse/HIVE-24320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222124#comment-17222124 ] Zoltan Haindrich commented on HIVE-24320: - attached last 1000 lines of hive log and full jstack trace there seems to be some derby issues prior to this state - not sure if those are only caused by the issue or they are part of the root cause issues seems to start with: {code} 2020-10-28T01:24:33,767 WARN [Heartbeater-3] pool.ProxyConnection: HikariPool-3 - Connection org.apache.derby.impl.jdbc.EmbedConnection@1913174287 (XID = null), (SESSIONID = 68), (DATABASE = /home/jenkins/agent/workspace/internal-hive-precommit_PR-2/itests/qtest/target/tmp/junit_metastore_db), (DRDAID = null) marked as broken because of SQLSTATE(08003), ErrorCode(4) java.sql.SQLNonTransientConnectionException: No current connection. at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) ~[derby-10.14.2.0.jar:?] at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) ~[derby-10.14.2.0.jar:?] at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source) ~[derby-10.14.2.0.jar:?] at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source) ~[derby-10.14.2.0.jar:?] at org.apache.derby.impl.jdbc.Util.noCurrentConnection(Unknown Source) ~[derby-10.14.2.0.jar:?] at org.apache.derby.impl.jdbc.EmbedConnection.checkIfClosed(Unknown Source) ~[derby-10.14.2.0.jar:?] at org.apache.derby.impl.jdbc.EmbedConnection.setupContextStack(Unknown Source) ~[derby-10.14.2.0.jar:?] at org.apache.derby.impl.jdbc.EmbedConnection.rollback(Unknown Source) ~[derby-10.14.2.0.jar:?] at com.zaxxer.hikari.pool.ProxyConnection.rollback(ProxyConnection.java:362) ~[HikariCP-2.6.1.jar:?] at com.zaxxer.hikari.pool.HikariProxyConnection.rollback(HikariProxyConnection.java) ~[HikariCP-2.6.1.jar:?] at org.apache.hadoop.hive.metastore.txn.TxnHandler.rollbackDBConn(TxnHandler.java:3787) ~[hive-standalone-metastore-3.1.3000.7.2.3.0-212.jar:3.1.3000.7.2.3.0-212] at org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeat(TxnHandler.java:2912) ~[hive-standalone-metastore-3.1.3000.7.2.3.0-212.jar:3.1.3000.7.2.3.0-212] at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.heartbeat(HiveMetaStore.java:8440) ~[hive-standalone-metastore-3.1.3000.7.2.3.0-212.jar:3.1.3000.7.2.3.0-212] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_262] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_262] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_262] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_262] at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) ~[hive-standalone-metastore-3.1.3000.7.2.3.0-212.jar:3.1.3000.7.2.3.0-212] at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) ~[hive-standalone-metastore-3.1.3000.7.2.3.0-212.jar:3.1.3000.7.2.3.0-212] at com.sun.proxy.$Proxy58.heartbeat(Unknown Source) ~[?:?] at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.heartbeat(HiveMetaStoreClient.java:3250) ~[hive-standalone-metastore-3.1.3000.7.2.3.0-212.jar:3.1.3000.7.2.3.0-212] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_262] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_262] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_262] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_262] at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212) ~[hive-standalone-metastore-3.1.3000.7.2.3.0-212.jar:3.1.3000.7.2.3.0-212] at com.sun.proxy.$Proxy59.heartbeat(Unknown Source) ~[?:?] at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:665) ~[hive-exec-3.1.3000.7.2.3.0-212.jar:3.1.3000.7.2.3.0-212] at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.lambda$run$0(DbTxnManager.java:1085) ~[hive-exec-3.1.3000.7.2.3.0-212.jar:3.1.3000.7.2.3.0-212] at java.security.AccessController.doPrivileged(Native Method) [?:1.8.0_262] at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_262] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898) [hadoop-common-3.1.1.7.2.3.0-212.jar:?] at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.run(DbTxnManager.java:1084) [hive-exec-3.1.3000.7.2.3.0-212.jar:3.1.3000.7.2.3.0-212] at
[jira] [Updated] (HIVE-24320) TestMiniLlapLocal sometimes hangs because of some derby issues
[ https://issues.apache.org/jira/browse/HIVE-24320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-24320: Attachment: 3hr.jstack 3hr.hive.log > TestMiniLlapLocal sometimes hangs because of some derby issues > -- > > Key: HIVE-24320 > URL: https://issues.apache.org/jira/browse/HIVE-24320 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Attachments: 3hr.hive.log, 3hr.jstack > > > code in question is a slightly modified version of branch-3 > opening ticket to make notes about the investigation > {code} > "dcce5fec-2365-4697-8a8f-04a4dfa5d9f5 main" #1 prio=5 os_prio=0 > tid=0x7fd7c000a800 nid=0x1de23 waiting on condition [0x7fd7c4b7] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xc61635f0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitUninterruptibly(AbstractQueuedSynchronizer.java:1981) > at > org.apache.derby.impl.services.cache.CacheEntry.waitUntilIdentityIsSet(Unknown > Source) > at > org.apache.derby.impl.services.cache.ConcurrentCache.getEntry(Unknown Source) > at org.apache.derby.impl.services.cache.ConcurrentCache.find(Unknown > Source) > at > org.apache.derby.impl.store.raw.data.BaseDataFileFactory.openContainer(Unknown > Source) > at > org.apache.derby.impl.store.raw.data.BaseDataFileFactory.openContainer(Unknown > Source) > at org.apache.derby.impl.store.raw.xact.Xact.openContainer(Unknown > Source) > at > org.apache.derby.impl.store.access.conglomerate.OpenConglomerate.init(Unknown > Source) > at org.apache.derby.impl.store.access.heap.Heap.open(Unknown Source) > at > org.apache.derby.impl.store.access.RAMTransaction.openConglomerate(Unknown > Source) > at > org.apache.derby.impl.store.access.RAMTransaction.openConglomerate(Unknown > Source) > at > org.apache.derby.impl.sql.catalog.DataDictionaryImpl.getDescriptorViaIndexMinion(Unknown > Source) > at > org.apache.derby.impl.sql.catalog.DataDictionaryImpl.getDescriptorViaIndex(Unknown > Source) > at > org.apache.derby.impl.sql.catalog.DataDictionaryImpl.getSubKeyConstraint(Unknown > Source) > at > org.apache.derby.impl.sql.catalog.DataDictionaryImpl.getConstraintDescriptorViaIndex(Unknown > Source) > at > org.apache.derby.impl.sql.catalog.DataDictionaryImpl.getConstraintDescriptorsScan(Unknown > Source) > at > org.apache.derby.impl.sql.catalog.DataDictionaryImpl.getConstraintDescriptors(Unknown > Source) > - locked <0xc615c9a8> (a > org.apache.derby.iapi.sql.dictionary.ConstraintDescriptorList) > at > org.apache.derby.iapi.sql.dictionary.TableDescriptor.getAllRelevantConstraints(Unknown > Source) > at > org.apache.derby.impl.sql.compile.DMLModStatementNode.getAllRelevantConstraints(Unknown > Source) > at > org.apache.derby.impl.sql.compile.DMLModStatementNode.bindConstraints(Unknown > Source) > at org.apache.derby.impl.sql.compile.DeleteNode.bindStatement(Unknown > Source) > at org.apache.derby.impl.sql.GenericStatement.prepMinion(Unknown > Source) > at org.apache.derby.impl.sql.GenericStatement.prepare(Unknown Source) > at > org.apache.derby.impl.sql.conn.GenericLanguageConnectionContext.prepareInternalStatement(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source) > - locked <0xc4bb5fd0> (a > org.apache.derby.impl.jdbc.EmbedConnection) > at > org.apache.derby.impl.jdbc.EmbedStatement.executeBatchElement(Unknown Source) > at > org.apache.derby.impl.jdbc.EmbedStatement.executeLargeBatch(Unknown Source) > - locked <0xc4bb5fd0> (a > org.apache.derby.impl.jdbc.EmbedConnection) > at org.apache.derby.impl.jdbc.EmbedStatement.executeBatch(Unknown > Source) > at > com.zaxxer.hikari.pool.ProxyStatement.executeBatch(ProxyStatement.java:125) > at > com.zaxxer.hikari.pool.HikariProxyStatement.executeBatch(HikariProxyStatement.java) > at > org.apache.hadoop.hive.metastore.txn.TxnDbUtil.executeQueriesInBatch(TxnDbUtil.java:658) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.updateCommitIdAndCleanUpMetadata(TxnHandler.java:1338) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.commitTxn(TxnHandler.java:1236) > at >
[jira] [Assigned] (HIVE-24320) TestMiniLlapLocal sometimes hangs because of some derby issues
[ https://issues.apache.org/jira/browse/HIVE-24320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich reassigned HIVE-24320: --- > TestMiniLlapLocal sometimes hangs because of some derby issues > -- > > Key: HIVE-24320 > URL: https://issues.apache.org/jira/browse/HIVE-24320 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > > code in question is a slightly modified version of branch-3 > opening ticket to make notes about the investigation > {code} > "dcce5fec-2365-4697-8a8f-04a4dfa5d9f5 main" #1 prio=5 os_prio=0 > tid=0x7fd7c000a800 nid=0x1de23 waiting on condition [0x7fd7c4b7] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xc61635f0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitUninterruptibly(AbstractQueuedSynchronizer.java:1981) > at > org.apache.derby.impl.services.cache.CacheEntry.waitUntilIdentityIsSet(Unknown > Source) > at > org.apache.derby.impl.services.cache.ConcurrentCache.getEntry(Unknown Source) > at org.apache.derby.impl.services.cache.ConcurrentCache.find(Unknown > Source) > at > org.apache.derby.impl.store.raw.data.BaseDataFileFactory.openContainer(Unknown > Source) > at > org.apache.derby.impl.store.raw.data.BaseDataFileFactory.openContainer(Unknown > Source) > at org.apache.derby.impl.store.raw.xact.Xact.openContainer(Unknown > Source) > at > org.apache.derby.impl.store.access.conglomerate.OpenConglomerate.init(Unknown > Source) > at org.apache.derby.impl.store.access.heap.Heap.open(Unknown Source) > at > org.apache.derby.impl.store.access.RAMTransaction.openConglomerate(Unknown > Source) > at > org.apache.derby.impl.store.access.RAMTransaction.openConglomerate(Unknown > Source) > at > org.apache.derby.impl.sql.catalog.DataDictionaryImpl.getDescriptorViaIndexMinion(Unknown > Source) > at > org.apache.derby.impl.sql.catalog.DataDictionaryImpl.getDescriptorViaIndex(Unknown > Source) > at > org.apache.derby.impl.sql.catalog.DataDictionaryImpl.getSubKeyConstraint(Unknown > Source) > at > org.apache.derby.impl.sql.catalog.DataDictionaryImpl.getConstraintDescriptorViaIndex(Unknown > Source) > at > org.apache.derby.impl.sql.catalog.DataDictionaryImpl.getConstraintDescriptorsScan(Unknown > Source) > at > org.apache.derby.impl.sql.catalog.DataDictionaryImpl.getConstraintDescriptors(Unknown > Source) > - locked <0xc615c9a8> (a > org.apache.derby.iapi.sql.dictionary.ConstraintDescriptorList) > at > org.apache.derby.iapi.sql.dictionary.TableDescriptor.getAllRelevantConstraints(Unknown > Source) > at > org.apache.derby.impl.sql.compile.DMLModStatementNode.getAllRelevantConstraints(Unknown > Source) > at > org.apache.derby.impl.sql.compile.DMLModStatementNode.bindConstraints(Unknown > Source) > at org.apache.derby.impl.sql.compile.DeleteNode.bindStatement(Unknown > Source) > at org.apache.derby.impl.sql.GenericStatement.prepMinion(Unknown > Source) > at org.apache.derby.impl.sql.GenericStatement.prepare(Unknown Source) > at > org.apache.derby.impl.sql.conn.GenericLanguageConnectionContext.prepareInternalStatement(Unknown > Source) > at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source) > - locked <0xc4bb5fd0> (a > org.apache.derby.impl.jdbc.EmbedConnection) > at > org.apache.derby.impl.jdbc.EmbedStatement.executeBatchElement(Unknown Source) > at > org.apache.derby.impl.jdbc.EmbedStatement.executeLargeBatch(Unknown Source) > - locked <0xc4bb5fd0> (a > org.apache.derby.impl.jdbc.EmbedConnection) > at org.apache.derby.impl.jdbc.EmbedStatement.executeBatch(Unknown > Source) > at > com.zaxxer.hikari.pool.ProxyStatement.executeBatch(ProxyStatement.java:125) > at > com.zaxxer.hikari.pool.HikariProxyStatement.executeBatch(HikariProxyStatement.java) > at > org.apache.hadoop.hive.metastore.txn.TxnDbUtil.executeQueriesInBatch(TxnDbUtil.java:658) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.updateCommitIdAndCleanUpMetadata(TxnHandler.java:1338) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.commitTxn(TxnHandler.java:1236) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.commit_txn(HiveMetaStore.java:8315) > at
[jira] [Updated] (HIVE-24318) When GlobalLimit is efficient, query will run twice with "Retry query with a different approach..."
[ https://issues.apache.org/jira/browse/HIVE-24318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] libo updated HIVE-24318: Attachment: HIVE-24318.patch Assignee: libo Status: Patch Available (was: Open) > When GlobalLimit is efficient, query will run twice with "Retry query with a > different approach..." > --- > > Key: HIVE-24318 > URL: https://issues.apache.org/jira/browse/HIVE-24318 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.0.1 > Environment: Hadoop 2.6.0 > Hive-2.0.1 >Reporter: libo >Assignee: libo >Priority: Minor > Attachments: HIVE-24318.patch > > > hive.limit.optimize.enable=true > hive.limit.row.max.size=1000 > hive.limit.optimize.fetch.max=1000 > hive.fetch.task.conversion.threshold=256 > hive.fetch.task.conversion=more > > *sql eg:* > select db_name,concat(tb_name,'test') from (select * from test1.t3 where > dt='0909' limit 10)t1; > (only partitioned table) > *console information:* > Retry query with a different approach... > > *exception stack:* > org.apache.hadoop.hive.ql.CommandNeedRetryException > at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:147) > at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2022) > at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:317) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:232) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:475) > at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:855) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:794) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:721) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:323) > at org.apache.hadoop.util.RunJar.main(RunJar.java:236) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24307) Beeline with property-file and -e parameter is failing
[ https://issues.apache.org/jira/browse/HIVE-24307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222093#comment-17222093 ] Aasha Medhi commented on HIVE-24307: +1 > Beeline with property-file and -e parameter is failing > -- > > Key: HIVE-24307 > URL: https://issues.apache.org/jira/browse/HIVE-24307 > Project: Hive > Issue Type: Bug >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Beeline query with property file specified with -e parameter fails with : > {noformat} > Cannot run commands specified using -e. No current connection > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24316) Upgrade ORC from 1.5.6 to 1.5.8 in branch-3.1
[ https://issues.apache.org/jira/browse/HIVE-24316?focusedWorklogId=505658=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505658 ] ASF GitHub Bot logged work on HIVE-24316: - Author: ASF GitHub Bot Created on: 28/Oct/20 10:00 Start Date: 28/Oct/20 10:00 Worklog Time Spent: 10m Work Description: pgaref commented on pull request #1616: URL: https://github.com/apache/hive/pull/1616#issuecomment-717826458 Thanks for the patch @dongjoon-hyun -- can you please reopen the PR as I dont see the pre-commit test results at all (I guess they were never triggered) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505658) Time Spent: 20m (was: 10m) > Upgrade ORC from 1.5.6 to 1.5.8 in branch-3.1 > - > > Key: HIVE-24316 > URL: https://issues.apache.org/jira/browse/HIVE-24316 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 3.1.3 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > This will bring eleven bug fixes. > * ORC 1.5.7: [https://issues.apache.org/jira/projects/ORC/versions/12345702] > * ORC 1.5.8: [https://issues.apache.org/jira/projects/ORC/versions/12346462] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24316) Upgrade ORC from 1.5.6 to 1.5.8 in branch-3.1
[ https://issues.apache.org/jira/browse/HIVE-24316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reassigned HIVE-24316: - Assignee: Dongjoon Hyun > Upgrade ORC from 1.5.6 to 1.5.8 in branch-3.1 > - > > Key: HIVE-24316 > URL: https://issues.apache.org/jira/browse/HIVE-24316 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 3.1.3 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24316) Upgrade ORC from 1.5.6 to 1.5.8 in branch-3.1
[ https://issues.apache.org/jira/browse/HIVE-24316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-24316: -- Description: This will bring eleven bug fixes. * ORC 1.5.7: [https://issues.apache.org/jira/projects/ORC/versions/12345702] * ORC 1.5.8: [https://issues.apache.org/jira/projects/ORC/versions/12346462] > Upgrade ORC from 1.5.6 to 1.5.8 in branch-3.1 > - > > Key: HIVE-24316 > URL: https://issues.apache.org/jira/browse/HIVE-24316 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 3.1.3 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > This will bring eleven bug fixes. > * ORC 1.5.7: [https://issues.apache.org/jira/projects/ORC/versions/12345702] > * ORC 1.5.8: [https://issues.apache.org/jira/projects/ORC/versions/12346462] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24302) Cleaner shouldn't run if it can't remove obsolete files
[ https://issues.apache.org/jira/browse/HIVE-24302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222072#comment-17222072 ] Karen Coppage commented on HIVE-24302: -- Duplicates HIVE-24291 > Cleaner shouldn't run if it can't remove obsolete files > --- > > Key: HIVE-24302 > URL: https://issues.apache.org/jira/browse/HIVE-24302 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Example: > # open txn 5, leave it open (maybe it's a long-running compaction) > # insert into table t in txns 6, 7 with writeids 1, 2 > # compactor.Worker runs on table t and compacts writeids 1, 2 > # compactor.Cleaner picks up the compaction queue entry, but doesn't delete > any files because the min global open txnid is 5, which cannot see writeIds > 1, 2. > # Cleaner marks the compactor queue entry as cleaned and removes the entry > from the queue. > delta_1 and delta_2 will remain in the file system until another compaction > is run on table t. > Step 5 should not happen, we should skip calling markCleaned() and leave it > in the queue in "ready to clean" state. MarkCleaned() should be called only > after txn 5 is closed and, following that, the cleaner runs successfully. > This will potentially slow down the cleaner, but on the other hand it won't > silently "fail" i.e. not do its job. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24291) Compaction Cleaner prematurely cleans up deltas
[ https://issues.apache.org/jira/browse/HIVE-24291?focusedWorklogId=505651=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505651 ] ASF GitHub Bot logged work on HIVE-24291: - Author: ASF GitHub Bot Created on: 28/Oct/20 09:44 Start Date: 28/Oct/20 09:44 Worklog Time Spent: 10m Work Description: pvargacl commented on pull request #1592: URL: https://github.com/apache/hive/pull/1592#issuecomment-717817745 @deniskuzZ Can I ask you for a review? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505651) Time Spent: 0.5h (was: 20m) > Compaction Cleaner prematurely cleans up deltas > --- > > Key: HIVE-24291 > URL: https://issues.apache.org/jira/browse/HIVE-24291 > Project: Hive > Issue Type: Bug >Reporter: Peter Varga >Assignee: Peter Varga >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Since HIVE-23107 the cleaner can clean up deltas that are still used by > running queries. > Example: > * TxnId 1-5 writes to a partition, all commits > * Compactor starts with txnId=6 > * Long running query starts with txnId=7, it sees txnId=6 as open in its > snapshot > * Compaction commits > * Cleaner runs > Previously min_history_level table would have prevented the Cleaner to delete > the deltas1-5 until txnId=7 is open, but now they will be deleted and the > long running query may fail if its tries to access the files. > Solution could be to not run the cleaner until any txn is open that was > opened before the compaction was committed (CQ_NEXT_TXN_ID) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24108) AddToClassPathAction should use TezClassLoader
[ https://issues.apache.org/jira/browse/HIVE-24108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222066#comment-17222066 ] László Bodor commented on HIVE-24108: - thanks [~harishjp]! could you please review [~rajesh.balamohan], [~ashutoshc]? this is probably the last tez 0.10 to hive blocker > AddToClassPathAction should use TezClassLoader > -- > > Key: HIVE-24108 > URL: https://issues.apache.org/jira/browse/HIVE-24108 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Attachments: HIVE-24108.01.patch, HIVE-24108.02.patch, > hive_log_llap.log > > > TEZ-4228 fixes an issue from tez side, which is about to use TezClassLoader > instead of the system classloader. However, there are some codepaths, e.g. in > [^hive_log_llap.log] which shows that the system class loader is used. As > thread context classloaders are inherited, the easier solution is to > early-initialize TezClassLoader in LlapDaemon, and let all threads use that > as context class loader, so this solution is more like TEZ-4223 for llap > daemons. > {code} > 2020-09-02T00:18:20,242 ERROR [TezTR-93696_1_1_1_0_0] tez.TezProcessor: > java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:351) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at > org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.RuntimeException: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at > org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:332) > at > org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:427) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:288) > ... 16 more > Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:79) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:100) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:95) > at > org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:313) > ... 18 more > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:76) > ... 21 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-24302) Cleaner shouldn't run if it can't remove obsolete files
[ https://issues.apache.org/jira/browse/HIVE-24302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Coppage resolved HIVE-24302. -- Resolution: Duplicate > Cleaner shouldn't run if it can't remove obsolete files > --- > > Key: HIVE-24302 > URL: https://issues.apache.org/jira/browse/HIVE-24302 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Example: > # open txn 5, leave it open (maybe it's a long-running compaction) > # insert into table t in txns 6, 7 with writeids 1, 2 > # compactor.Worker runs on table t and compacts writeids 1, 2 > # compactor.Cleaner picks up the compaction queue entry, but doesn't delete > any files because the min global open txnid is 5, which cannot see writeIds > 1, 2. > # Cleaner marks the compactor queue entry as cleaned and removes the entry > from the queue. > delta_1 and delta_2 will remain in the file system until another compaction > is run on table t. > Step 5 should not happen, we should skip calling markCleaned() and leave it > in the queue in "ready to clean" state. MarkCleaned() should be called only > after txn 5 is closed and, following that, the cleaner runs successfully. > This will potentially slow down the cleaner, but on the other hand it won't > silently "fail" i.e. not do its job. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24108) AddToClassPathAction should use TezClassLoader
[ https://issues.apache.org/jira/browse/HIVE-24108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222055#comment-17222055 ] Harish JP commented on HIVE-24108: -- Thanks [~abstractdog]. LGTM +1, non binding. > AddToClassPathAction should use TezClassLoader > -- > > Key: HIVE-24108 > URL: https://issues.apache.org/jira/browse/HIVE-24108 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Attachments: HIVE-24108.01.patch, HIVE-24108.02.patch, > hive_log_llap.log > > > TEZ-4228 fixes an issue from tez side, which is about to use TezClassLoader > instead of the system classloader. However, there are some codepaths, e.g. in > [^hive_log_llap.log] which shows that the system class loader is used. As > thread context classloaders are inherited, the easier solution is to > early-initialize TezClassLoader in LlapDaemon, and let all threads use that > as context class loader, so this solution is more like TEZ-4223 for llap > daemons. > {code} > 2020-09-02T00:18:20,242 ERROR [TezTR-93696_1_1_1_0_0] tez.TezProcessor: > java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:351) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75) > at > org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62) > at > org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at > org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.RuntimeException: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at > org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:332) > at > org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:427) > at > org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:288) > ... 16 more > Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:79) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:100) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializer(TableDesc.java:95) > at > org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:313) > ... 18 more > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.hive.serde2.TestSerDe > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at > org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:76) > ... 21 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24173) notification cleanup interval value changes depending upon replication enabled or not.
[ https://issues.apache.org/jira/browse/HIVE-24173?focusedWorklogId=505634=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505634 ] ASF GitHub Bot logged work on HIVE-24173: - Author: ASF GitHub Bot Created on: 28/Oct/20 08:43 Start Date: 28/Oct/20 08:43 Worklog Time Spent: 10m Work Description: aasha commented on a change in pull request #1593: URL: https://github.com/apache/hive/pull/1593#discussion_r513266229 ## File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java ## @@ -522,6 +522,9 @@ private static void populateLlapDaemonVarsSet(Set llapDaemonVarsSetLocal REPLCMINTERVAL("hive.repl.cm.interval","3600s", new TimeValidator(TimeUnit.SECONDS), "Inteval for cmroot cleanup thread."), +REPL_EVENT_DB_LISTENER_TTL("hive.repl.event.db.listener.timetolive", 10, new TimeValidator(TimeUnit.DAYS), Review comment: Set the hive.repl.cm.retain to 7 days in Metastore conf also This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505634) Time Spent: 40m (was: 0.5h) > notification cleanup interval value changes depending upon replication > enabled or not. > -- > > Key: HIVE-24173 > URL: https://issues.apache.org/jira/browse/HIVE-24173 > Project: Hive > Issue Type: Improvement >Reporter: Arko Sharma >Assignee: Arko Sharma >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Currently we use hive.metastore.event.db.listener.timetolive to determine how > long the events are stored in rdbms backing hms. We should have another > configuration for the same purpose in context of replication so that we have > longer time configured for that otherwise we can default to a 1 day. > hive.repl.cm.enabled can be used to identify if replication is enabled or > not. if enabled use the new configuration property to determine ttl for > events in rdbms else use hive.metastore.event.db.listener.timetolive for ttl. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24294) TezSessionPool sessions can throw AssertionError
[ https://issues.apache.org/jira/browse/HIVE-24294?focusedWorklogId=505604=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505604 ] ASF GitHub Bot logged work on HIVE-24294: - Author: ASF GitHub Bot Created on: 28/Oct/20 07:44 Start Date: 28/Oct/20 07:44 Worklog Time Spent: 10m Work Description: lcspinter commented on pull request #1596: URL: https://github.com/apache/hive/pull/1596#issuecomment-717759487 @nareshpr Thanks for the patch. Merged it to master. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505604) Time Spent: 40m (was: 0.5h) > TezSessionPool sessions can throw AssertionError > > > Key: HIVE-24294 > URL: https://issues.apache.org/jira/browse/HIVE-24294 > Project: Hive > Issue Type: Bug >Reporter: Naresh P R >Assignee: Naresh P R >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Whenever default TezSessionPool sessions are reopened for some reason, we are > setting dagResources to null before close & setting it back in openWhenever > default TezSessionPool sessions are reopened for some reason, we are setting > dagResources to null before close & setting it back in open > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L498-L503 > If there is an exception in sessionState.close(), we are not restoring the > dagResource but moving the session back to TezSessionPool.eg., exception > trace when sessionState.close() failed > {code:java} > 2020-10-15T09:20:28,749 INFO [HiveServer2-Background-Pool: Thread-25451]: > client.TezClient (:()) - Failed to shutdown Tez Session via proxy > org.apache.tez.dag.api.SessionNotRunning: Application not running, > applicationId=application_1602093123456_12345, yarnApplicationState=FINISHED, > finalApplicationStatus=SUCCEEDED, > trackingUrl=http://localhost:8088/proxy/application_1602093123456_12345/, > diagnostics=Session timed out, lastDAGCompletionTime=1602997683786 ms, > sessionTimeoutInterval=60 ms > Session stats:submittedDAGs=2, successfulDAGs=2, failedDAGs=0, killedDAGs=0 > at > org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910) > at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1060) > at org.apache.tez.client.TezClient.stop(TezClient.java:743) > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.closeClient(TezSessionState.java:789) > > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.close(TezSessionState.java:756) > > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.close(TezSessionPoolSession.java:111) > > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.reopenInternal(TezSessionPoolManager.java:496) > > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.reopen(TezSessionPoolManager.java:487) > > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.reopen(TezSessionPoolSession.java:228) > > at > org.apache.hadoop.hive.ql.exec.tez.TezTask.getNewTezSessionOnError(TezTask.java:531) > > at > org.apache.hadoop.hive.ql.exec.tez.TezTask.submit(TezTask.java:546) > at > org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:221){code} > Because of this, all new queries using this corrupted sessions are failing > with below exception > {code:java} > Caused by: java.lang.AssertionError: Ensure called on an unitialized (or > closed) session 41774265-b7da-4d58-84a8-1bedfd597aecCaused by: > java.lang.AssertionError: Ensure called on an unitialized (or closed) session > 41774265-b7da-4d58-84a8-1bedfd597aec at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.ensureLocalResources(TezSessionState.java:685){code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24294) TezSessionPool sessions can throw AssertionError
[ https://issues.apache.org/jira/browse/HIVE-24294?focusedWorklogId=505603=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505603 ] ASF GitHub Bot logged work on HIVE-24294: - Author: ASF GitHub Bot Created on: 28/Oct/20 07:44 Start Date: 28/Oct/20 07:44 Worklog Time Spent: 10m Work Description: lcspinter merged pull request #1596: URL: https://github.com/apache/hive/pull/1596 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505603) Time Spent: 0.5h (was: 20m) > TezSessionPool sessions can throw AssertionError > > > Key: HIVE-24294 > URL: https://issues.apache.org/jira/browse/HIVE-24294 > Project: Hive > Issue Type: Bug >Reporter: Naresh P R >Assignee: Naresh P R >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Whenever default TezSessionPool sessions are reopened for some reason, we are > setting dagResources to null before close & setting it back in openWhenever > default TezSessionPool sessions are reopened for some reason, we are setting > dagResources to null before close & setting it back in open > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L498-L503 > If there is an exception in sessionState.close(), we are not restoring the > dagResource but moving the session back to TezSessionPool.eg., exception > trace when sessionState.close() failed > {code:java} > 2020-10-15T09:20:28,749 INFO [HiveServer2-Background-Pool: Thread-25451]: > client.TezClient (:()) - Failed to shutdown Tez Session via proxy > org.apache.tez.dag.api.SessionNotRunning: Application not running, > applicationId=application_1602093123456_12345, yarnApplicationState=FINISHED, > finalApplicationStatus=SUCCEEDED, > trackingUrl=http://localhost:8088/proxy/application_1602093123456_12345/, > diagnostics=Session timed out, lastDAGCompletionTime=1602997683786 ms, > sessionTimeoutInterval=60 ms > Session stats:submittedDAGs=2, successfulDAGs=2, failedDAGs=0, killedDAGs=0 > at > org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910) > at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1060) > at org.apache.tez.client.TezClient.stop(TezClient.java:743) > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.closeClient(TezSessionState.java:789) > > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.close(TezSessionState.java:756) > > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.close(TezSessionPoolSession.java:111) > > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.reopenInternal(TezSessionPoolManager.java:496) > > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.reopen(TezSessionPoolManager.java:487) > > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.reopen(TezSessionPoolSession.java:228) > > at > org.apache.hadoop.hive.ql.exec.tez.TezTask.getNewTezSessionOnError(TezTask.java:531) > > at > org.apache.hadoop.hive.ql.exec.tez.TezTask.submit(TezTask.java:546) > at > org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:221){code} > Because of this, all new queries using this corrupted sessions are failing > with below exception > {code:java} > Caused by: java.lang.AssertionError: Ensure called on an unitialized (or > closed) session 41774265-b7da-4d58-84a8-1bedfd597aecCaused by: > java.lang.AssertionError: Ensure called on an unitialized (or closed) session > 41774265-b7da-4d58-84a8-1bedfd597aec at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.ensureLocalResources(TezSessionState.java:685){code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24305) avro decimal schema is not properly populating scale/precision if value is enclosed in quote
[ https://issues.apache.org/jira/browse/HIVE-24305?focusedWorklogId=505602=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505602 ] ASF GitHub Bot logged work on HIVE-24305: - Author: ASF GitHub Bot Created on: 28/Oct/20 07:39 Start Date: 28/Oct/20 07:39 Worklog Time Spent: 10m Work Description: lcspinter commented on a change in pull request #1601: URL: https://github.com/apache/hive/pull/1601#discussion_r513232453 ## File path: ql/src/test/queries/clientnegative/avro_decimal.q ## @@ -12,6 +12,6 @@ OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' TBLPROPERTIES ( 'numFiles'='1', - 'avro.schema.literal'='{\"namespace\":\"com.howdy\",\"name\":\"some_schema\",\"type\":\"record\",\"fields\":[{\"name\":\"name\",\"type\":\"string\"},{\"name\":\"value\",\"type\":{\"type\":\"bytes\",\"logicalType\":\"decimal\",\"precision\":"5",\"scale\":"2"}}]}' + 'avro.schema.literal'='{\"namespace\":\"com.howdy\",\"name\":\"some_schema\",\"type\":\"record\",\"fields\":[{\"name\":\"name\",\"type\":\"string\"},{\"name\":\"value\",\"type\":{\"type\":\"bytes\",\"logicalType\":\"decimal\",\"precision\":"a",\"scale\":"b"}}]}' Review comment: What happens if the precision or scale numbers are negative? Could you please add some q tests to cover that scenario as well? ## File path: serde/src/java/org/apache/hadoop/hive/serde2/avro/SchemaToTypeInfo.java ## @@ -186,6 +188,20 @@ public static TypeInfo generateTypeInfo(Schema schema, return typeInfoCache.retrieve(schema, seenSchemas); } + private static int getIntValue(JsonNode jsonNode) { +int value = 0; +if (jsonNode instanceof TextNode) { + try { Review comment: Nit: Instead of using a try-catch block, you could use StringUtils.isNumeric() to determine if a string is a positive decimal number. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505602) Time Spent: 20m (was: 10m) > avro decimal schema is not properly populating scale/precision if value is > enclosed in quote > > > Key: HIVE-24305 > URL: https://issues.apache.org/jira/browse/HIVE-24305 > Project: Hive > Issue Type: Bug >Reporter: Naresh P R >Assignee: Naresh P R >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > {code:java} > CREATE TABLE test_quoted_scale_precision STORED AS AVRO TBLPROPERTIES > ('avro.schema.literal'='{"type":"record","name":"DecimalTest","namespace":"com.example.test","fields":[{"name":"Decimal24_6","type":["null",{"type":"bytes","logicalType":"decimal","precision":24,"scale":"6"}]}]}'); > > desc test_quoted_scale_precision; > // current output > decimal24_6 decimal(24,0) > // expected output > decimal24_6 decimal(24,6){code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24317) External Table is not replicated for Cloud store (e.g. Microsoft ADLS Gen2)
[ https://issues.apache.org/jira/browse/HIVE-24317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nikhil Gupta reassigned HIVE-24317: --- > External Table is not replicated for Cloud store (e.g. Microsoft ADLS Gen2) > --- > > Key: HIVE-24317 > URL: https://issues.apache.org/jira/browse/HIVE-24317 > Project: Hive > Issue Type: Bug > Components: repl >Affects Versions: 4.0.0 >Reporter: Nikhil Gupta >Assignee: Nikhil Gupta >Priority: Minor > > External Table is not replicated properly because of distcp options. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24306) Launch single copy task for single batch of partitions in repl load for managed table
[ https://issues.apache.org/jira/browse/HIVE-24306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aasha Medhi updated HIVE-24306: --- Description: For data dumped in staging location, we will run a single distcp at the table level for all partitions as the data is already present in the staging location. For _files case where data is on source cluster and staging just has the file list, distcp is executed at the each file level. This is to take care of the cm case where we need the full path and encoded path(for cm). If the table is dropped, table level distcp will fail. This patch takes care of single copy for staging data. However to run single distcp at the table level, file listing in distcp might lead to OOM if the number of files are too high. So it needs to be fixed at the distcp level before committing this patch. > Launch single copy task for single batch of partitions in repl load for > managed table > - > > Key: HIVE-24306 > URL: https://issues.apache.org/jira/browse/HIVE-24306 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24306.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > > For data dumped in staging location, we will run a single distcp at the table > level for all partitions as the data is already present in the staging > location. > For _files case where data is on source cluster and staging just has the file > list, distcp is executed at the each file level. This is to take care of the > cm case where we need the full path and encoded path(for cm). If the table is > dropped, table level distcp will fail. > This patch takes care of single copy for staging data. > However to run single distcp at the table level, file listing in distcp might > lead to OOM if the number of files are too high. So it needs to be fixed at > the distcp level before committing this patch. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24306) Launch single copy task for single batch of partitions in repl load for managed table
[ https://issues.apache.org/jira/browse/HIVE-24306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aasha Medhi updated HIVE-24306: --- Attachment: HIVE-24306.01.patch Status: Patch Available (was: In Progress) > Launch single copy task for single batch of partitions in repl load for > managed table > - > > Key: HIVE-24306 > URL: https://issues.apache.org/jira/browse/HIVE-24306 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24306.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (HIVE-24306) Launch single copy task for single batch of partitions in repl load for managed table
[ https://issues.apache.org/jira/browse/HIVE-24306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-24306 started by Aasha Medhi. -- > Launch single copy task for single batch of partitions in repl load for > managed table > - > > Key: HIVE-24306 > URL: https://issues.apache.org/jira/browse/HIVE-24306 > Project: Hive > Issue Type: Task >Reporter: Aasha Medhi >Assignee: Aasha Medhi >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24306.01.patch > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24302) Cleaner shouldn't run if it can't remove obsolete files
[ https://issues.apache.org/jira/browse/HIVE-24302?focusedWorklogId=505581=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505581 ] ASF GitHub Bot logged work on HIVE-24302: - Author: ASF GitHub Bot Created on: 28/Oct/20 06:53 Start Date: 28/Oct/20 06:53 Worklog Time Spent: 10m Work Description: pvargacl commented on pull request #1612: URL: https://github.com/apache/hive/pull/1612#issuecomment-717738838 This is solved here: https://github.com/apache/hive/pull/1592 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 505581) Time Spent: 0.5h (was: 20m) > Cleaner shouldn't run if it can't remove obsolete files > --- > > Key: HIVE-24302 > URL: https://issues.apache.org/jira/browse/HIVE-24302 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Example: > # open txn 5, leave it open (maybe it's a long-running compaction) > # insert into table t in txns 6, 7 with writeids 1, 2 > # compactor.Worker runs on table t and compacts writeids 1, 2 > # compactor.Cleaner picks up the compaction queue entry, but doesn't delete > any files because the min global open txnid is 5, which cannot see writeIds > 1, 2. > # Cleaner marks the compactor queue entry as cleaned and removes the entry > from the queue. > delta_1 and delta_2 will remain in the file system until another compaction > is run on table t. > Step 5 should not happen, we should skip calling markCleaned() and leave it > in the queue in "ready to clean" state. MarkCleaned() should be called only > after txn 5 is closed and, following that, the cleaner runs successfully. > This will potentially slow down the cleaner, but on the other hand it won't > silently "fail" i.e. not do its job. -- This message was sent by Atlassian Jira (v8.3.4#803005)