[jira] [Updated] (HIVE-17978) TPCDS queries 58 and 83 generate exceptions in the vectorization.
[ https://issues.apache.org/jira/browse/HIVE-17978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-17978: --- Attachment: HIVE-17978.patch > TPCDS queries 58 and 83 generate exceptions in the vectorization. > - > > Key: HIVE-17978 > URL: https://issues.apache.org/jira/browse/HIVE-17978 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Nita Dembla >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-17978.patch > > > Failed with the following Exception > {code} > ERROR [6c707c4e-2849-4ff2-809d-946581e6b83a HiveServer2-Handler-Pool: > Thread-78] ql.Driver: FAILED: SemanticException > org.apache.hadoop.hive.ql.metadata.HiveException: The column KEY._col0 is not > in the vectorization context column map {_col0=0}. > org.apache.hadoop.hive.ql.parse.SemanticException: > org.apache.hadoop.hive.ql.metadata.HiveException: The column KEY._col0 is not > in the vectorization context column map {_col0=0}. > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:1665) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:1725) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:43) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:1245) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:671) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:616) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:1902) > at > org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(TezCompiler.java:674) > at > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:271) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11621) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:296) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:268) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:169) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:268) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:599) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1463) > at > org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1420) > at > org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:201) > at > org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:288) > at > org.apache.hive.service.cli.operation.Operation.run(Operation.java:249) > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:532) > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:518) > at > org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:311) >
[jira] [Work started] (HIVE-17978) TPCDS queries 58 and 83 generate exceptions in the vectorization.
[ https://issues.apache.org/jira/browse/HIVE-17978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-17978 started by Jesus Camacho Rodriguez. -- > TPCDS queries 58 and 83 generate exceptions in the vectorization. > - > > Key: HIVE-17978 > URL: https://issues.apache.org/jira/browse/HIVE-17978 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Nita Dembla >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-17978.patch > > > Failed with the following Exception > {code} > ERROR [6c707c4e-2849-4ff2-809d-946581e6b83a HiveServer2-Handler-Pool: > Thread-78] ql.Driver: FAILED: SemanticException > org.apache.hadoop.hive.ql.metadata.HiveException: The column KEY._col0 is not > in the vectorization context column map {_col0=0}. > org.apache.hadoop.hive.ql.parse.SemanticException: > org.apache.hadoop.hive.ql.metadata.HiveException: The column KEY._col0 is not > in the vectorization context column map {_col0=0}. > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:1665) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:1725) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:43) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:1245) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:671) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:616) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:1902) > at > org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(TezCompiler.java:674) > at > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:271) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11621) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:296) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:268) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:169) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:268) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:599) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1463) > at > org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1420) > at > org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:201) > at > org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:288) > at > org.apache.hive.service.cli.operation.Operation.run(Operation.java:249) > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:532) > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:518) > at > org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:311) > at
[jira] [Updated] (HIVE-17978) TPCDS queries 58 and 83 generate exceptions in the vectorization.
[ https://issues.apache.org/jira/browse/HIVE-17978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-17978: --- Status: Patch Available (was: In Progress) > TPCDS queries 58 and 83 generate exceptions in the vectorization. > - > > Key: HIVE-17978 > URL: https://issues.apache.org/jira/browse/HIVE-17978 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Nita Dembla >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-17978.patch > > > Failed with the following Exception > {code} > ERROR [6c707c4e-2849-4ff2-809d-946581e6b83a HiveServer2-Handler-Pool: > Thread-78] ql.Driver: FAILED: SemanticException > org.apache.hadoop.hive.ql.metadata.HiveException: The column KEY._col0 is not > in the vectorization context column map {_col0=0}. > org.apache.hadoop.hive.ql.parse.SemanticException: > org.apache.hadoop.hive.ql.metadata.HiveException: The column KEY._col0 is not > in the vectorization context column map {_col0=0}. > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:1665) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:1725) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:43) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:1245) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:671) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:616) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:1902) > at > org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(TezCompiler.java:674) > at > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:271) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11621) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:296) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:268) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:169) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:268) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:599) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1463) > at > org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1420) > at > org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:201) > at > org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:288) > at > org.apache.hive.service.cli.operation.Operation.run(Operation.java:249) > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:532) > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:518) > at >
[jira] [Resolved] (HIVE-19419) SharedScanOptimizer may leave unnecessary operators in the plan
[ https://issues.apache.org/jira/browse/HIVE-19419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez resolved HIVE-19419. Resolution: Duplicate > SharedScanOptimizer may leave unnecessary operators in the plan > --- > > Key: HIVE-19419 > URL: https://issues.apache.org/jira/browse/HIVE-19419 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 3.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > > Due to the interaction with branches created by semijoin reduction. In turn, > this can lead to errors such as: > {noformat} > 2018-05-03T21:19:41,277 INFO [8d6a552a-b62f-44a4-bdb4-afc2e810ae56 > HiveServer2-Handler-Pool: Thread-139]: physical.Vectorizer > (Vectorizer.java:walkStackToFindVectorizationContext(1260)) - > walkStackToFindVectorizationContext RS has new vectorization context Context > name GBY, level 0, sorted projectionColumnMap {0=_col0}, > scratchColumnTypeNames [] > 2018-05-03T21:19:41,278 ERROR [8d6a552a-b62f-44a4-bdb4-afc2e810ae56 > HiveServer2-Handler-Pool: Thread-139]: ql.Driver > (SessionState.java:printError(1220)) - FAILED: SemanticException > org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.reflect.InvocationTargetException > org.apache.hadoop.hive.ql.parse.SemanticException: > org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:1285) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:1346) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:43) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:955) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:514) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:485) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:1495) > at > org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(TezCompiler.java:644) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-17978) TPCDS queries 58 and 83 generate exceptions in the vectorization.
[ https://issues.apache.org/jira/browse/HIVE-17978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez reassigned HIVE-17978: -- Assignee: Jesus Camacho Rodriguez > TPCDS queries 58 and 83 generate exceptions in the vectorization. > - > > Key: HIVE-17978 > URL: https://issues.apache.org/jira/browse/HIVE-17978 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Nita Dembla >Assignee: Jesus Camacho Rodriguez >Priority: Major > > Failed with the following Exception > {code} > ERROR [6c707c4e-2849-4ff2-809d-946581e6b83a HiveServer2-Handler-Pool: > Thread-78] ql.Driver: FAILED: SemanticException > org.apache.hadoop.hive.ql.metadata.HiveException: The column KEY._col0 is not > in the vectorization context column map {_col0=0}. > org.apache.hadoop.hive.ql.parse.SemanticException: > org.apache.hadoop.hive.ql.metadata.HiveException: The column KEY._col0 is not > in the vectorization context column map {_col0=0}. > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:1665) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:1725) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:43) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:1245) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:671) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:616) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:1902) > at > org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(TezCompiler.java:674) > at > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:271) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11621) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:296) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:268) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:169) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:268) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:599) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1463) > at > org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1420) > at > org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:201) > at > org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:288) > at > org.apache.hive.service.cli.operation.Operation.run(Operation.java:249) > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:532) > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:518) > at > org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:311) > at >
[jira] [Commented] (HIVE-19400) Adjust Hive 1.0 to 2.0 conversion utility to the upgrade
[ https://issues.apache.org/jira/browse/HIVE-19400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463389#comment-16463389 ] Hive QA commented on HIVE-19400: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12921715/HIVE-19400.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 41 failed/errored test(s), 14316 tests executed *Failed tests:* {noformat} TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed out) (batchId=247) TestHCatHiveCompatibility - did not produce a TEST-*.xml file (likely timed out) (batchId=247) TestNonCatCallsWithCatalog - did not produce a TEST-*.xml file (likely timed out) (batchId=217) TestSequenceFileReadWrite - did not produce a TEST-*.xml file (likely timed out) (batchId=247) TestTxnExIm - did not produce a TEST-*.xml file (likely timed out) (batchId=286) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[explainuser_2] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1] (batchId=175) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez2] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_4] (batchId=157) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_5] (batchId=154) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_rebuild_dummy] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_time_window] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] (batchId=169) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=163) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=167) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_stats] (batchId=159) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] (batchId=105) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] (batchId=105) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=105) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[tez-tag] (batchId=105) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_reflect_neg] (batchId=96) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_test_error] (batchId=96) org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=228) org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 (batchId=228) org.apache.hadoop.hive.ql.TestMTQueries.testMTQueries1 (batchId=232) org.apache.hadoop.hive.ql.parse.TestCopyUtils.testPrivilegedDistCpWithSameUserAsCurrentDoesNotTryToImpersonate (batchId=231) org.apache.hadoop.hive.ql.parse.TestReplicationOnHDFSEncryptedZones.targetAndSourceHaveDifferentEncryptionZoneKeys (batchId=231) org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel (batchId=235) org.apache.hive.jdbc.TestRestrictedList.org.apache.hive.jdbc.TestRestrictedList (batchId=241) org.apache.hive.jdbc.TestSSL.testSSLFetchHttp (batchId=239) org.apache.hive.jdbc.TestTriggersMoveWorkloadManager.testTriggerMoveConflictKill (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testMultipleTriggers2 (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedFiles (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomNonExistent (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomReadOps (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesRead (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesWrite (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryElapsedTime (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryExecutionTime (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerVertexRawInputSplitsNoKill (batchId=241) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/10666/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/10666/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-10666/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing
[jira] [Assigned] (HIVE-19419) SharedScanOptimizer may leave unnecessary operators in the plan
[ https://issues.apache.org/jira/browse/HIVE-19419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez reassigned HIVE-19419: -- > SharedScanOptimizer may leave unnecessary operators in the plan > --- > > Key: HIVE-19419 > URL: https://issues.apache.org/jira/browse/HIVE-19419 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 3.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > > Due to the interaction with branches created by semijoin reduction. In turn, > this can lead to errors such as: > {noformat} > 2018-05-03T21:19:41,277 INFO [8d6a552a-b62f-44a4-bdb4-afc2e810ae56 > HiveServer2-Handler-Pool: Thread-139]: physical.Vectorizer > (Vectorizer.java:walkStackToFindVectorizationContext(1260)) - > walkStackToFindVectorizationContext RS has new vectorization context Context > name GBY, level 0, sorted projectionColumnMap {0=_col0}, > scratchColumnTypeNames [] > 2018-05-03T21:19:41,278 ERROR [8d6a552a-b62f-44a4-bdb4-afc2e810ae56 > HiveServer2-Handler-Pool: Thread-139]: ql.Driver > (SessionState.java:printError(1220)) - FAILED: SemanticException > org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.reflect.InvocationTargetException > org.apache.hadoop.hive.ql.parse.SemanticException: > org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:1285) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:1346) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:43) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:955) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:514) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:485) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:1495) > at > org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(TezCompiler.java:644) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (HIVE-19419) SharedScanOptimizer may leave unnecessary operators in the plan
[ https://issues.apache.org/jira/browse/HIVE-19419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-19419 started by Jesus Camacho Rodriguez. -- > SharedScanOptimizer may leave unnecessary operators in the plan > --- > > Key: HIVE-19419 > URL: https://issues.apache.org/jira/browse/HIVE-19419 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer >Affects Versions: 3.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > > Due to the interaction with branches created by semijoin reduction. In turn, > this can lead to errors such as: > {noformat} > 2018-05-03T21:19:41,277 INFO [8d6a552a-b62f-44a4-bdb4-afc2e810ae56 > HiveServer2-Handler-Pool: Thread-139]: physical.Vectorizer > (Vectorizer.java:walkStackToFindVectorizationContext(1260)) - > walkStackToFindVectorizationContext RS has new vectorization context Context > name GBY, level 0, sorted projectionColumnMap {0=_col0}, > scratchColumnTypeNames [] > 2018-05-03T21:19:41,278 ERROR [8d6a552a-b62f-44a4-bdb4-afc2e810ae56 > HiveServer2-Handler-Pool: Thread-139]: ql.Driver > (SessionState.java:printError(1220)) - FAILED: SemanticException > org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.reflect.InvocationTargetException > org.apache.hadoop.hive.ql.parse.SemanticException: > org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:1285) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:1346) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:43) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:955) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:514) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:485) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:1495) > at > org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(TezCompiler.java:644) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18864) ValidWriteIdList snapshot seems incorrect if obtained after allocating writeId by current transaction.
[ https://issues.apache.org/jira/browse/HIVE-18864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463376#comment-16463376 ] ASF GitHub Bot commented on HIVE-18864: --- Github user sankarh closed the pull request at: https://github.com/apache/hive/pull/316 > ValidWriteIdList snapshot seems incorrect if obtained after allocating > writeId by current transaction. > -- > > Key: HIVE-18864 > URL: https://issues.apache.org/jira/browse/HIVE-18864 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Affects Versions: 3.0.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: ACID, pull-request-available > Fix For: 3.0.0 > > Attachments: HIVE-18864.01.patch, HIVE-18864.02.patch > > > For multi-statement txns, it is possible that write on a table happens after > a read. Let's see the below scenario. > # Committed txn=9 writes on table T1 with writeId=5. > # Open txn=10. ValidTxnList(open:null, txn_HWM=10), > # Read table T1 from txn=10. ValidWriteIdList(open:null, write_HWM=5). > # Open txn=11, writes on table T1 with writeid=6. > # Read table T1 from txn=10. ValidWriteIdList(open:null, write_HWM=5). > # Write table T1 from txn=10 with writeId=7. > # Read table T1 from txn=10. {color:#d04437}*ValidWriteIdList(open:null, > write_HWM=7)*. – This read will able to see rows added by txn=11 which is > still open.{color} > {color:#d04437}So, it is needed to rebuild the open/aborted list of > ValidWriteIdList based on txn_HWM. Any writeId allocated by txnId > txn_HWM > should be marked as open. In this example, *ValidWriteIdList(open:6, > write_HWM=7)* should be generated.{color} > {color:#33}cc{color} [~ekoifman], [~thejas] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19130) NPE is thrown when REPL LOAD applied drop partition event.
[ https://issues.apache.org/jira/browse/HIVE-19130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463378#comment-16463378 ] ASF GitHub Bot commented on HIVE-19130: --- Github user sankarh closed the pull request at: https://github.com/apache/hive/pull/332 > NPE is thrown when REPL LOAD applied drop partition event. > -- > > Key: HIVE-19130 > URL: https://issues.apache.org/jira/browse/HIVE-19130 > Project: Hive > Issue Type: Bug > Components: HiveServer2, repl >Affects Versions: 3.0.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: DR, Replication, pull-request-available > Fix For: 3.0.0 > > Attachments: HIVE-19130.01.patch > > > During incremental replication, if we split the events batch as follows, then > the REPL LOAD on second batch throws NPE. > Batch-1: CREATE_TABLE(t1) -> ADD_PARTITION(t1.p1) -> DROP_PARTITION (t1.p1) > Batch-2: DROP_TABLE(t1) -> CREATE_TABLE(t1) -> ADD_PARTITION(t1.p1) -> > DROP_PARTITION (t1.p1) > {code} > 2018-04-05 16:20:36,531 ERROR [HiveServer2-Background-Pool: Thread-107044]: > metadata.Hive (Hive.java:getTable(1219)) - Table catalog_sales_new not found: > new5_tpcds_real_bin_partitioned_orc_1000.catalog_sales_new table not found > 2018-04-05 16:20:36,538 ERROR [HiveServer2-Background-Pool: Thread-107044]: > exec.DDLTask (DDLTask.java:failed(540)) - > org.apache.hadoop.hive.ql.metadata.HiveException > at > org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4016) > at > org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3983) > at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:341) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1765) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1506) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1303) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1170) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1165) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197) > at > org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76) > at > org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) > at > org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByExpr(Hive.java:2613) > at > org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4008) > ... 23 more > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18988) Support bootstrap replication of ACID tables
[ https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463377#comment-16463377 ] ASF GitHub Bot commented on HIVE-18988: --- Github user sankarh closed the pull request at: https://github.com/apache/hive/pull/331 > Support bootstrap replication of ACID tables > > > Key: HIVE-18988 > URL: https://issues.apache.org/jira/browse/HIVE-18988 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, repl >Affects Versions: 3.0.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan >Priority: Major > Labels: ACID, DR, pull-request-available, replication > Fix For: 3.0.0, 3.1.0 > > Attachments: HIVE-18988.01-branch-3.patch, HIVE-18988.01.patch, > HIVE-18988.02.patch, HIVE-18988.03.patch, HIVE-18988.04.patch, > HIVE-18988.05.patch, HIVE-18988.06.patch, HIVE-18988.07.patch > > > Bootstrapping of ACID tables, need special handling to replicate a stable > state of data. > - If ACID feature enables, then perform bootstrap dump for ACID tables with > in read txn. > -> Dump table/partition metadata. > -> Get the list of valid data files for a table using same logic as read txn > do. > -> Dump latest ValidWriteIdList as per current read txn. > - Set the valid last replication state such that it doesn't miss any open > txn started after triggering bootstrap dump. > - If any txns on-going which was opened before triggering bootstrap dump, > then it is not guaranteed that if open_txn event captured for these txns. > Also, if these txns are opened for streaming ingest case, then dumped ACID > table data may include data of open txns which impact snapshot isolation at > target. To avoid that, bootstrap dump should wait for timeout (new > configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, > just force abort those txns and continue. > - If any txns force aborted belongs to a streaming ingest case, then dumped > ACID table data may have aborted data too. So, it is necessary to replicate > the aborted write ids to target to mark those data invalid for any readers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19400) Adjust Hive 1.0 to 2.0 conversion utility to the upgrade
[ https://issues.apache.org/jira/browse/HIVE-19400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463358#comment-16463358 ] Hive QA commented on HIVE-19400: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 28s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s{color} | {color:green} standalone-metastore: The patch generated 0 new + 60 unchanged - 1 fixed = 60 total (was 61) {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 14m 16s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-10666/dev-support/hive-personality.sh | | git revision | master / 1c3b82f | | Default Java | 1.8.0_111 | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-10666/yetus/whitespace-eol.txt | | modules | C: standalone-metastore U: standalone-metastore | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-10666/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Adjust Hive 1.0 to 2.0 conversion utility to the upgrade > > > Key: HIVE-19400 > URL: https://issues.apache.org/jira/browse/HIVE-19400 > Project: Hive > Issue Type: Sub-task > Components: Hive >Affects Versions: 3.0.0 >Reporter: Miklos Gergely >Assignee: Miklos Gergely >Priority: Major > Fix For: 3.0.0 > > Attachments: HIVE-19400.01.patch > > > Conversion utility should allow specification of the output dir, and create > files only if there is actually something to do. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-14388) Add number of rows inserted message after insert command in Beeline
[ https://issues.apache.org/jira/browse/HIVE-14388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharathkrishna Guruvayoor Murali updated HIVE-14388: Attachment: HIVE-14388.06.patch > Add number of rows inserted message after insert command in Beeline > --- > > Key: HIVE-14388 > URL: https://issues.apache.org/jira/browse/HIVE-14388 > Project: Hive > Issue Type: Improvement > Components: Beeline >Reporter: Vihang Karajgaonkar >Assignee: Bharathkrishna Guruvayoor Murali >Priority: Minor > Attachments: HIVE-14388-WIP.patch, HIVE-14388.02.patch, > HIVE-14388.03.patch, HIVE-14388.05.patch, HIVE-14388.06.patch > > > Currently, when you run insert command on beeline, it returns a message > saying "No rows affected .." > A better and more intuitive msg would be "xxx rows inserted (26.068 seconds)" -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19108) Vectorization and Parquet: Turning on vectorization in parquet_ppd_decimal.q causes Wrong Query Results
[ https://issues.apache.org/jira/browse/HIVE-19108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463338#comment-16463338 ] Hive QA commented on HIVE-19108: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12921699/HIVE-19108.04.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 41 failed/errored test(s), 14318 tests executed *Failed tests:* {noformat} TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed out) (batchId=247) TestHCatHiveCompatibility - did not produce a TEST-*.xml file (likely timed out) (batchId=247) TestNonCatCallsWithCatalog - did not produce a TEST-*.xml file (likely timed out) (batchId=217) TestSequenceFileReadWrite - did not produce a TEST-*.xml file (likely timed out) (batchId=247) TestTxnExIm - did not produce a TEST-*.xml file (likely timed out) (batchId=286) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[fouter_join_ppr] (batchId=33) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1] (batchId=175) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez2] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_4] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_4] (batchId=157) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_5] (batchId=154) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_rebuild_dummy] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_time_window] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=163) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_dynpart_hashjoin_1] (batchId=174) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_vector_dynpart_hashjoin_1] (batchId=172) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=167) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] (batchId=105) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] (batchId=105) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=105) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[tez-tag] (batchId=105) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_reflect_neg] (batchId=96) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_test_error] (batchId=96) org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=228) org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 (batchId=228) org.apache.hadoop.hive.ql.TestMTQueries.testMTQueries1 (batchId=232) org.apache.hadoop.hive.ql.parse.TestCopyUtils.testPrivilegedDistCpWithSameUserAsCurrentDoesNotTryToImpersonate (batchId=231) org.apache.hadoop.hive.ql.parse.TestReplicationOnHDFSEncryptedZones.targetAndSourceHaveDifferentEncryptionZoneKeys (batchId=231) org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgress (batchId=235) org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel (batchId=235) org.apache.hive.jdbc.TestSSL.testSSLFetchHttp (batchId=239) org.apache.hive.jdbc.TestTriggersWorkloadManager.testMultipleTriggers2 (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedFiles (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomNonExistent (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomReadOps (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesRead (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesWrite (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryElapsedTime (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryExecutionTime (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerVertexRawInputSplitsNoKill (batchId=241) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/10665/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/10665/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-10665/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing
[jira] [Commented] (HIVE-19108) Vectorization and Parquet: Turning on vectorization in parquet_ppd_decimal.q causes Wrong Query Results
[ https://issues.apache.org/jira/browse/HIVE-19108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463330#comment-16463330 ] Hive QA commented on HIVE-19108: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 1s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 48s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 44s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 30s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 43s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 46s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 33s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 52m 23s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-10665/dev-support/hive-personality.sh | | git revision | master / 1c3b82f | | Default Java | 1.8.0_111 | | modules | C: . ql U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-10665/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Vectorization and Parquet: Turning on vectorization in parquet_ppd_decimal.q > causes Wrong Query Results > --- > > Key: HIVE-19108 > URL: https://issues.apache.org/jira/browse/HIVE-19108 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.0.0 >Reporter: Matt McCline >Assignee: Haifeng Chen >Priority: Critical > Attachments: HIVE-19108.01.patch, HIVE-19108.02.patch, > HIVE-19108.03.patch, HIVE-19108.04.patch > > > Found in vectorization enable by default experiment. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19110) Vectorization: Enabling vectorization causes TestContribCliDriver udf_example_arraymapstruct.q to produce Wrong Results
[ https://issues.apache.org/jira/browse/HIVE-19110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463302#comment-16463302 ] Hive QA commented on HIVE-19110: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12921698/HIVE-19110.04.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 47 failed/errored test(s), 14319 tests executed *Failed tests:* {noformat} TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed out) (batchId=247) TestHCatHiveCompatibility - did not produce a TEST-*.xml file (likely timed out) (batchId=247) TestNonCatCallsWithCatalog - did not produce a TEST-*.xml file (likely timed out) (batchId=217) TestSequenceFileReadWrite - did not produce a TEST-*.xml file (likely timed out) (batchId=247) TestTxnExIm - did not produce a TEST-*.xml file (likely timed out) (batchId=286) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_13] (batchId=253) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_1] (batchId=68) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_2] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin_negative3] (batchId=29) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_1_23] (batchId=81) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_skew_1_23] (batchId=9) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_13] (batchId=32) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sort_merge_join_desc_7] (batchId=27) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_4] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=163) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main] (batchId=160) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_vector_dynpart_hashjoin_1] (batchId=172) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=167) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] (batchId=105) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[insertsel_fail] (batchId=95) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[load_data_parquet_empty] (batchId=96) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_reflect_neg] (batchId=96) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_test_error] (batchId=96) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucket_map_join_1] (batchId=137) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucket_map_join_2] (batchId=133) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucketmapjoin_negative3] (batchId=120) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_sort_1_23] (batchId=143) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_sort_skew_1_23] (batchId=111) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[smb_mapjoin_13] (batchId=122) org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=228) org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 (batchId=228) org.apache.hadoop.hive.ql.TestMTQueries.testMTQueries1 (batchId=232) org.apache.hadoop.hive.ql.parse.TestCopyUtils.testPrivilegedDistCpWithSameUserAsCurrentDoesNotTryToImpersonate (batchId=231) org.apache.hadoop.hive.ql.parse.TestReplicationOnHDFSEncryptedZones.targetAndSourceHaveDifferentEncryptionZoneKeys (batchId=231) org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgress (batchId=235) org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel (batchId=235) org.apache.hive.jdbc.TestSSL.testSSLFetchHttp (batchId=239) org.apache.hive.jdbc.TestTriggersWorkloadManager.testMultipleTriggers2 (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedFiles (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomNonExistent (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomReadOps (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesRead (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesWrite (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryElapsedTime (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryExecutionTime (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerVertexRawInputSplitsNoKill (batchId=241) {noformat} Test results:
[jira] [Commented] (HIVE-19110) Vectorization: Enabling vectorization causes TestContribCliDriver udf_example_arraymapstruct.q to produce Wrong Results
[ https://issues.apache.org/jira/browse/HIVE-19110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463286#comment-16463286 ] Hive QA commented on HIVE-19110: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 1s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 2s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 25s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 18s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 48s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 47s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 34s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 55m 41s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-10664/dev-support/hive-personality.sh | | git revision | master / bf8e696 | | Default Java | 1.8.0_111 | | modules | C: . contrib ql U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-10664/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Vectorization: Enabling vectorization causes TestContribCliDriver > udf_example_arraymapstruct.q to produce Wrong Results > --- > > Key: HIVE-19110 > URL: https://issues.apache.org/jira/browse/HIVE-19110 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.0.0 >Reporter: Matt McCline >Assignee: Haifeng Chen >Priority: Critical > Attachments: HIVE-19110.01.patch, HIVE-19110.02.patch, > HIVE-19110.03.patch, HIVE-19110.04.patch > > > Found in vectorization enable by default experiment. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19396) HiveOperation is incorrectly set for analyze statement
[ https://issues.apache.org/jira/browse/HIVE-19396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-19396: Resolution: Fixed Fix Version/s: 3.1.0 Status: Resolved (was: Patch Available) Pushed to master. > HiveOperation is incorrectly set for analyze statement > -- > > Key: HIVE-19396 > URL: https://issues.apache.org/jira/browse/HIVE-19396 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan >Priority: Major > Fix For: 3.1.0 > > Attachments: HIVE-19396.patch > > > Because we rewrite analyze to select compute_stats() operation enum gets set > to Query. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields
[ https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463269#comment-16463269 ] Sahil Takiar commented on HIVE-19041: - Think you missed {{bucketCols}} in {{StorageDescriptor}}, otherwise +1 pending Hive QA. We can do this in a follow up JIRA, but we probably want to intern {{StorageDescriptor#sortCols}} too. I don't think different partitions can have different sort cols, so these would be duplicated across all partitions for a table. We will have to define a customized intern method for it because {{sortCols}} is a list of {{Order}} objects, but {{Order}} objects are just a String + int > Thrift deserialization of Partition objects should intern fields > > > Key: HIVE-19041 > URL: https://issues.apache.org/jira/browse/HIVE-19041 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 3.0.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-19041.01.patch, HIVE-19041.02.patch > > > When a client is creating large number of partitions, the thrift objects are > deserialized into Partition objects. The read method of these objects does > not intern the inputformat, location, outputformat which cause large number > of duplicate Strings in the HMS memory. We should intern these objects while > deserialization to reduce memory pressure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19399) Down cast from int to tinyint generating incorrect value for vectorization
[ https://issues.apache.org/jira/browse/HIVE-19399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haifeng Chen updated HIVE-19399: Description: The following sql scripts generating different result for vectorization disabled and enabled (both for ORC and for parquet). drop table test_schema; create table test_schema (f int) stored as parquet; insert into test_schema values ('9'); select cast(f as tinyint) + 1 from test_schema; For non-vectorization, the result is -96 while for vectorization mode, it is 10 was: The following sql scripts generating different result for vectorization disabled and enabled. drop table test_schema; create table test_schema (f int) stored as parquet; insert into test_schema values ('9'); select cast(f as tinyint) + 1 from test_schema; For non-vectorization, the result is -96 while for vectorization mode, it is 10 > Down cast from int to tinyint generating incorrect value for vectorization > -- > > Key: HIVE-19399 > URL: https://issues.apache.org/jira/browse/HIVE-19399 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 3.1.0 >Reporter: Haifeng Chen >Priority: Major > > The following sql scripts generating different result for vectorization > disabled and enabled (both for ORC and for parquet). > drop table test_schema; > create table test_schema (f int) stored as parquet; > insert into test_schema values ('9'); > select cast(f as tinyint) + 1 from test_schema; > For non-vectorization, the result is -96 while for vectorization mode, it is > 10 > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19384) Vectorization: IfExprTimestamp* do not handle NULLs correctly
[ https://issues.apache.org/jira/browse/HIVE-19384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463258#comment-16463258 ] Hive QA commented on HIVE-19384: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12921697/HIVE-19384.02.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 49 failed/errored test(s), 14319 tests executed *Failed tests:* {noformat} TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed out) (batchId=247) TestHCatHiveCompatibility - did not produce a TEST-*.xml file (likely timed out) (batchId=247) TestJdbcWithMiniKdc - did not produce a TEST-*.xml file (likely timed out) (batchId=254) TestNonCatCallsWithCatalog - did not produce a TEST-*.xml file (likely timed out) (batchId=217) TestSequenceFileReadWrite - did not produce a TEST-*.xml file (likely timed out) (batchId=247) TestTxnExIm - did not produce a TEST-*.xml file (likely timed out) (batchId=286) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_13] (batchId=253) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_1] (batchId=68) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_2] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin_negative3] (batchId=29) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_1_23] (batchId=81) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_skew_1_23] (batchId=9) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_13] (batchId=32) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sort_merge_join_desc_7] (batchId=27) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_timestamp_funcs] (batchId=31) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=163) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main] (batchId=160) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_vector_dynpart_hashjoin_1] (batchId=172) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=167) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] (batchId=105) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[insertsel_fail] (batchId=95) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[load_data_parquet_empty] (batchId=96) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_reflect_neg] (batchId=96) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_test_error] (batchId=96) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucket_map_join_1] (batchId=137) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucket_map_join_2] (batchId=133) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucketmapjoin_negative3] (batchId=120) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_sort_1_23] (batchId=143) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_sort_skew_1_23] (batchId=111) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[smb_mapjoin_13] (batchId=122) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_timestamp_funcs] (batchId=121) org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=228) org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 (batchId=228) org.apache.hadoop.hive.ql.TestAutoPurgeTables.testExternalNoAutoPurge (batchId=233) org.apache.hadoop.hive.ql.TestMTQueries.testMTQueries1 (batchId=232) org.apache.hadoop.hive.ql.parse.TestCopyUtils.testPrivilegedDistCpWithSameUserAsCurrentDoesNotTryToImpersonate (batchId=231) org.apache.hadoop.hive.ql.parse.TestReplicationOnHDFSEncryptedZones.targetAndSourceHaveDifferentEncryptionZoneKeys (batchId=231) org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgress (batchId=235) org.apache.hive.jdbc.TestSSL.testSSLFetchHttp (batchId=239) org.apache.hive.jdbc.TestTriggersWorkloadManager.testMultipleTriggers2 (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedFiles (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomNonExistent (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomReadOps (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesRead (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesWrite (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryElapsedTime (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryExecutionTime
[jira] [Commented] (HIVE-19418) add background stats updater similar to compactor
[ https://issues.apache.org/jira/browse/HIVE-19418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463257#comment-16463257 ] Gopal V commented on HIVE-19418: bq. some exceptions like numRows, cannot be aggregated (i.e. you cannot combine ndvs from two inserts) For pure insert queries all stats can be merged - because nDVs are actually stored as HyperLogLog bitsets which have a merge() op. bq. herefore we will add background logic to metastore (similar to, and partially inside, the ACID compactor) With standalone-metastore, adding more background logic to the metastore is going to become a big problem - I'd argue that even the compactor need to be moved out & the metastore can only keep the book-keeping for pending tasks (a generic task queue + priorities) because it will no longer have a yarn-site.xml in its configurations. > add background stats updater similar to compactor > - > > Key: HIVE-19418 > URL: https://issues.apache.org/jira/browse/HIVE-19418 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > > There's a JIRA HIVE-19416 to add snapshot version to stats for MM/ACID tables > to make them usable in a transaction without breaking ACID (for metadata-only > optimization). However, stats for ACID tables can still become unusable if > e.g. two parallel inserts run - neither sees the data written by the other, > so after both finish, the snapshots on either set of stats won't match the > current snapshot and the stats will be unusable. > Additionally, for ACID and non-ACID tables alike, a lot of the stats, with > some exceptions like numRows, cannot be aggregated (i.e. you cannot combine > ndvs from two inserts), and for ACID even less can be aggregated (you cannot > derive min/max if some rows are deleted but you don't scan the rest of the > dataset). > Therefore we will add background logic to metastore (similar to, and > partially inside, the ACID compactor) to update stats. > It will have 3 modes of operation. > 1) Off. > 2) Update only the stats that exist but are out of date (generating stats can > be expensive, so if the user is only analyzing a subset of tables it should > be able to only update that subset). We can simply look at existing stats and > only analyze for the relevant partitions and columns. > 3) On: 2 + create stats for all tables and columns missing stats. > There will also be a table parameter to skip stats update. > In phase 1, the process will operate outside of compactor, and run analyze > command on the table. The analyze command will automatically save the stats > with ACID snapshot information if needed, based on HIVE-19416, so we don't > need to do any special state management and this will work for all table > types. However it's also more expensive. > In phase 2, we can explore adding stats collection during MM compaction that > uses a temp table. If we don't have open writers during major compaction (so > we overwrite all of the data), the temp table stats can simply be copied over > to the main table with correct snapshot information, saving us a table scan. > In phase 3, we can add custom stats collection logic to full ACID compactor > that is not query based, the same way as we'd do for (2). Alternatively we > can wait for ACID compactor to become query based and just reuse (2). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields
[ https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463256#comment-16463256 ] Vihang Karajgaonkar commented on HIVE-19041: Attaching second patch which addresses [~stakiar]'s comment above. Specifically, interns {{catName}} in the setter, {{FieldSchema.comment}} field and handles setters of deserializerClass, serializerClass and serializationLib Strings in the setters and read method. > Thrift deserialization of Partition objects should intern fields > > > Key: HIVE-19041 > URL: https://issues.apache.org/jira/browse/HIVE-19041 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 3.0.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-19041.01.patch, HIVE-19041.02.patch > > > When a client is creating large number of partitions, the thrift objects are > deserialized into Partition objects. The read method of these objects does > not intern the inputformat, location, outputformat which cause large number > of duplicate Strings in the HMS memory. We should intern these objects while > deserialization to reduce memory pressure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19041) Thrift deserialization of Partition objects should intern fields
[ https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-19041: --- Attachment: HIVE-19041.02.patch > Thrift deserialization of Partition objects should intern fields > > > Key: HIVE-19041 > URL: https://issues.apache.org/jira/browse/HIVE-19041 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 3.0.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-19041.01.patch, HIVE-19041.02.patch > > > When a client is creating large number of partitions, the thrift objects are > deserialized into Partition objects. The read method of these objects does > not intern the inputformat, location, outputformat which cause large number > of duplicate Strings in the HMS memory. We should intern these objects while > deserialization to reduce memory pressure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields
[ https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463254#comment-16463254 ] Vihang Karajgaonkar commented on HIVE-19041: {{FieldSchema.comment}} is already being interned as per https://github.com/apache/hive/blob/master/standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/FieldSchema.java#L144 Hence adding intern call around comment in the {{read}} method isn't really doing anything new but rather fixing another important code path which wastes lot of memory. > Thrift deserialization of Partition objects should intern fields > > > Key: HIVE-19041 > URL: https://issues.apache.org/jira/browse/HIVE-19041 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 3.0.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-19041.01.patch > > > When a client is creating large number of partitions, the thrift objects are > deserialized into Partition objects. The read method of these objects does > not intern the inputformat, location, outputformat which cause large number > of duplicate Strings in the HMS memory. We should intern these objects while > deserialization to reduce memory pressure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19418) add background stats updater similar to compactor
[ https://issues.apache.org/jira/browse/HIVE-19418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-19418: -- Component/s: Transactions > add background stats updater similar to compactor > - > > Key: HIVE-19418 > URL: https://issues.apache.org/jira/browse/HIVE-19418 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > > There's a JIRA HIVE-19416 to add snapshot version to stats for MM/ACID tables > to make them usable in a transaction without breaking ACID (for metadata-only > optimization). However, stats for ACID tables can still become unusable if > e.g. two parallel inserts run - neither sees the data written by the other, > so after both finish, the snapshots on either set of stats won't match the > current snapshot and the stats will be unusable. > Additionally, for ACID and non-ACID tables alike, a lot of the stats, with > some exceptions like numRows, cannot be aggregated (i.e. you cannot combine > ndvs from two inserts), and for ACID even less can be aggregated (you cannot > derive min/max if some rows are deleted but you don't scan the rest of the > dataset). > Therefore we will add background logic to metastore (similar to, and > partially inside, the ACID compactor) to update stats. > It will have 3 modes of operation. > 1) Off. > 2) Update only the stats that exist but are out of date (generating stats can > be expensive, so if the user is only analyzing a subset of tables it should > be able to only update that subset). We can simply look at existing stats and > only analyze for the relevant partitions and columns. > 3) On: 2 + create stats for all tables and columns missing stats. > There will also be a table parameter to skip stats update. > In phase 1, the process will operate outside of compactor, and run analyze > command on the table. The analyze command will automatically save the stats > with ACID snapshot information if needed, based on HIVE-19416, so we don't > need to do any special state management and this will work for all table > types. However it's also more expensive. > In phase 2, we can explore adding stats collection during MM compaction that > uses a temp table. If we don't have open writers during major compaction (so > we overwrite all of the data), the temp table stats can simply be copied over > to the main table with correct snapshot information, saving us a table scan. > In phase 3, we can add custom stats collection logic to full ACID compactor > that is not query based, the same way as we'd do for (2). Alternatively we > can wait for ACID compactor to become query based and just reuse (2). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19418) add background stats updater similar to compactor
[ https://issues.apache.org/jira/browse/HIVE-19418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463241#comment-16463241 ] Sergey Shelukhin commented on HIVE-19418: - cc [~steveyeom2017] [~ekoifman] [~ashutoshc] > add background stats updater similar to compactor > - > > Key: HIVE-19418 > URL: https://issues.apache.org/jira/browse/HIVE-19418 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > > There's a JIRA HIVE-19416 to add snapshot version to stats for MM/ACID tables > to make them usable in a transaction without breaking ACID (for metadata-only > optimization). However, stats for ACID tables can still become unusable if > e.g. two parallel inserts run - neither sees the data written by the other, > so after both finish, the snapshots on either set of stats won't match the > current snapshot and the stats will be unusable. > Additionally, for ACID and non-ACID tables alike, a lot of the stats, with > some exceptions like numRows, cannot be aggregated (i.e. you cannot combine > ndvs from two inserts), and for ACID even less can be aggregated (you cannot > derive min/max if some rows are deleted but you don't scan the rest of the > dataset). > Therefore we will add background logic to metastore (similar to, and > partially inside, the ACID compactor) to update stats. > It will have 3 modes of operation. > 1) Off. > 2) Update only the stats that exist but are out of date (generating stats can > be expensive, so if the user is only analyzing a subset of tables it should > be able to only update that subset). We can simply look at existing stats and > only analyze for the relevant partitions and columns. > 3) On: 2 + create stats for all tables and columns missing stats. > There will also be a table parameter to skip stats update. > In phase 1, the process will operate outside of compactor, and run analyze > command on the table. The analyze command will automatically save the stats > with ACID snapshot information if needed, based on HIVE-19416, so we don't > need to do any special state management and this will work for all table > types. However it's also more expensive. > In phase 2, we can explore adding stats collection during MM compaction that > uses a temp table. If we don't have open writers during major compaction (so > we overwrite all of the data), the temp table stats can simply be copied over > to the main table with correct snapshot information, saving us a table scan. > In phase 3, we can add custom stats collection logic to full ACID compactor > that is not query based, the same way as we'd do for (2). Alternatively we > can wait for ACID compactor to become query based and just reuse (2). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-19418) add background stats updater similar to compactor
[ https://issues.apache.org/jira/browse/HIVE-19418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-19418: --- > add background stats updater similar to compactor > - > > Key: HIVE-19418 > URL: https://issues.apache.org/jira/browse/HIVE-19418 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > > There's a JIRA HIVE-19416 to add snapshot version to stats for MM/ACID tables > to make them usable in a transaction without breaking ACID (for metadata-only > optimization). However, stats for ACID tables can still become unusable if > e.g. two parallel inserts run - neither sees the data written by the other, > so after both finish, the snapshots on either set of stats won't match the > current snapshot and the stats will be unusable. > Additionally, for ACID and non-ACID tables alike, a lot of the stats, with > some exceptions like numRows, cannot be aggregated (i.e. you cannot combine > ndvs from two inserts), and for ACID even less can be aggregated (you cannot > derive min/max if some rows are deleted but you don't scan the rest of the > dataset). > Therefore we will add background logic to metastore (similar to, and > partially inside, the ACID compactor) to update stats. > It will have 3 modes of operation. > 1) Off. > 2) Update only the stats that exist but are out of date (generating stats can > be expensive, so if the user is only analyzing a subset of tables it should > be able to only update that subset). We can simply look at existing stats and > only analyze for the relevant partitions and columns. > 3) On: 2 + create stats for all tables and columns missing stats. > There will also be a table parameter to skip stats update. > In phase 1, the process will operate outside of compactor, and run analyze > command on the table. The analyze command will automatically save the stats > with ACID snapshot information if needed, based on HIVE-19416, so we don't > need to do any special state management and this will work for all table > types. However it's also more expensive. > In phase 2, we can explore adding stats collection during MM compaction that > uses a temp table. If we don't have open writers during major compaction (so > we overwrite all of the data), the temp table stats can simply be copied over > to the main table with correct snapshot information, saving us a table scan. > In phase 3, we can add custom stats collection logic to full ACID compactor > that is not query based, the same way as we'd do for (2). Alternatively we > can wait for ACID compactor to become query based and just reuse (2). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields
[ https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463236#comment-16463236 ] Misha Dmitriev commented on HIVE-19041: --- [~gopalv] yes, since JDK 1.7 built-in string interning is far superior to one based on WeakHashMap (which was used by Hadoop weak interner). Check the above article on interning for further details. > Thrift deserialization of Partition objects should intern fields > > > Key: HIVE-19041 > URL: https://issues.apache.org/jira/browse/HIVE-19041 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 3.0.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-19041.01.patch > > > When a client is creating large number of partitions, the thrift objects are > deserialized into Partition objects. The read method of these objects does > not intern the inputformat, location, outputformat which cause large number > of duplicate Strings in the HMS memory. We should intern these objects while > deserialization to reduce memory pressure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields
[ https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463235#comment-16463235 ] Misha Dmitriev commented on HIVE-19041: --- Yes, all interned strings are kept in the JVM internal equivalent of a concurrent WeakHashMap. Since it's highly specialized, it's very fast, and has no extra overhead when more strings are added to it (because it's quite large and preallocated, so actually every running JVM already bears this memory overhead of a few MB). If you are really interested, check this article: [http://java-performance.info/string-intern-in-java-6-7-8/] Basically, the only thing that you may be concerned with when using String.intern(), is the CPU overhead. But in my experience, unless interning is used, mistakingly, for strings that are very short-lived anyway, the impact of reduced GC outweighs the impact of of extra CPU cycles consumed by the intern() call. > Thrift deserialization of Partition objects should intern fields > > > Key: HIVE-19041 > URL: https://issues.apache.org/jira/browse/HIVE-19041 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 3.0.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-19041.01.patch > > > When a client is creating large number of partitions, the thrift objects are > deserialized into Partition objects. The read method of these objects does > not intern the inputformat, location, outputformat which cause large number > of duplicate Strings in the HMS memory. We should intern these objects while > deserialization to reduce memory pressure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-19041) Thrift deserialization of Partition objects should intern fields
[ https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463233#comment-16463233 ] Gopal V edited comment on HIVE-19041 at 5/4/18 12:21 AM: - Hadoop comes with a weak-interner, which is used by Tez. StringInterner.weakIntern() Looking at the code, it was explicitly removed by HIVE-17237 ?? was (Author: gopalv): Hadoop comes with a weak-interner, which is used by Tez. StringInterner.weakIntern() > Thrift deserialization of Partition objects should intern fields > > > Key: HIVE-19041 > URL: https://issues.apache.org/jira/browse/HIVE-19041 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 3.0.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-19041.01.patch > > > When a client is creating large number of partitions, the thrift objects are > deserialized into Partition objects. The read method of these objects does > not intern the inputformat, location, outputformat which cause large number > of duplicate Strings in the HMS memory. We should intern these objects while > deserialization to reduce memory pressure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields
[ https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463233#comment-16463233 ] Gopal V commented on HIVE-19041: Hadoop comes with a weak-interner, which is used by Tez. StringInterner.weakIntern() > Thrift deserialization of Partition objects should intern fields > > > Key: HIVE-19041 > URL: https://issues.apache.org/jira/browse/HIVE-19041 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 3.0.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-19041.01.patch > > > When a client is creating large number of partitions, the thrift objects are > deserialized into Partition objects. The read method of these objects does > not intern the inputformat, location, outputformat which cause large number > of duplicate Strings in the HMS memory. We should intern these objects while > deserialization to reduce memory pressure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19417) Modify metastore to have/access persistent tables for stats
[ https://issues.apache.org/jira/browse/HIVE-19417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463226#comment-16463226 ] Steve Yeom commented on HIVE-19417: --- I will add a snapshot table and the reference to it on the existing stats tables. > Modify metastore to have/access persistent tables for stats > --- > > Key: HIVE-19417 > URL: https://issues.apache.org/jira/browse/HIVE-19417 > Project: Hive > Issue Type: Sub-task > Components: Hive >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 3.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19417) Modify metastore to have/access persistent tables for stats
[ https://issues.apache.org/jira/browse/HIVE-19417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-19417: -- Summary: Modify metastore to have/access persistent tables for stats (was: Modify metastore to have persistent tables/objects) > Modify metastore to have/access persistent tables for stats > --- > > Key: HIVE-19417 > URL: https://issues.apache.org/jira/browse/HIVE-19417 > Project: Hive > Issue Type: Sub-task > Components: Hive >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 3.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields
[ https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463221#comment-16463221 ] Sahil Takiar commented on HIVE-19041: - I guess if you have a comment for the partition column you will have tons of duplicate comments. I don't think we can selectively intern just for the partition column comments though. I'm not sure how overhead it would introduce if we just blindly intern all comments. This would include table, database, and column level comments. [~mi...@cloudera.com] do interned strings ever get purged from the Java heap? > Thrift deserialization of Partition objects should intern fields > > > Key: HIVE-19041 > URL: https://issues.apache.org/jira/browse/HIVE-19041 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 3.0.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-19041.01.patch > > > When a client is creating large number of partitions, the thrift objects are > deserialized into Partition objects. The read method of these objects does > not intern the inputformat, location, outputformat which cause large number > of duplicate Strings in the HMS memory. We should intern these objects while > deserialization to reduce memory pressure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-19417) Modify metastore to have persistent tables/objects
[ https://issues.apache.org/jira/browse/HIVE-19417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom reassigned HIVE-19417: - > Modify metastore to have persistent tables/objects > -- > > Key: HIVE-19417 > URL: https://issues.apache.org/jira/browse/HIVE-19417 > Project: Hive > Issue Type: Sub-task > Components: Hive >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 3.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19344) Change default value of msck.repair.batch.size
[ https://issues.apache.org/jira/browse/HIVE-19344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463214#comment-16463214 ] Sahil Takiar commented on HIVE-19344: - +1 > Change default value of msck.repair.batch.size > --- > > Key: HIVE-19344 > URL: https://issues.apache.org/jira/browse/HIVE-19344 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > Attachments: HIVE-19344.01.patch > > > {{msck.repair.batch.size}} default to 0 which means msck will try to add all > the partitions in one API call to HMS. This can potentially add huge memory > pressure on HMS. The default value should be changed to a reasonable number > so that in case of large number of partitions we can batch the addition of > partitions. Same goes for {{msck.repair.batch.max.retries}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-19416) Create single version transactional table metastore statistics for aggregation queries
[ https://issues.apache.org/jira/browse/HIVE-19416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom reassigned HIVE-19416: - > Create single version transactional table metastore statistics for > aggregation queries > -- > > Key: HIVE-19416 > URL: https://issues.apache.org/jira/browse/HIVE-19416 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 3.0.0 > > > The system should use only statistics for aggregation queries like count on > transactional tables. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18570) ACID IOW implemented using base may delete too much data
[ https://issues.apache.org/jira/browse/HIVE-18570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463211#comment-16463211 ] Eugene Koifman commented on HIVE-18570: --- HIVE-18570.03.patch updates some golden files > ACID IOW implemented using base may delete too much data > > > Key: HIVE-18570 > URL: https://issues.apache.org/jira/browse/HIVE-18570 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Sergey Shelukhin >Assignee: Eugene Koifman >Priority: Blocker > Attachments: HIVE-18570.01-branch-3.patch, HIVE-18570.01.patch, > HIVE-18570.02-branch-3.patch, HIVE-18570.02.patch, > HIVE-18570.03-branch-3.patch, HIVE-18570.03.patch, > HIVE-18570.04-branch-3.patch > > > Suppose we have a table with delta_0 insert data. > Txn 1 starts an insert into delta_1. > Txn 2 starts an IOW into base_2. > Txn 2 commits. > Txn 1 commits after txn 2 but its results would be invisible. > Txn 2 deletes rows committed by txn 1 that according to standard ACID > semantics it could have never observed and affected; this sequence of events > is only possible under read-uncommitted isolation level (so, 2 deletes rows > written by 1 before 1 commits them). > This is if we look at IOW as transactional delete+insert. Otherwise we are > just saying IOW performs "semi"-transactional delete. > If 1 ran an update on rows instead of an insert, and 2 still ran an > IOW/delete, row lock conflict (or equivalent) should cause one of them to > fail. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18570) ACID IOW implemented using base may delete too much data
[ https://issues.apache.org/jira/browse/HIVE-18570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-18570: -- Attachment: HIVE-18570.03.patch > ACID IOW implemented using base may delete too much data > > > Key: HIVE-18570 > URL: https://issues.apache.org/jira/browse/HIVE-18570 > Project: Hive > Issue Type: Bug > Components: Transactions >Reporter: Sergey Shelukhin >Assignee: Eugene Koifman >Priority: Blocker > Attachments: HIVE-18570.01-branch-3.patch, HIVE-18570.01.patch, > HIVE-18570.02-branch-3.patch, HIVE-18570.02.patch, > HIVE-18570.03-branch-3.patch, HIVE-18570.03.patch, > HIVE-18570.04-branch-3.patch > > > Suppose we have a table with delta_0 insert data. > Txn 1 starts an insert into delta_1. > Txn 2 starts an IOW into base_2. > Txn 2 commits. > Txn 1 commits after txn 2 but its results would be invisible. > Txn 2 deletes rows committed by txn 1 that according to standard ACID > semantics it could have never observed and affected; this sequence of events > is only possible under read-uncommitted isolation level (so, 2 deletes rows > written by 1 before 1 commits them). > This is if we look at IOW as transactional delete+insert. Otherwise we are > just saying IOW performs "semi"-transactional delete. > If 1 ran an update on rows instead of an insert, and 2 still ran an > IOW/delete, row lock conflict (or equivalent) should cause one of them to > fail. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19384) Vectorization: IfExprTimestamp* do not handle NULLs correctly
[ https://issues.apache.org/jira/browse/HIVE-19384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463207#comment-16463207 ] Hive QA commented on HIVE-19384: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 48s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 4s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 27s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 0s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 29s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 12s{color} | {color:red} storage-api: The patch generated 2 new + 12 unchanged - 2 fixed = 14 total (was 14) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 47s{color} | {color:red} ql: The patch generated 20 new + 232 unchanged - 14 fixed = 252 total (was 246) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 19m 11s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-10663/dev-support/hive-personality.sh | | git revision | master / bf8e696 | | Default Java | 1.8.0_111 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-10663/yetus/diff-checkstyle-storage-api.txt | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-10663/yetus/diff-checkstyle-ql.txt | | modules | C: storage-api ql U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-10663/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Vectorization: IfExprTimestamp* do not handle NULLs correctly > - > > Key: HIVE-19384 > URL: https://issues.apache.org/jira/browse/HIVE-19384 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-19384.01.patch, HIVE-19384.02.patch > > > HIVE-18622: "Vectorization: IF Statements, Comparisons, and more do not > handle NULLs correctly" didn't quite fix the IfExprTimestamp* classes > right > {noformat} > // Carefully handle NULLs... > outputColVector.noNulls = false;{noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19344) Change default value of msck.repair.batch.size
[ https://issues.apache.org/jira/browse/HIVE-19344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463203#comment-16463203 ] Vihang Karajgaonkar commented on HIVE-19344: [~stakiar] Can you take a look? > Change default value of msck.repair.batch.size > --- > > Key: HIVE-19344 > URL: https://issues.apache.org/jira/browse/HIVE-19344 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > Attachments: HIVE-19344.01.patch > > > {{msck.repair.batch.size}} default to 0 which means msck will try to add all > the partitions in one API call to HMS. This can potentially add huge memory > pressure on HMS. The default value should be changed to a reasonable number > so that in case of large number of partitions we can batch the addition of > partitions. Same goes for {{msck.repair.batch.max.retries}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-19306) Arrow batch serializer
[ https://issues.apache.org/jira/browse/HIVE-19306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463200#comment-16463200 ] Teddy Choi edited comment on HIVE-19306 at 5/3/18 11:45 PM: [~mmccline], [~ewohlstadter]. HIVE-19306.3.patch fixes null handling. ArrowColumnarBatchSerDe#writeNull is the strangest part. Because Apache Arrow's UnionListWriter should implement AbstractFieldWriter#writeNull properly and FieldWriter should have a super method of AbstractFieldWriter#writeNull to expose it, but they don't. I used reflection and concrete class check to handle it. I'll fix FieldWriter interface in Apache Arrow. was (Author: teddy.choi): [~mmccline], [~ewohlstadter]. [^HIVE-19306.3.patch] fixed null handling. ArrowColumnarBatchSerDe#writeNull is the strangest part. Because Apache Arrow's UnionListWriter should implement AbstractFieldWriter#writeNull properly and FieldWriter should have a super method of AbstractFieldWriter#writeNull to expose it. I used reflection and concrete class check to handle it. I'll fix FieldWriter interface in Apache Arrow. > Arrow batch serializer > -- > > Key: HIVE-19306 > URL: https://issues.apache.org/jira/browse/HIVE-19306 > Project: Hive > Issue Type: Task > Components: Serializers/Deserializers >Reporter: Eric Wohlstadter >Assignee: Teddy Choi >Priority: Major > Attachments: HIVE-19306.2.patch > > > Leverage the ThriftJDBCBinarySerDe code path that already exists in > SemanticAnalyzer/FileSinkOperator to create a serializer that batches rows > into Arrow vector batches. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19306) Arrow batch serializer
[ https://issues.apache.org/jira/browse/HIVE-19306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463200#comment-16463200 ] Teddy Choi commented on HIVE-19306: --- [~mmccline], [~ewohlstadter]. [^HIVE-19306.3.patch] fixed null handling. ArrowColumnarBatchSerDe#writeNull is the strangest part. Because Apache Arrow's UnionListWriter should implement AbstractFieldWriter#writeNull properly and FieldWriter should have a super method of AbstractFieldWriter#writeNull to expose it. I used reflection and concrete class check to handle it. I'll fix FieldWriter interface in Apache Arrow. > Arrow batch serializer > -- > > Key: HIVE-19306 > URL: https://issues.apache.org/jira/browse/HIVE-19306 > Project: Hive > Issue Type: Task > Components: Serializers/Deserializers >Reporter: Eric Wohlstadter >Assignee: Teddy Choi >Priority: Major > Attachments: HIVE-19306.2.patch > > > Leverage the ThriftJDBCBinarySerDe code path that already exists in > SemanticAnalyzer/FileSinkOperator to create a serializer that batches rows > into Arrow vector batches. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19118) Vectorization: Turning on vectorization in escape_crlf produces wrong results
[ https://issues.apache.org/jira/browse/HIVE-19118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463191#comment-16463191 ] Hive QA commented on HIVE-19118: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12921696/HIVE-19118.05.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 48 failed/errored test(s), 14318 tests executed *Failed tests:* {noformat} TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed out) (batchId=247) TestHCatHiveCompatibility - did not produce a TEST-*.xml file (likely timed out) (batchId=247) TestNonCatCallsWithCatalog - did not produce a TEST-*.xml file (likely timed out) (batchId=217) TestSequenceFileReadWrite - did not produce a TEST-*.xml file (likely timed out) (batchId=247) TestTxnExIm - did not produce a TEST-*.xml file (likely timed out) (batchId=286) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_13] (batchId=253) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_1] (batchId=68) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_2] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin_negative3] (batchId=29) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_1_23] (batchId=81) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_skew_1_23] (batchId=9) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_13] (batchId=32) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sort_merge_join_desc_7] (batchId=27) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1] (batchId=175) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=163) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_dynpart_hashjoin_1] (batchId=174) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main] (batchId=160) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=167) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] (batchId=105) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[insertsel_fail] (batchId=95) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[load_data_parquet_empty] (batchId=96) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_reflect_neg] (batchId=96) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_test_error] (batchId=96) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucket_map_join_1] (batchId=137) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucket_map_join_2] (batchId=133) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucketmapjoin_negative3] (batchId=120) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_sort_1_23] (batchId=143) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_sort_skew_1_23] (batchId=111) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[smb_mapjoin_13] (batchId=122) org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=228) org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 (batchId=228) org.apache.hadoop.hive.ql.TestMTQueries.testMTQueries1 (batchId=232) org.apache.hadoop.hive.ql.parse.TestCopyUtils.testPrivilegedDistCpWithSameUserAsCurrentDoesNotTryToImpersonate (batchId=231) org.apache.hadoop.hive.ql.parse.TestReplicationOnHDFSEncryptedZones.targetAndSourceHaveDifferentEncryptionZoneKeys (batchId=231) org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgress (batchId=235) org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel (batchId=235) org.apache.hive.jdbc.TestSSL.testSSLFetchHttp (batchId=239) org.apache.hive.jdbc.TestTriggersMoveWorkloadManager.testTriggerMoveAndKill (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testMultipleTriggers2 (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedFiles (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomNonExistent (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomReadOps (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesRead (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesWrite (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryElapsedTime (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryExecutionTime (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerVertexRawInputSplitsNoKill
[jira] [Commented] (HIVE-18288) merge/concat not supported on Acid table
[ https://issues.apache.org/jira/browse/HIVE-18288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463187#comment-16463187 ] Eugene Koifman commented on HIVE-18288: --- [~sershe] could you review please > merge/concat not supported on Acid table > > > Key: HIVE-18288 > URL: https://issues.apache.org/jira/browse/HIVE-18288 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Major > Attachments: HIVE-18288.01.patch, HIVE-18288.02.patch > > > For example, mvn test -Dtest=TestCliDriver -Dqfile=orc_merge10.q > now ends up with > {noformat} > 2017-12-15T15:12:30,753 ERROR [7c3ff5b2-285c-44f2-8b13-5c3ccbd41b13 main] > ql.Driver: FAILED: SemanticException > org.apache.hadoop.hive.ql.parse.SemanticException: Concatenate/M\ > erge can not be performed on transactional tables > org.apache.hadoop.hive.ql.parse.SemanticException: > org.apache.hadoop.hive.ql.parse.SemanticException: Concatenate/Merge can not > be performed on transactional tables > at > org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeAlterTablePartMergeFiles(DDLSemanticAnalyzer.java:2172) > at > org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:343) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19298) Fix operator tree of CTAS for Druid Storage Handler
[ https://issues.apache.org/jira/browse/HIVE-19298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] slim bouguerra updated HIVE-19298: -- Attachment: HIVE-19298.2.patch > Fix operator tree of CTAS for Druid Storage Handler > --- > > Key: HIVE-19298 > URL: https://issues.apache.org/jira/browse/HIVE-19298 > Project: Hive > Issue Type: Bug > Components: Druid integration >Reporter: slim bouguerra >Assignee: slim bouguerra >Priority: Major > Fix For: 3.1.0 > > Attachments: HIVE-19298.2.patch, HIVE-19298.patch, HIVE-19298.patch > > > Current operator plan of CTAS for Druid storage handler is broken when used > enables the property \{code} hive.exec.parallel\{code} as \{code} true\{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19415) Support CORS for all HS2 web endpoints
[ https://issues.apache.org/jira/browse/HIVE-19415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463179#comment-16463179 ] Sergey Shelukhin commented on HIVE-19415: - +1 > Support CORS for all HS2 web endpoints > -- > > Key: HIVE-19415 > URL: https://issues.apache.org/jira/browse/HIVE-19415 > Project: Hive > Issue Type: Improvement > Components: Web UI >Affects Versions: 3.0.0, 3.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Minor > Attachments: HIVE-19415.1.patch > > > HIVE-19277 changes alone are not sufficient to support CORS. > CrossOriginFilter has to be added to jetty which will serve appropriate > response for OPTIONS pre-flight request. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19415) Support CORS for all HS2 web endpoints
[ https://issues.apache.org/jira/browse/HIVE-19415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463178#comment-16463178 ] Prasanth Jayachandran commented on HIVE-19415: -- [~sershe]/[~gopalv] can someone please review this patch? small patch. RB is acting weird doesn't respond to "rbt post" or manual upload. > Support CORS for all HS2 web endpoints > -- > > Key: HIVE-19415 > URL: https://issues.apache.org/jira/browse/HIVE-19415 > Project: Hive > Issue Type: Improvement > Components: Web UI >Affects Versions: 3.0.0, 3.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Minor > Attachments: HIVE-19415.1.patch > > > HIVE-19277 changes alone are not sufficient to support CORS. > CrossOriginFilter has to be added to jetty which will serve appropriate > response for OPTIONS pre-flight request. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19415) Support CORS for all HS2 web endpoints
[ https://issues.apache.org/jira/browse/HIVE-19415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-19415: - Priority: Minor (was: Major) > Support CORS for all HS2 web endpoints > -- > > Key: HIVE-19415 > URL: https://issues.apache.org/jira/browse/HIVE-19415 > Project: Hive > Issue Type: Improvement > Components: Web UI >Affects Versions: 3.0.0, 3.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Minor > Attachments: HIVE-19415.1.patch > > > HIVE-19277 changes alone are not sufficient to support CORS. > CrossOriginFilter has to be added to jetty which will serve appropriate > response for OPTIONS pre-flight request. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19415) Support CORS for all HS2 web endpoints
[ https://issues.apache.org/jira/browse/HIVE-19415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-19415: - Status: Patch Available (was: Open) > Support CORS for all HS2 web endpoints > -- > > Key: HIVE-19415 > URL: https://issues.apache.org/jira/browse/HIVE-19415 > Project: Hive > Issue Type: Improvement > Components: Web UI >Affects Versions: 3.0.0, 3.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Attachments: HIVE-19415.1.patch > > > HIVE-19277 changes alone are not sufficient to support CORS. > CrossOriginFilter has to be added to jetty which will serve appropriate > response for OPTIONS pre-flight request. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19415) Support CORS for all HS2 web endpoints
[ https://issues.apache.org/jira/browse/HIVE-19415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-19415: - Attachment: HIVE-19415.1.patch > Support CORS for all HS2 web endpoints > -- > > Key: HIVE-19415 > URL: https://issues.apache.org/jira/browse/HIVE-19415 > Project: Hive > Issue Type: Improvement > Components: Web UI >Affects Versions: 3.0.0, 3.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Attachments: HIVE-19415.1.patch > > > HIVE-19277 changes alone are not sufficient to support CORS. > CrossOriginFilter has to be added to jetty which will serve appropriate > response for OPTIONS pre-flight request. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19389) Schematool: For Hive's Information Schema, use embedded HS2 as default
[ https://issues.apache.org/jira/browse/HIVE-19389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463171#comment-16463171 ] Daniel Dai commented on HIVE-19389: --- +1 > Schematool: For Hive's Information Schema, use embedded HS2 as default > -- > > Key: HIVE-19389 > URL: https://issues.apache.org/jira/browse/HIVE-19389 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0, 3.1.0 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta >Priority: Major > Attachments: HIVE-19389.1.patch, HIVE-19389.2.patch > > > Currently, for initializing/upgrading Hive's information schema, we require a > full jdbc url (for HS2). It will be good to have it connect using embedded > HS2 by default. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19118) Vectorization: Turning on vectorization in escape_crlf produces wrong results
[ https://issues.apache.org/jira/browse/HIVE-19118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463174#comment-16463174 ] Hive QA commented on HIVE-19118: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 54s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 28s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 1s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 7s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 8m 5s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 7m 56s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 55m 56s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-10662/dev-support/hive-personality.sh | | git revision | master / 70d835b | | Default Java | 1.8.0_111 | | modules | C: serde . ql U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-10662/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Vectorization: Turning on vectorization in escape_crlf produces wrong results > - > > Key: HIVE-19118 > URL: https://issues.apache.org/jira/browse/HIVE-19118 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.0.0 >Reporter: Matt McCline >Assignee: Haifeng Chen >Priority: Critical > Attachments: HIVE-19118.01.patch, HIVE-19118.02.patch, > HIVE-19118.03.patch, HIVE-19118.04.patch, HIVE-19118.05.patch > > > Found in vectorization enable by default experiment. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19389) Schematool: For Hive's Information Schema, use embedded HS2 as default
[ https://issues.apache.org/jira/browse/HIVE-19389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463170#comment-16463170 ] Vaibhav Gumashta commented on HIVE-19389: - [~daijy] Updated > Schematool: For Hive's Information Schema, use embedded HS2 as default > -- > > Key: HIVE-19389 > URL: https://issues.apache.org/jira/browse/HIVE-19389 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0, 3.1.0 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta >Priority: Major > Attachments: HIVE-19389.1.patch, HIVE-19389.2.patch > > > Currently, for initializing/upgrading Hive's information schema, we require a > full jdbc url (for HS2). It will be good to have it connect using embedded > HS2 by default. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-19415) Support CORS for all HS2 web endpoints
[ https://issues.apache.org/jira/browse/HIVE-19415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran reassigned HIVE-19415: > Support CORS for all HS2 web endpoints > -- > > Key: HIVE-19415 > URL: https://issues.apache.org/jira/browse/HIVE-19415 > Project: Hive > Issue Type: Improvement > Components: Web UI >Affects Versions: 3.0.0, 3.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > > HIVE-19277 changes alone are not sufficient to support CORS. > CrossOriginFilter has to be added to jetty which will serve appropriate > response for OPTIONS pre-flight request. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19389) Schematool: For Hive's Information Schema, use embedded HS2 as default
[ https://issues.apache.org/jira/browse/HIVE-19389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-19389: Attachment: HIVE-19389.2.patch > Schematool: For Hive's Information Schema, use embedded HS2 as default > -- > > Key: HIVE-19389 > URL: https://issues.apache.org/jira/browse/HIVE-19389 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0, 3.1.0 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta >Priority: Major > Attachments: HIVE-19389.1.patch, HIVE-19389.2.patch > > > Currently, for initializing/upgrading Hive's information schema, we require a > full jdbc url (for HS2). It will be good to have it connect using embedded > HS2 by default. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18288) merge/concat not supported on Acid table
[ https://issues.apache.org/jira/browse/HIVE-18288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-18288: -- Attachment: HIVE-18288.02.patch > merge/concat not supported on Acid table > > > Key: HIVE-18288 > URL: https://issues.apache.org/jira/browse/HIVE-18288 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Major > Attachments: HIVE-18288.01.patch, HIVE-18288.02.patch > > > For example, mvn test -Dtest=TestCliDriver -Dqfile=orc_merge10.q > now ends up with > {noformat} > 2017-12-15T15:12:30,753 ERROR [7c3ff5b2-285c-44f2-8b13-5c3ccbd41b13 main] > ql.Driver: FAILED: SemanticException > org.apache.hadoop.hive.ql.parse.SemanticException: Concatenate/M\ > erge can not be performed on transactional tables > org.apache.hadoop.hive.ql.parse.SemanticException: > org.apache.hadoop.hive.ql.parse.SemanticException: Concatenate/Merge can not > be performed on transactional tables > at > org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeAlterTablePartMergeFiles(DDLSemanticAnalyzer.java:2172) > at > org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:343) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18288) merge/concat not supported on Acid table
[ https://issues.apache.org/jira/browse/HIVE-18288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-18288: -- Attachment: (was: HIVE-18288.02.patch) > merge/concat not supported on Acid table > > > Key: HIVE-18288 > URL: https://issues.apache.org/jira/browse/HIVE-18288 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Major > Attachments: HIVE-18288.01.patch, HIVE-18288.02.patch > > > For example, mvn test -Dtest=TestCliDriver -Dqfile=orc_merge10.q > now ends up with > {noformat} > 2017-12-15T15:12:30,753 ERROR [7c3ff5b2-285c-44f2-8b13-5c3ccbd41b13 main] > ql.Driver: FAILED: SemanticException > org.apache.hadoop.hive.ql.parse.SemanticException: Concatenate/M\ > erge can not be performed on transactional tables > org.apache.hadoop.hive.ql.parse.SemanticException: > org.apache.hadoop.hive.ql.parse.SemanticException: Concatenate/Merge can not > be performed on transactional tables > at > org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeAlterTablePartMergeFiles(DDLSemanticAnalyzer.java:2172) > at > org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:343) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18288) merge/concat not supported on Acid table
[ https://issues.apache.org/jira/browse/HIVE-18288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-18288: -- Attachment: HIVE-18288.02.patch > merge/concat not supported on Acid table > > > Key: HIVE-18288 > URL: https://issues.apache.org/jira/browse/HIVE-18288 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Major > Attachments: HIVE-18288.01.patch, HIVE-18288.02.patch > > > For example, mvn test -Dtest=TestCliDriver -Dqfile=orc_merge10.q > now ends up with > {noformat} > 2017-12-15T15:12:30,753 ERROR [7c3ff5b2-285c-44f2-8b13-5c3ccbd41b13 main] > ql.Driver: FAILED: SemanticException > org.apache.hadoop.hive.ql.parse.SemanticException: Concatenate/M\ > erge can not be performed on transactional tables > org.apache.hadoop.hive.ql.parse.SemanticException: > org.apache.hadoop.hive.ql.parse.SemanticException: Concatenate/Merge can not > be performed on transactional tables > at > org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeAlterTablePartMergeFiles(DDLSemanticAnalyzer.java:2172) > at > org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:343) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18288) merge/concat not supported on Acid table
[ https://issues.apache.org/jira/browse/HIVE-18288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-18288: -- Status: Patch Available (was: Open) > merge/concat not supported on Acid table > > > Key: HIVE-18288 > URL: https://issues.apache.org/jira/browse/HIVE-18288 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Major > Attachments: HIVE-18288.01.patch, HIVE-18288.02.patch > > > For example, mvn test -Dtest=TestCliDriver -Dqfile=orc_merge10.q > now ends up with > {noformat} > 2017-12-15T15:12:30,753 ERROR [7c3ff5b2-285c-44f2-8b13-5c3ccbd41b13 main] > ql.Driver: FAILED: SemanticException > org.apache.hadoop.hive.ql.parse.SemanticException: Concatenate/M\ > erge can not be performed on transactional tables > org.apache.hadoop.hive.ql.parse.SemanticException: > org.apache.hadoop.hive.ql.parse.SemanticException: Concatenate/Merge can not > be performed on transactional tables > at > org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeAlterTablePartMergeFiles(DDLSemanticAnalyzer.java:2172) > at > org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:343) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19394) WM_TRIGGER trigger creation failed with type cast from Integer to Boolean
[ https://issues.apache.org/jira/browse/HIVE-19394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463123#comment-16463123 ] Thai Bui commented on HIVE-19394: - (y) Awesome! > WM_TRIGGER trigger creation failed with type cast from Integer to Boolean > -- > > Key: HIVE-19394 > URL: https://issues.apache.org/jira/browse/HIVE-19394 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Thai Bui >Assignee: Thai Bui >Priority: Minor > Fix For: 3.0.0 > > Attachments: HIVE-19394.patch, HIVE-19394.patch > > > During testing of the new WM feature and the Hive metastore is created using > Postgresql, I've discovered a bug when creating a new trigger. For example > {noformat} > CREATE RESOURCE PLAN plan_1 WITH QUERY_PARALLELISM=4; > CREATE POOL plan_1.slow WITH >ALLOC_FRACTION=0.5, QUERY_PARALLELISM=2, SCHEDULING_POLICY='fair'; > ALTER POOL plan_1.default SET >ALLOC_FRACTION=0.5, QUERY_PARALLELISM=2, SCHEDULING_POLICY='fifo'; > CREATE TRIGGER plan_1.trigger_1 WHEN S3A_BYTES_READ > 268435456 DO MOVE TO > slow; > {noformat} > Right at the CREATE TRIGGER statement, an error will occur > {noformat} > Error while processing statement: FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Insert of > object "org.apache.hadoop.hive.metastore.model.MWMTrigger@5c5ae5d8" using > statement "INSERT INTO "WM_TRIGGER" > ("TRIGGER_ID","ACTION_EXPRESSION","IS_IN_UNMANAGED","NAME","RP_ID","TRIGGER_EXPRESSION") > VALUES (?,?,?,?,?,?)" failed : ERROR: column "IS_IN_UNMANAGED" is of type > boolean but expression is of type integer Hint: You will need to rewrite or > cast the expression. Position: 129) > at > org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543) > ~[datanucleus-api-jdo-4.2.4.jar:?] > at > org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:729) > ~[datanucleus-api-jdo-4.2.4.jar:?] > at > org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:749) > ~[datanucleus-api-jdo-4.2.4.jar:?] > at > org.apache.hadoop.hive.metastore.ObjectStore.createWMTrigger(ObjectStore.java:11218) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_151] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_151] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_151] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151] > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_151] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_151] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_151] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151] > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at com.sun.proxy.$Proxy37.createWMTrigger(Unknown Source) ~[?:?] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_wm_trigger(HiveMetaStore.java:7846) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_151] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_151] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_151] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at com.sun.proxy.$Proxy39.create_wm_trigger(Unknown Source) ~[?:?] > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createWMTrigger(HiveMetaStoreClient.java:3062) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_151] > at >
[jira] [Resolved] (HIVE-19394) WM_TRIGGER trigger creation failed with type cast from Integer to Boolean
[ https://issues.apache.org/jira/browse/HIVE-19394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HIVE-19394. - Resolution: Fixed > WM_TRIGGER trigger creation failed with type cast from Integer to Boolean > -- > > Key: HIVE-19394 > URL: https://issues.apache.org/jira/browse/HIVE-19394 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Thai Bui >Assignee: Thai Bui >Priority: Minor > Fix For: 3.0.0 > > Attachments: HIVE-19394.patch, HIVE-19394.patch > > > During testing of the new WM feature and the Hive metastore is created using > Postgresql, I've discovered a bug when creating a new trigger. For example > {noformat} > CREATE RESOURCE PLAN plan_1 WITH QUERY_PARALLELISM=4; > CREATE POOL plan_1.slow WITH >ALLOC_FRACTION=0.5, QUERY_PARALLELISM=2, SCHEDULING_POLICY='fair'; > ALTER POOL plan_1.default SET >ALLOC_FRACTION=0.5, QUERY_PARALLELISM=2, SCHEDULING_POLICY='fifo'; > CREATE TRIGGER plan_1.trigger_1 WHEN S3A_BYTES_READ > 268435456 DO MOVE TO > slow; > {noformat} > Right at the CREATE TRIGGER statement, an error will occur > {noformat} > Error while processing statement: FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Insert of > object "org.apache.hadoop.hive.metastore.model.MWMTrigger@5c5ae5d8" using > statement "INSERT INTO "WM_TRIGGER" > ("TRIGGER_ID","ACTION_EXPRESSION","IS_IN_UNMANAGED","NAME","RP_ID","TRIGGER_EXPRESSION") > VALUES (?,?,?,?,?,?)" failed : ERROR: column "IS_IN_UNMANAGED" is of type > boolean but expression is of type integer Hint: You will need to rewrite or > cast the expression. Position: 129) > at > org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543) > ~[datanucleus-api-jdo-4.2.4.jar:?] > at > org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:729) > ~[datanucleus-api-jdo-4.2.4.jar:?] > at > org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:749) > ~[datanucleus-api-jdo-4.2.4.jar:?] > at > org.apache.hadoop.hive.metastore.ObjectStore.createWMTrigger(ObjectStore.java:11218) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_151] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_151] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_151] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151] > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_151] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_151] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_151] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151] > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at com.sun.proxy.$Proxy37.createWMTrigger(Unknown Source) ~[?:?] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_wm_trigger(HiveMetaStore.java:7846) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_151] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_151] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_151] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at com.sun.proxy.$Proxy39.create_wm_trigger(Unknown Source) ~[?:?] > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createWMTrigger(HiveMetaStoreClient.java:3062) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_151] > at >
[jira] [Commented] (HIVE-19394) WM_TRIGGER trigger creation failed with type cast from Integer to Boolean
[ https://issues.apache.org/jira/browse/HIVE-19394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463119#comment-16463119 ] Sergey Shelukhin commented on HIVE-19394: - Missed that. Let me update in place. > WM_TRIGGER trigger creation failed with type cast from Integer to Boolean > -- > > Key: HIVE-19394 > URL: https://issues.apache.org/jira/browse/HIVE-19394 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Thai Bui >Assignee: Thai Bui >Priority: Minor > Fix For: 3.0.0 > > Attachments: HIVE-19394.patch, HIVE-19394.patch > > > During testing of the new WM feature and the Hive metastore is created using > Postgresql, I've discovered a bug when creating a new trigger. For example > {noformat} > CREATE RESOURCE PLAN plan_1 WITH QUERY_PARALLELISM=4; > CREATE POOL plan_1.slow WITH >ALLOC_FRACTION=0.5, QUERY_PARALLELISM=2, SCHEDULING_POLICY='fair'; > ALTER POOL plan_1.default SET >ALLOC_FRACTION=0.5, QUERY_PARALLELISM=2, SCHEDULING_POLICY='fifo'; > CREATE TRIGGER plan_1.trigger_1 WHEN S3A_BYTES_READ > 268435456 DO MOVE TO > slow; > {noformat} > Right at the CREATE TRIGGER statement, an error will occur > {noformat} > Error while processing statement: FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Insert of > object "org.apache.hadoop.hive.metastore.model.MWMTrigger@5c5ae5d8" using > statement "INSERT INTO "WM_TRIGGER" > ("TRIGGER_ID","ACTION_EXPRESSION","IS_IN_UNMANAGED","NAME","RP_ID","TRIGGER_EXPRESSION") > VALUES (?,?,?,?,?,?)" failed : ERROR: column "IS_IN_UNMANAGED" is of type > boolean but expression is of type integer Hint: You will need to rewrite or > cast the expression. Position: 129) > at > org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543) > ~[datanucleus-api-jdo-4.2.4.jar:?] > at > org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:729) > ~[datanucleus-api-jdo-4.2.4.jar:?] > at > org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:749) > ~[datanucleus-api-jdo-4.2.4.jar:?] > at > org.apache.hadoop.hive.metastore.ObjectStore.createWMTrigger(ObjectStore.java:11218) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_151] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_151] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_151] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151] > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_151] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_151] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_151] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151] > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at com.sun.proxy.$Proxy37.createWMTrigger(Unknown Source) ~[?:?] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_wm_trigger(HiveMetaStore.java:7846) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_151] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_151] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_151] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at com.sun.proxy.$Proxy39.create_wm_trigger(Unknown Source) ~[?:?] > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createWMTrigger(HiveMetaStoreClient.java:3062) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_151] > at >
[jira] [Reopened] (HIVE-19394) WM_TRIGGER trigger creation failed with type cast from Integer to Boolean
[ https://issues.apache.org/jira/browse/HIVE-19394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reopened HIVE-19394: - > WM_TRIGGER trigger creation failed with type cast from Integer to Boolean > -- > > Key: HIVE-19394 > URL: https://issues.apache.org/jira/browse/HIVE-19394 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Thai Bui >Assignee: Thai Bui >Priority: Minor > Fix For: 3.0.0 > > Attachments: HIVE-19394.patch, HIVE-19394.patch > > > During testing of the new WM feature and the Hive metastore is created using > Postgresql, I've discovered a bug when creating a new trigger. For example > {noformat} > CREATE RESOURCE PLAN plan_1 WITH QUERY_PARALLELISM=4; > CREATE POOL plan_1.slow WITH >ALLOC_FRACTION=0.5, QUERY_PARALLELISM=2, SCHEDULING_POLICY='fair'; > ALTER POOL plan_1.default SET >ALLOC_FRACTION=0.5, QUERY_PARALLELISM=2, SCHEDULING_POLICY='fifo'; > CREATE TRIGGER plan_1.trigger_1 WHEN S3A_BYTES_READ > 268435456 DO MOVE TO > slow; > {noformat} > Right at the CREATE TRIGGER statement, an error will occur > {noformat} > Error while processing statement: FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Insert of > object "org.apache.hadoop.hive.metastore.model.MWMTrigger@5c5ae5d8" using > statement "INSERT INTO "WM_TRIGGER" > ("TRIGGER_ID","ACTION_EXPRESSION","IS_IN_UNMANAGED","NAME","RP_ID","TRIGGER_EXPRESSION") > VALUES (?,?,?,?,?,?)" failed : ERROR: column "IS_IN_UNMANAGED" is of type > boolean but expression is of type integer Hint: You will need to rewrite or > cast the expression. Position: 129) > at > org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543) > ~[datanucleus-api-jdo-4.2.4.jar:?] > at > org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:729) > ~[datanucleus-api-jdo-4.2.4.jar:?] > at > org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:749) > ~[datanucleus-api-jdo-4.2.4.jar:?] > at > org.apache.hadoop.hive.metastore.ObjectStore.createWMTrigger(ObjectStore.java:11218) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_151] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_151] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_151] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151] > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_151] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_151] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_151] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151] > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at com.sun.proxy.$Proxy37.createWMTrigger(Unknown Source) ~[?:?] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_wm_trigger(HiveMetaStore.java:7846) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_151] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_151] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_151] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at com.sun.proxy.$Proxy39.create_wm_trigger(Unknown Source) ~[?:?] > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createWMTrigger(HiveMetaStoreClient.java:3062) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_151] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >
[jira] [Assigned] (HIVE-19414) Cover partitioned table stats update/retrieve cases
[ https://issues.apache.org/jira/browse/HIVE-19414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom reassigned HIVE-19414: - Assignee: Steve Yeom > Cover partitioned table stats update/retrieve cases > > > Key: HIVE-19414 > URL: https://issues.apache.org/jira/browse/HIVE-19414 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 3.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19394) WM_TRIGGER trigger creation failed with type cast from Integer to Boolean
[ https://issues.apache.org/jira/browse/HIVE-19394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463117#comment-16463117 ] Thai Bui commented on HIVE-19394: - Thanks [~sershe]. I noticed that the upgrade script has the type defined as {noformat} "IS_IN_UNMANAGED" smallint NOT NULL DEFAULT false{noformat} I just did a quick test in Postgres and that will cause an error because of `DEFAULT false`. It needs to be `DEFAULT 0`. {noformat} create table test (active smallint not null default false); ERROR: column "active" is of type smallint but default expression is of type boolean HINT: You will need to rewrite or cast the expression.{noformat} Can we just push a follow up commit to fix it or have to create a new Jira and everything? :) > WM_TRIGGER trigger creation failed with type cast from Integer to Boolean > -- > > Key: HIVE-19394 > URL: https://issues.apache.org/jira/browse/HIVE-19394 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Thai Bui >Assignee: Thai Bui >Priority: Minor > Fix For: 3.0.0 > > Attachments: HIVE-19394.patch, HIVE-19394.patch > > > During testing of the new WM feature and the Hive metastore is created using > Postgresql, I've discovered a bug when creating a new trigger. For example > {noformat} > CREATE RESOURCE PLAN plan_1 WITH QUERY_PARALLELISM=4; > CREATE POOL plan_1.slow WITH >ALLOC_FRACTION=0.5, QUERY_PARALLELISM=2, SCHEDULING_POLICY='fair'; > ALTER POOL plan_1.default SET >ALLOC_FRACTION=0.5, QUERY_PARALLELISM=2, SCHEDULING_POLICY='fifo'; > CREATE TRIGGER plan_1.trigger_1 WHEN S3A_BYTES_READ > 268435456 DO MOVE TO > slow; > {noformat} > Right at the CREATE TRIGGER statement, an error will occur > {noformat} > Error while processing statement: FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Insert of > object "org.apache.hadoop.hive.metastore.model.MWMTrigger@5c5ae5d8" using > statement "INSERT INTO "WM_TRIGGER" > ("TRIGGER_ID","ACTION_EXPRESSION","IS_IN_UNMANAGED","NAME","RP_ID","TRIGGER_EXPRESSION") > VALUES (?,?,?,?,?,?)" failed : ERROR: column "IS_IN_UNMANAGED" is of type > boolean but expression is of type integer Hint: You will need to rewrite or > cast the expression. Position: 129) > at > org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543) > ~[datanucleus-api-jdo-4.2.4.jar:?] > at > org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:729) > ~[datanucleus-api-jdo-4.2.4.jar:?] > at > org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:749) > ~[datanucleus-api-jdo-4.2.4.jar:?] > at > org.apache.hadoop.hive.metastore.ObjectStore.createWMTrigger(ObjectStore.java:11218) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_151] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_151] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_151] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151] > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_151] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_151] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_151] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151] > at > org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at com.sun.proxy.$Proxy37.createWMTrigger(Unknown Source) ~[?:?] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_wm_trigger(HiveMetaStore.java:7846) > ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_151] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_151] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_151] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151] > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) >
[jira] [Commented] (HIVE-19358) CBO decorrelation logic should generate Hive operators
[ https://issues.apache.org/jira/browse/HIVE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463116#comment-16463116 ] Hive QA commented on HIVE-19358: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12921694/fix.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 46 failed/errored test(s), 14316 tests executed *Failed tests:* {noformat} TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed out) (batchId=247) TestHCatHiveCompatibility - did not produce a TEST-*.xml file (likely timed out) (batchId=247) TestNonCatCallsWithCatalog - did not produce a TEST-*.xml file (likely timed out) (batchId=217) TestSequenceFileReadWrite - did not produce a TEST-*.xml file (likely timed out) (batchId=247) TestTxnExIm - did not produce a TEST-*.xml file (likely timed out) (batchId=286) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_13] (batchId=253) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_1] (batchId=68) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_2] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin_negative3] (batchId=29) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_1_23] (batchId=81) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_skew_1_23] (batchId=9) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_13] (batchId=32) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sort_merge_join_desc_7] (batchId=27) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_4] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_smb] (batchId=176) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=163) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main] (batchId=160) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=167) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] (batchId=105) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[insertsel_fail] (batchId=95) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[load_data_parquet_empty] (batchId=96) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_reflect_neg] (batchId=96) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_test_error] (batchId=96) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucket_map_join_1] (batchId=137) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucket_map_join_2] (batchId=133) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucketmapjoin_negative3] (batchId=120) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_sort_1_23] (batchId=143) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_sort_skew_1_23] (batchId=111) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[smb_mapjoin_13] (batchId=122) org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=228) org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 (batchId=228) org.apache.hadoop.hive.ql.TestMTQueries.testMTQueries1 (batchId=232) org.apache.hadoop.hive.ql.parse.TestCopyUtils.testPrivilegedDistCpWithSameUserAsCurrentDoesNotTryToImpersonate (batchId=231) org.apache.hadoop.hive.ql.parse.TestReplicationOnHDFSEncryptedZones.targetAndSourceHaveDifferentEncryptionZoneKeys (batchId=231) org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgress (batchId=235) org.apache.hive.jdbc.TestSSL.testSSLFetchHttp (batchId=239) org.apache.hive.jdbc.TestTriggersWorkloadManager.testMultipleTriggers2 (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedFiles (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomNonExistent (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomReadOps (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesRead (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesWrite (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryElapsedTime (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryExecutionTime (batchId=241) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerVertexRawInputSplitsNoKill (batchId=241) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/10661/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/10661/console Test logs:
[jira] [Commented] (HIVE-17824) msck repair table should drop the missing partitions from metastore
[ https://issues.apache.org/jira/browse/HIVE-17824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463113#comment-16463113 ] Vihang Karajgaonkar commented on HIVE-17824: Test failures are unrelated. Patch merged in branch-2 as well. Thank you for your contribution [~janulatha] and patience with the ptest issues :) > msck repair table should drop the missing partitions from metastore > --- > > Key: HIVE-17824 > URL: https://issues.apache.org/jira/browse/HIVE-17824 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Janaki Lahorani >Priority: Major > Fix For: 3.0.0, 2.4.0, 3.1.0 > > Attachments: HIVE-17824-branch-2.01.patch, > HIVE-17824.01-branch-2.patch, HIVE-17824.01-branch-2.patch, > HIVE-17824.01-branch-2.patch, HIVE-17824.1.patch, HIVE-17824.2.patch, > HIVE-17824.3.patch, HIVE-17824.4.patch > > > {{msck repair table }} is often used in environments where the new > partitions are loaded as directories on HDFS or S3 and users want to create > the missing partitions in bulk. However, currently it only supports addition > of missing partitions. If there are any partitions which are present in > metastore but not on the FileSystem, it should also delete them so that it > truly repairs the table metadata. > We should be careful not to break backwards compatibility so we should either > introduce a new config or keyword to add support to delete unnecessary > partitions from the metastore. This way users who want the old behavior can > easily turn it off. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-17824) msck repair table should drop the missing partitions from metastore
[ https://issues.apache.org/jira/browse/HIVE-17824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-17824: --- Resolution: Fixed Fix Version/s: 2.4.0 Status: Resolved (was: Patch Available) > msck repair table should drop the missing partitions from metastore > --- > > Key: HIVE-17824 > URL: https://issues.apache.org/jira/browse/HIVE-17824 > Project: Hive > Issue Type: Improvement >Reporter: Vihang Karajgaonkar >Assignee: Janaki Lahorani >Priority: Major > Fix For: 3.0.0, 2.4.0, 3.1.0 > > Attachments: HIVE-17824-branch-2.01.patch, > HIVE-17824.01-branch-2.patch, HIVE-17824.01-branch-2.patch, > HIVE-17824.01-branch-2.patch, HIVE-17824.1.patch, HIVE-17824.2.patch, > HIVE-17824.3.patch, HIVE-17824.4.patch > > > {{msck repair table }} is often used in environments where the new > partitions are loaded as directories on HDFS or S3 and users want to create > the missing partitions in bulk. However, currently it only supports addition > of missing partitions. If there are any partitions which are present in > metastore but not on the FileSystem, it should also delete them so that it > truly repairs the table metadata. > We should be careful not to break backwards compatibility so we should either > introduce a new config or keyword to add support to delete unnecessary > partitions from the metastore. This way users who want the old behavior can > easily turn it off. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-19413) Add/use writeId and validWriteIdList during update and for reteive.
[ https://issues.apache.org/jira/browse/HIVE-19413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom reassigned HIVE-19413: - > Add/use writeId and validWriteIdList during update and for reteive. > > > Key: HIVE-19413 > URL: https://issues.apache.org/jira/browse/HIVE-19413 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 3.0.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19389) Schematool: For Hive's Information Schema, use embedded HS2 as default
[ https://issues.apache.org/jira/browse/HIVE-19389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463111#comment-16463111 ] Daniel Dai commented on HIVE-19389: --- Actually the best way to check if it is initializing information schema is dbType='hive'. > Schematool: For Hive's Information Schema, use embedded HS2 as default > -- > > Key: HIVE-19389 > URL: https://issues.apache.org/jira/browse/HIVE-19389 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0, 3.1.0 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta >Priority: Major > Attachments: HIVE-19389.1.patch > > > Currently, for initializing/upgrading Hive's information schema, we require a > full jdbc url (for HS2). It will be good to have it connect using embedded > HS2 by default. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Issue Comment Deleted] (HIVE-19389) Schematool: For Hive's Information Schema, use embedded HS2 as default
[ https://issues.apache.org/jira/browse/HIVE-19389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-19389: -- Comment: was deleted (was: +1) > Schematool: For Hive's Information Schema, use embedded HS2 as default > -- > > Key: HIVE-19389 > URL: https://issues.apache.org/jira/browse/HIVE-19389 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0, 3.1.0 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta >Priority: Major > Attachments: HIVE-19389.1.patch > > > Currently, for initializing/upgrading Hive's information schema, we require a > full jdbc url (for HS2). It will be good to have it connect using embedded > HS2 by default. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields
[ https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463109#comment-16463109 ] Vihang Karajgaonkar commented on HIVE-19041: The test which did not show comments as wasting but then in my testing we do not use comments to the column fields. We would need actual dumps from users to confirm that theory which we don't have. Particularly for this issue. Even otherwise, I think we can intern the comment in the partition objects since they will be the same for all the partitions. > Thrift deserialization of Partition objects should intern fields > > > Key: HIVE-19041 > URL: https://issues.apache.org/jira/browse/HIVE-19041 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: 3.0.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-19041.01.patch > > > When a client is creating large number of partitions, the thrift objects are > deserialized into Partition objects. The read method of these objects does > not intern the inputformat, location, outputformat which cause large number > of duplicate Strings in the HMS memory. We should intern these objects while > deserialization to reduce memory pressure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19389) Schematool: For Hive's Information Schema, use embedded HS2 as default
[ https://issues.apache.org/jira/browse/HIVE-19389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463108#comment-16463108 ] Daniel Dai commented on HIVE-19389: --- +1 > Schematool: For Hive's Information Schema, use embedded HS2 as default > -- > > Key: HIVE-19389 > URL: https://issues.apache.org/jira/browse/HIVE-19389 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0, 3.1.0 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta >Priority: Major > Attachments: HIVE-19389.1.patch > > > Currently, for initializing/upgrading Hive's information schema, we require a > full jdbc url (for HS2). It will be good to have it connect using embedded > HS2 by default. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19389) Schematool: For Hive's Information Schema, use embedded HS2 as default
[ https://issues.apache.org/jira/browse/HIVE-19389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-19389: Status: Patch Available (was: Open) > Schematool: For Hive's Information Schema, use embedded HS2 as default > -- > > Key: HIVE-19389 > URL: https://issues.apache.org/jira/browse/HIVE-19389 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0, 3.1.0 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta >Priority: Major > Attachments: HIVE-19389.1.patch > > > Currently, for initializing/upgrading Hive's information schema, we require a > full jdbc url (for HS2). It will be good to have it connect using embedded > HS2 by default. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19389) Schematool: For Hive's Information Schema, use embedded HS2 as default
[ https://issues.apache.org/jira/browse/HIVE-19389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-19389: Attachment: HIVE-19389.1.patch > Schematool: For Hive's Information Schema, use embedded HS2 as default > -- > > Key: HIVE-19389 > URL: https://issues.apache.org/jira/browse/HIVE-19389 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0, 3.1.0 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta >Priority: Major > Attachments: HIVE-19389.1.patch > > > Currently, for initializing/upgrading Hive's information schema, we require a > full jdbc url (for HS2). It will be good to have it connect using embedded > HS2 by default. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18395) Using stats for aggregate query on Acid/MM is off even with "hive.compute.query.using.stats" is true.
[ https://issues.apache.org/jira/browse/HIVE-18395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463077#comment-16463077 ] Steve Yeom commented on HIVE-18395: --- In summary, the comments from Sergey will be applied to the Design google doc soon if not yet added. Also I have created a jira about the issue Sergey mentioned at 4). > Using stats for aggregate query on Acid/MM is off even with > "hive.compute.query.using.stats" is true. > - > > Key: HIVE-18395 > URL: https://issues.apache.org/jira/browse/HIVE-18395 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Attachments: HIVE-18395.01.preview > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19411) Full-ACID table stats may not be valid
[ https://issues.apache.org/jira/browse/HIVE-19411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-19411: -- Description: One case is that , per Sergey, updating a row can ended up +2 rows instead of +0 since it is translated to delete and insert and the physical writer may just add # of operations. was: E.g., per Sergey,. updating a row can ended up +2 rows instead of +0 since it is translated to delete and insert and the physical writer may just add # of operations. > Full-ACID table stats may not be valid > -- > > Key: HIVE-19411 > URL: https://issues.apache.org/jira/browse/HIVE-19411 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Priority: Major > Fix For: 0.10.1 > > > One case is that , per Sergey, updating a row can ended up +2 rows instead of > +0 > since it is translated to delete and insert and the physical writer > may just add # of operations. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HIVE-19411) Full-ACID table stats may not be valid
[ https://issues.apache.org/jira/browse/HIVE-19411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom reassigned HIVE-19411: - Assignee: Steve Yeom > Full-ACID table stats may not be valid > -- > > Key: HIVE-19411 > URL: https://issues.apache.org/jira/browse/HIVE-19411 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Fix For: 0.10.1 > > > One case is that , per Sergey, updating a row can ended up +2 rows instead of > +0 > since it is translated to delete and insert and the physical writer > may just add # of operations. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19411) Full-ACID table stats may not be valid
[ https://issues.apache.org/jira/browse/HIVE-19411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Yeom updated HIVE-19411: -- Summary: Full-ACID table stats may not be valid (was: Full-ACID table stats may not be valid. E.g., delete/insert) > Full-ACID table stats may not be valid > -- > > Key: HIVE-19411 > URL: https://issues.apache.org/jira/browse/HIVE-19411 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Priority: Major > Fix For: 0.10.1 > > > E.g., per Sergey,. updating a row can ended up +2 rows instead of +0 > since it is translated to delete and insert and the physical writer > may just add # of operations. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-6589) Automatically add partitions for external tables
[ https://issues.apache.org/jira/browse/HIVE-6589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463058#comment-16463058 ] Steve Hoffman commented on HIVE-6589: - I agree that `MSCK REPAIR TABLE your_table_name;` will fix up the table and even remove expired partitions, but it still takes time to scan and if you have a new partition every hour, you still run a job every hour to add said partition. For example, in AWS using athena this repair table command takes 800 seconds to run which will only get longer as the partitions grow and eventually take more than an hour which means the hourly job will start a new one before the old one finishes – madness. The point of this ticket isn't to come up with simpler ways to insert records into an internal metadata store, but to NOT insert individual records because the paths follow a pattern. > Automatically add partitions for external tables > > > Key: HIVE-6589 > URL: https://issues.apache.org/jira/browse/HIVE-6589 > Project: Hive > Issue Type: New Feature >Affects Versions: 0.14.0 >Reporter: Ken Dallmeyer >Assignee: Dharmendra Pratap Singh >Priority: Major > > I have a data stream being loaded into Hadoop via Flume. It loads into a date > partition folder in HDFS. The path looks like this: > {code}/flume/my_data//MM/DD/HH > /flume/my_data/2014/03/02/01 > /flume/my_data/2014/03/02/02 > /flume/my_data/2014/03/02/03{code} > On top of it I create an EXTERNAL hive table to do querying. As of now, I > have to manually add partitions. What I want is for EXTERNAL tables, Hive > should "discover" those partitions. Additionally I would like to specify a > partition pattern so that when I query Hive will know to use the partition > pattern to find the HDFS folder. > So something like this: > {code}CREATE EXTERNAL TABLE my_data ( > col1 STRING, > col2 INT > ) > PARTITIONED BY ( > dt STRING, > hour STRING > ) > LOCATION > '/flume/mydata' > TBLPROPERTIES ( > 'hive.partition.spec' = 'dt=$Y-$M-$D, hour=$H', > 'hive.partition.spec.location' = '$Y/$M/$D/$H', > ); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19410) don't create serde reader in LLAP if there's no cache
[ https://issues.apache.org/jira/browse/HIVE-19410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463036#comment-16463036 ] Prasanth Jayachandran commented on HIVE-19410: -- +1 > don't create serde reader in LLAP if there's no cache > - > > Key: HIVE-19410 > URL: https://issues.apache.org/jira/browse/HIVE-19410 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HIVE-19410.patch > > > Seems to crop up in some tests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19358) CBO decorrelation logic should generate Hive operators
[ https://issues.apache.org/jira/browse/HIVE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463030#comment-16463030 ] Hive QA commented on HIVE-19358: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 54s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 13s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 44s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 17m 42s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-10661/dev-support/hive-personality.sh | | git revision | master / db26f34 | | Default Java | 1.8.0_111 | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-10661/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > CBO decorrelation logic should generate Hive operators > -- > > Key: HIVE-19358 > URL: https://issues.apache.org/jira/browse/HIVE-19358 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 3.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-19358.01.patch, HIVE-19358.02.patch, > HIVE-19358.patch, fix.patch > > > Decorrelation logic may generate logical instances of the operators in the > plan (e.g., LogicalFilter instead of HiveFilter). This leads to errors while > costing the tree in the Volcano planner (used in MV rewriting), since logical > operators do not have a cost associated to them. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18395) Using stats for aggregate query on Acid/MM is off even with "hive.compute.query.using.stats" is true.
[ https://issues.apache.org/jira/browse/HIVE-18395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463022#comment-16463022 ] Steve Yeom commented on HIVE-18395: --- Again the patch is not yet to be reviewed. Items 2, 3 are already in my list to do [~sershe] do you think you can explain a little further on item#4? > Using stats for aggregate query on Acid/MM is off even with > "hive.compute.query.using.stats" is true. > - > > Key: HIVE-18395 > URL: https://issues.apache.org/jira/browse/HIVE-18395 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Attachments: HIVE-18395.01.preview > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19393) NonSyncDataInputBuffer.skipBytes hangs when the file is corrupted
[ https://issues.apache.org/jira/browse/HIVE-19393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Doe updated HIVE-19393: Affects Version/s: 2.3.2 > NonSyncDataInputBuffer.skipBytes hangs when the file is corrupted > -- > > Key: HIVE-19393 > URL: https://issues.apache.org/jira/browse/HIVE-19393 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 1.0.0, 2.3.2 >Reporter: John Doe >Priority: Minor > > When an InputStream is corrupted, the InputStream.skip can return -1, causing > the while loop in NonSyncDataInputBuffer.skipBytes become infinite. > {code:java} > public final int skipBytes(int count) throws IOException { > int skipped = 0; > long skip; > while (skipped < count && (skip = in.skip(count - skipped)) != 0) { > skipped += skip; > } > if (skipped < 0) { > throw new EOFException(); > } > return skipped; > } > {code} > Similar bugs are Hadoop-8614, Yarn-2905, Yarn-163, Mapreduce-6990 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18219) When InputStream is corrupted, the skip() returns -1, causing infinite loop
[ https://issues.apache.org/jira/browse/HIVE-18219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Doe updated HIVE-18219: Affects Version/s: 1.0.0 > When InputStream is corrupted, the skip() returns -1, causing infinite loop > --- > > Key: HIVE-18219 > URL: https://issues.apache.org/jira/browse/HIVE-18219 > Project: Hive > Issue Type: Bug >Affects Versions: 1.0.0, 2.3.2 >Reporter: John Doe >Priority: Major > > Similar like > [CASSANDRA-7330|https://issues.apache.org/jira/browse/CASSANDRA-7330], when > InputStream is corrupted, skip() returns -1, causing the following loop be > infinite. > {code:java} > public final int skipBytes(int count) throws IOException { > int skipped = 0; > long skip; > while (skipped < count && (skip = in.skip(count - skipped)) != 0) { > skipped += skip; > } > if (skipped < 0) { > throw new EOFException(); > } > return skipped; > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19393) NonSyncDataInputBuffer.skipBytes hangs when the file is corrupted
[ https://issues.apache.org/jira/browse/HIVE-19393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463011#comment-16463011 ] John Doe commented on HIVE-19393: - [~sershe] Actually it is a duplicated issue. Same as [HIVE-18219|https://issues.apache.org/jira/browse/HIVE-18219] > NonSyncDataInputBuffer.skipBytes hangs when the file is corrupted > -- > > Key: HIVE-19393 > URL: https://issues.apache.org/jira/browse/HIVE-19393 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 1.0.0 >Reporter: John Doe >Priority: Minor > > When an InputStream is corrupted, the InputStream.skip can return -1, causing > the while loop in NonSyncDataInputBuffer.skipBytes become infinite. > {code:java} > public final int skipBytes(int count) throws IOException { > int skipped = 0; > long skip; > while (skipped < count && (skip = in.skip(count - skipped)) != 0) { > skipped += skip; > } > if (skipped < 0) { > throw new EOFException(); > } > return skipped; > } > {code} > Similar bugs are Hadoop-8614, Yarn-2905, Yarn-163, Mapreduce-6990 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HIVE-19393) NonSyncDataInputBuffer.skipBytes hangs when the file is corrupted
[ https://issues.apache.org/jira/browse/HIVE-19393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Doe resolved HIVE-19393. - Resolution: Duplicate > NonSyncDataInputBuffer.skipBytes hangs when the file is corrupted > -- > > Key: HIVE-19393 > URL: https://issues.apache.org/jira/browse/HIVE-19393 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 1.0.0 >Reporter: John Doe >Priority: Minor > > When an InputStream is corrupted, the InputStream.skip can return -1, causing > the while loop in NonSyncDataInputBuffer.skipBytes become infinite. > {code:java} > public final int skipBytes(int count) throws IOException { > int skipped = 0; > long skip; > while (skipped < count && (skip = in.skip(count - skipped)) != 0) { > skipped += skip; > } > if (skipped < 0) { > throw new EOFException(); > } > return skipped; > } > {code} > Similar bugs are Hadoop-8614, Yarn-2905, Yarn-163, Mapreduce-6990 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-18395) Using stats for aggregate query on Acid/MM is off even with "hive.compute.query.using.stats" is true.
[ https://issues.apache.org/jira/browse/HIVE-18395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463006#comment-16463006 ] Sergey Shelukhin edited comment on HIVE-18395 at 5/3/18 8:33 PM: - Some small comments: 1) Seems to include some generated config in the patch. 2) TABLE_PARAMS_TXN - can we normalize this and not rely on JSON object? Since this will only be used for stats I assume, and stats only have one parameter (JSON string), we can store it as proper SQL data [~ekoifman] you might want to look at the new tables schema since it's based on what ACID state is sufficient to map stats to a snapshot. 3) {noformat} + if (SessionState.get().getTxnMgr() != null && isTransactional) { +request.setTxnid(SessionState.get().getTxnMgr().getCurrentTxnId()); + } {noformat} Is it enough to store transaction ID? Transaction ID will invalidate stats for every other table with every transaction, right? At least it should be writeId. Also the doc calls for storing entire state information about when the stats were written - otherwise you cannot tell based on read-time snapshot if the stats are valid, for example in case of two inserts, where both can write stats, but both stats are invalid because the inserts don't see each other. 4) {noformat} +/* if (isFullAcid && !work.isTargetRewritten()) { // Don't bother with aggregation in this case, it will probably be invalid. parameters.remove(statType); continue; } +*/ {noformat} Aggregation for full ACID will still be invalid. 5) This doesn't seem to affect basic stats in all the places they are updated. However, basic stats are also used for count(*) type queries (numRows). I sent a patch separately that has some changes for basic stats... Might make sense to also modify these places in this patch. Doesn't have to be the same change I made as long as it is sufficient to add txn info to basic stats. It might help to exclude the generated code for review. I sometimes use the script like this, where $1 is base branch and $2 is file name: {noformat} rm -f ~/patches/$2.nogen.patch for f in `git diff $1 --name-only | grep -v "gen-" | grep -v "\/gen\/"` do git diff $1 -- $f >> ~/patches/$2.nogen.patch {noformat} was (Author: sershe): Some small comments: 1) Seems to include some generated config in the patch. 2) TABLE_PARAMS_TXN - can we normalize this and not rely on JSON object? Since this will only be used for stats I assume, and stats only have one parameter (JSON string), we can store it as proper SQL data [~ekoifman] you might want to look at the new tables schema since it's based on what ACID state is sufficient to map stats to a snapshot. 3) {noformat} + if (SessionState.get().getTxnMgr() != null && isTransactional) { +request.setTxnid(SessionState.get().getTxnMgr().getCurrentTxnId()); + } {noformat} Is it enough to store transaction ID? Transaction ID will invalidate stats for every other table with every transaction, right? Also the doc calls for storing entire state information about when the stats were written - otherwise you cannot tell based on read-time snapshot if the stats are valid, for example in case of two inserts, where both can write stats, but both stats are invalid because the inserts don't see each other. 4) {noformat} +/* if (isFullAcid && !work.isTargetRewritten()) { // Don't bother with aggregation in this case, it will probably be invalid. parameters.remove(statType); continue; } +*/ {noformat} Aggregation for full ACID will still be invalid. 5) This doesn't seem to affect basic stats in all the places they are updated. However, basic stats are also used for count(*) type queries (numRows). I sent a patch separately that has some changes for basic stats... Might make sense to also modify these places in this patch. Doesn't have to be the same change I made as long as it is sufficient to add txn info to basic stats. It might help to exclude the generated code for review. I sometimes use the script like this, where $1 is base branch and $2 is file name: {noformat} rm -f ~/patches/$2.nogen.patch for f in `git diff $1 --name-only | grep -v "gen-" | grep -v "\/gen\/"` do git diff $1 -- $f >> ~/patches/$2.nogen.patch {noformat} > Using stats for aggregate query on Acid/MM is off even with > "hive.compute.query.using.stats" is true. > - > > Key: HIVE-18395 > URL: https://issues.apache.org/jira/browse/HIVE-18395 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Attachments:
[jira] [Comment Edited] (HIVE-18395) Using stats for aggregate query on Acid/MM is off even with "hive.compute.query.using.stats" is true.
[ https://issues.apache.org/jira/browse/HIVE-18395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463006#comment-16463006 ] Sergey Shelukhin edited comment on HIVE-18395 at 5/3/18 8:32 PM: - Some small comments: 1) Seems to include some generated config in the patch. 2) TABLE_PARAMS_TXN - can we normalize this and not rely on JSON object? Since this will only be used for stats I assume, and stats only have one parameter (JSON string), we can store it as proper SQL data [~ekoifman] you might want to look at the new tables schema since it's based on what ACID state is sufficient to map stats to a snapshot. 3) {noformat} + if (SessionState.get().getTxnMgr() != null && isTransactional) { +request.setTxnid(SessionState.get().getTxnMgr().getCurrentTxnId()); + } {noformat} Is it enough to store transaction ID? Transaction ID will invalidate stats for every other table with every transaction, right? Also the doc calls for storing entire state information about when the stats were written - otherwise you cannot tell based on read-time snapshot if the stats are valid, for example in case of two inserts, where both can write stats, but both stats are invalid because the inserts don't see each other. 4) {noformat} +/* if (isFullAcid && !work.isTargetRewritten()) { // Don't bother with aggregation in this case, it will probably be invalid. parameters.remove(statType); continue; } +*/ {noformat} Aggregation for full ACID will still be invalid. 5) This doesn't seem to affect basic stats in all the places they are updated. However, basic stats are also used for count(*) type queries (numRows). I sent a patch separately that has some changes for basic stats... Might make sense to also modify these places in this patch. Doesn't have to be the same change I made as long as it is sufficient to add txn info to basic stats. It might help to exclude the generated code for review. I sometimes use the script like this, where $1 is base branch and $2 is file name: {noformat} rm -f ~/patches/$2.nogen.patch for f in `git diff $1 --name-only | grep -v "gen-" | grep -v "\/gen\/"` do git diff $1 -- $f >> ~/patches/$2.nogen.patch {noformat} was (Author: sershe): Some small comments: 1) Seems to include some generated config in the patch. 2) TABLE_PARAMS_TXN - can we normalize this and not rely on JSON object? Since this will only be used for stats I assume, and stats only have one parameter. [~ekoifman] you might want to look at the new tables schema since it's based on what ACID state is sufficient to map stats to a snapshot. 3) {noformat} + if (SessionState.get().getTxnMgr() != null && isTransactional) { +request.setTxnid(SessionState.get().getTxnMgr().getCurrentTxnId()); + } {noformat} Is it enough to store transaction ID? Transaction ID will invalidate stats for every other table with every transaction, right? Also the doc calls for storing entire state information about when the stats were written - otherwise you cannot tell based on read-time snapshot if the stats are valid, for example in case of two inserts, where both can write stats, but both stats are invalid because the inserts don't see each other. 4) {noformat} +/* if (isFullAcid && !work.isTargetRewritten()) { // Don't bother with aggregation in this case, it will probably be invalid. parameters.remove(statType); continue; } +*/ {noformat} Aggregation for full ACID will still be invalid. 5) This doesn't seem to affect basic stats in all the places they are updated. However, basic stats are also used for count(*) type queries (numRows). I sent a patch separately that has some changes for basic stats... Might make sense to also modify these places in this patch. Doesn't have to be the same change I made as long as it is sufficient to add txn info to basic stats. It might help to exclude the generated code for review. I sometimes use the script like this, where $1 is base branch and $2 is file name: {noformat} rm -f ~/patches/$2.nogen.patch for f in `git diff $1 --name-only | grep -v "gen-" | grep -v "\/gen\/"` do git diff $1 -- $f >> ~/patches/$2.nogen.patch {noformat} > Using stats for aggregate query on Acid/MM is off even with > "hive.compute.query.using.stats" is true. > - > > Key: HIVE-18395 > URL: https://issues.apache.org/jira/browse/HIVE-18395 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Attachments: HIVE-18395.01.preview > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18395) Using stats for aggregate query on Acid/MM is off even with "hive.compute.query.using.stats" is true.
[ https://issues.apache.org/jira/browse/HIVE-18395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463006#comment-16463006 ] Sergey Shelukhin commented on HIVE-18395: - Some small comments: 1) Seems to include some generated config in the patch. 2) TABLE_PARAMS_TXN - can we normalize this and not rely on JSON object? Since this will only be used for stats I assume, and stats only have one parameter. [~ekoifman] you might want to look at the new tables schema since it's based on what ACID state is sufficient to map stats to a snapshot. 3) {noformat} + if (SessionState.get().getTxnMgr() != null && isTransactional) { +request.setTxnid(SessionState.get().getTxnMgr().getCurrentTxnId()); + } {noformat} Is it enough to store transaction ID? Transaction ID will invalidate stats for every other table with every transaction, right? Also the doc calls for storing entire state information about when the stats were written - otherwise you cannot tell based on read-time snapshot if the stats are valid, for example in case of two inserts, where both can write stats, but both stats are invalid because the inserts don't see each other. 4) {noformat} +/* if (isFullAcid && !work.isTargetRewritten()) { // Don't bother with aggregation in this case, it will probably be invalid. parameters.remove(statType); continue; } +*/ {noformat} Aggregation for full ACID will still be invalid. 5) This doesn't seem to affect basic stats in all the places they are updated. However, basic stats are also used for count(*) type queries (numRows). I sent a patch separately that has some changes for basic stats... Might make sense to also modify these places in this patch. Doesn't have to be the same change I made as long as it is sufficient to add txn info to basic stats. It might help to exclude the generated code for review. I sometimes use the script like this, where $1 is base branch and $2 is file name: {noformat} rm -f ~/patches/$2.nogen.patch for f in `git diff $1 --name-only | grep -v "gen-" | grep -v "\/gen\/"` do git diff $1 -- $f >> ~/patches/$2.nogen.patch {noformat} > Using stats for aggregate query on Acid/MM is off even with > "hive.compute.query.using.stats" is true. > - > > Key: HIVE-18395 > URL: https://issues.apache.org/jira/browse/HIVE-18395 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Steve Yeom >Assignee: Steve Yeom >Priority: Major > Attachments: HIVE-18395.01.preview > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19358) CBO decorrelation logic should generate Hive operators
[ https://issues.apache.org/jira/browse/HIVE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463005#comment-16463005 ] Jesus Camacho Rodriguez commented on HIVE-19358: [~vgarg], awesome, thanks. That seems to fix the issue that I had encountered. I have uploaded a new patch that includes your changes to trigger a new ptest run. > CBO decorrelation logic should generate Hive operators > -- > > Key: HIVE-19358 > URL: https://issues.apache.org/jira/browse/HIVE-19358 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 3.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-19358.01.patch, HIVE-19358.02.patch, > HIVE-19358.patch, fix.patch > > > Decorrelation logic may generate logical instances of the operators in the > plan (e.g., LogicalFilter instead of HiveFilter). This leads to errors while > costing the tree in the Volcano planner (used in MV rewriting), since logical > operators do not have a cost associated to them. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HIVE-19393) NonSyncDataInputBuffer.skipBytes hangs when the file is corrupted
[ https://issues.apache.org/jira/browse/HIVE-19393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462983#comment-16462983 ] John Doe edited comment on HIVE-19393 at 5/3/18 8:30 PM: - [~sershe] Sorry about the confusion. Actually, when the InputStream is corrupted, skip function returns -1 instead of 0, causing the infinite loop. was (Author: dustinday): [~sershe] Yes, you are right. Sorry about it. > NonSyncDataInputBuffer.skipBytes hangs when the file is corrupted > -- > > Key: HIVE-19393 > URL: https://issues.apache.org/jira/browse/HIVE-19393 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 1.0.0 >Reporter: John Doe >Priority: Minor > > When an InputStream is corrupted, the InputStream.skip can return -1, causing > the while loop in NonSyncDataInputBuffer.skipBytes become infinite. > {code:java} > public final int skipBytes(int count) throws IOException { > int skipped = 0; > long skip; > while (skipped < count && (skip = in.skip(count - skipped)) != 0) { > skipped += skip; > } > if (skipped < 0) { > throw new EOFException(); > } > return skipped; > } > {code} > Similar bugs are Hadoop-8614, Yarn-2905, Yarn-163, Mapreduce-6990 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HIVE-19393) NonSyncDataInputBuffer.skipBytes hangs when the file is corrupted
[ https://issues.apache.org/jira/browse/HIVE-19393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Doe reopened HIVE-19393: - > NonSyncDataInputBuffer.skipBytes hangs when the file is corrupted > -- > > Key: HIVE-19393 > URL: https://issues.apache.org/jira/browse/HIVE-19393 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 1.0.0 >Reporter: John Doe >Priority: Minor > > When an InputStream is corrupted, the InputStream.skip can return -1, causing > the while loop in NonSyncDataInputBuffer.skipBytes become infinite. > {code:java} > public final int skipBytes(int count) throws IOException { > int skipped = 0; > long skip; > while (skipped < count && (skip = in.skip(count - skipped)) != 0) { > skipped += skip; > } > if (skipped < 0) { > throw new EOFException(); > } > return skipped; > } > {code} > Similar bugs are Hadoop-8614, Yarn-2905, Yarn-163, Mapreduce-6990 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19358) CBO decorrelation logic should generate Hive operators
[ https://issues.apache.org/jira/browse/HIVE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-19358: --- Attachment: HIVE-19358.02.patch > CBO decorrelation logic should generate Hive operators > -- > > Key: HIVE-19358 > URL: https://issues.apache.org/jira/browse/HIVE-19358 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 3.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Major > Attachments: HIVE-19358.01.patch, HIVE-19358.02.patch, > HIVE-19358.patch, fix.patch > > > Decorrelation logic may generate logical instances of the operators in the > plan (e.g., LogicalFilter instead of HiveFilter). This leads to errors while > costing the tree in the Volcano planner (used in MV rewriting), since logical > operators do not have a cost associated to them. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-19393) NonSyncDataInputBuffer.skipBytes hangs when the file is corrupted
[ https://issues.apache.org/jira/browse/HIVE-19393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Doe updated HIVE-19393: Description: When an InputStream is corrupted, the InputStream.skip can return -1, causing the while loop in NonSyncDataInputBuffer.skipBytes become infinite. {code:java} public final int skipBytes(int count) throws IOException { int skipped = 0; long skip; while (skipped < count && (skip = in.skip(count - skipped)) != 0) { skipped += skip; } if (skipped < 0) { throw new EOFException(); } return skipped; } {code} Similar bugs are Hadoop-8614, Yarn-2905, Yarn-163, Mapreduce-6990 was: When an InputStream is corrupted, the InputStream.skip can return 0, causing the while loop in NonSyncDataInputBuffer.skipBytes become infinite. {code:java} public final int skipBytes(int count) throws IOException { int skipped = 0; long skip; while (skipped < count && (skip = in.skip(count - skipped)) != 0) { skipped += skip; } if (skipped < 0) { throw new EOFException(); } return skipped; } {code} Similar bugs are [Hadoop-8614|https://issues.apache.org/jira/browse/HADOOP-8614], [Yarn-2905|https://issues.apache.org/jira/browse/YARN-2905], [Yarn-163|https://issues.apache.org/jira/browse/YARN-163], [Mapreduce-6990|https://issues.apache.org/jira/browse/MAPREDUCE-6990] > NonSyncDataInputBuffer.skipBytes hangs when the file is corrupted > -- > > Key: HIVE-19393 > URL: https://issues.apache.org/jira/browse/HIVE-19393 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 1.0.0 >Reporter: John Doe >Priority: Minor > > When an InputStream is corrupted, the InputStream.skip can return -1, causing > the while loop in NonSyncDataInputBuffer.skipBytes become infinite. > {code:java} > public final int skipBytes(int count) throws IOException { > int skipped = 0; > long skip; > while (skipped < count && (skip = in.skip(count - skipped)) != 0) { > skipped += skip; > } > if (skipped < 0) { > throw new EOFException(); > } > return skipped; > } > {code} > Similar bugs are Hadoop-8614, Yarn-2905, Yarn-163, Mapreduce-6990 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19376) Statistics: switch to 10bit HLL by default for Hive
[ https://issues.apache.org/jira/browse/HIVE-19376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463000#comment-16463000 ] Hive QA commented on HIVE-19376: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12921424/HIVE-19376.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 186 failed/errored test(s), 14316 tests executed *Failed tests:* {noformat} TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed out) (batchId=247) TestHCatHiveCompatibility - did not produce a TEST-*.xml file (likely timed out) (batchId=247) TestNonCatCallsWithCatalog - did not produce a TEST-*.xml file (likely timed out) (batchId=217) TestSequenceFileReadWrite - did not produce a TEST-*.xml file (likely timed out) (batchId=247) TestTxnExIm - did not produce a TEST-*.xml file (likely timed out) (batchId=286) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_13] (batchId=253) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_2] (batchId=86) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_9] (batchId=37) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bitvector] (batchId=85) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_1] (batchId=68) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_2] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin_negative3] (batchId=29) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[compute_stats_date] (batchId=46) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[confirm_initial_tbl_stats] (batchId=31) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cross_join_merge] (batchId=7) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[describe_table] (batchId=44) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_1_23] (batchId=81) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_skew_1_23] (batchId=9) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[hll] (batchId=90) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_mapjoin] (batchId=52) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_13] (batchId=32) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sort_merge_join_desc_7] (batchId=27) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl] (batchId=178) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[dynamic_semijoin_user_level] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[explainuser_2] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llapdecider] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[acid_no_buckets] (batchId=171) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[autoColumnStats_2] (batchId=175) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_join1] (batchId=173) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_join21] (batchId=174) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_join29] (batchId=168) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_join30] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_6] (batchId=157) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_groupby] (batchId=176) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1] (batchId=175) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez2] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[check_constraint] (batchId=158) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer1] (batchId=168) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer2] (batchId=166) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer3] (batchId=174) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer6] (batchId=166) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cross_join] (batchId=160) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction] (batchId=166) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_sw] (batchId=155) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainanalyze_2] (batchId=171)
[jira] [Updated] (HIVE-19410) don't create serde reader in LLAP if there's no cache
[ https://issues.apache.org/jira/browse/HIVE-19410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-19410: Status: Patch Available (was: Open) > don't create serde reader in LLAP if there's no cache > - > > Key: HIVE-19410 > URL: https://issues.apache.org/jira/browse/HIVE-19410 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HIVE-19410.patch > > > Seems to crop up in some tests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19410) don't create serde reader in LLAP if there's no cache
[ https://issues.apache.org/jira/browse/HIVE-19410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462993#comment-16462993 ] Sergey Shelukhin commented on HIVE-19410: - [~prasanth_j] can you take a look? Tiny patch https://builds.apache.org/job/PreCommit-HIVE-Build/10649/testReport/org.apache.hive.jdbc/TestTriggersWorkloadManager/testTriggerSlowQueryExecutionTime/ > don't create serde reader in LLAP if there's no cache > - > > Key: HIVE-19410 > URL: https://issues.apache.org/jira/browse/HIVE-19410 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HIVE-19410.patch > > > Seems to crop up in some tests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)