[jira] [Updated] (HIVE-17978) TPCDS queries 58 and 83 generate exceptions in the vectorization.

2018-05-03 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17978:
---
Attachment: HIVE-17978.patch

> TPCDS queries 58 and 83 generate exceptions in the vectorization.
> -
>
> Key: HIVE-17978
> URL: https://issues.apache.org/jira/browse/HIVE-17978
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Nita Dembla
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-17978.patch
>
>
> Failed with the following Exception 
> {code}
>  ERROR [6c707c4e-2849-4ff2-809d-946581e6b83a HiveServer2-Handler-Pool: 
> Thread-78] ql.Driver: FAILED: SemanticException 
> org.apache.hadoop.hive.ql.metadata.HiveException: The column KEY._col0 is not 
> in the vectorization context column map {_col0=0}.
> org.apache.hadoop.hive.ql.parse.SemanticException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: The column KEY._col0 is not 
> in the vectorization context column map {_col0=0}.
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:1665)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:1725)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:43)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:1245)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:671)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:616)
> at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
> at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
> at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:1902)
> at 
> org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(TezCompiler.java:674)
> at 
> org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:271)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11621)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:296)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:268)
> at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:169)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:268)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:599)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1463)
> at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1420)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:201)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:288)
> at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:249)
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:532)
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:518)
> at 
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:311)
> 

[jira] [Work started] (HIVE-17978) TPCDS queries 58 and 83 generate exceptions in the vectorization.

2018-05-03 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-17978 started by Jesus Camacho Rodriguez.
--
> TPCDS queries 58 and 83 generate exceptions in the vectorization.
> -
>
> Key: HIVE-17978
> URL: https://issues.apache.org/jira/browse/HIVE-17978
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Nita Dembla
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-17978.patch
>
>
> Failed with the following Exception 
> {code}
>  ERROR [6c707c4e-2849-4ff2-809d-946581e6b83a HiveServer2-Handler-Pool: 
> Thread-78] ql.Driver: FAILED: SemanticException 
> org.apache.hadoop.hive.ql.metadata.HiveException: The column KEY._col0 is not 
> in the vectorization context column map {_col0=0}.
> org.apache.hadoop.hive.ql.parse.SemanticException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: The column KEY._col0 is not 
> in the vectorization context column map {_col0=0}.
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:1665)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:1725)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:43)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:1245)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:671)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:616)
> at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
> at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
> at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:1902)
> at 
> org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(TezCompiler.java:674)
> at 
> org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:271)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11621)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:296)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:268)
> at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:169)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:268)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:599)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1463)
> at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1420)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:201)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:288)
> at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:249)
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:532)
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:518)
> at 
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:311)
> at 

[jira] [Updated] (HIVE-17978) TPCDS queries 58 and 83 generate exceptions in the vectorization.

2018-05-03 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17978:
---
Status: Patch Available  (was: In Progress)

> TPCDS queries 58 and 83 generate exceptions in the vectorization.
> -
>
> Key: HIVE-17978
> URL: https://issues.apache.org/jira/browse/HIVE-17978
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Nita Dembla
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-17978.patch
>
>
> Failed with the following Exception 
> {code}
>  ERROR [6c707c4e-2849-4ff2-809d-946581e6b83a HiveServer2-Handler-Pool: 
> Thread-78] ql.Driver: FAILED: SemanticException 
> org.apache.hadoop.hive.ql.metadata.HiveException: The column KEY._col0 is not 
> in the vectorization context column map {_col0=0}.
> org.apache.hadoop.hive.ql.parse.SemanticException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: The column KEY._col0 is not 
> in the vectorization context column map {_col0=0}.
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:1665)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:1725)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:43)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:1245)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:671)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:616)
> at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
> at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
> at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:1902)
> at 
> org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(TezCompiler.java:674)
> at 
> org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:271)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11621)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:296)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:268)
> at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:169)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:268)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:599)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1463)
> at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1420)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:201)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:288)
> at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:249)
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:532)
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:518)
> at 
> 

[jira] [Resolved] (HIVE-19419) SharedScanOptimizer may leave unnecessary operators in the plan

2018-05-03 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez resolved HIVE-19419.

Resolution: Duplicate

> SharedScanOptimizer may leave unnecessary operators in the plan
> ---
>
> Key: HIVE-19419
> URL: https://issues.apache.org/jira/browse/HIVE-19419
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Due to the interaction with branches created by semijoin reduction. In turn, 
> this can lead to errors such as:
> {noformat}
> 2018-05-03T21:19:41,277 INFO  [8d6a552a-b62f-44a4-bdb4-afc2e810ae56 
> HiveServer2-Handler-Pool: Thread-139]: physical.Vectorizer 
> (Vectorizer.java:walkStackToFindVectorizationContext(1260)) - 
> walkStackToFindVectorizationContext RS has new vectorization context Context 
> name GBY, level 0, sorted projectionColumnMap {0=_col0}, 
> scratchColumnTypeNames []
> 2018-05-03T21:19:41,278 ERROR [8d6a552a-b62f-44a4-bdb4-afc2e810ae56 
> HiveServer2-Handler-Pool: Thread-139]: ql.Driver 
> (SessionState.java:printError(1220)) - FAILED: SemanticException 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.reflect.InvocationTargetException
> org.apache.hadoop.hive.ql.parse.SemanticException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.reflect.InvocationTargetException
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:1285)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:1346)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:43)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:955)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:514)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:485)
> at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
> at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
> at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:1495)
> at 
> org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(TezCompiler.java:644)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-17978) TPCDS queries 58 and 83 generate exceptions in the vectorization.

2018-05-03 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-17978:
--

Assignee: Jesus Camacho Rodriguez

> TPCDS queries 58 and 83 generate exceptions in the vectorization.
> -
>
> Key: HIVE-17978
> URL: https://issues.apache.org/jira/browse/HIVE-17978
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Nita Dembla
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Failed with the following Exception 
> {code}
>  ERROR [6c707c4e-2849-4ff2-809d-946581e6b83a HiveServer2-Handler-Pool: 
> Thread-78] ql.Driver: FAILED: SemanticException 
> org.apache.hadoop.hive.ql.metadata.HiveException: The column KEY._col0 is not 
> in the vectorization context column map {_col0=0}.
> org.apache.hadoop.hive.ql.parse.SemanticException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: The column KEY._col0 is not 
> in the vectorization context column map {_col0=0}.
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:1665)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:1725)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:43)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:1245)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:671)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:616)
> at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
> at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
> at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:1902)
> at 
> org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(TezCompiler.java:674)
> at 
> org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:271)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:11621)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:296)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:268)
> at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:169)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:268)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:599)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1463)
> at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1420)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:201)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:288)
> at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:249)
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:532)
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:518)
> at 
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:311)
> at 
> 

[jira] [Commented] (HIVE-19400) Adjust Hive 1.0 to 2.0 conversion utility to the upgrade

2018-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463389#comment-16463389
 ] 

Hive QA commented on HIVE-19400:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12921715/HIVE-19400.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 41 failed/errored test(s), 14316 tests 
executed
*Failed tests:*
{noformat}
TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed 
out) (batchId=247)
TestHCatHiveCompatibility - did not produce a TEST-*.xml file (likely timed 
out) (batchId=247)
TestNonCatCallsWithCatalog - did not produce a TEST-*.xml file (likely timed 
out) (batchId=217)
TestSequenceFileReadWrite - did not produce a TEST-*.xml file (likely timed 
out) (batchId=247)
TestTxnExIm - did not produce a TEST-*.xml file (likely timed out) (batchId=286)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[explainuser_2] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez2]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_4]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_5]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_rebuild_dummy]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_time_window]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] 
(batchId=169)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_stats]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] 
(batchId=105)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] 
(batchId=105)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=105)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[tez-tag] 
(batchId=105)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_reflect_neg] 
(batchId=96)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_test_error] 
(batchId=96)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=228)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=228)
org.apache.hadoop.hive.ql.TestMTQueries.testMTQueries1 (batchId=232)
org.apache.hadoop.hive.ql.parse.TestCopyUtils.testPrivilegedDistCpWithSameUserAsCurrentDoesNotTryToImpersonate
 (batchId=231)
org.apache.hadoop.hive.ql.parse.TestReplicationOnHDFSEncryptedZones.targetAndSourceHaveDifferentEncryptionZoneKeys
 (batchId=231)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel 
(batchId=235)
org.apache.hive.jdbc.TestRestrictedList.org.apache.hive.jdbc.TestRestrictedList 
(batchId=241)
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp (batchId=239)
org.apache.hive.jdbc.TestTriggersMoveWorkloadManager.testTriggerMoveConflictKill
 (batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testMultipleTriggers2 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedFiles 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomNonExistent 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomReadOps 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesRead 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesWrite 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryElapsedTime
 (batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryExecutionTime
 (batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerVertexRawInputSplitsNoKill
 (batchId=241)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/10666/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/10666/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-10666/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing 

[jira] [Assigned] (HIVE-19419) SharedScanOptimizer may leave unnecessary operators in the plan

2018-05-03 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-19419:
--


> SharedScanOptimizer may leave unnecessary operators in the plan
> ---
>
> Key: HIVE-19419
> URL: https://issues.apache.org/jira/browse/HIVE-19419
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Due to the interaction with branches created by semijoin reduction. In turn, 
> this can lead to errors such as:
> {noformat}
> 2018-05-03T21:19:41,277 INFO  [8d6a552a-b62f-44a4-bdb4-afc2e810ae56 
> HiveServer2-Handler-Pool: Thread-139]: physical.Vectorizer 
> (Vectorizer.java:walkStackToFindVectorizationContext(1260)) - 
> walkStackToFindVectorizationContext RS has new vectorization context Context 
> name GBY, level 0, sorted projectionColumnMap {0=_col0}, 
> scratchColumnTypeNames []
> 2018-05-03T21:19:41,278 ERROR [8d6a552a-b62f-44a4-bdb4-afc2e810ae56 
> HiveServer2-Handler-Pool: Thread-139]: ql.Driver 
> (SessionState.java:printError(1220)) - FAILED: SemanticException 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.reflect.InvocationTargetException
> org.apache.hadoop.hive.ql.parse.SemanticException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.reflect.InvocationTargetException
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:1285)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:1346)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:43)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:955)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:514)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:485)
> at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
> at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
> at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:1495)
> at 
> org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(TezCompiler.java:644)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (HIVE-19419) SharedScanOptimizer may leave unnecessary operators in the plan

2018-05-03 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-19419 started by Jesus Camacho Rodriguez.
--
> SharedScanOptimizer may leave unnecessary operators in the plan
> ---
>
> Key: HIVE-19419
> URL: https://issues.apache.org/jira/browse/HIVE-19419
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Due to the interaction with branches created by semijoin reduction. In turn, 
> this can lead to errors such as:
> {noformat}
> 2018-05-03T21:19:41,277 INFO  [8d6a552a-b62f-44a4-bdb4-afc2e810ae56 
> HiveServer2-Handler-Pool: Thread-139]: physical.Vectorizer 
> (Vectorizer.java:walkStackToFindVectorizationContext(1260)) - 
> walkStackToFindVectorizationContext RS has new vectorization context Context 
> name GBY, level 0, sorted projectionColumnMap {0=_col0}, 
> scratchColumnTypeNames []
> 2018-05-03T21:19:41,278 ERROR [8d6a552a-b62f-44a4-bdb4-afc2e810ae56 
> HiveServer2-Handler-Pool: Thread-139]: ql.Driver 
> (SessionState.java:printError(1220)) - FAILED: SemanticException 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.reflect.InvocationTargetException
> org.apache.hadoop.hive.ql.parse.SemanticException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.reflect.InvocationTargetException
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:1285)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:1346)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:43)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.PreOrderOnceWalker.walk(PreOrderOnceWalker.java:54)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:955)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:514)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:485)
> at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
> at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
> at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
> at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:1495)
> at 
> org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(TezCompiler.java:644)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18864) ValidWriteIdList snapshot seems incorrect if obtained after allocating writeId by current transaction.

2018-05-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463376#comment-16463376
 ] 

ASF GitHub Bot commented on HIVE-18864:
---

Github user sankarh closed the pull request at:

https://github.com/apache/hive/pull/316


> ValidWriteIdList snapshot seems incorrect if obtained after allocating 
> writeId by current transaction.
> --
>
> Key: HIVE-18864
> URL: https://issues.apache.org/jira/browse/HIVE-18864
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-18864.01.patch, HIVE-18864.02.patch
>
>
> For multi-statement txns, it is possible that write on a table happens after 
> a read. Let's see the below scenario.
>  # Committed txn=9 writes on table T1 with writeId=5.
>  # Open txn=10. ValidTxnList(open:null, txn_HWM=10),
>  # Read table T1 from txn=10. ValidWriteIdList(open:null, write_HWM=5).
>  # Open txn=11, writes on table T1 with writeid=6.
>  # Read table T1 from txn=10. ValidWriteIdList(open:null, write_HWM=5).
>  # Write table T1 from txn=10 with writeId=7.
>  # Read table T1 from txn=10. {color:#d04437}*ValidWriteIdList(open:null, 
> write_HWM=7)*. – This read will able to see rows added by txn=11 which is 
> still open.{color}
> {color:#d04437}So, it is needed to rebuild the open/aborted list of 
> ValidWriteIdList based on txn_HWM. Any writeId allocated by txnId > txn_HWM 
> should be marked as open. In this example, *ValidWriteIdList(open:6, 
> write_HWM=7)* should be generated.{color}
> {color:#33}cc{color} [~ekoifman], [~thejas]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19130) NPE is thrown when REPL LOAD applied drop partition event.

2018-05-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463378#comment-16463378
 ] 

ASF GitHub Bot commented on HIVE-19130:
---

Github user sankarh closed the pull request at:

https://github.com/apache/hive/pull/332


> NPE is thrown when REPL LOAD applied drop partition event.
> --
>
> Key: HIVE-19130
> URL: https://issues.apache.org/jira/browse/HIVE-19130
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, Replication, pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-19130.01.patch
>
>
> During incremental replication, if we split the events batch as follows, then 
> the REPL LOAD on second batch throws NPE.
> Batch-1: CREATE_TABLE(t1) -> ADD_PARTITION(t1.p1) -> DROP_PARTITION (t1.p1)
> Batch-2: DROP_TABLE(t1) ->  CREATE_TABLE(t1) -> ADD_PARTITION(t1.p1) -> 
> DROP_PARTITION (t1.p1)
> {code}
> 2018-04-05 16:20:36,531 ERROR [HiveServer2-Background-Pool: Thread-107044]: 
> metadata.Hive (Hive.java:getTable(1219)) - Table catalog_sales_new not found: 
> new5_tpcds_real_bin_partitioned_orc_1000.catalog_sales_new table not found
> 2018-04-05 16:20:36,538 ERROR [HiveServer2-Background-Pool: Thread-107044]: 
> exec.DDLTask (DDLTask.java:failed(540)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4016)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3983)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:341)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1765)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1506)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1303)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1170)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1165)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:266)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByExpr(Hive.java:2613)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4008)
> ... 23 more
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18988) Support bootstrap replication of ACID tables

2018-05-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463377#comment-16463377
 ] 

ASF GitHub Bot commented on HIVE-18988:
---

Github user sankarh closed the pull request at:

https://github.com/apache/hive/pull/331


> Support bootstrap replication of ACID tables
> 
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.0.0, 3.1.0
>
> Attachments: HIVE-18988.01-branch-3.patch, HIVE-18988.01.patch, 
> HIVE-18988.02.patch, HIVE-18988.03.patch, HIVE-18988.04.patch, 
> HIVE-18988.05.patch, HIVE-18988.06.patch, HIVE-18988.07.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable 
> state of data.
>  - If ACID feature enables, then perform bootstrap dump for ACID tables with 
> in read txn.
>  -> Dump table/partition metadata.
>  -> Get the list of valid data files for a table using same logic as read txn 
> do.
>  -> Dump latest ValidWriteIdList as per current read txn.
>  - Set the valid last replication state such that it doesn't miss any open 
> txn started after triggering bootstrap dump.
>  - If any txns on-going which was opened before triggering bootstrap dump, 
> then it is not guaranteed that if open_txn event captured for these txns. 
> Also, if these txns are opened for streaming ingest case, then dumped ACID 
> table data may include data of open txns which impact snapshot isolation at 
> target. To avoid that, bootstrap dump should wait for timeout (new 
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout, 
> just force abort those txns and continue.
>  - If any txns force aborted belongs to a streaming ingest case, then dumped 
> ACID table data may have aborted data too. So, it is necessary to replicate 
> the aborted write ids to target to mark those data invalid for any readers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19400) Adjust Hive 1.0 to 2.0 conversion utility to the upgrade

2018-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463358#comment-16463358
 ] 

Hive QA commented on HIVE-19400:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} standalone-metastore: The patch generated 0 new + 60 
unchanged - 1 fixed = 60 total (was 61) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 14m 16s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-10666/dev-support/hive-personality.sh
 |
| git revision | master / 1c3b82f |
| Default Java | 1.8.0_111 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10666/yetus/whitespace-eol.txt
 |
| modules | C: standalone-metastore U: standalone-metastore |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10666/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Adjust Hive 1.0 to 2.0 conversion utility to the upgrade
> 
>
> Key: HIVE-19400
> URL: https://issues.apache.org/jira/browse/HIVE-19400
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-19400.01.patch
>
>
> Conversion utility should allow specification of the output dir, and create 
> files only if there is actually something to do.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-14388) Add number of rows inserted message after insert command in Beeline

2018-05-03 Thread Bharathkrishna Guruvayoor Murali (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharathkrishna Guruvayoor Murali updated HIVE-14388:

Attachment: HIVE-14388.06.patch

> Add number of rows inserted message after insert command in Beeline
> ---
>
> Key: HIVE-14388
> URL: https://issues.apache.org/jira/browse/HIVE-14388
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Reporter: Vihang Karajgaonkar
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Minor
> Attachments: HIVE-14388-WIP.patch, HIVE-14388.02.patch, 
> HIVE-14388.03.patch, HIVE-14388.05.patch, HIVE-14388.06.patch
>
>
> Currently, when you run insert command on beeline, it returns a message 
> saying "No rows affected .."
> A better and more intuitive msg would be "xxx rows inserted (26.068 seconds)"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19108) Vectorization and Parquet: Turning on vectorization in parquet_ppd_decimal.q causes Wrong Query Results

2018-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463338#comment-16463338
 ] 

Hive QA commented on HIVE-19108:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12921699/HIVE-19108.04.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 41 failed/errored test(s), 14318 tests 
executed
*Failed tests:*
{noformat}
TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed 
out) (batchId=247)
TestHCatHiveCompatibility - did not produce a TEST-*.xml file (likely timed 
out) (batchId=247)
TestNonCatCallsWithCatalog - did not produce a TEST-*.xml file (likely timed 
out) (batchId=217)
TestSequenceFileReadWrite - did not produce a TEST-*.xml file (likely timed 
out) (batchId=247)
TestTxnExIm - did not produce a TEST-*.xml file (likely timed out) (batchId=286)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[fouter_join_ppr] 
(batchId=33)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez2]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_4]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_4]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_5]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_rebuild_dummy]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[materialized_view_create_rewrite_time_window]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_dynpart_hashjoin_1]
 (batchId=174)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_vector_dynpart_hashjoin_1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] 
(batchId=105)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] 
(batchId=105)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=105)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[tez-tag] 
(batchId=105)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_reflect_neg] 
(batchId=96)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_test_error] 
(batchId=96)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=228)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=228)
org.apache.hadoop.hive.ql.TestMTQueries.testMTQueries1 (batchId=232)
org.apache.hadoop.hive.ql.parse.TestCopyUtils.testPrivilegedDistCpWithSameUserAsCurrentDoesNotTryToImpersonate
 (batchId=231)
org.apache.hadoop.hive.ql.parse.TestReplicationOnHDFSEncryptedZones.targetAndSourceHaveDifferentEncryptionZoneKeys
 (batchId=231)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgress (batchId=235)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel 
(batchId=235)
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp (batchId=239)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testMultipleTriggers2 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedFiles 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomNonExistent 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomReadOps 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesRead 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesWrite 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryElapsedTime
 (batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryExecutionTime
 (batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerVertexRawInputSplitsNoKill
 (batchId=241)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/10665/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/10665/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-10665/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing 

[jira] [Commented] (HIVE-19108) Vectorization and Parquet: Turning on vectorization in parquet_ppd_decimal.q causes Wrong Query Results

2018-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463330#comment-16463330
 ] 

Hive QA commented on HIVE-19108:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
48s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
44s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
46s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
33s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 52m 23s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-10665/dev-support/hive-personality.sh
 |
| git revision | master / 1c3b82f |
| Default Java | 1.8.0_111 |
| modules | C: . ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10665/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Vectorization and Parquet: Turning on vectorization in parquet_ppd_decimal.q 
> causes Wrong Query Results
> ---
>
> Key: HIVE-19108
> URL: https://issues.apache.org/jira/browse/HIVE-19108
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Haifeng Chen
>Priority: Critical
> Attachments: HIVE-19108.01.patch, HIVE-19108.02.patch, 
> HIVE-19108.03.patch, HIVE-19108.04.patch
>
>
> Found in vectorization enable by default experiment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19110) Vectorization: Enabling vectorization causes TestContribCliDriver udf_example_arraymapstruct.q to produce Wrong Results

2018-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463302#comment-16463302
 ] 

Hive QA commented on HIVE-19110:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12921698/HIVE-19110.04.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 47 failed/errored test(s), 14319 tests 
executed
*Failed tests:*
{noformat}
TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed 
out) (batchId=247)
TestHCatHiveCompatibility - did not produce a TEST-*.xml file (likely timed 
out) (batchId=247)
TestNonCatCallsWithCatalog - did not produce a TEST-*.xml file (likely timed 
out) (batchId=217)
TestSequenceFileReadWrite - did not produce a TEST-*.xml file (likely timed 
out) (batchId=247)
TestTxnExIm - did not produce a TEST-*.xml file (likely timed out) (batchId=286)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_13] 
(batchId=253)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_1] 
(batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_2] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin_negative3] 
(batchId=29)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_1_23] 
(batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_skew_1_23] 
(batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_13] 
(batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sort_merge_join_desc_7] 
(batchId=27)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_4]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main]
 (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_vector_dynpart_hashjoin_1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] 
(batchId=105)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[insertsel_fail] 
(batchId=95)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[load_data_parquet_empty]
 (batchId=96)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_reflect_neg] 
(batchId=96)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_test_error] 
(batchId=96)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucket_map_join_1] 
(batchId=137)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucket_map_join_2] 
(batchId=133)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucketmapjoin_negative3]
 (batchId=120)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_sort_1_23] 
(batchId=143)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_sort_skew_1_23]
 (batchId=111)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[smb_mapjoin_13] 
(batchId=122)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=228)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=228)
org.apache.hadoop.hive.ql.TestMTQueries.testMTQueries1 (batchId=232)
org.apache.hadoop.hive.ql.parse.TestCopyUtils.testPrivilegedDistCpWithSameUserAsCurrentDoesNotTryToImpersonate
 (batchId=231)
org.apache.hadoop.hive.ql.parse.TestReplicationOnHDFSEncryptedZones.targetAndSourceHaveDifferentEncryptionZoneKeys
 (batchId=231)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgress (batchId=235)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel 
(batchId=235)
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp (batchId=239)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testMultipleTriggers2 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedFiles 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomNonExistent 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomReadOps 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesRead 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesWrite 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryElapsedTime
 (batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryExecutionTime
 (batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerVertexRawInputSplitsNoKill
 (batchId=241)
{noformat}

Test results: 

[jira] [Commented] (HIVE-19110) Vectorization: Enabling vectorization causes TestContribCliDriver udf_example_arraymapstruct.q to produce Wrong Results

2018-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463286#comment-16463286
 ] 

Hive QA commented on HIVE-19110:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
2s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
18s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
48s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
47s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
7s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
34s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 55m 41s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-10664/dev-support/hive-personality.sh
 |
| git revision | master / bf8e696 |
| Default Java | 1.8.0_111 |
| modules | C: . contrib ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10664/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Vectorization: Enabling vectorization causes TestContribCliDriver 
> udf_example_arraymapstruct.q to produce Wrong Results
> ---
>
> Key: HIVE-19110
> URL: https://issues.apache.org/jira/browse/HIVE-19110
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Haifeng Chen
>Priority: Critical
> Attachments: HIVE-19110.01.patch, HIVE-19110.02.patch, 
> HIVE-19110.03.patch, HIVE-19110.04.patch
>
>
> Found in vectorization enable by default experiment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19396) HiveOperation is incorrectly set for analyze statement

2018-05-03 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-19396:

   Resolution: Fixed
Fix Version/s: 3.1.0
   Status: Resolved  (was: Patch Available)

Pushed to master.

> HiveOperation is incorrectly set for analyze statement
> --
>
> Key: HIVE-19396
> URL: https://issues.apache.org/jira/browse/HIVE-19396
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-19396.patch
>
>
> Because we rewrite analyze to select compute_stats() operation enum gets set 
> to Query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields

2018-05-03 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463269#comment-16463269
 ] 

Sahil Takiar commented on HIVE-19041:
-

Think you missed {{bucketCols}} in {{StorageDescriptor}}, otherwise +1 pending 
Hive QA.

We can do this in a follow up JIRA, but we probably want to intern 
{{StorageDescriptor#sortCols}} too. I don't think different partitions can have 
different sort cols, so these would be duplicated across all partitions for a 
table. We will have to define a customized intern method for it because 
{{sortCols}} is a list of {{Order}} objects, but {{Order}} objects are just a 
String + int

> Thrift deserialization of Partition objects should intern fields
> 
>
> Key: HIVE-19041
> URL: https://issues.apache.org/jira/browse/HIVE-19041
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-19041.01.patch, HIVE-19041.02.patch
>
>
> When a client is creating large number of partitions, the thrift objects are 
> deserialized into Partition objects. The read method of these objects does 
> not intern the inputformat, location, outputformat which cause large number 
> of duplicate Strings in the HMS memory. We should intern these objects while 
> deserialization to reduce memory pressure. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19399) Down cast from int to tinyint generating incorrect value for vectorization

2018-05-03 Thread Haifeng Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haifeng Chen updated HIVE-19399:

Description: 
 The following sql scripts generating different result for vectorization 
disabled and enabled (both for ORC and for parquet).
   drop table test_schema;
   create table test_schema (f int) stored as parquet;
   insert into test_schema values ('9');
   select cast(f as tinyint) + 1 from test_schema;

For non-vectorization, the result is -96 while for vectorization mode, it is 
10

 

  was:
 

The following sql scripts generating different result for vectorization 
disabled and enabled.
  drop table test_schema;
  create table test_schema (f int) stored as parquet;
  insert into test_schema values ('9');
  select cast(f as tinyint) + 1 from test_schema;

For non-vectorization, the result is -96 while for vectorization mode, it is 
10

 


> Down cast from int to tinyint generating incorrect value for vectorization
> --
>
> Key: HIVE-19399
> URL: https://issues.apache.org/jira/browse/HIVE-19399
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 3.1.0
>Reporter: Haifeng Chen
>Priority: Major
>
>  The following sql scripts generating different result for vectorization 
> disabled and enabled (both for ORC and for parquet).
>    drop table test_schema;
>    create table test_schema (f int) stored as parquet;
>    insert into test_schema values ('9');
>    select cast(f as tinyint) + 1 from test_schema;
> For non-vectorization, the result is -96 while for vectorization mode, it is 
> 10
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19384) Vectorization: IfExprTimestamp* do not handle NULLs correctly

2018-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463258#comment-16463258
 ] 

Hive QA commented on HIVE-19384:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12921697/HIVE-19384.02.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 49 failed/errored test(s), 14319 tests 
executed
*Failed tests:*
{noformat}
TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed 
out) (batchId=247)
TestHCatHiveCompatibility - did not produce a TEST-*.xml file (likely timed 
out) (batchId=247)
TestJdbcWithMiniKdc - did not produce a TEST-*.xml file (likely timed out) 
(batchId=254)
TestNonCatCallsWithCatalog - did not produce a TEST-*.xml file (likely timed 
out) (batchId=217)
TestSequenceFileReadWrite - did not produce a TEST-*.xml file (likely timed 
out) (batchId=247)
TestTxnExIm - did not produce a TEST-*.xml file (likely timed out) (batchId=286)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_13] 
(batchId=253)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_1] 
(batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_2] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin_negative3] 
(batchId=29)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_1_23] 
(batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_skew_1_23] 
(batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_13] 
(batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sort_merge_join_desc_7] 
(batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_timestamp_funcs]
 (batchId=31)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main]
 (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_vector_dynpart_hashjoin_1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] 
(batchId=105)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[insertsel_fail] 
(batchId=95)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[load_data_parquet_empty]
 (batchId=96)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_reflect_neg] 
(batchId=96)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_test_error] 
(batchId=96)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucket_map_join_1] 
(batchId=137)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucket_map_join_2] 
(batchId=133)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucketmapjoin_negative3]
 (batchId=120)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_sort_1_23] 
(batchId=143)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_sort_skew_1_23]
 (batchId=111)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[smb_mapjoin_13] 
(batchId=122)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_timestamp_funcs]
 (batchId=121)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=228)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=228)
org.apache.hadoop.hive.ql.TestAutoPurgeTables.testExternalNoAutoPurge 
(batchId=233)
org.apache.hadoop.hive.ql.TestMTQueries.testMTQueries1 (batchId=232)
org.apache.hadoop.hive.ql.parse.TestCopyUtils.testPrivilegedDistCpWithSameUserAsCurrentDoesNotTryToImpersonate
 (batchId=231)
org.apache.hadoop.hive.ql.parse.TestReplicationOnHDFSEncryptedZones.targetAndSourceHaveDifferentEncryptionZoneKeys
 (batchId=231)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgress (batchId=235)
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp (batchId=239)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testMultipleTriggers2 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedFiles 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomNonExistent 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomReadOps 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesRead 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesWrite 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryElapsedTime
 (batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryExecutionTime
 

[jira] [Commented] (HIVE-19418) add background stats updater similar to compactor

2018-05-03 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463257#comment-16463257
 ] 

Gopal V commented on HIVE-19418:


bq. some exceptions like numRows, cannot be aggregated (i.e. you cannot combine 
ndvs from two inserts)

For pure insert queries all stats can be merged - because nDVs are actually 
stored as HyperLogLog bitsets which have a merge() op.

bq. herefore we will add background logic to metastore (similar to, and 
partially inside, the ACID compactor)

With standalone-metastore, adding more background logic to the metastore is 
going to become a big problem - I'd argue that even the compactor need to be 
moved out & the metastore can only keep the book-keeping for pending tasks (a 
generic task queue + priorities) because it will no longer have a yarn-site.xml 
in its configurations.


> add background stats updater similar to compactor
> -
>
> Key: HIVE-19418
> URL: https://issues.apache.org/jira/browse/HIVE-19418
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>
> There's a JIRA HIVE-19416 to add snapshot version to stats for MM/ACID tables 
> to make them usable in a transaction without breaking ACID (for metadata-only 
> optimization). However, stats for ACID tables can still become unusable if 
> e.g. two parallel inserts run - neither sees the data written by the other, 
> so after both finish, the snapshots on either set of stats won't match the 
> current snapshot and the stats will be unusable.
> Additionally, for ACID and non-ACID tables alike, a lot of the stats, with 
> some exceptions like numRows, cannot be aggregated (i.e. you cannot combine 
> ndvs from two inserts), and for ACID even less can be aggregated (you cannot 
> derive min/max if some rows are deleted but you don't scan the rest of the 
> dataset).
> Therefore we will add background logic to metastore (similar to, and 
> partially inside, the ACID compactor) to update stats.
> It will have 3 modes of operation.
> 1) Off.
> 2) Update only the stats that exist but are out of date (generating stats can 
> be expensive, so if the user is only analyzing a subset of tables it should 
> be able to only update that subset). We can simply look at existing stats and 
> only analyze for the relevant partitions and columns.
> 3) On: 2 + create stats for all tables and columns missing stats.
> There will also be a table parameter to skip stats update. 
> In phase 1, the process will operate outside of compactor, and run analyze 
> command on the table. The analyze command will automatically save the stats 
> with ACID snapshot information if needed, based on HIVE-19416, so we don't 
> need to do any special state management and this will work for all table 
> types. However it's also more expensive.
> In phase 2, we can explore adding stats collection during MM compaction that 
> uses a temp table. If we don't have open writers during major compaction (so 
> we overwrite all of the data), the temp table stats can simply be copied over 
> to the main table with correct snapshot information, saving us a table scan.
> In phase 3, we can add custom stats collection logic to full ACID compactor 
> that is not query based, the same way as we'd do for (2). Alternatively we 
> can wait for ACID compactor to become query based and just reuse (2).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields

2018-05-03 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463256#comment-16463256
 ] 

Vihang Karajgaonkar commented on HIVE-19041:


Attaching second patch which addresses [~stakiar]'s comment above. 
Specifically, interns {{catName}} in the setter, {{FieldSchema.comment}} field 
and handles setters of deserializerClass, serializerClass and serializationLib 
Strings in the setters and read method.

> Thrift deserialization of Partition objects should intern fields
> 
>
> Key: HIVE-19041
> URL: https://issues.apache.org/jira/browse/HIVE-19041
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-19041.01.patch, HIVE-19041.02.patch
>
>
> When a client is creating large number of partitions, the thrift objects are 
> deserialized into Partition objects. The read method of these objects does 
> not intern the inputformat, location, outputformat which cause large number 
> of duplicate Strings in the HMS memory. We should intern these objects while 
> deserialization to reduce memory pressure. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19041) Thrift deserialization of Partition objects should intern fields

2018-05-03 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-19041:
---
Attachment: HIVE-19041.02.patch

> Thrift deserialization of Partition objects should intern fields
> 
>
> Key: HIVE-19041
> URL: https://issues.apache.org/jira/browse/HIVE-19041
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-19041.01.patch, HIVE-19041.02.patch
>
>
> When a client is creating large number of partitions, the thrift objects are 
> deserialized into Partition objects. The read method of these objects does 
> not intern the inputformat, location, outputformat which cause large number 
> of duplicate Strings in the HMS memory. We should intern these objects while 
> deserialization to reduce memory pressure. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields

2018-05-03 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463254#comment-16463254
 ] 

Vihang Karajgaonkar commented on HIVE-19041:


{{FieldSchema.comment}} is already being interned as per 
https://github.com/apache/hive/blob/master/standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/FieldSchema.java#L144

Hence adding intern call around comment in the {{read}} method isn't really 
doing anything new but rather fixing another important code path which wastes 
lot of memory.

> Thrift deserialization of Partition objects should intern fields
> 
>
> Key: HIVE-19041
> URL: https://issues.apache.org/jira/browse/HIVE-19041
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-19041.01.patch
>
>
> When a client is creating large number of partitions, the thrift objects are 
> deserialized into Partition objects. The read method of these objects does 
> not intern the inputformat, location, outputformat which cause large number 
> of duplicate Strings in the HMS memory. We should intern these objects while 
> deserialization to reduce memory pressure. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19418) add background stats updater similar to compactor

2018-05-03 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-19418:
--
Component/s: Transactions

> add background stats updater similar to compactor
> -
>
> Key: HIVE-19418
> URL: https://issues.apache.org/jira/browse/HIVE-19418
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>
> There's a JIRA HIVE-19416 to add snapshot version to stats for MM/ACID tables 
> to make them usable in a transaction without breaking ACID (for metadata-only 
> optimization). However, stats for ACID tables can still become unusable if 
> e.g. two parallel inserts run - neither sees the data written by the other, 
> so after both finish, the snapshots on either set of stats won't match the 
> current snapshot and the stats will be unusable.
> Additionally, for ACID and non-ACID tables alike, a lot of the stats, with 
> some exceptions like numRows, cannot be aggregated (i.e. you cannot combine 
> ndvs from two inserts), and for ACID even less can be aggregated (you cannot 
> derive min/max if some rows are deleted but you don't scan the rest of the 
> dataset).
> Therefore we will add background logic to metastore (similar to, and 
> partially inside, the ACID compactor) to update stats.
> It will have 3 modes of operation.
> 1) Off.
> 2) Update only the stats that exist but are out of date (generating stats can 
> be expensive, so if the user is only analyzing a subset of tables it should 
> be able to only update that subset). We can simply look at existing stats and 
> only analyze for the relevant partitions and columns.
> 3) On: 2 + create stats for all tables and columns missing stats.
> There will also be a table parameter to skip stats update. 
> In phase 1, the process will operate outside of compactor, and run analyze 
> command on the table. The analyze command will automatically save the stats 
> with ACID snapshot information if needed, based on HIVE-19416, so we don't 
> need to do any special state management and this will work for all table 
> types. However it's also more expensive.
> In phase 2, we can explore adding stats collection during MM compaction that 
> uses a temp table. If we don't have open writers during major compaction (so 
> we overwrite all of the data), the temp table stats can simply be copied over 
> to the main table with correct snapshot information, saving us a table scan.
> In phase 3, we can add custom stats collection logic to full ACID compactor 
> that is not query based, the same way as we'd do for (2). Alternatively we 
> can wait for ACID compactor to become query based and just reuse (2).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19418) add background stats updater similar to compactor

2018-05-03 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463241#comment-16463241
 ] 

Sergey Shelukhin commented on HIVE-19418:
-

cc [~steveyeom2017] [~ekoifman] [~ashutoshc]

> add background stats updater similar to compactor
> -
>
> Key: HIVE-19418
> URL: https://issues.apache.org/jira/browse/HIVE-19418
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>
> There's a JIRA HIVE-19416 to add snapshot version to stats for MM/ACID tables 
> to make them usable in a transaction without breaking ACID (for metadata-only 
> optimization). However, stats for ACID tables can still become unusable if 
> e.g. two parallel inserts run - neither sees the data written by the other, 
> so after both finish, the snapshots on either set of stats won't match the 
> current snapshot and the stats will be unusable.
> Additionally, for ACID and non-ACID tables alike, a lot of the stats, with 
> some exceptions like numRows, cannot be aggregated (i.e. you cannot combine 
> ndvs from two inserts), and for ACID even less can be aggregated (you cannot 
> derive min/max if some rows are deleted but you don't scan the rest of the 
> dataset).
> Therefore we will add background logic to metastore (similar to, and 
> partially inside, the ACID compactor) to update stats.
> It will have 3 modes of operation.
> 1) Off.
> 2) Update only the stats that exist but are out of date (generating stats can 
> be expensive, so if the user is only analyzing a subset of tables it should 
> be able to only update that subset). We can simply look at existing stats and 
> only analyze for the relevant partitions and columns.
> 3) On: 2 + create stats for all tables and columns missing stats.
> There will also be a table parameter to skip stats update. 
> In phase 1, the process will operate outside of compactor, and run analyze 
> command on the table. The analyze command will automatically save the stats 
> with ACID snapshot information if needed, based on HIVE-19416, so we don't 
> need to do any special state management and this will work for all table 
> types. However it's also more expensive.
> In phase 2, we can explore adding stats collection during MM compaction that 
> uses a temp table. If we don't have open writers during major compaction (so 
> we overwrite all of the data), the temp table stats can simply be copied over 
> to the main table with correct snapshot information, saving us a table scan.
> In phase 3, we can add custom stats collection logic to full ACID compactor 
> that is not query based, the same way as we'd do for (2). Alternatively we 
> can wait for ACID compactor to become query based and just reuse (2).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19418) add background stats updater similar to compactor

2018-05-03 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-19418:
---


> add background stats updater similar to compactor
> -
>
> Key: HIVE-19418
> URL: https://issues.apache.org/jira/browse/HIVE-19418
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>
> There's a JIRA HIVE-19416 to add snapshot version to stats for MM/ACID tables 
> to make them usable in a transaction without breaking ACID (for metadata-only 
> optimization). However, stats for ACID tables can still become unusable if 
> e.g. two parallel inserts run - neither sees the data written by the other, 
> so after both finish, the snapshots on either set of stats won't match the 
> current snapshot and the stats will be unusable.
> Additionally, for ACID and non-ACID tables alike, a lot of the stats, with 
> some exceptions like numRows, cannot be aggregated (i.e. you cannot combine 
> ndvs from two inserts), and for ACID even less can be aggregated (you cannot 
> derive min/max if some rows are deleted but you don't scan the rest of the 
> dataset).
> Therefore we will add background logic to metastore (similar to, and 
> partially inside, the ACID compactor) to update stats.
> It will have 3 modes of operation.
> 1) Off.
> 2) Update only the stats that exist but are out of date (generating stats can 
> be expensive, so if the user is only analyzing a subset of tables it should 
> be able to only update that subset). We can simply look at existing stats and 
> only analyze for the relevant partitions and columns.
> 3) On: 2 + create stats for all tables and columns missing stats.
> There will also be a table parameter to skip stats update. 
> In phase 1, the process will operate outside of compactor, and run analyze 
> command on the table. The analyze command will automatically save the stats 
> with ACID snapshot information if needed, based on HIVE-19416, so we don't 
> need to do any special state management and this will work for all table 
> types. However it's also more expensive.
> In phase 2, we can explore adding stats collection during MM compaction that 
> uses a temp table. If we don't have open writers during major compaction (so 
> we overwrite all of the data), the temp table stats can simply be copied over 
> to the main table with correct snapshot information, saving us a table scan.
> In phase 3, we can add custom stats collection logic to full ACID compactor 
> that is not query based, the same way as we'd do for (2). Alternatively we 
> can wait for ACID compactor to become query based and just reuse (2).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields

2018-05-03 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463236#comment-16463236
 ] 

Misha Dmitriev commented on HIVE-19041:
---

[~gopalv] yes, since JDK 1.7 built-in string interning is far superior to one 
based on WeakHashMap (which was used by Hadoop weak interner). Check the above 
article on interning for further details.

> Thrift deserialization of Partition objects should intern fields
> 
>
> Key: HIVE-19041
> URL: https://issues.apache.org/jira/browse/HIVE-19041
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-19041.01.patch
>
>
> When a client is creating large number of partitions, the thrift objects are 
> deserialized into Partition objects. The read method of these objects does 
> not intern the inputformat, location, outputformat which cause large number 
> of duplicate Strings in the HMS memory. We should intern these objects while 
> deserialization to reduce memory pressure. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields

2018-05-03 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463235#comment-16463235
 ] 

Misha Dmitriev commented on HIVE-19041:
---

Yes, all interned strings are kept in the JVM internal equivalent of a 
concurrent WeakHashMap. Since it's highly specialized, it's very fast, and has 
no extra overhead when more strings are added to it (because it's quite large 
and preallocated, so actually every running JVM already bears this memory 
overhead of a few MB). If you are really interested, check this article: 
[http://java-performance.info/string-intern-in-java-6-7-8/] 

Basically, the only thing that you may be concerned with when using 
String.intern(), is the CPU overhead. But in my experience, unless interning is 
used, mistakingly, for strings that are very short-lived anyway, the impact of 
reduced GC outweighs the impact of of extra CPU cycles consumed by the intern() 
call.

> Thrift deserialization of Partition objects should intern fields
> 
>
> Key: HIVE-19041
> URL: https://issues.apache.org/jira/browse/HIVE-19041
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-19041.01.patch
>
>
> When a client is creating large number of partitions, the thrift objects are 
> deserialized into Partition objects. The read method of these objects does 
> not intern the inputformat, location, outputformat which cause large number 
> of duplicate Strings in the HMS memory. We should intern these objects while 
> deserialization to reduce memory pressure. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-19041) Thrift deserialization of Partition objects should intern fields

2018-05-03 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463233#comment-16463233
 ] 

Gopal V edited comment on HIVE-19041 at 5/4/18 12:21 AM:
-

Hadoop comes with a weak-interner, which is used by Tez.

StringInterner.weakIntern()

Looking at the code, it was explicitly removed by HIVE-17237 ??


was (Author: gopalv):
Hadoop comes with a weak-interner, which is used by Tez.

StringInterner.weakIntern()


> Thrift deserialization of Partition objects should intern fields
> 
>
> Key: HIVE-19041
> URL: https://issues.apache.org/jira/browse/HIVE-19041
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-19041.01.patch
>
>
> When a client is creating large number of partitions, the thrift objects are 
> deserialized into Partition objects. The read method of these objects does 
> not intern the inputformat, location, outputformat which cause large number 
> of duplicate Strings in the HMS memory. We should intern these objects while 
> deserialization to reduce memory pressure. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields

2018-05-03 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463233#comment-16463233
 ] 

Gopal V commented on HIVE-19041:


Hadoop comes with a weak-interner, which is used by Tez.

StringInterner.weakIntern()


> Thrift deserialization of Partition objects should intern fields
> 
>
> Key: HIVE-19041
> URL: https://issues.apache.org/jira/browse/HIVE-19041
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-19041.01.patch
>
>
> When a client is creating large number of partitions, the thrift objects are 
> deserialized into Partition objects. The read method of these objects does 
> not intern the inputformat, location, outputformat which cause large number 
> of duplicate Strings in the HMS memory. We should intern these objects while 
> deserialization to reduce memory pressure. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19417) Modify metastore to have/access persistent tables for stats

2018-05-03 Thread Steve Yeom (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463226#comment-16463226
 ] 

Steve Yeom commented on HIVE-19417:
---

I will add a snapshot table and the reference to it on the existing stats 
tables. 

> Modify metastore to have/access persistent tables for stats
> ---
>
> Key: HIVE-19417
> URL: https://issues.apache.org/jira/browse/HIVE-19417
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19417) Modify metastore to have/access persistent tables for stats

2018-05-03 Thread Steve Yeom (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Yeom updated HIVE-19417:
--
Summary: Modify metastore to have/access persistent tables for stats  (was: 
Modify metastore to have persistent tables/objects)

> Modify metastore to have/access persistent tables for stats
> ---
>
> Key: HIVE-19417
> URL: https://issues.apache.org/jira/browse/HIVE-19417
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields

2018-05-03 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463221#comment-16463221
 ] 

Sahil Takiar commented on HIVE-19041:
-

I guess if you have a comment for the partition column you will have tons of 
duplicate comments. I don't think we can selectively intern just for the 
partition column comments though.

I'm not sure how overhead it would introduce if we just blindly intern all 
comments. This would include table, database, and column level comments.

[~mi...@cloudera.com] do interned strings ever get purged from the Java heap? 

> Thrift deserialization of Partition objects should intern fields
> 
>
> Key: HIVE-19041
> URL: https://issues.apache.org/jira/browse/HIVE-19041
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-19041.01.patch
>
>
> When a client is creating large number of partitions, the thrift objects are 
> deserialized into Partition objects. The read method of these objects does 
> not intern the inputformat, location, outputformat which cause large number 
> of duplicate Strings in the HMS memory. We should intern these objects while 
> deserialization to reduce memory pressure. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19417) Modify metastore to have persistent tables/objects

2018-05-03 Thread Steve Yeom (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Yeom reassigned HIVE-19417:
-


> Modify metastore to have persistent tables/objects
> --
>
> Key: HIVE-19417
> URL: https://issues.apache.org/jira/browse/HIVE-19417
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19344) Change default value of msck.repair.batch.size

2018-05-03 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463214#comment-16463214
 ] 

Sahil Takiar commented on HIVE-19344:
-

+1

> Change default value of msck.repair.batch.size 
> ---
>
> Key: HIVE-19344
> URL: https://issues.apache.org/jira/browse/HIVE-19344
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-19344.01.patch
>
>
> {{msck.repair.batch.size}} default to 0 which means msck will try to add all 
> the partitions in one API call to HMS. This can potentially add huge memory 
> pressure on HMS. The default value should be changed to a reasonable number 
> so that in case of large number of partitions we can batch the addition of 
> partitions. Same goes for {{msck.repair.batch.max.retries}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19416) Create single version transactional table metastore statistics for aggregation queries

2018-05-03 Thread Steve Yeom (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Yeom reassigned HIVE-19416:
-


> Create single version transactional table metastore statistics for 
> aggregation queries
> --
>
> Key: HIVE-19416
> URL: https://issues.apache.org/jira/browse/HIVE-19416
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Fix For: 3.0.0
>
>
> The system should use only statistics for aggregation queries like count on 
> transactional tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18570) ACID IOW implemented using base may delete too much data

2018-05-03 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463211#comment-16463211
 ] 

Eugene Koifman commented on HIVE-18570:
---

HIVE-18570.03.patch updates some golden files

> ACID IOW implemented using base may delete too much data
> 
>
> Key: HIVE-18570
> URL: https://issues.apache.org/jira/browse/HIVE-18570
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-18570.01-branch-3.patch, HIVE-18570.01.patch, 
> HIVE-18570.02-branch-3.patch, HIVE-18570.02.patch, 
> HIVE-18570.03-branch-3.patch, HIVE-18570.03.patch, 
> HIVE-18570.04-branch-3.patch
>
>
> Suppose we have a table with delta_0 insert data.
> Txn 1 starts an insert into delta_1.
> Txn 2 starts an IOW into base_2.
> Txn 2 commits.
> Txn 1 commits after txn 2 but its results would be invisible.
> Txn 2 deletes rows committed by txn 1 that according to standard ACID 
> semantics it could have never observed and affected; this sequence of events 
> is only possible under read-uncommitted isolation level (so, 2 deletes rows 
> written by 1 before 1 commits them). 
> This is if we look at IOW as transactional delete+insert. Otherwise we are 
> just saying IOW performs "semi"-transactional delete.
> If 1 ran an update on rows instead of an insert, and 2 still ran an 
> IOW/delete, row lock conflict (or equivalent) should cause one of them to 
> fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18570) ACID IOW implemented using base may delete too much data

2018-05-03 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18570:
--
Attachment: HIVE-18570.03.patch

> ACID IOW implemented using base may delete too much data
> 
>
> Key: HIVE-18570
> URL: https://issues.apache.org/jira/browse/HIVE-18570
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Eugene Koifman
>Priority: Blocker
> Attachments: HIVE-18570.01-branch-3.patch, HIVE-18570.01.patch, 
> HIVE-18570.02-branch-3.patch, HIVE-18570.02.patch, 
> HIVE-18570.03-branch-3.patch, HIVE-18570.03.patch, 
> HIVE-18570.04-branch-3.patch
>
>
> Suppose we have a table with delta_0 insert data.
> Txn 1 starts an insert into delta_1.
> Txn 2 starts an IOW into base_2.
> Txn 2 commits.
> Txn 1 commits after txn 2 but its results would be invisible.
> Txn 2 deletes rows committed by txn 1 that according to standard ACID 
> semantics it could have never observed and affected; this sequence of events 
> is only possible under read-uncommitted isolation level (so, 2 deletes rows 
> written by 1 before 1 commits them). 
> This is if we look at IOW as transactional delete+insert. Otherwise we are 
> just saying IOW performs "semi"-transactional delete.
> If 1 ran an update on rows instead of an insert, and 2 still ran an 
> IOW/delete, row lock conflict (or equivalent) should cause one of them to 
> fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19384) Vectorization: IfExprTimestamp* do not handle NULLs correctly

2018-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463207#comment-16463207
 ] 

Hive QA commented on HIVE-19384:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
48s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 4s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
27s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
12s{color} | {color:red} storage-api: The patch generated 2 new + 12 unchanged 
- 2 fixed = 14 total (was 14) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
47s{color} | {color:red} ql: The patch generated 20 new + 232 unchanged - 14 
fixed = 252 total (was 246) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 19m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-10663/dev-support/hive-personality.sh
 |
| git revision | master / bf8e696 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10663/yetus/diff-checkstyle-storage-api.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10663/yetus/diff-checkstyle-ql.txt
 |
| modules | C: storage-api ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10663/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Vectorization: IfExprTimestamp* do not handle NULLs correctly
> -
>
> Key: HIVE-19384
> URL: https://issues.apache.org/jira/browse/HIVE-19384
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-19384.01.patch, HIVE-19384.02.patch
>
>
> HIVE-18622: "Vectorization: IF Statements, Comparisons, and more do not 
> handle NULLs correctly" didn't quite fix the IfExprTimestamp* classes 
> right
> {noformat}
> // Carefully handle NULLs...
> outputColVector.noNulls = false;{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19344) Change default value of msck.repair.batch.size

2018-05-03 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463203#comment-16463203
 ] 

Vihang Karajgaonkar commented on HIVE-19344:


[~stakiar] Can you take a look?

> Change default value of msck.repair.batch.size 
> ---
>
> Key: HIVE-19344
> URL: https://issues.apache.org/jira/browse/HIVE-19344
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-19344.01.patch
>
>
> {{msck.repair.batch.size}} default to 0 which means msck will try to add all 
> the partitions in one API call to HMS. This can potentially add huge memory 
> pressure on HMS. The default value should be changed to a reasonable number 
> so that in case of large number of partitions we can batch the addition of 
> partitions. Same goes for {{msck.repair.batch.max.retries}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-19306) Arrow batch serializer

2018-05-03 Thread Teddy Choi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463200#comment-16463200
 ] 

Teddy Choi edited comment on HIVE-19306 at 5/3/18 11:45 PM:


[~mmccline], [~ewohlstadter]. HIVE-19306.3.patch fixes null handling. 
ArrowColumnarBatchSerDe#writeNull is the strangest part. Because Apache Arrow's 
UnionListWriter should implement AbstractFieldWriter#writeNull properly and 
FieldWriter should have a super method of AbstractFieldWriter#writeNull to 
expose it, but they don't. I used reflection and concrete class check to handle 
it. I'll fix FieldWriter interface in Apache Arrow.


was (Author: teddy.choi):
[~mmccline], [~ewohlstadter]. [^HIVE-19306.3.patch] fixed null handling. 
ArrowColumnarBatchSerDe#writeNull is the strangest part. Because Apache Arrow's 
UnionListWriter should implement AbstractFieldWriter#writeNull properly and 
FieldWriter should have a super method of AbstractFieldWriter#writeNull to 
expose it. I used reflection and concrete class check to handle it. I'll fix 
FieldWriter interface in Apache Arrow.

> Arrow batch serializer
> --
>
> Key: HIVE-19306
> URL: https://issues.apache.org/jira/browse/HIVE-19306
> Project: Hive
>  Issue Type: Task
>  Components: Serializers/Deserializers
>Reporter: Eric Wohlstadter
>Assignee: Teddy Choi
>Priority: Major
> Attachments: HIVE-19306.2.patch
>
>
> Leverage the ThriftJDBCBinarySerDe code path that already exists in 
> SemanticAnalyzer/FileSinkOperator to create a serializer that batches rows 
> into Arrow vector batches.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19306) Arrow batch serializer

2018-05-03 Thread Teddy Choi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463200#comment-16463200
 ] 

Teddy Choi commented on HIVE-19306:
---

[~mmccline], [~ewohlstadter]. [^HIVE-19306.3.patch] fixed null handling. 
ArrowColumnarBatchSerDe#writeNull is the strangest part. Because Apache Arrow's 
UnionListWriter should implement AbstractFieldWriter#writeNull properly and 
FieldWriter should have a super method of AbstractFieldWriter#writeNull to 
expose it. I used reflection and concrete class check to handle it. I'll fix 
FieldWriter interface in Apache Arrow.

> Arrow batch serializer
> --
>
> Key: HIVE-19306
> URL: https://issues.apache.org/jira/browse/HIVE-19306
> Project: Hive
>  Issue Type: Task
>  Components: Serializers/Deserializers
>Reporter: Eric Wohlstadter
>Assignee: Teddy Choi
>Priority: Major
> Attachments: HIVE-19306.2.patch
>
>
> Leverage the ThriftJDBCBinarySerDe code path that already exists in 
> SemanticAnalyzer/FileSinkOperator to create a serializer that batches rows 
> into Arrow vector batches.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19118) Vectorization: Turning on vectorization in escape_crlf produces wrong results

2018-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463191#comment-16463191
 ] 

Hive QA commented on HIVE-19118:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12921696/HIVE-19118.05.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 48 failed/errored test(s), 14318 tests 
executed
*Failed tests:*
{noformat}
TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed 
out) (batchId=247)
TestHCatHiveCompatibility - did not produce a TEST-*.xml file (likely timed 
out) (batchId=247)
TestNonCatCallsWithCatalog - did not produce a TEST-*.xml file (likely timed 
out) (batchId=217)
TestSequenceFileReadWrite - did not produce a TEST-*.xml file (likely timed 
out) (batchId=247)
TestTxnExIm - did not produce a TEST-*.xml file (likely timed out) (batchId=286)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_13] 
(batchId=253)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_1] 
(batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_2] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin_negative3] 
(batchId=29)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_1_23] 
(batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_skew_1_23] 
(batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_13] 
(batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sort_merge_join_desc_7] 
(batchId=27)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_dynpart_hashjoin_1]
 (batchId=174)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main]
 (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] 
(batchId=105)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[insertsel_fail] 
(batchId=95)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[load_data_parquet_empty]
 (batchId=96)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_reflect_neg] 
(batchId=96)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_test_error] 
(batchId=96)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucket_map_join_1] 
(batchId=137)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucket_map_join_2] 
(batchId=133)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucketmapjoin_negative3]
 (batchId=120)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_sort_1_23] 
(batchId=143)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_sort_skew_1_23]
 (batchId=111)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[smb_mapjoin_13] 
(batchId=122)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=228)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=228)
org.apache.hadoop.hive.ql.TestMTQueries.testMTQueries1 (batchId=232)
org.apache.hadoop.hive.ql.parse.TestCopyUtils.testPrivilegedDistCpWithSameUserAsCurrentDoesNotTryToImpersonate
 (batchId=231)
org.apache.hadoop.hive.ql.parse.TestReplicationOnHDFSEncryptedZones.targetAndSourceHaveDifferentEncryptionZoneKeys
 (batchId=231)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgress (batchId=235)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel 
(batchId=235)
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp (batchId=239)
org.apache.hive.jdbc.TestTriggersMoveWorkloadManager.testTriggerMoveAndKill 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testMultipleTriggers2 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedFiles 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomNonExistent 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomReadOps 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesRead 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesWrite 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryElapsedTime
 (batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryExecutionTime
 (batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerVertexRawInputSplitsNoKill
 

[jira] [Commented] (HIVE-18288) merge/concat not supported on Acid table

2018-05-03 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463187#comment-16463187
 ] 

Eugene Koifman commented on HIVE-18288:
---

[~sershe] could you review please

> merge/concat not supported on Acid table
> 
>
> Key: HIVE-18288
> URL: https://issues.apache.org/jira/browse/HIVE-18288
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-18288.01.patch, HIVE-18288.02.patch
>
>
> For example, mvn test -Dtest=TestCliDriver -Dqfile=orc_merge10.q
> now ends up with 
> {noformat}
> 2017-12-15T15:12:30,753 ERROR [7c3ff5b2-285c-44f2-8b13-5c3ccbd41b13 main] 
> ql.Driver: FAILED: SemanticException 
> org.apache.hadoop.hive.ql.parse.SemanticException: Concatenate/M\
> erge can not be performed on transactional tables
> org.apache.hadoop.hive.ql.parse.SemanticException: 
> org.apache.hadoop.hive.ql.parse.SemanticException: Concatenate/Merge can not 
> be performed on transactional tables
> at 
> org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeAlterTablePartMergeFiles(DDLSemanticAnalyzer.java:2172)
> at 
> org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:343)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19298) Fix operator tree of CTAS for Druid Storage Handler

2018-05-03 Thread slim bouguerra (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-19298:
--
Attachment: HIVE-19298.2.patch

> Fix operator tree of CTAS for Druid Storage Handler
> ---
>
> Key: HIVE-19298
> URL: https://issues.apache.org/jira/browse/HIVE-19298
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-19298.2.patch, HIVE-19298.patch, HIVE-19298.patch
>
>
> Current operator plan of CTAS for Druid storage handler is broken when used 
> enables the property \{code} hive.exec.parallel\{code} as \{code} true\{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19415) Support CORS for all HS2 web endpoints

2018-05-03 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463179#comment-16463179
 ] 

Sergey Shelukhin commented on HIVE-19415:
-

+1

> Support CORS for all HS2 web endpoints
> --
>
> Key: HIVE-19415
> URL: https://issues.apache.org/jira/browse/HIVE-19415
> Project: Hive
>  Issue Type: Improvement
>  Components: Web UI
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Minor
> Attachments: HIVE-19415.1.patch
>
>
> HIVE-19277 changes alone are not sufficient to support CORS. 
> CrossOriginFilter has to be added to jetty which will serve appropriate 
> response for OPTIONS pre-flight request. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19415) Support CORS for all HS2 web endpoints

2018-05-03 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463178#comment-16463178
 ] 

Prasanth Jayachandran commented on HIVE-19415:
--

[~sershe]/[~gopalv] can someone please review this patch? small patch. RB is 
acting weird doesn't respond to "rbt post" or manual upload. 

> Support CORS for all HS2 web endpoints
> --
>
> Key: HIVE-19415
> URL: https://issues.apache.org/jira/browse/HIVE-19415
> Project: Hive
>  Issue Type: Improvement
>  Components: Web UI
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Minor
> Attachments: HIVE-19415.1.patch
>
>
> HIVE-19277 changes alone are not sufficient to support CORS. 
> CrossOriginFilter has to be added to jetty which will serve appropriate 
> response for OPTIONS pre-flight request. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19415) Support CORS for all HS2 web endpoints

2018-05-03 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-19415:
-
Priority: Minor  (was: Major)

> Support CORS for all HS2 web endpoints
> --
>
> Key: HIVE-19415
> URL: https://issues.apache.org/jira/browse/HIVE-19415
> Project: Hive
>  Issue Type: Improvement
>  Components: Web UI
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Minor
> Attachments: HIVE-19415.1.patch
>
>
> HIVE-19277 changes alone are not sufficient to support CORS. 
> CrossOriginFilter has to be added to jetty which will serve appropriate 
> response for OPTIONS pre-flight request. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19415) Support CORS for all HS2 web endpoints

2018-05-03 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-19415:
-
Status: Patch Available  (was: Open)

> Support CORS for all HS2 web endpoints
> --
>
> Key: HIVE-19415
> URL: https://issues.apache.org/jira/browse/HIVE-19415
> Project: Hive
>  Issue Type: Improvement
>  Components: Web UI
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-19415.1.patch
>
>
> HIVE-19277 changes alone are not sufficient to support CORS. 
> CrossOriginFilter has to be added to jetty which will serve appropriate 
> response for OPTIONS pre-flight request. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19415) Support CORS for all HS2 web endpoints

2018-05-03 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-19415:
-
Attachment: HIVE-19415.1.patch

> Support CORS for all HS2 web endpoints
> --
>
> Key: HIVE-19415
> URL: https://issues.apache.org/jira/browse/HIVE-19415
> Project: Hive
>  Issue Type: Improvement
>  Components: Web UI
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-19415.1.patch
>
>
> HIVE-19277 changes alone are not sufficient to support CORS. 
> CrossOriginFilter has to be added to jetty which will serve appropriate 
> response for OPTIONS pre-flight request. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19389) Schematool: For Hive's Information Schema, use embedded HS2 as default

2018-05-03 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463171#comment-16463171
 ] 

Daniel Dai commented on HIVE-19389:
---

+1

> Schematool: For Hive's Information Schema, use embedded HS2 as default
> --
>
> Key: HIVE-19389
> URL: https://issues.apache.org/jira/browse/HIVE-19389
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Major
> Attachments: HIVE-19389.1.patch, HIVE-19389.2.patch
>
>
> Currently, for initializing/upgrading Hive's information schema, we require a 
> full jdbc url (for HS2). It will be good to have it connect using embedded 
> HS2 by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19118) Vectorization: Turning on vectorization in escape_crlf produces wrong results

2018-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463174#comment-16463174
 ] 

Hive QA commented on HIVE-19118:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
54s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m  
1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  8m  
5s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 55m 56s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-10662/dev-support/hive-personality.sh
 |
| git revision | master / 70d835b |
| Default Java | 1.8.0_111 |
| modules | C: serde . ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10662/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Vectorization: Turning on vectorization in escape_crlf produces wrong results
> -
>
> Key: HIVE-19118
> URL: https://issues.apache.org/jira/browse/HIVE-19118
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Haifeng Chen
>Priority: Critical
> Attachments: HIVE-19118.01.patch, HIVE-19118.02.patch, 
> HIVE-19118.03.patch, HIVE-19118.04.patch, HIVE-19118.05.patch
>
>
> Found in vectorization enable by default experiment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19389) Schematool: For Hive's Information Schema, use embedded HS2 as default

2018-05-03 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463170#comment-16463170
 ] 

Vaibhav Gumashta commented on HIVE-19389:
-

[~daijy] Updated

> Schematool: For Hive's Information Schema, use embedded HS2 as default
> --
>
> Key: HIVE-19389
> URL: https://issues.apache.org/jira/browse/HIVE-19389
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Major
> Attachments: HIVE-19389.1.patch, HIVE-19389.2.patch
>
>
> Currently, for initializing/upgrading Hive's information schema, we require a 
> full jdbc url (for HS2). It will be good to have it connect using embedded 
> HS2 by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19415) Support CORS for all HS2 web endpoints

2018-05-03 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-19415:



> Support CORS for all HS2 web endpoints
> --
>
> Key: HIVE-19415
> URL: https://issues.apache.org/jira/browse/HIVE-19415
> Project: Hive
>  Issue Type: Improvement
>  Components: Web UI
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>
> HIVE-19277 changes alone are not sufficient to support CORS. 
> CrossOriginFilter has to be added to jetty which will serve appropriate 
> response for OPTIONS pre-flight request. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19389) Schematool: For Hive's Information Schema, use embedded HS2 as default

2018-05-03 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-19389:

Attachment: HIVE-19389.2.patch

> Schematool: For Hive's Information Schema, use embedded HS2 as default
> --
>
> Key: HIVE-19389
> URL: https://issues.apache.org/jira/browse/HIVE-19389
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Major
> Attachments: HIVE-19389.1.patch, HIVE-19389.2.patch
>
>
> Currently, for initializing/upgrading Hive's information schema, we require a 
> full jdbc url (for HS2). It will be good to have it connect using embedded 
> HS2 by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18288) merge/concat not supported on Acid table

2018-05-03 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18288:
--
Attachment: HIVE-18288.02.patch

> merge/concat not supported on Acid table
> 
>
> Key: HIVE-18288
> URL: https://issues.apache.org/jira/browse/HIVE-18288
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-18288.01.patch, HIVE-18288.02.patch
>
>
> For example, mvn test -Dtest=TestCliDriver -Dqfile=orc_merge10.q
> now ends up with 
> {noformat}
> 2017-12-15T15:12:30,753 ERROR [7c3ff5b2-285c-44f2-8b13-5c3ccbd41b13 main] 
> ql.Driver: FAILED: SemanticException 
> org.apache.hadoop.hive.ql.parse.SemanticException: Concatenate/M\
> erge can not be performed on transactional tables
> org.apache.hadoop.hive.ql.parse.SemanticException: 
> org.apache.hadoop.hive.ql.parse.SemanticException: Concatenate/Merge can not 
> be performed on transactional tables
> at 
> org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeAlterTablePartMergeFiles(DDLSemanticAnalyzer.java:2172)
> at 
> org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:343)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18288) merge/concat not supported on Acid table

2018-05-03 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18288:
--
Attachment: (was: HIVE-18288.02.patch)

> merge/concat not supported on Acid table
> 
>
> Key: HIVE-18288
> URL: https://issues.apache.org/jira/browse/HIVE-18288
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-18288.01.patch, HIVE-18288.02.patch
>
>
> For example, mvn test -Dtest=TestCliDriver -Dqfile=orc_merge10.q
> now ends up with 
> {noformat}
> 2017-12-15T15:12:30,753 ERROR [7c3ff5b2-285c-44f2-8b13-5c3ccbd41b13 main] 
> ql.Driver: FAILED: SemanticException 
> org.apache.hadoop.hive.ql.parse.SemanticException: Concatenate/M\
> erge can not be performed on transactional tables
> org.apache.hadoop.hive.ql.parse.SemanticException: 
> org.apache.hadoop.hive.ql.parse.SemanticException: Concatenate/Merge can not 
> be performed on transactional tables
> at 
> org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeAlterTablePartMergeFiles(DDLSemanticAnalyzer.java:2172)
> at 
> org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:343)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18288) merge/concat not supported on Acid table

2018-05-03 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18288:
--
Attachment: HIVE-18288.02.patch

> merge/concat not supported on Acid table
> 
>
> Key: HIVE-18288
> URL: https://issues.apache.org/jira/browse/HIVE-18288
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-18288.01.patch, HIVE-18288.02.patch
>
>
> For example, mvn test -Dtest=TestCliDriver -Dqfile=orc_merge10.q
> now ends up with 
> {noformat}
> 2017-12-15T15:12:30,753 ERROR [7c3ff5b2-285c-44f2-8b13-5c3ccbd41b13 main] 
> ql.Driver: FAILED: SemanticException 
> org.apache.hadoop.hive.ql.parse.SemanticException: Concatenate/M\
> erge can not be performed on transactional tables
> org.apache.hadoop.hive.ql.parse.SemanticException: 
> org.apache.hadoop.hive.ql.parse.SemanticException: Concatenate/Merge can not 
> be performed on transactional tables
> at 
> org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeAlterTablePartMergeFiles(DDLSemanticAnalyzer.java:2172)
> at 
> org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:343)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18288) merge/concat not supported on Acid table

2018-05-03 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18288:
--
Status: Patch Available  (was: Open)

> merge/concat not supported on Acid table
> 
>
> Key: HIVE-18288
> URL: https://issues.apache.org/jira/browse/HIVE-18288
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-18288.01.patch, HIVE-18288.02.patch
>
>
> For example, mvn test -Dtest=TestCliDriver -Dqfile=orc_merge10.q
> now ends up with 
> {noformat}
> 2017-12-15T15:12:30,753 ERROR [7c3ff5b2-285c-44f2-8b13-5c3ccbd41b13 main] 
> ql.Driver: FAILED: SemanticException 
> org.apache.hadoop.hive.ql.parse.SemanticException: Concatenate/M\
> erge can not be performed on transactional tables
> org.apache.hadoop.hive.ql.parse.SemanticException: 
> org.apache.hadoop.hive.ql.parse.SemanticException: Concatenate/Merge can not 
> be performed on transactional tables
> at 
> org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeAlterTablePartMergeFiles(DDLSemanticAnalyzer.java:2172)
> at 
> org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:343)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19394) WM_TRIGGER trigger creation failed with type cast from Integer to Boolean

2018-05-03 Thread Thai Bui (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463123#comment-16463123
 ] 

Thai Bui commented on HIVE-19394:
-

(y) Awesome!

> WM_TRIGGER trigger creation failed with type cast from Integer to Boolean 
> --
>
> Key: HIVE-19394
> URL: https://issues.apache.org/jira/browse/HIVE-19394
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Thai Bui
>Assignee: Thai Bui
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-19394.patch, HIVE-19394.patch
>
>
> During testing of the new WM feature and the Hive metastore is created using 
> Postgresql, I've discovered a bug when creating a new trigger. For example
> {noformat}
> CREATE RESOURCE PLAN plan_1 WITH QUERY_PARALLELISM=4;
> CREATE POOL plan_1.slow WITH
>ALLOC_FRACTION=0.5, QUERY_PARALLELISM=2, SCHEDULING_POLICY='fair';
> ALTER POOL plan_1.default SET
>ALLOC_FRACTION=0.5, QUERY_PARALLELISM=2, SCHEDULING_POLICY='fifo';
> CREATE TRIGGER plan_1.trigger_1 WHEN S3A_BYTES_READ > 268435456 DO MOVE TO 
> slow;
> {noformat}
> Right at the CREATE TRIGGER statement, an error will occur
> {noformat}
> Error while processing statement: FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Insert of 
> object "org.apache.hadoop.hive.metastore.model.MWMTrigger@5c5ae5d8" using 
> statement "INSERT INTO "WM_TRIGGER" 
> ("TRIGGER_ID","ACTION_EXPRESSION","IS_IN_UNMANAGED","NAME","RP_ID","TRIGGER_EXPRESSION")
>  VALUES (?,?,?,?,?,?)" failed : ERROR: column "IS_IN_UNMANAGED" is of type 
> boolean but expression is of type integer Hint: You will need to rewrite or 
> cast the expression. Position: 129)
> at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
>  ~[datanucleus-api-jdo-4.2.4.jar:?]
> at 
> org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:729)
>  ~[datanucleus-api-jdo-4.2.4.jar:?]
> at 
> org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:749)
>  ~[datanucleus-api-jdo-4.2.4.jar:?]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.createWMTrigger(ObjectStore.java:11218)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_151]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151]
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) 
> ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_151]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151]
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) 
> ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at com.sun.proxy.$Proxy37.createWMTrigger(Unknown Source) ~[?:?]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_wm_trigger(HiveMetaStore.java:7846)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_151]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151]
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at com.sun.proxy.$Proxy39.create_wm_trigger(Unknown Source) ~[?:?]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createWMTrigger(HiveMetaStoreClient.java:3062)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_151]
> at 
> 

[jira] [Resolved] (HIVE-19394) WM_TRIGGER trigger creation failed with type cast from Integer to Boolean

2018-05-03 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-19394.
-
Resolution: Fixed

> WM_TRIGGER trigger creation failed with type cast from Integer to Boolean 
> --
>
> Key: HIVE-19394
> URL: https://issues.apache.org/jira/browse/HIVE-19394
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Thai Bui
>Assignee: Thai Bui
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-19394.patch, HIVE-19394.patch
>
>
> During testing of the new WM feature and the Hive metastore is created using 
> Postgresql, I've discovered a bug when creating a new trigger. For example
> {noformat}
> CREATE RESOURCE PLAN plan_1 WITH QUERY_PARALLELISM=4;
> CREATE POOL plan_1.slow WITH
>ALLOC_FRACTION=0.5, QUERY_PARALLELISM=2, SCHEDULING_POLICY='fair';
> ALTER POOL plan_1.default SET
>ALLOC_FRACTION=0.5, QUERY_PARALLELISM=2, SCHEDULING_POLICY='fifo';
> CREATE TRIGGER plan_1.trigger_1 WHEN S3A_BYTES_READ > 268435456 DO MOVE TO 
> slow;
> {noformat}
> Right at the CREATE TRIGGER statement, an error will occur
> {noformat}
> Error while processing statement: FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Insert of 
> object "org.apache.hadoop.hive.metastore.model.MWMTrigger@5c5ae5d8" using 
> statement "INSERT INTO "WM_TRIGGER" 
> ("TRIGGER_ID","ACTION_EXPRESSION","IS_IN_UNMANAGED","NAME","RP_ID","TRIGGER_EXPRESSION")
>  VALUES (?,?,?,?,?,?)" failed : ERROR: column "IS_IN_UNMANAGED" is of type 
> boolean but expression is of type integer Hint: You will need to rewrite or 
> cast the expression. Position: 129)
> at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
>  ~[datanucleus-api-jdo-4.2.4.jar:?]
> at 
> org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:729)
>  ~[datanucleus-api-jdo-4.2.4.jar:?]
> at 
> org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:749)
>  ~[datanucleus-api-jdo-4.2.4.jar:?]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.createWMTrigger(ObjectStore.java:11218)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_151]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151]
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) 
> ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_151]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151]
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) 
> ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at com.sun.proxy.$Proxy37.createWMTrigger(Unknown Source) ~[?:?]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_wm_trigger(HiveMetaStore.java:7846)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_151]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151]
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at com.sun.proxy.$Proxy39.create_wm_trigger(Unknown Source) ~[?:?]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createWMTrigger(HiveMetaStoreClient.java:3062)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_151]
> at 
> 

[jira] [Commented] (HIVE-19394) WM_TRIGGER trigger creation failed with type cast from Integer to Boolean

2018-05-03 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463119#comment-16463119
 ] 

Sergey Shelukhin commented on HIVE-19394:
-

Missed that. Let me update in place.

> WM_TRIGGER trigger creation failed with type cast from Integer to Boolean 
> --
>
> Key: HIVE-19394
> URL: https://issues.apache.org/jira/browse/HIVE-19394
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Thai Bui
>Assignee: Thai Bui
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-19394.patch, HIVE-19394.patch
>
>
> During testing of the new WM feature and the Hive metastore is created using 
> Postgresql, I've discovered a bug when creating a new trigger. For example
> {noformat}
> CREATE RESOURCE PLAN plan_1 WITH QUERY_PARALLELISM=4;
> CREATE POOL plan_1.slow WITH
>ALLOC_FRACTION=0.5, QUERY_PARALLELISM=2, SCHEDULING_POLICY='fair';
> ALTER POOL plan_1.default SET
>ALLOC_FRACTION=0.5, QUERY_PARALLELISM=2, SCHEDULING_POLICY='fifo';
> CREATE TRIGGER plan_1.trigger_1 WHEN S3A_BYTES_READ > 268435456 DO MOVE TO 
> slow;
> {noformat}
> Right at the CREATE TRIGGER statement, an error will occur
> {noformat}
> Error while processing statement: FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Insert of 
> object "org.apache.hadoop.hive.metastore.model.MWMTrigger@5c5ae5d8" using 
> statement "INSERT INTO "WM_TRIGGER" 
> ("TRIGGER_ID","ACTION_EXPRESSION","IS_IN_UNMANAGED","NAME","RP_ID","TRIGGER_EXPRESSION")
>  VALUES (?,?,?,?,?,?)" failed : ERROR: column "IS_IN_UNMANAGED" is of type 
> boolean but expression is of type integer Hint: You will need to rewrite or 
> cast the expression. Position: 129)
> at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
>  ~[datanucleus-api-jdo-4.2.4.jar:?]
> at 
> org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:729)
>  ~[datanucleus-api-jdo-4.2.4.jar:?]
> at 
> org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:749)
>  ~[datanucleus-api-jdo-4.2.4.jar:?]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.createWMTrigger(ObjectStore.java:11218)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_151]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151]
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) 
> ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_151]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151]
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) 
> ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at com.sun.proxy.$Proxy37.createWMTrigger(Unknown Source) ~[?:?]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_wm_trigger(HiveMetaStore.java:7846)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_151]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151]
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at com.sun.proxy.$Proxy39.create_wm_trigger(Unknown Source) ~[?:?]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createWMTrigger(HiveMetaStoreClient.java:3062)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_151]
> at 
> 

[jira] [Reopened] (HIVE-19394) WM_TRIGGER trigger creation failed with type cast from Integer to Boolean

2018-05-03 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reopened HIVE-19394:
-

> WM_TRIGGER trigger creation failed with type cast from Integer to Boolean 
> --
>
> Key: HIVE-19394
> URL: https://issues.apache.org/jira/browse/HIVE-19394
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Thai Bui
>Assignee: Thai Bui
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-19394.patch, HIVE-19394.patch
>
>
> During testing of the new WM feature and the Hive metastore is created using 
> Postgresql, I've discovered a bug when creating a new trigger. For example
> {noformat}
> CREATE RESOURCE PLAN plan_1 WITH QUERY_PARALLELISM=4;
> CREATE POOL plan_1.slow WITH
>ALLOC_FRACTION=0.5, QUERY_PARALLELISM=2, SCHEDULING_POLICY='fair';
> ALTER POOL plan_1.default SET
>ALLOC_FRACTION=0.5, QUERY_PARALLELISM=2, SCHEDULING_POLICY='fifo';
> CREATE TRIGGER plan_1.trigger_1 WHEN S3A_BYTES_READ > 268435456 DO MOVE TO 
> slow;
> {noformat}
> Right at the CREATE TRIGGER statement, an error will occur
> {noformat}
> Error while processing statement: FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Insert of 
> object "org.apache.hadoop.hive.metastore.model.MWMTrigger@5c5ae5d8" using 
> statement "INSERT INTO "WM_TRIGGER" 
> ("TRIGGER_ID","ACTION_EXPRESSION","IS_IN_UNMANAGED","NAME","RP_ID","TRIGGER_EXPRESSION")
>  VALUES (?,?,?,?,?,?)" failed : ERROR: column "IS_IN_UNMANAGED" is of type 
> boolean but expression is of type integer Hint: You will need to rewrite or 
> cast the expression. Position: 129)
> at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
>  ~[datanucleus-api-jdo-4.2.4.jar:?]
> at 
> org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:729)
>  ~[datanucleus-api-jdo-4.2.4.jar:?]
> at 
> org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:749)
>  ~[datanucleus-api-jdo-4.2.4.jar:?]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.createWMTrigger(ObjectStore.java:11218)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_151]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151]
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) 
> ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_151]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151]
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) 
> ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at com.sun.proxy.$Proxy37.createWMTrigger(Unknown Source) ~[?:?]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_wm_trigger(HiveMetaStore.java:7846)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_151]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151]
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at com.sun.proxy.$Proxy39.create_wm_trigger(Unknown Source) ~[?:?]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createWMTrigger(HiveMetaStoreClient.java:3062)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> 

[jira] [Assigned] (HIVE-19414) Cover partitioned table stats update/retrieve cases

2018-05-03 Thread Steve Yeom (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Yeom reassigned HIVE-19414:
-

Assignee: Steve Yeom

> Cover partitioned table stats update/retrieve cases 
> 
>
> Key: HIVE-19414
> URL: https://issues.apache.org/jira/browse/HIVE-19414
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19394) WM_TRIGGER trigger creation failed with type cast from Integer to Boolean

2018-05-03 Thread Thai Bui (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463117#comment-16463117
 ] 

Thai Bui commented on HIVE-19394:
-

Thanks [~sershe]. I noticed that the upgrade script has the type defined as
{noformat}
"IS_IN_UNMANAGED" smallint NOT NULL DEFAULT false{noformat}
I just did a quick test in Postgres and that will cause an error because of 
`DEFAULT false`. It needs to be `DEFAULT 0`.
{noformat}
create table test (active smallint not null default false);
ERROR: column "active" is of type smallint but default expression is of type 
boolean
HINT: You will need to rewrite or cast the expression.{noformat}
Can we just push a follow up commit to fix it or have to create a new Jira and 
everything? :)

> WM_TRIGGER trigger creation failed with type cast from Integer to Boolean 
> --
>
> Key: HIVE-19394
> URL: https://issues.apache.org/jira/browse/HIVE-19394
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Thai Bui
>Assignee: Thai Bui
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-19394.patch, HIVE-19394.patch
>
>
> During testing of the new WM feature and the Hive metastore is created using 
> Postgresql, I've discovered a bug when creating a new trigger. For example
> {noformat}
> CREATE RESOURCE PLAN plan_1 WITH QUERY_PARALLELISM=4;
> CREATE POOL plan_1.slow WITH
>ALLOC_FRACTION=0.5, QUERY_PARALLELISM=2, SCHEDULING_POLICY='fair';
> ALTER POOL plan_1.default SET
>ALLOC_FRACTION=0.5, QUERY_PARALLELISM=2, SCHEDULING_POLICY='fifo';
> CREATE TRIGGER plan_1.trigger_1 WHEN S3A_BYTES_READ > 268435456 DO MOVE TO 
> slow;
> {noformat}
> Right at the CREATE TRIGGER statement, an error will occur
> {noformat}
> Error while processing statement: FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Insert of 
> object "org.apache.hadoop.hive.metastore.model.MWMTrigger@5c5ae5d8" using 
> statement "INSERT INTO "WM_TRIGGER" 
> ("TRIGGER_ID","ACTION_EXPRESSION","IS_IN_UNMANAGED","NAME","RP_ID","TRIGGER_EXPRESSION")
>  VALUES (?,?,?,?,?,?)" failed : ERROR: column "IS_IN_UNMANAGED" is of type 
> boolean but expression is of type integer Hint: You will need to rewrite or 
> cast the expression. Position: 129)
> at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
>  ~[datanucleus-api-jdo-4.2.4.jar:?]
> at 
> org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:729)
>  ~[datanucleus-api-jdo-4.2.4.jar:?]
> at 
> org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:749)
>  ~[datanucleus-api-jdo-4.2.4.jar:?]
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.createWMTrigger(ObjectStore.java:11218)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_151]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151]
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) 
> ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_151]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151]
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) 
> ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at com.sun.proxy.$Proxy37.createWMTrigger(Unknown Source) ~[?:?]
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_wm_trigger(HiveMetaStore.java:7846)
>  ~[hive-exec-3.1.0-SNAPSHOT.jar:3.1.0-SNAPSHOT]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_151]
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_151]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151]
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>  

[jira] [Commented] (HIVE-19358) CBO decorrelation logic should generate Hive operators

2018-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463116#comment-16463116
 ] 

Hive QA commented on HIVE-19358:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12921694/fix.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 46 failed/errored test(s), 14316 tests 
executed
*Failed tests:*
{noformat}
TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed 
out) (batchId=247)
TestHCatHiveCompatibility - did not produce a TEST-*.xml file (likely timed 
out) (batchId=247)
TestNonCatCallsWithCatalog - did not produce a TEST-*.xml file (likely timed 
out) (batchId=217)
TestSequenceFileReadWrite - did not produce a TEST-*.xml file (likely timed 
out) (batchId=247)
TestTxnExIm - did not produce a TEST-*.xml file (likely timed out) (batchId=286)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_13] 
(batchId=253)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_1] 
(batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_2] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin_negative3] 
(batchId=29)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_1_23] 
(batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_skew_1_23] 
(batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_13] 
(batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sort_merge_join_desc_7] 
(batchId=27)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_4]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_smb] 
(batchId=176)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main]
 (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] 
(batchId=105)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[insertsel_fail] 
(batchId=95)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[load_data_parquet_empty]
 (batchId=96)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_reflect_neg] 
(batchId=96)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_test_error] 
(batchId=96)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucket_map_join_1] 
(batchId=137)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucket_map_join_2] 
(batchId=133)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucketmapjoin_negative3]
 (batchId=120)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_sort_1_23] 
(batchId=143)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_sort_skew_1_23]
 (batchId=111)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[smb_mapjoin_13] 
(batchId=122)
org.apache.hadoop.hive.ql.TestAcidOnTez.testCtasTezUnion (batchId=228)
org.apache.hadoop.hive.ql.TestAcidOnTez.testNonStandardConversion01 
(batchId=228)
org.apache.hadoop.hive.ql.TestMTQueries.testMTQueries1 (batchId=232)
org.apache.hadoop.hive.ql.parse.TestCopyUtils.testPrivilegedDistCpWithSameUserAsCurrentDoesNotTryToImpersonate
 (batchId=231)
org.apache.hadoop.hive.ql.parse.TestReplicationOnHDFSEncryptedZones.targetAndSourceHaveDifferentEncryptionZoneKeys
 (batchId=231)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgress (batchId=235)
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp (batchId=239)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testMultipleTriggers2 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomCreatedFiles 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomNonExistent 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerCustomReadOps 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesRead 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighBytesWrite 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes 
(batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryElapsedTime
 (batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerSlowQueryExecutionTime
 (batchId=241)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerVertexRawInputSplitsNoKill
 (batchId=241)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/10661/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/10661/console
Test logs: 

[jira] [Commented] (HIVE-17824) msck repair table should drop the missing partitions from metastore

2018-05-03 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463113#comment-16463113
 ] 

Vihang Karajgaonkar commented on HIVE-17824:


Test failures are unrelated. Patch merged in branch-2 as well. Thank you for 
your contribution [~janulatha] and patience with the ptest issues :)

> msck repair table should drop the missing partitions from metastore
> ---
>
> Key: HIVE-17824
> URL: https://issues.apache.org/jira/browse/HIVE-17824
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Janaki Lahorani
>Priority: Major
> Fix For: 3.0.0, 2.4.0, 3.1.0
>
> Attachments: HIVE-17824-branch-2.01.patch, 
> HIVE-17824.01-branch-2.patch, HIVE-17824.01-branch-2.patch, 
> HIVE-17824.01-branch-2.patch, HIVE-17824.1.patch, HIVE-17824.2.patch, 
> HIVE-17824.3.patch, HIVE-17824.4.patch
>
>
> {{msck repair table }} is often used in environments where the new 
> partitions are loaded as directories on HDFS or S3 and users want to create 
> the missing partitions in bulk. However, currently it only supports addition 
> of missing partitions. If there are any partitions which are present in 
> metastore but not on the FileSystem, it should also delete them so that it 
> truly repairs the table metadata.
> We should be careful not to break backwards compatibility so we should either 
> introduce a new config or keyword to add support to delete unnecessary 
> partitions from the metastore. This way users who want the old behavior can 
> easily turn it off. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17824) msck repair table should drop the missing partitions from metastore

2018-05-03 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-17824:
---
   Resolution: Fixed
Fix Version/s: 2.4.0
   Status: Resolved  (was: Patch Available)

> msck repair table should drop the missing partitions from metastore
> ---
>
> Key: HIVE-17824
> URL: https://issues.apache.org/jira/browse/HIVE-17824
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Janaki Lahorani
>Priority: Major
> Fix For: 3.0.0, 2.4.0, 3.1.0
>
> Attachments: HIVE-17824-branch-2.01.patch, 
> HIVE-17824.01-branch-2.patch, HIVE-17824.01-branch-2.patch, 
> HIVE-17824.01-branch-2.patch, HIVE-17824.1.patch, HIVE-17824.2.patch, 
> HIVE-17824.3.patch, HIVE-17824.4.patch
>
>
> {{msck repair table }} is often used in environments where the new 
> partitions are loaded as directories on HDFS or S3 and users want to create 
> the missing partitions in bulk. However, currently it only supports addition 
> of missing partitions. If there are any partitions which are present in 
> metastore but not on the FileSystem, it should also delete them so that it 
> truly repairs the table metadata.
> We should be careful not to break backwards compatibility so we should either 
> introduce a new config or keyword to add support to delete unnecessary 
> partitions from the metastore. This way users who want the old behavior can 
> easily turn it off. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19413) Add/use writeId and validWriteIdList during update and for reteive.

2018-05-03 Thread Steve Yeom (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Yeom reassigned HIVE-19413:
-


> Add/use writeId and validWriteIdList during update and for reteive. 
> 
>
> Key: HIVE-19413
> URL: https://issues.apache.org/jira/browse/HIVE-19413
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19389) Schematool: For Hive's Information Schema, use embedded HS2 as default

2018-05-03 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463111#comment-16463111
 ] 

Daniel Dai commented on HIVE-19389:
---

Actually the best way to check if it is initializing information schema is 
dbType='hive'.

> Schematool: For Hive's Information Schema, use embedded HS2 as default
> --
>
> Key: HIVE-19389
> URL: https://issues.apache.org/jira/browse/HIVE-19389
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Major
> Attachments: HIVE-19389.1.patch
>
>
> Currently, for initializing/upgrading Hive's information schema, we require a 
> full jdbc url (for HS2). It will be good to have it connect using embedded 
> HS2 by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (HIVE-19389) Schematool: For Hive's Information Schema, use embedded HS2 as default

2018-05-03 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-19389:
--
Comment: was deleted

(was: +1)

> Schematool: For Hive's Information Schema, use embedded HS2 as default
> --
>
> Key: HIVE-19389
> URL: https://issues.apache.org/jira/browse/HIVE-19389
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Major
> Attachments: HIVE-19389.1.patch
>
>
> Currently, for initializing/upgrading Hive's information schema, we require a 
> full jdbc url (for HS2). It will be good to have it connect using embedded 
> HS2 by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19041) Thrift deserialization of Partition objects should intern fields

2018-05-03 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463109#comment-16463109
 ] 

Vihang Karajgaonkar commented on HIVE-19041:


The test which did not show comments as wasting but then in my testing we do 
not use comments to the column fields. We would need actual dumps from users to 
confirm that theory which we don't have. Particularly for this issue. Even 
otherwise, I think we can intern the comment in the partition objects since 
they will be the same for all the partitions. 

> Thrift deserialization of Partition objects should intern fields
> 
>
> Key: HIVE-19041
> URL: https://issues.apache.org/jira/browse/HIVE-19041
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-19041.01.patch
>
>
> When a client is creating large number of partitions, the thrift objects are 
> deserialized into Partition objects. The read method of these objects does 
> not intern the inputformat, location, outputformat which cause large number 
> of duplicate Strings in the HMS memory. We should intern these objects while 
> deserialization to reduce memory pressure. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19389) Schematool: For Hive's Information Schema, use embedded HS2 as default

2018-05-03 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463108#comment-16463108
 ] 

Daniel Dai commented on HIVE-19389:
---

+1

> Schematool: For Hive's Information Schema, use embedded HS2 as default
> --
>
> Key: HIVE-19389
> URL: https://issues.apache.org/jira/browse/HIVE-19389
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Major
> Attachments: HIVE-19389.1.patch
>
>
> Currently, for initializing/upgrading Hive's information schema, we require a 
> full jdbc url (for HS2). It will be good to have it connect using embedded 
> HS2 by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19389) Schematool: For Hive's Information Schema, use embedded HS2 as default

2018-05-03 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-19389:

Status: Patch Available  (was: Open)

> Schematool: For Hive's Information Schema, use embedded HS2 as default
> --
>
> Key: HIVE-19389
> URL: https://issues.apache.org/jira/browse/HIVE-19389
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Major
> Attachments: HIVE-19389.1.patch
>
>
> Currently, for initializing/upgrading Hive's information schema, we require a 
> full jdbc url (for HS2). It will be good to have it connect using embedded 
> HS2 by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19389) Schematool: For Hive's Information Schema, use embedded HS2 as default

2018-05-03 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-19389:

Attachment: HIVE-19389.1.patch

> Schematool: For Hive's Information Schema, use embedded HS2 as default
> --
>
> Key: HIVE-19389
> URL: https://issues.apache.org/jira/browse/HIVE-19389
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Major
> Attachments: HIVE-19389.1.patch
>
>
> Currently, for initializing/upgrading Hive's information schema, we require a 
> full jdbc url (for HS2). It will be good to have it connect using embedded 
> HS2 by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18395) Using stats for aggregate query on Acid/MM is off even with "hive.compute.query.using.stats" is true.

2018-05-03 Thread Steve Yeom (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463077#comment-16463077
 ] 

Steve Yeom commented on HIVE-18395:
---

In summary, 
the comments from Sergey will be applied to the Design google doc soon 
if not yet added.

Also I have created a jira about the issue Sergey mentioned at 4). 

> Using stats for aggregate query on Acid/MM is off even with 
> "hive.compute.query.using.stats" is true.
> -
>
> Key: HIVE-18395
> URL: https://issues.apache.org/jira/browse/HIVE-18395
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Attachments: HIVE-18395.01.preview
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19411) Full-ACID table stats may not be valid

2018-05-03 Thread Steve Yeom (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Yeom updated HIVE-19411:
--
Description: 
One case is that , per Sergey, updating a row can ended up +2 rows instead of 
+0 
since it is translated to delete and insert and the physical writer 
may just add # of operations. 


  was:
E.g., per Sergey,. updating a row can ended up +2 rows instead of +0 
since it is translated to delete and insert and the physical writer 
may just add # of operations. 



> Full-ACID table stats may not be valid
> --
>
> Key: HIVE-19411
> URL: https://issues.apache.org/jira/browse/HIVE-19411
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Priority: Major
> Fix For: 0.10.1
>
>
> One case is that , per Sergey, updating a row can ended up +2 rows instead of 
> +0 
> since it is translated to delete and insert and the physical writer 
> may just add # of operations. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-19411) Full-ACID table stats may not be valid

2018-05-03 Thread Steve Yeom (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Yeom reassigned HIVE-19411:
-

Assignee: Steve Yeom

> Full-ACID table stats may not be valid
> --
>
> Key: HIVE-19411
> URL: https://issues.apache.org/jira/browse/HIVE-19411
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Fix For: 0.10.1
>
>
> One case is that , per Sergey, updating a row can ended up +2 rows instead of 
> +0 
> since it is translated to delete and insert and the physical writer 
> may just add # of operations. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19411) Full-ACID table stats may not be valid

2018-05-03 Thread Steve Yeom (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Yeom updated HIVE-19411:
--
Summary: Full-ACID table stats may not be valid  (was: Full-ACID table 
stats may not be valid. E.g., delete/insert)

> Full-ACID table stats may not be valid
> --
>
> Key: HIVE-19411
> URL: https://issues.apache.org/jira/browse/HIVE-19411
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Priority: Major
> Fix For: 0.10.1
>
>
> E.g., per Sergey,. updating a row can ended up +2 rows instead of +0 
> since it is translated to delete and insert and the physical writer 
> may just add # of operations. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-6589) Automatically add partitions for external tables

2018-05-03 Thread Steve Hoffman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463058#comment-16463058
 ] 

Steve Hoffman commented on HIVE-6589:
-

I agree that `MSCK REPAIR TABLE your_table_name;` will fix up the table and 
even remove expired partitions, but it still takes time to scan and if you have 
a new partition every hour, you still run a job every hour to add said 
partition.

For example, in AWS using athena this repair table command takes 800 seconds to 
run which will only get longer as the partitions grow and eventually take more 
than an hour which means the hourly job will start a new one before the old one 
finishes – madness.

The point of this ticket isn't to come up with simpler ways to insert records 
into an internal metadata store, but to NOT insert individual records because 
the paths follow a pattern.

> Automatically add partitions for external tables
> 
>
> Key: HIVE-6589
> URL: https://issues.apache.org/jira/browse/HIVE-6589
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 0.14.0
>Reporter: Ken Dallmeyer
>Assignee: Dharmendra Pratap Singh
>Priority: Major
>
> I have a data stream being loaded into Hadoop via Flume. It loads into a date 
> partition folder in HDFS.  The path looks like this:
> {code}/flume/my_data//MM/DD/HH
> /flume/my_data/2014/03/02/01
> /flume/my_data/2014/03/02/02
> /flume/my_data/2014/03/02/03{code}
> On top of it I create an EXTERNAL hive table to do querying.  As of now, I 
> have to manually add partitions.  What I want is for EXTERNAL tables, Hive 
> should "discover" those partitions.  Additionally I would like to specify a 
> partition pattern so that when I query Hive will know to use the partition 
> pattern to find the HDFS folder.
> So something like this:
> {code}CREATE EXTERNAL TABLE my_data (
>   col1 STRING,
>   col2 INT
> )
> PARTITIONED BY (
>   dt STRING,
>   hour STRING
> )
> LOCATION 
>   '/flume/mydata'
> TBLPROPERTIES (
>   'hive.partition.spec' = 'dt=$Y-$M-$D, hour=$H',
>   'hive.partition.spec.location' = '$Y/$M/$D/$H',
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19410) don't create serde reader in LLAP if there's no cache

2018-05-03 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463036#comment-16463036
 ] 

Prasanth Jayachandran commented on HIVE-19410:
--

+1

> don't create serde reader in LLAP if there's no cache
> -
>
> Key: HIVE-19410
> URL: https://issues.apache.org/jira/browse/HIVE-19410
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19410.patch
>
>
> Seems to crop up in some tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19358) CBO decorrelation logic should generate Hive operators

2018-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463030#comment-16463030
 ] 

Hive QA commented on HIVE-19358:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
54s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
13s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 17m 42s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-10661/dev-support/hive-personality.sh
 |
| git revision | master / db26f34 |
| Default Java | 1.8.0_111 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-10661/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> CBO decorrelation logic should generate Hive operators
> --
>
> Key: HIVE-19358
> URL: https://issues.apache.org/jira/browse/HIVE-19358
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-19358.01.patch, HIVE-19358.02.patch, 
> HIVE-19358.patch, fix.patch
>
>
> Decorrelation logic may generate logical instances of the operators in the 
> plan (e.g., LogicalFilter instead of HiveFilter). This leads to errors while 
> costing the tree in the Volcano planner (used in MV rewriting), since logical 
> operators do not have a cost associated to them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18395) Using stats for aggregate query on Acid/MM is off even with "hive.compute.query.using.stats" is true.

2018-05-03 Thread Steve Yeom (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463022#comment-16463022
 ] 

Steve Yeom commented on HIVE-18395:
---

Again the patch is not yet to be reviewed. 

Items 2, 3 are already in my list to do
[~sershe] do you think you can explain a little further on item#4?

> Using stats for aggregate query on Acid/MM is off even with 
> "hive.compute.query.using.stats" is true.
> -
>
> Key: HIVE-18395
> URL: https://issues.apache.org/jira/browse/HIVE-18395
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Attachments: HIVE-18395.01.preview
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19393) NonSyncDataInputBuffer.skipBytes hangs when the file is corrupted

2018-05-03 Thread John Doe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Doe updated HIVE-19393:

Affects Version/s: 2.3.2

> NonSyncDataInputBuffer.skipBytes hangs when the file is corrupted 
> --
>
> Key: HIVE-19393
> URL: https://issues.apache.org/jira/browse/HIVE-19393
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.0.0, 2.3.2
>Reporter: John Doe
>Priority: Minor
>
> When an InputStream is corrupted, the InputStream.skip can return -1, causing 
> the while loop in NonSyncDataInputBuffer.skipBytes become infinite.
> {code:java}
>   public final int skipBytes(int count) throws IOException {
> int skipped = 0;
> long skip;
> while (skipped < count && (skip = in.skip(count - skipped)) != 0) {
>   skipped += skip;
> }
> if (skipped < 0) {
>   throw new EOFException();
> }
> return skipped;
>   }
> {code}
> Similar bugs are Hadoop-8614, Yarn-2905, Yarn-163, Mapreduce-6990



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-18219) When InputStream is corrupted, the skip() returns -1, causing infinite loop

2018-05-03 Thread John Doe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Doe updated HIVE-18219:

Affects Version/s: 1.0.0

> When InputStream is corrupted, the skip() returns -1, causing infinite loop
> ---
>
> Key: HIVE-18219
> URL: https://issues.apache.org/jira/browse/HIVE-18219
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0, 2.3.2
>Reporter: John Doe
>Priority: Major
>
> Similar like 
> [CASSANDRA-7330|https://issues.apache.org/jira/browse/CASSANDRA-7330], when 
> InputStream is corrupted, skip() returns -1, causing the following loop be 
> infinite.
> {code:java}
>   public final int skipBytes(int count) throws IOException {
> int skipped = 0;
> long skip;
> while (skipped < count && (skip = in.skip(count - skipped)) != 0) {
>   skipped += skip;
> }
> if (skipped < 0) {
>   throw new EOFException();
> }
> return skipped;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19393) NonSyncDataInputBuffer.skipBytes hangs when the file is corrupted

2018-05-03 Thread John Doe (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463011#comment-16463011
 ] 

John Doe commented on HIVE-19393:
-

[~sershe] Actually it is a duplicated issue. Same as 
[HIVE-18219|https://issues.apache.org/jira/browse/HIVE-18219]

> NonSyncDataInputBuffer.skipBytes hangs when the file is corrupted 
> --
>
> Key: HIVE-19393
> URL: https://issues.apache.org/jira/browse/HIVE-19393
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.0.0
>Reporter: John Doe
>Priority: Minor
>
> When an InputStream is corrupted, the InputStream.skip can return -1, causing 
> the while loop in NonSyncDataInputBuffer.skipBytes become infinite.
> {code:java}
>   public final int skipBytes(int count) throws IOException {
> int skipped = 0;
> long skip;
> while (skipped < count && (skip = in.skip(count - skipped)) != 0) {
>   skipped += skip;
> }
> if (skipped < 0) {
>   throw new EOFException();
> }
> return skipped;
>   }
> {code}
> Similar bugs are Hadoop-8614, Yarn-2905, Yarn-163, Mapreduce-6990



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HIVE-19393) NonSyncDataInputBuffer.skipBytes hangs when the file is corrupted

2018-05-03 Thread John Doe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Doe resolved HIVE-19393.
-
Resolution: Duplicate

> NonSyncDataInputBuffer.skipBytes hangs when the file is corrupted 
> --
>
> Key: HIVE-19393
> URL: https://issues.apache.org/jira/browse/HIVE-19393
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.0.0
>Reporter: John Doe
>Priority: Minor
>
> When an InputStream is corrupted, the InputStream.skip can return -1, causing 
> the while loop in NonSyncDataInputBuffer.skipBytes become infinite.
> {code:java}
>   public final int skipBytes(int count) throws IOException {
> int skipped = 0;
> long skip;
> while (skipped < count && (skip = in.skip(count - skipped)) != 0) {
>   skipped += skip;
> }
> if (skipped < 0) {
>   throw new EOFException();
> }
> return skipped;
>   }
> {code}
> Similar bugs are Hadoop-8614, Yarn-2905, Yarn-163, Mapreduce-6990



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-18395) Using stats for aggregate query on Acid/MM is off even with "hive.compute.query.using.stats" is true.

2018-05-03 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463006#comment-16463006
 ] 

Sergey Shelukhin edited comment on HIVE-18395 at 5/3/18 8:33 PM:
-

Some small comments:
1) Seems to include some generated config in the patch.
2) TABLE_PARAMS_TXN - can we normalize this and not rely on JSON object? Since 
this will only be used for stats I assume, and stats only have one parameter 
(JSON string), we can store it as proper SQL data
[~ekoifman] you might want to look at the new tables schema since it's based on 
what ACID state is sufficient to map stats to a snapshot.
3) {noformat}
+  if (SessionState.get().getTxnMgr() != null && isTransactional) {
+request.setTxnid(SessionState.get().getTxnMgr().getCurrentTxnId());
+  }
{noformat}
Is it enough to store transaction ID? Transaction ID will invalidate stats for 
every other table with every transaction, right? At least it should be writeId.
Also the doc calls for storing entire state information about when the stats 
were written - otherwise you cannot tell based on read-time snapshot if the 
stats are valid, for example in case of two inserts, where both can write 
stats, but both stats are invalid because the inserts don't see each other. 
4) {noformat}
+/*
 if (isFullAcid && !work.isTargetRewritten()) {
   // Don't bother with aggregation in this case, it will probably be 
invalid.
   parameters.remove(statType);
   continue;
 }
+*/
{noformat}
Aggregation for full ACID will still be invalid.
5) This doesn't seem to affect basic stats in all the places they are updated. 
However, basic stats are also used for count(*) type queries (numRows). 
I sent a patch separately that has some changes for basic stats... Might make 
sense to also modify these places in this patch. Doesn't have to be the same 
change I made as long as it is sufficient to add txn info to basic stats.

It might help to exclude the generated code for review. I sometimes use the 
script like this, where $1 is base branch and $2 is file name:
{noformat}
rm -f  ~/patches/$2.nogen.patch
for f in `git diff $1 --name-only | grep -v "gen-" | grep -v "\/gen\/"`
do
  git diff $1 -- $f >> ~/patches/$2.nogen.patch
{noformat}


was (Author: sershe):
Some small comments:
1) Seems to include some generated config in the patch.
2) TABLE_PARAMS_TXN - can we normalize this and not rely on JSON object? Since 
this will only be used for stats I assume, and stats only have one parameter 
(JSON string), we can store it as proper SQL data
[~ekoifman] you might want to look at the new tables schema since it's based on 
what ACID state is sufficient to map stats to a snapshot.
3) {noformat}
+  if (SessionState.get().getTxnMgr() != null && isTransactional) {
+request.setTxnid(SessionState.get().getTxnMgr().getCurrentTxnId());
+  }
{noformat}
Is it enough to store transaction ID? Transaction ID will invalidate stats for 
every other table with every transaction, right?
Also the doc calls for storing entire state information about when the stats 
were written - otherwise you cannot tell based on read-time snapshot if the 
stats are valid, for example in case of two inserts, where both can write 
stats, but both stats are invalid because the inserts don't see each other. 
4) {noformat}
+/*
 if (isFullAcid && !work.isTargetRewritten()) {
   // Don't bother with aggregation in this case, it will probably be 
invalid.
   parameters.remove(statType);
   continue;
 }
+*/
{noformat}
Aggregation for full ACID will still be invalid.
5) This doesn't seem to affect basic stats in all the places they are updated. 
However, basic stats are also used for count(*) type queries (numRows). 
I sent a patch separately that has some changes for basic stats... Might make 
sense to also modify these places in this patch. Doesn't have to be the same 
change I made as long as it is sufficient to add txn info to basic stats.

It might help to exclude the generated code for review. I sometimes use the 
script like this, where $1 is base branch and $2 is file name:
{noformat}
rm -f  ~/patches/$2.nogen.patch
for f in `git diff $1 --name-only | grep -v "gen-" | grep -v "\/gen\/"`
do
  git diff $1 -- $f >> ~/patches/$2.nogen.patch
{noformat}

> Using stats for aggregate query on Acid/MM is off even with 
> "hive.compute.query.using.stats" is true.
> -
>
> Key: HIVE-18395
> URL: https://issues.apache.org/jira/browse/HIVE-18395
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Attachments: 

[jira] [Comment Edited] (HIVE-18395) Using stats for aggregate query on Acid/MM is off even with "hive.compute.query.using.stats" is true.

2018-05-03 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463006#comment-16463006
 ] 

Sergey Shelukhin edited comment on HIVE-18395 at 5/3/18 8:32 PM:
-

Some small comments:
1) Seems to include some generated config in the patch.
2) TABLE_PARAMS_TXN - can we normalize this and not rely on JSON object? Since 
this will only be used for stats I assume, and stats only have one parameter 
(JSON string), we can store it as proper SQL data
[~ekoifman] you might want to look at the new tables schema since it's based on 
what ACID state is sufficient to map stats to a snapshot.
3) {noformat}
+  if (SessionState.get().getTxnMgr() != null && isTransactional) {
+request.setTxnid(SessionState.get().getTxnMgr().getCurrentTxnId());
+  }
{noformat}
Is it enough to store transaction ID? Transaction ID will invalidate stats for 
every other table with every transaction, right?
Also the doc calls for storing entire state information about when the stats 
were written - otherwise you cannot tell based on read-time snapshot if the 
stats are valid, for example in case of two inserts, where both can write 
stats, but both stats are invalid because the inserts don't see each other. 
4) {noformat}
+/*
 if (isFullAcid && !work.isTargetRewritten()) {
   // Don't bother with aggregation in this case, it will probably be 
invalid.
   parameters.remove(statType);
   continue;
 }
+*/
{noformat}
Aggregation for full ACID will still be invalid.
5) This doesn't seem to affect basic stats in all the places they are updated. 
However, basic stats are also used for count(*) type queries (numRows). 
I sent a patch separately that has some changes for basic stats... Might make 
sense to also modify these places in this patch. Doesn't have to be the same 
change I made as long as it is sufficient to add txn info to basic stats.

It might help to exclude the generated code for review. I sometimes use the 
script like this, where $1 is base branch and $2 is file name:
{noformat}
rm -f  ~/patches/$2.nogen.patch
for f in `git diff $1 --name-only | grep -v "gen-" | grep -v "\/gen\/"`
do
  git diff $1 -- $f >> ~/patches/$2.nogen.patch
{noformat}


was (Author: sershe):
Some small comments:
1) Seems to include some generated config in the patch.
2) TABLE_PARAMS_TXN - can we normalize this and not rely on JSON object? Since 
this will only be used for stats I assume, and stats only have one parameter.
[~ekoifman] you might want to look at the new tables schema since it's based on 
what ACID state is sufficient to map stats to a snapshot.
3) {noformat}
+  if (SessionState.get().getTxnMgr() != null && isTransactional) {
+request.setTxnid(SessionState.get().getTxnMgr().getCurrentTxnId());
+  }
{noformat}
Is it enough to store transaction ID? Transaction ID will invalidate stats for 
every other table with every transaction, right?
Also the doc calls for storing entire state information about when the stats 
were written - otherwise you cannot tell based on read-time snapshot if the 
stats are valid, for example in case of two inserts, where both can write 
stats, but both stats are invalid because the inserts don't see each other. 
4) {noformat}
+/*
 if (isFullAcid && !work.isTargetRewritten()) {
   // Don't bother with aggregation in this case, it will probably be 
invalid.
   parameters.remove(statType);
   continue;
 }
+*/
{noformat}
Aggregation for full ACID will still be invalid.
5) This doesn't seem to affect basic stats in all the places they are updated. 
However, basic stats are also used for count(*) type queries (numRows). 
I sent a patch separately that has some changes for basic stats... Might make 
sense to also modify these places in this patch. Doesn't have to be the same 
change I made as long as it is sufficient to add txn info to basic stats.

It might help to exclude the generated code for review. I sometimes use the 
script like this, where $1 is base branch and $2 is file name:
{noformat}
rm -f  ~/patches/$2.nogen.patch
for f in `git diff $1 --name-only | grep -v "gen-" | grep -v "\/gen\/"`
do
  git diff $1 -- $f >> ~/patches/$2.nogen.patch
{noformat}

> Using stats for aggregate query on Acid/MM is off even with 
> "hive.compute.query.using.stats" is true.
> -
>
> Key: HIVE-18395
> URL: https://issues.apache.org/jira/browse/HIVE-18395
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Attachments: HIVE-18395.01.preview
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18395) Using stats for aggregate query on Acid/MM is off even with "hive.compute.query.using.stats" is true.

2018-05-03 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463006#comment-16463006
 ] 

Sergey Shelukhin commented on HIVE-18395:
-

Some small comments:
1) Seems to include some generated config in the patch.
2) TABLE_PARAMS_TXN - can we normalize this and not rely on JSON object? Since 
this will only be used for stats I assume, and stats only have one parameter.
[~ekoifman] you might want to look at the new tables schema since it's based on 
what ACID state is sufficient to map stats to a snapshot.
3) {noformat}
+  if (SessionState.get().getTxnMgr() != null && isTransactional) {
+request.setTxnid(SessionState.get().getTxnMgr().getCurrentTxnId());
+  }
{noformat}
Is it enough to store transaction ID? Transaction ID will invalidate stats for 
every other table with every transaction, right?
Also the doc calls for storing entire state information about when the stats 
were written - otherwise you cannot tell based on read-time snapshot if the 
stats are valid, for example in case of two inserts, where both can write 
stats, but both stats are invalid because the inserts don't see each other. 
4) {noformat}
+/*
 if (isFullAcid && !work.isTargetRewritten()) {
   // Don't bother with aggregation in this case, it will probably be 
invalid.
   parameters.remove(statType);
   continue;
 }
+*/
{noformat}
Aggregation for full ACID will still be invalid.
5) This doesn't seem to affect basic stats in all the places they are updated. 
However, basic stats are also used for count(*) type queries (numRows). 
I sent a patch separately that has some changes for basic stats... Might make 
sense to also modify these places in this patch. Doesn't have to be the same 
change I made as long as it is sufficient to add txn info to basic stats.

It might help to exclude the generated code for review. I sometimes use the 
script like this, where $1 is base branch and $2 is file name:
{noformat}
rm -f  ~/patches/$2.nogen.patch
for f in `git diff $1 --name-only | grep -v "gen-" | grep -v "\/gen\/"`
do
  git diff $1 -- $f >> ~/patches/$2.nogen.patch
{noformat}

> Using stats for aggregate query on Acid/MM is off even with 
> "hive.compute.query.using.stats" is true.
> -
>
> Key: HIVE-18395
> URL: https://issues.apache.org/jira/browse/HIVE-18395
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Major
> Attachments: HIVE-18395.01.preview
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19358) CBO decorrelation logic should generate Hive operators

2018-05-03 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463005#comment-16463005
 ] 

Jesus Camacho Rodriguez commented on HIVE-19358:


[~vgarg], awesome, thanks. That seems to fix the issue that I had encountered. 
I have uploaded a new patch that includes your changes to trigger a new ptest 
run.

> CBO decorrelation logic should generate Hive operators
> --
>
> Key: HIVE-19358
> URL: https://issues.apache.org/jira/browse/HIVE-19358
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-19358.01.patch, HIVE-19358.02.patch, 
> HIVE-19358.patch, fix.patch
>
>
> Decorrelation logic may generate logical instances of the operators in the 
> plan (e.g., LogicalFilter instead of HiveFilter). This leads to errors while 
> costing the tree in the Volcano planner (used in MV rewriting), since logical 
> operators do not have a cost associated to them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-19393) NonSyncDataInputBuffer.skipBytes hangs when the file is corrupted

2018-05-03 Thread John Doe (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462983#comment-16462983
 ] 

John Doe edited comment on HIVE-19393 at 5/3/18 8:30 PM:
-

[~sershe] Sorry about the confusion. Actually, when the InputStream is 
corrupted, skip function returns -1 instead of 0, causing the infinite loop.


was (Author: dustinday):
[~sershe] Yes, you are right. Sorry about it.

> NonSyncDataInputBuffer.skipBytes hangs when the file is corrupted 
> --
>
> Key: HIVE-19393
> URL: https://issues.apache.org/jira/browse/HIVE-19393
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.0.0
>Reporter: John Doe
>Priority: Minor
>
> When an InputStream is corrupted, the InputStream.skip can return -1, causing 
> the while loop in NonSyncDataInputBuffer.skipBytes become infinite.
> {code:java}
>   public final int skipBytes(int count) throws IOException {
> int skipped = 0;
> long skip;
> while (skipped < count && (skip = in.skip(count - skipped)) != 0) {
>   skipped += skip;
> }
> if (skipped < 0) {
>   throw new EOFException();
> }
> return skipped;
>   }
> {code}
> Similar bugs are Hadoop-8614, Yarn-2905, Yarn-163, Mapreduce-6990



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HIVE-19393) NonSyncDataInputBuffer.skipBytes hangs when the file is corrupted

2018-05-03 Thread John Doe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Doe reopened HIVE-19393:
-

> NonSyncDataInputBuffer.skipBytes hangs when the file is corrupted 
> --
>
> Key: HIVE-19393
> URL: https://issues.apache.org/jira/browse/HIVE-19393
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.0.0
>Reporter: John Doe
>Priority: Minor
>
> When an InputStream is corrupted, the InputStream.skip can return -1, causing 
> the while loop in NonSyncDataInputBuffer.skipBytes become infinite.
> {code:java}
>   public final int skipBytes(int count) throws IOException {
> int skipped = 0;
> long skip;
> while (skipped < count && (skip = in.skip(count - skipped)) != 0) {
>   skipped += skip;
> }
> if (skipped < 0) {
>   throw new EOFException();
> }
> return skipped;
>   }
> {code}
> Similar bugs are Hadoop-8614, Yarn-2905, Yarn-163, Mapreduce-6990



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19358) CBO decorrelation logic should generate Hive operators

2018-05-03 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-19358:
---
Attachment: HIVE-19358.02.patch

> CBO decorrelation logic should generate Hive operators
> --
>
> Key: HIVE-19358
> URL: https://issues.apache.org/jira/browse/HIVE-19358
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-19358.01.patch, HIVE-19358.02.patch, 
> HIVE-19358.patch, fix.patch
>
>
> Decorrelation logic may generate logical instances of the operators in the 
> plan (e.g., LogicalFilter instead of HiveFilter). This leads to errors while 
> costing the tree in the Volcano planner (used in MV rewriting), since logical 
> operators do not have a cost associated to them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-19393) NonSyncDataInputBuffer.skipBytes hangs when the file is corrupted

2018-05-03 Thread John Doe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Doe updated HIVE-19393:

Description: 
When an InputStream is corrupted, the InputStream.skip can return -1, causing 
the while loop in NonSyncDataInputBuffer.skipBytes become infinite.
{code:java}
  public final int skipBytes(int count) throws IOException {
int skipped = 0;
long skip;
while (skipped < count && (skip = in.skip(count - skipped)) != 0) {
  skipped += skip;
}
if (skipped < 0) {
  throw new EOFException();
}
return skipped;
  }
{code}
Similar bugs are Hadoop-8614, Yarn-2905, Yarn-163, Mapreduce-6990

  was:
When an InputStream is corrupted, the InputStream.skip can return 0, causing 
the while loop in NonSyncDataInputBuffer.skipBytes become infinite.

{code:java}
  public final int skipBytes(int count) throws IOException {
int skipped = 0;
long skip;
while (skipped < count && (skip = in.skip(count - skipped)) != 0) {
  skipped += skip;
}
if (skipped < 0) {
  throw new EOFException();
}
return skipped;
  }
{code}

Similar bugs are 
[Hadoop-8614|https://issues.apache.org/jira/browse/HADOOP-8614], 
[Yarn-2905|https://issues.apache.org/jira/browse/YARN-2905], 
[Yarn-163|https://issues.apache.org/jira/browse/YARN-163], 
[Mapreduce-6990|https://issues.apache.org/jira/browse/MAPREDUCE-6990]


> NonSyncDataInputBuffer.skipBytes hangs when the file is corrupted 
> --
>
> Key: HIVE-19393
> URL: https://issues.apache.org/jira/browse/HIVE-19393
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.0.0
>Reporter: John Doe
>Priority: Minor
>
> When an InputStream is corrupted, the InputStream.skip can return -1, causing 
> the while loop in NonSyncDataInputBuffer.skipBytes become infinite.
> {code:java}
>   public final int skipBytes(int count) throws IOException {
> int skipped = 0;
> long skip;
> while (skipped < count && (skip = in.skip(count - skipped)) != 0) {
>   skipped += skip;
> }
> if (skipped < 0) {
>   throw new EOFException();
> }
> return skipped;
>   }
> {code}
> Similar bugs are Hadoop-8614, Yarn-2905, Yarn-163, Mapreduce-6990



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19376) Statistics: switch to 10bit HLL by default for Hive

2018-05-03 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463000#comment-16463000
 ] 

Hive QA commented on HIVE-19376:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12921424/HIVE-19376.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 186 failed/errored test(s), 14316 tests 
executed
*Failed tests:*
{noformat}
TestDbNotificationListener - did not produce a TEST-*.xml file (likely timed 
out) (batchId=247)
TestHCatHiveCompatibility - did not produce a TEST-*.xml file (likely timed 
out) (batchId=247)
TestNonCatCallsWithCatalog - did not produce a TEST-*.xml file (likely timed 
out) (batchId=217)
TestSequenceFileReadWrite - did not produce a TEST-*.xml file (likely timed 
out) (batchId=247)
TestTxnExIm - did not produce a TEST-*.xml file (likely timed out) (batchId=286)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_13] 
(batchId=253)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_2] 
(batchId=86)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_9] 
(batchId=37)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bitvector] (batchId=85)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_1] 
(batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_2] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin_negative3] 
(batchId=29)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[compute_stats_date] 
(batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[confirm_initial_tbl_stats]
 (batchId=31)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cross_join_merge] 
(batchId=7)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[describe_table] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_1_23] 
(batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_sort_skew_1_23] 
(batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[hll] (batchId=90)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_mapjoin] 
(batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_13] 
(batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[sort_merge_join_desc_7] 
(batchId=27)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=178)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[dynamic_semijoin_user_level]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[explainuser_2] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llapdecider] 
(batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[acid_no_buckets]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[autoColumnStats_2]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_join1] 
(batchId=173)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_join21]
 (batchId=174)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_join29]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_join30]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_6]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_groupby]
 (batchId=176)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez2]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[check_constraint]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer1]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer2]
 (batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer3]
 (batchId=174)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer6]
 (batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cross_join] 
(batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction]
 (batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_sw]
 (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainanalyze_2]
 (batchId=171)

[jira] [Updated] (HIVE-19410) don't create serde reader in LLAP if there's no cache

2018-05-03 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-19410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-19410:

Status: Patch Available  (was: Open)

> don't create serde reader in LLAP if there's no cache
> -
>
> Key: HIVE-19410
> URL: https://issues.apache.org/jira/browse/HIVE-19410
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19410.patch
>
>
> Seems to crop up in some tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19410) don't create serde reader in LLAP if there's no cache

2018-05-03 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16462993#comment-16462993
 ] 

Sergey Shelukhin commented on HIVE-19410:
-

[~prasanth_j] can you take a look? Tiny patch
https://builds.apache.org/job/PreCommit-HIVE-Build/10649/testReport/org.apache.hive.jdbc/TestTriggersWorkloadManager/testTriggerSlowQueryExecutionTime/



> don't create serde reader in LLAP if there's no cache
> -
>
> Key: HIVE-19410
> URL: https://issues.apache.org/jira/browse/HIVE-19410
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-19410.patch
>
>
> Seems to crop up in some tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   >