from:"Ashutosh Chauhan \(JIRA\)"

[jira] [Commented] (HIVE-22637) Avoid cost based rules during generating expressions from AST

2019-12-13 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16996085#comment-16996085
 ] 

Ashutosh Chauhan commented on HIVE-22637:
-

+1

> Avoid cost based rules during generating expressions from AST
> -
>
> Key: HIVE-22637
> URL: https://issues.apache.org/jira/browse/HIVE-22637
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-22637.1.patch, HIVE-22637.2.patch
>
>
> genExprNode uses default dispatcher which fire rules based on cost, 
> computation of cost is expensive and is likely un-necessary.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22647) enable session pool by default

2019-12-13 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16996078#comment-16996078
 ] 

Ashutosh Chauhan commented on HIVE-22647:
-

+1

> enable session pool by default
> --
>
> Key: HIVE-22647
> URL: https://issues.apache.org/jira/browse/HIVE-22647
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22647.1.patch
>
>
> Non pooled session my leak when the client doesn't close the connection.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-20974) TezTask should set task exception on failures

2019-12-13 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-20974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16995904#comment-16995904
 ] 

Ashutosh Chauhan commented on HIVE-20974:
-

+1

> TezTask should set task exception on failures
> -
>
> Key: HIVE-20974
> URL: https://issues.apache.org/jira/browse/HIVE-20974
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Trivial
> Attachments: HIVE-20974.1.patch
>
>
> TezTask logs the error as "Failed to execute tez graph" and proceeds further. 
> "TaskRunner.runSequentail()" code would not be able to get these exceptions 
> for TezTask. If there are any failure hooks configured, these exceptions 
> wouldn't show up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22609) Reduce number of FS getFileStatus calls in AcidUtils::getHdfsDirSnapshots

2019-12-13 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16995766#comment-16995766
 ] 

Ashutosh Chauhan commented on HIVE-22609:
-

[~rajesh.balamohan] Failures look related?

> Reduce number of FS getFileStatus calls in AcidUtils::getHdfsDirSnapshots
> -
>
> Key: HIVE-22609
> URL: https://issues.apache.org/jira/browse/HIVE-22609
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-22609.1.patch, HIVE-22609.2.patch
>
>
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1380]
> ACID delta folder contains {{_orc_acid_version}} and {{bucket_0}} files. 
> For both these files, parent dir is the same. Number of getFileStatus in such 
> cases should be reduced by 1/2.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22485) Cross product should set the conf in UnorderedPartitionedKVEdgeConfig

2019-12-13 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16995764#comment-16995764
 ] 

Ashutosh Chauhan commented on HIVE-22485:
-

+1

> Cross product should set the conf in UnorderedPartitionedKVEdgeConfig
> -
>
> Key: HIVE-22485
> URL: https://issues.apache.org/jira/browse/HIVE-22485
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Trivial
> Attachments: HIVE-22485.1.patch
>
>
> SSL and other options would not be sent correctly, if this is not setup.
>  
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java#L545



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-21971) HS2 leaks classloader due to `ReflectionUtils::CONSTRUCTOR_CACHE` with temporary functions + GenericUDF

2019-12-13 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-21971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16995762#comment-16995762
 ] 

Ashutosh Chauhan commented on HIVE-21971:
-

+1

> HS2 leaks classloader due to `ReflectionUtils::CONSTRUCTOR_CACHE` with 
> temporary functions + GenericUDF
> ---
>
> Key: HIVE-21971
> URL: https://issues.apache.org/jira/browse/HIVE-21971
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.3.4
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Critical
> Attachments: HIVE-21971.1.patch
>
>
> https://issues.apache.org/jira/browse/HIVE-10329 helped in moving away from 
> hadoop's ReflectionUtils constructor cache issue 
> (https://issues.apache.org/jira/browse/HADOOP-10513).
> However, there are corner cases where hadoop's {{ReflectionUtils}} is in use 
> and this causes gradual build up of memory in HS2.
> I have observed this in Hive 2.3. But the codepath in master for this has not 
> changed much.
> Easiest way to repro would be to add a temp function which extends 
> {{GenericUDF}}. In {{FunctionRegistry::cloneGenericUDF,}} this would 
> end up using {{org.apache.hadoop.util.ReflectionUtils.newInstance}} which in 
> turn lands up in COSNTRUCTOR_CACHE of ReflectionUtils. 
> {noformat}
> CREATE TEMPORARY FUNCTION dummy AS 'com.hive.test.DummyGenericUDF' USING JAR 
> 'file:///home/test/udf/dummy.jar';
> select dummy();
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:107)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionRegistry.cloneGenericUDF(FunctionRegistry.java:1353)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionInfo.getGenericUDF(FunctionInfo.java:122)
>   at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:983)
>   at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1359)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
>   at 
> org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:76)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> {noformat}
> Note: Reflection based invocation of hadoop's {{ReflectionUtils::clear}} was 
> removed in 2.x. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22558) Metastore: Passwords jceks should be read lazily, in case of connection pools

2019-12-12 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16994861#comment-16994861
 ] 

Ashutosh Chauhan commented on HIVE-22558:
-

+1

> Metastore: Passwords jceks should be read lazily, in case of connection pools
> -
>
> Key: HIVE-22558
> URL: https://issues.apache.org/jira/browse/HIVE-22558
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Standalone Metastore
>Reporter: Gopal Vijayaraghavan
>Assignee: Slim Bouguerra
>Priority: Major
> Attachments: HIVE-22558.1.patch, getDatabase-password-md5-hotpath.png
>
>
> The jceks file is parsed for every instance of the metastore conf to populate 
> the password in plain-text, which is irrelevant for the scenario where the DB 
> connection pool is already active.
>   !getDatabase-password-md5-hotpath.png|width=640!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22577) StringIndexOutOfBoundsException when getting sessionId from worker node name

2019-12-12 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16994857#comment-16994857
 ] 

Ashutosh Chauhan commented on HIVE-22577:
-

+1

> StringIndexOutOfBoundsException when getting sessionId from worker node name
> 
>
> Key: HIVE-22577
> URL: https://issues.apache.org/jira/browse/HIVE-22577
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22577.1.patch, HIVE-22577.2.patch, 
> HIVE-22577.3.patch
>
>
> When the node name is "worker-" the following exception is thrown
>  
> {code:java}
> Caused by: java.lang.StringIndexOutOfBoundsException: String index out of 
> range: -1Caused by: java.lang.StringIndexOutOfBoundsException: String index 
> out of range: -1 at java.lang.String.substring(String.java:1931) at 
> org.apache.hadoop.hive.registry.impl.ZkRegistryBase.extractSeqNum(ZkRegistryBase.java:781)
>  at 
> org.apache.hadoop.hive.registry.impl.ZkRegistryBase.populateCache(ZkRegistryBase.java:507)
>  at 
> org.apache.hadoop.hive.llap.registry.impl.LlapZookeeperRegistryImpl.access$000(LlapZookeeperRegistryImpl.java:65)
>  at 
> org.apache.hadoop.hive.llap.registry.impl.LlapZookeeperRegistryImpl$DynamicServiceInstanceSet.(LlapZookeeperRegistryImpl.java:313)
>  at 
> org.apache.hadoop.hive.llap.registry.impl.LlapZookeeperRegistryImpl.getInstances(LlapZookeeperRegistryImpl.java:462)
>  at 
> org.apache.hadoop.hive.llap.registry.impl.LlapZookeeperRegistryImpl.getApplicationId(LlapZookeeperRegistryImpl.java:469)
>  at 
> org.apache.hadoop.hive.llap.registry.impl.LlapRegistryService.getApplicationId(LlapRegistryService.java:212)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.Utils.getCustomSplitLocationProvider(Utils.java:77)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.Utils.getSplitLocationProvider(Utils.java:53)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.(HiveSplitGenerator.java:140)
>   {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-16690) Configure Tez cartesian product edge based on LLAP cluster size

2019-12-10 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-16690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16992714#comment-16992714
 ] 

Ashutosh Chauhan commented on HIVE-16690:
-

+1

> Configure Tez cartesian product edge based on LLAP cluster size
> ---
>
> Key: HIVE-16690
> URL: https://issues.apache.org/jira/browse/HIVE-16690
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: László Bodor
>Priority: Major
> Attachments: HIVE-16690.03.patch, HIVE-16690.1.patch, 
> HIVE-16690.2.patch, HIVE-16690.2.patch, HIVE-16690.2.patch, 
> HIVE-16690.2.patch, HIVE-16690.2.patch, HIVE-16690.2.patch, 
> HIVE-16690.2.patch, HIVE-16690.2.patch, HIVE-16690.addendum.patch
>
>
> In HIVE-14731 we are using default value for target parallelism of fair 
> cartesian product edge. Ideally this should be set according to cluster size. 
> In case of LLAP it's pretty easy to get cluster size, i.e., number of 
> executors.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22587) hive.stats.ndv.error parameter documentation issue in HiveConf.java

2019-12-07 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22587:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Pablo!

> hive.stats.ndv.error parameter documentation issue in HiveConf.java
> ---
>
> Key: HIVE-22587
> URL: https://issues.apache.org/jira/browse/HIVE-22587
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Pablo Junge
>Assignee: Pablo Junge
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-22587.patch
>
>
> hive.stats.ndv.error parameter documentation should specify that it only 
> affects the FM-Sketch algorithm



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22588) Flush the remaining rows for the rest of the grouping sets when switching the vector groupby mode

2019-12-07 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22588:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Ramesh!

> Flush the remaining rows for the rest of the grouping sets when switching the 
> vector groupby mode
> -
>
> Key: HIVE-22588
> URL: https://issues.apache.org/jira/browse/HIVE-22588
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22588.1.patch, HIVE-22588.2.patch, 
> HIVE-22588.3.patch, HIVE-22588.4.patch
>
>
> Flush the remaining rows for the rest of the grouping sets when switching the 
> vector groupby mode



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22588) Flush the remaining rows for the rest of the grouping sets when switching the vector groupby mode

2019-12-07 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16990551#comment-16990551
 ] 

Ashutosh Chauhan commented on HIVE-22588:
-

+1

> Flush the remaining rows for the rest of the grouping sets when switching the 
> vector groupby mode
> -
>
> Key: HIVE-22588
> URL: https://issues.apache.org/jira/browse/HIVE-22588
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22588.1.patch, HIVE-22588.2.patch, 
> HIVE-22588.3.patch, HIVE-22588.4.patch
>
>
> Flush the remaining rows for the rest of the grouping sets when switching the 
> vector groupby mode



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22587) hive.stats.ndv.error parameter documentation issue in HiveConf.java

2019-12-06 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16989916#comment-16989916
 ] 

Ashutosh Chauhan commented on HIVE-22587:
-

+1

> hive.stats.ndv.error parameter documentation issue in HiveConf.java
> ---
>
> Key: HIVE-22587
> URL: https://issues.apache.org/jira/browse/HIVE-22587
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Pablo Junge
>Assignee: Pablo Junge
>Priority: Minor
> Attachments: HIVE-22587.patch
>
>
> hive.stats.ndv.error parameter documentation should specify that it only 
> affects the FM-Sketch algorithm



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22499) LLAP: Add an EncodedReaderOptions to extend ORC impl for options

2019-12-04 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22499:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Mustafa!

> LLAP: Add an EncodedReaderOptions to extend ORC impl for options
> 
>
> Key: HIVE-22499
> URL: https://issues.apache.org/jira/browse/HIVE-22499
> Project: Hive
>  Issue Type: Bug
>  Components: llap, ORC
>Reporter: Gopal Vijayaraghavan
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22499.2.patch, HIVE-22499.2.patch, 
> HIVE-22499.3.patch, HIVE-22499.4.patch, HIVE-22499.5.patch, 
> HIVE-22499.6.patch, HIVE-22499.7.patch, HIVE-22499.WIP.patch, HIVE-22499.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> ORC-570 is an ABI change to the way getFileSystem() by adding an another 
> exception to the implementation.
> To accept and use that change requires waiting for an ORC release, while this 
> patch serves the same purpose though falls back for a retry for 
> FileSystem.get() in case the supplier fails at runtime.
> Also as a side-note, the FS.get() call is always used in the cases where the 
> file is not being read from a cache such as EncodedOrcFile (so the upstream 
> API change might be overkill).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22395) Add ability to read Druid metastore password from jceks

2019-12-04 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16987998#comment-16987998
 ] 

Ashutosh Chauhan commented on HIVE-22395:
-

+1

> Add ability to read Druid metastore password from jceks
> ---
>
> Key: HIVE-22395
> URL: https://issues.apache.org/jira/browse/HIVE-22395
> Project: Hive
>  Issue Type: Bug
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22395.1.patch, HIVE-22395.2.patch, 
> HIVE-22395.2.patch, HIVE-22395.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22485) Cross product should set the conf in UnorderedPartitionedKVEdgeConfig

2019-11-12 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22485:

Status: Patch Available  (was: Open)

> Cross product should set the conf in UnorderedPartitionedKVEdgeConfig
> -
>
> Key: HIVE-22485
> URL: https://issues.apache.org/jira/browse/HIVE-22485
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Trivial
> Attachments: HIVE-22485.1.patch
>
>
> SSL and other options would not be sent correctly, if this is not setup.
>  
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java#L545



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-22485) Cross product should set the conf in UnorderedPartitionedKVEdgeConfig

2019-11-12 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reassigned HIVE-22485:
---

Assignee: Rajesh Balamohan

> Cross product should set the conf in UnorderedPartitionedKVEdgeConfig
> -
>
> Key: HIVE-22485
> URL: https://issues.apache.org/jira/browse/HIVE-22485
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Trivial
> Attachments: HIVE-22485.1.patch
>
>
> SSL and other options would not be sent correctly, if this is not setup.
>  
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java#L545



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22411) Performance degradation on single row inserts

2019-11-11 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16971777#comment-16971777
 ] 

Ashutosh Chauhan commented on HIVE-22411:
-

+1

> Performance degradation on single row inserts
> -
>
> Key: HIVE-22411
> URL: https://issues.apache.org/jira/browse/HIVE-22411
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22411.2.patch, HIVE-22411.3.patch, Screen Shot 
> 2019-10-17 at 8.40.50 PM.png
>
>
> Executing single insert statements on a transactional table effects write 
> performance on a s3 file system. Each insert creates a new delta directory. 
> After each insert hive calculates statistics like number of file in the table 
> and total size of the table. In order to calculate these, it traverses the 
> directory recursively. During the recursion for each path a separate 
> listStatus call is executed. In the end the more delta directory you have the 
> more time it takes to calculate the statistics.
> Therefore insertion time goes up linearly:
> !Screen Shot 2019-10-17 at 8.40.50 PM.png|width=601,height=436!
> The fix is to use fs.listFiles(path, /**recursive**/ true) instead the 
> handcrafter recursive method/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22382) Support Decimal64 column division with decimal64 Column

2019-11-05 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22382:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Ramesh!

> Support Decimal64 column division with decimal64 Column
> ---
>
> Key: HIVE-22382
> URL: https://issues.apache.org/jira/browse/HIVE-22382
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22382.2.patch, HIVE-22382.3.patch, 
> HIVE-22382.4.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Support Decimal64 column division with decimal64 Column



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22436) Add more logging to the test.

2019-10-31 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16964187#comment-16964187
 ] 

Ashutosh Chauhan commented on HIVE-22436:
-

+1

> Add more logging to the test.
> -
>
> Key: HIVE-22436
> URL: https://issues.apache.org/jira/browse/HIVE-22436
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Slim Bouguerra
>Assignee: Slim Bouguerra
>Priority: Major
> Attachments: HIVE-22436.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22396) CMV creating a Full ACID partitioned table fails because of no writeId

2019-10-23 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22396:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master.

> CMV creating a Full ACID partitioned table fails because of no writeId
> --
>
> Key: HIVE-22396
> URL: https://issues.apache.org/jira/browse/HIVE-22396
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22396.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> create materialized view acid_cmv_part disable rewrite partitioned on (k)
>   stored as orc TBLPROPERTIES ('transactional'='true')
>   as select key k, value from src order by k limit 5;
> ERROR : FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask. MoveTask : Write id is not set in 
> the config by open txn task for migration
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.MoveTask. MoveTask : Write id is not 
> set in the config by open txn task for migration (state=08S01,code=1)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22396) CMV creating a Full ACID partitioned table fails because of no writeId

2019-10-23 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16958549#comment-16958549
 ] 

Ashutosh Chauhan commented on HIVE-22396:
-

+1

> CMV creating a Full ACID partitioned table fails because of no writeId
> --
>
> Key: HIVE-22396
> URL: https://issues.apache.org/jira/browse/HIVE-22396
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22396.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> create materialized view acid_cmv_part disable rewrite partitioned on (k)
>   stored as orc TBLPROPERTIES ('transactional'='true')
>   as select key k, value from src order by k limit 5;
> ERROR : FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask. MoveTask : Write id is not set in 
> the config by open txn task for migration
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 1 from org.apache.hadoop.hive.ql.exec.MoveTask. MoveTask : Write id is not 
> set in the config by open txn task for migration (state=08S01,code=1)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22398) Remove Yarn queue management via ShimLoader.

2019-10-23 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16958544#comment-16958544
 ] 

Ashutosh Chauhan commented on HIVE-22398:
-

+1 pending tests.

> Remove Yarn queue management via ShimLoader.
> 
>
> Key: HIVE-22398
> URL: https://issues.apache.org/jira/browse/HIVE-22398
> Project: Hive
>  Issue Type: Task
>Reporter: Slim Bouguerra
>Assignee: Slim Bouguerra
>Priority: Major
> Attachments: HIVE-22398.patch
>
>
> Legacy MR Hive used this shim loader to do fair scheduling using Yarn Queues 
> non public APIs.
> This patch will remove this code since it is not used anymore and new 
> [YARN-8967|https://issues.apache.org/jira/browse/YARN-8967] changes will 
> break future version upgrade 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22373) File Merge tasks fail when containers are reused

2019-10-23 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22373:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Toshihiko!

> File Merge tasks fail when containers are reused
> 
>
> Key: HIVE-22373
> URL: https://issues.apache.org/jira/browse/HIVE-22373
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2
>Reporter: Toshihiko Uchida
>Assignee: Toshihiko Uchida
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22373.patch
>
>
> h1. Problems
> Setting tez.am.container.reuse.enabled=true allows for containers to be 
> reused across multiple tasks.
> When two File Merge tasks run on the same container, the last task fails in 
> renaming the output path.
> Below is an error log of the task 01_0 on the container 
> container_e87_1570604853053_11564_01_03, where the task 04_0 ran 
> before the task 01_0.
> It shows that the task 01_0's output file name is taken from the previous 
> task id 04_0 mistakenly.
> {code}
> 2019-10-15 13:00:31,438 [ERROR] [TezChild] |tez.TezProcessor|: 
> java.lang.RuntimeException: Hive Runtime Error while closing operators
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.close(MergeFileRecordProcessor.java:188)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:284)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileTezProcessor.run(MergeFileTezProcessor.java:42)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
>   at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
>   at 
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to close 
> AbstractFileMergeOperator
>   at 
> org.apache.hadoop.hive.ql.exec.AbstractFileMergeOperator.closeOp(AbstractFileMergeOperator.java:315)
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:265)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:733)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.close(MergeFileRecordProcessor.java:180)
>   ... 17 more
> Caused by: java.io.IOException: Unable to rename 
> viewfs:///user//.hive-staging_hive_2019-10-15_12-59-32_916_2461818728035733124-15476/_task_tmp.-ext-1/_tmp.04_0
>  to 
> viewfs:///user//.hive-staging_hive_2019-10-15_12-59-32_916_2461818728035733124-15476/_tmp.-ext-1/04_0
>   at 
> org.apache.hadoop.hive.ql.exec.AbstractFileMergeOperator.closeOp(AbstractFileMergeOperator.java:254)
>   ... 20 more
> {code}
> h1. Causes
> When AbstractFileMergeOperator is initialized, taskId is updated only for the 
> first time.
> - AbstractFileMergeOperator.java
> {code}
> private void updatePaths(Path tp, Path ttp) {
>   if (taskId == null) {
> taskId = Utilities.getTaskId(jc);
>   }
> {code}
> It leads to the above conflict of the output file names.
> h1. Solutions
> Remove the null-checking conditional, which was introduced in HIVE-14640, and 
> update taskId from JobConf whenever the operator is initialized.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22373) File Merge tasks fail when containers are reused

2019-10-23 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16958011#comment-16958011
 ] 

Ashutosh Chauhan commented on HIVE-22373:
-

+1

> File Merge tasks fail when containers are reused
> 
>
> Key: HIVE-22373
> URL: https://issues.apache.org/jira/browse/HIVE-22373
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2
>Reporter: Toshihiko Uchida
>Assignee: Toshihiko Uchida
>Priority: Major
> Attachments: HIVE-22373.patch
>
>
> h1. Problems
> Setting tez.am.container.reuse.enabled=true allows for containers to be 
> reused across multiple tasks.
> When two File Merge tasks run on the same container, the last task fails in 
> renaming the output path.
> Below is an error log of the task 01_0 on the container 
> container_e87_1570604853053_11564_01_03, where the task 04_0 ran 
> before the task 01_0.
> It shows that the task 01_0's output file name is taken from the previous 
> task id 04_0 mistakenly.
> {code}
> 2019-10-15 13:00:31,438 [ERROR] [TezChild] |tez.TezProcessor|: 
> java.lang.RuntimeException: Hive Runtime Error while closing operators
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.close(MergeFileRecordProcessor.java:188)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:284)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileTezProcessor.run(MergeFileTezProcessor.java:42)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
>   at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
>   at 
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to close 
> AbstractFileMergeOperator
>   at 
> org.apache.hadoop.hive.ql.exec.AbstractFileMergeOperator.closeOp(AbstractFileMergeOperator.java:315)
>   at 
> org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:265)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:733)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.close(MergeFileRecordProcessor.java:180)
>   ... 17 more
> Caused by: java.io.IOException: Unable to rename 
> viewfs:///user//.hive-staging_hive_2019-10-15_12-59-32_916_2461818728035733124-15476/_task_tmp.-ext-1/_tmp.04_0
>  to 
> viewfs:///user//.hive-staging_hive_2019-10-15_12-59-32_916_2461818728035733124-15476/_tmp.-ext-1/04_0
>   at 
> org.apache.hadoop.hive.ql.exec.AbstractFileMergeOperator.closeOp(AbstractFileMergeOperator.java:254)
>   ... 20 more
> {code}
> h1. Causes
> When AbstractFileMergeOperator is initialized, taskId is updated only for the 
> first time.
> - AbstractFileMergeOperator.java
> {code}
> private void updatePaths(Path tp, Path ttp) {
>   if (taskId == null) {
> taskId = Utilities.getTaskId(jc);
>   }
> {code}
> It leads to the above conflict of the output file names.
> h1. Solutions
> Remove the null-checking conditional, which was introduced in HIVE-14640, and 
> update taskId from JobConf whenever the operator is initialized.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22339) Change default time for MVs refresh in registry

2019-10-15 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952048#comment-16952048
 ] 

Ashutosh Chauhan commented on HIVE-22339:
-

+1

> Change default time for MVs refresh in registry
> ---
>
> Key: HIVE-22339
> URL: https://issues.apache.org/jira/browse/HIVE-22339
> Project: Hive
>  Issue Type: Improvement
>  Components: Materialized views
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Attachments: HIVE-22339.patch
>
>
> Default was set to 60secs in HIVE-21344. It seems it may be too aggressive; 
> suggestion is to change default to 1500secs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22332) Hive should ensure valid schema evolution settings since ORC-540

2019-10-14 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951073#comment-16951073
 ] 

Ashutosh Chauhan commented on HIVE-22332:
-

+1

> Hive should ensure valid schema evolution settings since ORC-540
> 
>
> Key: HIVE-22332
> URL: https://issues.apache.org/jira/browse/HIVE-22332
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Critical
> Fix For: 4.0.0
>
> Attachments: HIVE-22332.01.patch, HIVE-22332.02.patch
>
>
> For details please see: https://issues.apache.org/jira/browse/ORC-558



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-21344) CBO: Reduce compilation time in presence of materialized views

2019-10-13 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-21344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16950704#comment-16950704
 ] 

Ashutosh Chauhan commented on HIVE-21344:
-

sorry to be late on this, but I think 60s default to refresh MVs is too 
aggressive and will put load on both HS2 and HMS. Better default is more like 
900s.

> CBO: Reduce compilation time in presence of materialized views
> --
>
> Key: HIVE-21344
> URL: https://issues.apache.org/jira/browse/HIVE-21344
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views
>Affects Versions: 4.0.0
>Reporter: Gopal Vijayaraghavan
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21344.01.patch, HIVE-21344.02.patch, 
> HIVE-21344.03.patch, HIVE-21344.04.patch, HIVE-21344.05.patch, 
> HIVE-21344.patch, calcite-planner-after-fix.svg.zip, mv-get-from-remote.png
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> For every query, {{getAllValidMaterializedViews}} still requires a call to 
> metastore to verify that the materializations exist, whether they are 
> outdated or not, etc. Since this is only useful for active-active HS2 
> deployments, we could take a less aggressive approach and check this 
> information only after rewriting has been triggered. In addition, we could 
> refresh the information in the HS2 registry periodically in a background 
> thread.
> {code}
> // This is not a rebuild, we retrieve all the materializations. In turn, we 
> do not need
> // to force the materialization contents to be up-to-date, as this is not a 
> rebuild, and
> // we apply the user parameters 
> (HIVE_MATERIALIZED_VIEW_REWRITING_TIME_WINDOW) instead.
> materializations = db.getAllValidMaterializedViews(getTablesUsed(basePlan), 
> false, getTxnMgr());
> {code}
> !mv-get-from-remote.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22163) CBO: Enabling CBO turns on stats estimation, even when the estimation is disabled

2019-09-27 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22163:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks, Krisztian!

> CBO: Enabling CBO turns on stats estimation, even when the estimation is 
> disabled
> -
>
> Key: HIVE-22163
> URL: https://issues.apache.org/jira/browse/HIVE-22163
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Gopal Vijayaraghavan
>Assignee: Krisztian Kasa
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22163.1.patch, HIVE-22163.1.patch, 
> HIVE-22163.1.patch, HIVE-22163.2.patch, HIVE-22163.3.patch, 
> HIVE-22163.4.patch, HIVE-22163.4.patch, HIVE-22163.5.patch, 
> HIVE-22163.5.patch, HIVE-22163.5.patch, HIVE-22163.5.patch, 
> HIVE-22163.5.patch, HIVE-22163.5.patch, HIVE-22163.5.patch
>
>
> {code}
> create table claims(claim_rec_id bigint, claim_invoice_num string, typ_c int);
> alter table claims update statistics set 
> ('numRows'='1154941534','rawDataSize'='1135307527922');
> set hive.stats.estimate=false;
> explain extended select count(1) from claims where typ_c=3;
> set hive.stats.ndv.estimate.percent=5e-7;
> explain extended select count(1) from claims where typ_c=3;
> {code}
> Expecting the standard /2 for the single filter, but we instead get 5 rows.
> {code}
> 'Map Operator Tree:'
> 'TableScan'
> '  alias: claims'
> '  filterExpr: (typ_c = 3) (type: boolean)'
> '  Statistics: Num rows: 1154941534 Data size: 4388777832 
> Basic stats: COMPLETE Column stats: NONE'
> '  GatherStats: false'
> '  Filter Operator'
> 'isSamplingPred: false'
> 'predicate: (typ_c = 3) (type: boolean)'
> 'Statistics: Num rows: 5 Data size: 19 Basic stats: 
> COMPLETE Column stats: NONE'
> {code}
> The estimation is in effect, as changing the estimate.percent changes this.
> {code}
> '  filterExpr: (typ_c = 3) (type: boolean)'
> '  Statistics: Num rows: 1154941534 Data size: 4388777832 
> Basic stats: COMPLETE Column stats: NONE'
> '  GatherStats: false'
> '  Filter Operator'
> 'isSamplingPred: false'
> 'predicate: (typ_c = 3) (type: boolean)'
> 'Statistics: Num rows: 230988307 Data size: 877755567 
> Basic stats: COMPLETE Column stats: NONE'
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22145) Avoid optimizations for analyze compute statistics

2019-09-21 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16935213#comment-16935213
 ] 

Ashutosh Chauhan commented on HIVE-22145:
-

+1

> Avoid optimizations for analyze compute statistics
> --
>
> Key: HIVE-22145
> URL: https://issues.apache.org/jira/browse/HIVE-22145
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-22145.1.patch, HIVE-22145.2.patch, 
> HIVE-22145.3.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22201) ConvertJoinMapJoin#checkShuffleSizeForLargeTable throws ArrayIndexOutOfBoundsException if no big table is selected

2019-09-19 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22201:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Himanshu!

> ConvertJoinMapJoin#checkShuffleSizeForLargeTable throws 
> ArrayIndexOutOfBoundsException if no big table is selected
> --
>
> Key: HIVE-22201
> URL: https://issues.apache.org/jira/browse/HIVE-22201
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Himanshu Mishra
>Assignee: Himanshu Mishra
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22201.01.patch, HIVE-22201.02.patch, 
> HIVE-22201.03.patch, HIVE-22201.04.patch, HIVE-22201.05.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In case when {{bigTableCandidateSet}} is empty e.g. full outer join, we end 
> up calling {{checkShuffleSizeForLargeTable}} with {{bigTablePosition}} as -1, 
> resulting in {{ArrayIndexOutOfBoundsException}}.
> Also, should we return as soon as we see {{bigTableCandidateSet}} is empty ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-17131) Add InterfaceAudience and InterfaceStability annotations for SerDe APIs

2019-09-19 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-17131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16933856#comment-16933856
 ] 

Ashutosh Chauhan commented on HIVE-17131:
-

yes for HIVE-16374 for master.

> Add InterfaceAudience and InterfaceStability annotations for SerDe APIs
> ---
>
> Key: HIVE-17131
> URL: https://issues.apache.org/jira/browse/HIVE-17131
> Project: Hive
>  Issue Type: Sub-task
>  Components: Serializers/Deserializers
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Fix For: 2.4.0
>
> Attachments: HIVE-17131.1.branch-2.patch, HIVE-17131.1.patch
>
>
> Adding InterfaceAudience and InterfaceStability annotations for the core 
> SerDe APIs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-20983) Vectorization: Scale up small hashtables, when collisions are detected

2019-09-18 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-20983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-20983:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Gopal and Mustafa!

> Vectorization: Scale up small hashtables, when collisions are detected
> --
>
> Key: HIVE-20983
> URL: https://issues.apache.org/jira/browse/HIVE-20983
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20983.1.patch, HIVE-20983.2.patch, 
> HIVE-20983.3.patch, HIVE-20983.4.patch, HIVE-20983.5.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive's hashtable estimates are getting better with HyperLogLog stats in 
> place, but an accurate estimate does not always result in a low number of 
> collisions.
> The hashtables which contain a very small number of items tend to lose their 
> O(1) lookup performance where there are collisions. Since collisions are easy 
> to detect within the fast hashtable implementation, a rehashing to a higher 
> size will help these small hashtables avoid collisions and go back to O(1) 
> perf.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-20983) Vectorization: Scale up small hashtables, when collisions are detected

2019-09-18 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-20983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16932477#comment-16932477
 ] 

Ashutosh Chauhan commented on HIVE-20983:
-

+1

> Vectorization: Scale up small hashtables, when collisions are detected
> --
>
> Key: HIVE-20983
> URL: https://issues.apache.org/jira/browse/HIVE-20983
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20983.1.patch, HIVE-20983.2.patch, 
> HIVE-20983.3.patch, HIVE-20983.4.patch, HIVE-20983.5.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive's hashtable estimates are getting better with HyperLogLog stats in 
> place, but an accurate estimate does not always result in a low number of 
> collisions.
> The hashtables which contain a very small number of items tend to lose their 
> O(1) lookup performance where there are collisions. Since collisions are easy 
> to detect within the fast hashtable implementation, a rehashing to a higher 
> size will help these small hashtables avoid collisions and go back to O(1) 
> perf.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22201) ConvertJoinMapJoin#checkShuffleSizeForLargeTable throws ArrayIndexOutOfBoundsException if no big table is selected

2019-09-18 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16932258#comment-16932258
 ] 

Ashutosh Chauhan commented on HIVE-22201:
-

+1
 [~himanshum] we require a green run. Can you please re-upload your patch one 
more time to get a new run.

> ConvertJoinMapJoin#checkShuffleSizeForLargeTable throws 
> ArrayIndexOutOfBoundsException if no big table is selected
> --
>
> Key: HIVE-22201
> URL: https://issues.apache.org/jira/browse/HIVE-22201
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Himanshu Mishra
>Assignee: Himanshu Mishra
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22201.01.patch, HIVE-22201.02.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In case when {{bigTableCandidateSet}} is empty e.g. full outer join, we end 
> up calling {{checkShuffleSizeForLargeTable}} with {{bigTablePosition}} as -1, 
> resulting in {{ArrayIndexOutOfBoundsException}}.
> Also, should we return as soon as we see {{bigTableCandidateSet}} is empty ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22213) TxnHander cleanupRecords should only clean records belonging to default catalog

2019-09-17 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16932072#comment-16932072
 ] 

Ashutosh Chauhan commented on HIVE-22213:
-

+1

> TxnHander cleanupRecords should only clean records belonging to default 
> catalog
> ---
>
> Key: HIVE-22213
> URL: https://issues.apache.org/jira/browse/HIVE-22213
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-22213.1.patch
>
>
> Currently it removes record for given database and given table without 
> checking for the catalog, as a result it can end up removing records when it 
> shouldn't. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22169) Tez: SplitGenerator tries to look for plan files which won't exist for Tez

2019-09-16 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22169:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Gopal!

> Tez: SplitGenerator tries to look for plan files which won't exist for Tez
> --
>
> Key: HIVE-22169
> URL: https://issues.apache.org/jira/browse/HIVE-22169
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22169.1.patch, HIVE-22169.1.patch, 
> HIVE-22169.1.patch
>
>
> {code}
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.clearWork(Utilities.java:310)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:318)
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
>   at java.security.AccessController.doPrivileged(Native Method
> {code}
> The split generator tries to clear out the work items from HDFS, which will 
> never exist for Tez plans.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (HIVE-22200) Hash collision may cause column resolution to fail

2019-09-14 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16929814#comment-16929814
 ] 

Ashutosh Chauhan commented on HIVE-22200:
-

+1

> Hash collision may cause column resolution to fail
> --
>
> Key: HIVE-22200
> URL: https://issues.apache.org/jira/browse/HIVE-22200
> Project: Hive
>  Issue Type: Bug
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
>  Labels: pull-request-available
> Attachments: HIVE-22200.01.patch, HIVE-22200.patch, HIVE-22200.patch, 
> HIVE-22200.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {{ExprNodeDescUtils.getExprNodeColumnDesc}} extracts the 
> {{ExprNodeColumnDesc}} (column descriptors) from an expression. In fact, it 
> creates a map from hash to the object itself. If same hash value is generated 
> for two different objects, this will result in a clash in the map and some 
> expressions not being part of its values.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (HIVE-22204) Beeline option to show/not show execution report

2019-09-13 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16929602#comment-16929602
 ] 

Ashutosh Chauhan commented on HIVE-22204:
-

+1 pending tests

> Beeline option to show/not show execution report
> 
>
> Key: HIVE-22204
> URL: https://issues.apache.org/jira/browse/HIVE-22204
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Attachments: HIVE-22204.patch
>
>
> Currently, {{\-\-silent=true}} will also remove the short report about 
> execution (includes number of rows returned by a query and execution time). 
> It would be interesting to control whether we want to show that report even 
> if {{\-\-silent=true}}, e.g., using an option {{\-\-report=true}}. Default 
> (existing) behavior should not change.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (HIVE-22079) Post order walker for iterating over expression tree

2019-09-11 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16928219#comment-16928219
 ] 

Ashutosh Chauhan commented on HIVE-22079:
-

it will be great to see perf impact of this. Any measurements on any query 
before and after?
Also consider adding jmh bechmarks in tests.

> Post order walker for iterating over expression tree
> 
>
> Key: HIVE-22079
> URL: https://issues.apache.org/jira/browse/HIVE-22079
> Project: Hive
>  Issue Type: Improvement
>  Components: Logical Optimizer, Physical Optimizer
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-22079.1.patch, HIVE-22079.2.patch, 
> HIVE-22079.3.patch, HIVE-22079.4.patch
>
>
> Current {{DefaultGraphWalker}} is used to iterate over an expression tree. 
> This walker uses hash map to keep track of visited/processed nodes. If an 
> expression tree is large this adds significant overhead due to map lookup.
> For an expression trees we can instead use post order traversal and avoid 
> using map.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Updated] (HIVE-22076) JDK11: Remove ParallelGC in debug.sh

2019-09-09 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22076:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Gopal!

> JDK11: Remove ParallelGC in debug.sh
> 
>
> Key: HIVE-22076
> URL: https://issues.apache.org/jira/browse/HIVE-22076
> Project: Hive
>  Issue Type: Bug
>  Components: Diagnosability
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-22076.1.patch
>
>
> The JDK debug mode no longer depends on ParallelGC 
> This was a workaround for JDK6 bug - 
> https://bugs.java.com/bugdatabase/view_bug.do?bug_id=6862295



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (HIVE-22076) JDK11: Remove ParallelGC in debug.sh

2019-09-09 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16926181#comment-16926181
 ] 

Ashutosh Chauhan commented on HIVE-22076:
-

+1

> JDK11: Remove ParallelGC in debug.sh
> 
>
> Key: HIVE-22076
> URL: https://issues.apache.org/jira/browse/HIVE-22076
> Project: Hive
>  Issue Type: Bug
>  Components: Diagnosability
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Minor
> Attachments: HIVE-22076.1.patch
>
>
> The JDK debug mode no longer depends on ParallelGC 
> This was a workaround for JDK6 bug - 
> https://bugs.java.com/bugdatabase/view_bug.do?bug_id=6862295



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (HIVE-20683) Add the Ability to push Dynamic Between and Bloom filters to Druid

2019-09-07 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-20683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925046#comment-16925046
 ] 

Ashutosh Chauhan commented on HIVE-20683:
-

[~bslim] can you please review this?

> Add the Ability to push Dynamic Between and Bloom filters to Druid
> --
>
> Key: HIVE-20683
> URL: https://issues.apache.org/jira/browse/HIVE-20683
> Project: Hive
>  Issue Type: New Feature
>  Components: Druid integration
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20683.1.patch, HIVE-20683.2.patch, 
> HIVE-20683.3.patch, HIVE-20683.4.patch, HIVE-20683.5.patch, 
> HIVE-20683.6.patch, HIVE-20683.8.patch, HIVE-20683.patch
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> For optimizing joins, Hive generates BETWEEN filter with min-max and BLOOM 
> filter for filtering one side of semi-join.
> Druid 0.13.0 will have support for Bloom filters (Added via 
> https://github.com/apache/incubator-druid/pull/6222)
> Implementation details - 
> # Hive generates and passes the filters as part of 'filterExpr' in TableScan. 
> # DruidQueryBasedRecordReader gets this filter passed as part of the conf. 
> # During execution phase, before sending the query to druid in 
> DruidQueryBasedRecordReader we will deserialize this filter, translate it 
> into a DruidDimFilter and add it to existing DruidQuery.  Tez executor 
> already ensures that when we start reading results from the record reader, 
> all the dynamic values are initialized. 
> # Explaining a druid query also prints the query sent to druid as 
> {{druid.json.query}}. We also need to make sure to update the druid query 
> with the filters. During explain we do not have the actual values for the 
> dynamic values, so instead of values we will print the dynamic expression 
> itself as part of druid query. 
> Note:- This work needs druid to be updated to version 0.13.0



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (HIVE-22161) UDF: FunctionRegistry synchronizes on org.apache.hadoop.hive.ql.udf.UDFType class

2019-08-30 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16919814#comment-16919814
 ] 

Ashutosh Chauhan commented on HIVE-22161:
-

+1 pending tests.

> UDF: FunctionRegistry synchronizes on org.apache.hadoop.hive.ql.udf.UDFType 
> class
> -
>
> Key: HIVE-22161
> URL: https://issues.apache.org/jira/browse/HIVE-22161
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
> Attachments: HIVE-22161.1.patch
>
>
> There's a hidden synchronization across threads when looking up isStateful 
> and isDeterministic.
> https://github.com/apache/hive/blob/master/common/src/java/org/apache/hive/common/util/AnnotationUtils.java#L27
> {code}
>   // to avoid https://bugs.openjdk.java.net/browse/JDK-7122142
>   public static  T getAnnotation(Class clazz, 
> Class annotationClass) {
> synchronized (annotationClass) {
>   return clazz.getAnnotation(annotationClass);
> }
>   }
> {code}
> This is serializing multiple threads initializing UDFs (or checking them 
> during compilation) & also being locked across threads for each instance of 
> GenericUDFOpEqual in the specific scenario.
> https://bugs.openjdk.java.net/browse/JDK-7122142 is fixed in jdk8+



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Updated] (HIVE-22151) Turn off hybrid grace hash join by default

2019-08-29 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22151:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master.

> Turn off hybrid grace hash join by default
> --
>
> Key: HIVE-22151
> URL: https://issues.apache.org/jira/browse/HIVE-22151
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22151.2.patch, HIVE-22151.4.patch, HIVE-22151.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Updated] (HIVE-22151) Turn off hybrid grace hash join by default

2019-08-29 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22151:

Status: Patch Available  (was: Open)

> Turn off hybrid grace hash join by default
> --
>
> Key: HIVE-22151
> URL: https://issues.apache.org/jira/browse/HIVE-22151
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-22151.2.patch, HIVE-22151.4.patch, HIVE-22151.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Updated] (HIVE-22151) Turn off hybrid grace hash join by default

2019-08-29 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22151:

Attachment: HIVE-22151.4.patch

> Turn off hybrid grace hash join by default
> --
>
> Key: HIVE-22151
> URL: https://issues.apache.org/jira/browse/HIVE-22151
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-22151.2.patch, HIVE-22151.4.patch, HIVE-22151.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Updated] (HIVE-22151) Turn off hybrid grace hash join by default

2019-08-29 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22151:

Status: Open  (was: Patch Available)

> Turn off hybrid grace hash join by default
> --
>
> Key: HIVE-22151
> URL: https://issues.apache.org/jira/browse/HIVE-22151
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-22151.2.patch, HIVE-22151.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Updated] (HIVE-22151) Turn off hybrid grace hash join by default

2019-08-29 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22151:

Attachment: (was: HIVE-22151.3.patch)

> Turn off hybrid grace hash join by default
> --
>
> Key: HIVE-22151
> URL: https://issues.apache.org/jira/browse/HIVE-22151
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-22151.2.patch, HIVE-22151.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Updated] (HIVE-22151) Turn off hybrid grace hash join by default

2019-08-29 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22151:

Attachment: HIVE-22151.3.patch

> Turn off hybrid grace hash join by default
> --
>
> Key: HIVE-22151
> URL: https://issues.apache.org/jira/browse/HIVE-22151
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-22151.2.patch, HIVE-22151.3.patch, HIVE-22151.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Updated] (HIVE-22151) Turn off hybrid grace hash join by default

2019-08-29 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22151:

Status: Patch Available  (was: Open)

[~vgarg] Does my explanation make sense? Can you please review the patch.

> Turn off hybrid grace hash join by default
> --
>
> Key: HIVE-22151
> URL: https://issues.apache.org/jira/browse/HIVE-22151
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-22151.2.patch, HIVE-22151.3.patch, HIVE-22151.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Updated] (HIVE-22151) Turn off hybrid grace hash join by default

2019-08-29 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22151:

Status: Open  (was: Patch Available)

> Turn off hybrid grace hash join by default
> --
>
> Key: HIVE-22151
> URL: https://issues.apache.org/jira/browse/HIVE-22151
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-22151.2.patch, HIVE-22151.3.patch, HIVE-22151.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Updated] (HIVE-22151) Turn off hybrid grace hash join by default

2019-08-29 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22151:

Attachment: HIVE-22151.2.patch

> Turn off hybrid grace hash join by default
> --
>
> Key: HIVE-22151
> URL: https://issues.apache.org/jira/browse/HIVE-22151
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-22151.2.patch, HIVE-22151.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Updated] (HIVE-22151) Turn off hybrid grace hash join by default

2019-08-29 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22151:

Status: Patch Available  (was: Open)

> Turn off hybrid grace hash join by default
> --
>
> Key: HIVE-22151
> URL: https://issues.apache.org/jira/browse/HIVE-22151
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-22151.2.patch, HIVE-22151.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Updated] (HIVE-22151) Turn off hybrid grace hash join by default

2019-08-29 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22151:

Status: Open  (was: Patch Available)

> Turn off hybrid grace hash join by default
> --
>
> Key: HIVE-22151
> URL: https://issues.apache.org/jira/browse/HIVE-22151
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-22151.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (HIVE-22151) Turn off hybrid grace hash join by default

2019-08-28 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16918309#comment-16918309
 ] 

Ashutosh Chauhan commented on HIVE-22151:
-

This is a perf config. But we never got it to perform reliably. This was 
introduced to prevent Mapjoin to go OOM by spilling data to disk. But what we 
found in practice is if hashtable fits in memory , then this performs worse 
than MapJoin and in case it doesnt and it spills perf still suffers quite a 
bit. And then to determine when to spill is not easy so either you 
unnecessarily spill or spill too late. Biggest issue was the first one, ie, 
this impl is slow compared to mapjoin when there is no spilling.
As a result, in most sites this is turned off. Most recent instance was with 
[~rameshkumar] who discovered this doesnt work very well with vectorization and 
throws up exception. So, my suggestion is to have this turned off by default.

> Turn off hybrid grace hash join by default
> --
>
> Key: HIVE-22151
> URL: https://issues.apache.org/jira/browse/HIVE-22151
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-22151.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Updated] (HIVE-22151) Turn off hybrid grace hash join by default

2019-08-27 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22151:

Status: Patch Available  (was: Open)

> Turn off hybrid grace hash join by default
> --
>
> Key: HIVE-22151
> URL: https://issues.apache.org/jira/browse/HIVE-22151
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-22151.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Updated] (HIVE-22151) Turn off hybrid grace hash join by default

2019-08-27 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22151:

Attachment: HIVE-22151.patch

> Turn off hybrid grace hash join by default
> --
>
> Key: HIVE-22151
> URL: https://issues.apache.org/jira/browse/HIVE-22151
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-22151.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Assigned] (HIVE-22151) Turn off hybrid grace hash join by default

2019-08-27 Thread Ashutosh Chauhan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reassigned HIVE-22151:
---


> Turn off hybrid grace hash join by default
> --
>
> Key: HIVE-22151
> URL: https://issues.apache.org/jira/browse/HIVE-22151
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (HIVE-22107) Correlated subquery producing wrong schema

2019-08-21 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16912703#comment-16912703
 ] 

Ashutosh Chauhan commented on HIVE-22107:
-

+1

> Correlated subquery producing wrong schema
> --
>
> Key: HIVE-22107
> URL: https://issues.apache.org/jira/browse/HIVE-22107
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-22107.1.patch, HIVE-22107.2.patch, 
> HIVE-22107.3.patch, HIVE-22107.4.patch, HIVE-22107.5.patch
>
>
> *Repro*
> {code:sql}
> create table test(id int, name string,dept string);
> insert into test values(1,'a','it'),(2,'b','eee'),(NULL, 'c', 'cse');
> select distinct 'empno' as eid, a.id from test a where NOT EXISTS (select 
> c.id from test c where a.id=c.id);
> {code}
> {code}
> +---++
> |  eid  |  a.id  |
> +---++
> | NULL  | empno  |
> +---++
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Commented] (HIVE-22125) Move to Kafka 2.3 Clients

2019-08-21 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16912557#comment-16912557
 ] 

Ashutosh Chauhan commented on HIVE-22125:
-

+1


> Move to Kafka 2.3 Clients
> -
>
> Key: HIVE-22125
> URL: https://issues.apache.org/jira/browse/HIVE-22125
> Project: Hive
>  Issue Type: Improvement
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-22125.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Updated] (HIVE-22114) insert query for partitioned insert only table failing when all buckets are empty

2019-08-14 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22114:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master.

> insert query for partitioned insert only table failing when all buckets are 
> empty
> -
>
> Key: HIVE-22114
> URL: https://issues.apache.org/jira/browse/HIVE-22114
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22114.1.patch
>
>
> Following insert query fails when all buckets are empty
> {code:sql}
> create table src_emptybucket_partitioned_1 (name string, age int, gpa 
> decimal(3,2))
>partitioned by(year int)
>clustered by (age)
>sorted by (age)
>into 100 buckets
>stored as orc tblproperties 
> ("transactional"="true", "transactional_properties"="insert_only");
> create table src1(name string, age int, gpa decimal(3,2));
> insert into src1 values("name", 56, 4);
> insert into table src_emptybucket_partitioned_1
>partition(year=2015)
>select * from src1 limit 0;
> {code}
> Error:
> {noformat}
> ERROR : Job Commit failed with exception 
> 'org.apache.hadoop.hive.ql.metadata.HiveException(java.io.FileNotFoundException:
>  No such file or directory: 
> s3a://warehouse/tablespace/managed/hive/src_emptybucket_partitioned/year=2015)'
> # org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.FileNotFoundException: No such file or directory: 
> s3a:///warehouse/tablespace/managed/hive/src_emptybucket_partitioned/year=2015
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.jobCloseOp(FileSinkOperator.java:1403)
>   at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:798)
>   at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:803)
>   at org.apache.hadoop.hive.ql.exec.tez.TezTask.close(TezTask.java:590)
>   at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:327)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2335)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2002)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1674)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1372)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1366)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:226)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:324)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:342)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.FileNotFoundException: No such file or directory: 
> s3a:///warehouse/tablespace/managed/hive/src_emptybucket_partitioned/year=2015
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2805)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2694)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2587)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerListStatus(S3AFileSystem.java:2388)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listStatus$10(S3AFileSystem.java:2367)
>   at org.apac

[jira] [Commented] (HIVE-22114) insert query for partitioned insert only table failing when all buckets are empty

2019-08-14 Thread Ashutosh Chauhan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-22114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16907797#comment-16907797
 ] 

Ashutosh Chauhan commented on HIVE-22114:
-

+1

> insert query for partitioned insert only table failing when all buckets are 
> empty
> -
>
> Key: HIVE-22114
> URL: https://issues.apache.org/jira/browse/HIVE-22114
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-22114.1.patch
>
>
> Following insert query fails when all buckets are empty
> {code:sql}
> create table src_emptybucket_partitioned_1 (name string, age int, gpa 
> decimal(3,2))
>partitioned by(year int)
>clustered by (age)
>sorted by (age)
>into 100 buckets
>stored as orc tblproperties 
> ("transactional"="true", "transactional_properties"="insert_only");
> create table src1(name string, age int, gpa decimal(3,2));
> insert into src1 values("name", 56, 4);
> insert into table src_emptybucket_partitioned_1
>partition(year=2015)
>select * from src1 limit 0;
> {code}
> Error:
> {noformat}
> ERROR : Job Commit failed with exception 
> 'org.apache.hadoop.hive.ql.metadata.HiveException(java.io.FileNotFoundException:
>  No such file or directory: 
> s3a://warehouse/tablespace/managed/hive/src_emptybucket_partitioned/year=2015)'
> # org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.FileNotFoundException: No such file or directory: 
> s3a:///warehouse/tablespace/managed/hive/src_emptybucket_partitioned/year=2015
>   at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.jobCloseOp(FileSinkOperator.java:1403)
>   at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:798)
>   at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:803)
>   at org.apache.hadoop.hive.ql.exec.tez.TezTask.close(TezTask.java:590)
>   at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:327)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2335)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2002)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1674)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1372)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1366)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:226)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:324)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:342)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.FileNotFoundException: No such file or directory: 
> s3a:///warehouse/tablespace/managed/hive/src_emptybucket_partitioned/year=2015
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2805)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2694)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2587)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerListStatus(S3AFileSystem.java:2388)
>   at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listStatus$10(S3AFileSystem.java:2367)
>   at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109)
>   at 
> org.apache.hadoop.fs.s3a.S3AF

[jira] [Updated] (HIVE-22112) update jackson version in disconnected poms

2019-08-14 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22112:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master.

> update jackson version in disconnected poms 
> 
>
> Key: HIVE-22112
> URL: https://issues.apache.org/jira/browse/HIVE-22112
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22112.patch
>
>
> was updated in top level pom via HIVE-22089



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-22112) update jackson version in disconnected poms

2019-08-14 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22112:

Status: Patch Available  (was: Open)

> update jackson version in disconnected poms 
> 
>
> Key: HIVE-22112
> URL: https://issues.apache.org/jira/browse/HIVE-22112
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-22112.patch
>
>
> was updated in top level pom via HIVE-22089



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-22112) update jackson version in disconnected poms

2019-08-14 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22112:

Attachment: HIVE-22112.patch

> update jackson version in disconnected poms 
> 
>
> Key: HIVE-22112
> URL: https://issues.apache.org/jira/browse/HIVE-22112
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-22112.patch
>
>
> was updated in top level pom via HIVE-22089



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Assigned] (HIVE-22112) update jackson version in disconnected poms

2019-08-14 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reassigned HIVE-22112:
---


> update jackson version in disconnected poms 
> 
>
> Key: HIVE-22112
> URL: https://issues.apache.org/jira/browse/HIVE-22112
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
>
> was updated in top level pom via HIVE-22089



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-22094) queries failing with ClassCastException: hive.ql.exec.vector.DecimalColumnVector cannot be cast to hive.ql.exec.vector.Decimal64ColumnVector

2019-08-10 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22094:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Attila!

> queries failing with ClassCastException: 
> hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> hive.ql.exec.vector.Decimal64ColumnVector
> 
>
> Key: HIVE-22094
> URL: https://issues.apache.org/jira/browse/HIVE-22094
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22094.1.patch, HIVE-22094.2.patch
>
>
> When running a query like this
> select sum(salary.salary_paid) from salary, employee_closure where 
> salary.employee_id = employee_closure.employee_id;
> with hive.auto.convert.join=true and hive.vectorized.execution.enabled=true 
> the following exception occurs
> {code:java}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.aggregates.VectorUDAFSumDecimal64ToDecimal.aggregateInput(VectorUDAFSumDecimal64ToDecimal.java:320)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeBase.processAggregators(VectorGroupByOperator.java:217)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.doProcessBatch(VectorGroupByOperator.java:414)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeBase.processBatch(VectorGroupByOperator.java:182)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.process(VectorGroupByOperator.java:1124)
> at org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardOverflow(VectorMapJoinGenerateResultOperator.java:706)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyGenerateResultOperator.generateHashMultiSetResultMultiValue(VectorMapJoinInnerBigOnlyGenerateResultOperator.java:268)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyGenerateResultOperator.finishInnerBigOnly(VectorMapJoinInnerBigOnlyGenerateResultOperator.java:180)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyLongOperator.processBatch(VectorMapJoinInnerBigOnlyLongOperator.java:379)
> ... 28 more{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (HIVE-22094) queries failing with ClassCastException: hive.ql.exec.vector.DecimalColumnVector cannot be cast to hive.ql.exec.vector.Decimal64ColumnVector

2019-08-10 Thread Ashutosh Chauhan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-22094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16904524#comment-16904524
 ] 

Ashutosh Chauhan commented on HIVE-22094:
-

+1

> queries failing with ClassCastException: 
> hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> hive.ql.exec.vector.Decimal64ColumnVector
> 
>
> Key: HIVE-22094
> URL: https://issues.apache.org/jira/browse/HIVE-22094
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22094.1.patch, HIVE-22094.2.patch
>
>
> When running a query like this
> select sum(salary.salary_paid) from salary, employee_closure where 
> salary.employee_id = employee_closure.employee_id;
> with hive.auto.convert.join=true and hive.vectorized.execution.enabled=true 
> the following exception occurs
> {code:java}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.aggregates.VectorUDAFSumDecimal64ToDecimal.aggregateInput(VectorUDAFSumDecimal64ToDecimal.java:320)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeBase.processAggregators(VectorGroupByOperator.java:217)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.doProcessBatch(VectorGroupByOperator.java:414)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeBase.processBatch(VectorGroupByOperator.java:182)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.process(VectorGroupByOperator.java:1124)
> at org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardOverflow(VectorMapJoinGenerateResultOperator.java:706)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyGenerateResultOperator.generateHashMultiSetResultMultiValue(VectorMapJoinInnerBigOnlyGenerateResultOperator.java:268)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyGenerateResultOperator.finishInnerBigOnly(VectorMapJoinInnerBigOnlyGenerateResultOperator.java:180)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyLongOperator.processBatch(VectorMapJoinInnerBigOnlyLongOperator.java:379)
> ... 28 more{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-22094) queries failing with ClassCastException: hive.ql.exec.vector.DecimalColumnVector cannot be cast to hive.ql.exec.vector.Decimal64ColumnVector

2019-08-10 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22094:

Summary: queries failing with ClassCastException: 
hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
hive.ql.exec.vector.Decimal64ColumnVector  (was: Mondrian queries failing with 
ClassCastException: hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
hive.ql.exec.vector.Decimal64ColumnVector)

> queries failing with ClassCastException: 
> hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> hive.ql.exec.vector.Decimal64ColumnVector
> 
>
> Key: HIVE-22094
> URL: https://issues.apache.org/jira/browse/HIVE-22094
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22094.1.patch, HIVE-22094.2.patch
>
>
> When running a query like this
> select sum(salary.salary_paid) from salary, employee_closure where 
> salary.employee_id = employee_closure.employee_id;
> with hive.auto.convert.join=true and hive.vectorized.execution.enabled=true 
> the following exception occurs
> {code:java}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.aggregates.VectorUDAFSumDecimal64ToDecimal.aggregateInput(VectorUDAFSumDecimal64ToDecimal.java:320)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeBase.processAggregators(VectorGroupByOperator.java:217)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.doProcessBatch(VectorGroupByOperator.java:414)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeBase.processBatch(VectorGroupByOperator.java:182)
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.process(VectorGroupByOperator.java:1124)
> at org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardOverflow(VectorMapJoinGenerateResultOperator.java:706)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyGenerateResultOperator.generateHashMultiSetResultMultiValue(VectorMapJoinInnerBigOnlyGenerateResultOperator.java:268)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyGenerateResultOperator.finishInnerBigOnly(VectorMapJoinInnerBigOnlyGenerateResultOperator.java:180)
> at 
> org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyLongOperator.processBatch(VectorMapJoinInnerBigOnlyLongOperator.java:379)
> ... 28 more{code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-22089) Upgrade jackson to 2.9.9

2019-08-08 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22089:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master.

> Upgrade jackson to 2.9.9
> 
>
> Key: HIVE-22089
> URL: https://issues.apache.org/jira/browse/HIVE-22089
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 3.1.0, 3.1.1
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22089.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-22090) Upgrade jetty to 9.3.27

2019-08-08 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22090:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

> Upgrade jetty to 9.3.27
> ---
>
> Key: HIVE-22090
> URL: https://issues.apache.org/jira/browse/HIVE-22090
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22090.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-22090) Upgrade jetty to 9.3.27

2019-08-07 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22090:

Status: Patch Available  (was: Open)

> Upgrade jetty to 9.3.27
> ---
>
> Key: HIVE-22090
> URL: https://issues.apache.org/jira/browse/HIVE-22090
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-22090.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-22090) Upgrade jetty to 9.3.27

2019-08-07 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22090:

Attachment: HIVE-22090.patch

> Upgrade jetty to 9.3.27
> ---
>
> Key: HIVE-22090
> URL: https://issues.apache.org/jira/browse/HIVE-22090
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-22090.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Assigned] (HIVE-22090) Upgrade jetty to 9.3.27

2019-08-07 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reassigned HIVE-22090:
---


> Upgrade jetty to 9.3.27
> ---
>
> Key: HIVE-22090
> URL: https://issues.apache.org/jira/browse/HIVE-22090
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-22089) Upgrade jackson to 2.9.9

2019-08-07 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22089:

Attachment: HIVE-22089.patch

> Upgrade jackson to 2.9.9
> 
>
> Key: HIVE-22089
> URL: https://issues.apache.org/jira/browse/HIVE-22089
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 3.1.0, 3.1.1
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-22089.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-22089) Upgrade jackson to 2.9.9

2019-08-07 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-22089:

Status: Patch Available  (was: Open)

> Upgrade jackson to 2.9.9
> 
>
> Key: HIVE-22089
> URL: https://issues.apache.org/jira/browse/HIVE-22089
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.1.1, 3.1.0, 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-22089.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Assigned] (HIVE-22089) Upgrade jackson to 2.9.9

2019-08-07 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-22089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reassigned HIVE-22089:
---


> Upgrade jackson to 2.9.9
> 
>
> Key: HIVE-22089
> URL: https://issues.apache.org/jira/browse/HIVE-22089
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.1.1, 3.1.0, 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-21828) Tez: Use a pre-parsed TezConfiguration from DagUtils

2019-08-05 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21828:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Attila!

> Tez: Use a pre-parsed TezConfiguration from DagUtils
> 
>
> Key: HIVE-21828
> URL: https://issues.apache.org/jira/browse/HIVE-21828
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21828.1.patch, HIVE-21828.2.patch, 
> HIVE-21828.5.patch, HIVE-21828.6.patch, HIVE-21828.7.patch, HIVE-21828.8.patch
>
>
> The HS2 tez-site.xml does not change dynamically - the XML parsed components 
> of the config can be obtained statically and kept across sessions.
> This allows for the replacing of "new TezConfiguration()" with a HS2 local 
> version instead.
> The configuration object however has to reference the right resource file 
> (i.e location of tez-site.xml) without reparsing it for each query.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (HIVE-22001) AcidUtils.getAcidState() can fail if Cleaner is removing files at the same time

2019-08-05 Thread Ashutosh Chauhan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-22001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900611#comment-16900611
 ] 

Ashutosh Chauhan commented on HIVE-22001:
-

+1 during commit can you please also add a comment on when exactly files may 
get deleted and why this is OK for correctness?

> AcidUtils.getAcidState() can fail if Cleaner is removing files at the same 
> time
> ---
>
> Key: HIVE-22001
> URL: https://issues.apache.org/jira/browse/HIVE-22001
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-22001.1.patch
>
>
> Had one user hit the following error during getSplits
> {noformat}
> 2019-07-06T14:33:03,067 ERROR [4640181a-3eb7-4b3e-9a40-d7a8de9a570c 
> HiveServer2-HttpHandler-Pool: Thread-415519]: SessionState 
> (SessionState.java:printError(1247)) - Vertex failed, vertexName=Map 1, 
> vertexId=vertex_1560947172646_2452_6199_00, diagnostics=[Vertex 
> vertex_1560947172646_2452_6199_00 [Map 1] killed/failed due 
> to:ROOT_INPUT_INIT_FAILURE, Vertex Input: hive_table initializer failed, 
> vertex=vertex_1560947172646_2452_6199_00 [Map 1], java.lang.RuntimeException: 
> ORC split generation failed with exception: java.io.FileNotFoundException: 
> File hdfs://path/to/hive_table/oiddatemmdd=20190706/delta_0987070_0987070 
> does not exist.
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1870)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1958)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:524)
> at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:779)
> at 
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243)
> at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
> at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269)
> at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
> at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
> at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
> at 
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.util.concurrent.ExecutionException: 
> java.io.FileNotFoundException: File 
> hdfs://path/to/hive_table/oiddatemmdd=20190706/delta_0987070_0987070 does 
> not exist.
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1809)
> ... 17 more
> Caused by: java.io.FileNotFoundException: File 
> hdfs://path/to/hive_table/oiddatemmdd=20190706/delta_0987070_0987070 does 
> not exist.
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1059)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1119)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1116)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1126)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1868)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1953)
> at 
> org.apache.

[jira] [Commented] (HIVE-20442) Hive stale lock when the hiveserver2 background thread died with NPE

2019-07-26 Thread Ashutosh Chauhan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16893997#comment-16893997
 ] 

Ashutosh Chauhan commented on HIVE-20442:
-

+1

> Hive stale lock when the hiveserver2 background thread died with NPE
> 
>
> Key: HIVE-20442
> URL: https://issues.apache.org/jira/browse/HIVE-20442
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Transactions
>Affects Versions: 2.1.1
> Environment: Hive-2.1
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-20442.01.branch-2.patch
>
>
> this look like a race condition where background thread is not able to 
> release the lock it aquired.
> 1. hiveserver2 background thread request for lock
> {code}
> 2018-08-20T14:13:38,813 INFO  [HiveServer2-Background-Pool: Thread-X]: 
> lockmgr.DbLockManager (DbLockManager.java:lock(100)) - Requesting: 
> queryId=hive_xxx LockRequest(component:[LockComponent(type:SHARED_READ, 
> level:TABLE, dbname:testdb, tablename:test_table, operationType:SELECT)], 
> txnid:0, user:hive, hostname:HOSTNAME, agentInfo:hive_xxx)
> {code}
> 2. acquired the lock and start heartbeating
> {code}
> 2018-08-20T14:36:30,233 INFO  [HiveServer2-Background-Pool: Thread-X]: 
> lockmgr.DbTxnManager (DbTxnManager.java:startHeartbeat(517)) - Started 
> heartbeat with delay/interval = 15/15 MILLISECONDS for 
> query: agentInfo:hive_xxx
> {code}
> 3. during time between event #1 and #2, client disconnected and deleteContext 
> cleanup the session dir
> {code}
> 2018-08-21T15:39:57,820 INFO  [HiveServer2-Handler-Pool: Thread-XXX]: 
> thrift.ThriftCLIService (ThriftBinaryCLIService.java:deleteContext(136)) - 
> Session disconnected without closing properly.
> 2018-08-21T15:39:57,820 INFO  [HiveServer2-Handler-Pool: Thread-]: 
> thrift.ThriftCLIService (ThriftBinaryCLIService.java:deleteContext(140)) - 
> Closing the session: SessionHandle [3be07faf-5544-4178-8b50-8173002b171a]
> 2018-08-21T15:39:57,820 INFO  [HiveServer2-Handler-Pool: Thread-]: 
> service.CompositeService (SessionManager.java:closeSession(363)) - Session 
> closed, SessionHandle [xxx], current sessions:2
> {code}
> 4. background thread died with NPE while trying to get the queryid 
> {code}
> java.lang.NullPointerException: null
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1568) 
> ~[hive-exec-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1414) 
> ~[hive-exec-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1211) 
> ~[hive-exec-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1204) 
> ~[hive-exec-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:242)
>  [hive-service-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)
>  [hive-service-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
> at 
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:336)
>  [hive-service-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
> at java.security.AccessController.doPrivileged(Native Method) 
> [?:1.8.0_77]
> at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_77]
> {code}
> did not get a chance to release the lock and heartbeater thread continue 
> heartbeat indefinately.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (HIVE-22043) Make LLAP's Yarn package dir on HDFS configurable

2019-07-24 Thread Ashutosh Chauhan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-22043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16892061#comment-16892061
 ] 

Ashutosh Chauhan commented on HIVE-22043:
-

+1

> Make LLAP's Yarn package dir on HDFS configurable
> -
>
> Key: HIVE-22043
> URL: https://issues.apache.org/jira/browse/HIVE-22043
> Project: Hive
>  Issue Type: New Feature
>Reporter: Adam Szita
>Assignee: Adam Szita
>Priority: Major
> Attachments: HIVE-22043.0.patch
>
>
> Currently at LLAP launch we're using a hardwired HDFS directory to upload 
> libs and configs that are required for LLAP daemons.  This is hive user home 
> directory/.yarn
> I propose to have this configurable instead.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (HIVE-22034) HiveStrictManagedMigration updates DB location even with --dryRun setting on

2019-07-23 Thread Ashutosh Chauhan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-22034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16891313#comment-16891313
 ] 

Ashutosh Chauhan commented on HIVE-22034:
-

+1

> HiveStrictManagedMigration updates DB location even with --dryRun setting on
> 
>
> Key: HIVE-22034
> URL: https://issues.apache.org/jira/browse/HIVE-22034
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-22034.1.patch
>
>
> The logic at the end of procesDatabase() to update the DB location in the 
> Metastore should only run if runOptions.dryRun == false.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (HIVE-22035) HiveStrictManagedMigration settings do not always get set with --hiveconf arguments

2019-07-23 Thread Ashutosh Chauhan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-22035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16891312#comment-16891312
 ] 

Ashutosh Chauhan commented on HIVE-22035:
-

+1

> HiveStrictManagedMigration settings do not always get set with --hiveconf 
> arguments
> ---
>
> Key: HIVE-22035
> URL: https://issues.apache.org/jira/browse/HIVE-22035
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-22035.1.patch
>
>
> Currently the --hiveconf arguments get added to the System properties. While 
> this allows official HiveConf variables to be set in the conf that is loaded 
> by the HiveStrictManagedMigration utility, there are utility-specific 
> configuration settings which we would want to be set from the command line. 
> For example since Ambari knows what the Hive system user name is it would 
> make sense to be able to set strict.managed.tables.migration.owner on the 
> command line when running this utility.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (HIVE-20908) Avoid multiple getTableMeta calls during GetTablesOperation.

2019-07-17 Thread Ashutosh Chauhan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16887468#comment-16887468
 ] 

Ashutosh Chauhan commented on HIVE-20908:
-

I dont see how this patch will improve perf, if anything it might make it 
worse. You are still making a call per DB, so nothing changes there but now 
instead of passing name you are passing pattern, so you will get more table 
objects then necessary. 
Suggestion on HIVE-19432 was to create a pattern on dbNames and then make a 
*single* metastoreClient.getTableMeta() to get all table objects in one shot, 
instead of per DB.

> Avoid multiple getTableMeta calls during GetTablesOperation.
> 
>
> Key: HIVE-20908
> URL: https://issues.apache.org/jira/browse/HIVE-20908
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.1
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Minor
>  Labels: performance
> Attachments: HIVE-20908.patch
>
>
> following HIVE-19432, we are doing getTableMeta for each authorized db 
> instead of that we can pass pattern for metastore.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-21728) WorkloadManager logging fix

2019-07-17 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21728:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Rajkumar!

> WorkloadManager logging fix 
> 
>
> Key: HIVE-21728
> URL: https://issues.apache.org/jira/browse/HIVE-21728
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.2.0
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21728.patch
>
>
> logger skip the following message if HS2 is running in INFO level.
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/WorkloadManager.java#L705



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Commented] (HIVE-21728) WorkloadManager logging fix

2019-07-17 Thread Ashutosh Chauhan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16887461#comment-16887461
 ] 

Ashutosh Chauhan commented on HIVE-21728:
-

New logger category is probably a good idea, but this one line fix is needed 
regardless.
+1

> WorkloadManager logging fix 
> 
>
> Key: HIVE-21728
> URL: https://issues.apache.org/jira/browse/HIVE-21728
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.2.0
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-21728.patch
>
>
> logger skip the following message if HS2 is running in INFO level.
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/WorkloadManager.java#L705



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-21828) Tez: Use a pre-parsed TezConfiguration from DagUtils

2019-07-15 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21828:

Status: Patch Available  (was: Open)

> Tez: Use a pre-parsed TezConfiguration from DagUtils
> 
>
> Key: HIVE-21828
> URL: https://issues.apache.org/jira/browse/HIVE-21828
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Attila Magyar
>Priority: Major
> Attachments: HIVE-21828.1.patch, HIVE-21828.2.patch, 
> HIVE-21828.3.patch
>
>
> The HS2 tez-site.xml does not change dynamically - the XML parsed components 
> of the config can be obtained statically and kept across sessions.
> This allows for the replacing of "new TezConfiguration()" with a HS2 local 
> version instead.
> The configuration object however has to reference the right resource file 
> (i.e location of tez-site.xml) without reparsing it for each query.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-19661) switch Hive UDFs to use Re2J regex engine

2019-06-19 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-19661:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Raj!

> switch Hive UDFs to use Re2J regex engine
> -
>
> Key: HIVE-19661
> URL: https://issues.apache.org/jira/browse/HIVE-19661
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-19661.01.patch, HIVE-19661.02.patch, 
> HIVE-19661.03.patch, HIVE-19661.patch
>
>
> Java regex engine can be very slow in some cases e.g. 
> https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8203458



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19661) switch Hive UDFs to use Re2J regex engine

2019-06-13 Thread Ashutosh Chauhan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16863559#comment-16863559
 ] 

Ashutosh Chauhan commented on HIVE-19661:
-

[~alangates] Can you please help with License? I am not sure if license 
permissible for it to be included?
[~Rajkumar Singh] Why not turn it on by default. Perf gains are impressive.

> switch Hive UDFs to use Re2J regex engine
> -
>
> Key: HIVE-19661
> URL: https://issues.apache.org/jira/browse/HIVE-19661
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-19661.01.patch, HIVE-19661.02.patch, 
> HIVE-19661.03.patch, HIVE-19661.patch
>
>
> Java regex engine can be very slow in some cases e.g. 
> https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8203458



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20415) Hive1: Tez Session failed to return if background thread is interrupted

2019-06-10 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-20415:

   Resolution: Fixed
Fix Version/s: 1.3.0
   Status: Resolved  (was: Patch Available)

Pushed to branch-1. Thanks, Rajkumar!

> Hive1: Tez Session failed to return if background thread is interrupted
> ---
>
> Key: HIVE-20415
> URL: https://issues.apache.org/jira/browse/HIVE-20415
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Fix For: 1.3.0
>
> Attachments: HIVE-20415.1-branch-1.patch, 
> HIVE-20415.2-branch-1.patch, HIVE-20415.3-branch-1.2.patch, 
> HIVE-20415.4-branch-1.2.patch
>
>
> user canceled the query which interrupts the background thread, because of 
> this interrupt background thread fail to put the session back to the pool.
> {code}
> 2018-08-14 15:55:27,581 ERROR exec.Task (TezTask.java:execute(226)) - Failed 
> to execute tez graph.
> java.lang.InterruptedException
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220)
> at 
> java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335)
> at 
> java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:350)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.returnSession(TezSessionPoolManager.java:176)
> {code}
> we need a similar fix as HIVE-15731



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21742) Vectorization: CASE result type casting

2019-06-07 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21742:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Vineet!
Is there already a follow-up jira for non-cbo path?

> Vectorization: CASE result type casting
> ---
>
> Key: HIVE-21742
> URL: https://issues.apache.org/jira/browse/HIVE-21742
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer, Vectorization
>Affects Versions: 3.1.1
>Reporter: Gopal V
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21742.1.patch, HIVE-21742.2.patch, 
> HIVE-21742.3.patch, HIVE-21742.4.patch, HIVE-21742.5.patch, 
> HIVE-21742.6.patch, HIVE-21799.4.patch
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> {code}
> create temporary table foo(q548284 int);
> insert into foo values(1),(2),(3),(4),(5),(6);
> select q548284, CASE WHEN ((q548284 = 1)) THEN (0.2) WHEN ((q548284 = 2)) 
> THEN (0.4) WHEN ((q548284 = 3)) THEN (0.6) WHEN ((q548284 = 4)) THEN (0.8) 
> WHEN ((q548284 = 5)) THEN (1) ELSE (null) END from foo order by q548284 limit 
> 1;
> {code}
> Fails with 
> {code}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector
> at 
> org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector.setElement(DecimalColumnVector.java:130)
> at 
> org.apache.hadoop.hive.ql.exec.vector.expressions.IfExprColumnNull.evaluate(IfExprColumnNull.java:101)
> {code}
> This gets fixed if the case return of (1) is turned into a (1.0).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21805) HiveServer2: Use the fast ShutdownHookManager APIs

2019-06-07 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21805:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Gopal!

> HiveServer2: Use the fast ShutdownHookManager APIs
> --
>
> Key: HIVE-21805
> URL: https://issues.apache.org/jira/browse/HIVE-21805
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
>  Labels: regression
> Fix For: 4.0.0
>
> Attachments: HIVE-21805.1.patch, 
> shutdownhookmanager-configuration.png, txnmanager-shutdownhook-2x.png
>
>
> Hadoop ShutDownHookManager calls "new Configuration()" inside if the 
> parameters are not provided as args.
> This unzips jars & looks for the .xml files in the entire classpath.
>  !shutdownhookmanager-configuration.png! 
> +
>  !txnmanager-shutdownhook-2x.png! 
> Hive has its own impl of ShutDownHookManager, which has also history from the 
> hadoo one (added to code instead of shims).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (HIVE-21513) ACID: Running merge concurrently with minor compaction causes a later select * to throw exception

2019-06-04 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-21513.
-
   Resolution: Duplicate
Fix Version/s: 4.0.0

> ACID: Running merge concurrently with minor compaction causes a later select 
> * to throw exception 
> --
>
> Key: HIVE-21513
> URL: https://issues.apache.org/jira/browse/HIVE-21513
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0, 3.1.1
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Major
> Fix For: 4.0.0
>
>
> Repro steps:
> - Create table 
> - Load some data 
> - Run merge so records gets updated and delete_delta dirs are created
> - Manually initiate minor compaction: ALTER TABLE ... COMPACT 'minor';
> - While the compaction is running keep executing the merge statement
> - After some time try to do simple select *;



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21800) Tez: Cartesian product reparses HiveConf XML

2019-06-04 Thread Ashutosh Chauhan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16856141#comment-16856141
 ] 

Ashutosh Chauhan commented on HIVE-21800:
-

Will HIVE-21828 fix this implicitly by not calling new TezConfiguration() ?

> Tez: Cartesian product reparses HiveConf XML 
> -
>
> Key: HIVE-21800
> URL: https://issues.apache.org/jira/browse/HIVE-21800
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Gopal V
>Priority: Minor
> Attachments: Tez-CartesianProductSlowness.png, hive3-with-cbo.svg
>
>
> {code}
>   CartesianProductConfig cpConfig = new 
> CartesianProductConfig(crossProductSources);
>   edgeManagerDescriptor.setUserPayload(cpConfig.toUserPayload(new 
> TezConfiguration(conf)));
> {code}
>  !Tez-CartesianProductSlowness.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21815) Stats in ORC file are parsed twice

2019-06-04 Thread Ashutosh Chauhan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16856137#comment-16856137
 ] 

Ashutosh Chauhan commented on HIVE-21815:
-

[~gopalv] If you have references handy where we are parsing stats twice in 
Hive, can you point it out? Or is this in orc and we need ORC-500 for it?

> Stats in ORC file are parsed twice
> --
>
> Key: HIVE-21815
> URL: https://issues.apache.org/jira/browse/HIVE-21815
> Project: Hive
>  Issue Type: Improvement
>  Components: ORC
>Reporter: Gopal V
>Priority: Major
>
> ORC record reader unnecessarily parses stats twice



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20854) Sensible Defaults: Hive's Zookeeper heartbeat interval is 20 minutes, change to 2

2019-06-04 Thread Ashutosh Chauhan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16856134#comment-16856134
 ] 

Ashutosh Chauhan commented on HIVE-20854:
-

This one needs to be pushed in.

> Sensible Defaults: Hive's Zookeeper heartbeat interval is 20 minutes, change 
> to 2
> -
>
> Key: HIVE-20854
> URL: https://issues.apache.org/jira/browse/HIVE-20854
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
> Attachments: HIVE-20854.1.patch, HIVE-20854.2.patch
>
>
> {code}
> HIVE_ZOOKEEPER_SESSION_TIMEOUT("hive.zookeeper.session.timeout", 
> "120ms",
> new TimeValidator(TimeUnit.MILLISECONDS),
> "ZooKeeper client's session timeout (in milliseconds). The client is 
> disconnected, and as a result, all locks released, \n" +
> "if a heartbeat is not sent in the timeout."),
> {code}
> That's 1,200,000ms which is too long for all practical purposes - a 20 minute 
> outage in case a node has a failure is too long.
> That is too long for the JDBC load-balancing, LLAP failure tolerance and the 
> lock manager expiry.
> Change to 2 minutes, as a sensible default



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20415) Hive1: Tez Session failed to return if background thread is interrupted

2019-06-03 Thread Ashutosh Chauhan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16855170#comment-16855170
 ] 

Ashutosh Chauhan commented on HIVE-20415:
-

+1

> Hive1: Tez Session failed to return if background thread is interrupted
> ---
>
> Key: HIVE-20415
> URL: https://issues.apache.org/jira/browse/HIVE-20415
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-20415.1-branch-1.patch, HIVE-20415.2-branch-1.patch
>
>
> user canceled the query which interrupts the background thread, because of 
> this interrupt background thread fail to put the session back to the pool.
> {code}
> 2018-08-14 15:55:27,581 ERROR exec.Task (TezTask.java:execute(226)) - Failed 
> to execute tez graph.
> java.lang.InterruptedException
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220)
> at 
> java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335)
> at 
> java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:350)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.returnSession(TezSessionPoolManager.java:176)
> {code}
> we need a similar fix as HIVE-15731



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21805) HiveServer2: Use the fast ShutdownHookManager APIs

2019-06-03 Thread Ashutosh Chauhan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16855168#comment-16855168
 ] 

Ashutosh Chauhan commented on HIVE-21805:
-

+1

> HiveServer2: Use the fast ShutdownHookManager APIs
> --
>
> Key: HIVE-21805
> URL: https://issues.apache.org/jira/browse/HIVE-21805
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
>  Labels: regression
> Attachments: HIVE-21805.1.patch, 
> shutdownhookmanager-configuration.png, txnmanager-shutdownhook-2x.png
>
>
> Hadoop ShutDownHookManager calls "new Configuration()" inside if the 
> parameters are not provided as args.
> This unzips jars & looks for the .xml files in the entire classpath.
>  !shutdownhookmanager-configuration.png! 
> +
>  !txnmanager-shutdownhook-2x.png! 
> Hive has its own impl of ShutDownHookManager, which has also history from the 
> hadoo one (added to code instead of shims).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-16690) Configure Tez cartesian product edge based on LLAP cluster size

2019-06-03 Thread Ashutosh Chauhan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-16690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-16690:

Status: Patch Available  (was: Open)

Rebased [~aplusplus]'s patch.

> Configure Tez cartesian product edge based on LLAP cluster size
> ---
>
> Key: HIVE-16690
> URL: https://issues.apache.org/jira/browse/HIVE-16690
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
>Priority: Major
> Attachments: HIVE-16690.1.patch, HIVE-16690.2.patch, 
> HIVE-16690.addendum.patch
>
>
> In HIVE-14731 we are using default value for target parallelism of fair 
> cartesian product edge. Ideally this should be set according to cluster size. 
> In case of LLAP it's pretty easy to get cluster size, i.e., number of 
> executors.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

< 2 3 4 5 6 7 8 9 10 11 >

601 - 700 of 5716 matches

Mail list logo