[jira] [Commented] (HIVE-21715) Adding a new partition specified by location (which is empty) leads to Exceptions

2019-05-20 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844228#comment-16844228
 ] 

Ashutosh Chauhan commented on HIVE-21715:
-

+1

> Adding a new partition specified by location (which is empty) leads to 
> Exceptions
> -
>
> Key: HIVE-21715
> URL: https://issues.apache.org/jira/browse/HIVE-21715
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-21715.01.patch, HIVE-21715.01.patch, 
> HIVE-21715.02.patch, HIVE-21715.02.patch
>
>
> {code}
> create table supply (id int, part string, quantity int) partitioned by (day 
> int)
>stored as orc
>location 'hdfs:///tmp/a1'
>TBLPROPERTIES ('transactional'='true')
> ;
> alter table supply add partition (day=20110103) location 
>'hdfs:///tmp/a3';
> {code}
> check exception:
> {code}
> org.apache.hadoop.hive.ql.metadata.HiveException: Wrong file format. Please 
> check the file's format.
>   at 
> org.apache.hadoop.hive.ql.exec.MoveTask.checkFileFormats(MoveTask.java:696)
>   at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:370)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:210)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
> {code}
> If the format check is disabled; an exception happens from AcidUtils; because 
> during checking it doesn't expect it to be empty.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21646) Tez: Prevent TezTasks from escaping thread logging context

2019-05-20 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844081#comment-16844081
 ] 

Ashutosh Chauhan commented on HIVE-21646:
-

+1

> Tez: Prevent TezTasks from escaping thread logging context
> --
>
> Key: HIVE-21646
> URL: https://issues.apache.org/jira/browse/HIVE-21646
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
> Attachments: HIVE-21646.1.patch, HIVE-21646.1.patch
>
>
> If hive.exec.parallel is set to true to parallelize MoveTasks or StatsTasks, 
> the Tez task does not benefit from a new thread and will lose all the thread 
> context of the current query.
> Multiple threads even if they are spawned, will lock on SyncDagClient & make 
> progress sequentially.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21686) Brute Force eviction can lead to a random uncontrolled eviction pattern.

2019-05-15 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840622#comment-16840622
 ] 

Ashutosh Chauhan commented on HIVE-21686:
-

+1

> Brute Force eviction can lead to a random uncontrolled eviction pattern.
> 
>
> Key: HIVE-21686
> URL: https://issues.apache.org/jira/browse/HIVE-21686
> Project: Hive
>  Issue Type: Bug
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
>  Labels: pull-request-available
> Attachments: Cache_hitrate_improvement.csv, HIVE-21686.2.patch, 
> HIVE-21686.3.patch, HIVE-21686.4.patch, HIVE-21686.5.patch, HIVE-21686.patch
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Current logic used by brute force eviction can lead to a perpetual random 
> eviction pattern.
> For instance if the cache build a small pocket of free memory where the total 
> size is greater than incoming allocation request, the allocator will randomly 
> evict block that fits a particular size.
> This can happen over and over therefore all the eviction will be random.
> In Addition this random eviction will lead a leak in the linked list 
> maintained by the policy since it does not know anymore about what is evicted 
> and what not.
> The improvement of this patch is very substantial  to TPC-DS benchmark. I 
> have tested it with 10TB scale 9 llap nodes and 32GB cache size per node.  
> The patch has showed very noticeable difference in the Hit rate for raw 
> number  [^Cache_hitrate_improvement.csv] 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21531) Vectorization: all NULL hashcodes are not computed using Murmur3

2019-04-23 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16824551#comment-16824551
 ] 

Ashutosh Chauhan commented on HIVE-21531:
-

+1

> Vectorization: all NULL hashcodes are not computed using Murmur3
> 
>
> Key: HIVE-21531
> URL: https://issues.apache.org/jira/browse/HIVE-21531
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.1.1
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Critical
> Attachments: HIVE-21531.1.patch, HIVE-21531.1.patch, 
> HIVE-21531.WIP.patch
>
>
> The comments in Vectorized hash computation call out the MurmurHash 
> implementation (the one using 0x5bd1e995), while the non-vectorized codepath 
> calls out the Murmur3 one (using 0xcc9e2d51).
> The comments here are wrong
> {code}
>  /**
>* Batch compute the hash codes for all the serialized keys.
>*
>* NOTE: MAJOR MAJOR ASSUMPTION:
>* We assume that HashCodeUtil.murmurHash produces the same result
>* as MurmurHash.hash with seed = 0 (the method used by 
> ReduceSinkOperator for
>* UNIFORM distribution).
>*/
>   protected void computeSerializedHashCodes() {
> int offset = 0;
> int keyLength;
> byte[] bytes = output.getData();
> for (int i = 0; i < nonNullKeyCount; i++) {
>   keyLength = serializedKeyLengths[i];
>   hashCodes[i] = Murmur3.hash32(bytes, offset, keyLength, 0);
>   offset += keyLength;
> }
>   }
> {code}
> but the wrong comment is followed in the Vector RS operator 
> {code}
>   System.arraycopy(nullKeyOutput.getData(), 0, nullBytes, 0, 
> nullBytesLength);
>   nullKeyHashCode = HashCodeUtil.calculateBytesHashCode(nullBytes, 0, 
> nullBytesLength);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21538) Beeline: password source though the console reader did not pass to connection param

2019-04-17 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21538:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Raj!

> Beeline: password source though the console reader did not pass to connection 
> param
> ---
>
> Key: HIVE-21538
> URL: https://issues.apache.org/jira/browse/HIVE-21538
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0
> Environment: Hive-3.1 auth set to LDAP
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21538.01.patch, HIVE-21538.02.patch, 
> HIVE-21538.patch
>
>
> Beeline: password source through the console reader do not pass to connection 
> param, this will yield into the Authentication failure in case of LDAP 
> authentication.
> {code}
> beeline -n USER -u 
> "jdbc:hive2://host:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2"
>  -p
> Connecting to 
> jdbc:hive2://host:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;user=USER
> Enter password for jdbc:hive2://host:2181/: 
> 19/03/26 19:49:44 [main]: WARN jdbc.HiveConnection: Failed to connect to 
> host:1
> 19/03/26 19:49:44 [main]: ERROR jdbc.Utils: Unable to read HiveServer2 
> configs from ZooKeeper
> Unknown HS2 problem when communicating with Thrift server.
> Error: Could not open client transport for any of the Server URI's in 
> ZooKeeper: Peer indicated failure: PLAIN auth failed: 
> javax.security.sasl.AuthenticationException: Error validating LDAP user 
> [Caused by javax.naming.AuthenticationException: [LDAP: error code 49 - 
> 80090308: LdapErr: DSID-0C0903C8, comment: AcceptSecurityContext error, data 
> 52e, v2580]] (state=08S01,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21538) Beeline: password source though the console reader did not pass to connection param

2019-04-17 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820379#comment-16820379
 ] 

Ashutosh Chauhan commented on HIVE-21538:
-

+1

> Beeline: password source though the console reader did not pass to connection 
> param
> ---
>
> Key: HIVE-21538
> URL: https://issues.apache.org/jira/browse/HIVE-21538
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.0
> Environment: Hive-3.1 auth set to LDAP
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-21538.01.patch, HIVE-21538.02.patch, 
> HIVE-21538.patch
>
>
> Beeline: password source through the console reader do not pass to connection 
> param, this will yield into the Authentication failure in case of LDAP 
> authentication.
> {code}
> beeline -n USER -u 
> "jdbc:hive2://host:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2"
>  -p
> Connecting to 
> jdbc:hive2://host:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;user=USER
> Enter password for jdbc:hive2://host:2181/: 
> 19/03/26 19:49:44 [main]: WARN jdbc.HiveConnection: Failed to connect to 
> host:1
> 19/03/26 19:49:44 [main]: ERROR jdbc.Utils: Unable to read HiveServer2 
> configs from ZooKeeper
> Unknown HS2 problem when communicating with Thrift server.
> Error: Could not open client transport for any of the Server URI's in 
> ZooKeeper: Peer indicated failure: PLAIN auth failed: 
> javax.security.sasl.AuthenticationException: Error validating LDAP user 
> [Caused by javax.naming.AuthenticationException: [LDAP: error code 49 - 
> 80090308: LdapErr: DSID-0C0903C8, comment: AcceptSecurityContext error, data 
> 52e, v2580]] (state=08S01,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21372) Use Apache Commons IO To Read Stream To String

2019-04-07 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21372:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, David!

> Use Apache Commons IO To Read Stream To String
> --
>
> Key: HIVE-21372
> URL: https://issues.apache.org/jira/browse/HIVE-21372
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Trivial
> Fix For: 4.0.0
>
> Attachments: HIVE-21372.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21377) Using Oracle as HMS DB with DirectSQL

2019-04-02 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21377:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Rajkumar!

> Using Oracle as HMS DB with DirectSQL
> -
>
> Key: HIVE-21377
> URL: https://issues.apache.org/jira/browse/HIVE-21377
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Bo 
>Assignee: Rajkumar Singh
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21377.01.patch, HIVE-21377.patch, 
> blob-cast-error.jpg
>
>
> When we use the Oracle as HMS DB, we saw this kind of contents in the HMS log 
> accordingly:
> {code:java}
> 2019-02-02 T08:23:57,102 WARN [Thread-12]: metastore.ObjectStore 
> (ObjectStore.java:handleDirectSqlError(3741)) - Falling back to ORM path due 
> to direct SQL failure (this is not an error): Cannot extract boolean from 
> column value 0 at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.extractSqlBoolean(MetaStoreDirectSql.java:1031)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsFromPartitionIds(MetaStoreDirectSql.java:728)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.access$300(MetaStoreDirectSql.java:109)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql$1.run(MetaStoreDirectSql.java:471)
>  at org.apache.hadoop.hive.metastore.Batchable.runBatched(Batchable.java:73) 
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(MetaStoreDirectSql.java:462)
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore$8.getSqlResult(ObjectStore.java:3392)
> {code}
> In Hive, we handle the Postgres, MySQL and Derby for the extractSqlBoolean.
> But Oracle return the 0 or 1 for Boolean. So we need to modify the 
> MetastoreDirectSqlUtils.java - [1]
> So, could add this snip in this code?
> {code:java}
>   static Boolean extractSqlBoolean(Object value) throws MetaException {
> if (value == null) {
>   return null;
> }
> if (value instanceof Boolean) {
>   return (Boolean)value;
> }
> if (value instanceof Number) { // add
>   try {
> return BooleanUtils.toBooleanObject((BigDecimal) value, 1, 0, null);
>   } catch(IllegalArugmentExeception iae){
>   // NOOP
>   }
> if (value instanceof String) {
>   try {
> return BooleanUtils.toBooleanObject((String) value, "Y", "N", null);
>   } catch (IllegalArgumentException iae) {
> // NOOP
>   }
> }
> throw new MetaException("Cannot extract boolean from column value " + 
> value);
>   }
> {code}
>  [1] -
> https://github.com/apache/hive/blob/f51f108b761f0c88647f48f30447dae12b308f31/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetastoreDirectSqlUtils.java#L501-L527
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21377) Using Oracle as HMS DB with DirectSQL

2019-04-02 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807959#comment-16807959
 ] 

Ashutosh Chauhan commented on HIVE-21377:
-

+1

> Using Oracle as HMS DB with DirectSQL
> -
>
> Key: HIVE-21377
> URL: https://issues.apache.org/jira/browse/HIVE-21377
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Bo 
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-21377.01.patch, HIVE-21377.patch, 
> blob-cast-error.jpg
>
>
> When we use the Oracle as HMS DB, we saw this kind of contents in the HMS log 
> accordingly:
> {code:java}
> 2019-02-02 T08:23:57,102 WARN [Thread-12]: metastore.ObjectStore 
> (ObjectStore.java:handleDirectSqlError(3741)) - Falling back to ORM path due 
> to direct SQL failure (this is not an error): Cannot extract boolean from 
> column value 0 at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.extractSqlBoolean(MetaStoreDirectSql.java:1031)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsFromPartitionIds(MetaStoreDirectSql.java:728)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.access$300(MetaStoreDirectSql.java:109)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql$1.run(MetaStoreDirectSql.java:471)
>  at org.apache.hadoop.hive.metastore.Batchable.runBatched(Batchable.java:73) 
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(MetaStoreDirectSql.java:462)
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore$8.getSqlResult(ObjectStore.java:3392)
> {code}
> In Hive, we handle the Postgres, MySQL and Derby for the extractSqlBoolean.
> But Oracle return the 0 or 1 for Boolean. So we need to modify the 
> MetastoreDirectSqlUtils.java - [1]
> So, could add this snip in this code?
> {code:java}
>   static Boolean extractSqlBoolean(Object value) throws MetaException {
> if (value == null) {
>   return null;
> }
> if (value instanceof Boolean) {
>   return (Boolean)value;
> }
> if (value instanceof Number) { // add
>   try {
> return BooleanUtils.toBooleanObject((BigDecimal) value, 1, 0, null);
>   } catch(IllegalArugmentExeception iae){
>   // NOOP
>   }
> if (value instanceof String) {
>   try {
> return BooleanUtils.toBooleanObject((String) value, "Y", "N", null);
>   } catch (IllegalArgumentException iae) {
> // NOOP
>   }
> }
> throw new MetaException("Cannot extract boolean from column value " + 
> value);
>   }
> {code}
>  [1] -
> https://github.com/apache/hive/blob/f51f108b761f0c88647f48f30447dae12b308f31/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetastoreDirectSqlUtils.java#L501-L527
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21499) should not remove the function from registry if create command failed with AlreadyExistsException

2019-04-01 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21499:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Rajkumar!

> should not remove the function from registry if create command failed with 
> AlreadyExistsException
> -
>
> Key: HIVE-21499
> URL: https://issues.apache.org/jira/browse/HIVE-21499
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
> Environment: Hive-3.1
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21499.01.patch, HIVE-21499.02.patch, 
> HIVE-21499.patch
>
>
> As a part of HIVE-20953 we are removing the function if creation for same 
> failed with any reason, this will yield into the following situation.
> 1. create function failed since function already exists
> 2. on #1 failure hive will clear the permanent function from the registry
> 3. this function will be of no use until hiveserver2 restarted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21557) Query based compaction fails with NullPointerException: Non-local session path expected to be non-null

2019-04-01 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806917#comment-16806917
 ] 

Ashutosh Chauhan commented on HIVE-21557:
-

+1

> Query based compaction fails with NullPointerException: Non-local session 
> path expected to be non-null
> --
>
> Key: HIVE-21557
> URL: https://issues.apache.org/jira/browse/HIVE-21557
> Project: Hive
>  Issue Type: Bug
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-21557.02.patch, HIVE-21557.patch
>
>
> {code:java}
> 2019-03-29T13:04:19.282Z hiveserver2-65d5bb4bd8-xx24r hiveserver2 1 
> db896a5e-5215-11e9-87ec-020c4712c37c [mdc@18060 class="compactor.CompactorMR" 
> level="ERROR" thread="hiveserver2-65d5bb4bd8-xx24r-28"] 
> org.apache.hadoop.hive.ql.metadata.HiveException: Failed to run drop table if 
> exists default_tmp_compactor_asd_1553864659196
> at org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:57)
> at org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:34)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runCrudCompaction(CompactorMR.java:408)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:250)
> at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:194)
> Caused by: java.lang.NullPointerException: Non-local session path expected to 
> be non-null
> at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:228)
> at 
> org.apache.hadoop.hive.ql.session.SessionState.getHDFSSessionPath(SessionState.java:838)
> at org.apache.hadoop.hive.ql.Context.(Context.java:319)
> at org.apache.hadoop.hive.ql.Context.(Context.java:305)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:603)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1881)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2004)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1764)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1753)
> at 
> org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:54){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21516) Fix spark downloading for q tests

2019-03-29 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21516:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Miklos!

> Fix spark downloading for q tests
> -
>
> Key: HIVE-21516
> URL: https://issues.apache.org/jira/browse/HIVE-21516
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21516.01.patch, HIVE-21516.02.patch, 
> HIVE-21516.03.patch, HIVE-21516.04.patch, HIVE-21516.05.patch, 
> HIVE-21516.06.patch
>
>
> Currently itests/pom.xml declares a command to generated the download script 
> for spark, thus it is re-generated every time any maven command is executed 
> for any sub project of itests. AS a side effect it is leaving download.sh 
> files everywhere. The download.sh file is almost totally static, no need to 
> recreate it every time, just requires $spark.version as a parameter.
> Also it is only working properly under linux, as it relies on the md5sum 
> program which is not present in OS X. This means that if the spark tarball is 
> partially downloaded on OS X, then it would never be re-downloaded. This 
> should be fixed by making it work as well using md5 on OS X.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21517) Fix AggregateStatsCache

2019-03-29 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21517:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Miklos!

> Fix AggregateStatsCache
> ---
>
> Key: HIVE-21517
> URL: https://issues.apache.org/jira/browse/HIVE-21517
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21517.01.patch
>
>
> Due to a bug AggregateStatsCache is not returning the best matching result.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21517) Fix AggregateStatsCache

2019-03-26 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16802467#comment-16802467
 ] 

Ashutosh Chauhan commented on HIVE-21517:
-

+1

> Fix AggregateStatsCache
> ---
>
> Key: HIVE-21517
> URL: https://issues.apache.org/jira/browse/HIVE-21517
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21517.01.patch
>
>
> Due to a bug AggregateStatsCache is not returning the best matching result.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21034) Add option to schematool to drop Hive databases

2019-03-25 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800426#comment-16800426
 ] 

Ashutosh Chauhan commented on HIVE-21034:
-

[~dvoros] would you like to reattach the patch to get a clean run.

> Add option to schematool to drop Hive databases
> ---
>
> Key: HIVE-21034
> URL: https://issues.apache.org/jira/browse/HIVE-21034
> Project: Hive
>  Issue Type: Improvement
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-21034.1.patch, HIVE-21034.2.patch, 
> HIVE-21034.2.patch, HIVE-21034.3.patch, HIVE-21034.4.patch, 
> HIVE-21034.5.patch, HIVE-21034.5.patch, HIVE-21034.5.patch
>
>
> An option to remove all Hive managed data could be a useful addition to 
> {{schematool}}.
> I propose to introduce a new flag {{-dropAllDatabases}} that would *drop all 
> databases with CASCADE* to remove all data of managed tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21034) Add option to schematool to drop Hive databases

2019-03-21 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798194#comment-16798194
 ] 

Ashutosh Chauhan commented on HIVE-21034:
-

+1

> Add option to schematool to drop Hive databases
> ---
>
> Key: HIVE-21034
> URL: https://issues.apache.org/jira/browse/HIVE-21034
> Project: Hive
>  Issue Type: Improvement
>Reporter: Daniel Voros
>Assignee: Daniel Voros
>Priority: Major
> Attachments: HIVE-21034.1.patch, HIVE-21034.2.patch, 
> HIVE-21034.2.patch, HIVE-21034.3.patch, HIVE-21034.4.patch, 
> HIVE-21034.5.patch, HIVE-21034.5.patch, HIVE-21034.5.patch
>
>
> An option to remove all Hive managed data could be a useful addition to 
> {{schematool}}.
> I propose to introduce a new flag {{-dropAllDatabases}} that would *drop all 
> databases with CASCADE* to remove all data of managed tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20580) OrcInputFormat.isOriginal() should not rely on hive.acid.key.index

2019-03-21 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798181#comment-16798181
 ] 

Ashutosh Chauhan commented on HIVE-20580:
-

[~pvary] I will get rid of isOriginal(Footer). I don't see it being part of a 
public interface and I would rather not leave a public method which is unused 
in code.
LGTM otherwise.

> OrcInputFormat.isOriginal() should not rely on hive.acid.key.index
> --
>
> Key: HIVE-20580
> URL: https://issues.apache.org/jira/browse/HIVE-20580
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 3.1.0
>Reporter: Eugene Koifman
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-20580.2.patch, HIVE-20580.3.patch, 
> HIVE-20580.4.patch, HIVE-20580.5.patch, HIVE-20580.6.patch, HIVE-20580.patch
>
>
> {{org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.isOriginal()}} is checking 
> for presence of {{hive.acid.key.index}} in the footer.  This is only created 
> when the file is written by {{OrcRecordUpdater}}.  It should instead check 
> for presence of Acid metadata columns so that a file can be produced by 
> something other than {{OrcRecordUpater}}.
> Also, {{hive.acid.key.index}} counts number of different type of events which 
> is not really useful for Acid V2 (as of Hive 3) since each file only has 1 
> type of event.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16924) Support distinct in presence of Group By

2019-03-19 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16796214#comment-16796214
 ] 

Ashutosh Chauhan commented on HIVE-16924:
-

please remove ql/src/test/queries/clientnegative/udaf_invalid_place.q too, its 
not needed anymore.

> Support distinct in presence of Group By 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch, 
> HIVE-16924.03.patch, HIVE-16924.04.patch, HIVE-16924.05.patch, 
> HIVE-16924.06.patch, HIVE-16924.07.patch, HIVE-16924.08.patch, 
> HIVE-16924.09.patch, HIVE-16924.10.patch, HIVE-16924.11.patch, 
> HIVE-16924.12.patch, HIVE-16924.13.patch, HIVE-16924.14.patch, 
> HIVE-16924.15.patch, HIVE-16924.16.patch, HIVE-16924.17.patch, 
> HIVE-16924.18.patch, HIVE-16924.19.patch, HIVE-16924.20.patch, 
> HIVE-16924.21.patch, HIVE-16924.22.patch, HIVE-16924.23.patch, 
> HIVE-16924.24.patch, HIVE-16924.25.patch, HIVE-16924.26.patch, 
> HIVE-16924.27.patch
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21372) Use Apache Commons IO To Read Stream To String

2019-03-18 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16795040#comment-16795040
 ] 

Ashutosh Chauhan commented on HIVE-21372:
-

+1

> Use Apache Commons IO To Read Stream To String
> --
>
> Key: HIVE-21372
> URL: https://issues.apache.org/jira/browse/HIVE-21372
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Trivial
> Fix For: 4.0.0
>
> Attachments: HIVE-21372.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-16924) Support distinct in presence of Group By

2019-03-16 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-16924:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Miklos!

> Support distinct in presence of Group By 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch, 
> HIVE-16924.03.patch, HIVE-16924.04.patch, HIVE-16924.05.patch, 
> HIVE-16924.06.patch, HIVE-16924.07.patch, HIVE-16924.08.patch, 
> HIVE-16924.09.patch, HIVE-16924.10.patch, HIVE-16924.11.patch, 
> HIVE-16924.12.patch, HIVE-16924.13.patch, HIVE-16924.14.patch, 
> HIVE-16924.15.patch, HIVE-16924.16.patch, HIVE-16924.17.patch, 
> HIVE-16924.18.patch, HIVE-16924.19.patch, HIVE-16924.20.patch, 
> HIVE-16924.21.patch, HIVE-16924.22.patch, HIVE-16924.23.patch, 
> HIVE-16924.24.patch, HIVE-16924.25.patch, HIVE-16924.26.patch, 
> HIVE-16924.27.patch
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21444) Additional tests for materialized view rewriting

2019-03-15 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16794099#comment-16794099
 ] 

Ashutosh Chauhan commented on HIVE-21444:
-

+1

> Additional tests for materialized view rewriting
> 
>
> Key: HIVE-21444
> URL: https://issues.apache.org/jira/browse/HIVE-21444
> Project: Hive
>  Issue Type: Test
>  Components: CBO, Materialized views
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21444.patch, HIVE-21444.patch, HIVE-21444.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21385) Allow disabling pushdown of non-splittable computation to JDBC sources

2019-03-15 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16794026#comment-16794026
 ] 

Ashutosh Chauhan commented on HIVE-21385:
-

+1

> Allow disabling pushdown of non-splittable computation to JDBC sources
> --
>
> Key: HIVE-21385
> URL: https://issues.apache.org/jira/browse/HIVE-21385
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, StorageHandler
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21385.01.patch, HIVE-21385.01.patch, 
> HIVE-21385.02.patch, HIVE-21385.02.patch, HIVE-21385.patch
>
>
> Until pushdown is cost-based decision, we will be able to enable / disable 
> pushdown of operators that prevent reading results from the JDBC connection 
> in parallel.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21445) Support range check for DECIMAL type in stats annotation

2019-03-14 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793314#comment-16793314
 ] 

Ashutosh Chauhan commented on HIVE-21445:
-

+1

> Support range check for DECIMAL type in stats annotation
> 
>
> Key: HIVE-21445
> URL: https://issues.apache.org/jira/browse/HIVE-21445
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer, Statistics
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21445.01.patch, HIVE-21445.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21430) INSERT into a dynamically partitioned table with hive.stats.autogather = false throws a MetaException

2019-03-14 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793075#comment-16793075
 ] 

Ashutosh Chauhan commented on HIVE-21430:
-

If you already have a fix for this specific issue, please go ahead with it. 
There is no jira for removal of the config currently.

> INSERT into a dynamically partitioned table with hive.stats.autogather = 
> false throws a MetaException
> -
>
> Key: HIVE-21430
> URL: https://issues.apache.org/jira/browse/HIVE-21430
> Project: Hive
>  Issue Type: Bug
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
> Attachments: metaexception_repro.patch, 
> org.apache.hadoop.hive.ql.stats.TestStatsUpdaterThread-output.txt
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> When the test TestStatsUpdaterThread#testTxnDynamicPartitions added in the 
> attached patch is run it throws exception (full logs attached.)
> org.apache.hadoop.hive.metastore.api.MetaException: Cannot change stats state 
> for a transactional table default.simple_stats without providing the 
> transactional write state for verification (new write ID 5, valid write IDs 
> null; current state \{"BASIC_STATS":"true","COLUMN_STATS":{"s":"true"}}; new 
> state null
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore.alterPartitionNoTxn(ObjectStore.java:4328)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21369) LLAP: Logging is expensive in encoded reader path

2019-03-14 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16793038#comment-16793038
 ] 

Ashutosh Chauhan commented on HIVE-21369:
-

+1

> LLAP: Logging is expensive in encoded reader path
> -
>
> Key: HIVE-21369
> URL: https://issues.apache.org/jira/browse/HIVE-21369
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Nita Dembla
>Priority: Major
> Attachments: HIVE-21369.patch
>
>
> There should be no INFO logging in EncodedReaderImpl. Stringifying of disk 
> ranges is expensive in core read path.
> {code:java}
> 2019-03-01T17:55:56.322852142Z 2019-03-01T17:55:56,306 INFO  
> [IO-Elevator-Thread-3 
> (hive_20190301175546_a279f33c-4f2b-4cd5-8695-57bc8b042a61)] 
> encoded.EncodedReaderImpl: Disk ranges after cache (found everything true; 
> file [-3693547618692831801, 1551190876000, 1047660824], base offset 
> 792920167): [{start: 887940 end: 1003508 cache buffer: 0x5165f83d(1)}, 
> {start: 1003508 end: 1119078 cache buffer: 0xb63cac3(1)}, {start: 1119078 
> end: 1234745 cache buffer: 0x41a724fa(1)}, {start: 1234745 end: 1350261 cache 
> buffer: 0x2f71bc38(1)}, {start: 1350261 end: 1465752 cache buffer: 
> 0x2c38e1bb(1)}, {start: 1465752 end: 1581231 cache buffer: 0x5827982(1)}, 
> {start: 1581231 end: 1696885 cache buffer: 0x75a6773c(1)}, {start: 1696885 
> end: 1812492 cache buffer: 0x2ed060f9(1)},{start: 1812492 end: 1928086 cache 
> buffer: 0x20b2c8aa(1)}, {start: 1928086 end: 2043588 cache buffer: 
> 0x6559aacb(1)}, {start: 2043588 end: 2159089 cache buffer: 0x569c85e1(1)}, 
> {start: 2159089 end: 2274725 cache buffer: 0x25a88dd0(1)}, {start: 2274725 
> end: 2390228 cache buffer: 0x738b7e87(1)}, {start: 2390228 end: 2505715 cache 
> buffer: 0x26edafa0(1)}, {start: 2505715 end: 2621322 cache buffer: 
> 0x69db7752(1)}, {start: 2621322 end: 2736844 cache b{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HIVE-17061) Add Support for Column List in Insert Clause

2019-03-12 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-17061.
-
   Resolution: Fixed
Fix Version/s: 4.0.0

Resolved via HIVE-20590

> Add Support for Column List in Insert Clause
> 
>
> Key: HIVE-17061
> URL: https://issues.apache.org/jira/browse/HIVE-17061
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Shawn Weeks
>Priority: Minor
> Fix For: 4.0.0
>
>
> Include support for a list of columns in the insert clause of the merge 
> statement. This helps when you may not know or care about the order of 
> columns in the target table or if you don't want to have to insert values 
> into all of the columns.
> {code}
> MERGE INTO target 
> USING source ON b = y
> WHEN MATCHED AND c + 1 + z > 0
> THEN UPDATE SET a = 1, c = z
> WHEN NOT MATCHED AND z IS NULL
> THEN INSERT(a,b) VALUES(z, 7)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21424) Disable AggregateStatsCache by default

2019-03-12 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16791010#comment-16791010
 ] 

Ashutosh Chauhan commented on HIVE-21424:
-

+1 pending tests.

> Disable AggregateStatsCache by default
> --
>
> Key: HIVE-21424
> URL: https://issues.apache.org/jira/browse/HIVE-21424
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-21424.01.patch, HIVE-21424.02.patch, 
> HIVE-21424.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21430) INSERT into a dynamically partitioned table with hive.stats.autogather = false throws a MetaException

2019-03-12 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16791005#comment-16791005
 ] 

Ashutosh Chauhan commented on HIVE-21430:
-

IMHO, we shall remove the config {{hive.stats.autogather}} since a) overhead to 
collect stats is negligible, so its always on. b) StatsTask always need to be 
present since we need to invalidate stats when we are not collecting them. So, 
better to remove this config and simplify the codebase (and avoid issues like 
this). 

> INSERT into a dynamically partitioned table with hive.stats.autogather = 
> false throws a MetaException
> -
>
> Key: HIVE-21430
> URL: https://issues.apache.org/jira/browse/HIVE-21430
> Project: Hive
>  Issue Type: Bug
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
> Attachments: metaexception_repro.patch, 
> org.apache.hadoop.hive.ql.stats.TestStatsUpdaterThread-output.txt
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> When the test TestStatsUpdaterThread#testTxnDynamicPartitions added in the 
> attached patch is run it throws exception (full logs attached.)
> org.apache.hadoop.hive.metastore.api.MetaException: Cannot change stats state 
> for a transactional table default.simple_stats without providing the 
> transactional write state for verification (new write ID 5, valid write IDs 
> null; current state \{"BASIC_STATS":"true","COLUMN_STATS":{"s":"true"}}; new 
> state null
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore.alterPartitionNoTxn(ObjectStore.java:4328)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21316) Comparision of varchar column and string literal should happen in varchar

2019-03-11 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16789775#comment-16789775
 ] 

Ashutosh Chauhan commented on HIVE-21316:
-

Can you please create RB (or GH pull-request) for this?

> Comparision of varchar column and string literal should happen in varchar
> -
>
> Key: HIVE-21316
> URL: https://issues.apache.org/jira/browse/HIVE-21316
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-21316.01.patch, HIVE-21316.02.patch, 
> HIVE-21316.03.patch
>
>
> this is most probably the root cause behind HIVE-21310 as well



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21412) PostExecOrcFileDump doesn't work with ACID tables

2019-03-11 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21412:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Denys!

> PostExecOrcFileDump doesn't work with ACID tables
> -
>
> Key: HIVE-21412
> URL: https://issues.apache.org/jira/browse/HIVE-21412
> Project: Hive
>  Issue Type: Bug
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21412.1.patch, HIVE-21412.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21402) Compaction state remains 'working' when major compaction fails

2019-03-11 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16789614#comment-16789614
 ] 

Ashutosh Chauhan commented on HIVE-21402:
-

I am unsure of how to deal with unchecked exceptions. IMHO, its not useful to 
catch Throwable since in case of unchecked exception its very likely that 
compaction will fail in next iteration too, likely that error will be 
encountered every time (e.g., was the case here of missing jar). In such cases, 
its better to let Throwable escape (or raise InterrruptedException) so that its 
dealt with in caller which should then fail the process. For end user its not 
useful that HS2 keeps on running where every compaction fails.
On the other hand there is already catch(Throwable) in the outer loop : 
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java#L238
 

> Compaction state remains 'working' when major compaction fails
> --
>
> Key: HIVE-21402
> URL: https://issues.apache.org/jira/browse/HIVE-21402
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-21402.patch
>
>
> When calcite is not on the HMS classpath, and query based compaction is 
> enabled then the compaction fails with NoClassDefFound error. Since the catch 
> block only catches Exceptions the following code block is not executed:
> {code:java}
> } catch (Exception e) {
>   LOG.error("Caught exception while trying to compact " + ci +
>   ".  Marking failed to avoid repeated failures, " + 
> StringUtils.stringifyException(e));
>   msc.markFailed(CompactionInfo.compactionInfoToStruct(ci));
>   msc.abortTxns(Collections.singletonList(compactorTxnId));
> }
> {code}
> So the compaction is not set to failed.
> Would be better to catch Throwable instead of Exception



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21412) PostExecOrcFileDump doesn't work with ACID tables

2019-03-11 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16789580#comment-16789580
 ] 

Ashutosh Chauhan commented on HIVE-21412:
-

+1

> PostExecOrcFileDump doesn't work with ACID tables
> -
>
> Key: HIVE-21412
> URL: https://issues.apache.org/jira/browse/HIVE-21412
> Project: Hive
>  Issue Type: Bug
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
> Attachments: HIVE-21412.1.patch, HIVE-21412.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20585) Fix column stat flucutuation in list_bucket_dml_4.q

2019-03-09 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16788761#comment-16788761
 ] 

Ashutosh Chauhan commented on HIVE-20585:
-

Given that we want to turn CachedStore on by default in near future, we shall 
turn off AggrStatsCache by default. (That will also get rid off this 
fluctuation).

> Fix column stat flucutuation in list_bucket_dml_4.q
> ---
>
> Key: HIVE-20585
> URL: https://issues.apache.org/jira/browse/HIVE-20585
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-20585.01.patch, HIVE-20585.02.patch
>
>
> If column stats are fetched(HIVE-17084) then list_bucket_dml_4.q's output is 
> flactuating; sometimes it has column stats COMPLETE; sometimes just partial.
> Running locally produces COMPLETE.
> running together with join33.q,list_bucket_dml_4.q cause it to degrade to 
> PARTIAL



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21390) BI split strategy does not work for blob stores

2019-03-09 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21390:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Prasanth!

> BI split strategy does not work for blob stores
> ---
>
> Key: HIVE-21390
> URL: https://issues.apache.org/jira/browse/HIVE-21390
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21390.1.patch, HIVE-21390.2.patch, 
> HIVE-21390.3.patch, HIVE-21390.4.patch
>
>
> BI split strategy cuts the split at block boundaries however there are no 
> block boundaries in blob storage so we end up with 1 split for BI split 
> strategy. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21339) LLAP: Cache hit also initializes an FS object

2019-03-09 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21339:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Prasanth!

> LLAP: Cache hit also initializes an FS object 
> --
>
> Key: HIVE-21339
> URL: https://issues.apache.org/jira/browse/HIVE-21339
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Assignee: Prasanth Jayachandran
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21339.1.patch, HIVE-21339.2.patch, 
> HIVE-21339.3.patch, HIVE-21339.4.patch, HIVE-21339.5.patch, 
> llap-cache-fs-get.png, llap-query7-cached.svg
>
>
> https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java#L214
> {code}
> // 1. Get file metadata from cache, or create the reader and read it.
> // Don't cache the filesystem object for now; Tez closes it and FS cache 
> will fix all that
> fs = split.getPath().getFileSystem(jobConf);
> fileKey = determineFileId(fs, split,
> HiveConf.getBoolVar(daemonConf, 
> ConfVars.LLAP_CACHE_ALLOW_SYNTHETIC_FILEID),
> HiveConf.getBoolVar(daemonConf, 
> ConfVars.LLAP_CACHE_DEFAULT_FS_FILE_ID),
> !HiveConf.getBoolVar(daemonConf, ConfVars.LLAP_IO_USE_FILEID_PATH)
> );
> {code}
>  !llap-cache-fs-get.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21416) Log git apply tries with p0, p1, and p2

2019-03-08 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16788472#comment-16788472
 ] 

Ashutosh Chauhan commented on HIVE-21416:
-

+1

> Log git apply tries with p0, p1, and p2
> ---
>
> Key: HIVE-21416
> URL: https://issues.apache.org/jira/browse/HIVE-21416
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21416.01.patch
>
>
> Currently when the PreCommit-HIVE-Build Jenkins job is trying to apply the 
> patch it tries it first with -p0, then if it wasn't successful with -p1, then 
> finally if it still wasn't successful with -p2. The 3 tries are not separated 
> by anything, so the error messages of  the potential failures are mixed 
> together. There should be a log message before each try.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21402) Compaction state remains 'working' when major compaction fails

2019-03-08 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16788227#comment-16788227
 ] 

Ashutosh Chauhan commented on HIVE-21402:
-

Actually looking deeply, actual Compaction is now moved to ql so compactions 
are run in HS2. HS2 should have calcite on classpath. So, this is a deployment 
issue. cc: [~vgumashta]

> Compaction state remains 'working' when major compaction fails
> --
>
> Key: HIVE-21402
> URL: https://issues.apache.org/jira/browse/HIVE-21402
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-21402.patch
>
>
> When calcite is not on the HMS classpath, and query based compaction is 
> enabled then the compaction fails with NoClassDefFound error. Since the catch 
> block only catches Exceptions the following code block is not executed:
> {code:java}
> } catch (Exception e) {
>   LOG.error("Caught exception while trying to compact " + ci +
>   ".  Marking failed to avoid repeated failures, " + 
> StringUtils.stringifyException(e));
>   msc.markFailed(CompactionInfo.compactionInfoToStruct(ci));
>   msc.abortTxns(Collections.singletonList(compactorTxnId));
> }
> {code}
> So the compaction is not set to failed.
> Would be better to catch Throwable instead of Exception



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21280) Null pointer exception on running compaction against a MM table.

2019-03-08 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16788019#comment-16788019
 ] 

Ashutosh Chauhan commented on HIVE-21280:
-

+1
Both queryid and session path are needed after switch to query based 
compaction, which weren't needed earlier.

> Null pointer exception on running compaction against a MM table.
> 
>
> Key: HIVE-21280
> URL: https://issues.apache.org/jira/browse/HIVE-21280
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0, 3.1.1
>Reporter: Aditya Shah
>Assignee: Aditya Shah
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21280.patch
>
>
> On running compaction on MM table, got a null pointer exception while getting 
> HDFS session path. The error seemed to me that the session state was not 
> started for these queries. Even after making it start it further fails in 
> running a Teztask for insert overwrite on temp table with the contents of the 
> original table. The cause for this is Tezsession state is not able to 
> initialize due to Illegal Argument exception being thrown at the time of 
> setting up caller context in Tez task due to caller id which uses queryid 
> being an empty string. 
> I do think session state needs to be started and each of the queries running 
> for compaction (I'm also doubtful for stats updater thread's queries) should 
> have a query id. Some details are as follows:
> Steps to reproduce:
> 1) Using beeline with HS2 and HMS
> 2) create an MM table
> 3) Insert a few values in the table
> 4) alter table mm_table compact 'major'; 
> Stack trace on HMS:
> {code:java}
> compactor.Worker: Caught exception while trying to compact 
> id:8,dbname:default,tableName:acid_mm_orc,partName:null,state:^@,type:MAJOR,properties:null,runAs:null,tooManyAborts:false,highestWriteId:0.
>  Marking failed to avoid repeated failures, java.io.IOException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Failed to run create 
> temporary table default.tmp_compactor_acid_mm_orc_1550222367257(`a` int, `b` 
> string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'WITH 
> SERDEPROPERTIES (
> 'serialization.format'='1')STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 
> 'hdfs://localhost:9000/user/hive/warehouse/acid_mm_orc/_tmp_2d8a096c-2db5-4ed8-921c-b3f6d31e079e/_base'
>  TBLPROPERTIES ('transactional'='false')
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runMmCompaction(CompactorMR.java:373)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:241)
> at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:174)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to run 
> create temporary table default.tmp_compactor_acid_mm_orc_1550222367257(`a` 
> int, `b` string) ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'WITH SERDEPROPERTIES (
> 'serialization.format'='1')STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 
> 'hdfs://localhost:9000/user/hive/warehouse/acid_mm_orc/_tmp_2d8a096c-2db5-4ed8-921c-b3f6d31e079e/_base'
>  TBLPROPERTIES ('transactional'='false')
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runOnDriver(CompactorMR.java:525)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runMmCompaction(CompactorMR.java:365)
> ... 2 more
> Caused by: java.lang.NullPointerException: Non-local session path expected to 
> be non-null
> at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:228)
> at 
> org.apache.hadoop.hive.ql.session.SessionState.getHDFSSessionPath(SessionState.java:815)
> at org.apache.hadoop.hive.ql.Context.(Context.java:309)
> at org.apache.hadoop.hive.ql.Context.(Context.java:295)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:591)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1684)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1807)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1567)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1556)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runOnDriver(CompactorMR.java:522)
> ... 3 more
> {code}
> cc: [~ekoifman] [~vgumashta] [~sershe]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21280) Null pointer exception on running compaction against a MM table.

2019-03-08 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21280:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Aditya!

> Null pointer exception on running compaction against a MM table.
> 
>
> Key: HIVE-21280
> URL: https://issues.apache.org/jira/browse/HIVE-21280
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0, 3.1.1
>Reporter: Aditya Shah
>Assignee: Aditya Shah
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21280.patch
>
>
> On running compaction on MM table, got a null pointer exception while getting 
> HDFS session path. The error seemed to me that the session state was not 
> started for these queries. Even after making it start it further fails in 
> running a Teztask for insert overwrite on temp table with the contents of the 
> original table. The cause for this is Tezsession state is not able to 
> initialize due to Illegal Argument exception being thrown at the time of 
> setting up caller context in Tez task due to caller id which uses queryid 
> being an empty string. 
> I do think session state needs to be started and each of the queries running 
> for compaction (I'm also doubtful for stats updater thread's queries) should 
> have a query id. Some details are as follows:
> Steps to reproduce:
> 1) Using beeline with HS2 and HMS
> 2) create an MM table
> 3) Insert a few values in the table
> 4) alter table mm_table compact 'major'; 
> Stack trace on HMS:
> {code:java}
> compactor.Worker: Caught exception while trying to compact 
> id:8,dbname:default,tableName:acid_mm_orc,partName:null,state:^@,type:MAJOR,properties:null,runAs:null,tooManyAborts:false,highestWriteId:0.
>  Marking failed to avoid repeated failures, java.io.IOException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Failed to run create 
> temporary table default.tmp_compactor_acid_mm_orc_1550222367257(`a` int, `b` 
> string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'WITH 
> SERDEPROPERTIES (
> 'serialization.format'='1')STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 
> 'hdfs://localhost:9000/user/hive/warehouse/acid_mm_orc/_tmp_2d8a096c-2db5-4ed8-921c-b3f6d31e079e/_base'
>  TBLPROPERTIES ('transactional'='false')
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runMmCompaction(CompactorMR.java:373)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:241)
> at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:174)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to run 
> create temporary table default.tmp_compactor_acid_mm_orc_1550222367257(`a` 
> int, `b` string) ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'WITH SERDEPROPERTIES (
> 'serialization.format'='1')STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 
> 'hdfs://localhost:9000/user/hive/warehouse/acid_mm_orc/_tmp_2d8a096c-2db5-4ed8-921c-b3f6d31e079e/_base'
>  TBLPROPERTIES ('transactional'='false')
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runOnDriver(CompactorMR.java:525)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runMmCompaction(CompactorMR.java:365)
> ... 2 more
> Caused by: java.lang.NullPointerException: Non-local session path expected to 
> be non-null
> at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:228)
> at 
> org.apache.hadoop.hive.ql.session.SessionState.getHDFSSessionPath(SessionState.java:815)
> at org.apache.hadoop.hive.ql.Context.(Context.java:309)
> at org.apache.hadoop.hive.ql.Context.(Context.java:295)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:591)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1684)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1807)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1567)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1556)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runOnDriver(CompactorMR.java:522)
> ... 3 more
> {code}
> cc: [~ekoifman] [~vgumashta] [~sershe]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21402) Compaction state remains 'working' when major compaction fails

2019-03-08 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16788012#comment-16788012
 ] 

Ashutosh Chauhan commented on HIVE-21402:
-

[~pvary] What  exception was thrown instead in that case?

> Compaction state remains 'working' when major compaction fails
> --
>
> Key: HIVE-21402
> URL: https://issues.apache.org/jira/browse/HIVE-21402
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-21402.patch
>
>
> When calcite is not on the HMS classpath, and query based compaction is 
> enabled then the compaction fails with NoClassDefFound error. Since the catch 
> block only catches Exceptions the following code block is not executed:
> {code:java}
> } catch (Exception e) {
>   LOG.error("Caught exception while trying to compact " + ci +
>   ".  Marking failed to avoid repeated failures, " + 
> StringUtils.stringifyException(e));
>   msc.markFailed(CompactionInfo.compactionInfoToStruct(ci));
>   msc.abortTxns(Collections.singletonList(compactorTxnId));
> }
> {code}
> So the compaction is not set to failed.
> Would be better to catch Throwable instead of Exception



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21410) find out the actual port number when hive.server2.thrift.port=0

2019-03-08 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21410:

Status: Patch Available  (was: Open)

> find out the actual port number when hive.server2.thrift.port=0
> ---
>
> Key: HIVE-21410
> URL: https://issues.apache.org/jira/browse/HIVE-21410
> Project: Hive
>  Issue Type: Improvement
>Reporter: zuotingbing
>Assignee: zuotingbing
>Priority: Minor
> Attachments: 2019-03-08_163705.png, 2019-03-08_163747.png, 
> HIVE-21410.patch
>
>
> before fixed:
> !2019-03-08_163705.png!
> after fixed:
> !2019-03-08_163747.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20580) OrcInputFormat.isOriginal() should not rely on hive.acid.key.index

2019-03-08 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16788001#comment-16788001
 ] 

Ashutosh Chauhan commented on HIVE-20580:
-

There are some test util methods in {{TestAcidUtils}} which might be useful 
here. Also, {{TestAcidOnTez}}

> OrcInputFormat.isOriginal() should not rely on hive.acid.key.index
> --
>
> Key: HIVE-20580
> URL: https://issues.apache.org/jira/browse/HIVE-20580
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 3.1.0
>Reporter: Eugene Koifman
>Assignee: Peter Vary
>Priority: Major
> Attachments: HIVE-20580.patch
>
>
> {{org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.isOriginal()}} is checking 
> for presence of {{hive.acid.key.index}} in the footer.  This is only created 
> when the file is written by {{OrcRecordUpdater}}.  It should instead check 
> for presence of Acid metadata columns so that a file can be produced by 
> something other than {{OrcRecordUpater}}.
> Also, {{hive.acid.key.index}} counts number of different type of events which 
> is not really useful for Acid V2 (as of Hive 3) since each file only has 1 
> type of event.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21397) BloomFilter for hive Managed [ACID] table does not work as expected

2019-03-08 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16788004#comment-16788004
 ] 

Ashutosh Chauhan commented on HIVE-21397:
-

yes.. submit to ORC project. Once landed there, we can upgrade ORC version in 
Hive which contains this fix.

> BloomFilter for hive Managed [ACID] table does not work as expected
> ---
>
> Key: HIVE-21397
> URL: https://issues.apache.org/jira/browse/HIVE-21397
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, Transactions
>Affects Versions: 3.1.1
>Reporter: vaibhav
>Assignee: Denys Kuzmenko
>Priority: Blocker
> Attachments: OrcUtils.patch, orc_file_dump.out, orc_file_dump.q
>
>
> Steps to Reproduce this issue : 
> - 
> 1. Create a HIveManaged table as below : 
> - 
> {code:java}
> CREATE TABLE `bloomTest`( 
>    `msisdn` string, 
>    `imsi` varchar(20), 
>    `imei` bigint, 
>    `cell_id` bigint) 
>  ROW FORMAT SERDE 
>    'org.apache.hadoop.hive.ql.io.orc.OrcSerde' 
>  STORED AS INPUTFORMAT 
>    'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' 
>  OUTPUTFORMAT 
>    'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' 
>  LOCATION 
>    
> 'hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/managed/hive/bloomTest;
>  
>  TBLPROPERTIES ( 
>    'bucketing_version'='2', 
>    'orc.bloom.filter.columns'='msisdn,cell_id,imsi', 
>    'orc.bloom.filter.fpp'='0.02', 
>    'transactional'='true', 
>    'transactional_properties'='default', 
>    'transient_lastDdlTime'='1551206683') {code}
> - 
> 2. Insert a few rows. 
> - 
> - 
> 3. Check if bloom filter or active : [ It does not show bloom filters for 
> hive managed tables ] 
> - 
> {code:java}
> [hive@c1162-node2 root]$ hive --orcfiledump 
> hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/managed/hive/bloomTest/delta_001_001_
>  | grep -i bloom 
> SLF4J: Class path contains multiple SLF4J bindings. 
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>  
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>  
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation. 
> SLF4J: Actual binding is of type 
> [org.apache.logging.slf4j.Log4jLoggerFactory] 
> Processing data file 
> hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/managed/hive/bloomTest/delta_001_001_/bucket_0
>  [length: 791] 
> Structure for 
> hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/managed/hive/bloomTest/delta_001_001_/bucket_0
>  {code}
> - 
> On Another hand: For hive External tables it works : 
> - 
> {code:java}
> CREATE external TABLE `ext_bloomTest`( 
>    `msisdn` string, 
>    `imsi` varchar(20), 
>    `imei` bigint, 
>    `cell_id` bigint) 
>  ROW FORMAT SERDE 
>    'org.apache.hadoop.hive.ql.io.orc.OrcSerde' 
>  STORED AS INPUTFORMAT 
>    'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' 
>  OUTPUTFORMAT 
>    'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' 
>  TBLPROPERTIES ( 
>    'bucketing_version'='2', 
>    'orc.bloom.filter.columns'='msisdn,cell_id,imsi', 
>    'orc.bloom.filter.fpp'='0.02') {code}
> - 
> {code:java}
> [hive@c1162-node2 root]$ hive --orcfiledump 
> hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/external/hive/ext_bloomTest/00_0
>  | grep -i bloom 
> SLF4J: Class path contains multiple SLF4J bindings. 
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>  
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>  
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation. 
> SLF4J: Actual binding is of type 
> [org.apache.logging.slf4j.Log4jLoggerFactory] 
> Processing data file 
> hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/external/hive/ext_bloomTest/00_0
>  [length: 755] 
> Structure for 
> hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/external/hive/ext_bloomTest/00_0
>  
> Stream: column 1 section BLOOM_FILTER_UTF8 start: 41 length 110 
> Stream: column 2 section BLOOM_FILTER_UTF8 start: 178 length 114 
> 

[jira] [Commented] (HIVE-21397) BloomFilter for hive Managed [ACID] table does not work as expected

2019-03-07 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787382#comment-16787382
 ] 

Ashutosh Chauhan commented on HIVE-21397:
-

On Hive2 it was reported that bloom filters arent created either but actually 
results in an exception:
h4. REPRODUCE STEPS

Install a cluster with ACID enabled
{code}
CREATE TABLE IF NOT EXISTS emp_part_bckt (
 empid int,
 name string,
 designation  string,
 salary int)
 PARTITIONED BY (department String)
 clustered by (empid) into 2 buckets
 stored as orc
TBLPROPERTIES ('transactional'='true', 'orc.create.index'='true', 
'orc.bloom.filter.columns'='empid,name,designation');

hive> INSERT INTO emp_part_bckt PARTITION(department) VALUES (1, 'Hajime', 
'Test', 100, 'Support');
{code}

h4. ERROR
{noformat}
Status: Failed
Vertex failed, vertexName=Reducer 2, vertexId=vertex_1503649523886_0030_1_01, 
diagnostics=[Task failed, taskId=task_1503649523886_0030_1_01_01, 
diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running 
task:java.lang.RuntimeException: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row (tag=0) 
{"key":{},"value":{"_col0":"1","_col1":"Hajime","_col2":"Test","_col3":"100","_col4":"Support"}}
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row (tag=0) 
{"key":{},"value":{"_col0":"1","_col1":"Hajime","_col2":"Test","_col3":"100","_col4":"Support"}}
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:284)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:266)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row (tag=0) 
{"key":{},"value":{"_col0":"1","_col1":"Hajime","_col2":"Test","_col3":"100","_col4":"Support"}}
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:274)
... 16 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.ArrayIndexOutOfBoundsException: 4
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:581)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.createNewPaths(FileSinkOperator.java:870)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.getDynOutPaths(FileSinkOperator.java:977)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:720)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:841)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343)
... 17 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 4
at 
org.apache.hadoop.hive.ql.io.orc.OrcUtils.getColumnSpan(OrcUtils.java:134)
at 
org.apache.hadoop.hive.ql.io.orc.OrcUtils.includeColumnsImpl(OrcUtils.java:92)
at 
org.apache.hadoop.hive.ql.io.orc.OrcUtils.includeColumns(OrcUtils.java:84)
at 
org.apache.hadoop.hive.ql.io.orc.WriterImpl.(WriterImpl.java:217)
at 

[jira] [Commented] (HIVE-21398) Columns which has estimated statistics should not be considered as unique keys

2019-03-07 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787273#comment-16787273
 ] 

Ashutosh Chauhan commented on HIVE-21398:
-

+1

> Columns which has estimated statistics should not be considered as unique keys
> --
>
> Key: HIVE-21398
> URL: https://issues.apache.org/jira/browse/HIVE-21398
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-21398.01.patch
>
>
> Right now for a column to qualify as a unique column it has to meet the 
> criteria: 
> {code}
> NDV >= numRows
> {code}
> when numRows is 1 this tends to be true ; but numRows is also 1 in cases when 
> we are kinda operate in the blind - don't know how many row there are - more 
> generatlly: with estimated column statistics.
> As a sideeffect of qualifying all columns to be unique; after a few joins all 
> column combinations became uniqueso for a join between 3 tables which 
> have (i,j,k) columns; then it will allocate {{i*j*k}} triplets of "unique 
> column triplets".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20848) After setting UpdateInputAccessTimeHook query fail with Table Not Found.

2019-03-07 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-20848:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Rajkumar!

> After setting UpdateInputAccessTimeHook query fail with Table Not Found.
> 
>
> Key: HIVE-20848
> URL: https://issues.apache.org/jira/browse/HIVE-20848
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20848.01.patch, HIVE-20848.patch
>
>
> {code}
>  select from_unixtime(1540495168); 
>  set 
> hive.exec.pre.hooks=org.apache.hadoop.hive.ql.hooks.ATSHook,org.apache.hadoop.hive.ql.hooks.UpdateInputAccessTimeHook$PreExec;
>  select from_unixtime(1540495168); 
> {code}
> the second select fail with following exception
> {code}
> ERROR ql.Driver: FAILED: Hive Internal Error: 
> org.apache.hadoop.hive.ql.metadata.InvalidTableException(Table not found 
> _dummy_table)
> org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found 
> _dummy_table
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1217)
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1168)
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1155)
> at 
> org.apache.hadoop.hive.ql.hooks.UpdateInputAccessTimeHook$PreExec.run(UpdateInputAccessTimeHook.java:67)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1444)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:266)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20546) Upgrade to Apache Druid 0.13.0-incubating

2019-03-07 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-20546:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Nishant!

> Upgrade to Apache Druid 0.13.0-incubating
> -
>
> Key: HIVE-20546
> URL: https://issues.apache.org/jira/browse/HIVE-20546
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20546.1.patch, HIVE-20546.2.patch, 
> HIVE-20546.3.patch, HIVE-20546.4.patch, HIVE-20546.5.patch, 
> HIVE-20546.6.patch, HIVE-20546.7.patch, HIVE-20546.patch
>
>
> This task is to upgrade to druid 0.13.0 when it is released. Note that it 
> will hopefully be first apache release for Druid. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17668) Push filter clauses through PTF(Windowing) does not work in some cases

2019-03-07 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787250#comment-16787250
 ] 

Ashutosh Chauhan commented on HIVE-17668:
-

Ton of golden files needs updating.

> Push filter clauses through PTF(Windowing) does not work in some cases
> --
>
> Key: HIVE-17668
> URL: https://issues.apache.org/jira/browse/HIVE-17668
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.1.0, 2.2.0, 2.3.0, 3.0.0, 2.4.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-17668.01.patch, HIVE-17668.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21392) Misconfigurations of DataNucleus log in log4j.properties

2019-03-07 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21392:

Status: Patch Available  (was: Open)

> Misconfigurations of DataNucleus log in log4j.properties
> 
>
> Key: HIVE-21392
> URL: https://issues.apache.org/jira/browse/HIVE-21392
> Project: Hive
>  Issue Type: Improvement
>  Components: Logging
>Affects Versions: 2.0.0
>Reporter: Chen Zhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21392.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In the patch of  
> [HIVE-12020|https://issues.apache.org/jira/browse/HIVE-12020], we changed the 
> DataNucleus related logging configuration from nine fine-grained loggers with 
> three coarse-grained loggers (DataNucleus, Datastore and JPOX). As Prasanth 
> Jayachandran 
> [explain|https://issues.apache.org/jira/browse/HIVE-12020?focusedCommentId=15025612=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15025612],
>  these three loggers are the top-level logger in DataNucleus, so that we 
> don't need to specify other loggers for DataNucleus. However, according to 
> the 
> [documents|http://www.datanucleus.org/products/accessplatform/logging.html] 
> and [source 
> codes|https://github.com/datanucleus/datanucleus-core/blob/master/src/main/java/org/datanucleus/util/NucleusLogger.java#L108]
>  of DataNucleus, the top-level logger in DataNucleus is `DataNucleus`. 
> Therefore, we just need to keep the right one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21376) Incompatible change in Hive bucket computation

2019-03-07 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787245#comment-16787245
 ] 

Ashutosh Chauhan commented on HIVE-21376:
-

+1

> Incompatible change in Hive bucket computation
> --
>
> Key: HIVE-21376
> URL: https://issues.apache.org/jira/browse/HIVE-21376
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: David Phillips
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21376.01.patch, HIVE-21376.patch
>
>
> HIVE-20007 seems to have inadvertently changed the bucket hash code 
> computation via {{ObjectInspectorUtils.getBucketHashCodeOld()}} for the 
> {{DATE}} and {{TIMESTAMP}} data type2.
> {{DATE}} was previously computed using {{DateWritable}}, which uses 
> {{daysSinceEpoch}} as the hash code. It is now computed using 
> {{DateWritableV2}}, which uses the hash code of {{java.time.LocalDate}} 
> (which is not days since epoch).
> {{TIMESTAMP}} was previous computed using {{TimestampWritable}} and now uses 
> {{TimestampWritableV2}}. They ostensibly use the same hash code computation, 
> but there are two important differences:
>  # {{TimestampWritable}} rounds the number of milliseconds into the seconds 
> portion of the computation, but {{TimestampWritableV2}} does not.
>  # {{TimestampWritable}} gets the epoch time from {{java.sql.Timestamp}}, 
> which returns it relative to the JVM time zone, not UTC. 
> {{TimestampWritableV2}} uses a {{LocalDateTime}} relative to UTC.
> I was unable to get Hive 3.1 running in order to verify if this actually 
> causes data to be read or written incorrectly (there may be code above this 
> library method which makes things work correctly). However, if my 
> understanding is correct, this means Hive 3.1 is both forwards and backwards 
> incompatible with bucketed tables using either of these data types. It also 
> indicates that Hive needs tests to verify that the hash code does not change 
> between releases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21182) Skip setting up hive scratch dir during planning

2019-03-05 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785283#comment-16785283
 ] 

Ashutosh Chauhan commented on HIVE-21182:
-

+1

> Skip setting up hive scratch dir during planning
> 
>
> Key: HIVE-21182
> URL: https://issues.apache.org/jira/browse/HIVE-21182
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-21182.1.patch, HIVE-21182.2.patch, 
> HIVE-21182.3.patch
>
>
> During metadata gathering phase hive creates staging/scratch dir which is 
> further used by FS op (FS op sets up staging dir within this dir for tasks to 
> write to).
> Since FS op do mkdirs to setup staging dir we can skip creating scratch dir 
> during metadata gathering phase. FS op will take care of setting up all the 
> dirs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20546) Upgrade to Apache Druid 0.13.0-incubating

2019-03-05 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785266#comment-16785266
 ] 

Ashutosh Chauhan commented on HIVE-20546:
-

+1

> Upgrade to Apache Druid 0.13.0-incubating
> -
>
> Key: HIVE-20546
> URL: https://issues.apache.org/jira/browse/HIVE-20546
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-20546.1.patch, HIVE-20546.2.patch, 
> HIVE-20546.3.patch, HIVE-20546.4.patch, HIVE-20546.5.patch, 
> HIVE-20546.6.patch, HIVE-20546.patch
>
>
> This task is to upgrade to druid 0.13.0 when it is released. Note that it 
> will hopefully be first apache release for Druid. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20848) After setting UpdateInputAccessTimeHook query fail with Table Not Found.

2019-03-05 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785128#comment-16785128
 ] 

Ashutosh Chauhan commented on HIVE-20848:
-

+1
Can you reattach the patch for QA run?

> After setting UpdateInputAccessTimeHook query fail with Table Not Found.
> 
>
> Key: HIVE-20848
> URL: https://issues.apache.org/jira/browse/HIVE-20848
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-20848.patch
>
>
> {code}
>  select from_unixtime(1540495168); 
>  set 
> hive.exec.pre.hooks=org.apache.hadoop.hive.ql.hooks.ATSHook,org.apache.hadoop.hive.ql.hooks.UpdateInputAccessTimeHook$PreExec;
>  select from_unixtime(1540495168); 
> {code}
> the second select fail with following exception
> {code}
> ERROR ql.Driver: FAILED: Hive Internal Error: 
> org.apache.hadoop.hive.ql.metadata.InvalidTableException(Table not found 
> _dummy_table)
> org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found 
> _dummy_table
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1217)
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1168)
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1155)
> at 
> org.apache.hadoop.hive.ql.hooks.UpdateInputAccessTimeHook$PreExec.run(UpdateInputAccessTimeHook.java:67)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1444)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:266)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20616) Dynamic Partition Insert failed if PART_VALUE exceeds 4000 chars

2019-03-05 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785129#comment-16785129
 ] 

Ashutosh Chauhan commented on HIVE-20616:
-

[~daijy] Can you please review this?

> Dynamic Partition Insert failed if PART_VALUE exceeds 4000 chars
> 
>
> Key: HIVE-20616
> URL: https://issues.apache.org/jira/browse/HIVE-20616
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-20616.patch
>
>
> with mysql as metastore db the PARTITION_PARAMS.PARAM_VALUE defined as 
> varchar(4000)
> {code}
> describe PARTITION_PARAMS; 
> +-+---+--+-+-+---+ 
> | Field | Type | Null | Key | Default | Extra | 
> +-+---+--+-+-+---+ 
> | PART_ID | bigint(20) | NO | PRI | NULL | | 
> | PARAM_KEY | varchar(256) | NO | PRI | NULL | | 
> | PARAM_VALUE | varchar(4000) | YES | | NULL | | 
> +-+---+--+-+-+---+ 
> {code}
> which lead to the MoveTask failure if PART_VALUE excceeds 4000 chars.
> {code}
> org.datanucleus.store.rdbms.exceptions.MappedDatastoreException: INSERT INTO 
> `PARTITION_PARAMS` (`PARAM_VALUE`,`PART_ID`,`PARAM_KEY`) VALUES (?,?,?)
>  at 
> org.datanucleus.store.rdbms.scostore.JoinMapStore.internalPut(JoinMapStore.java:1074)
>  at 
> org.datanucleus.store.rdbms.scostore.JoinMapStore.putAll(JoinMapStore.java:224)
>  at 
> org.datanucleus.store.rdbms.mapping.java.MapMapping.postInsert(MapMapping.java:158)
>  at 
> org.datanucleus.store.rdbms.request.InsertRequest.execute(InsertRequest.java:522)
>  at 
> org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObjectInTable(RDBMSPersistenceHandler.java:162)
>  at 
> org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObject(RDBMSPersistenceHandler.java:138)
>  at 
> org.datanucleus.state.StateManagerImpl.internalMakePersistent(StateManagerImpl.java:3363)
>  at 
> org.datanucleus.state.StateManagerImpl.makePersistent(StateManagerImpl.java:3339)
>  at 
> org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2080)
>  at 
> org.datanucleus.ExecutionContextImpl.persistObjectWork(ExecutionContextImpl.java:1923)
>  at 
> org.datanucleus.ExecutionContextImpl.persistObject(ExecutionContextImpl.java:1778)
>  at 
> org.datanucleus.ExecutionContextThreadedImpl.persistObject(ExecutionContextThreadedImpl.java:217)
>  at 
> org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:724)
>  at 
> org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:749)
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore.addPartition(ObjectStore.java:2442)
>  at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97)
>  at com.sun.proxy.$Proxy32.addPartition(Unknown Source)
>  at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_partition_core(HiveMetaStore.java:3976)
>  at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_partition_with_environment_context(HiveMetaStore.java:4032)
>  at sun.reflect.GeneratedMethodAccessor54.invoke(Unknown Source)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>  at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>  at com.sun.proxy.$Proxy34.add_partition_with_environment_context(Unknown 
> Source)
>  at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$add_partition_with_environment_context.getResult(ThriftHiveMetastore.java:15528)
>  at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$add_partition_with_environment_context.getResult(ThriftHiveMetastore.java:15512)
>  at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>  at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>  at 
> org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:636)
>  at 
> org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:631)
>  at 

[jira] [Commented] (HIVE-20801) ACID: Allow DbTxnManager to ignore non-ACID table locking

2019-03-04 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783794#comment-16783794
 ] 

Ashutosh Chauhan commented on HIVE-20801:
-

Now we only have ACID tables or insert-only (MM) tables or external tables. If 
etl is via spark or sqoop, those tables should be external and then there will 
be no locking on them. 

> ACID: Allow DbTxnManager to ignore non-ACID table locking
> -
>
> Key: HIVE-20801
> URL: https://issues.apache.org/jira/browse/HIVE-20801
> Project: Hive
>  Issue Type: Bug
>  Components: Locking, Transactions
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
>  Labels: Branch3Candidate, TODOC
> Attachments: HIVE-20801.1.patch, HIVE-20801.2.patch, 
> HIVE-20801.2.patch, HIVE-20801.3.patch
>
>
> Enabling ACIDv1 on a cluster produces a central locking bottleneck for all 
> table types, which is not always the intention.
> The Hive locking for non-acid tables are advisory (i.e a client can 
> write/read without locking), which means that the implementation does not 
> offer strong consistency despite the lock manager consuming resources 
> centrally.
> Disabling this lock acquisition would improve the performance of non-ACID 
> tables co-existing with a globally configured DbTxnManager implementation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18920) CBO: Initialize the Janino providers ahead of 1st query

2019-03-04 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783693#comment-16783693
 ] 

Ashutosh Chauhan commented on HIVE-18920:
-

+1

> CBO: Initialize the Janino providers ahead of 1st query
> ---
>
> Key: HIVE-18920
> URL: https://issues.apache.org/jira/browse/HIVE-18920
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-18920.01.patch, HIVE-18920.02.patch, 
> HIVE-18920.patch
>
>
> Hive Calcite metadata providers are compiled when the 1st query comes in.
> If a second query arrives before the 1st one has built a metadata provider, 
> it will also try to do the same thing, because the cache is not populated yet.
> With 1024 concurrent users, it takes 6 minutes for the 1st query to finish 
> fighting all the other queries which are trying to load that cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21341) Sensible defaults : hive.server2.idle.operation.timeout and hive.server2.idle.session.timeout are too high

2019-02-27 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21341:

Attachment: HIVE-21341.patch

> Sensible defaults : hive.server2.idle.operation.timeout and 
> hive.server2.idle.session.timeout are too high
> --
>
> Key: HIVE-21341
> URL: https://issues.apache.org/jira/browse/HIVE-21341
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, HiveServer2
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-21341.patch
>
>
> Defaults are too high, which results in extra resources being held too long 
> in HS2 memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21341) Sensible defaults : hive.server2.idle.operation.timeout and hive.server2.idle.session.timeout are too high

2019-02-27 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779946#comment-16779946
 ] 

Ashutosh Chauhan commented on HIVE-21341:
-

[~thejas] Can you please review? Is there any other companion configs we shall 
change too?

> Sensible defaults : hive.server2.idle.operation.timeout and 
> hive.server2.idle.session.timeout are too high
> --
>
> Key: HIVE-21341
> URL: https://issues.apache.org/jira/browse/HIVE-21341
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, HiveServer2
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-21341.patch
>
>
> Defaults are too high, which results in extra resources being held too long 
> in HS2 memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21341) Sensible defaults : hive.server2.idle.operation.timeout and hive.server2.idle.session.timeout are too high

2019-02-27 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21341:

Status: Patch Available  (was: Open)

> Sensible defaults : hive.server2.idle.operation.timeout and 
> hive.server2.idle.session.timeout are too high
> --
>
> Key: HIVE-21341
> URL: https://issues.apache.org/jira/browse/HIVE-21341
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, HiveServer2
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-21341.patch
>
>
> Defaults are too high, which results in extra resources being held too long 
> in HS2 memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21341) Sensible defaults : hive.server2.idle.operation.timeout and hive.server2.idle.session.timeout are too high

2019-02-27 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reassigned HIVE-21341:
---


> Sensible defaults : hive.server2.idle.operation.timeout and 
> hive.server2.idle.session.timeout are too high
> --
>
> Key: HIVE-21341
> URL: https://issues.apache.org/jira/browse/HIVE-21341
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, HiveServer2
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
>
> Defaults are too high, which results in extra resources being held too long 
> in HS2 memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21279) Avoid moving/rename operation in FileSink op for SELECT queries

2019-02-27 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779933#comment-16779933
 ] 

Ashutosh Chauhan commented on HIVE-21279:
-

+1 pending tests.

> Avoid moving/rename operation in FileSink op for SELECT queries
> ---
>
> Key: HIVE-21279
> URL: https://issues.apache.org/jira/browse/HIVE-21279
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21279.1.patch, HIVE-21279.10.patch, 
> HIVE-21279.11.patch, HIVE-21279.2.patch, HIVE-21279.3.patch, 
> HIVE-21279.4.patch, HIVE-21279.5.patch, HIVE-21279.6.patch, 
> HIVE-21279.7.patch, HIVE-21279.8.patch, HIVE-21279.9.patch
>
>
> Currently at the end of a job FileSink operator moves/rename temp directory 
> to another directory from which FetchTask fetches result. This is done to 
> avoid fetching potential partial/invalid files by failed/runway tasks. This 
> operation is expensive for cloud storage. It could be avoided if FetchTask is 
> passed on set of files to read from instead of whole directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HIVE-20364) Update default for hive.map.aggr.hash.min.reduction

2019-02-27 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-20364.
-
Resolution: Fixed

Taken care of in HIVE-20656

> Update default for hive.map.aggr.hash.min.reduction
> ---
>
> Key: HIVE-20364
> URL: https://issues.apache.org/jira/browse/HIVE-20364
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Nita Dembla
>Assignee: Ashutosh Chauhan
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20364.patch
>
>
> Default value is 0.5 Lets update it to 0.99
> In average case its a trade-off between cpu vs network. Erring on side of CPU 
> is better since perf loss caused by network is usually larger.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HIVE-20364) Update default for hive.map.aggr.hash.min.reduction

2019-02-27 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reopened HIVE-20364:
-

> Update default for hive.map.aggr.hash.min.reduction
> ---
>
> Key: HIVE-20364
> URL: https://issues.apache.org/jira/browse/HIVE-20364
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Nita Dembla
>Assignee: Ashutosh Chauhan
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20364.patch
>
>
> Default value is 0.5 Lets update it to 0.99
> In average case its a trade-off between cpu vs network. Erring on side of CPU 
> is better since perf loss caused by network is usually larger.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20656) Sensible defaults: Map aggregation memory configs are too aggressive

2019-02-27 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779835#comment-16779835
 ] 

Ashutosh Chauhan commented on HIVE-20656:
-

Sorry about that. I will reopen HIVE-20364 and update correct config there.

> Sensible defaults: Map aggregation memory configs are too aggressive
> 
>
> Key: HIVE-20656
> URL: https://issues.apache.org/jira/browse/HIVE-20656
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20656.1.patch
>
>
> The defaults for the following configs seems to be too aggressive. In java 
> this can easily lead to several full GC pauses whose memory cannot be 
> reclaimed.
> {code:java}
> HIVEMAPAGGRHASHMEMORY("hive.map.aggr.hash.percentmemory", (float) 0.99,
> "Portion of total memory to be used by map-side group aggregation hash 
> table"),
> HIVEMAPAGGRMEMORYTHRESHOLD("hive.map.aggr.hash.force.flush.memory.threshold", 
> (float) 0.9,
> "The max memory to be used by map-side group aggregation hash table.\n" +
> "If the memory usage is higher than this number, force to flush 
> data"),{code}
>  
> We can be little bit conservative for these configs to avoid getting into GC 
> pause. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21332) Cache Purge command does purge the in-use buffer.

2019-02-27 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779824#comment-16779824
 ] 

Ashutosh Chauhan commented on HIVE-21332:
-

+1

> Cache Purge command does purge the in-use buffer.
> -
>
> Key: HIVE-21332
> URL: https://issues.apache.org/jira/browse/HIVE-21332
> Project: Hive
>  Issue Type: Bug
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-21332.patch
>
>
> Cache Purge command, is purging what is not suppose to evict.
> This can lead to unrecoverable state.
> {code} 
> TaskAttempt 3 failed, info=[Error: Error while running task ( failure ) : 
> attempt_1545278897356_0093_27_00_01_3:java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
> java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 32768; at 0 out of 1 (entire cache is fragmented and 
> locked, or an internal issue)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
>  at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>  at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>  at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>  at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
>  at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>  at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>  at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>  at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:110)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 32768; at 0 out of 1 (entire cache is fragmented and 
> locked, or an internal issue)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
>  ... 15 more
> Caused by: java.io.IOException: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 32768; at 0 out of 1 (entire cache is fragmented and 
> locked, or an internal issue)
>  at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>  at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>  at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
>  at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
>  at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
>  ... 17 more
> Caused by: java.io.IOException: 
> org.apache.hadoop.hive.common.io.Allocator$AllocatorOutOfMemoryException: 
> Failed to allocate 32768; at 0 out of 1 (entire cache is fragmented and 
> locked, or an internal issue)
>  at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:513)
>  at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:407)
>  at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:266)
>  at 
> 

[jira] [Updated] (HIVE-21298) Move Hive Schema Tool classes to their own package to have cleaner structure

2019-02-27 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21298:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Miklos!

> Move Hive Schema Tool classes to their own package to have  cleaner structure
> -
>
> Key: HIVE-21298
> URL: https://issues.apache.org/jira/browse/HIVE-21298
> Project: Hive
>  Issue Type: Improvement
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21298.01.patch, HIVE-21298.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20801) ACID: Allow DbTxnManager to ignore non-ACID table locking

2019-02-26 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16778986#comment-16778986
 ] 

Ashutosh Chauhan commented on HIVE-20801:
-

hive.txn.strict.locking.mode=false should be sufficient for that. No?

> ACID: Allow DbTxnManager to ignore non-ACID table locking
> -
>
> Key: HIVE-20801
> URL: https://issues.apache.org/jira/browse/HIVE-20801
> Project: Hive
>  Issue Type: Bug
>  Components: Locking, Transactions
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
>  Labels: Branch3Candidate, TODOC
> Attachments: HIVE-20801.1.patch, HIVE-20801.2.patch, 
> HIVE-20801.2.patch, HIVE-20801.3.patch
>
>
> Enabling ACIDv1 on a cluster produces a central locking bottleneck for all 
> table types, which is not always the intention.
> The Hive locking for non-acid tables are advisory (i.e a client can 
> write/read without locking), which means that the implementation does not 
> offer strong consistency despite the lock manager consuming resources 
> centrally.
> Disabling this lock acquisition would improve the performance of non-ACID 
> tables co-existing with a globally configured DbTxnManager implementation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21225) ACID: getAcidState() should cache a recursive dir listing locally

2019-02-26 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16778979#comment-16778979
 ] 

Ashutosh Chauhan commented on HIVE-21225:
-

[~vgumashta] Which approach you are thinking here? [~ekoifman]'s of encoding 
identifier in file names or [~gopalv]'s single recursive call. 
If we change names of dirs not sure if that will have any impact on data in 
existing tables.

> ACID: getAcidState() should cache a recursive dir listing locally
> -
>
> Key: HIVE-21225
> URL: https://issues.apache.org/jira/browse/HIVE-21225
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Gopal V
>Assignee: Vaibhav Gumashta
>Priority: Major
> Attachments: async-pid-44-2.svg
>
>
> Currently getAcidState() makes 3 calls into the FS api which could be 
> answered by making a single recursive listDir call and reusing the same data 
> to check for isRawFormat() and isValidBase().
> All delta operations for a single partition can go against a single listed 
> directory snapshot instead of interacting with the NameNode or ObjectStore 
> within the inner loop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20801) ACID: Allow DbTxnManager to ignore non-ACID table locking

2019-02-26 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16778980#comment-16778980
 ] 

Ashutosh Chauhan commented on HIVE-20801:
-

bq. Just that Hive is slowed down by a magnitude when ACID is enabled for even 
1 tables.
>From where this slowness coming from? Is that acquiring locks for  is slow?

> ACID: Allow DbTxnManager to ignore non-ACID table locking
> -
>
> Key: HIVE-20801
> URL: https://issues.apache.org/jira/browse/HIVE-20801
> Project: Hive
>  Issue Type: Bug
>  Components: Locking, Transactions
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
>  Labels: Branch3Candidate, TODOC
> Attachments: HIVE-20801.1.patch, HIVE-20801.2.patch, 
> HIVE-20801.2.patch, HIVE-20801.3.patch
>
>
> Enabling ACIDv1 on a cluster produces a central locking bottleneck for all 
> table types, which is not always the intention.
> The Hive locking for non-acid tables are advisory (i.e a client can 
> write/read without locking), which means that the implementation does not 
> offer strong consistency despite the lock manager consuming resources 
> centrally.
> Disabling this lock acquisition would improve the performance of non-ACID 
> tables co-existing with a globally configured DbTxnManager implementation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18920) CBO: Initialize the Janino providers ahead of 1st query

2019-02-26 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16778965#comment-16778965
 ] 

Ashutosh Chauhan commented on HIVE-18920:
-

then it is no longer an issue? [~jcamachorodriguez] can you please confirm and 
resolve.

> CBO: Initialize the Janino providers ahead of 1st query
> ---
>
> Key: HIVE-18920
> URL: https://issues.apache.org/jira/browse/HIVE-18920
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Hive Calcite metadata providers are compiled when the 1st query comes in.
> If a second query arrives before the 1st one has built a metadata provider, 
> it will also try to do the same thing, because the cache is not populated yet.
> With 1024 concurrent users, it takes 6 minutes for the 1st query to finish 
> fighting all the other queries which are trying to load that cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18920) CBO: Initialize the Janino providers ahead of 1st query

2019-02-26 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16778932#comment-16778932
 ] 

Ashutosh Chauhan commented on HIVE-18920:
-

Can this be done at HS2 process startup time?

> CBO: Initialize the Janino providers ahead of 1st query
> ---
>
> Key: HIVE-18920
> URL: https://issues.apache.org/jira/browse/HIVE-18920
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Hive Calcite metadata providers are compiled when the 1st query comes in.
> If a second query arrives before the 1st one has built a metadata provider, 
> it will also try to do the same thing, because the cache is not populated yet.
> With 1024 concurrent users, it takes 6 minutes for the 1st query to finish 
> fighting all the other queries which are trying to load that cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21279) Avoid moving/rename operation in FileSink op for SELECT queries

2019-02-26 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16778597#comment-16778597
 ] 

Ashutosh Chauhan commented on HIVE-21279:
-

[~vgarg] can you create RB for review?

> Avoid moving/rename operation in FileSink op for SELECT queries
> ---
>
> Key: HIVE-21279
> URL: https://issues.apache.org/jira/browse/HIVE-21279
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21279.1.patch, HIVE-21279.2.patch, 
> HIVE-21279.3.patch, HIVE-21279.4.patch, HIVE-21279.5.patch, 
> HIVE-21279.6.patch, HIVE-21279.7.patch, HIVE-21279.8.patch
>
>
> Currently at the end of a job FileSink operator moves/rename temp directory 
> to another directory from which FetchTask fetches result. This is done to 
> avoid fetching potential partial/invalid files by failed/runway tasks. This 
> operation is expensive for cloud storage. It could be avoided if FetchTask is 
> passed on set of files to read from instead of whole directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21293) Fix ambiguity in grammar warnings at compilation time (II)

2019-02-24 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776328#comment-16776328
 ] 

Ashutosh Chauhan commented on HIVE-21293:
-

{{unknown}} needs to be non-reserved for this feature to be included. Else, 
resulting ambiguity in grammar is not worth including this feature.
Because altho its reserved in standard, having it as a reserved will be a 
backward incompatible change.

> Fix ambiguity in grammar warnings at compilation time (II)
> --
>
> Key: HIVE-21293
> URL: https://issues.apache.org/jira/browse/HIVE-21293
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-21293.01.patch
>
>
> These are the warnings at compilation time:
> {code}
> warning(200): IdentifiersParser.g:424:5:
> Decision can match input such as "KW_UNKNOWN" using multiple alternatives: 1, 
> 10
> As a result, alternative(s) 10 were disabled for that input
> {code}
> This means that multiple parser rules can match certain query text, possibly 
> leading to unexpected errors at parsing time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21301) Show tables statement to include views and materialized views

2019-02-22 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775533#comment-16775533
 ] 

Ashutosh Chauhan commented on HIVE-21301:
-

+1

> Show tables statement to include views and materialized views
> -
>
> Key: HIVE-21301
> URL: https://issues.apache.org/jira/browse/HIVE-21301
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: TODOC3.2, pull-request-available
> Attachments: HIVE-21301.01.patch, HIVE-21301.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HIVE-19974 introduced backwards incompatible change, with {{SHOW TABLES}} 
> statement showing only managed/external tables in the system.
> This issue will restore old behavior, with {{SHOW TABLES}} showing all 
> queryable entities, including views and materialized views.
> Instead, to provide information about table types, {{SHOW EXTENDED TABLES}} 
> statement is introduced, which includes an additional column with the table 
> type for each of the tables listed.
> Besides, the possibility to filter the show tables statements with a {{WHERE 
> `table_type` = 'ANY_TYPE'}} clause is introduced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21278) Fix ambiguity in grammar warnings at compilation time

2019-02-20 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16773736#comment-16773736
 ] 

Ashutosh Chauhan commented on HIVE-21278:
-

[~jcamachorodriguez] Can you create a follow-up jira for  ambiguity due to 
unknown? I am wondering if we shall revert that feature.

> Fix ambiguity in grammar warnings at compilation time
> -
>
> Key: HIVE-21278
> URL: https://issues.apache.org/jira/browse/HIVE-21278
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Fix For: 4.0.0, 3.2.0
>
> Attachments: HIVE-21278.01.patch, HIVE-21278.02.patch, 
> HIVE-21278.patch
>
>
> These are the warnings at compilation time:
> {code}
> warning(200): org/apache/hadoop/hive/ql/parse/HiveParser.g:2439:5:
> Decision can match input such as "KW_CHECK KW_DATETIME" using multiple 
> alternatives: 1, 2
> As a result, alternative(s) 2 were disabled for that input
> warning(200): org/apache/hadoop/hive/ql/parse/HiveParser.g:2439:5:
> Decision can match input such as "KW_CHECK KW_DATE {LPAREN, StringLiteral}" 
> using multiple alternatives: 1, 2
> As a result, alternative(s) 2 were disabled for that input
> warning(200): org/apache/hadoop/hive/ql/parse/HiveParser.g:2439:5:
> Decision can match input such as "KW_CHECK KW_UNIONTYPE LESSTHAN" using 
> multiple alternatives: 1, 2
> As a result, alternative(s) 2 were disabled for that input
> warning(200): org/apache/hadoop/hive/ql/parse/HiveParser.g:2439:5:
> Decision can match input such as "KW_CHECK {KW_EXISTS, KW_TINYINT}" using 
> multiple alternatives: 1, 2
> As a result, alternative(s) 2 were disabled for that input
> warning(200): org/apache/hadoop/hive/ql/parse/HiveParser.g:2439:5:
> Decision can match input such as "KW_CHECK KW_STRUCT LESSTHAN" using multiple 
> alternatives: 1, 2
> As a result, alternative(s) 2 were disabled for that input
> {code}
> This means that multiple parser rules can match certain query text, possibly 
> leading to unexpected errors at parsing time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21038) Fix checkstyle for standalone-metastore

2019-02-13 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21038:

   Resolution: Fixed
Fix Version/s: (was: 3.2.0)
   4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Miklos!

> Fix checkstyle for standalone-metastore
> ---
>
> Key: HIVE-21038
> URL: https://issues.apache.org/jira/browse/HIVE-21038
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-21038.01.patch
>
>
> Since HIVE-17506 checkstyle is not working for standalone-metastore and it's 
> sub projects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21149) Refactor LlapServiceDriver

2019-02-13 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21149:

   Resolution: Fixed
Fix Version/s: (was: 3.1.2)
   4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Miklos!

> Refactor LlapServiceDriver
> --
>
> Key: HIVE-21149
> URL: https://issues.apache.org/jira/browse/HIVE-21149
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.2
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21149.01.patch, HIVE-21149.02.patch, 
> HIVE-21149.03.patch, HIVE-21149.04.patch
>
>
> LlapServiceDriver is one monolith class doing several things, needs to be 
> refactor in order to make it clearer how it works.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21149) Refactor LlapServiceDriver

2019-02-13 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767481#comment-16767481
 ] 

Ashutosh Chauhan commented on HIVE-21149:
-

+1


> Refactor LlapServiceDriver
> --
>
> Key: HIVE-21149
> URL: https://issues.apache.org/jira/browse/HIVE-21149
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.2
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 3.1.2
>
> Attachments: HIVE-21149.01.patch, HIVE-21149.02.patch, 
> HIVE-21149.03.patch, HIVE-21149.04.patch
>
>
> LlapServiceDriver is one monolith class doing several things, needs to be 
> refactor in order to make it clearer how it works.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21254) Pre-upgrade tool should handle exceptions and skip db/tables

2019-02-12 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766599#comment-16766599
 ] 

Ashutosh Chauhan commented on HIVE-21254:
-

It should be determined whether table is acid or not.  Pre-upgrade tool need to 
compact acid tables. Thats a must, we can't skip it. If it's a non-acid table 
then no action is needed on it anyways, so ACE exception can be ignored, but if 
its an acid table then I think tool shall raise an exception and fail. User 
need to make sure that user running pre-upgrade tool has sufficient privileges.

> Pre-upgrade tool should handle exceptions and skip db/tables
> 
>
> Key: HIVE-21254
> URL: https://issues.apache.org/jira/browse/HIVE-21254
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21254.1.patch
>
>
> When exceptions like AccessControlException is thrown, pre-upgrade tool 
> fails. If hive user does not have read access to database or tables (some 
> external tables denies read access to hive), pre-upgrade tool should just 
> assume they are external tables and move on without failing pre-upgrade 
> process. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21236) SharedWorkOptimizer should check table properties

2019-02-12 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766279#comment-16766279
 ] 

Ashutosh Chauhan commented on HIVE-21236:
-

+1

> SharedWorkOptimizer should check table properties
> -
>
> Key: HIVE-21236
> URL: https://issues.apache.org/jira/browse/HIVE-21236
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21236.01.patch, HIVE-21236.02.patch, 
> HIVE-21236.patch
>
>
> For instance, Calcite may have pushed computation to Druid or a JDBC source, 
> rest of table structures may look the same, but the embedded query is 
> different.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21235) LLAP: make the name of log4j2 properties file configurable

2019-02-11 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16765292#comment-16765292
 ] 

Ashutosh Chauhan commented on HIVE-21235:
-

+1

> LLAP: make the name of log4j2 properties file configurable
> --
>
> Key: HIVE-21235
> URL: https://issues.apache.org/jira/browse/HIVE-21235
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21235.1.patch
>
>
> For llap daemon, the name of llap-daemon-log4j2.properties is fixed. If a 
> conf dir and jar contain the same filename, it will mess up log4j2 
> initialization. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21001) Upgrade to calcite-1.18

2019-02-07 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16763086#comment-16763086
 ] 

Ashutosh Chauhan commented on HIVE-21001:
-

Left some comments on RB.

> Upgrade to calcite-1.18
> ---
>
> Key: HIVE-21001
> URL: https://issues.apache.org/jira/browse/HIVE-21001
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-21001.01.patch, HIVE-21001.01.patch, 
> HIVE-21001.02.patch, HIVE-21001.03.patch, HIVE-21001.04.patch, 
> HIVE-21001.05.patch, HIVE-21001.06.patch, HIVE-21001.06.patch, 
> HIVE-21001.07.patch, HIVE-21001.08.patch, HIVE-21001.08.patch, 
> HIVE-21001.08.patch, HIVE-21001.09.patch, HIVE-21001.09.patch, 
> HIVE-21001.09.patch, HIVE-21001.10.patch, HIVE-21001.11.patch, 
> HIVE-21001.12.patch, HIVE-21001.13.patch, HIVE-21001.15.patch, 
> HIVE-21001.16.patch, HIVE-21001.17.patch, HIVE-21001.18.patch, 
> HIVE-21001.18.patch, HIVE-21001.19.patch, HIVE-21001.20.patch, 
> HIVE-21001.21.patch, HIVE-21001.22.patch, HIVE-21001.22.patch, 
> HIVE-21001.22.patch, HIVE-21001.23.patch, HIVE-21001.24.patch
>
>
> XLEAR LIBRARY CACHE 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20523) Improve table statistics for Parquet format

2019-02-06 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16762260#comment-16762260
 ] 

Ashutosh Chauhan commented on HIVE-20523:
-

+1 patch needs a refresh and rerun.

> Improve table statistics for Parquet format
> ---
>
> Key: HIVE-20523
> URL: https://issues.apache.org/jira/browse/HIVE-20523
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: George Pachitariu
>Assignee: George Pachitariu
>Priority: Minor
> Attachments: HIVE-20523.1.patch, HIVE-20523.10.patch, 
> HIVE-20523.11.patch, HIVE-20523.12.patch, HIVE-20523.2.patch, 
> HIVE-20523.3.patch, HIVE-20523.4.patch, HIVE-20523.5.patch, 
> HIVE-20523.6.patch, HIVE-20523.7.patch, HIVE-20523.8.patch, 
> HIVE-20523.9.patch, HIVE-20523.patch
>
>
> Right now, in the table basic statistics, the *raw data size* for a row with 
> any data type in the Parquet format is 1. This is an underestimated value 
> when columns are complex data structures, like arrays.
> Having tables with underestimated raw data size makes Hive assign less 
> containers (mappers/reducers) to it, making the overall query slower. 
> Heavy underestimation also makes Hive choose MapJoin instead of the 
> ShuffleJoin that can fail with OOM errors.
> In this patch, I compute the columns data size better, taking into account 
> complex structures. I followed the Writer implementation for the ORC format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21184) Add explain and explain formatted CBO plan with cost information

2019-02-04 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16759931#comment-16759931
 ] 

Ashutosh Chauhan commented on HIVE-21184:
-

+1

> Add explain and explain formatted CBO plan with cost information
> 
>
> Key: HIVE-21184
> URL: https://issues.apache.org/jira/browse/HIVE-21184
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21184.01.patch, HIVE-21184.03.patch, 
> HIVE-21184.04.patch, HIVE-21184.05.patch
>
>
> Plan is more readable than full DAG. Explain formatted/extended will print 
> the plan.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21189) hive.merge.nway.joins should default to false

2019-02-02 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21189:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks Jesus for review!

> hive.merge.nway.joins should default to false
> -
>
> Key: HIVE-21189
> URL: https://issues.apache.org/jira/browse/HIVE-21189
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21189.1.patch, HIVE-21189.2.patch, HIVE-21189.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21189) hive.merge.nway.joins should default to false

2019-02-01 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21189:

Status: Patch Available  (was: Open)

> hive.merge.nway.joins should default to false
> -
>
> Key: HIVE-21189
> URL: https://issues.apache.org/jira/browse/HIVE-21189
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-21189.1.patch, HIVE-21189.2.patch, HIVE-21189.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21189) hive.merge.nway.joins should default to false

2019-02-01 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21189:

Attachment: HIVE-21189.2.patch

> hive.merge.nway.joins should default to false
> -
>
> Key: HIVE-21189
> URL: https://issues.apache.org/jira/browse/HIVE-21189
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-21189.1.patch, HIVE-21189.2.patch, HIVE-21189.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21189) hive.merge.nway.joins should default to false

2019-02-01 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21189:

Status: Open  (was: Patch Available)

> hive.merge.nway.joins should default to false
> -
>
> Key: HIVE-21189
> URL: https://issues.apache.org/jira/browse/HIVE-21189
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-21189.1.patch, HIVE-21189.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21189) hive.merge.nway.joins should default to false

2019-02-01 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21189:

Status: Patch Available  (was: Open)

> hive.merge.nway.joins should default to false
> -
>
> Key: HIVE-21189
> URL: https://issues.apache.org/jira/browse/HIVE-21189
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-21189.1.patch, HIVE-21189.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21189) hive.merge.nway.joins should default to false

2019-02-01 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21189:

Status: Open  (was: Patch Available)

> hive.merge.nway.joins should default to false
> -
>
> Key: HIVE-21189
> URL: https://issues.apache.org/jira/browse/HIVE-21189
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-21189.1.patch, HIVE-21189.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21189) hive.merge.nway.joins should default to false

2019-02-01 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21189:

Attachment: HIVE-21189.1.patch

> hive.merge.nway.joins should default to false
> -
>
> Key: HIVE-21189
> URL: https://issues.apache.org/jira/browse/HIVE-21189
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-21189.1.patch, HIVE-21189.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17938) Enable parallel query compilation in HS2

2019-01-30 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756780#comment-16756780
 ] 

Ashutosh Chauhan commented on HIVE-17938:
-

+1

> Enable parallel query compilation in HS2
> 
>
> Key: HIVE-17938
> URL: https://issues.apache.org/jira/browse/HIVE-17938
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>Priority: Major
> Attachments: HIVE-17938.1.patch, HIVE-17938.2.patch, 
> HIVE-17938.3.patch
>
>
> This (hive.driver.parallel.compilation) has been enabled in many production 
> environments for a while (Hortonworks customers), and it has been stable.
> Just realized that this is not yet enabled in apache by default. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21188) SemanticException for query on view with masked table

2019-01-30 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756733#comment-16756733
 ] 

Ashutosh Chauhan commented on HIVE-21188:
-

+1 pending tests

> SemanticException for query on view with masked table
> -
>
> Key: HIVE-21188
> URL: https://issues.apache.org/jira/browse/HIVE-21188
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21188.patch
>
>
> When view reference is fully qualified. Following q file can be used to 
> reproduce the issue:
> {code}
> --! qt:dataset:srcpart
> --! qt:dataset:src
> set hive.mapred.mode=nonstrict;
> set 
> hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest;
> create database atlasmask;
> use atlasmask;
> create table masking_test_n8 (key int, value int);
> insert into masking_test_n8 values(1,1), (2,2);
> create view testv(c,d) as select * from masking_test_n8;
> select * from `atlasmask`.`testv`;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21189) hive.merge.nway.joins should default to false

2019-01-30 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21189:

Attachment: HIVE-21189.patch

> hive.merge.nway.joins should default to false
> -
>
> Key: HIVE-21189
> URL: https://issues.apache.org/jira/browse/HIVE-21189
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-21189.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21189) hive.merge.nway.joins should default to false

2019-01-30 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-21189:

Status: Patch Available  (was: Open)

> hive.merge.nway.joins should default to false
> -
>
> Key: HIVE-21189
> URL: https://issues.apache.org/jira/browse/HIVE-21189
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
> Attachments: HIVE-21189.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21189) hive.merge.nway.joins should default to false

2019-01-30 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reassigned HIVE-21189:
---


> hive.merge.nway.joins should default to false
> -
>
> Key: HIVE-21189
> URL: https://issues.apache.org/jira/browse/HIVE-21189
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17020) Aggressive RS dedup can incorrectly remove OP tree branch

2019-01-28 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754283#comment-16754283
 ] 

Ashutosh Chauhan commented on HIVE-17020:
-

+1

> Aggressive RS dedup can incorrectly remove OP tree branch
> -
>
> Key: HIVE-17020
> URL: https://issues.apache.org/jira/browse/HIVE-17020
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-17020.1.patch, HIVE-17020.2.patch, 
> HIVE-17020.3.patch, HIVE-17020.4.patch, HIVE-17020.5.patch, 
> HIVE-17020.6.patch, HIVE-17020.7.patch, HIVE-17020.8.patch, HIVE-17020.9.patch
>
>
> Suppose we have an OP tree like this:
> {noformat}
>  ...
>   |
>  RS[1]
>   |
> SEL[2]
> /\
> SEL[3]   SEL[4]
>   | |
> RS[5] FS[6]
>   |
>  ... 
> {noformat}
> When doing aggressive RS dedup, we'll remove all the operators between RS5 
> and RS1, and thus the branch containing FS6 is lost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21171) Skip creating scratch dirs for tez if RPC is on

2019-01-28 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754074#comment-16754074
 ] 

Ashutosh Chauhan commented on HIVE-21171:
-

+1

> Skip creating scratch dirs for tez if RPC is on
> ---
>
> Key: HIVE-21171
> URL: https://issues.apache.org/jira/browse/HIVE-21171
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-21171.1.patch, HIVE-21171.2.patch
>
>
> There are few places e.g. during creating DAG/Vertices where scratch 
> directories are created for each vertex even if plan is being sent using RPC. 
> This adds un-necessary overhead for cloud file system e.g. S3A.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-11708) Logical operators raises ClassCastExceptions with NULL

2019-01-25 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-11708:

   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Ryu!

> Logical operators raises ClassCastExceptions with NULL
> --
>
> Key: HIVE-11708
> URL: https://issues.apache.org/jira/browse/HIVE-11708
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0, 1.2.1
>Reporter: Satoshi Tagomori
>Assignee: Ryu Kobayashi
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-11708.01.patch, HIVE-11708.02.patch
>
>
> According to Language Manual UDF, logical operators returns NULL if one of 
> arguments is NULL.
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-LogicalOperators
> But query below fails with ClassCastException.
> {code}
> SELECT COUNT(*) AS c
> FROM tbl
> WHERE 1=1 AND NULL
> {code}
> Exception (on 0.13):
> {noformat}
> 15/08/27 08:56:23 ERROR ql.Driver: FAILED: ClassCastException 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableVoidObjectInspector
>  cannot be cast to 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.BooleanObjectInspector
> java.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableVoidObjectInspector
>  cannot be cast to 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.BooleanObjectInspector
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPAnd.initialize(GenericUDFOPAnd.java:52)
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:116)
>   at 
> org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:231)
>   at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:934)
>   at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1128)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:132)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
>   at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:184)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:9716)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:9672)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3208)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3005)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8228)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8183)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9015)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9281)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:427)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:323)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:980)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1045)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:916)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:906)
> {noformat}
> I confirmed that Hive 1.2.1 of HDP2.3 Sandbox also raises this exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-11708) Logical operators raises ClassCastExceptions with NULL

2019-01-24 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751883#comment-16751883
 ] 

Ashutosh Chauhan commented on HIVE-11708:
-

+1 pending tests

> Logical operators raises ClassCastExceptions with NULL
> --
>
> Key: HIVE-11708
> URL: https://issues.apache.org/jira/browse/HIVE-11708
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0, 1.2.1
>Reporter: Satoshi Tagomori
>Assignee: Ryu Kobayashi
>Priority: Major
> Attachments: HIVE-11708.01.patch, HIVE-11708.02.patch
>
>
> According to Language Manual UDF, logical operators returns NULL if one of 
> arguments is NULL.
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-LogicalOperators
> But query below fails with ClassCastException.
> {code}
> SELECT COUNT(*) AS c
> FROM tbl
> WHERE 1=1 AND NULL
> {code}
> Exception (on 0.13):
> {noformat}
> 15/08/27 08:56:23 ERROR ql.Driver: FAILED: ClassCastException 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableVoidObjectInspector
>  cannot be cast to 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.BooleanObjectInspector
> java.lang.ClassCastException: 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableVoidObjectInspector
>  cannot be cast to 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.BooleanObjectInspector
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPAnd.initialize(GenericUDFOPAnd.java:52)
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:116)
>   at 
> org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:231)
>   at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:934)
>   at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1128)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:132)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
>   at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:184)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:9716)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:9672)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3208)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3005)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8228)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8183)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9015)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9281)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:427)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:323)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:980)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1045)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:916)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:906)
> {noformat}
> I confirmed that Hive 1.2.1 of HDP2.3 Sandbox also raises this exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   4   5   6   7   8   9   10   >