[jira] [Created] (HIVE-22078) Upgrade arrow version to 0.14.1

2019-08-02 Thread David Mollitor (JIRA)
David Mollitor created HIVE-22078:
-

 Summary: Upgrade arrow version to 0.14.1
 Key: HIVE-22078
 URL: https://issues.apache.org/jira/browse/HIVE-22078
 Project: Hive
  Issue Type: Task
Affects Versions: 4.0.0
Reporter: David Mollitor






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HIVE-22079) Post order walker for iterating over expression tree

2019-08-02 Thread Vineet Garg (JIRA)
Vineet Garg created HIVE-22079:
--

 Summary: Post order walker for iterating over expression tree
 Key: HIVE-22079
 URL: https://issues.apache.org/jira/browse/HIVE-22079
 Project: Hive
  Issue Type: Improvement
  Components: Logical Optimizer, Physical Optimizer
Affects Versions: 4.0.0
Reporter: Vineet Garg
Assignee: Vineet Garg


Current {{DefaultGraphWalker}} is used to iterate over an expression tree. This 
walker uses hash map to keep track of visited/processed nodes. If an expression 
tree is large this adds significant overhead due to map lookup.
For an expression trees we can instead use post order traversal and avoid using 
map.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HIVE-22081) Hivemetastore Performance: Compaction Initiator thread overwhelmed if no there are too many Table/partitions are eligible for compaction

2019-08-02 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-22081:
-

 Summary: Hivemetastore Performance: Compaction Initiator thread 
overwhelmed if no there are too many Table/partitions are eligible for 
compaction 
 Key: HIVE-22081
 URL: https://issues.apache.org/jira/browse/HIVE-22081
 Project: Hive
  Issue Type: Improvement
  Components: Transactions
Affects Versions: 3.1.1
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


if Automatic Compaction is turned on, Initiator thread check for potential 
table/partitions which are eligible for compactions and run some checks in for 
loop before requesting compaction for eligibles. Though initiator thread is 
configured to run at interval 5 min default, in case of many objects it keeps 
on running as these checks are IO intensive and hog cpu.
In the proposed changes, I am planning to do
1. passing less object to for loop by filtering out the objects based on the 
condition which we are checking within the loop.
2. Doing Async call using future to determine compaction type(this is where we 
do FileSystem calls)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HIVE-22080) Prevent implicit conversion from String/char/varchar to double/decimal

2019-08-02 Thread Ramesh Kumar Thangarajan (JIRA)
Ramesh Kumar Thangarajan created HIVE-22080:
---

 Summary: Prevent implicit conversion from String/char/varchar to 
double/decimal
 Key: HIVE-22080
 URL: https://issues.apache.org/jira/browse/HIVE-22080
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 4.0.0
Reporter: Ramesh Kumar Thangarajan
Assignee: Ramesh Kumar Thangarajan
 Fix For: 4.0.0
 Attachments: DWX-684_1.patch

Implicit conversion from String family types to any non-string family types are 
invalid. User can force the conversion by turning off the setting 
hive.metastore.disallow.incompatible.col.type.changes. If not turned off, such 
a conversion should throw error.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HIVE-22076) JDK11: Remove ParallelGC in debug.sh

2019-08-02 Thread Gopal V (JIRA)
Gopal V created HIVE-22076:
--

 Summary: JDK11: Remove ParallelGC in debug.sh
 Key: HIVE-22076
 URL: https://issues.apache.org/jira/browse/HIVE-22076
 Project: Hive
  Issue Type: Bug
  Components: Diagnosability
Affects Versions: 4.0.0
Reporter: Gopal V


The JDK debug mode no longer depends on ParallelGC 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HIVE-22077) Inserting overwrite partitions clause does not clean directories while partitions' info is not stored in metadata

2019-08-02 Thread Hui An (JIRA)
Hui An created HIVE-22077:
-

 Summary: Inserting overwrite partitions clause does not clean 
directories while partitions' info is not stored in metadata
 Key: HIVE-22077
 URL: https://issues.apache.org/jira/browse/HIVE-22077
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 2.3.4, 1.1.1, 4.0.0
Reporter: Hui An
Assignee: Hui An


Inserting overwrite static partitions may not clean related HDFS location if 
partitions' info is not stored in metadata.
Steps to Reproduce this issue : 

1. Create a managed table :


{code:sql}
 CREATE TABLE `test`(   
   `id` string) 
 PARTITIONED BY (   
   `dayno` string)  
 ROW FORMAT SERDE   
   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'  
 STORED AS INPUTFORMAT  
   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'  
 OUTPUTFORMAT   
   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' 
 LOCATION   |
   'hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test' 
 TBLPROPERTIES (
   'transient_lastDdlTime'='1564731656')   
{code}

2. Create partition's directory and put some data under it


{code:java}
hdfs dfs -mkdir 
hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test/dayno=20190802
hdfs dfs -put test.data 
hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test/dayno=20190802
{code}

3. Insert overwrite partition dayno=20190802


{code:sql}
INSERT OVERWRITE TABLE test PARTITION(dayno='20190802')
SELECT 1;
{code}

4. We could see the test.data under partition directory is not deleted.




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)