[jira] [Comment Edited] (HIVE-15773) HCatRecordObjectInspectorFactory is not thread safe

2019-12-15 Thread bianqi (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-15773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16996958#comment-16996958
 ] 

bianqi edited comment on HIVE-15773 at 12/16/19 4:32 AM:
-

[~dongjoon] When we use a single CPU to run the spark-sql program, there will 
be no exceptions, but if we use multiple CPU to run, it will report an error. 
we use spark2.3.1


was (Author: bianqi):
[~dongjoon] When we use a single CPU to run the spark-sql program, there will 
be no exceptions, but if we use multiple CPU to run, it will report an error.

> HCatRecordObjectInspectorFactory is not thread safe
> ---
>
> Key: HIVE-15773
> URL: https://issues.apache.org/jira/browse/HIVE-15773
> Project: Hive
>  Issue Type: Bug
>Reporter: David Phillips
>Priority: Major
> Attachments: HIVE-15773.2.patch, HIVE-15773.3.patch, HIVE-15773.patch
>
>
> {{HashMap}} used without synchronization for the caches, which makes the code 
> unsafe for use in a multi-threaded environment such as Presto (or Spark?). 
> The simple fix is to switch them to {{ConcurrentHashMap}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-15773) HCatRecordObjectInspectorFactory is not thread safe

2019-12-15 Thread bianqi (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-15773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16996958#comment-16996958
 ] 

bianqi edited comment on HIVE-15773 at 12/16/19 4:28 AM:
-

[~dongjoon] When we use a single CPU to run the spark-sql program, there will 
be no exceptions, but if we use multiple CPU to run, it will report an error.


was (Author: bianqi):
When we use a single CPU to run the spark-sql program, there will be no 
exceptions, but if we use multiple CPU to run, it will report an error.

> HCatRecordObjectInspectorFactory is not thread safe
> ---
>
> Key: HIVE-15773
> URL: https://issues.apache.org/jira/browse/HIVE-15773
> Project: Hive
>  Issue Type: Bug
>Reporter: David Phillips
>Priority: Major
> Attachments: HIVE-15773.2.patch, HIVE-15773.3.patch, HIVE-15773.patch
>
>
> {{HashMap}} used without synchronization for the caches, which makes the code 
> unsafe for use in a multi-threaded environment such as Presto (or Spark?). 
> The simple fix is to switch them to {{ConcurrentHashMap}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-15773) HCatRecordObjectInspectorFactory is not thread safe

2019-12-15 Thread bianqi (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-15773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16996958#comment-16996958
 ] 

bianqi commented on HIVE-15773:
---

When we use a single CPU to run the spark-sql program, there will be no 
exceptions, but if we use multiple CPU to run, it will report an error.

> HCatRecordObjectInspectorFactory is not thread safe
> ---
>
> Key: HIVE-15773
> URL: https://issues.apache.org/jira/browse/HIVE-15773
> Project: Hive
>  Issue Type: Bug
>Reporter: David Phillips
>Priority: Major
> Attachments: HIVE-15773.2.patch, HIVE-15773.3.patch, HIVE-15773.patch
>
>
> {{HashMap}} used without synchronization for the caches, which makes the code 
> unsafe for use in a multi-threaded environment such as Presto (or Spark?). 
> The simple fix is to switch them to {{ConcurrentHashMap}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22638) Fix insert statement issue with return path

2019-12-15 Thread Jesus Camacho Rodriguez (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16996954#comment-16996954
 ] 

Jesus Camacho Rodriguez commented on HIVE-22638:


+1

> Fix insert statement issue with return path
> ---
>
> Key: HIVE-22638
> URL: https://issues.apache.org/jira/browse/HIVE-22638
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22638.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Insert statements were not handled properly with return path. It was revealed 
> during examining why TestUpgradeTool is not working with return path.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22561) Data loss on map join for bucketed, partitioned table

2019-12-15 Thread Jesus Camacho Rodriguez (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16996952#comment-16996952
 ] 

Jesus Camacho Rodriguez commented on HIVE-22561:


[~aditya-shah], I am not sure why it was not triggered... Nevertheless, the 
patch does not apply cleanly on branch-3.1.

> Data loss on map join for bucketed, partitioned table
> -
>
> Key: HIVE-22561
> URL: https://issues.apache.org/jira/browse/HIVE-22561
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2
>Reporter: Aditya Shah
>Assignee: Aditya Shah
>Priority: Blocker
> Fix For: 3.0.0, 3.1.0
>
> Attachments: HIVE-22561.branch-3.1.patch, HIVE-22561.patch, 
> Screenshot 2019-11-28 at 8.45.17 PM.png, image-2019-11-28-20-46-25-432.png
>
>
> A map join on a column (which is neither involved in bucketing and partition) 
> causes data loss. 
> Steps to reproduce:
> Env: [hive-dev-box|[https://github.com/kgyrtkirk/hive-dev-box]] hive 3.1.2.
> Create tables:
>  
> {code:java}
> CREATE TABLE `testj2`(
>   `id` int, 
>   `bn` string, 
>   `cn` string, 
>   `ad` map, 
>   `mi` array)
> PARTITIONED BY ( 
>   `br` string)
> CLUSTERED BY ( 
>   bn) 
> INTO 2 BUCKETS
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ','
> STORED AS TEXTFILE
> TBLPROPERTIES (
>   'bucketing_version'='2');
> CREATE TABLE `testj1`(
>   `id` int, 
>   `can` string, 
>   `cn` string, 
>   `ad` map, 
>   `av` boolean, 
>   `mi` array)
> PARTITIONED BY ( 
>   `brand` string)
> CLUSTERED BY ( 
>   can) 
> INTO 2 BUCKETS
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ','
> STORED AS TEXTFILE
> TBLPROPERTIES (
>   'bucketing_version'='2');
> {code}
> insert some data in both:
> {code:java}
> insert into testj1 values (100, 'mes_1', 'customer_1',  map('city1', 560077), 
> false, array(5, 10), 'brand_1'),
> (101, 'mes_2', 'customer_2',  map('city2', 560078), true, array(10, 20), 
> 'brand_2'),
> (102, 'mes_3', 'customer_3',  map('city3', 560079), false, array(15, 30), 
> 'brand_3'),
> (103, 'mes_4', 'customer_4',  map('city4', 560080), true, array(20, 40), 
> 'brand_4'),
> (104, 'mes_5', 'customer_5',  map('city5', 560081), false, array(25, 50), 
> 'brand_5');
> insert into table testj2 values (100, 'tv_0', 'customer_0', map('city0', 
> 560076),array(0, 0, 0), 'tv'),
> (101, 'tv_1', 'customer_1', map('city1', 560077),array(20, 25, 30), 'tv'),
> (102, 'tv_2', 'customer_2', map('city2', 560078),array(40, 50, 60), 'tv'),
> (103, 'tv_3', 'customer_3', map('city3', 560079),array(60, 75, 90), 'tv'),
> (104, 'tv_4', 'customer_4', map('city4', 560080),array(80, 100, 120), 'tv');
> {code}
> Do a join between them:
> {code:java}
> select t1.id, t1.can, t1.cn, t2.bn,t2.ad, t2.br FROM testj1 t1 JOIN testj2 t2 
> on (t1.id = t2.id) order by t1.id;
> {code}
> Observed results:
> !image-2019-11-28-20-46-25-432.png|width=524,height=100!
> In the plan, I can see a map join. Disabling it gives the correct result.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22608) Reduce the number of public methods in Driver

2019-12-15 Thread Jesus Camacho Rodriguez (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16996946#comment-16996946
 ] 

Jesus Camacho Rodriguez commented on HIVE-22608:


Are these failures related?

> Reduce the number of public methods in Driver
> -
>
> Key: HIVE-22608
> URL: https://issues.apache.org/jira/browse/HIVE-22608
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22608.01.patch
>
>
> There are 33 public methods in Driver, some of them either don't belong 
> there, or should not be public.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22609) Reduce number of FS getFileStatus calls in AcidUtils::getHdfsDirSnapshots

2019-12-15 Thread Rajesh Balamohan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22609:

Attachment: HIVE-22609.3.patch

> Reduce number of FS getFileStatus calls in AcidUtils::getHdfsDirSnapshots
> -
>
> Key: HIVE-22609
> URL: https://issues.apache.org/jira/browse/HIVE-22609
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Priority: Major
> Attachments: HIVE-22609.1.patch, HIVE-22609.2.patch, 
> HIVE-22609.3.patch
>
>
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L1380]
> ACID delta folder contains {{_orc_acid_version}} and {{bucket_0}} files. 
> For both these files, parent dir is the same. Number of getFileStatus in such 
> cases should be reduced by 1/2.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-7292) Hive on Spark

2019-12-15 Thread xushiwei (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16996908#comment-16996908
 ] 

xushiwei commented on HIVE-7292:


Does anybody know if Hive supports Spark on k8s?
 

> Hive on Spark
> -
>
> Key: HIVE-7292
> URL: https://issues.apache.org/jira/browse/HIVE-7292
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
>Priority: Major
>  Labels: Spark-M1, Spark-M2, Spark-M3, Spark-M4, Spark-M5
> Fix For: 1.1.0
>
> Attachments: Hive-on-Spark.pdf
>
>
> Spark as an open-source data analytics cluster computing framework has gained 
> significant momentum recently. Many Hive users already have Spark installed 
> as their computing backbone. To take advantages of Hive, they still need to 
> have either MapReduce or Tez on their cluster. This initiative will provide 
> user a new alternative so that those user can consolidate their backend. 
> Secondly, providing such an alternative further increases Hive's adoption as 
> it exposes Spark users  to a viable, feature-rich de facto standard SQL tools 
> on Hadoop.
> Finally, allowing Hive to run on Spark also has performance benefits. Hive 
> queries, especially those involving multiple reducer stages, will run faster, 
> thus improving user experience as Tez does.
> This is an umbrella JIRA which will cover many coming subtask. Design doc 
> will be attached here shortly, and will be on the wiki as well. Feedback from 
> the community is greatly appreciated!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22550) Result of hive.query.string in job xml contains encoded string

2019-12-15 Thread zhangbutao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangbutao updated HIVE-22550:
--
Attachment: HIVE-branch-3.1.patch
Status: Patch Available  (was: Open)

> Result of hive.query.string in job xml contains  encoded string
> ---
>
> Key: HIVE-22550
> URL: https://issues.apache.org/jira/browse/HIVE-22550
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.1, 3.1.0
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
> Attachments: HIVE-branch-3.1.patch, job xml.JPG
>
>
> repo:
> Query :  *insert into test values(1)*
> job xml will be display *insert+into+test+values%281%29*
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-17063) insert overwrite partition onto a external table fail when drop partition first

2019-12-15 Thread smallx (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-17063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16996896#comment-16996896
 ] 

smallx commented on HIVE-17063:
---

ping [~ashutoshc]

> insert overwrite partition onto a external table fail when drop partition 
> first
> ---
>
> Key: HIVE-17063
> URL: https://issues.apache.org/jira/browse/HIVE-17063
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.2, 2.1.1, 2.2.0
>Reporter: Wang Haihua
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-17063.1.patch, HIVE-17063.2.patch, 
> HIVE-17063.3.patch, HIVE-17063.4.patch
>
>
> The default value of {{hive.exec.stagingdir}} which is a relative path, and 
> also drop partition on a external table will not clear the real data. As a 
> result, insert overwrite partition twice will happen to fail because of the 
> target data to be moved has 
>  already existed.
> This happened when we reproduce partition data onto a external table. 
> I see the target data will not be cleared only when {{immediately generated 
> data}} is child of {{the target data directory}}, so my proposal is trying  
> to clear target file already existed finally whe doing rename  {{immediately 
> generated data}} into {{the target data directory}}
> Operation reproduced:
> {code}
> create external table insert_after_drop_partition(key string, val string) 
> partitioned by (insertdate string);
> from src insert overwrite table insert_after_drop_partition partition 
> (insertdate='2008-01-01') select *;
> alter table insert_after_drop_partition drop partition 
> (insertdate='2008-01-01');
> from src insert overwrite table insert_after_drop_partition partition 
> (insertdate='2008-01-01') select *;
> {code}
> Stack trace:
> {code}
> 2017-07-09T08:32:05,212 ERROR [f3bc51c8-2441-4689-b1c1-d60aef86c3aa main] 
> exec.Task: Failed with exception java.io.IOException: rename for src path: 
> pfile:/data/haihua/official/hive/itests/qtest/target/warehouse/insert_after_drop_partition/insertdate=2008-01-01/.hive-staging_hive_2017-07-09_08-32-03_840_4046825276907030554-1/-ext-1/00_0
>  to dest 
> path:pfile:/data/haihua/official/hive/itests/qtest/target/warehouse/insert_after_drop_partition/insertdate=2008-01-01/00_0
>  returned false
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: rename 
> for src path: 
> pfile:/data/haihua/official/hive/itests/qtest/target/warehouse/insert_after_drop_partition/insertdate=2008-01-01/.hive-staging_hive_2017-07-09_08-32-03_840_4046825276907030554-1/-ext-1/00_0
>  to dest 
> path:pfile:/data/haihua/official/hive/itests/qtest/target/warehouse/insert_after_drop_partition/insertdate=2008-01-01/00_0
>  returned false
> at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2992)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3248)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1532)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1461)
> at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:498)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2073)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1744)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1453)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1171)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335)
> at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1137)
> at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:)
> at 
> org.apache.hadoop.hive.cli.TestCliDriver.runTest(TestCliDriver.java:120)
> at 
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_after_drop_partition(TestCliDriver.java:103)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> 

[jira] [Comment Edited] (HIVE-17063) insert overwrite partition onto a external table fail when drop partition first

2019-12-15 Thread smallx (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-17063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16996891#comment-16996891
 ] 

smallx edited comment on HIVE-17063 at 12/15/19 11:19 PM:
--

[~wanghaihua] [~djaiswal]

When the replace flag is true, we should delete all files in the target path 
except the source directory and hidden files, not only the file with rename 
conflict, otherwise it may cause data duplication or unexpected.

We need to consider this case: the number of files becomes smaller when hive 
inserts data again.

Or this case: after spark-sql inserts data, drop partition, and then hive 
inserts data. Because the file names are different, the data inserted by 
spark-sql will not be replaced, and the data will double at this time.


was (Author: smallx):
[~wanghaihua] [~djaiswal]

When the replace flag is true, we should delete all files in the target path 
except the source directory and hidden files, not only the file with rename 
conflict, otherwise it may cause data duplication or unexpected.
We need to consider this case: the number of files becomes smaller when hive 
inserts data again.
Or this case: after spark-sql inserts data, drop partition, and then hive 
inserts data. Because the file names are different, the data inserted by 
spark-sql will not be replaced, and the data will double at this time.

> insert overwrite partition onto a external table fail when drop partition 
> first
> ---
>
> Key: HIVE-17063
> URL: https://issues.apache.org/jira/browse/HIVE-17063
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.2, 2.1.1, 2.2.0
>Reporter: Wang Haihua
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-17063.1.patch, HIVE-17063.2.patch, 
> HIVE-17063.3.patch, HIVE-17063.4.patch
>
>
> The default value of {{hive.exec.stagingdir}} which is a relative path, and 
> also drop partition on a external table will not clear the real data. As a 
> result, insert overwrite partition twice will happen to fail because of the 
> target data to be moved has 
>  already existed.
> This happened when we reproduce partition data onto a external table. 
> I see the target data will not be cleared only when {{immediately generated 
> data}} is child of {{the target data directory}}, so my proposal is trying  
> to clear target file already existed finally whe doing rename  {{immediately 
> generated data}} into {{the target data directory}}
> Operation reproduced:
> {code}
> create external table insert_after_drop_partition(key string, val string) 
> partitioned by (insertdate string);
> from src insert overwrite table insert_after_drop_partition partition 
> (insertdate='2008-01-01') select *;
> alter table insert_after_drop_partition drop partition 
> (insertdate='2008-01-01');
> from src insert overwrite table insert_after_drop_partition partition 
> (insertdate='2008-01-01') select *;
> {code}
> Stack trace:
> {code}
> 2017-07-09T08:32:05,212 ERROR [f3bc51c8-2441-4689-b1c1-d60aef86c3aa main] 
> exec.Task: Failed with exception java.io.IOException: rename for src path: 
> pfile:/data/haihua/official/hive/itests/qtest/target/warehouse/insert_after_drop_partition/insertdate=2008-01-01/.hive-staging_hive_2017-07-09_08-32-03_840_4046825276907030554-1/-ext-1/00_0
>  to dest 
> path:pfile:/data/haihua/official/hive/itests/qtest/target/warehouse/insert_after_drop_partition/insertdate=2008-01-01/00_0
>  returned false
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: rename 
> for src path: 
> pfile:/data/haihua/official/hive/itests/qtest/target/warehouse/insert_after_drop_partition/insertdate=2008-01-01/.hive-staging_hive_2017-07-09_08-32-03_840_4046825276907030554-1/-ext-1/00_0
>  to dest 
> path:pfile:/data/haihua/official/hive/itests/qtest/target/warehouse/insert_after_drop_partition/insertdate=2008-01-01/00_0
>  returned false
> at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2992)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3248)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1532)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1461)
> at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:498)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2073)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1744)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1453)
> at 

[jira] [Commented] (HIVE-17063) insert overwrite partition onto a external table fail when drop partition first

2019-12-15 Thread smallx (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-17063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16996891#comment-16996891
 ] 

smallx commented on HIVE-17063:
---

[~wanghaihua] [~djaiswal]

When the replace flag is true, we should delete all files in the target path 
except the source directory and hidden files, not only the file with rename 
conflict, otherwise it may cause data duplication or unexpected.
We need to consider this case: the number of files becomes smaller when hive 
inserts data again.
Or this case: after spark-sql inserts data, drop partition, and then hive 
inserts data. Because the file names are different, the data inserted by 
spark-sql will not be replaced, and the data will double at this time.

> insert overwrite partition onto a external table fail when drop partition 
> first
> ---
>
> Key: HIVE-17063
> URL: https://issues.apache.org/jira/browse/HIVE-17063
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 1.2.2, 2.1.1, 2.2.0
>Reporter: Wang Haihua
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-17063.1.patch, HIVE-17063.2.patch, 
> HIVE-17063.3.patch, HIVE-17063.4.patch
>
>
> The default value of {{hive.exec.stagingdir}} which is a relative path, and 
> also drop partition on a external table will not clear the real data. As a 
> result, insert overwrite partition twice will happen to fail because of the 
> target data to be moved has 
>  already existed.
> This happened when we reproduce partition data onto a external table. 
> I see the target data will not be cleared only when {{immediately generated 
> data}} is child of {{the target data directory}}, so my proposal is trying  
> to clear target file already existed finally whe doing rename  {{immediately 
> generated data}} into {{the target data directory}}
> Operation reproduced:
> {code}
> create external table insert_after_drop_partition(key string, val string) 
> partitioned by (insertdate string);
> from src insert overwrite table insert_after_drop_partition partition 
> (insertdate='2008-01-01') select *;
> alter table insert_after_drop_partition drop partition 
> (insertdate='2008-01-01');
> from src insert overwrite table insert_after_drop_partition partition 
> (insertdate='2008-01-01') select *;
> {code}
> Stack trace:
> {code}
> 2017-07-09T08:32:05,212 ERROR [f3bc51c8-2441-4689-b1c1-d60aef86c3aa main] 
> exec.Task: Failed with exception java.io.IOException: rename for src path: 
> pfile:/data/haihua/official/hive/itests/qtest/target/warehouse/insert_after_drop_partition/insertdate=2008-01-01/.hive-staging_hive_2017-07-09_08-32-03_840_4046825276907030554-1/-ext-1/00_0
>  to dest 
> path:pfile:/data/haihua/official/hive/itests/qtest/target/warehouse/insert_after_drop_partition/insertdate=2008-01-01/00_0
>  returned false
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: rename 
> for src path: 
> pfile:/data/haihua/official/hive/itests/qtest/target/warehouse/insert_after_drop_partition/insertdate=2008-01-01/.hive-staging_hive_2017-07-09_08-32-03_840_4046825276907030554-1/-ext-1/00_0
>  to dest 
> path:pfile:/data/haihua/official/hive/itests/qtest/target/warehouse/insert_after_drop_partition/insertdate=2008-01-01/00_0
>  returned false
> at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2992)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3248)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1532)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1461)
> at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:498)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2073)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1744)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1453)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1171)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335)
> at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1137)
> at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:)
>   

[jira] [Commented] (HIVE-21971) HS2 leaks classloader due to `ReflectionUtils::CONSTRUCTOR_CACHE` with temporary functions + GenericUDF

2019-12-15 Thread Rajesh Balamohan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16996890#comment-16996890
 ] 

Rajesh Balamohan commented on HIVE-21971:
-

Thanks [~ashutoshc]. Uploading .2 version (same as .1) to trigger tests. Will 
commit it after tests complete in master.

> HS2 leaks classloader due to `ReflectionUtils::CONSTRUCTOR_CACHE` with 
> temporary functions + GenericUDF
> ---
>
> Key: HIVE-21971
> URL: https://issues.apache.org/jira/browse/HIVE-21971
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.3.4
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Critical
> Attachments: HIVE-21971.1.patch, HIVE-21971.2.patch
>
>
> https://issues.apache.org/jira/browse/HIVE-10329 helped in moving away from 
> hadoop's ReflectionUtils constructor cache issue 
> (https://issues.apache.org/jira/browse/HADOOP-10513).
> However, there are corner cases where hadoop's {{ReflectionUtils}} is in use 
> and this causes gradual build up of memory in HS2.
> I have observed this in Hive 2.3. But the codepath in master for this has not 
> changed much.
> Easiest way to repro would be to add a temp function which extends 
> {{GenericUDF}}. In {{FunctionRegistry::cloneGenericUDF,}} this would 
> end up using {{org.apache.hadoop.util.ReflectionUtils.newInstance}} which in 
> turn lands up in COSNTRUCTOR_CACHE of ReflectionUtils. 
> {noformat}
> CREATE TEMPORARY FUNCTION dummy AS 'com.hive.test.DummyGenericUDF' USING JAR 
> 'file:///home/test/udf/dummy.jar';
> select dummy();
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:107)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionRegistry.cloneGenericUDF(FunctionRegistry.java:1353)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionInfo.getGenericUDF(FunctionInfo.java:122)
>   at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:983)
>   at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1359)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
>   at 
> org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:76)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> {noformat}
> Note: Reflection based invocation of hadoop's {{ReflectionUtils::clear}} was 
> removed in 2.x. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-21971) HS2 leaks classloader due to `ReflectionUtils::CONSTRUCTOR_CACHE` with temporary functions + GenericUDF

2019-12-15 Thread Rajesh Balamohan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-21971:

Attachment: HIVE-21971.2.patch

> HS2 leaks classloader due to `ReflectionUtils::CONSTRUCTOR_CACHE` with 
> temporary functions + GenericUDF
> ---
>
> Key: HIVE-21971
> URL: https://issues.apache.org/jira/browse/HIVE-21971
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.3.4
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Critical
> Attachments: HIVE-21971.1.patch, HIVE-21971.2.patch
>
>
> https://issues.apache.org/jira/browse/HIVE-10329 helped in moving away from 
> hadoop's ReflectionUtils constructor cache issue 
> (https://issues.apache.org/jira/browse/HADOOP-10513).
> However, there are corner cases where hadoop's {{ReflectionUtils}} is in use 
> and this causes gradual build up of memory in HS2.
> I have observed this in Hive 2.3. But the codepath in master for this has not 
> changed much.
> Easiest way to repro would be to add a temp function which extends 
> {{GenericUDF}}. In {{FunctionRegistry::cloneGenericUDF,}} this would 
> end up using {{org.apache.hadoop.util.ReflectionUtils.newInstance}} which in 
> turn lands up in COSNTRUCTOR_CACHE of ReflectionUtils. 
> {noformat}
> CREATE TEMPORARY FUNCTION dummy AS 'com.hive.test.DummyGenericUDF' USING JAR 
> 'file:///home/test/udf/dummy.jar';
> select dummy();
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:107)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionRegistry.cloneGenericUDF(FunctionRegistry.java:1353)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionInfo.getGenericUDF(FunctionInfo.java:122)
>   at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:983)
>   at 
> org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1359)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
>   at 
> org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:76)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> {noformat}
> Note: Reflection based invocation of hadoop's {{ReflectionUtils::clear}} was 
> removed in 2.x. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22485) Cross product should set the conf in UnorderedPartitionedKVEdgeConfig

2019-12-15 Thread Rajesh Balamohan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22485:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Thanks [~ashutoshc]. Committed to master. 

> Cross product should set the conf in UnorderedPartitionedKVEdgeConfig
> -
>
> Key: HIVE-22485
> URL: https://issues.apache.org/jira/browse/HIVE-22485
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Trivial
> Fix For: 4.0.0
>
> Attachments: HIVE-22485.1.patch
>
>
> SSL and other options would not be sent correctly, if this is not setup.
>  
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java#L545



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22649) Fix TestHiveCli: scratchdir should be writable

2019-12-15 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko updated HIVE-22649:
--
Status: Patch Available  (was: Open)

> Fix TestHiveCli: scratchdir should be writable
> --
>
> Key: HIVE-22649
> URL: https://issues.apache.org/jira/browse/HIVE-22649
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
> Attachments: HIVE-22649.1.patch
>
>
> Error applying authorization policy on hive configuration: The dir: /tmp/hive 
> on HDFS should be writable. Current permissions are: rwxr-xr-x
> SessionState.java
> {code}
>   private Path createRootHDFSDir(HiveConf conf) throws IOException {
> Path rootHDFSDirPath = new Path(HiveConf.getVar(conf, 
> HiveConf.ConfVars.SCRATCHDIR));
> *Utilities.ensurePathIsWritable(rootHDFSDirPath, conf);*
> return rootHDFSDirPath;
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-22649) Fix TestHiveCli: scratchdir should be writable

2019-12-15 Thread Denys Kuzmenko (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16996885#comment-16996885
 ] 

Denys Kuzmenko edited comment on HIVE-22649 at 12/15/19 10:31 PM:
--

[~jdere]: could this be caused by 
[HIVE-22599|https://issues.apache.org/jira/browse/HIVE-22599] ?

I made a quick fix for now, however we should try to find the actual root 
cause; Is it possible that some Query results cache test does this? 
{code}
  @BeforeClass
  public static void init(){
// something changed scratch dir permissions, so test can't execute
HiveConf hiveConf = new HiveConf();
String scratchDir = hiveConf.get(HiveConf.ConfVars.SCRATCHDIR.varname);
File file = new File(scratchDir);
if (file.exists()) {
  file.setWritable(true, false);
}
  }
{code}


was (Author: dkuzmenko):
[~jdere]: could this be caused by 
[HIVE-22599|https://issues.apache.org/jira/browse/HIVE-22599] ?

I made a quick fix for now, however we should try to find the actual root cause;
{code}
  @BeforeClass
  public static void init(){
// something changed scratch dir permissions, so test can't execute
HiveConf hiveConf = new HiveConf();
String scratchDir = hiveConf.get(HiveConf.ConfVars.SCRATCHDIR.varname);
File file = new File(scratchDir);
if (file.exists()) {
  file.setWritable(true, false);
}
  }
{code}

> Fix TestHiveCli: scratchdir should be writable
> --
>
> Key: HIVE-22649
> URL: https://issues.apache.org/jira/browse/HIVE-22649
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
> Attachments: HIVE-22649.1.patch
>
>
> Error applying authorization policy on hive configuration: The dir: /tmp/hive 
> on HDFS should be writable. Current permissions are: rwxr-xr-x
> SessionState.java
> {code}
>   private Path createRootHDFSDir(HiveConf conf) throws IOException {
> Path rootHDFSDirPath = new Path(HiveConf.getVar(conf, 
> HiveConf.ConfVars.SCRATCHDIR));
> *Utilities.ensurePathIsWritable(rootHDFSDirPath, conf);*
> return rootHDFSDirPath;
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-22649) Fix TestHiveCli: scratchdir should be writable

2019-12-15 Thread Denys Kuzmenko (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16996885#comment-16996885
 ] 

Denys Kuzmenko edited comment on HIVE-22649 at 12/15/19 10:27 PM:
--

[~jdere]: could this be caused by 
[HIVE-22599|https://issues.apache.org/jira/browse/HIVE-22599] ?

I made a quick fix for now, however we should try to find the actual root cause;
{code}
  @BeforeClass
  public static void init(){
// something changed scratch dir permissions, so test can't execute
HiveConf hiveConf = new HiveConf();
String scratchDir = hiveConf.get(HiveConf.ConfVars.SCRATCHDIR.varname);
File file = new File(scratchDir);
if (file.exists()) {
  file.setWritable(true, false);
}
  }
{code}


was (Author: dkuzmenko):
[~jdere]: could this be caused by 
[HIVE-22599,https://issues.apache.org/jira/browse/HIVE-22599] ?

I made a quick fix for now, however we should try to find the actual root cause;
{code}
  @BeforeClass
  public static void init(){
// something changed scratch dir permissions, so test can't execute
HiveConf hiveConf = new HiveConf();
String scratchDir = hiveConf.get(HiveConf.ConfVars.SCRATCHDIR.varname);
File file = new File(scratchDir);
if (file.exists()) {
  file.setWritable(true, false);
}
  }
{code}

> Fix TestHiveCli: scratchdir should be writable
> --
>
> Key: HIVE-22649
> URL: https://issues.apache.org/jira/browse/HIVE-22649
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
> Attachments: HIVE-22649.1.patch
>
>
> Error applying authorization policy on hive configuration: The dir: /tmp/hive 
> on HDFS should be writable. Current permissions are: rwxr-xr-x
> SessionState.java
> {code}
>   private Path createRootHDFSDir(HiveConf conf) throws IOException {
> Path rootHDFSDirPath = new Path(HiveConf.getVar(conf, 
> HiveConf.ConfVars.SCRATCHDIR));
> *Utilities.ensurePathIsWritable(rootHDFSDirPath, conf);*
> return rootHDFSDirPath;
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22649) Fix TestHiveCli: scratchdir should be writable

2019-12-15 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko updated HIVE-22649:
--
Description: 
Error applying authorization policy on hive configuration: The dir: /tmp/hive 
on HDFS should be writable. Current permissions are: rwxr-xr-x

SessionState.java
{code}
  private Path createRootHDFSDir(HiveConf conf) throws IOException {
Path rootHDFSDirPath = new Path(HiveConf.getVar(conf, 
HiveConf.ConfVars.SCRATCHDIR));
*Utilities.ensurePathIsWritable(rootHDFSDirPath, conf);*
return rootHDFSDirPath;
  }
{code}

  was:Error applying authorization policy on hive configuration: The dir: 
/tmp/hive on HDFS should be writable. Current permissions are: rwxr-xr-x


> Fix TestHiveCli: scratchdir should be writable
> --
>
> Key: HIVE-22649
> URL: https://issues.apache.org/jira/browse/HIVE-22649
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
> Attachments: HIVE-22649.1.patch
>
>
> Error applying authorization policy on hive configuration: The dir: /tmp/hive 
> on HDFS should be writable. Current permissions are: rwxr-xr-x
> SessionState.java
> {code}
>   private Path createRootHDFSDir(HiveConf conf) throws IOException {
> Path rootHDFSDirPath = new Path(HiveConf.getVar(conf, 
> HiveConf.ConfVars.SCRATCHDIR));
> *Utilities.ensurePathIsWritable(rootHDFSDirPath, conf);*
> return rootHDFSDirPath;
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-22649) Fix TestHiveCli: scratchdir should be writable

2019-12-15 Thread Denys Kuzmenko (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16996885#comment-16996885
 ] 

Denys Kuzmenko edited comment on HIVE-22649 at 12/15/19 10:23 PM:
--

[~jdere]: could this be caused by 
[HIVE-22599,https://issues.apache.org/jira/browse/HIVE-22599] ?

I made a quick fix for now, however we should try to find the actual root cause;
{code}
  @BeforeClass
  public static void init(){
// something changed scratch dir permissions, so test can't execute
HiveConf hiveConf = new HiveConf();
String scratchDir = hiveConf.get(HiveConf.ConfVars.SCRATCHDIR.varname);
File file = new File(scratchDir);
if (file.exists()) {
  file.setWritable(true, false);
}
  }
{code}


was (Author: dkuzmenko):
[~jdere]: could this be caused by 
[HIVE-22599,https://issues.apache.org/jira/browse/HIVE-22599] ?

> Fix TestHiveCli: scratchdir should be writable
> --
>
> Key: HIVE-22649
> URL: https://issues.apache.org/jira/browse/HIVE-22649
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
> Attachments: HIVE-22649.1.patch
>
>
> Error applying authorization policy on hive configuration: The dir: /tmp/hive 
> on HDFS should be writable. Current permissions are: rwxr-xr-x



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22649) Fix TestHiveCli: scratchdir should be writable

2019-12-15 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko updated HIVE-22649:
--
Description: Error applying authorization policy on hive configuration: The 
dir: /tmp/hive on HDFS should be writable. Current permissions are: rwxr-xr-x

> Fix TestHiveCli: scratchdir should be writable
> --
>
> Key: HIVE-22649
> URL: https://issues.apache.org/jira/browse/HIVE-22649
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
> Attachments: HIVE-22649.1.patch
>
>
> Error applying authorization policy on hive configuration: The dir: /tmp/hive 
> on HDFS should be writable. Current permissions are: rwxr-xr-x



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22649) Fix TestHiveCli: scratchdir should be writable

2019-12-15 Thread Denys Kuzmenko (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16996885#comment-16996885
 ] 

Denys Kuzmenko commented on HIVE-22649:
---

[~jdere]: could this be caused by 
[HIVE-22599,https://issues.apache.org/jira/browse/HIVE-22599] ?

> Fix TestHiveCli: scratchdir should be writable
> --
>
> Key: HIVE-22649
> URL: https://issues.apache.org/jira/browse/HIVE-22649
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
> Attachments: HIVE-22649.1.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-22649) Fix TestHiveCli: scratchdir should be writable

2019-12-15 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko reassigned HIVE-22649:
-

Assignee: Denys Kuzmenko

> Fix TestHiveCli: scratchdir should be writable
> --
>
> Key: HIVE-22649
> URL: https://issues.apache.org/jira/browse/HIVE-22649
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
> Attachments: HIVE-22649.1.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22649) Fix TestHiveCli: scratchdir should be writable

2019-12-15 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko updated HIVE-22649:
--
Attachment: HIVE-22649.1.patch

> Fix TestHiveCli: scratchdir should be writable
> --
>
> Key: HIVE-22649
> URL: https://issues.apache.org/jira/browse/HIVE-22649
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
> Attachments: HIVE-22649.1.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22649) Fix TestHiveCli: scratchdir should be writable

2019-12-15 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko updated HIVE-22649:
--
Summary: Fix TestHiveCli: scratchdir should be writable  (was: TestHiveCli 
fix: scratchdir should be writable)

> Fix TestHiveCli: scratchdir should be writable
> --
>
> Key: HIVE-22649
> URL: https://issues.apache.org/jira/browse/HIVE-22649
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Denys Kuzmenko
>Priority: Major
> Attachments: HIVE-22649.1.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-20150) TopNKey pushdown

2019-12-15 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated HIVE-20150:
--
Attachment: HIVE-20150.29.patch

> TopNKey pushdown
> 
>
> Key: HIVE-20150
> URL: https://issues.apache.org/jira/browse/HIVE-20150
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 4.0.0
>Reporter: Teddy Choi
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20150.1.patch, HIVE-20150.10.patch, 
> HIVE-20150.11.patch, HIVE-20150.11.patch, HIVE-20150.14.patch, 
> HIVE-20150.15.patch, HIVE-20150.16.patch, HIVE-20150.17.patch, 
> HIVE-20150.17.patch, HIVE-20150.18.patch, HIVE-20150.18.patch, 
> HIVE-20150.19.patch, HIVE-20150.2.patch, HIVE-20150.20.patch, 
> HIVE-20150.21.patch, HIVE-20150.22.patch, HIVE-20150.23.patch, 
> HIVE-20150.24.patch, HIVE-20150.25.patch, HIVE-20150.25.patch, 
> HIVE-20150.26.patch, HIVE-20150.27.patch, HIVE-20150.28.patch, 
> HIVE-20150.29.patch, HIVE-20150.29.patch, HIVE-20150.29.patch, 
> HIVE-20150.4.patch, HIVE-20150.5.patch, HIVE-20150.6.patch, 
> HIVE-20150.7.patch, HIVE-20150.8.patch, HIVE-20150.9.patch
>
>
> TopNKey operator is implemented in HIVE-17896, but it needs more work in 
> pushdown implementation. So this issue covers TopNKey pushdown implementation 
> with proper tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-20150) TopNKey pushdown

2019-12-15 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated HIVE-20150:
--
Status: Patch Available  (was: Open)

> TopNKey pushdown
> 
>
> Key: HIVE-20150
> URL: https://issues.apache.org/jira/browse/HIVE-20150
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 4.0.0
>Reporter: Teddy Choi
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20150.1.patch, HIVE-20150.10.patch, 
> HIVE-20150.11.patch, HIVE-20150.11.patch, HIVE-20150.14.patch, 
> HIVE-20150.15.patch, HIVE-20150.16.patch, HIVE-20150.17.patch, 
> HIVE-20150.17.patch, HIVE-20150.18.patch, HIVE-20150.18.patch, 
> HIVE-20150.19.patch, HIVE-20150.2.patch, HIVE-20150.20.patch, 
> HIVE-20150.21.patch, HIVE-20150.22.patch, HIVE-20150.23.patch, 
> HIVE-20150.24.patch, HIVE-20150.25.patch, HIVE-20150.25.patch, 
> HIVE-20150.26.patch, HIVE-20150.27.patch, HIVE-20150.28.patch, 
> HIVE-20150.29.patch, HIVE-20150.29.patch, HIVE-20150.29.patch, 
> HIVE-20150.4.patch, HIVE-20150.5.patch, HIVE-20150.6.patch, 
> HIVE-20150.7.patch, HIVE-20150.8.patch, HIVE-20150.9.patch
>
>
> TopNKey operator is implemented in HIVE-17896, but it needs more work in 
> pushdown implementation. So this issue covers TopNKey pushdown implementation 
> with proper tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-20150) TopNKey pushdown

2019-12-15 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated HIVE-20150:
--
Status: Open  (was: Patch Available)

> TopNKey pushdown
> 
>
> Key: HIVE-20150
> URL: https://issues.apache.org/jira/browse/HIVE-20150
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 4.0.0
>Reporter: Teddy Choi
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20150.1.patch, HIVE-20150.10.patch, 
> HIVE-20150.11.patch, HIVE-20150.11.patch, HIVE-20150.14.patch, 
> HIVE-20150.15.patch, HIVE-20150.16.patch, HIVE-20150.17.patch, 
> HIVE-20150.17.patch, HIVE-20150.18.patch, HIVE-20150.18.patch, 
> HIVE-20150.19.patch, HIVE-20150.2.patch, HIVE-20150.20.patch, 
> HIVE-20150.21.patch, HIVE-20150.22.patch, HIVE-20150.23.patch, 
> HIVE-20150.24.patch, HIVE-20150.25.patch, HIVE-20150.25.patch, 
> HIVE-20150.26.patch, HIVE-20150.27.patch, HIVE-20150.28.patch, 
> HIVE-20150.29.patch, HIVE-20150.29.patch, HIVE-20150.4.patch, 
> HIVE-20150.5.patch, HIVE-20150.6.patch, HIVE-20150.7.patch, 
> HIVE-20150.8.patch, HIVE-20150.9.patch
>
>
> TopNKey operator is implemented in HIVE-17896, but it needs more work in 
> pushdown implementation. So this issue covers TopNKey pushdown implementation 
> with proper tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22510) Support decimal64 operations for column operands with different scales

2019-12-15 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-22510:

Status: Open  (was: Patch Available)

> Support decimal64 operations for column operands with different scales
> --
>
> Key: HIVE-22510
> URL: https://issues.apache.org/jira/browse/HIVE-22510
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22510.11.patch, HIVE-22510.13.patch, 
> HIVE-22510.14.patch, HIVE-22510.15.patch, HIVE-22510.16.patch, 
> HIVE-22510.2.patch, HIVE-22510.3.patch, HIVE-22510.4.patch, 
> HIVE-22510.5.patch, HIVE-22510.7.patch, HIVE-22510.9.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Right now, if the operands on the decimal64 operations are columns with 
> different scales, then we do not use the decimal64 vectorized version and 
> fall back to HiveDecimal vectorized version of the operator. In this Jira, we 
> will check if we can use decimal64 vectorized version, even if the scales are 
> different.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22510) Support decimal64 operations for column operands with different scales

2019-12-15 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-22510:

Attachment: HIVE-22510.16.patch
Status: Patch Available  (was: Open)

> Support decimal64 operations for column operands with different scales
> --
>
> Key: HIVE-22510
> URL: https://issues.apache.org/jira/browse/HIVE-22510
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22510.11.patch, HIVE-22510.13.patch, 
> HIVE-22510.14.patch, HIVE-22510.15.patch, HIVE-22510.16.patch, 
> HIVE-22510.2.patch, HIVE-22510.3.patch, HIVE-22510.4.patch, 
> HIVE-22510.5.patch, HIVE-22510.7.patch, HIVE-22510.9.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Right now, if the operands on the decimal64 operations are columns with 
> different scales, then we do not use the decimal64 vectorized version and 
> fall back to HiveDecimal vectorized version of the operator. In this Jira, we 
> will check if we can use decimal64 vectorized version, even if the scales are 
> different.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22510) Support decimal64 operations for column operands with different scales

2019-12-15 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-22510:

Attachment: (was: HIVE-22510.16.patch)

> Support decimal64 operations for column operands with different scales
> --
>
> Key: HIVE-22510
> URL: https://issues.apache.org/jira/browse/HIVE-22510
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22510.11.patch, HIVE-22510.13.patch, 
> HIVE-22510.14.patch, HIVE-22510.15.patch, HIVE-22510.16.patch, 
> HIVE-22510.2.patch, HIVE-22510.3.patch, HIVE-22510.4.patch, 
> HIVE-22510.5.patch, HIVE-22510.7.patch, HIVE-22510.9.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Right now, if the operands on the decimal64 operations are columns with 
> different scales, then we do not use the decimal64 vectorized version and 
> fall back to HiveDecimal vectorized version of the operator. In this Jira, we 
> will check if we can use decimal64 vectorized version, even if the scales are 
> different.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22608) Reduce the number of public methods in Driver

2019-12-15 Thread Miklos Gergely (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-22608:
--
Attachment: (was: HIVE-22608.01.patch)

> Reduce the number of public methods in Driver
> -
>
> Key: HIVE-22608
> URL: https://issues.apache.org/jira/browse/HIVE-22608
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22608.01.patch
>
>
> There are 33 public methods in Driver, some of them either don't belong 
> there, or should not be public.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22608) Reduce the number of public methods in Driver

2019-12-15 Thread Miklos Gergely (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-22608:
--
Attachment: HIVE-22608.01.patch

> Reduce the number of public methods in Driver
> -
>
> Key: HIVE-22608
> URL: https://issues.apache.org/jira/browse/HIVE-22608
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22608.01.patch
>
>
> There are 33 public methods in Driver, some of them either don't belong 
> there, or should not be public.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)