[jira] [Commented] (HIVE-17812) Move remaining classes that HiveMetaStore depends on

2017-11-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235202#comment-16235202
 ] 

Hive QA commented on HIVE-17812:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12895310/HIVE-17812.4.patch

{color:green}SUCCESS:{color} +1 due to 9 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11349 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=62)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes 
(batchId=231)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7591/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7591/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7591/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12895310 - PreCommit-HIVE-Build

> Move remaining classes that HiveMetaStore depends on 
> -
>
> Key: HIVE-17812
> URL: https://issues.apache.org/jira/browse/HIVE-17812
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-17812.2.patch, HIVE-17812.3.patch, 
> HIVE-17812.4.patch, HIVE-17812.patch
>
>
> There are several remaining pieces that need moved before we can move 
> HiveMetaStore itself.  These include NotificationListener and 
> implementations, Events, AlterHandler, and a few other miscellaneous pieces.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17767) Rewrite correlated EXISTS/IN subqueries into LEFT SEMI JOIN

2017-11-01 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17767:
---
Attachment: HIVE-17767.4.patch

> Rewrite correlated EXISTS/IN subqueries into LEFT SEMI JOIN
> ---
>
> Key: HIVE-17767
> URL: https://issues.apache.org/jira/browse/HIVE-17767
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17767.1.patch, HIVE-17767.2.patch, 
> HIVE-17767.3.patch, HIVE-17767.4.patch
>
>
> Currently such queries are written into group by + inner join with value 
> generator and is inefficient. Value generator consists of join with outer 
> query to fetch all correlated values. This value generator could be 
> completely eliminated if such queries are instead rewritten into LEFT SEMI 
> JOIN.
> Note that to do this first hive need to support LEFT SEMI JOIN with non-equi 
> condition (HIVE-17766).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17767) Rewrite correlated EXISTS/IN subqueries into LEFT SEMI JOIN

2017-11-01 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17767:
---
Status: Patch Available  (was: Open)

> Rewrite correlated EXISTS/IN subqueries into LEFT SEMI JOIN
> ---
>
> Key: HIVE-17767
> URL: https://issues.apache.org/jira/browse/HIVE-17767
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17767.1.patch, HIVE-17767.2.patch, 
> HIVE-17767.3.patch, HIVE-17767.4.patch
>
>
> Currently such queries are written into group by + inner join with value 
> generator and is inefficient. Value generator consists of join with outer 
> query to fetch all correlated values. This value generator could be 
> completely eliminated if such queries are instead rewritten into LEFT SEMI 
> JOIN.
> Note that to do this first hive need to support LEFT SEMI JOIN with non-equi 
> condition (HIVE-17766).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17767) Rewrite correlated EXISTS/IN subqueries into LEFT SEMI JOIN

2017-11-01 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17767:
---
Status: Open  (was: Patch Available)

> Rewrite correlated EXISTS/IN subqueries into LEFT SEMI JOIN
> ---
>
> Key: HIVE-17767
> URL: https://issues.apache.org/jira/browse/HIVE-17767
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17767.1.patch, HIVE-17767.2.patch, 
> HIVE-17767.3.patch
>
>
> Currently such queries are written into group by + inner join with value 
> generator and is inefficient. Value generator consists of join with outer 
> query to fetch all correlated values. This value generator could be 
> completely eliminated if such queries are instead rewritten into LEFT SEMI 
> JOIN.
> Note that to do this first hive need to support LEFT SEMI JOIN with non-equi 
> condition (HIVE-17766).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17907) enable and apply resource plan commands in HS2

2017-11-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235152#comment-16235152
 ] 

Hive QA commented on HIVE-17907:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12895329/HIVE-17907.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11353 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=62)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes
 (batchId=230)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7590/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7590/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7590/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12895329 - PreCommit-HIVE-Build

> enable and apply resource plan commands in HS2
> --
>
> Key: HIVE-17907
> URL: https://issues.apache.org/jira/browse/HIVE-17907
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-17907.only.nogen.patch, HIVE-17907.patch
>
>
> Enabling and applying the RP should only be runnable in HS2 with active WM. 
> Both should validate the full resource plan (or at least enable should; users 
> cannot modify the RP via normal means once enabled, but it might be worth 
> double checking since we have to fetch it anyway to apply).
> Then, apply should propagate the resource plan to the WM instance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17486) Enable SharedWorkOptimizer in tez on HOS

2017-11-01 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235137#comment-16235137
 ] 

Xuefu Zhang commented on HIVE-17486:


[~kellyzly] I think M->M->R is possible. It's just that the current planner 
doesn't do this, but in theory it can be done. Currently the assumption is that 
a Map task is always followed by a Reduce task. 

> Enable SharedWorkOptimizer in tez on HOS
> 
>
> Key: HIVE-17486
> URL: https://issues.apache.org/jira/browse/HIVE-17486
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang
>Assignee: liyunzhang
>Priority: Major
> Attachments: scanshare.after.svg, scanshare.before.svg
>
>
> in HIVE-16602, Implement shared scans with Tez.
> Given a query plan, the goal is to identify scans on input tables that can be 
> merged so the data is read only once. Optimization will be carried out at the 
> physical level.  In Hive on Spark, it caches the result of spark work if the 
> spark work is used by more than 1 child spark work. After sharedWorkOptimizer 
> is enabled in physical plan in HoS, the identical table scans are merged to 1 
> table scan. This result of table scan will be used by more 1 child spark 
> work. Thus we need not do the same computation because of cache mechanism.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17528) Add more q-tests for Hive-on-Spark with Parquet vectorized reader

2017-11-01 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-17528:

Attachment: HIVE-17528.patch

> Add more q-tests for Hive-on-Spark with Parquet vectorized reader
> -
>
> Key: HIVE-17528
> URL: https://issues.apache.org/jira/browse/HIVE-17528
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-17528.patch
>
>
> Most of the vectorization related q-tests operate on ORC tables using Tez. It 
> would be good to add more coverage on a different combination of engine and 
> file-format. We can model existing q-tests using parquet tables and run it 
> using TestSparkCliDriver



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17528) Add more q-tests for Hive-on-Spark with Parquet vectorized reader

2017-11-01 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-17528:

Status: Patch Available  (was: Open)

> Add more q-tests for Hive-on-Spark with Parquet vectorized reader
> -
>
> Key: HIVE-17528
> URL: https://issues.apache.org/jira/browse/HIVE-17528
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-17528.patch
>
>
> Most of the vectorization related q-tests operate on ORC tables using Tez. It 
> would be good to add more coverage on a different combination of engine and 
> file-format. We can model existing q-tests using parquet tables and run it 
> using TestSparkCliDriver



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17528) Add more q-tests for Hive-on-Spark with Parquet vectorized reader

2017-11-01 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-17528:

Attachment: (was: HIVE-17528.patch)

> Add more q-tests for Hive-on-Spark with Parquet vectorized reader
> -
>
> Key: HIVE-17528
> URL: https://issues.apache.org/jira/browse/HIVE-17528
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
>
> Most of the vectorization related q-tests operate on ORC tables using Tez. It 
> would be good to add more coverage on a different combination of engine and 
> file-format. We can model existing q-tests using parquet tables and run it 
> using TestSparkCliDriver



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17486) Enable SharedWorkOptimizer in tez on HOS

2017-11-01 Thread liyunzhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235131#comment-16235131
 ] 

liyunzhang commented on HIVE-17486:
---

[~lirui]: what i want to ask is there any possiblity to change current 
structure in the SparkTask in HoS
{code}
M->R 
{code}
to 
{code}
M->M->R
{code}

> Enable SharedWorkOptimizer in tez on HOS
> 
>
> Key: HIVE-17486
> URL: https://issues.apache.org/jira/browse/HIVE-17486
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang
>Assignee: liyunzhang
>Priority: Major
> Attachments: scanshare.after.svg, scanshare.before.svg
>
>
> in HIVE-16602, Implement shared scans with Tez.
> Given a query plan, the goal is to identify scans on input tables that can be 
> merged so the data is read only once. Optimization will be carried out at the 
> physical level.  In Hive on Spark, it caches the result of spark work if the 
> spark work is used by more than 1 child spark work. After sharedWorkOptimizer 
> is enabled in physical plan in HoS, the identical table scans are merged to 1 
> table scan. This result of table scan will be used by more 1 child spark 
> work. Thus we need not do the same computation because of cache mechanism.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17486) Enable SharedWorkOptimizer in tez on HOS

2017-11-01 Thread liyunzhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235125#comment-16235125
 ] 

liyunzhang edited comment on HIVE-17486 at 11/2/17 3:35 AM:


[~lirui]:
{quote}
My understanding is HoS also supports one Map connecting to multiple Reducers 
{quote}
There is only 1 RS in Map in HoS. It is true that there are cases that 1 Map is 
used by two Reducers in HoS. But in HoT, 2 RS are allowed in 1 Map, the 
different 2 RS in the 1 Map can transfer different data to 2 different 
Reducers. 
{quote}
The problem here is HoS doesn't merge equivalent works as aggressively as HoT 
does. 
{quote}
yes


was (Author: kellyzly):
[~lirui]:
{quote}
My understanding is HoS also supports one Map connecting to multiple Reducers 
{quote}
There is only 1 RS in Map in HoS. It is true that there are cases that 1 Map is 
used by two Reducers in HoS. But in HoT, 2 RS are allowed in 1 Map, the 
different 2 RS in the 1 Map can transfer different data to 2 different 
Reducers. 

> Enable SharedWorkOptimizer in tez on HOS
> 
>
> Key: HIVE-17486
> URL: https://issues.apache.org/jira/browse/HIVE-17486
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang
>Assignee: liyunzhang
>Priority: Major
> Attachments: scanshare.after.svg, scanshare.before.svg
>
>
> in HIVE-16602, Implement shared scans with Tez.
> Given a query plan, the goal is to identify scans on input tables that can be 
> merged so the data is read only once. Optimization will be carried out at the 
> physical level.  In Hive on Spark, it caches the result of spark work if the 
> spark work is used by more than 1 child spark work. After sharedWorkOptimizer 
> is enabled in physical plan in HoS, the identical table scans are merged to 1 
> table scan. This result of table scan will be used by more 1 child spark 
> work. Thus we need not do the same computation because of cache mechanism.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17486) Enable SharedWorkOptimizer in tez on HOS

2017-11-01 Thread liyunzhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235125#comment-16235125
 ] 

liyunzhang commented on HIVE-17486:
---

[~lirui]:
{quote}
My understanding is HoS also supports one Map connecting to multiple Reducers 
{quote}
There is only 1 RS in Map in HoS. It is true that there are cases that 1 Map is 
used by two Reducers in HoS. But in HoT, 2 RS are allowed in 1 Map, the 
different 2 RS in the 1 Map can transfer different data to 2 different 
Reducers. 

> Enable SharedWorkOptimizer in tez on HOS
> 
>
> Key: HIVE-17486
> URL: https://issues.apache.org/jira/browse/HIVE-17486
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang
>Assignee: liyunzhang
>Priority: Major
> Attachments: scanshare.after.svg, scanshare.before.svg
>
>
> in HIVE-16602, Implement shared scans with Tez.
> Given a query plan, the goal is to identify scans on input tables that can be 
> merged so the data is read only once. Optimization will be carried out at the 
> physical level.  In Hive on Spark, it caches the result of spark work if the 
> spark work is used by more than 1 child spark work. After sharedWorkOptimizer 
> is enabled in physical plan in HoS, the identical table scans are merged to 1 
> table scan. This result of table scan will be used by more 1 child spark 
> work. Thus we need not do the same computation because of cache mechanism.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-9350) Add ability for HiveAuthorizer implementations to filter out results of 'show tables', 'show databases'

2017-11-01 Thread Alexander Kolbasov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235117#comment-16235117
 ] 

Alexander Kolbasov commented on HIVE-9350:
--

[~thejas] Thanks for the clarification. It is a pity RawStore doesn't specify 
exceptions on method signatures as well.

> Add ability for HiveAuthorizer implementations to filter out results of 'show 
> tables', 'show databases'
> ---
>
> Key: HIVE-9350
> URL: https://issues.apache.org/jira/browse/HIVE-9350
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Thejas M Nair
>Assignee: Thejas M Nair
>Priority: Major
>  Labels: TODOC1.2
> Fix For: 1.2.0
>
> Attachments: HIVE-9350.1.patch, HIVE-9350.2.patch, HIVE-9350.3.patch, 
> HIVE-9350.4.patch, HIVE-9350.5.patch
>
>
> It should be possible for HiveAuthorizer implementations to control if a user 
> is able to see a table or database in results of 'show tables' and 'show 
> databases' respectively.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17486) Enable SharedWorkOptimizer in tez on HOS

2017-11-01 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235116#comment-16235116
 ] 

Rui Li commented on HIVE-17486:
---

Hi [~kellyzly],
bq. In tez, Map can be connected 2 Reducers while in spark, we can not do this.
My understanding is HoS also supports one Map connecting to multiple Reducers - 
that's what {{CombineEquivalentWorkResolver}} is intended for and there're some 
example queries in {{dynamic_rdd_cache.q}}. The problem here is HoS doesn't 
merge equivalent works as aggressively as HoT does. Is that right?

> Enable SharedWorkOptimizer in tez on HOS
> 
>
> Key: HIVE-17486
> URL: https://issues.apache.org/jira/browse/HIVE-17486
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang
>Assignee: liyunzhang
>Priority: Major
> Attachments: scanshare.after.svg, scanshare.before.svg
>
>
> in HIVE-16602, Implement shared scans with Tez.
> Given a query plan, the goal is to identify scans on input tables that can be 
> merged so the data is read only once. Optimization will be carried out at the 
> physical level.  In Hive on Spark, it caches the result of spark work if the 
> spark work is used by more than 1 child spark work. After sharedWorkOptimizer 
> is enabled in physical plan in HoS, the identical table scans are merged to 1 
> table scan. This result of table scan will be used by more 1 child spark 
> work. Thus we need not do the same computation because of cache mechanism.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17958) spark_dynamic_partition_pruning.q fails when hive.tez.dynamic.semijoin.reduction is false

2017-11-01 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235107#comment-16235107
 ] 

Sahil Takiar commented on HIVE-17958:
-

So looks like {{RedundantDynamicPruningConditionsRemoval}} has some bugs in it. 
It completely disables DPP for the following query:

{code}
EXPLAIN SELECT count(*) FROM partitioned_table1 WHERE 
partitioned_table1.part_col IN (
SELECT regular_table1.col1 FROM regular_table1 JOIN partitioned_table2 ON
regular_table1.col1 = partitioned_table2.part_col AND partitioned_table2.col > 
3 AND regular_table1.col1 > 1)
{code}

> spark_dynamic_partition_pruning.q fails when 
> hive.tez.dynamic.semijoin.reduction is false
> -
>
> Key: HIVE-17958
> URL: https://issues.apache.org/jira/browse/HIVE-17958
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
>
> Looks like {{RedundantDynamicPruningConditionsRemoval}} causes DPP to be 
> disabled in a few cases (not sure why). When 
> {{hive.tez.dynamic.semijoin.reduction}} is {{true}} (the default), then this 
> rule is disabled so the normal tests don't hit this issue.
> But when I disable {{hive.tez.dynamic.semijoin.reduction}} then the following 
> query no longer fully triggers DPP:
> {code}
> EXPLAIN select count(*) from srcpart join srcpart_date on (srcpart.ds = 
> srcpart_date.ds) join srcpart_hour on (srcpart.hr = srcpart_hour.hr)
> 5777 where srcpart_date.`date` = '2008-04-08' and srcpart_hour.hour = 11 and 
> srcpart.hr = 11
> {code}
> There should be two DPP sinks, but when the config is set to false, there is 
> only one.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17486) Enable SharedWorkOptimizer in tez on HOS

2017-11-01 Thread liyunzhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235094#comment-16235094
 ] 

liyunzhang commented on HIVE-17486:
---

[~xuefuz]:

{quote}
 My gut feeling is that this needs to be combined with Spark RDD caching or 
Hive's materialized view.
{quote}
 About the optimization, I found that Hive on Tez can get indeed 
improvement(20%+) in TPC-DS/query28,88,90 on not excellent hw or in table scan 
with huge data. So I want to implement it on the Hive on Spark.  
 I agree that we need to combine Spark RDD caching with the optimization to 
reduce the table scan. As you described, the multi-insert case  benefits from 
the Spark RDD caching because map12=map13. But more complex cases can not. Use 
DS/query28.sql as an example.
 The physical plan:
 {code}
TS[0]-FIL[52]-SEL[2]-GBY[3]-RS[4]-GBY[5]-RS[42]-JOIN[48]-SEL[49]-LIM[50]-FS[51]
TS[7]-FIL[53]-SEL[9]-GBY[10]-RS[11]-GBY[12]-RS[43]-JOIN[48]
TS[14]-FIL[54]-SEL[16]-GBY[17]-RS[18]-GBY[19]-RS[44]-JOIN[48]
TS[21]-FIL[55]-SEL[23]-GBY[24]-RS[25]-GBY[26]-RS[45]-JOIN[48]
TS[28]-FIL[56]-SEL[30]-GBY[31]-RS[32]-GBY[33]-RS[46]-JOIN[48]
TS[35]-FIL[57]-SEL[37]-GBY[38]-RS[39]-GBY[40]-RS[47]-JOIN[48]
{code}

After the scan share optimization, the phyiscal plan
{code}
TS[0]-FIL[52]-SEL[2]-GBY[3]-RS[4]-GBY[5]-RS[42]-JOIN[48]-SEL[49]-LIM[50]-FS[51]
 -FIL[53]-SEL[9]-GBY[10]-RS[11]-GBY[12]-RS[43]-JOIN[48]
 -FIL[54]-SEL[16]-GBY[17]-RS[18]-GBY[19]-RS[44]-JOIN[48]
 -FIL[55]-SEL[23]-GBY[24]-RS[25]-GBY[26]-RS[45]-JOIN[48]
 -FIL[56]-SEL[30]-GBY[31]-RS[32]-GBY[33]-RS[46]-JOIN[48]
 -FIL[57]-SEL[37]-GBY[38]-RS[39]-GBY[40]-RS[47]-JOIN[48]

{code}

HoS will split operators trees when encounting {{RS}}.
{code}
Map1: TS[0]-FIL[52]-SEL[2]-GBY[3]-RS[4]
Map2: TS[0]-FIL[53]-SEL[9]-GBY[10]-RS[11]
Map3: TS[0]-FIL[54]-SEL[16]-GBY[17]-RS[18]
Map4: TS[0]-FIL[55]-SEL[23]-GBY[24]-RS[25]
Map5: TS[0] -FIL[56]-SEL[30]-GBY[31]-RS[32]
Map6: TS[0]-FIL[57]-SEL[37]-GBY[38]-RS[39]
{code}

We can not combine Map1,..., Map6 because the {{FIL}}(FIL\[52\], 
FIL\[53\],...,FIL\[57\]) are not same.
So what i think about can we directly extract TS from MapTask and put the TS to 
a single Map
{code}
Map0: TS[0]
Map1: FIL[52]-SEL[2]-GBY[3]-RS[4]
Map2: FIL[53]-SEL[9]-GBY[10]-RS[11]
Map3: FIL[54]-SEL[16]-GBY[17]-RS[18]
Map4: FIL[55]-SEL[23]-GBY[24]-RS[25]
Map5: FIL[56]-SEL[30]-GBY[31]-RS[32]
Map6: FIL[57]-SEL[37]-GBY[38]-RS[39]
{code}
There is only TS\[0\] in the Map0 and connect Map0 to Map1,...,Map6.  
Appreciate to get some suggestion from you!


> Enable SharedWorkOptimizer in tez on HOS
> 
>
> Key: HIVE-17486
> URL: https://issues.apache.org/jira/browse/HIVE-17486
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang
>Assignee: liyunzhang
>Priority: Major
> Attachments: scanshare.after.svg, scanshare.before.svg
>
>
> in HIVE-16602, Implement shared scans with Tez.
> Given a query plan, the goal is to identify scans on input tables that can be 
> merged so the data is read only once. Optimization will be carried out at the 
> physical level.  In Hive on Spark, it caches the result of spark work if the 
> spark work is used by more than 1 child spark work. After sharedWorkOptimizer 
> is enabled in physical plan in HoS, the identical table scans are merged to 1 
> table scan. This result of table scan will be used by more 1 child spark 
> work. Thus we need not do the same computation because of cache mechanism.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17486) Enable SharedWorkOptimizer in tez on HOS

2017-11-01 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235085#comment-16235085
 ] 

Xuefu Zhang commented on HIVE-17486:


Hi [~kellyzly], I think your observation is correct. Spark has certain 
limitations. In fact, the edge theory doesn't even apply to Spark. Spark uses 
RDD model. Internally Hive translates the DAG to RDD operations 
(transformations and actions). In the example of ( Map1->Reducer3, 
Map1->Reducer2), Hive on Spark actually has a plan like (map12 - > reduce2, 
map13 ->reduce3) with map12 = map13. This way, there will be two spark jobs. In 
the second job, the cached result is used instead of loading the data again. 
BTW, this is a multi-insert example.

Multiple edges between two vertices are even less thinkable. You might be able 
to turn this optimization for Spark, but Spark might not be able to run it. I'm 
not sure if there is any case that this optimization might help Spark. My gut 
feeling is that this needs to be combined with Spark RDD caching or HIve's 
materialized view.

> Enable SharedWorkOptimizer in tez on HOS
> 
>
> Key: HIVE-17486
> URL: https://issues.apache.org/jira/browse/HIVE-17486
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang
>Assignee: liyunzhang
>Priority: Major
> Attachments: scanshare.after.svg, scanshare.before.svg
>
>
> in HIVE-16602, Implement shared scans with Tez.
> Given a query plan, the goal is to identify scans on input tables that can be 
> merged so the data is read only once. Optimization will be carried out at the 
> physical level.  In Hive on Spark, it caches the result of spark work if the 
> spark work is used by more than 1 child spark work. After sharedWorkOptimizer 
> is enabled in physical plan in HoS, the identical table scans are merged to 1 
> table scan. This result of table scan will be used by more 1 child spark 
> work. Thus we need not do the same computation because of cache mechanism.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17963) Fix for HIVE-17113 can be improved for non-blobstore filesystems

2017-11-01 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere reassigned HIVE-17963:
-


> Fix for HIVE-17113 can be improved for non-blobstore filesystems
> 
>
> Key: HIVE-17963
> URL: https://issues.apache.org/jira/browse/HIVE-17963
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
>
> HIVE-17113/HIVE-17813 fix the duplicate file issue by performing file moves 
> on a file-by-file basis. For non-blobstore filesystems this results in many 
> more filesystem/namenode operations compared to the previous 
> Utilities.mvFileToFinalPath() behavior (dedup files in src dir, rename src 
> dir to final dir).
> For non-blobstore filesystems, a better solution would be the one described 
> [here|https://issues.apache.org/jira/browse/HIVE-17113?focusedCommentId=16100564=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16100564]:
> 1) Move the temp directory to a new directory name, to prevent additional 
> files from being added by any runaway processes.
> 2) Run removeTempOrDuplicateFiles() on this renamed temp directory
> 3) Run renameOrMoveFiles() to move the renamed temp directory to the final 
> location.
> This results in only one additional file operation in non-blobstore FSes 
> compared to the original Utilities.mvFileToFinalPath() behavior.
> The proposal is to do away with the config setting 
> hive.exec.move.files.from.source.dir and always have behavior that should 
> take care of the duplicate file issue described in HIVE-17113. For 
> non-blobstore filesystems we will do steps 1-3 described above. For blobstore 
> filesystems we will do the solution done in HIVE-17113/HIVE-17813 which does 
> the file-by-file copy - this should have the same number of file operations 
> as doing a rename directory on blobstore, which effectively results in file 
> moves on a file-by-file basis.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17937) llap_acid_fast test is flaky

2017-11-01 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17937:

Summary: llap_acid_fast test is flaky  (was: llap_acid_test is flaky)

> llap_acid_fast test is flaky
> 
>
> Key: HIVE-17937
> URL: https://issues.apache.org/jira/browse/HIVE-17937
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Teddy Choi
>Priority: Major
>
> See for example 
> https://builds.apache.org/job/PreCommit-HIVE-Build/7521/testReport/org.apache.hadoop.hive.cli/TestMiniLlapLocalCliDriver/testCliDriver_llap_acid_fast_/history/
>  (the history link is the same from any build number with a test run, just 
> replace 7521 if this one expires).
> Looks like results change, which may not be good.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17907) enable and apply resource plan commands in HS2

2017-11-01 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17907:

Attachment: (was: HIVE-17907.only.nogen.nogen.patch)

> enable and apply resource plan commands in HS2
> --
>
> Key: HIVE-17907
> URL: https://issues.apache.org/jira/browse/HIVE-17907
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-17907.only.nogen.patch, HIVE-17907.patch
>
>
> Enabling and applying the RP should only be runnable in HS2 with active WM. 
> Both should validate the full resource plan (or at least enable should; users 
> cannot modify the RP via normal means once enabled, but it might be worth 
> double checking since we have to fetch it anyway to apply).
> Then, apply should propagate the resource plan to the WM instance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17907) enable and apply resource plan commands in HS2

2017-11-01 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17907:

Attachment: (was: HIVE-17907.patch)

> enable and apply resource plan commands in HS2
> --
>
> Key: HIVE-17907
> URL: https://issues.apache.org/jira/browse/HIVE-17907
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-17907.only.nogen.patch, HIVE-17907.patch
>
>
> Enabling and applying the RP should only be runnable in HS2 with active WM. 
> Both should validate the full resource plan (or at least enable should; users 
> cannot modify the RP via normal means once enabled, but it might be worth 
> double checking since we have to fetch it anyway to apply).
> Then, apply should propagate the resource plan to the WM instance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17907) enable and apply resource plan commands in HS2

2017-11-01 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17907:

Attachment: HIVE-17907.patch
HIVE-17907.only.nogen.patch

> enable and apply resource plan commands in HS2
> --
>
> Key: HIVE-17907
> URL: https://issues.apache.org/jira/browse/HIVE-17907
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-17907.only.nogen.patch, HIVE-17907.patch
>
>
> Enabling and applying the RP should only be runnable in HS2 with active WM. 
> Both should validate the full resource plan (or at least enable should; users 
> cannot modify the RP via normal means once enabled, but it might be worth 
> double checking since we have to fetch it anyway to apply).
> Then, apply should propagate the resource plan to the WM instance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17926) Support triggers for non-pool sessions

2017-11-01 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235064#comment-16235064
 ] 

Sergey Shelukhin commented on HIVE-17926:
-

One question on RB. Looks good otherwise

> Support triggers for non-pool sessions
> --
>
> Key: HIVE-17926
> URL: https://issues.apache.org/jira/browse/HIVE-17926
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-17926.1.patch, HIVE-17926.1.patch, 
> HIVE-17926.2.patch, HIVE-17926.3.patch
>
>
> Current trigger implementation works only with tez session pools. In case 
> when tez sessions pools are not used, a new session gets created for every 
> query in which case trigger validation does not happen. It will be good to 
> support such one-off session case as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17901) org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and More

2017-11-01 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-17901:
---
Attachment: (was: HIVE-17901.3.patch)

> org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and 
> More
> 
>
> Key: HIVE-17901
> URL: https://issues.apache.org/jira/browse/HIVE-17901
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-17901.2.patch, HIVE-17901.3.patch
>
>
> {{org.apache.hadoop.hive.ql.exec.Utilities}}
> # Remove unused imports
> # Remove unused variables
> # Modify logging to use logging parameterization
> # Other small tweeks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17901) org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and More

2017-11-01 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-17901:
---
Attachment: (was: HIVE-17901.1.patch)

> org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and 
> More
> 
>
> Key: HIVE-17901
> URL: https://issues.apache.org/jira/browse/HIVE-17901
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-17901.2.patch, HIVE-17901.3.patch, 
> HIVE-17901.3.patch
>
>
> {{org.apache.hadoop.hive.ql.exec.Utilities}}
> # Remove unused imports
> # Remove unused variables
> # Modify logging to use logging parameterization
> # Other small tweeks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17901) org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and More

2017-11-01 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-17901:
---
Attachment: HIVE-17901.3.patch

> org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and 
> More
> 
>
> Key: HIVE-17901
> URL: https://issues.apache.org/jira/browse/HIVE-17901
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-17901.2.patch, HIVE-17901.3.patch, 
> HIVE-17901.3.patch
>
>
> {{org.apache.hadoop.hive.ql.exec.Utilities}}
> # Remove unused imports
> # Remove unused variables
> # Modify logging to use logging parameterization
> # Other small tweeks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17486) Enable SharedWorkOptimizer in tez on HOS

2017-11-01 Thread liyunzhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235054#comment-16235054
 ] 

liyunzhang commented on HIVE-17486:
---

Now HoS does not support multiple edge between two vertex. Let's give an 
example to show this.
TPC-DS/[query28.sql|https://github.com/kellyzly/hive-testbench/blob/hive14/sample-queries-tpcds/query28.sql].
   Before scan shared optimization(HIVE-16602). the tez explain is  
[scanshare.before.svg|https://issues.apache.org/jira/secure/attachment/12895148/scanshare.before.svg].
 After scan shared optimization(HIVE-16602), the tez explain is 
[scanshare.after.svg|https://issues.apache.org/jira/secure/attachment/12895149/scanshare.after.svg].
 We can see that after optimization, there is only 1 map(before there are 6 
maps). But later the only map Map1 connects other 6 reducers by 6 edges. This 
is because tez supports mulitple edges between two vertexes(TEZ-1190). Now i am 
working on enabling this feature on HoS. But in HoS, it does not support " 
mulitple edges between two vertexes". So even i change the physical plan as 
what HoT does, it may not reduce the number of Map. [~lirui],[~xuefuz],can you 
help to see the problem.

> Enable SharedWorkOptimizer in tez on HOS
> 
>
> Key: HIVE-17486
> URL: https://issues.apache.org/jira/browse/HIVE-17486
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang
>Assignee: liyunzhang
>Priority: Major
> Attachments: scanshare.after.svg, scanshare.before.svg
>
>
> in HIVE-16602, Implement shared scans with Tez.
> Given a query plan, the goal is to identify scans on input tables that can be 
> merged so the data is read only once. Optimization will be carried out at the 
> physical level.  In Hive on Spark, it caches the result of spark work if the 
> spark work is used by more than 1 child spark work. After sharedWorkOptimizer 
> is enabled in physical plan in HoS, the identical table scans are merged to 1 
> table scan. This result of table scan will be used by more 1 child spark 
> work. Thus we need not do the same computation because of cache mechanism.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17962) org.apache.hadoop.hive.metastore.security.MemoryTokenStore - Parameterize Logging

2017-11-01 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235032#comment-16235032
 ] 

Aihua Xu commented on HIVE-17962:
-

Looks good. +1.

> org.apache.hadoop.hive.metastore.security.MemoryTokenStore - Parameterize 
> Logging
> -
>
> Key: HIVE-17962
> URL: https://issues.apache.org/jira/browse/HIVE-17962
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
> Attachments: HIVE-17962.1.patch
>
>
> * Parameterize logging
> * Small simplification



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17962) org.apache.hadoop.hive.metastore.security.MemoryTokenStore - Parameterize Logging

2017-11-01 Thread BELUGA BEHR (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235028#comment-16235028
 ] 

BELUGA BEHR commented on HIVE-17962:


[~aihuaxu] :)

> org.apache.hadoop.hive.metastore.security.MemoryTokenStore - Parameterize 
> Logging
> -
>
> Key: HIVE-17962
> URL: https://issues.apache.org/jira/browse/HIVE-17962
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
> Attachments: HIVE-17962.1.patch
>
>
> * Parameterize logging
> * Small simplification



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17962) org.apache.hadoop.hive.metastore.security.MemoryTokenStore - Parameterize Logging

2017-11-01 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-17962:
---
Status: Patch Available  (was: Open)

> org.apache.hadoop.hive.metastore.security.MemoryTokenStore - Parameterize 
> Logging
> -
>
> Key: HIVE-17962
> URL: https://issues.apache.org/jira/browse/HIVE-17962
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
> Attachments: HIVE-17962.1.patch
>
>
> * Parameterize logging
> * Small simplification



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17962) org.apache.hadoop.hive.metastore.security.MemoryTokenStore - Parameterize Logging

2017-11-01 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-17962:
---
Attachment: HIVE-17962.1.patch

> org.apache.hadoop.hive.metastore.security.MemoryTokenStore - Parameterize 
> Logging
> -
>
> Key: HIVE-17962
> URL: https://issues.apache.org/jira/browse/HIVE-17962
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
> Attachments: HIVE-17962.1.patch
>
>
> * Parameterize logging
> * Small simplification



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17901) org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and More

2017-11-01 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-17901:
---
Attachment: HIVE-17901.3.patch

> org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and 
> More
> 
>
> Key: HIVE-17901
> URL: https://issues.apache.org/jira/browse/HIVE-17901
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-17901.1.patch, HIVE-17901.2.patch, 
> HIVE-17901.3.patch
>
>
> {{org.apache.hadoop.hive.ql.exec.Utilities}}
> # Remove unused imports
> # Remove unused variables
> # Modify logging to use logging parameterization
> # Other small tweeks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17901) org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and More

2017-11-01 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-17901:
---
Status: Open  (was: Patch Available)

> org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and 
> More
> 
>
> Key: HIVE-17901
> URL: https://issues.apache.org/jira/browse/HIVE-17901
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-17901.1.patch, HIVE-17901.2.patch, 
> HIVE-17901.3.patch
>
>
> {{org.apache.hadoop.hive.ql.exec.Utilities}}
> # Remove unused imports
> # Remove unused variables
> # Modify logging to use logging parameterization
> # Other small tweeks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17901) org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and More

2017-11-01 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-17901:
---
Status: Patch Available  (was: Open)

> org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and 
> More
> 
>
> Key: HIVE-17901
> URL: https://issues.apache.org/jira/browse/HIVE-17901
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-17901.1.patch, HIVE-17901.2.patch, 
> HIVE-17901.3.patch
>
>
> {{org.apache.hadoop.hive.ql.exec.Utilities}}
> # Remove unused imports
> # Remove unused variables
> # Modify logging to use logging parameterization
> # Other small tweeks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17901) org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and More

2017-11-01 Thread BELUGA BEHR (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235026#comment-16235026
 ] 

BELUGA BEHR commented on HIVE-17901:


Adding some more changes...

* Using a global Random object instead of creating it as a local variable for 
each method call.  As of JDK 7, Random is thread-safe.
* When creating a random positive number, there is one edge-case where it can 
still [produce a negative 
number|https://stackoverflow.com/questions/5827023/java-random-giving-negative-numbers].
  This patch addresses that.
* Unify how we check for empty collections and empty strings using the handy 
Apache Commons library
* Remove dead code

> org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and 
> More
> 
>
> Key: HIVE-17901
> URL: https://issues.apache.org/jira/browse/HIVE-17901
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-17901.1.patch, HIVE-17901.2.patch
>
>
> {{org.apache.hadoop.hive.ql.exec.Utilities}}
> # Remove unused imports
> # Remove unused variables
> # Modify logging to use logging parameterization
> # Other small tweeks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17907) enable and apply resource plan commands in HS2

2017-11-01 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235012#comment-16235012
 ] 

Sergey Shelukhin commented on HIVE-17907:
-

Actually I've never tried resourceplan.q, I bet it's broken with my random 
parser changes. Might update this. The non-parser changes should be ready.

> enable and apply resource plan commands in HS2
> --
>
> Key: HIVE-17907
> URL: https://issues.apache.org/jira/browse/HIVE-17907
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-17907.only.nogen.nogen.patch, HIVE-17907.patch
>
>
> Enabling and applying the RP should only be runnable in HS2 with active WM. 
> Both should validate the full resource plan (or at least enable should; users 
> cannot modify the RP via normal means once enabled, but it might be worth 
> double checking since we have to fetch it anyway to apply).
> Then, apply should propagate the resource plan to the WM instance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17961) NPE during initialization of VectorizedParquetRecordReader when input split is null

2017-11-01 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar reassigned HIVE-17961:
--


> NPE during initialization of VectorizedParquetRecordReader when input split 
> is null
> ---
>
> Key: HIVE-17961
> URL: https://issues.apache.org/jira/browse/HIVE-17961
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
>
> HIVE-16465 introduces the regression which causes a NPE during initialize of 
> the vectorized reader when input split is null. This was already fixed in 
> HIVE-15718 but got exposed again we refactored for HIVE-16465. We should also 
> add a test case to catch such regressions in the future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HIVE-17920) Vectorized reader does push down projection columns for index access schema

2017-11-01 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu resolved HIVE-17920.
-
Resolution: Duplicate

> Vectorized reader does push down projection columns for index access schema
> ---
>
> Key: HIVE-17920
> URL: https://issues.apache.org/jira/browse/HIVE-17920
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ferdinand Xu
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17907) enable and apply resource plan commands in HS2

2017-11-01 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17907:

Status: Patch Available  (was: Open)

> enable and apply resource plan commands in HS2
> --
>
> Key: HIVE-17907
> URL: https://issues.apache.org/jira/browse/HIVE-17907
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-17907.only.nogen.nogen.patch, HIVE-17907.patch
>
>
> Enabling and applying the RP should only be runnable in HS2 with active WM. 
> Both should validate the full resource plan (or at least enable should; users 
> cannot modify the RP via normal means once enabled, but it might be worth 
> double checking since we have to fetch it anyway to apply).
> Then, apply should propagate the resource plan to the WM instance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17907) enable and apply resource plan commands in HS2

2017-11-01 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17907:

Attachment: HIVE-17907.only.nogen.nogen.patch
HIVE-17907.patch

The no-generated-files patch for this jira only, and a combined patch from 
master on top of HIVE-17841

> enable and apply resource plan commands in HS2
> --
>
> Key: HIVE-17907
> URL: https://issues.apache.org/jira/browse/HIVE-17907
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-17907.only.nogen.nogen.patch, HIVE-17907.patch
>
>
> Enabling and applying the RP should only be runnable in HS2 with active WM. 
> Both should validate the full resource plan (or at least enable should; users 
> cannot modify the RP via normal means once enabled, but it might be worth 
> double checking since we have to fetch it anyway to apply).
> Then, apply should propagate the resource plan to the WM instance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17812) Move remaining classes that HiveMetaStore depends on

2017-11-01 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17812:
--
Attachment: HIVE-17812.4.patch

> Move remaining classes that HiveMetaStore depends on 
> -
>
> Key: HIVE-17812
> URL: https://issues.apache.org/jira/browse/HIVE-17812
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-17812.2.patch, HIVE-17812.3.patch, 
> HIVE-17812.4.patch, HIVE-17812.patch
>
>
> There are several remaining pieces that need moved before we can move 
> HiveMetaStore itself.  These include NotificationListener and 
> implementations, Events, AlterHandler, and a few other miscellaneous pieces.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16602) Implement shared scans with Tez

2017-11-01 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234980#comment-16234980
 ] 

Jesus Camacho Rodriguez commented on HIVE-16602:


[~kellyzly], the table scan is shared, however the filters are different so 
they are not shared. This is all executed within the same Map 1 task indeed, 
which then outputs to all those Reducers that you mention.

> Implement shared scans with Tez
> ---
>
> Key: HIVE-16602
> URL: https://issues.apache.org/jira/browse/HIVE-16602
> Project: Hive
>  Issue Type: New Feature
>  Components: Physical Optimizer
>Affects Versions: 3.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-16602.01.patch, HIVE-16602.02.patch, 
> HIVE-16602.03.patch, HIVE-16602.04.patch, HIVE-16602.patch
>
>
> Given a query plan, the goal is to identify scans on input tables that can be 
> merged so the data is read only once. Optimization will be carried out at the 
> physical level.
> In the longer term, identification of equivalent expressions and 
> reutilization of intermediary results should be done at the logical layer via 
> Spool operator.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17867) Exception in windowing functions with TIMESTAMP WITH LOCAL TIME ZONE type

2017-11-01 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-17867:
---
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master.

> Exception in windowing functions with TIMESTAMP WITH LOCAL TIME ZONE type
> -
>
> Key: HIVE-17867
> URL: https://issues.apache.org/jira/browse/HIVE-17867
> Project: Hive
>  Issue Type: Bug
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-17867.01.patch, HIVE-17867.patch
>
>
> The following query where column {{ts}} is of type {{TIMESTAMP WITH LOCAL 
> TIME ZONE}}:
> {code}
> select ts, i, sum(f) over (partition by i order by ts)
> from over10k_2
> limit 100;
> {code}
> fails with the following stacktrace:
> {code}
> org.apache.hadoop.hive.ql.parse.SemanticException: Failed to breakup 
> Windowing invocations into Groups. At least 1 group must only depend on input 
> columns. Also check for circular dependencies.
> Underlying error: Primitive type TIMESTAMPLOCALTZ not supported in Value 
> Boundary expression
> at 
> org.apache.hadoop.hive.ql.parse.WindowingComponentizer.next(WindowingComponentizer.java:97)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genWindowingPlan(SemanticAnalyzer.java:13508)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:9912)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:9871)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:10784)
> ...
> {code}
> The list of supported types for boundaries expressions in PTFTranslator needs 
> to be updated.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15157) Partition Table With timestamp type on S3 storage --> Error in getting fields from serde.Invalid Field null

2017-11-01 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234965#comment-16234965
 ] 

Jesus Camacho Rodriguez commented on HIVE-15157:


It is actually the other way around. The problem that we are facing is that the 
partition is written without .0, e.g., '2016-11-02 17:00:00'. Then we try to 
retrieve it with DESCRIBE FORMATTED, we were adding the .0 at the end of the 
value representation for the literal, i.e., '2016-11-02 17:00:00.0', so the 
partition is not found in the list of returned partitions. This particular 
check was done in {{DDLSemanticAnalyzer.getPartitionSpec}}, the exception we 
were hitting is in L2229.

> Partition Table With timestamp type on S3 storage --> Error in getting fields 
> from serde.Invalid Field null
> ---
>
> Key: HIVE-15157
> URL: https://issues.apache.org/jira/browse/HIVE-15157
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 2.1.0
> Environment: JDK 1.8 101 
>Reporter: thauvin damien
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
>  Labels: timestamp
> Attachments: HIVE-15157.01.patch, HIVE-15157.02.patch
>
>
> Hello 
> I get the error above when i try to perform  :
> hive> DESCRIBE formatted table partition (tsbucket='2016-10-28 16%3A00%3A00');
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Error in getting fields from 
> serde.Invalid Field null
> Here is the description of the issue.
> --External table Hive with dynamic partition enable on Aws S3 storage.
> --Partition Table with timestamp type .
> When i perform "show partition table;" everything is fine :
> hive>  show partitions table;
> OK
> tsbucket=2016-10-01 11%3A00%3A00
> tsbucket=2016-10-28 16%3A00%3A00
> And when i perform "describe FORMATTED table;" everything is fine
> Is this a bug ? 
> The stacktrace of hive.log :
> 2016-11-08T10:30:20,868 ERROR [ac3e0d48-22c5-4d04-a788-aeb004ea94f3 
> main([])]: exec.DDLTask (DDLTask.java:failed(574)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error in getting fields 
> from serde.Invalid Field null
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3414)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.describeTable(DDLTask.java:3109)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:408)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: MetaException(message:Invalid Field null)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getFieldsFromDeserializer(MetaStoreUtils.java:1336)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3409)
> ... 21 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17901) org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and More

2017-11-01 Thread BELUGA BEHR (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234955#comment-16234955
 ] 

BELUGA BEHR commented on HIVE-17901:


Updated patch for latest version

> org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and 
> More
> 
>
> Key: HIVE-17901
> URL: https://issues.apache.org/jira/browse/HIVE-17901
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-17901.1.patch, HIVE-17901.2.patch
>
>
> {{org.apache.hadoop.hive.ql.exec.Utilities}}
> # Remove unused imports
> # Remove unused variables
> # Modify logging to use logging parameterization
> # Other small tweeks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17901) org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and More

2017-11-01 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-17901:
---
Attachment: HIVE-17901.2.patch

> org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and 
> More
> 
>
> Key: HIVE-17901
> URL: https://issues.apache.org/jira/browse/HIVE-17901
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-17901.1.patch, HIVE-17901.2.patch
>
>
> {{org.apache.hadoop.hive.ql.exec.Utilities}}
> # Remove unused imports
> # Remove unused variables
> # Modify logging to use logging parameterization
> # Other small tweeks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17901) org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and More

2017-11-01 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-17901:
---
Status: Patch Available  (was: Open)

> org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and 
> More
> 
>
> Key: HIVE-17901
> URL: https://issues.apache.org/jira/browse/HIVE-17901
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-17901.1.patch, HIVE-17901.2.patch
>
>
> {{org.apache.hadoop.hive.ql.exec.Utilities}}
> # Remove unused imports
> # Remove unused variables
> # Modify logging to use logging parameterization
> # Other small tweeks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17901) org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and More

2017-11-01 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-17901:
---
Status: Open  (was: Patch Available)

> org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and 
> More
> 
>
> Key: HIVE-17901
> URL: https://issues.apache.org/jira/browse/HIVE-17901
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-17901.1.patch
>
>
> {{org.apache.hadoop.hive.ql.exec.Utilities}}
> # Remove unused imports
> # Remove unused variables
> # Modify logging to use logging parameterization
> # Other small tweeks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15157) Partition Table With timestamp type on S3 storage --> Error in getting fields from serde.Invalid Field null

2017-11-01 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234934#comment-16234934
 ] 

Prasanth Jayachandran commented on HIVE-15157:
--

It should be fine as there a config for unescaping.

This patch prevents writing .0 at the end of timestamp. I am assuming this 
behaviour exists for a long time and not something introduced recently. What 
happens to already existing partitions that are written with .0 nanos at the 
end?

> Partition Table With timestamp type on S3 storage --> Error in getting fields 
> from serde.Invalid Field null
> ---
>
> Key: HIVE-15157
> URL: https://issues.apache.org/jira/browse/HIVE-15157
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 2.1.0
> Environment: JDK 1.8 101 
>Reporter: thauvin damien
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
>  Labels: timestamp
> Attachments: HIVE-15157.01.patch, HIVE-15157.02.patch
>
>
> Hello 
> I get the error above when i try to perform  :
> hive> DESCRIBE formatted table partition (tsbucket='2016-10-28 16%3A00%3A00');
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Error in getting fields from 
> serde.Invalid Field null
> Here is the description of the issue.
> --External table Hive with dynamic partition enable on Aws S3 storage.
> --Partition Table with timestamp type .
> When i perform "show partition table;" everything is fine :
> hive>  show partitions table;
> OK
> tsbucket=2016-10-01 11%3A00%3A00
> tsbucket=2016-10-28 16%3A00%3A00
> And when i perform "describe FORMATTED table;" everything is fine
> Is this a bug ? 
> The stacktrace of hive.log :
> 2016-11-08T10:30:20,868 ERROR [ac3e0d48-22c5-4d04-a788-aeb004ea94f3 
> main([])]: exec.DDLTask (DDLTask.java:failed(574)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error in getting fields 
> from serde.Invalid Field null
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3414)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.describeTable(DDLTask.java:3109)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:408)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: MetaException(message:Invalid Field null)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getFieldsFromDeserializer(MetaStoreUtils.java:1336)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3409)
> ... 21 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15157) Partition Table With timestamp type on S3 storage --> Error in getting fields from serde.Invalid Field null

2017-11-01 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234913#comment-16234913
 ] 

Jesus Camacho Rodriguez commented on HIVE-15157:


[~prasanth_j], not surprisingly, we have an additional flag for unescaping show 
partitions that is set to false by default: hive.decode.partition.name :) We 
could explore in a follow-up why this is the default behavior, tbh it does not 
make much sense I think (maybe if you want to see the actual name of the 
folder?)... But in any case, this fix is self contained.

Fails are unrelated, except {{partition_timestamp}} for which I had to 
regenerate the q file. Patch is ready to be reviewed. Cc [~ashutoshc]

> Partition Table With timestamp type on S3 storage --> Error in getting fields 
> from serde.Invalid Field null
> ---
>
> Key: HIVE-15157
> URL: https://issues.apache.org/jira/browse/HIVE-15157
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 2.1.0
> Environment: JDK 1.8 101 
>Reporter: thauvin damien
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
>  Labels: timestamp
> Attachments: HIVE-15157.01.patch, HIVE-15157.02.patch
>
>
> Hello 
> I get the error above when i try to perform  :
> hive> DESCRIBE formatted table partition (tsbucket='2016-10-28 16%3A00%3A00');
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Error in getting fields from 
> serde.Invalid Field null
> Here is the description of the issue.
> --External table Hive with dynamic partition enable on Aws S3 storage.
> --Partition Table with timestamp type .
> When i perform "show partition table;" everything is fine :
> hive>  show partitions table;
> OK
> tsbucket=2016-10-01 11%3A00%3A00
> tsbucket=2016-10-28 16%3A00%3A00
> And when i perform "describe FORMATTED table;" everything is fine
> Is this a bug ? 
> The stacktrace of hive.log :
> 2016-11-08T10:30:20,868 ERROR [ac3e0d48-22c5-4d04-a788-aeb004ea94f3 
> main([])]: exec.DDLTask (DDLTask.java:failed(574)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error in getting fields 
> from serde.Invalid Field null
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3414)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.describeTable(DDLTask.java:3109)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:408)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: MetaException(message:Invalid Field null)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getFieldsFromDeserializer(MetaStoreUtils.java:1336)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3409)
> ... 21 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17867) Exception in windowing functions with TIMESTAMP WITH LOCAL TIME ZONE type

2017-11-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234910#comment-16234910
 ] 

Hive QA commented on HIVE-17867:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12895212/HIVE-17867.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 11349 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[temp_table_display_colstats_tbllvl]
 (batchId=76)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7589/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7589/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7589/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12895212 - PreCommit-HIVE-Build

> Exception in windowing functions with TIMESTAMP WITH LOCAL TIME ZONE type
> -
>
> Key: HIVE-17867
> URL: https://issues.apache.org/jira/browse/HIVE-17867
> Project: Hive
>  Issue Type: Bug
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-17867.01.patch, HIVE-17867.patch
>
>
> The following query where column {{ts}} is of type {{TIMESTAMP WITH LOCAL 
> TIME ZONE}}:
> {code}
> select ts, i, sum(f) over (partition by i order by ts)
> from over10k_2
> limit 100;
> {code}
> fails with the following stacktrace:
> {code}
> org.apache.hadoop.hive.ql.parse.SemanticException: Failed to breakup 
> Windowing invocations into Groups. At least 1 group must only depend on input 
> columns. Also check for circular dependencies.
> Underlying error: Primitive type TIMESTAMPLOCALTZ not supported in Value 
> Boundary expression
> at 
> org.apache.hadoop.hive.ql.parse.WindowingComponentizer.next(WindowingComponentizer.java:97)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genWindowingPlan(SemanticAnalyzer.java:13508)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:9912)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:9871)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:10784)
> ...
> {code}
> The list of supported types for boundaries expressions in PTFTranslator needs 
> to be updated.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15157) Partition Table With timestamp type on S3 storage --> Error in getting fields from serde.Invalid Field null

2017-11-01 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15157:
---
Attachment: HIVE-15157.02.patch

> Partition Table With timestamp type on S3 storage --> Error in getting fields 
> from serde.Invalid Field null
> ---
>
> Key: HIVE-15157
> URL: https://issues.apache.org/jira/browse/HIVE-15157
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 2.1.0
> Environment: JDK 1.8 101 
>Reporter: thauvin damien
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
>  Labels: timestamp
> Attachments: HIVE-15157.01.patch, HIVE-15157.02.patch
>
>
> Hello 
> I get the error above when i try to perform  :
> hive> DESCRIBE formatted table partition (tsbucket='2016-10-28 16%3A00%3A00');
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Error in getting fields from 
> serde.Invalid Field null
> Here is the description of the issue.
> --External table Hive with dynamic partition enable on Aws S3 storage.
> --Partition Table with timestamp type .
> When i perform "show partition table;" everything is fine :
> hive>  show partitions table;
> OK
> tsbucket=2016-10-01 11%3A00%3A00
> tsbucket=2016-10-28 16%3A00%3A00
> And when i perform "describe FORMATTED table;" everything is fine
> Is this a bug ? 
> The stacktrace of hive.log :
> 2016-11-08T10:30:20,868 ERROR [ac3e0d48-22c5-4d04-a788-aeb004ea94f3 
> main([])]: exec.DDLTask (DDLTask.java:failed(574)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error in getting fields 
> from serde.Invalid Field null
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3414)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.describeTable(DDLTask.java:3109)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:408)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: MetaException(message:Invalid Field null)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getFieldsFromDeserializer(MetaStoreUtils.java:1336)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3409)
> ... 21 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-14069) update curator version to 2.10.0

2017-11-01 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-14069:
--
Status: Patch Available  (was: Open)

> update curator version to 2.10.0 
> -
>
> Key: HIVE-14069
> URL: https://issues.apache.org/jira/browse/HIVE-14069
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Metastore
>Reporter: Thejas M Nair
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-14069.1.patch, HIVE-14069.2.patch
>
>
> curator-2.10.0 has several bug fixes over current version (2.6.0), updating 
> would help improve stability.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-14069) update curator version to 2.10.0

2017-11-01 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-14069:
--
Attachment: HIVE-14069.2.patch

Initial patch. Moved the declaration of curator dependencies to just the 
modules that use curator, and added shade executions to all of those modules to 
shade curator. Would be nice if the shade execution could be defined in the 
base pom and enabled just for those modules requiring the shading, but it looks 
like maven shade plugin does not support the skip parameter (MSHADE-251)

> update curator version to 2.10.0 
> -
>
> Key: HIVE-14069
> URL: https://issues.apache.org/jira/browse/HIVE-14069
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Metastore
>Reporter: Thejas M Nair
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-14069.1.patch, HIVE-14069.2.patch
>
>
> curator-2.10.0 has several bug fixes over current version (2.6.0), updating 
> would help improve stability.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15016) Run tests with Hadoop 3.0.0-beta1

2017-11-01 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-15016:

Status: Patch Available  (was: In Progress)

> Run tests with Hadoop 3.0.0-beta1
> -
>
> Key: HIVE-15016
> URL: https://issues.apache.org/jira/browse/HIVE-15016
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Sergio Peña
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-15016.2.patch, HIVE-15016.3.patch, 
> HIVE-15016.4.patch, HIVE-15016.5.patch, HIVE-15016.6.patch, 
> HIVE-15016.7.patch, HIVE-15016.8.patch, HIVE-15016.9.patch, HIVE-15016.patch, 
> Hadoop3Upstream.patch
>
>
> Hadoop 3.0.0-alpha1 was released back on Sep/16 to allow other components run 
> tests against this new version before GA.
> We should start running tests with Hive to validate compatibility against 
> Hadoop 3.0.
> NOTE: The patch used to test must not be committed to Hive until Hadoop 3.0 
> GA is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15016) Run tests with Hadoop 3.0.0-beta1

2017-11-01 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-15016:

Attachment: HIVE-15016.9.patch

> Run tests with Hadoop 3.0.0-beta1
> -
>
> Key: HIVE-15016
> URL: https://issues.apache.org/jira/browse/HIVE-15016
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Sergio Peña
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-15016.2.patch, HIVE-15016.3.patch, 
> HIVE-15016.4.patch, HIVE-15016.5.patch, HIVE-15016.6.patch, 
> HIVE-15016.7.patch, HIVE-15016.8.patch, HIVE-15016.9.patch, HIVE-15016.patch, 
> Hadoop3Upstream.patch
>
>
> Hadoop 3.0.0-alpha1 was released back on Sep/16 to allow other components run 
> tests against this new version before GA.
> We should start running tests with Hive to validate compatibility against 
> Hadoop 3.0.
> NOTE: The patch used to test must not be committed to Hive until Hadoop 3.0 
> GA is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15016) Run tests with Hadoop 3.0.0-beta1

2017-11-01 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-15016:

Attachment: (was: HIVE-15016.9.patch)

> Run tests with Hadoop 3.0.0-beta1
> -
>
> Key: HIVE-15016
> URL: https://issues.apache.org/jira/browse/HIVE-15016
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Sergio Peña
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-15016.2.patch, HIVE-15016.3.patch, 
> HIVE-15016.4.patch, HIVE-15016.5.patch, HIVE-15016.6.patch, 
> HIVE-15016.7.patch, HIVE-15016.8.patch, HIVE-15016.9.patch, HIVE-15016.patch, 
> Hadoop3Upstream.patch
>
>
> Hadoop 3.0.0-alpha1 was released back on Sep/16 to allow other components run 
> tests against this new version before GA.
> We should start running tests with Hive to validate compatibility against 
> Hadoop 3.0.
> NOTE: The patch used to test must not be committed to Hive until Hadoop 3.0 
> GA is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15016) Run tests with Hadoop 3.0.0-beta1

2017-11-01 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-15016:

Status: In Progress  (was: Patch Available)

> Run tests with Hadoop 3.0.0-beta1
> -
>
> Key: HIVE-15016
> URL: https://issues.apache.org/jira/browse/HIVE-15016
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Sergio Peña
>Assignee: Aihua Xu
>Priority: Major
> Attachments: HIVE-15016.2.patch, HIVE-15016.3.patch, 
> HIVE-15016.4.patch, HIVE-15016.5.patch, HIVE-15016.6.patch, 
> HIVE-15016.7.patch, HIVE-15016.8.patch, HIVE-15016.9.patch, HIVE-15016.patch, 
> Hadoop3Upstream.patch
>
>
> Hadoop 3.0.0-alpha1 was released back on Sep/16 to allow other components run 
> tests against this new version before GA.
> We should start running tests with Hive to validate compatibility against 
> Hadoop 3.0.
> NOTE: The patch used to test must not be committed to Hive until Hadoop 3.0 
> GA is released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15552) unable to coalesce DATE and TIMESTAMP types

2017-11-01 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15552:
---
Attachment: HIVE-15552.02.patch

> unable to coalesce DATE and TIMESTAMP types
> ---
>
> Key: HIVE-15552
> URL: https://issues.apache.org/jira/browse/HIVE-15552
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
>Reporter: N Campbell
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
>  Labels: timestamp
> Attachments: HIVE-15552.01.patch, HIVE-15552.02.patch, 
> HIVE-15552.patch
>
>
> COALESCE expression does not expect DATE and TIMESTAMP types 
> select tdt.rnum, coalesce(tdt.cdt, cast(tdt.cdt as timestamp)) from 
> certtext.tdt
> Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 
> Argument type mismatch 'cdt': The expressions after COALESCE should all have 
> the same type: "date" is expected but "timestamp" is found
> SQLState:  42000
> ErrorCode: 4



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17949) itests compile is busted on branch-1.2

2017-11-01 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234507#comment-16234507
 ] 

Mithun Radhakrishnan edited comment on HIVE-17949 at 11/1/17 10:02 PM:
---

On the bright side, the compile issue is fixed. On the other hand, the tests on 
{{branch-1.2}} are busted. 

I'll check this compile fix in now.


was (Author: mithun):
On the bright side, the compile issue is fixed. On the other hand, the tests on 
{{branch-1.2}} are busted. 

I'll check this in this compile fix now.

> itests compile is busted on branch-1.2
> --
>
> Key: HIVE-17949
> URL: https://issues.apache.org/jira/browse/HIVE-17949
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Affects Versions: 1.2.3
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
>Priority: Major
> Attachments: HIVE-17949.01-branch-1.2.patch
>
>
> {{commit 18ddf46e0a8f092358725fc102235cbe6ba3e24d}} on {{branch-1.2}} was for 
> {{Preparing for 1.2.3 development}}. This should have also included 
> corresponding changes to all the pom-files under {{itests}}. As it stands 
> now, the build fails with the following:
> {noformat}
> [ERROR]   location: class org.apache.hadoop.hive.metastore.api.Role
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java:[512,19]
>  no suitable method found for 
> updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse,boolean,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse,boolean,boolean,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.partition.spec.PartitionSpecProxy.PartitionIterator,org.apache.hadoop.hive.metastore.Warehouse,boolean,boolean,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreWithEnvironmentContext.java:[181,45]
>  incompatible types: org.apache.hadoop.hive.metastore.api.EnvironmentContext 
> cannot be converted to boolean
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreWithEnvironmentContext.java:[190,45]
>  incompatible types: org.apache.hadoop.hive.metastore.api.EnvironmentContext 
> cannot be converted to boolean
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/thrift/TestZooKeeperTokenStore.java:[53,26]
>  cannot find symbol
> [ERROR]   symbol:   class MiniZooKeeperCluster
> [ERROR]   location: class 
> org.apache.hadoop.hive.thrift.TestZooKeeperTokenStore
> [ERROR] -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
> [ERROR]
> [ERROR] After correcting the problems, you can resume the build with the 
> command
> [ERROR]   mvn  -rf :hive-it-unit
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17361) Support LOAD DATA for transactional tables

2017-11-01 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17361:
--
Description: 
LOAD DATA was not supported since ACID was introduced. Need to fill this gap 
between ACID table and regular hive table.

Current Documentation is under [DML 
Operations|https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-DMLOperations]
 and [Loading files into 
tables|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Loadingfilesintotables]:

\\
* Load Data performs very limited validations of the data, in particular it 
uses the input file name which may not be in 0_0 which can break some read 
logic.  (Certainly will for Acid).
* It does not check the schema of the file.  This may be a non issue for Acid 
which requires ORC which is self describing so Schema Evolution may handle this 
seamlessly.  (Assuming Schema is not too different).
* It does check that _InputFormat_S are compatible. 
* Bucketed (and thus sorted) tables don't support Load Data (but only if 
hive.strict.checks.bucketing=true (default)).  Will keep this restriction for 
Acid.
* Load Data supports OVERWRITE clause



  was:
LOAD DATA was not supported since ACID was introduced. Need to fill this gap 
between ACID table and regular hive table.

Current Documentation is under [DML 
Operations|https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-DMLOperations]
 and [Loading files into 
tables|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Loadingfilesintotables]:

\\
* Load Data performs very limited validations of the data, in particular it 
uses the input file name which may not be in 0_0 which can break some read 
logic.  (Certainly will for Acid).
* It does not check the schema of the file.  This may be a non issue for Acid 
which requires ORC which is self describing so Schema Evolution may handle this 
seamlessly.  (Assuming Schema is not too different).
* It does check that _InputFormat_S are compatible. 
* Bucketed (and thus sorted) tables don't support Load Data.  Will keep this 
restriction for Acid.
* Load Data supports OVERWRITE clause




> Support LOAD DATA for transactional tables
> --
>
> Key: HIVE-17361
> URL: https://issues.apache.org/jira/browse/HIVE-17361
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Wei Zheng
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-17361.1.patch, HIVE-17361.2.patch, 
> HIVE-17361.3.patch, HIVE-17361.4.patch
>
>
> LOAD DATA was not supported since ACID was introduced. Need to fill this gap 
> between ACID table and regular hive table.
> Current Documentation is under [DML 
> Operations|https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-DMLOperations]
>  and [Loading files into 
> tables|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Loadingfilesintotables]:
> \\
> * Load Data performs very limited validations of the data, in particular it 
> uses the input file name which may not be in 0_0 which can break some 
> read logic.  (Certainly will for Acid).
> * It does not check the schema of the file.  This may be a non issue for Acid 
> which requires ORC which is self describing so Schema Evolution may handle 
> this seamlessly.  (Assuming Schema is not too different).
> * It does check that _InputFormat_S are compatible. 
> * Bucketed (and thus sorted) tables don't support Load Data (but only if 
> hive.strict.checks.bucketing=true (default)).  Will keep this restriction for 
> Acid.
> * Load Data supports OVERWRITE clause



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17958) spark_dynamic_partition_pruning.q fails when hive.tez.dynamic.semijoin.reduction is false

2017-11-01 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234789#comment-16234789
 ] 

Sahil Takiar commented on HIVE-17958:
-

Actually {{RedundantDynamicPruningConditionsRemoval}} is probably doing the 
right thing here. You don't want the above query to trigger DPP twice because 
there is a static partition filter on {{srcpart.hr}}. So maybe 
{{RedundantDynamicPruningConditionsRemoval}} should only be disabled if 
{{hive.tez.dynamic.semijoin.reduction}} is {{true}} and 
{{hive.execution.engine=tez}}.

> spark_dynamic_partition_pruning.q fails when 
> hive.tez.dynamic.semijoin.reduction is false
> -
>
> Key: HIVE-17958
> URL: https://issues.apache.org/jira/browse/HIVE-17958
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
>
> Looks like {{RedundantDynamicPruningConditionsRemoval}} causes DPP to be 
> disabled in a few cases (not sure why). When 
> {{hive.tez.dynamic.semijoin.reduction}} is {{true}} (the default), then this 
> rule is disabled so the normal tests don't hit this issue.
> But when I disable {{hive.tez.dynamic.semijoin.reduction}} then the following 
> query no longer fully triggers DPP:
> {code}
> EXPLAIN select count(*) from srcpart join srcpart_date on (srcpart.ds = 
> srcpart_date.ds) join srcpart_hour on (srcpart.hr = srcpart_hour.hr)
> 5777 where srcpart_date.`date` = '2008-04-08' and srcpart_hour.hour = 11 and 
> srcpart.hr = 11
> {code}
> There should be two DPP sinks, but when the config is set to false, there is 
> only one.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17361) Support LOAD DATA for transactional tables

2017-11-01 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17361:
--
Description: 
LOAD DATA was not supported since ACID was introduced. Need to fill this gap 
between ACID table and regular hive table.

Current Documentation is under [DML 
Operations|https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-DMLOperations]
 and [Loading files into 
tables|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Loadingfilesintotables]:

\\
* Load Data performs very limited validations of the data, in particular it 
uses the input file name which may not be in 0_0 which can break some read 
logic.  (Certainly will for Acid).
* It does not check the schema of the file.  This may be a non issue for Acid 
which requires ORC which is self describing so Schema Evolution may handle this 
seamlessly.  (Assuming Schema is not too different).
* It does check that _InputFormat_S are compatible. 
* Bucketed (and thus sorted) tables don't support Load Data.  Will keep this 
restriction for Acid.
* Load Data supports OVERWRITE clause



  was:
LOAD DATA was not supported since ACID was introduced. Need to fill this gap 
between ACID table and regular hive table.

Current Documentation is under : 


> Support LOAD DATA for transactional tables
> --
>
> Key: HIVE-17361
> URL: https://issues.apache.org/jira/browse/HIVE-17361
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Wei Zheng
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-17361.1.patch, HIVE-17361.2.patch, 
> HIVE-17361.3.patch, HIVE-17361.4.patch
>
>
> LOAD DATA was not supported since ACID was introduced. Need to fill this gap 
> between ACID table and regular hive table.
> Current Documentation is under [DML 
> Operations|https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-DMLOperations]
>  and [Loading files into 
> tables|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Loadingfilesintotables]:
> \\
> * Load Data performs very limited validations of the data, in particular it 
> uses the input file name which may not be in 0_0 which can break some 
> read logic.  (Certainly will for Acid).
> * It does not check the schema of the file.  This may be a non issue for Acid 
> which requires ORC which is self describing so Schema Evolution may handle 
> this seamlessly.  (Assuming Schema is not too different).
> * It does check that _InputFormat_S are compatible. 
> * Bucketed (and thus sorted) tables don't support Load Data.  Will keep this 
> restriction for Acid.
> * Load Data supports OVERWRITE clause



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17361) Support LOAD DATA for transactional tables

2017-11-01 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17361:
--
Description: 
LOAD DATA was not supported since ACID was introduced. Need to fill this gap 
between ACID table and regular hive table.

Current Documentation is under : 

  was:LOAD DATA was not supported since ACID was introduced. Need to fill this 
gap between ACID table and regular hive table.


> Support LOAD DATA for transactional tables
> --
>
> Key: HIVE-17361
> URL: https://issues.apache.org/jira/browse/HIVE-17361
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Wei Zheng
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-17361.1.patch, HIVE-17361.2.patch, 
> HIVE-17361.3.patch, HIVE-17361.4.patch
>
>
> LOAD DATA was not supported since ACID was introduced. Need to fill this gap 
> between ACID table and regular hive table.
> Current Documentation is under : 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15157) Partition Table With timestamp type on S3 storage --> Error in getting fields from serde.Invalid Field null

2017-11-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234732#comment-16234732
 ] 

Hive QA commented on HIVE-15157:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12895208/HIVE-15157.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 11350 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_timestamp] 
(batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes 
(batchId=230)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7588/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7588/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7588/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12895208 - PreCommit-HIVE-Build

> Partition Table With timestamp type on S3 storage --> Error in getting fields 
> from serde.Invalid Field null
> ---
>
> Key: HIVE-15157
> URL: https://issues.apache.org/jira/browse/HIVE-15157
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 2.1.0
> Environment: JDK 1.8 101 
>Reporter: thauvin damien
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
>  Labels: timestamp
> Attachments: HIVE-15157.01.patch
>
>
> Hello 
> I get the error above when i try to perform  :
> hive> DESCRIBE formatted table partition (tsbucket='2016-10-28 16%3A00%3A00');
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Error in getting fields from 
> serde.Invalid Field null
> Here is the description of the issue.
> --External table Hive with dynamic partition enable on Aws S3 storage.
> --Partition Table with timestamp type .
> When i perform "show partition table;" everything is fine :
> hive>  show partitions table;
> OK
> tsbucket=2016-10-01 11%3A00%3A00
> tsbucket=2016-10-28 16%3A00%3A00
> And when i perform "describe FORMATTED table;" everything is fine
> Is this a bug ? 
> The stacktrace of hive.log :
> 2016-11-08T10:30:20,868 ERROR [ac3e0d48-22c5-4d04-a788-aeb004ea94f3 
> main([])]: exec.DDLTask (DDLTask.java:failed(574)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error in getting fields 
> from serde.Invalid Field null
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3414)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.describeTable(DDLTask.java:3109)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:408)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> 

[jira] [Commented] (HIVE-17935) Turn on hive.optimize.sort.dynamic.partition by default

2017-11-01 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234703#comment-16234703
 ] 

Prasanth Jayachandran commented on HIVE-17935:
--

bq. Is it fair to say that the gains from changing the default are potentially 
large while the losses are comparatively small?
For cases where it is beneficial, this is definitely a huge gain. There are 
many gains with this optimization. The point I was trying to make is that users 
should be aware of the regression for some cases until optimizer makes this 
decision automatically.

> Turn on hive.optimize.sort.dynamic.partition by default
> ---
>
> Key: HIVE-17935
> URL: https://issues.apache.org/jira/browse/HIVE-17935
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
>Priority: Major
> Attachments: HIVE-17935.1.patch, HIVE-17935.2.patch
>
>
> The config option hive.optimize.sort.dynamic.partition is an optimization for 
> Hive’s dynamic partitioning feature. It was originally implemented in 
> [HIVE-6455|https://issues.apache.org/jira/browse/HIVE-6455]. With this 
> optimization, the dynamic partition columns and bucketing columns (in case of 
> bucketed tables) are sorted before being fed to the reducers. Since the 
> partitioning and bucketing columns are sorted, each reducer can keep only one 
> record writer open at any time thereby reducing the memory pressure on the 
> reducers. There were some early problems with this optimization and it was 
> disabled by default in HiveConf in 
> [HIVE-8151|https://issues.apache.org/jira/browse/HIVE-8151]. Since then 
> setting hive.optimize.sort.dynamic.partition=true has been used to solve 
> problems where dynamic partitioning produces with (1) too many small files on 
> HDFS, which is bad for the cluster and can increase overhead for future Hive 
> queries over those partitions, and (2) OOM issues in the map tasks because it 
> trying to simultaneously write to 100 different files. 
> It now seems that the feature is probably mature enough that it can be 
> enabled by default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17935) Turn on hive.optimize.sort.dynamic.partition by default

2017-11-01 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234701#comment-16234701
 ] 

Prasanth Jayachandran commented on HIVE-17935:
--

bq. Do you think the possible performance regression for some jobs to be large? 
Unfortunately, not quantifiable. Overhead is essentially sort + shuffle + new 
tasks spin up for reduce tasks. If partition column count is low and data size 
is small, the regression factor will be completely different than the case with 
large data set. 

> Turn on hive.optimize.sort.dynamic.partition by default
> ---
>
> Key: HIVE-17935
> URL: https://issues.apache.org/jira/browse/HIVE-17935
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
>Priority: Major
> Attachments: HIVE-17935.1.patch, HIVE-17935.2.patch
>
>
> The config option hive.optimize.sort.dynamic.partition is an optimization for 
> Hive’s dynamic partitioning feature. It was originally implemented in 
> [HIVE-6455|https://issues.apache.org/jira/browse/HIVE-6455]. With this 
> optimization, the dynamic partition columns and bucketing columns (in case of 
> bucketed tables) are sorted before being fed to the reducers. Since the 
> partitioning and bucketing columns are sorted, each reducer can keep only one 
> record writer open at any time thereby reducing the memory pressure on the 
> reducers. There were some early problems with this optimization and it was 
> disabled by default in HiveConf in 
> [HIVE-8151|https://issues.apache.org/jira/browse/HIVE-8151]. Since then 
> setting hive.optimize.sort.dynamic.partition=true has been used to solve 
> problems where dynamic partitioning produces with (1) too many small files on 
> HDFS, which is bad for the cluster and can increase overhead for future Hive 
> queries over those partitions, and (2) OOM issues in the map tasks because it 
> trying to simultaneously write to 100 different files. 
> It now seems that the feature is probably mature enough that it can be 
> enabled by default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16890) org.apache.hadoop.hive.serde2.io.HiveVarcharWritable - Adds Superfluous Wrapper

2017-11-01 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234693#comment-16234693
 ] 

Naveen Gangam commented on HIVE-16890:
--

The fix makes sense to me. So +1 for me.

> org.apache.hadoop.hive.serde2.io.HiveVarcharWritable - Adds Superfluous 
> Wrapper
> ---
>
> Key: HIVE-16890
> URL: https://issues.apache.org/jira/browse/HIVE-16890
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
> Attachments: HIVE-16890.1.patch, HIVE-16890.1.patch
>
>
> Class {{org.apache.hadoop.hive.serde2.io.HiveVarcharWritable}} creates a 
> superfluous wrapper and then immediately unwraps it.  Don't bother wrapping 
> in this scenario.
> {code}
>   public void set(HiveVarchar val, int len) {
> set(val.getValue(), len);
>   }
>   public void set(String val, int maxLength) {
> value.set(HiveBaseChar.enforceMaxLength(val, maxLength));
>   }
>   public HiveVarchar getHiveVarchar() {
> return new HiveVarchar(value.toString(), -1);
>   }
>   // Here calls getHiveVarchar() which creates a new HiveVarchar object with 
> a string in it
>   // The object is passed to set(HiveVarchar val, int len)
>   //  The string is pulled out
>   public void enforceMaxLength(int maxLength) {
> // Might be possible to truncate the existing Text value, for now just do 
> something simple.
> if (value.getLength()>maxLength && getCharacterLength()>maxLength)
>   set(getHiveVarchar(), maxLength);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-16890) org.apache.hadoop.hive.serde2.io.HiveVarcharWritable - Adds Superfluous Wrapper

2017-11-01 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam reassigned HIVE-16890:


Assignee: BELUGA BEHR  (was: Naveen Gangam)

> org.apache.hadoop.hive.serde2.io.HiveVarcharWritable - Adds Superfluous 
> Wrapper
> ---
>
> Key: HIVE-16890
> URL: https://issues.apache.org/jira/browse/HIVE-16890
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
> Attachments: HIVE-16890.1.patch, HIVE-16890.1.patch
>
>
> Class {{org.apache.hadoop.hive.serde2.io.HiveVarcharWritable}} creates a 
> superfluous wrapper and then immediately unwraps it.  Don't bother wrapping 
> in this scenario.
> {code}
>   public void set(HiveVarchar val, int len) {
> set(val.getValue(), len);
>   }
>   public void set(String val, int maxLength) {
> value.set(HiveBaseChar.enforceMaxLength(val, maxLength));
>   }
>   public HiveVarchar getHiveVarchar() {
> return new HiveVarchar(value.toString(), -1);
>   }
>   // Here calls getHiveVarchar() which creates a new HiveVarchar object with 
> a string in it
>   // The object is passed to set(HiveVarchar val, int len)
>   //  The string is pulled out
>   public void enforceMaxLength(int maxLength) {
> // Might be possible to truncate the existing Text value, for now just do 
> something simple.
> if (value.getLength()>maxLength && getCharacterLength()>maxLength)
>   set(getHiveVarchar(), maxLength);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16890) org.apache.hadoop.hive.serde2.io.HiveVarcharWritable - Adds Superfluous Wrapper

2017-11-01 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-16890:
-
Attachment: HIVE-16890.1.patch

Re-attaching the same patch to kick off the pre-commits against the latest 
codebase.

> org.apache.hadoop.hive.serde2.io.HiveVarcharWritable - Adds Superfluous 
> Wrapper
> ---
>
> Key: HIVE-16890
> URL: https://issues.apache.org/jira/browse/HIVE-16890
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: BELUGA BEHR
>Assignee: Naveen Gangam
> Attachments: HIVE-16890.1.patch, HIVE-16890.1.patch
>
>
> Class {{org.apache.hadoop.hive.serde2.io.HiveVarcharWritable}} creates a 
> superfluous wrapper and then immediately unwraps it.  Don't bother wrapping 
> in this scenario.
> {code}
>   public void set(HiveVarchar val, int len) {
> set(val.getValue(), len);
>   }
>   public void set(String val, int maxLength) {
> value.set(HiveBaseChar.enforceMaxLength(val, maxLength));
>   }
>   public HiveVarchar getHiveVarchar() {
> return new HiveVarchar(value.toString(), -1);
>   }
>   // Here calls getHiveVarchar() which creates a new HiveVarchar object with 
> a string in it
>   // The object is passed to set(HiveVarchar val, int len)
>   //  The string is pulled out
>   public void enforceMaxLength(int maxLength) {
> // Might be possible to truncate the existing Text value, for now just do 
> something simple.
> if (value.getLength()>maxLength && getCharacterLength()>maxLength)
>   set(getHiveVarchar(), maxLength);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-16890) org.apache.hadoop.hive.serde2.io.HiveVarcharWritable - Adds Superfluous Wrapper

2017-11-01 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam reassigned HIVE-16890:


Assignee: Naveen Gangam  (was: BELUGA BEHR)

> org.apache.hadoop.hive.serde2.io.HiveVarcharWritable - Adds Superfluous 
> Wrapper
> ---
>
> Key: HIVE-16890
> URL: https://issues.apache.org/jira/browse/HIVE-16890
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: BELUGA BEHR
>Assignee: Naveen Gangam
> Attachments: HIVE-16890.1.patch
>
>
> Class {{org.apache.hadoop.hive.serde2.io.HiveVarcharWritable}} creates a 
> superfluous wrapper and then immediately unwraps it.  Don't bother wrapping 
> in this scenario.
> {code}
>   public void set(HiveVarchar val, int len) {
> set(val.getValue(), len);
>   }
>   public void set(String val, int maxLength) {
> value.set(HiveBaseChar.enforceMaxLength(val, maxLength));
>   }
>   public HiveVarchar getHiveVarchar() {
> return new HiveVarchar(value.toString(), -1);
>   }
>   // Here calls getHiveVarchar() which creates a new HiveVarchar object with 
> a string in it
>   // The object is passed to set(HiveVarchar val, int len)
>   //  The string is pulled out
>   public void enforceMaxLength(int maxLength) {
> // Might be possible to truncate the existing Text value, for now just do 
> something simple.
> if (value.getLength()>maxLength && getCharacterLength()>maxLength)
>   set(getHiveVarchar(), maxLength);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17935) Turn on hive.optimize.sort.dynamic.partition by default

2017-11-01 Thread Andrew Sherman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234652#comment-16234652
 ] 

Andrew Sherman commented on HIVE-17935:
---

Thanks [~prasanth_j] for helpful comments. Do you think the possible 
performance regression for some jobs to be large? Is it fair to say that the 
gains from changing the default are potentially large while the losses are 
comparatively small?

> Turn on hive.optimize.sort.dynamic.partition by default
> ---
>
> Key: HIVE-17935
> URL: https://issues.apache.org/jira/browse/HIVE-17935
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
>Priority: Major
> Attachments: HIVE-17935.1.patch, HIVE-17935.2.patch
>
>
> The config option hive.optimize.sort.dynamic.partition is an optimization for 
> Hive’s dynamic partitioning feature. It was originally implemented in 
> [HIVE-6455|https://issues.apache.org/jira/browse/HIVE-6455]. With this 
> optimization, the dynamic partition columns and bucketing columns (in case of 
> bucketed tables) are sorted before being fed to the reducers. Since the 
> partitioning and bucketing columns are sorted, each reducer can keep only one 
> record writer open at any time thereby reducing the memory pressure on the 
> reducers. There were some early problems with this optimization and it was 
> disabled by default in HiveConf in 
> [HIVE-8151|https://issues.apache.org/jira/browse/HIVE-8151]. Since then 
> setting hive.optimize.sort.dynamic.partition=true has been used to solve 
> problems where dynamic partitioning produces with (1) too many small files on 
> HDFS, which is bad for the cluster and can increase overhead for future Hive 
> queries over those partitions, and (2) OOM issues in the map tasks because it 
> trying to simultaneously write to 100 different files. 
> It now seems that the feature is probably mature enough that it can be 
> enabled by default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17958) spark_dynamic_partition_pruning.q fails when hive.tez.dynamic.semijoin.reduction is false

2017-11-01 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17958:

Priority: Major  (was: Trivial)

> spark_dynamic_partition_pruning.q fails when 
> hive.tez.dynamic.semijoin.reduction is false
> -
>
> Key: HIVE-17958
> URL: https://issues.apache.org/jira/browse/HIVE-17958
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
>
> Looks like {{RedundantDynamicPruningConditionsRemoval}} causes DPP to be 
> disabled in a few cases (not sure why). When 
> {{hive.tez.dynamic.semijoin.reduction}} is {{true}} (the default), then this 
> rule is disabled so the normal tests don't hit this issue.
> But when I disable {{hive.tez.dynamic.semijoin.reduction}} then the following 
> query no longer fully triggers DPP:
> {code}
> EXPLAIN select count(*) from srcpart join srcpart_date on (srcpart.ds = 
> srcpart_date.ds) join srcpart_hour on (srcpart.hr = srcpart_hour.hr)
> 5777 where srcpart_date.`date` = '2008-04-08' and srcpart_hour.hour = 11 and 
> srcpart.hr = 11
> {code}
> There should be two DPP sinks, but when the config is set to false, there is 
> only one.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17958) spark_dynamic_partition_pruning.q fails when hive.tez.dynamic.semijoin.reduction is false

2017-11-01 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar reassigned HIVE-17958:
---


> spark_dynamic_partition_pruning.q fails when 
> hive.tez.dynamic.semijoin.reduction is false
> -
>
> Key: HIVE-17958
> URL: https://issues.apache.org/jira/browse/HIVE-17958
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> Looks like {{RedundantDynamicPruningConditionsRemoval}} causes DPP to be 
> disabled in a few cases (not sure why). When 
> {{hive.tez.dynamic.semijoin.reduction}} is {{true}} (the default), then this 
> rule is disabled so the normal tests don't hit this issue.
> But when I disable {{hive.tez.dynamic.semijoin.reduction}} then the following 
> query no longer fully triggers DPP:
> {code}
> EXPLAIN select count(*) from srcpart join srcpart_date on (srcpart.ds = 
> srcpart_date.ds) join srcpart_hour on (srcpart.hr = srcpart_hour.hr)
> 5777 where srcpart_date.`date` = '2008-04-08' and srcpart_hour.hour = 11 and 
> srcpart.hr = 11
> {code}
> There should be two DPP sinks, but when the config is set to false, there is 
> only one.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-14069) update curator version to 2.10.0

2017-11-01 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234632#comment-16234632
 ] 

Jason Dere commented on HIVE-14069:
---

Trying to go with the shading approach - shading within each Hive module that 
uses Curator.

One issue I have been running into is MSHADE-148. The maven-shade plugin 
normally tries to generate a new POM file which does not include any of the 
shaded dependencies, which should be a good thing because (1) we don't want the 
conflicting Curator versions from Hadoop and Hive being brought into the 
classpath together, and (2) since the Curator version used by Hive would be 
shaded, there is no need to bring it into the classpath. Unfortunately 
MSHADE-148 is causing some kind of infinite loop when generating this new 
dependency-reduced POM file. We can disable this POM generation step in the 
shade options, but then this means the original POM file for the Hive module is 
being used which specifies the new Curator version as a dependency. Thinking we 
can set the Curator dependencies as optional dependencies in maven so they do 
not get brought into the classpath as transitive dependencies.

> update curator version to 2.10.0 
> -
>
> Key: HIVE-14069
> URL: https://issues.apache.org/jira/browse/HIVE-14069
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Metastore
>Reporter: Thejas M Nair
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-14069.1.patch
>
>
> curator-2.10.0 has several bug fixes over current version (2.6.0), updating 
> would help improve stability.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-14069) update curator version to 2.10.0

2017-11-01 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere reassigned HIVE-14069:
-

Assignee: Jason Dere  (was: Thejas M Nair)

> update curator version to 2.10.0 
> -
>
> Key: HIVE-14069
> URL: https://issues.apache.org/jira/browse/HIVE-14069
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Metastore
>Reporter: Thejas M Nair
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-14069.1.patch
>
>
> curator-2.10.0 has several bug fixes over current version (2.6.0), updating 
> would help improve stability.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17953) Metrics should move to destination atomically

2017-11-01 Thread Alexander Kolbasov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Kolbasov updated HIVE-17953:
--
Status: Patch Available  (was: Open)

> Metrics should move to destination atomically
> -
>
> Key: HIVE-17953
> URL: https://issues.apache.org/jira/browse/HIVE-17953
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Alexander Kolbasov
>Assignee: Alexander Kolbasov
> Attachments: HIVE-17953.02.patch
>
>
> HIVE-17563 reimplemented metrics using native nio interfaces. It used the 
> assumption that{{Files.move()}} is atomic operation. It turns out that by 
> default it isn't, unless {{ATOMIC_MOVE}} option is specified. Otherwise the 
> destination file is unlinked and then the source file is copied.
> This may cause test failure since the file may be temporarily unavailable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17953) Metrics should move to destination atomically

2017-11-01 Thread Alexander Kolbasov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Kolbasov updated HIVE-17953:
--
Status: Open  (was: Patch Available)

> Metrics should move to destination atomically
> -
>
> Key: HIVE-17953
> URL: https://issues.apache.org/jira/browse/HIVE-17953
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Alexander Kolbasov
>Assignee: Alexander Kolbasov
> Attachments: HIVE-17953.02.patch
>
>
> HIVE-17563 reimplemented metrics using native nio interfaces. It used the 
> assumption that{{Files.move()}} is atomic operation. It turns out that by 
> default it isn't, unless {{ATOMIC_MOVE}} option is specified. Otherwise the 
> destination file is unlinked and then the source file is copied.
> This may cause test failure since the file may be temporarily unavailable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17953) Metrics should move to destination atomically

2017-11-01 Thread Alexander Kolbasov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Kolbasov updated HIVE-17953:
--
Attachment: HIVE-17953.02.patch

> Metrics should move to destination atomically
> -
>
> Key: HIVE-17953
> URL: https://issues.apache.org/jira/browse/HIVE-17953
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Alexander Kolbasov
>Assignee: Alexander Kolbasov
> Attachments: HIVE-17953.02.patch
>
>
> HIVE-17563 reimplemented metrics using native nio interfaces. It used the 
> assumption that{{Files.move()}} is atomic operation. It turns out that by 
> default it isn't, unless {{ATOMIC_MOVE}} option is specified. Otherwise the 
> destination file is unlinked and then the source file is copied.
> This may cause test failure since the file may be temporarily unavailable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17953) Metrics should move to destination atomically

2017-11-01 Thread Alexander Kolbasov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Kolbasov updated HIVE-17953:
--
Attachment: (was: HIVE-17953.01.patch)

> Metrics should move to destination atomically
> -
>
> Key: HIVE-17953
> URL: https://issues.apache.org/jira/browse/HIVE-17953
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Alexander Kolbasov
>Assignee: Alexander Kolbasov
> Attachments: HIVE-17953.02.patch
>
>
> HIVE-17563 reimplemented metrics using native nio interfaces. It used the 
> assumption that{{Files.move()}} is atomic operation. It turns out that by 
> default it isn't, unless {{ATOMIC_MOVE}} option is specified. Otherwise the 
> destination file is unlinked and then the source file is copied.
> This may cause test failure since the file may be temporarily unavailable.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17908) LLAP External client not correctly handling killTask for pending requests

2017-11-01 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-17908:
--
Attachment: HIVE-17908.4.patch

Again don't see this Jira in the pre-commit queue. Uploading again ..

> LLAP External client not correctly handling killTask for pending requests
> -
>
> Key: HIVE-17908
> URL: https://issues.apache.org/jira/browse/HIVE-17908
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
>Priority: Major
> Attachments: HIVE-17908.1.patch, HIVE-17908.2.patch, 
> HIVE-17908.3.patch, HIVE-17908.4.patch
>
>
> Hitting "Timed out waiting for heartbeat for task ID" errors with the LLAP 
> external client.
> HIVE-17393 fixed some of these errors, however it is also occurring because 
> the client is not correctly handling the killTask notification when the 
> request is accepted but still waiting for the first task heartbeat. In this 
> situation the client should retry the request, similar to what the LLAP AM 
> does. Current logic is ignoring the killTask in this situation, which results 
> in a heartbeat timeout - no heartbeats are sent by LLAP because of the 
> killTask notification.
> {noformat}
> 17/08/09 05:36:02 WARN TaskSetManager: Lost task 10.0 in stage 4.0 (TID 14, 
> cn114-10.l42scl.hortonworks.com, executor 5): java.io.IOException: Received 
> reader event error: Timed out waiting for heartbeat for task ID 
> attempt_7739111832518812959_0005_0_00_10_0
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:178)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:50)
> at 
> org.apache.hadoop.hive.llap.LlapRowRecordReader.next(LlapRowRecordReader.java:121)
> at 
> org.apache.hadoop.hive.llap.LlapRowRecordReader.next(LlapRowRecordReader.java:68)
> at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:266)
> at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:211)
> at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
> at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithKeys$(Unknown
>  Source)
> at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)
> at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
> at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
> at org.apache.spark.scheduler.Task.run(Task.scala:99)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: 
> LlapTaskUmbilicalExternalClient(attempt_7739111832518812959_0005_0_00_10_0):
>  Error while attempting to read chunk length
> at 
> org.apache.hadoop.hive.llap.io.ChunkedInputStream.read(ChunkedInputStream.java:82)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
> at java.io.FilterInputStream.read(FilterInputStream.java:83)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.hasInput(LlapBaseRecordReader.java:267)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:142)
> ... 22 more
> Caused by: java.net.SocketException: Socket closed
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15157) Partition Table With timestamp type on S3 storage --> Error in getting fields from serde.Invalid Field null

2017-11-01 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234598#comment-16234598
 ] 

Prasanth Jayachandran commented on HIVE-15157:
--

Should "show partitions" unescape the characters? describe seems to do so.

> Partition Table With timestamp type on S3 storage --> Error in getting fields 
> from serde.Invalid Field null
> ---
>
> Key: HIVE-15157
> URL: https://issues.apache.org/jira/browse/HIVE-15157
> Project: Hive
>  Issue Type: Bug
>  Components: Clients
>Affects Versions: 2.1.0
> Environment: JDK 1.8 101 
>Reporter: thauvin damien
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
>  Labels: timestamp
> Attachments: HIVE-15157.01.patch
>
>
> Hello 
> I get the error above when i try to perform  :
> hive> DESCRIBE formatted table partition (tsbucket='2016-10-28 16%3A00%3A00');
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. Error in getting fields from 
> serde.Invalid Field null
> Here is the description of the issue.
> --External table Hive with dynamic partition enable on Aws S3 storage.
> --Partition Table with timestamp type .
> When i perform "show partition table;" everything is fine :
> hive>  show partitions table;
> OK
> tsbucket=2016-10-01 11%3A00%3A00
> tsbucket=2016-10-28 16%3A00%3A00
> And when i perform "describe FORMATTED table;" everything is fine
> Is this a bug ? 
> The stacktrace of hive.log :
> 2016-11-08T10:30:20,868 ERROR [ac3e0d48-22c5-4d04-a788-aeb004ea94f3 
> main([])]: exec.DDLTask (DDLTask.java:failed(574)) - 
> org.apache.hadoop.hive.ql.metadata.HiveException: Error in getting fields 
> from serde.Invalid Field null
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3414)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.describeTable(DDLTask.java:3109)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:408)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1858)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1562)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1313)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1084)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Caused by: MetaException(message:Invalid Field null)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getFieldsFromDeserializer(MetaStoreUtils.java:1336)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getFieldsFromDeserializer(Hive.java:3409)
> ... 21 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17939) Bucket map join not being selected when bucketed tables is missing bucket files

2017-11-01 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-17939:
--
Attachment: HIVE-17939.4.patch

Another attempt to trigger ptest run for the same patch as v2.

> Bucket map join not being selected when bucketed tables is missing bucket 
> files
> ---
>
> Key: HIVE-17939
> URL: https://issues.apache.org/jira/browse/HIVE-17939
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-17939.1.patch, HIVE-17939.2.patch, 
> HIVE-17939.3.patch, HIVE-17939.4.patch
>
>
>  Looks like the following logic kicks in during 
> OpTraitsRulesProcFactory.TableScanRule.checkBucketedTable(), which prevents 
> the table from being considered a proper bucketed table:
> // The number of files for the table should be same as number of
> // buckets.
> if (fileNames.size() != 0 && fileNames.size() != numBuckets) {
>   return false;
> }



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17552) Enable bucket map join by default

2017-11-01 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-17552:
--
Attachment: HIVE-17552.2.patch

Retrying to trigger ptest.

> Enable bucket map join by default
> -
>
> Key: HIVE-17552
> URL: https://issues.apache.org/jira/browse/HIVE-17552
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-17552.1.patch, HIVE-17552.2.patch
>
>
> Currently bucket map join is disabled by default, however, it is potentially 
> most optimal join we have. Need to enable it by default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17952) Fix license headers to avoid dangling javadoc warnings

2017-11-01 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234575#comment-16234575
 ] 

Prasanth Jayachandran commented on HIVE-17952:
--

Yeah. This is one small nagging thing. It is easy to script away this change 
but as you said it will touch so many files. I am fine with doing this, in fact 
enforcing this via some means in future would be ideal (validation phase).  

> Fix license headers to avoid dangling javadoc warnings
> --
>
> Key: HIVE-17952
> URL: https://issues.apache.org/jira/browse/HIVE-17952
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>
> All license headers starts with "/**" which are assumed to be javadocs and 
> IDE warns about dangling javadoc pointing to license headers.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17935) Turn on hive.optimize.sort.dynamic.partition by default

2017-11-01 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234570#comment-16234570
 ] 

Prasanth Jayachandran commented on HIVE-17935:
--

The thing to note is that this might cause performance regression for some 
jobs. Jobs with partition column values in the order of 10s will have 
regression as it may run as map only job. This feature will force a reducer 
stage even for small jobs. In some cases, reducer deduplication can bring in 
gains but in cases where there is extra reducer and small partition count this 
will slow down. This optimization is really beneficial when there are lots of 
partition which can cause queries to OOM or create GC pressure. In all cases, 
this will also result in optimal file structure (concurrent writers for ORC can 
result in too many small stripes per file which is suboptimal). So there are 
good and bad about this optimization. Ideally we want optimizer to make smart 
decision during planning whether to enable this or not based on column stats 
from source table. cc/ [~ashutoshc]

> Turn on hive.optimize.sort.dynamic.partition by default
> ---
>
> Key: HIVE-17935
> URL: https://issues.apache.org/jira/browse/HIVE-17935
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
>Priority: Major
> Attachments: HIVE-17935.1.patch, HIVE-17935.2.patch
>
>
> The config option hive.optimize.sort.dynamic.partition is an optimization for 
> Hive’s dynamic partitioning feature. It was originally implemented in 
> [HIVE-6455|https://issues.apache.org/jira/browse/HIVE-6455]. With this 
> optimization, the dynamic partition columns and bucketing columns (in case of 
> bucketed tables) are sorted before being fed to the reducers. Since the 
> partitioning and bucketing columns are sorted, each reducer can keep only one 
> record writer open at any time thereby reducing the memory pressure on the 
> reducers. There were some early problems with this optimization and it was 
> disabled by default in HiveConf in 
> [HIVE-8151|https://issues.apache.org/jira/browse/HIVE-8151]. Since then 
> setting hive.optimize.sort.dynamic.partition=true has been used to solve 
> problems where dynamic partitioning produces with (1) too many small files on 
> HDFS, which is bad for the cluster and can increase overhead for future Hive 
> queries over those partitions, and (2) OOM issues in the map tasks because it 
> trying to simultaneously write to 100 different files. 
> It now seems that the feature is probably mature enough that it can be 
> enabled by default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17935) Turn on hive.optimize.sort.dynamic.partition by default

2017-11-01 Thread Andrew Sherman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman updated HIVE-17935:
--
Attachment: HIVE-17935.2.patch

try again to see if tests will run

> Turn on hive.optimize.sort.dynamic.partition by default
> ---
>
> Key: HIVE-17935
> URL: https://issues.apache.org/jira/browse/HIVE-17935
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
>Priority: Major
> Attachments: HIVE-17935.1.patch, HIVE-17935.2.patch
>
>
> The config option hive.optimize.sort.dynamic.partition is an optimization for 
> Hive’s dynamic partitioning feature. It was originally implemented in 
> [HIVE-6455|https://issues.apache.org/jira/browse/HIVE-6455]. With this 
> optimization, the dynamic partition columns and bucketing columns (in case of 
> bucketed tables) are sorted before being fed to the reducers. Since the 
> partitioning and bucketing columns are sorted, each reducer can keep only one 
> record writer open at any time thereby reducing the memory pressure on the 
> reducers. There were some early problems with this optimization and it was 
> disabled by default in HiveConf in 
> [HIVE-8151|https://issues.apache.org/jira/browse/HIVE-8151]. Since then 
> setting hive.optimize.sort.dynamic.partition=true has been used to solve 
> problems where dynamic partitioning produces with (1) too many small files on 
> HDFS, which is bad for the cluster and can increase overhead for future Hive 
> queries over those partitions, and (2) OOM issues in the map tasks because it 
> trying to simultaneously write to 100 different files. 
> It now seems that the feature is probably mature enough that it can be 
> enabled by default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16917) HiveServer2 guard rails - Limit concurrent connections from user

2017-11-01 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234523#comment-16234523
 ] 

Prasanth Jayachandran commented on HIVE-16917:
--

[~asherman] Yes. Single connection can be used to execute queries in a loop 
which can potentially max out number of worker threads leading to 
denial-of-service/rejections for other users. Since we have session handle even 
when we execute query I think it should be easy to extend (in a subsequent or 
followup jira?) this to add guardrails for limiting queries per user/ip-address 
as well. The usecase for this jira was to avoid creating too many connections 
(and not closing it), which will increase the memory pressure of HS2 (long GC 
pauses) eventually leading to unusable state. This doesn't do the cleanup of 
stale connections but atleast will enforce some limits (instead of infinite 
connections).

> HiveServer2 guard rails - Limit concurrent connections from user
> 
>
> Key: HIVE-16917
> URL: https://issues.apache.org/jira/browse/HIVE-16917
> Project: Hive
>  Issue Type: New Feature
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-16917.1.patch
>
>
> Rogue applications can make HS2 unusable for others by making too many 
> connections at a time.
> HS2 should start rejecting the number of connections from a user, after it 
> has reached a configurable threshold.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files

2017-11-01 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17458:
--
Attachment: HIVE-17458.15.patch

patch 15 fixes TestVectorizedOrcAcidRowBatchReader - test issue 

> VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
> ---
>
> Key: HIVE-17458
> URL: https://issues.apache.org/jira/browse/HIVE-17458
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-17458.01.patch, HIVE-17458.02.patch, 
> HIVE-17458.03.patch, HIVE-17458.04.patch, HIVE-17458.05.patch, 
> HIVE-17458.06.patch, HIVE-17458.07.patch, HIVE-17458.07.patch, 
> HIVE-17458.08.patch, HIVE-17458.09.patch, HIVE-17458.10.patch, 
> HIVE-17458.11.patch, HIVE-17458.12.patch, HIVE-17458.12.patch, 
> HIVE-17458.13.patch, HIVE-17458.14.patch, HIVE-17458.15.patch
>
>
> VectorizedOrcAcidRowBatchReader will not be used for original files.  This 
> will likely look like a perf regression when converting a table from non-acid 
> to acid until it runs through a major compaction.
> With Load Data support, if large files are added via Load Data, the read ops 
> will not vectorize until major compaction.  
> There is no reason why this should be the case.  Just like 
> OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other 
> files in the logical tranche/bucket and calculate the offset for the RowBatch 
> of the split.  (Presumably getRecordReader().getRowNumber() works the same in 
> vector mode).
> In this case we don't even need OrcSplit.isOriginal() - the reader can infer 
> it from file path... which in particular simplifies 
> OrcInputFormat.determineSplitStrategies()



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17956) Retrieve "latest" partition from Hive Metastore

2017-11-01 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234512#comment-16234512
 ] 

Mithun Radhakrishnan commented on HIVE-17956:
-

I'm afraid the sort-order is lexicographical, in the order of the keys 
specified.

One uncomfortable truth about Hive's metastore is that the concept of "latest" 
is alien. The partition key-values are stored internally as strings. Any notion 
of data-types for partition keys is a retrofit. Square pegs and round holes.

> Retrieve "latest" partition from Hive Metastore
> ---
>
> Key: HIVE-17956
> URL: https://issues.apache.org/jira/browse/HIVE-17956
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Micah Whitacre
>
> We are trying to utilize the Hive Metastore for our processing needs, 
> specifically focusing on consuming through the HCatalog APIs.  One use case 
> we have is that we want to consume the "latest" partition.  In researching 
> there are a number of posts[1][2] that talk about using queries through Hive 
> Server2 to find that information.  It would be more ideal if this was a first 
> class API offered from the Hive Metastore without requiring a query to be 
> executed.
> The other option would be to retrieve all of the partitions and sort client 
> side.  There is a concern about the efficiency and memory requirements of 
> this especially without the "iterator" concept implemented from HIVE-7195.
> [1] - 
> https://community.hortonworks.com/questions/85330/how-to-optimize-hive-access-to-the-latest-partitio.html
> [2] - 
> https://stackoverflow.com/questions/36095790/how-to-find-the-most-recent-partition-in-hive-table



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17767) Rewrite correlated EXISTS/IN subqueries into LEFT SEMI JOIN

2017-11-01 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17767:
---
Attachment: HIVE-17767.3.patch

> Rewrite correlated EXISTS/IN subqueries into LEFT SEMI JOIN
> ---
>
> Key: HIVE-17767
> URL: https://issues.apache.org/jira/browse/HIVE-17767
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17767.1.patch, HIVE-17767.2.patch, 
> HIVE-17767.3.patch
>
>
> Currently such queries are written into group by + inner join with value 
> generator and is inefficient. Value generator consists of join with outer 
> query to fetch all correlated values. This value generator could be 
> completely eliminated if such queries are instead rewritten into LEFT SEMI 
> JOIN.
> Note that to do this first hive need to support LEFT SEMI JOIN with non-equi 
> condition (HIVE-17766).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17767) Rewrite correlated EXISTS/IN subqueries into LEFT SEMI JOIN

2017-11-01 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17767:
---
Status: Patch Available  (was: Open)

> Rewrite correlated EXISTS/IN subqueries into LEFT SEMI JOIN
> ---
>
> Key: HIVE-17767
> URL: https://issues.apache.org/jira/browse/HIVE-17767
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17767.1.patch, HIVE-17767.2.patch, 
> HIVE-17767.3.patch
>
>
> Currently such queries are written into group by + inner join with value 
> generator and is inefficient. Value generator consists of join with outer 
> query to fetch all correlated values. This value generator could be 
> completely eliminated if such queries are instead rewritten into LEFT SEMI 
> JOIN.
> Note that to do this first hive need to support LEFT SEMI JOIN with non-equi 
> condition (HIVE-17766).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17767) Rewrite correlated EXISTS/IN subqueries into LEFT SEMI JOIN

2017-11-01 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17767:
---
Status: Open  (was: Patch Available)

> Rewrite correlated EXISTS/IN subqueries into LEFT SEMI JOIN
> ---
>
> Key: HIVE-17767
> URL: https://issues.apache.org/jira/browse/HIVE-17767
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-17767.1.patch, HIVE-17767.2.patch, 
> HIVE-17767.3.patch
>
>
> Currently such queries are written into group by + inner join with value 
> generator and is inefficient. Value generator consists of join with outer 
> query to fetch all correlated values. This value generator could be 
> completely eliminated if such queries are instead rewritten into LEFT SEMI 
> JOIN.
> Note that to do this first hive need to support LEFT SEMI JOIN with non-equi 
> condition (HIVE-17766).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15552) unable to coalesce DATE and TIMESTAMP types

2017-11-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234510#comment-16234510
 ] 

Hive QA commented on HIVE-15552:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12895034/HIVE-15552.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 32 failed/errored test(s), 11350 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=235)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input8] (batchId=8)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join_cond_pushdown_unqual5]
 (batchId=65)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[named_column_join] 
(batchId=75)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[num_op_type_conv] 
(batchId=74)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_ppd_decimal] 
(batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_constant_expr] 
(batchId=57)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_array] (batchId=64)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_greatest] 
(batchId=16)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_if] (batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_least] (batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_map] (batchId=58)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_width_bucket] 
(batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_coalesce] 
(batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_coalesce_2] 
(batchId=70)
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_queries] 
(batchId=97)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_null_agg]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_coalesce]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_coalesce_2]
 (batchId=163)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc]
 (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=111)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_null_agg] 
(batchId=134)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=206)
org.apache.hadoop.hive.ql.exec.TestFunctionRegistry.testCommonClass 
(batchId=282)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=223)
org.apache.hadoop.hive.ql.udf.generic.TestGenericUDFGreatest.testGreatestMixed 
(batchId=258)
org.apache.hadoop.hive.ql.udf.generic.TestGenericUDFGreatest.testVoids 
(batchId=258)
org.apache.hadoop.hive.ql.udf.generic.TestGenericUDFLeast.testLeastTypes 
(batchId=256)
org.apache.hadoop.hive.ql.udf.generic.TestGenericUDFLeast.testVoids 
(batchId=256)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7587/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7587/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7587/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 32 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12895034 - PreCommit-HIVE-Build

> unable to coalesce DATE and TIMESTAMP types
> ---
>
> Key: HIVE-15552
> URL: https://issues.apache.org/jira/browse/HIVE-15552
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
>Reporter: N Campbell
>Assignee: Jesus Camacho Rodriguez
>Priority: Critical
>  Labels: timestamp
> Attachments: HIVE-15552.01.patch, HIVE-15552.patch
>
>
> COALESCE expression does not expect DATE and TIMESTAMP types 
> select tdt.rnum, coalesce(tdt.cdt, cast(tdt.cdt as timestamp)) from 
> certtext.tdt
> Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 
> Argument type mismatch 'cdt': The expressions after COALESCE should all have 
> the same type: "date" is expected but "timestamp" is found
> SQLState:  42000
> ErrorCode: 4



--
This message was sent 

[jira] [Updated] (HIVE-17853) RetryingMetaStoreClient loses UGI impersonation-context when reconnecting after timeout

2017-11-01 Thread Chris Drome (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Drome updated HIVE-17853:
---
Status: Open  (was: Patch Available)

> RetryingMetaStoreClient loses UGI impersonation-context when reconnecting 
> after timeout
> ---
>
> Key: HIVE-17853
> URL: https://issues.apache.org/jira/browse/HIVE-17853
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 2.4.0, 2.2.1
>Reporter: Mithun Radhakrishnan
>Assignee: Chris Drome
>Priority: Critical
> Attachments: HIVE-17853.01-branch-2.patch, HIVE-17853.01.patch
>
>
> The {{RetryingMetaStoreClient}} is used to automatically reconnect to the 
> Hive metastore, after client timeout, transparently to the user.
> In case of user impersonation (e.g. Oozie super-user {{oozie}} impersonating 
> a Hadoop user {{mithun}}, to run a workflow), in case of timeout, we find 
> that the reconnect causes the {{UGI.doAs()}} context to be lost. Any further 
> metastore operations will be attempted as the login-user ({{oozie}}), as 
> opposed to the effective user ({{mithun}}).
> We should have a fix for this shortly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17853) RetryingMetaStoreClient loses UGI impersonation-context when reconnecting after timeout

2017-11-01 Thread Chris Drome (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Drome updated HIVE-17853:
---
Status: Patch Available  (was: Open)

> RetryingMetaStoreClient loses UGI impersonation-context when reconnecting 
> after timeout
> ---
>
> Key: HIVE-17853
> URL: https://issues.apache.org/jira/browse/HIVE-17853
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 2.4.0, 2.2.1
>Reporter: Mithun Radhakrishnan
>Assignee: Chris Drome
>Priority: Critical
> Attachments: HIVE-17853.01-branch-2.patch, HIVE-17853.01.patch
>
>
> The {{RetryingMetaStoreClient}} is used to automatically reconnect to the 
> Hive metastore, after client timeout, transparently to the user.
> In case of user impersonation (e.g. Oozie super-user {{oozie}} impersonating 
> a Hadoop user {{mithun}}, to run a workflow), in case of timeout, we find 
> that the reconnect causes the {{UGI.doAs()}} context to be lost. Any further 
> metastore operations will be attempted as the login-user ({{oozie}}), as 
> opposed to the effective user ({{mithun}}).
> We should have a fix for this shortly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17949) itests compile is busted on branch-1.2

2017-11-01 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234507#comment-16234507
 ] 

Mithun Radhakrishnan commented on HIVE-17949:
-

On the bright side, the compile issue is fixed. On the other hand, the tests on 
{{branch-1.2}} are busted. 

I'll check this in this compile fix now.

> itests compile is busted on branch-1.2
> --
>
> Key: HIVE-17949
> URL: https://issues.apache.org/jira/browse/HIVE-17949
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Affects Versions: 1.2.3
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
>Priority: Major
> Attachments: HIVE-17949.01-branch-1.2.patch
>
>
> {{commit 18ddf46e0a8f092358725fc102235cbe6ba3e24d}} on {{branch-1.2}} was for 
> {{Preparing for 1.2.3 development}}. This should have also included 
> corresponding changes to all the pom-files under {{itests}}. As it stands 
> now, the build fails with the following:
> {noformat}
> [ERROR]   location: class org.apache.hadoop.hive.metastore.api.Role
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java:[512,19]
>  no suitable method found for 
> updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse,boolean,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.api.Partition,org.apache.hadoop.hive.metastore.Warehouse,boolean,boolean,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] method 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.updatePartitionStatsFast(org.apache.hadoop.hive.metastore.partition.spec.PartitionSpecProxy.PartitionIterator,org.apache.hadoop.hive.metastore.Warehouse,boolean,boolean,org.apache.hadoop.hive.metastore.api.EnvironmentContext)
>  is not applicable
> [ERROR]   (actual and formal argument lists differ in length)
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreWithEnvironmentContext.java:[181,45]
>  incompatible types: org.apache.hadoop.hive.metastore.api.EnvironmentContext 
> cannot be converted to boolean
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreWithEnvironmentContext.java:[190,45]
>  incompatible types: org.apache.hadoop.hive.metastore.api.EnvironmentContext 
> cannot be converted to boolean
> [ERROR] 
> /Users/mithunr/workspace/dev/hive/apache/branch-1.2/itests/hive-unit/src/test/java/org/apache/hadoop/hive/thrift/TestZooKeeperTokenStore.java:[53,26]
>  cannot find symbol
> [ERROR]   symbol:   class MiniZooKeeperCluster
> [ERROR]   location: class 
> org.apache.hadoop.hive.thrift.TestZooKeeperTokenStore
> [ERROR] -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
> [ERROR]
> [ERROR] After correcting the problems, you can resume the build with the 
> command
> [ERROR]   mvn  -rf :hive-it-unit
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17841) implement applying the resource plan

2017-11-01 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17841:

Attachment: HIVE-17841.07.patch

The same patch again... HiveQA won't run

> implement applying the resource plan
> 
>
> Key: HIVE-17841
> URL: https://issues.apache.org/jira/browse/HIVE-17841
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-17841.01.patch, HIVE-17841.02.patch, 
> HIVE-17841.03.patch, HIVE-17841.04.patch, HIVE-17841.05.patch, 
> HIVE-17841.06.patch, HIVE-17841.07.patch, HIVE-17841.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17902) add a notions of default pool and unmanaged mapping part 1

2017-11-01 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17902:

Attachment: (was: HIVE-17902.02.patch)

> add a notions of default pool and unmanaged mapping part 1
> --
>
> Key: HIVE-17902
> URL: https://issues.apache.org/jira/browse/HIVE-17902
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-17902.01.patch, HIVE-17902.02.patch, 
> HIVE-17902.03.patch, HIVE-17902.04.patch, HIVE-17902.patch
>
>
> This is needed to map queries between WM and non-WM execution



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17902) add a notions of default pool and unmanaged mapping part 1

2017-11-01 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17902:

Attachment: HIVE-17902.04.patch

The same patch again

> add a notions of default pool and unmanaged mapping part 1
> --
>
> Key: HIVE-17902
> URL: https://issues.apache.org/jira/browse/HIVE-17902
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HIVE-17902.01.patch, HIVE-17902.02.patch, 
> HIVE-17902.03.patch, HIVE-17902.04.patch, HIVE-17902.patch
>
>
> This is needed to map queries between WM and non-WM execution



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   >