[jira] [Updated] (HIVE-18208) SMB Join : Fix the unit tests to run SMB Joins.

2017-12-07 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-18208:
--
Attachment: (was: HIVE-18208.2.patch)

> SMB Join : Fix the unit tests to run SMB Joins.
> ---
>
> Key: HIVE-18208
> URL: https://issues.apache.org/jira/browse/HIVE-18208
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-18208.1.patch
>
>
> Most of the SMB Join tests are actually not creating SMB Joins. Need them to 
> test the intended join.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18208) SMB Join : Fix the unit tests to run SMB Joins.

2017-12-07 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-18208:
--
Attachment: HIVE-18208.2.patch

Updated the patch based on review comments.

> SMB Join : Fix the unit tests to run SMB Joins.
> ---
>
> Key: HIVE-18208
> URL: https://issues.apache.org/jira/browse/HIVE-18208
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-18208.1.patch, HIVE-18208.2.patch
>
>
> Most of the SMB Join tests are actually not creating SMB Joins. Need them to 
> test the intended join.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18206) Merge of RC/ORC file should follow other fileformate which use merge configuration parameter

2017-12-07 Thread Wang Haihua (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283129#comment-16283129
 ] 

Wang Haihua commented on HIVE-18206:


[~prasanth_j] ok, i saw it, the same problem. Thanks.

> Merge of RC/ORC file should follow other fileformate which use merge 
> configuration parameter
> 
>
> Key: HIVE-18206
> URL: https://issues.apache.org/jira/browse/HIVE-18206
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 1.2.1, 2.1.1
>Reporter: Wang Haihua
>Assignee: Wang Haihua
> Attachments: HIVE-18206.1.patch, HIVE-18206.2.patch
>
>
> Merge configuration parameter, like {{hive.merge.size.per.task}} , decide the 
> average file after merge stage.
> But we found it only work for file format like {{Textfile/SequenceFile}}. 
> With {{RC/ORC}} file format, it {{does not work}}.
> For {{RC/ORC}} file format we found the file size after merge stage, depends 
> on parameter like {{mapreduce.input.fileinputformat.split.maxsize}.
> it is better to use {{hive.merge.size.per.task}} to decide the the average 
> file size for RC/ORC fileformat, which results in unifying.
> Root Cause is for RC/ORC file format, merge class is {{MergeFileTask}} 
> instead of {{MapRedTask}} for Textfile/SequenceFile. And {{MergeFileTask}}  
> just has not accept the configuration value in MergeFileWork, so the solution 
> is passing it into  {{MergeFileTask}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18206) Merge of RC/ORC file should follow other fileformate which use merge configuration parameter

2017-12-07 Thread Wang Haihua (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wang Haihua updated HIVE-18206:
---
Affects Version/s: (was: 3.0.0)
   (was: 2.2.0)

> Merge of RC/ORC file should follow other fileformate which use merge 
> configuration parameter
> 
>
> Key: HIVE-18206
> URL: https://issues.apache.org/jira/browse/HIVE-18206
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 1.2.1, 2.1.1
>Reporter: Wang Haihua
>Assignee: Wang Haihua
> Attachments: HIVE-18206.1.patch, HIVE-18206.2.patch
>
>
> Merge configuration parameter, like {{hive.merge.size.per.task}} , decide the 
> average file after merge stage.
> But we found it only work for file format like {{Textfile/SequenceFile}}. 
> With {{RC/ORC}} file format, it {{does not work}}.
> For {{RC/ORC}} file format we found the file size after merge stage, depends 
> on parameter like {{mapreduce.input.fileinputformat.split.maxsize}.
> it is better to use {{hive.merge.size.per.task}} to decide the the average 
> file size for RC/ORC fileformat, which results in unifying.
> Root Cause is for RC/ORC file format, merge class is {{MergeFileTask}} 
> instead of {{MapRedTask}} for Textfile/SequenceFile. And {{MergeFileTask}}  
> just has not accept the configuration value in MergeFileWork, so the solution 
> is passing it into  {{MergeFileTask}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18191) Vectorization: Add validation of TableScanOperator (gather statistics) back

2017-12-07 Thread Sankar Hariappan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283075#comment-16283075
 ] 

Sankar Hariappan commented on HIVE-18191:
-

I killed the ptest build for this patch as it was running for more than 8 Hrs 
and blocking other builds. Can you please trigger it again?

> Vectorization: Add validation of TableScanOperator (gather statistics) back
> ---
>
> Key: HIVE-18191
> URL: https://issues.apache.org/jira/browse/HIVE-18191
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-18191.01.patch, HIVE-18191.02.patch
>
>
> HIVE-17433 accidentally removed call to validateTableScanOperator.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18112) show create for view having special char in where clause is not showing properly

2017-12-07 Thread Naresh P R (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283062#comment-16283062
 ] 

Naresh P R commented on HIVE-18112:
---

Sorry for the confusion [~ashutoshc], This issue exist in branch-2.2, i want 
this issue fixed on branch-2.2.

> show create for view having special char in where clause is not showing 
> properly
> 
>
> Key: HIVE-18112
> URL: https://issues.apache.org/jira/browse/HIVE-18112
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-18112-branch-2.2.patch
>
>
> e.g., 
> CREATE VIEW `v2` AS select `evil_byte1`.`a` from `default`.`EVIL_BYTE1` where 
> `evil_byte1`.`a` = 'abcÖdefÖgh';
> Output:
> ==
> 0: jdbc:hive2://172.26.122.227:1> show create table v2;
> ++--+
> | createtab_stmt  
>|
> ++--+
> | CREATE VIEW `v2` AS select `evil_byte1`.`a` from `default`.`EVIL_BYTE1` 
> where `evil_byte1`.`a` = 'abc�def�gh'  |
> ++--+
> Only show create output is having invalid characters, actual source table 
> content is displayed properly in the console.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18112) show create for view having special char in where clause is not showing properly

2017-12-07 Thread Naresh P R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naresh P R updated HIVE-18112:
--
Attachment: (was: HIVE-18112.patch)

> show create for view having special char in where clause is not showing 
> properly
> 
>
> Key: HIVE-18112
> URL: https://issues.apache.org/jira/browse/HIVE-18112
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-18112-branch-2.2.patch
>
>
> e.g., 
> CREATE VIEW `v2` AS select `evil_byte1`.`a` from `default`.`EVIL_BYTE1` where 
> `evil_byte1`.`a` = 'abcÖdefÖgh';
> Output:
> ==
> 0: jdbc:hive2://172.26.122.227:1> show create table v2;
> ++--+
> | createtab_stmt  
>|
> ++--+
> | CREATE VIEW `v2` AS select `evil_byte1`.`a` from `default`.`EVIL_BYTE1` 
> where `evil_byte1`.`a` = 'abc�def�gh'  |
> ++--+
> Only show create output is having invalid characters, actual source table 
> content is displayed properly in the console.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18252) Limit the size of the object inspector caches

2017-12-07 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283039#comment-16283039
 ] 

Gopal V commented on HIVE-18252:


bq. This would require implementing equals() on the constant object inspectors

Also a hashCode(), which has to incorporate the constant in question.

> Limit the size of the object inspector caches
> -
>
> Key: HIVE-18252
> URL: https://issues.apache.org/jira/browse/HIVE-18252
> Project: Hive
>  Issue Type: Bug
>  Components: Types
>Reporter: Jason Dere
>Assignee: Jason Dere
>
> Was running some tests that had a lot of queries with constant values, and 
> noticed that ObjectInspectorFactory.cachedStandardStructObjectInspector 
> started using up a lot of memory.
> It appears that StructObjectInspector caching does not work properly with 
> constant values. Constant ObjectInspectors are not cached, so each constant 
> expression creates a new constant ObjectInspector. And since object 
> inspectors do not override equals(), object inspector comparison relies on 
> object instance comparison. So even if the values are exactly the same as 
> what is already in the cache, the StructObjectInspector cache lookup would 
> fail, and Hive would create a new object inspector and add it to the cache, 
> creating another entry that would never be used. Plus, there is no max cache 
> size - it's just a map that is allowed to grow as long as values keep getting 
> added to it.
> Some possible solutions I can think of:
> 1. Limit the size of the object inspector caches, rather than growing without 
> bound.
> 2. Try to fix the caching to work with constant values. This would require 
> implementing equals() on the constant object inspectors (which could be slow 
> in nested cases), or else we would have to start caching constant object 
> inspectors, which could be expensive in terms of memory usage. Could be used 
> in combination with (1). By itself this is not a great solution because this 
> still has the unbounded cache growth issue.
> 3. Disable caching in the case of constant object inspectors since this 
> scenario currently doesn't work. This could be used in combination with (1).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18252) Limit the size of the object inspector caches

2017-12-07 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere reassigned HIVE-18252:
-


> Limit the size of the object inspector caches
> -
>
> Key: HIVE-18252
> URL: https://issues.apache.org/jira/browse/HIVE-18252
> Project: Hive
>  Issue Type: Bug
>  Components: Types
>Reporter: Jason Dere
>Assignee: Jason Dere
>
> Was running some tests that had a lot of queries with constant values, and 
> noticed that ObjectInspectorFactory.cachedStandardStructObjectInspector 
> started using up a lot of memory.
> It appears that StructObjectInspector caching does not work properly with 
> constant values. Constant ObjectInspectors are not cached, so each constant 
> expression creates a new constant ObjectInspector. And since object 
> inspectors do not override equals(), object inspector comparison relies on 
> object instance comparison. So even if the values are exactly the same as 
> what is already in the cache, the StructObjectInspector cache lookup would 
> fail, and Hive would create a new object inspector and add it to the cache, 
> creating another entry that would never be used. Plus, there is no max cache 
> size - it's just a map that is allowed to grow as long as values keep getting 
> added to it.
> Some possible solutions I can think of:
> 1. Limit the size of the object inspector caches, rather than growing without 
> bound.
> 2. Try to fix the caching to work with constant values. This would require 
> implementing equals() on the constant object inspectors (which could be slow 
> in nested cases), or else we would have to start caching constant object 
> inspectors, which could be expensive in terms of memory usage. Could be used 
> in combination with (1). By itself this is not a great solution because this 
> still has the unbounded cache growth issue.
> 3. Disable caching in the case of constant object inspectors since this 
> scenario currently doesn't work. This could be used in combination with (1).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17486) Enable SharedWorkOptimizer in tez on HOS

2017-12-07 Thread liyunzhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16283012#comment-16283012
 ] 

liyunzhang commented on HIVE-17486:
---

[~xuefuz]: have uploaded the [design 
doc|https://docs.google.com/document/d/1f4f0oMhN2vKSTCtXbnd3FBYOV02H4QflX1BbkglnC30/edit?usp=sharing].
 I described the problems i met in the [Problem 
Section|https://docs.google.com/document/d/1f4f0oMhN2vKSTCtXbnd3FBYOV02H4QflX1BbkglnC30/edit#heading=h.d0ptagvbv8k3],
 please help view the problem if have time, thanks!

> Enable SharedWorkOptimizer in tez on HOS
> 
>
> Key: HIVE-17486
> URL: https://issues.apache.org/jira/browse/HIVE-17486
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang
>Assignee: liyunzhang
> Attachments: HIVE-17486.1.patch, explain.28.share.false, 
> explain.28.share.true, scanshare.after.svg, scanshare.before.svg
>
>
> in HIVE-16602, Implement shared scans with Tez.
> Given a query plan, the goal is to identify scans on input tables that can be 
> merged so the data is read only once. Optimization will be carried out at the 
> physical level.  In Hive on Spark, it caches the result of spark work if the 
> spark work is used by more than 1 child spark work. After sharedWorkOptimizer 
> is enabled in physical plan in HoS, the identical table scans are merged to 1 
> table scan. This result of table scan will be used by more 1 child spark 
> work. Thus we need not do the same computation because of cache mechanism.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17486) Enable SharedWorkOptimizer in tez on HOS

2017-12-07 Thread liyunzhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282966#comment-16282966
 ] 

liyunzhang commented on HIVE-17486:
---

[~xuefuz]: ok, will upload doc soon.

> Enable SharedWorkOptimizer in tez on HOS
> 
>
> Key: HIVE-17486
> URL: https://issues.apache.org/jira/browse/HIVE-17486
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang
>Assignee: liyunzhang
> Attachments: HIVE-17486.1.patch, explain.28.share.false, 
> explain.28.share.true, scanshare.after.svg, scanshare.before.svg
>
>
> in HIVE-16602, Implement shared scans with Tez.
> Given a query plan, the goal is to identify scans on input tables that can be 
> merged so the data is read only once. Optimization will be carried out at the 
> physical level.  In Hive on Spark, it caches the result of spark work if the 
> spark work is used by more than 1 child spark work. After sharedWorkOptimizer 
> is enabled in physical plan in HoS, the identical table scans are merged to 1 
> table scan. This result of table scan will be used by more 1 child spark 
> work. Thus we need not do the same computation because of cache mechanism.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17486) Enable SharedWorkOptimizer in tez on HOS

2017-12-07 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282964#comment-16282964
 ] 

Xuefu Zhang commented on HIVE-17486:


Hi [~kellyzly], I think this thread is getting a little bit long and the 
problem doesn't seem trivial. Could you please create a doc that describes the 
problem or feature we are addressing and your proposal? That's probably easier 
to communicate. Thanks. 

> Enable SharedWorkOptimizer in tez on HOS
> 
>
> Key: HIVE-17486
> URL: https://issues.apache.org/jira/browse/HIVE-17486
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang
>Assignee: liyunzhang
> Attachments: HIVE-17486.1.patch, explain.28.share.false, 
> explain.28.share.true, scanshare.after.svg, scanshare.before.svg
>
>
> in HIVE-16602, Implement shared scans with Tez.
> Given a query plan, the goal is to identify scans on input tables that can be 
> merged so the data is read only once. Optimization will be carried out at the 
> physical level.  In Hive on Spark, it caches the result of spark work if the 
> spark work is used by more than 1 child spark work. After sharedWorkOptimizer 
> is enabled in physical plan in HoS, the identical table scans are merged to 1 
> table scan. This result of table scan will be used by more 1 child spark 
> work. Thus we need not do the same computation because of cache mechanism.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16688) Make sure Alter Table to set transaction=true acquires X lock

2017-12-07 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282865#comment-16282865
 ] 

Eugene Koifman commented on HIVE-16688:
---

this applies to converting to MM table which copies all existing files to a 
delta_x.  Need to make sure a parallel insert does write data to partition root 
that won't be copied

> Make sure Alter Table to set transaction=true acquires X lock
> -
>
> Key: HIVE-16688
> URL: https://issues.apache.org/jira/browse/HIVE-16688
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-16688.01.patch, HIVE-16688.02.patch
>
>
> suppose we have non-acid table with some data
> An insert op starts (long running)  (with hive.txn.strict.locking.mode=false 
> this takes shared lock)
> An alter table runs to add (transactional=true)
> An update is run which will read the list of "original" files and assign IDs 
> on the fly which are written to a delta file.
> The long running insert completes.
> Another update is run which now sees a different set of "original" files and 
> will (most likely) assign different IDs.
> Need to make sure to mutex this
> To clarify: The X lock is acquired for "An alter table runs to add 
> (transactional=true)"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18191) Vectorization: Add validation of TableScanOperator (gather statistics) back

2017-12-07 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282951#comment-16282951
 ] 

Rui Li commented on HIVE-18191:
---

Does the comment need to be updated with the change?
{code}
-  if (row instanceof VectorizedRowBatch) {
+  if (vectorized) {
 // We need to check with 'instanceof' instead of just checking
 // vectorized because the row can be a VectorizedRowBatch when
 // FetchOptimizer kicks in even if the operator pipeline is not
{code}

> Vectorization: Add validation of TableScanOperator (gather statistics) back
> ---
>
> Key: HIVE-18191
> URL: https://issues.apache.org/jira/browse/HIVE-18191
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-18191.01.patch, HIVE-18191.02.patch
>
>
> HIVE-17433 accidentally removed call to validateTableScanOperator.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18124) clean up isAcidTable() API vs isInsertOnlyTable()

2017-12-07 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18124:
--
Attachment: HIVE-18124.01.patch

>  clean up isAcidTable() API vs isInsertOnlyTable()
> --
>
> Key: HIVE-18124
> URL: https://issues.apache.org/jira/browse/HIVE-18124
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-18124.01.patch
>
>
> With the addition of MM tables (_AcidUtils.isInsertOnlyTable(table)_) the 
> methods in AcidUtils and dependent places are very muddled.
> Need to clean it up so that there is a isTransactional(Table) that checks 
> transactional=true setting and isAcid(Table) to mean full ACID and 
> isInsertOnly(Table) to mean MM tables.
> This would accurately describe the semantics of the tables.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18111) Fix temp path for Spark DPP sink

2017-12-07 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-18111:
--
Attachment: HIVE-18111.5.patch

Fix some check style issue and try again.

> Fix temp path for Spark DPP sink
> 
>
> Key: HIVE-18111
> URL: https://issues.apache.org/jira/browse/HIVE-18111
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-18111.1.patch, HIVE-18111.2.patch, 
> HIVE-18111.3.patch, HIVE-18111.4.patch, HIVE-18111.5.patch
>
>
> Before HIVE-17877, each DPP sink has only one target work. The output path of 
> a DPP work is {{TMP_PATH/targetWorkId/dppWorkId}}. When we do the pruning, 
> each map work reads DPP outputs under {{TMP_PATH/targetWorkId}}.
> After HIVE-17877, each DPP sink can have multiple target works. It's possible 
> that a map work needs to read DPP outputs from multiple 
> {{TMP_PATH/targetWorkId}}. To solve this, I think we can have a DPP output 
> path specific to each query, e.g. {{QUERY_TMP_PATH/dpp_output}}. Each DPP 
> work outputs to {{QUERY_TMP_PATH/dpp_output/dppWorkId}}. And each map work 
> reads from {{QUERY_TMP_PATH/dpp_output}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17710) LockManager should only lock Managed tables

2017-12-07 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282822#comment-16282822
 ] 

Alan Gates commented on HIVE-17710:
---

What you say about not able to lock during compile makes sense.  I forgot we 
compile then lock (by necessity) not the other way around.

But I don't follow what you're saying about it being not useful to prevent a 
drop.  Isn't that exactly what locks are for?  Which case are you talking 
about, views, materialized views, or managed tables?  For views you've 
convinced me, but not for the others.

> LockManager should only lock Managed tables
> ---
>
> Key: HIVE-17710
> URL: https://issues.apache.org/jira/browse/HIVE-17710
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17710.01.patch
>
>
> should the LM take locks on External tables?  Out of the box Acid LM is being 
> conservative which can cause throughput issues.
> A better strategy may be to exclude External tables but enable explicit "lock 
> table/partition " command (only on external tables?).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18251) Loosen restriction for some checks

2017-12-07 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reassigned HIVE-18251:
---


> Loosen restriction for some checks
> --
>
> Key: HIVE-18251
> URL: https://issues.apache.org/jira/browse/HIVE-18251
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18251) Loosen restriction for some checks

2017-12-07 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-18251:

Attachment: HIVE-18251.patch

> Loosen restriction for some checks
> --
>
> Key: HIVE-18251
> URL: https://issues.apache.org/jira/browse/HIVE-18251
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-18251.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18251) Loosen restriction for some checks

2017-12-07 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-18251:

Status: Patch Available  (was: Open)

> Loosen restriction for some checks
> --
>
> Key: HIVE-18251
> URL: https://issues.apache.org/jira/browse/HIVE-18251
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-18251.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18124) clean up isAcidTable() API vs isInsertOnlyTable()

2017-12-07 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18124:
--
Status: Patch Available  (was: Open)

>  clean up isAcidTable() API vs isInsertOnlyTable()
> --
>
> Key: HIVE-18124
> URL: https://issues.apache.org/jira/browse/HIVE-18124
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-18124.01.patch
>
>
> With the addition of MM tables (_AcidUtils.isInsertOnlyTable(table)_) the 
> methods in AcidUtils and dependent places are very muddled.
> Need to clean it up so that there is a isTransactional(Table) that checks 
> transactional=true setting and isAcid(Table) to mean full ACID and 
> isInsertOnly(Table) to mean MM tables.
> This would accurately describe the semantics of the tables.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18203) change the way WM is enabled and allow dropping the last resource plan

2017-12-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-18203:

Attachment: HIVE-18203.patch

> change the way WM is enabled and allow dropping the last resource plan
> --
>
> Key: HIVE-18203
> URL: https://issues.apache.org/jira/browse/HIVE-18203
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18203.patch
>
>
> Currently it's impossible to drop the last active resource plan even if WM is 
> disabled. It should be possible to deactivate the last resource plan AND 
> disable WM in the same action. Activating a resource plan should enable WM in 
> this case.
> This should interact with the WM queue config in a sensible manner.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18230) create plan like plan, and replace plan commands for easy modification

2017-12-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-18230:
---

Assignee: Sergey Shelukhin

> create plan like plan, and replace plan commands for easy modification
> --
>
> Key: HIVE-18230
> URL: https://issues.apache.org/jira/browse/HIVE-18230
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> Given that the plan already on the cluster cannot be altered, it would be 
> helpful to have create plan like plan, and replace plan commands that would 
> make a copy to be modified, and then rename+apply the copy in place of an 
> existing plan, and rename the existing active plan with a versioned name or 
> drop it altogether.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18240) support getClientInfo/setClientInfo in JDBC

2017-12-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-18240:

Description: 
These are JDBC APIs that allow the user of the driver to provide client info to 
the server; the list of the fields supported by the driver is returned as a 
result set by getClientInfoProperties API.
I've looked at IBM, Oracle, MS etc. docs and it seems like ApplicationName is a 
common one; there's also ClientHostname, etc. that we don't need because HS2 
derives them already.
The client will then set these properties via setClientInfo if desired. Whether 
it is desired by any BI tools of significance I've no idea. 
The properties are sent to the server on connect (which is what Microsoft seems 
to do, but in Hive model it's impossible because HiveConnection connects in 
ctor), or on the next query (I don't recall where I've seen this), or 
immediately (which is what I do in this patch).
The getClientInfo API on the driver side seems completely pointless, so I cache 
clientinfo locally for it.

> support getClientInfo/setClientInfo in JDBC
> ---
>
> Key: HIVE-18240
> URL: https://issues.apache.org/jira/browse/HIVE-18240
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18240.patch
>
>
> These are JDBC APIs that allow the user of the driver to provide client info 
> to the server; the list of the fields supported by the driver is returned as 
> a result set by getClientInfoProperties API.
> I've looked at IBM, Oracle, MS etc. docs and it seems like ApplicationName is 
> a common one; there's also ClientHostname, etc. that we don't need because 
> HS2 derives them already.
> The client will then set these properties via setClientInfo if desired. 
> Whether it is desired by any BI tools of significance I've no idea. 
> The properties are sent to the server on connect (which is what Microsoft 
> seems to do, but in Hive model it's impossible because HiveConnection 
> connects in ctor), or on the next query (I don't recall where I've seen 
> this), or immediately (which is what I do in this patch).
> The getClientInfo API on the driver side seems completely pointless, so I 
> cache clientinfo locally for it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18240) support getClientInfo/setClientInfo in JDBC

2017-12-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282905#comment-16282905
 ] 

Sergey Shelukhin commented on HIVE-18240:
-

Added the description, I will resubmit this for HiveQA once that is working.

> support getClientInfo/setClientInfo in JDBC
> ---
>
> Key: HIVE-18240
> URL: https://issues.apache.org/jira/browse/HIVE-18240
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18240.patch
>
>
> These are JDBC APIs that allow the user of the driver to provide client info 
> to the server; the list of the fields supported by the driver is returned as 
> a result set by getClientInfoProperties API.
> I've looked at IBM, Oracle, MS etc. docs and it seems like ApplicationName is 
> a common one; there's also ClientHostname, etc. that we don't need because 
> HS2 derives them already.
> The client will then set these properties via setClientInfo if desired. 
> Whether it is desired by any BI tools of significance I've no idea. 
> The properties are sent to the server on connect (which is what Microsoft 
> seems to do, but in Hive model it's impossible because HiveConnection 
> connects in ctor), or on the next query (I don't recall where I've seen 
> this), or immediately (which is what I do in this patch).
> The getClientInfo API on the driver side seems completely pointless, so I 
> cache clientinfo locally for it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18203) change the way WM is enabled and allow dropping the last resource plan

2017-12-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-18203:

Status: Patch Available  (was: Open)

> change the way WM is enabled and allow dropping the last resource plan
> --
>
> Key: HIVE-18203
> URL: https://issues.apache.org/jira/browse/HIVE-18203
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18203.patch
>
>
> Currently it's impossible to drop the last active resource plan even if WM is 
> disabled. It should be possible to deactivate the last resource plan AND 
> disable WM in the same action. Activating a resource plan should enable WM in 
> this case.
> This should interact with the WM queue config in a sensible manner.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15883) HBase mapped table in Hive insert fail for decimal

2017-12-07 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282821#comment-16282821
 ] 

Aihua Xu commented on HIVE-15883:
-

[~ngangam] I will wait for the test to pass. I remember we need to specify 
hbase.mapreduce.hfileoutputformat.table.name for hbase table now.

> HBase mapped table in Hive insert fail for decimal
> --
>
> Key: HIVE-15883
> URL: https://issues.apache.org/jira/browse/HIVE-15883
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-15883.1.patch, HIVE-15883.patch
>
>
> CREATE TABLE hbase_table (
> id int,
> balance decimal(15,2))
> ROW FORMAT DELIMITED
> COLLECTION ITEMS TERMINATED BY '~'
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
> "hbase.columns.mapping"=":key,cf:balance#b");
> insert into hbase_table values (1,1);
> 
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"tmp_values_col1":"1","tmp_values_col2":"1"}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1783)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"tmp_values_col1":"1","tmp_values_col2":"1"}
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
> ... 8 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.serde2.SerDeException: java.lang.RuntimeException: 
> Hive internal error.
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:733)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:97)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
> ... 9 more
> Caused by: org.apache.hadoop.hive.serde2.SerDeException: 
> java.lang.RuntimeException: Hive internal error.
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.serialize(HBaseSerDe.java:286)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:668)
> ... 15 more
> Caused by: java.lang.RuntimeException: Hive internal error.
> at 
> org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitive(LazyUtils.java:328)
> at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serialize(HBaseRowSerializer.java:220)
> at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serializeField(HBaseRowSerializer.java:194)
> at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serialize(HBaseRowSerializer.java:118)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.serialize(HBaseSerDe.java:282)
> ... 16 more 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-15120) Storage based auth: allow option to enforce write checks for external tables

2017-12-07 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16156688#comment-16156688
 ] 

Lefty Leverenz edited comment on HIVE-15120 at 12/8/17 12:30 AM:
-

In the code, the flag, 
hive.metastore.authorization.storage.check.externaltable.drop, is true by 
default. 
But In comments, it saids "The flag is set to false by default to maintain 
backward compatibility."
Comments /Doc or the flag default value, should be modified.

Edit 07/Dec/17:  Just a typo fix (flay -> flag) but also a +1 for fixing the 
parameter description.


was (Author: yuan_zac):
In the code, the flag, 
hive.metastore.authorization.storage.check.externaltable.drop, is true by 
default. 
But In comments, it saids "The flag is set to false by default to maintain 
backward compatibility."
Comments /Doc or the flay default value, should be modified.  

> Storage based auth: allow option to enforce write checks for external tables
> 
>
> Key: HIVE-15120
> URL: https://issues.apache.org/jira/browse/HIVE-15120
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Thejas M Nair
>Assignee: Daniel Dai
>  Labels: TODOC1.3, TODOC2.2
> Fix For: 1.3.0, 2.2.0
>
> Attachments: HIVE-15120.1.patch, HIVE-15120.2.patch, 
> HIVE-15120.3.patch, HIVE-15120.4.patch
>
>
> Under storage based authorization, we don't require write permissions on 
> table directory for external table create/drop.
> This is because external table contents are populated often from outside of 
> hive and are not written into from hive. So write access is not needed. Also, 
> we can't require write permissions to drop a table if we don't require them 
> for creation (users who created them should be able to drop them).
> However, this difference in behavior of external tables is not well 
> documented. So users get surprised to learn that drop table can be done by 
> just any user who has read access to the directory. At that point changing 
> the large number of scripts that use external tables is hard. 
> It would be good to have a user config option to have external tables to be 
> treated same as managed tables.
> The option should be off by default, so that the behavior is backward 
> compatible by default.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18191) Vectorization: Add validation of TableScanOperator (gather statistics) back

2017-12-07 Thread Teddy Choi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282165#comment-16282165
 ] 

Teddy Choi commented on HIVE-18191:
---

+1 pending tests.

> Vectorization: Add validation of TableScanOperator (gather statistics) back
> ---
>
> Key: HIVE-18191
> URL: https://issues.apache.org/jira/browse/HIVE-18191
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-18191.01.patch, HIVE-18191.02.patch
>
>
> HIVE-17433 accidentally removed call to validateTableScanOperator.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18191) Vectorization: Add validation of TableScanOperator (gather statistics) back

2017-12-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282805#comment-16282805
 ] 

Hive QA commented on HIVE-18191:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
36s{color} | {color:red} ql: The patch generated 1 new + 484 unchanged - 0 
fixed = 485 total (was 484) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
11s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 13m 10s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 36f0d89 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8142/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8142/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Vectorization: Add validation of TableScanOperator (gather statistics) back
> ---
>
> Key: HIVE-18191
> URL: https://issues.apache.org/jira/browse/HIVE-18191
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-18191.01.patch, HIVE-18191.02.patch
>
>
> HIVE-17433 accidentally removed call to validateTableScanOperator.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17710) LockManager should only lock Managed tables

2017-12-07 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17710:
--
Status: Patch Available  (was: Open)

> LockManager should only lock Managed tables
> ---
>
> Key: HIVE-17710
> URL: https://issues.apache.org/jira/browse/HIVE-17710
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17710.01.patch
>
>
> should the LM take locks on External tables?  Out of the box Acid LM is being 
> conservative which can cause throughput issues.
> A better strategy may be to exclude External tables but enable explicit "lock 
> table/partition " command (only on external tables?).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17710) LockManager should only lock Managed tables

2017-12-07 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17710:
--
Attachment: HIVE-17710.01.patch

> LockManager should only lock Managed tables
> ---
>
> Key: HIVE-17710
> URL: https://issues.apache.org/jira/browse/HIVE-17710
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17710.01.patch
>
>
> should the LM take locks on External tables?  Out of the box Acid LM is being 
> conservative which can cause throughput issues.
> A better strategy may be to exclude External tables but enable explicit "lock 
> table/partition " command (only on external tables?).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-12300) deprecate MR in Hive 2.0

2017-12-07 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15020813#comment-15020813
 ] 

Lefty Leverenz edited comment on HIVE-12300 at 12/8/17 12:17 AM:
-

Doc note:  This needs to be documented prominently in the wiki.  Also, the 
wiki's description of *hive.execution.engine* needs to be updated (without 
removing the old description for earlier versions).

* [Configuration Properties -- hive.execution.engine | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.execution.engine]

I suggest documenting MR deprecation on the wiki's home page and in the two 
requirements sections:

* [Home | https://cwiki.apache.org/confluence/display/Hive/Home]
* [Getting Started -- Requirements | 
https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-Requirements]
* [Installing Hive | 
https://cwiki.apache.org/confluence/display/Hive/AdminManual+Installation#AdminManualInstallation-InstallingHive]

There might be other appropriate places for it too.  Ideas, anyone?

Update 7/Dec/17:  *hive.execution.engine* has been revised.  Other wiki pages 
still need to be revised.


was (Author: le...@hortonworks.com):
Doc note:  This needs to be documented prominently in the wiki.  Also, the 
wiki's description of *hive.execution.engine* needs to be updated (without 
removing the old description for earlier versions).

* [Configuration Properties -- hive.execution.engine | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.execution.engine]

I suggest documenting MR deprecation on the wiki's home page and in the two 
requirements sections:

* [Home | https://cwiki.apache.org/confluence/display/Hive/Home]
* [Getting Started -- Requirements | 
https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-Requirements]
* [Installing Hive | 
https://cwiki.apache.org/confluence/display/Hive/AdminManual+Installation#AdminManualInstallation-InstallingHive]

There might be other appropriate places for it too.  Ideas, anyone?

> deprecate MR in Hive 2.0
> 
>
> Key: HIVE-12300
> URL: https://issues.apache.org/jira/browse/HIVE-12300
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Configuration, Documentation
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-12300.01.patch, HIVE-12300.02.patch, 
> HIVE-12300.patch
>
>
> As suggested in the thread on dev alias



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17495) CachedStore: prewarm improvement (avoid multiple sql calls to read partition column stats), refactoring and caching some aggregate stats

2017-12-07 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-17495:

Description: 
1. One sql call to retrieve column stats objects for a db
2. Cache some aggregate stats for speedup

> CachedStore: prewarm improvement (avoid multiple sql calls to read partition 
> column stats), refactoring and caching some aggregate stats
> 
>
> Key: HIVE-17495
> URL: https://issues.apache.org/jira/browse/HIVE-17495
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-17495.1.patch, HIVE-17495.2.patch, 
> HIVE-17495.3.patch, HIVE-17495.4.patch, HIVE-17495.5.patch
>
>
> 1. One sql call to retrieve column stats objects for a db
> 2. Cache some aggregate stats for speedup



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17495) CachedStore: prewarm improvement (avoid multiple sql calls to read partition column stats), refactoring and caching some aggregate stats

2017-12-07 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-17495:

Attachment: HIVE-17495.5.patch

Rebased on master

> CachedStore: prewarm improvement (avoid multiple sql calls to read partition 
> column stats), refactoring and caching some aggregate stats
> 
>
> Key: HIVE-17495
> URL: https://issues.apache.org/jira/browse/HIVE-17495
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-17495.1.patch, HIVE-17495.2.patch, 
> HIVE-17495.3.patch, HIVE-17495.4.patch, HIVE-17495.5.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18111) Fix temp path for Spark DPP sink

2017-12-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282761#comment-16282761
 ] 

Hive QA commented on HIVE-18111:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12901061/HIVE-18111.4.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8141/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8141/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8141/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Tests exited with: RuntimeException: Waited over fours for hosts, still have 
only 10 hosts out of an expected 12
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12901061 - PreCommit-HIVE-Build

> Fix temp path for Spark DPP sink
> 
>
> Key: HIVE-18111
> URL: https://issues.apache.org/jira/browse/HIVE-18111
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-18111.1.patch, HIVE-18111.2.patch, 
> HIVE-18111.3.patch, HIVE-18111.4.patch
>
>
> Before HIVE-17877, each DPP sink has only one target work. The output path of 
> a DPP work is {{TMP_PATH/targetWorkId/dppWorkId}}. When we do the pruning, 
> each map work reads DPP outputs under {{TMP_PATH/targetWorkId}}.
> After HIVE-17877, each DPP sink can have multiple target works. It's possible 
> that a map work needs to read DPP outputs from multiple 
> {{TMP_PATH/targetWorkId}}. To solve this, I think we can have a DPP output 
> path specific to each query, e.g. {{QUERY_TMP_PATH/dpp_output}}. Each DPP 
> work outputs to {{QUERY_TMP_PATH/dpp_output/dppWorkId}}. And each map work 
> reads from {{QUERY_TMP_PATH/dpp_output}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18247) Use DB auto-increment for indexes

2017-12-07 Thread Alexander Kolbasov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Kolbasov updated HIVE-18247:
--
Labels: datanucleus perfomance  (was: )

> Use DB auto-increment for indexes
> -
>
> Key: HIVE-18247
> URL: https://issues.apache.org/jira/browse/HIVE-18247
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Metastore
>Affects Versions: 3.0.0
>Reporter: Alexander Kolbasov
>Assignee: Alexander Kolbasov
>  Labels: datanucleus, perfomance
>
> I initially noticed this problem in Apache Sentry - see SENTRY-1960. Hive has 
> the same issue. DataNucleus uses SEQUENCE table to allocate IDs which 
> requires raw locks on multiple tables during transactions and this creates 
> scalability problems. 
> Instead DN should rely on DB auto-increment mechanisms which are much more 
> scalable.
> See SENTRY-1960 for extra details.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18240) support getClientInfo/setClientInfo in JDBC

2017-12-07 Thread Alexander Kolbasov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282653#comment-16282653
 ] 

Alexander Kolbasov commented on HIVE-18240:
---

This JIRA is missing any description - can you, please provide some?

> support getClientInfo/setClientInfo in JDBC
> ---
>
> Key: HIVE-18240
> URL: https://issues.apache.org/jira/browse/HIVE-18240
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18240.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18249) Remove Thrift dependency on fb303

2017-12-07 Thread Alexander Kolbasov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Kolbasov reassigned HIVE-18249:
-


> Remove Thrift dependency on fb303
> -
>
> Key: HIVE-18249
> URL: https://issues.apache.org/jira/browse/HIVE-18249
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Metastore
>Affects Versions: 3.0.0
>Reporter: Alexander Kolbasov
>Assignee: Alexander Kolbasov
>
> Looks like we are not really using fb303 and can remove fb303 dependency.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18249) Remove Thrift dependency on fb303

2017-12-07 Thread Alexander Kolbasov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Kolbasov updated HIVE-18249:
--
Labels: thrift  (was: )

> Remove Thrift dependency on fb303
> -
>
> Key: HIVE-18249
> URL: https://issues.apache.org/jira/browse/HIVE-18249
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Metastore
>Affects Versions: 3.0.0
>Reporter: Alexander Kolbasov
>Assignee: Alexander Kolbasov
>  Labels: thrift
>
> Looks like we are not really using fb303 and can remove fb303 dependency.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17710) LockManager should only lock Managed tables

2017-12-07 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282746#comment-16282746
 ] 

Eugene Koifman commented on HIVE-17710:
---

The materialization of the view is versioned.
I don't think we can lock anything during compilation since we have to have a 
query plan to know what to lock.

I'm not sure how much value there is in protecting from a drop.  Suppose there 
is a read op and a concurrent drop op is issued.  Yes the drop will block but 
the next attempt to read will still get an error.  In other words the type of 
admin op like removing some object needs to coordinated - a read lock in this 
case seems of minor help if someone issued a drop by mistake.

> LockManager should only lock Managed tables
> ---
>
> Key: HIVE-17710
> URL: https://issues.apache.org/jira/browse/HIVE-17710
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> should the LM take locks on External tables?  Out of the box Acid LM is being 
> conservative which can cause throughput issues.
> A better strategy may be to exclude External tables but enable explicit "lock 
> table/partition " command (only on external tables?).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17333) Schema changes in HIVE-12274 for Oracle may not work for upgrade

2017-12-07 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282667#comment-16282667
 ] 

Lefty Leverenz commented on HIVE-17333:
---

[~ngangam], branch-2 is for release 2.4.0 not 2.3.0.

Please change the fix version.

> Schema changes in HIVE-12274 for Oracle may not work for upgrade
> 
>
> Key: HIVE-17333
> URL: https://issues.apache.org/jira/browse/HIVE-17333
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Fix For: 2.3.0, 3.0.0
>
> Attachments: HIVE-17333.1.patch, HIVE-17333.patch
>
>
> According to 
> https://asktom.oracle.com/pls/asktom/f?p=100:11:0P11_QUESTION_ID:1770086700346491686
>  (reported in HIVE-12274)
> The alter table command to change the column datatype from {{VARCHAR}} to 
> {{CLOB}} may not work. So the correct way to accomplish this is to add a new 
> temp column, copy the value from the current column, drop the current column 
> and rename the new column to old column.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18240) support getClientInfo/setClientInfo in JDBC

2017-12-07 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282720#comment-16282720
 ] 

Vaibhav Gumashta commented on HIVE-18240:
-

+1; would be good to add description in jira.

> support getClientInfo/setClientInfo in JDBC
> ---
>
> Key: HIVE-18240
> URL: https://issues.apache.org/jira/browse/HIVE-18240
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18240.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18250) CBO gets turned off with duplicates in RR error

2017-12-07 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282725#comment-16282725
 ] 

Ashutosh Chauhan commented on HIVE-18250:
-

Workaround is to not use same alias as existing column name.
{code}
explain select t1.a as a1, min(t1.a) as a2 from t1 group by t1.a;
{code}

> CBO gets turned off with duplicates in RR error
> ---
>
> Key: HIVE-18250
> URL: https://issues.apache.org/jira/browse/HIVE-18250
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Query Planning
>Affects Versions: 2.0.0, 2.1.0, 2.2.0, 2.3.0
>Reporter: Ashutosh Chauhan
>
> {code}
>  create table t1 (a int);
> explain select t1.a as a1, min(t1.a) as a from t1 group by t1.a;
> {code}
> CBO gets turned off with:
> {code}
> WARN [2e80e34e-dc46-49cf-88bf-2c24c0262d41 main] parse.RowResolver: Found 
> duplicate column alias in RR: null.a => {null, a1, _col0: int} adding null.a 
> => {null, null, _col1: int}
> 2017-12-07T15:27:47,651 ERROR [2e80e34e-dc46-49cf-88bf-2c24c0262d41 main] 
> parse.CalcitePlanner: CBO failed, skipping CBO.
> org.apache.hadoop.hive.ql.optimizer.calcite.CalciteSemanticException: Cannot 
> add column to RR: null.a => _col1: int due to duplication, see previous 
> warnings
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genSelectLogicalPlan(CalcitePlanner.java:3985)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:4313)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1392)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1322)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> {code}
> After that non-CBO path completes the query.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18133) Parametrize TestTxnNoBuckets wrt Vectorization

2017-12-07 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18133:
--
Priority: Minor  (was: Major)

> Parametrize TestTxnNoBuckets wrt Vectorization
> --
>
> Key: HIVE-18133
> URL: https://issues.apache.org/jira/browse/HIVE-18133
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Minor
> Attachments: HIVE-18133.01.patch
>
>
> it currently runs in Vector mode only
> {noformat}
>   public void setUp() throws Exception {
> setUpInternal();
> hiveConf.setBoolVar(HiveConf.ConfVars.HIVE_VECTORIZATION_ENABLED, true);
>   }
> {noformat}
> would be good to run both modes



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17710) LockManager should only lock Managed tables

2017-12-07 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282733#comment-16282733
 ] 

Alan Gates commented on HIVE-17710:
---

It seems reasonable to lock views.  We don't want someone dropping a view while 
a query is using it.  For virtual views this could be removed after compilation 
is finished.  In the case of materialized view we need to make sure that 
refreshing the view does not happen in the midst of a query reading the view 
(or if it does, the view itself is versioned so that the query gets the old 
version and not the new).

> LockManager should only lock Managed tables
> ---
>
> Key: HIVE-17710
> URL: https://issues.apache.org/jira/browse/HIVE-17710
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> should the LM take locks on External tables?  Out of the box Acid LM is being 
> conservative which can cause throughput issues.
> A better strategy may be to exclude External tables but enable explicit "lock 
> table/partition " command (only on external tables?).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18133) Parametrize TestTxnNoBuckets wrt Vectorization

2017-12-07 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18133:
--
Status: Patch Available  (was: Open)

> Parametrize TestTxnNoBuckets wrt Vectorization
> --
>
> Key: HIVE-18133
> URL: https://issues.apache.org/jira/browse/HIVE-18133
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Minor
> Attachments: HIVE-18133.01.patch
>
>
> it currently runs in Vector mode only
> {noformat}
>   public void setUp() throws Exception {
> setUpInternal();
> hiveConf.setBoolVar(HiveConf.ConfVars.HIVE_VECTORIZATION_ENABLED, true);
>   }
> {noformat}
> would be good to run both modes



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17710) LockManager should only lock Managed tables

2017-12-07 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17710:
--
Summary: LockManager should only lock Managed tables  (was: LockManager and 
External tables)

> LockManager should only lock Managed tables
> ---
>
> Key: HIVE-17710
> URL: https://issues.apache.org/jira/browse/HIVE-17710
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> should the LM take locks on External tables?  Out of the box Acid LM is being 
> conservative which can cause throughput issues.
> A better strategy may be to exclude External tables but enable explicit "lock 
> table/partition " command (only on external tables?).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17232) "No match found" Compactor finds a bucket file thinking it's a directory

2017-12-07 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17232:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

>  "No match found"  Compactor finds a bucket file thinking it's a directory
> --
>
> Key: HIVE-17232
> URL: https://issues.apache.org/jira/browse/HIVE-17232
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17232.01.patch
>
>
> {noformat}
> 2017-08-02T12:38:11,996  WARN [main] compactor.CompactorMR: Found a 
> non-bucket file that we thought matched the bucket pattern! 
> file:/Users/ekoifman/dev/hiv\
> erwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands2-1501702264311/warehouse/acidtblpart/p=1/delta_013_013_/bucket_1
>  Matcher=java\
> .util.regex.Matcher[pattern=^[0-9]{6} region=0,12 lastmatch=]
> 2017-08-02T12:38:11,996  INFO [main] mapreduce.JobSubmitter: Cleaning up the 
> staging area 
> file:/tmp/hadoop/mapred/staging/ekoifman1723152463/.staging/job_lo\
> cal1723152463_0183
> 2017-08-02T12:38:11,997 ERROR [main] compactor.Worker: Caught exception while 
> trying to compact 
> id:1,dbname:default,tableName:ACIDTBLPART,partName:null,stat\
> e:^@,type:MAJOR,properties:null,runAs:null,tooManyAborts:false,highestTxnId:0.
>   Marking failed to avoid repeated failures, java.lang.IllegalStateException: 
> \
> No match found
> at java.util.regex.Matcher.group(Matcher.java:536)
> at java.util.regex.Matcher.group(Matcher.java:496)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorInputFormat.addFileToMap(CompactorMR.java:577)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorInputFormat.getSplits(CompactorMR.java:549)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:330)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:322)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:198)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1338)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1338)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
> at 
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.launchCompactionJob(CompactorMR.java:320)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:275)
> at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:166)
> at 
> org.apache.hadoop.hive.ql.TestTxnCommands2.runWorker(TestTxnCommands2.java:1138)
> at 
> org.apache.hadoop.hive.ql.TestTxnCommands2.updateDeletePartitioned(TestTxnCommands2.java:894)
> {noformat}
> the stack trace points to 1st runWorker() in updateDeletePartitioned() though 
> the test run was TestTxnCommands2WithSplitUpdateAndVectorization



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17232) "No match found" Compactor finds a bucket file thinking it's a directory

2017-12-07 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282700#comment-16282700
 ] 

Eugene Koifman commented on HIVE-17232:
---

This can happen when a table level compaction via ALTER TABLE is requested for 
a partitioned table.
The error propagation for this was improved in HIVE-17361.  Now the error has 
the file name it couldn't parse.

>  "No match found"  Compactor finds a bucket file thinking it's a directory
> --
>
> Key: HIVE-17232
> URL: https://issues.apache.org/jira/browse/HIVE-17232
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17232.01.patch
>
>
> {noformat}
> 2017-08-02T12:38:11,996  WARN [main] compactor.CompactorMR: Found a 
> non-bucket file that we thought matched the bucket pattern! 
> file:/Users/ekoifman/dev/hiv\
> erwgit/ql/target/tmp/org.apache.hadoop.hive.ql.TestTxnCommands2-1501702264311/warehouse/acidtblpart/p=1/delta_013_013_/bucket_1
>  Matcher=java\
> .util.regex.Matcher[pattern=^[0-9]{6} region=0,12 lastmatch=]
> 2017-08-02T12:38:11,996  INFO [main] mapreduce.JobSubmitter: Cleaning up the 
> staging area 
> file:/tmp/hadoop/mapred/staging/ekoifman1723152463/.staging/job_lo\
> cal1723152463_0183
> 2017-08-02T12:38:11,997 ERROR [main] compactor.Worker: Caught exception while 
> trying to compact 
> id:1,dbname:default,tableName:ACIDTBLPART,partName:null,stat\
> e:^@,type:MAJOR,properties:null,runAs:null,tooManyAborts:false,highestTxnId:0.
>   Marking failed to avoid repeated failures, java.lang.IllegalStateException: 
> \
> No match found
> at java.util.regex.Matcher.group(Matcher.java:536)
> at java.util.regex.Matcher.group(Matcher.java:496)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorInputFormat.addFileToMap(CompactorMR.java:577)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR$CompactorInputFormat.getSplits(CompactorMR.java:549)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:330)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:322)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:198)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
> at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1338)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1338)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
> at 
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.launchCompactionJob(CompactorMR.java:320)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:275)
> at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:166)
> at 
> org.apache.hadoop.hive.ql.TestTxnCommands2.runWorker(TestTxnCommands2.java:1138)
> at 
> org.apache.hadoop.hive.ql.TestTxnCommands2.updateDeletePartitioned(TestTxnCommands2.java:894)
> {noformat}
> the stack trace points to 1st runWorker() in updateDeletePartitioned() though 
> the test run was TestTxnCommands2WithSplitUpdateAndVectorization



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18133) Parametrize TestTxnNoBuckets wrt Vectorization

2017-12-07 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18133:
--
Attachment: HIVE-18133.01.patch

> Parametrize TestTxnNoBuckets wrt Vectorization
> --
>
> Key: HIVE-18133
> URL: https://issues.apache.org/jira/browse/HIVE-18133
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-18133.01.patch
>
>
> it currently runs in Vector mode only
> {noformat}
>   public void setUp() throws Exception {
> setUpInternal();
> hiveConf.setBoolVar(HiveConf.ConfVars.HIVE_VECTORIZATION_ENABLED, true);
>   }
> {noformat}
> would be good to run both modes



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18206) Merge of RC/ORC file should follow other fileformate which use merge configuration parameter

2017-12-07 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282681#comment-16282681
 ] 

Prasanth Jayachandran commented on HIVE-18206:
--

[~wanghaihua] sorry totally missed an old jira.. is this related to HIVE-15178? 
Are you seeing this issue in 2.2.x version?

> Merge of RC/ORC file should follow other fileformate which use merge 
> configuration parameter
> 
>
> Key: HIVE-18206
> URL: https://issues.apache.org/jira/browse/HIVE-18206
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 1.2.1, 2.1.1, 2.2.0, 3.0.0
>Reporter: Wang Haihua
>Assignee: Wang Haihua
> Attachments: HIVE-18206.1.patch, HIVE-18206.2.patch
>
>
> Merge configuration parameter, like {{hive.merge.size.per.task}} , decide the 
> average file after merge stage.
> But we found it only work for file format like {{Textfile/SequenceFile}}. 
> With {{RC/ORC}} file format, it {{does not work}}.
> For {{RC/ORC}} file format we found the file size after merge stage, depends 
> on parameter like {{mapreduce.input.fileinputformat.split.maxsize}.
> it is better to use {{hive.merge.size.per.task}} to decide the the average 
> file size for RC/ORC fileformat, which results in unifying.
> Root Cause is for RC/ORC file format, merge class is {{MergeFileTask}} 
> instead of {{MapRedTask}} for Textfile/SequenceFile. And {{MergeFileTask}}  
> just has not accept the configuration value in MergeFileWork, so the solution 
> is passing it into  {{MergeFileTask}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17921) Aggregation with struct in LLAP produces wrong result

2017-12-07 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282666#comment-16282666
 ] 

Eugene Koifman edited comment on HIVE-17921 at 12/7/17 11:00 PM:
-

I also have {noformat}select ROW__ID from T group by ROW__ID having count(*) > 
1{noformat}
in TestTxnNoBuckets.testInsertFromUnion() which runs MR - works OK


was (Author: ekoifman):
I also have "select ROW__ID from T group by ROW__ID having count(*) > 1"
in TestTxnNoBuckets.testInsertFromUnion() which runs MR - works OK

> Aggregation with struct in LLAP produces wrong result
> -
>
> Key: HIVE-17921
> URL: https://issues.apache.org/jira/browse/HIVE-17921
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap, Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Priority: Blocker
>
> Consider 
> {noformat}
> select ROW__ID, count(*) from over10k_orc_bucketed group by ROW__ID having 
> count(*) > 1;
> {noformat}
>  in acid_vectorization_original.q (available since HIVE-17458)
> when run using TestMiniLlapCliDriver produces "NULL, N" where N varies from 
> run to run.
> The right answer is empty results set as can be seen by running
> {noformat}
> select ROW__ID, * from over10k_orc_bucketed where ROW__ID is null
> {noformat}
> in the same test.
> This is with 
> {noformat}
> set hive.vectorized.execution.enabled=true;
> set hive.vectorized.row.identifier.enabled=true;
> {noformat}
> It fails with TestMiniLlapCliDriver but not TestMiniTezCliDriver.  See 
> acid_vectorization_original_tez.q which has identical query.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17921) Aggregation with struct in LLAP produces wrong result

2017-12-07 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282666#comment-16282666
 ] 

Eugene Koifman commented on HIVE-17921:
---

I also have "select ROW__ID from T group by ROW__ID having count(*) > 1"
in TestTxnNoBuckets.testInsertFromUnion() which runs MR - works OK

> Aggregation with struct in LLAP produces wrong result
> -
>
> Key: HIVE-17921
> URL: https://issues.apache.org/jira/browse/HIVE-17921
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap, Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Priority: Blocker
>
> Consider 
> {noformat}
> select ROW__ID, count(*) from over10k_orc_bucketed group by ROW__ID having 
> count(*) > 1;
> {noformat}
>  in acid_vectorization_original.q (available since HIVE-17458)
> when run using TestMiniLlapCliDriver produces "NULL, N" where N varies from 
> run to run.
> The right answer is empty results set as can be seen by running
> {noformat}
> select ROW__ID, * from over10k_orc_bucketed where ROW__ID is null
> {noformat}
> in the same test.
> This is with 
> {noformat}
> set hive.vectorized.execution.enabled=true;
> set hive.vectorized.row.identifier.enabled=true;
> {noformat}
> It fails with TestMiniLlapCliDriver but not TestMiniTezCliDriver.  See 
> acid_vectorization_original_tez.q which has identical query.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18245) clean up acid_vectorization_original.q

2017-12-07 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18245:
--
Status: Patch Available  (was: Open)

> clean up acid_vectorization_original.q
> --
>
> Key: HIVE-18245
> URL: https://issues.apache.org/jira/browse/HIVE-18245
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 3.0.0
>
> Attachments: HIVE-18245.01.patch
>
>
> now that HIVE-17923 is fixed, 
> acid_vectorization_original_tez.q/acid_vectorization_original.q can be 
> cleaned up



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18246) Replace toString with getExprString in AbstractOperatorDesc::getColumnExprMapForExplain

2017-12-07 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-18246:
---
Attachment: HIVE-18246.1.patch

> Replace toString with getExprString in 
> AbstractOperatorDesc::getColumnExprMapForExplain
> ---
>
> Key: HIVE-18246
> URL: https://issues.apache.org/jira/browse/HIVE-18246
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-18246.1.patch
>
>
> AbstractOperatorDesc::getColumnExprMapForExplain uses toString on ExprNode to 
> get the string representation of an expr. getExprString is better suited here 
> since each ExprNode class has suitable implementation for this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18245) clean up acid_vectorization_original.q

2017-12-07 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18245:
--
Attachment: HIVE-18245.01.patch

removed unnecessary "cluster by" from insert statements

> clean up acid_vectorization_original.q
> --
>
> Key: HIVE-18245
> URL: https://issues.apache.org/jira/browse/HIVE-18245
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 3.0.0
>
> Attachments: HIVE-18245.01.patch
>
>
> now that HIVE-17923 is fixed, 
> acid_vectorization_original_tez.q/acid_vectorization_original.q can be 
> cleaned up



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18247) Use DB auto-increment for indexes

2017-12-07 Thread Alexander Kolbasov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Kolbasov reassigned HIVE-18247:
-

Assignee: Alexander Kolbasov

> Use DB auto-increment for indexes
> -
>
> Key: HIVE-18247
> URL: https://issues.apache.org/jira/browse/HIVE-18247
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Metastore
>Affects Versions: 3.0.0
>Reporter: Alexander Kolbasov
>Assignee: Alexander Kolbasov
>
> I initially noticed this problem in Apache Sentry - see SENTRY-1960. Hive has 
> the same issue. DataNucleus uses SEQUENCE table to allocate IDs which 
> requires raw locks on multiple tables during transactions and this creates 
> scalability problems. 
> Instead DN should rely on DB auto-increment mechanisms which are much more 
> scalable.
> See SENTRY-1960 for extra details.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18248) Clean up parameters

2017-12-07 Thread Janaki Lahorani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janaki Lahorani reassigned HIVE-18248:
--


> Clean up parameters
> ---
>
> Key: HIVE-18248
> URL: https://issues.apache.org/jira/browse/HIVE-18248
> Project: Hive
>  Issue Type: Bug
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
>
> Clean up of parameters that need not change at run time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17921) Aggregation with struct in LLAP produces wrong result

2017-12-07 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282645#comment-16282645
 ] 

Gopal V commented on HIVE-17921:


This might be more related to GROUP BY  instead of the specifics of 
ROW__ID ?

> Aggregation with struct in LLAP produces wrong result
> -
>
> Key: HIVE-17921
> URL: https://issues.apache.org/jira/browse/HIVE-17921
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap, Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Priority: Blocker
>
> Consider 
> {noformat}
> select ROW__ID, count(*) from over10k_orc_bucketed group by ROW__ID having 
> count(*) > 1;
> {noformat}
>  in acid_vectorization_original.q (available since HIVE-17458)
> when run using TestMiniLlapCliDriver produces "NULL, N" where N varies from 
> run to run.
> The right answer is empty results set as can be seen by running
> {noformat}
> select ROW__ID, * from over10k_orc_bucketed where ROW__ID is null
> {noformat}
> in the same test.
> This is with 
> {noformat}
> set hive.vectorized.execution.enabled=true;
> set hive.vectorized.row.identifier.enabled=true;
> {noformat}
> It fails with TestMiniLlapCliDriver but not TestMiniTezCliDriver.  See 
> acid_vectorization_original_tez.q which has identical query.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18246) Replace toString with getExprString in AbstractOperatorDesc::getColumnExprMapForExplain

2017-12-07 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-18246:
---
Status: Patch Available  (was: Open)

> Replace toString with getExprString in 
> AbstractOperatorDesc::getColumnExprMapForExplain
> ---
>
> Key: HIVE-18246
> URL: https://issues.apache.org/jira/browse/HIVE-18246
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-18246.1.patch
>
>
> AbstractOperatorDesc::getColumnExprMapForExplain uses toString on ExprNode to 
> get the string representation of an expr. getExprString is better suited here 
> since each ExprNode class has suitable implementation for this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18246) Replace toString with getExprString in AbstractOperatorDesc::getColumnExprMapForExplain

2017-12-07 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-18246:
--


> Replace toString with getExprString in 
> AbstractOperatorDesc::getColumnExprMapForExplain
> ---
>
> Key: HIVE-18246
> URL: https://issues.apache.org/jira/browse/HIVE-18246
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>
> AbstractOperatorDesc::getColumnExprMapForExplain uses toString on ExprNode to 
> get the string representation of an expr. getExprString is better suited here 
> since each ExprNode class has suitable implementation for this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17921) Aggregation with struct in LLAP produces wrong result

2017-12-07 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282631#comment-16282631
 ] 

Eugene Koifman commented on HIVE-17921:
---

FYI, [~teddy.choi], [~mmccline], [~gopalv]

> Aggregation with struct in LLAP produces wrong result
> -
>
> Key: HIVE-17921
> URL: https://issues.apache.org/jira/browse/HIVE-17921
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap, Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Priority: Blocker
>
> Consider 
> {noformat}
> select ROW__ID, count(*) from over10k_orc_bucketed group by ROW__ID having 
> count(*) > 1;
> {noformat}
>  in acid_vectorization_original.q (available since HIVE-17458)
> when run using TestMiniLlapCliDriver produces "NULL, N" where N varies from 
> run to run.
> The right answer is empty results set as can be seen by running
> {noformat}
> select ROW__ID, * from over10k_orc_bucketed where ROW__ID is null
> {noformat}
> in the same test.
> This is with 
> {noformat}
> set hive.vectorized.execution.enabled=true;
> set hive.vectorized.row.identifier.enabled=true;
> {noformat}
> It fails with TestMiniLlapCliDriver but not TestMiniTezCliDriver.  See 
> acid_vectorization_original_tez.q which has identical query.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17921) Aggregation with struct in LLAP produces wrong result

2017-12-07 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17921:
--
Priority: Blocker  (was: Major)

> Aggregation with struct in LLAP produces wrong result
> -
>
> Key: HIVE-17921
> URL: https://issues.apache.org/jira/browse/HIVE-17921
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap, Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Priority: Blocker
>
> Consider 
> {noformat}
> select ROW__ID, count(*) from over10k_orc_bucketed group by ROW__ID having 
> count(*) > 1;
> {noformat}
>  in acid_vectorization_original.q (available since HIVE-17458)
> when run using TestMiniLlapCliDriver produces "NULL, N" where N varies from 
> run to run.
> The right answer is empty results set as can be seen by running
> {noformat}
> select ROW__ID, * from over10k_orc_bucketed where ROW__ID is null
> {noformat}
> in the same test.
> This is with 
> {noformat}
> set hive.vectorized.execution.enabled=true;
> set hive.vectorized.row.identifier.enabled=true;
> {noformat}
> It fails with TestMiniLlapCliDriver but not TestMiniTezCliDriver.  See 
> acid_vectorization_original_tez.q which has identical query.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18245) clean up acid_vectorization_original.q

2017-12-07 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18245:
--
Summary: clean up acid_vectorization_original.q  (was: clean up )

> clean up acid_vectorization_original.q
> --
>
> Key: HIVE-18245
> URL: https://issues.apache.org/jira/browse/HIVE-18245
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 3.0.0
>
>
> now that HIVE-17923 is fixed, 
> acid_vectorization_original_tez.q/acid_vectorization_original.q can be 
> cleaned up



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18245) clean up

2017-12-07 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-18245:
-


> clean up 
> -
>
> Key: HIVE-18245
> URL: https://issues.apache.org/jira/browse/HIVE-18245
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> now that HIVE-17923 is fixed, 
> acid_vectorization_original_tez.q/acid_vectorization_original.q can be 
> cleaned up



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17002) decimal (binary) is not working when creating external table for hbase

2017-12-07 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282576#comment-16282576
 ] 

Naveen Gangam commented on HIVE-17002:
--

[~arturt] Could you try the patch that I just attached to HIVE-15883 to see if 
it resolves this issue? Thanks

> decimal (binary) is not working when creating external table for hbase
> --
>
> Key: HIVE-17002
> URL: https://issues.apache.org/jira/browse/HIVE-17002
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
> Environment: HBase 1.2.0, Hive 2.1.1
>Reporter: Artur Tamazian
>Assignee: Naveen Gangam
>
> I have a table in Hbase which has a column stored using 
> Bytes.toBytes((BigDecimal) value). Hbase version is 1.2.0
> I'm creating an external table in hive to access it like this:
> {noformat}
> create external table `Users`(key int, ..., `example_column` decimal) 
> stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
> with serdeproperties ("hbase.columns.mapping" = ":key, 
> db:example_column") 
> tblproperties("hbase.table.name" = 
> "Users","hbase.table.default.storage.type" = "binary");
> {noformat}
> Table is created without errors. After that I try running "select * from 
> users;" and see this error:
> {noformat}
> org.apache.hive.service.cli.HiveSQLException:java.io.IOException: 
> java.lang.RuntimeException: java.lang.RuntimeException: Hive Internal Error: 
> no LazyObject for 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyHiveDecimalObjectInspector@1f18cebb:25:24
>   
>
> org.apache.hive.service.cli.operation.SQLOperation:getNextRowSet:SQLOperation.java:484
>   
>
> org.apache.hive.service.cli.operation.OperationManager:getOperationNextRowSet:OperationManager.java:308
>   
>
> org.apache.hive.service.cli.session.HiveSessionImpl:fetchResults:HiveSessionImpl.java:847
>   
>sun.reflect.GeneratedMethodAccessor11:invoke::-1  
>
> sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43
>   
>java.lang.reflect.Method:invoke:Method.java:498  
>
> org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:78
>   
>
> org.apache.hive.service.cli.session.HiveSessionProxy:access$000:HiveSessionProxy.java:36
>   
>
> org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSessionProxy.java:63
>   
>java.security.AccessController:doPrivileged:AccessController.java:-2  
>javax.security.auth.Subject:doAs:Subject.java:422  
>
> org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformation.java:1698
>   
>
> org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:59
>   
>com.sun.proxy.$Proxy33:fetchResults::-1  
>org.apache.hive.service.cli.CLIService:fetchResults:CLIService.java:504  
>
> org.apache.hive.service.cli.thrift.ThriftCLIService:FetchResults:ThriftCLIService.java:698
>   
>
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1717
>   
>
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1702
>   
>org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39  
>org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39  
>
> org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56
>   
>
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:286
>   
>
> java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1142
>   
>
> java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:617
>   
>java.lang.Thread:run:Thread.java:748  
>*java.io.IOException:java.lang.RuntimeException: 
> java.lang.RuntimeException: Hive Internal Error: no LazyObject for 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyHiveDecimalObjectInspector@1f18cebb:27:2
>   
>org.apache.hadoop.hive.ql.exec.FetchTask:fetch:FetchTask.java:164  
>org.apache.hadoop.hive.ql.Driver:getResults:Driver.java:2098  
>
> org.apache.hive.service.cli.operation.SQLOperation:getNextRowSet:SQLOperation.java:479
>   
>*java.lang.RuntimeException:java.lang.RuntimeException: Hive Internal 
> Error: no LazyObject for 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyHiveDecimalObjectInspector@1f18cebb:43:16
>   
>
> org.apache.hadoop.hive.serde2.lazy.LazyStruct:initLazyFields:LazyStruct.java:172
>   
>org.apache.hadoop.hive.hbase.LazyHBaseRow:initFields:LazyHBaseRow.java:122 
>  
>org.apache.hadoop.hive.hbase.LazyHBaseRow:getField:LazyHBaseRow.java:116  
>
> 

[jira] [Commented] (HIVE-17481) LLAP workload management (umbrella)

2017-12-07 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282573#comment-16282573
 ] 

Prasanth Jayachandran commented on HIVE-17481:
--

[~thai.bui] With workload management, you should be able to de-prioritze query 
that is hogging up all resources. You can define resource plan that include 
different pools with different cluster allocation fraction and move (or kill) 
queries based on condition (elapsed time, bytes read, task parallelism etc.) to 
a pool with lower priority. Following are some references that you can use for 
trying out but note that some of the features are still under development and 
are evolving so expect some rough edges
https://github.com/apache/hive/blob/master/ql/src/test/queries/clientpositive/resourceplan.q
https://github.com/apache/hive/blob/master/itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestTriggersMoveWorkloadManager.java

> LLAP workload management (umbrella)
> ---
>
> Key: HIVE-17481
> URL: https://issues.apache.org/jira/browse/HIVE-17481
> Project: Hive
>  Issue Type: New Feature
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: Workload management design doc.pdf
>
>
> This effort is intended to improve various aspects of cluster sharing for 
> LLAP. Some of these are applicable to non-LLAP queries and may later be 
> extended to all queries. Administrators will be able to specify and apply 
> policies for workload management ("resource plans") that apply to the entire 
> cluster, with only one resource plan being active at a time. The policies 
> will be created and modified using new Hive DDL statements. 
> The policies will cover:
> * Dividing the cluster into a set of (optionally, nested) query pools that 
> are each allocated a fraction of the cluster, a set query parallelism, 
> resource sharing policy between queries, and potentially others like 
> priority, etc.
> * Mapping the incoming queries into pools based on the query user, groups, 
> explicit configuration, etc.
> * Specifying rules that perform actions on queries based on counter values 
> (e.g. killing or moving queries).
> One would also be able to switch policies on a live cluster without (usually) 
> affecting running queries, including e.g. to change policies for daytime and 
> nighttime usage patterns, and other similar scenarios. The switches would be 
> safe and atomic; versioning may eventually be supported.
> Some implementation details:
> * WM will only be supported in HS2 (for obvious reasons).
> * All LLAP query AMs will run in "interactive" YARN queue and will be 
> fungible between Hive pools.
> * We will use the concept of "guaranteed tasks" (also known as ducks) to 
> enforce cluster allocation without a central scheduler and without 
> compromising throughput. Guaranteed tasks preempt other (speculative) tasks 
> and are distributed from HS2 to AMs, and from AMs to tasks, in accordance 
> with percentage allocations in the policy. Each "duck" corresponds to a CPU 
> resource on the cluster. The implementation will be isolated so as to allow 
> different ones later.
> * In future, we may consider improved task placement and late binding, 
> similar to the ones described in Sparrow paper, to work around potential 
> hotspots/etc. that are not avoided with the decentralized scheme.
> * Only one HS2 will initially be supported to avoid split-brain workload 
> management. We will also implement (in a tangential set of work items) 
> active-passive HS2 recovery. Eventually, we intend to switch to full 
> active-active HS2 configuration with shared WM and Tez session pool (unlike 
> the current case with 2 separate session pools). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17002) decimal (binary) is not working when creating external table for hbase

2017-12-07 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam reassigned HIVE-17002:


Assignee: Naveen Gangam

> decimal (binary) is not working when creating external table for hbase
> --
>
> Key: HIVE-17002
> URL: https://issues.apache.org/jira/browse/HIVE-17002
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
> Environment: HBase 1.2.0, Hive 2.1.1
>Reporter: Artur Tamazian
>Assignee: Naveen Gangam
>
> I have a table in Hbase which has a column stored using 
> Bytes.toBytes((BigDecimal) value). Hbase version is 1.2.0
> I'm creating an external table in hive to access it like this:
> {noformat}
> create external table `Users`(key int, ..., `example_column` decimal) 
> stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
> with serdeproperties ("hbase.columns.mapping" = ":key, 
> db:example_column") 
> tblproperties("hbase.table.name" = 
> "Users","hbase.table.default.storage.type" = "binary");
> {noformat}
> Table is created without errors. After that I try running "select * from 
> users;" and see this error:
> {noformat}
> org.apache.hive.service.cli.HiveSQLException:java.io.IOException: 
> java.lang.RuntimeException: java.lang.RuntimeException: Hive Internal Error: 
> no LazyObject for 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyHiveDecimalObjectInspector@1f18cebb:25:24
>   
>
> org.apache.hive.service.cli.operation.SQLOperation:getNextRowSet:SQLOperation.java:484
>   
>
> org.apache.hive.service.cli.operation.OperationManager:getOperationNextRowSet:OperationManager.java:308
>   
>
> org.apache.hive.service.cli.session.HiveSessionImpl:fetchResults:HiveSessionImpl.java:847
>   
>sun.reflect.GeneratedMethodAccessor11:invoke::-1  
>
> sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43
>   
>java.lang.reflect.Method:invoke:Method.java:498  
>
> org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:78
>   
>
> org.apache.hive.service.cli.session.HiveSessionProxy:access$000:HiveSessionProxy.java:36
>   
>
> org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSessionProxy.java:63
>   
>java.security.AccessController:doPrivileged:AccessController.java:-2  
>javax.security.auth.Subject:doAs:Subject.java:422  
>
> org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformation.java:1698
>   
>
> org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:59
>   
>com.sun.proxy.$Proxy33:fetchResults::-1  
>org.apache.hive.service.cli.CLIService:fetchResults:CLIService.java:504  
>
> org.apache.hive.service.cli.thrift.ThriftCLIService:FetchResults:ThriftCLIService.java:698
>   
>
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1717
>   
>
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1702
>   
>org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39  
>org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39  
>
> org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56
>   
>
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:286
>   
>
> java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1142
>   
>
> java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:617
>   
>java.lang.Thread:run:Thread.java:748  
>*java.io.IOException:java.lang.RuntimeException: 
> java.lang.RuntimeException: Hive Internal Error: no LazyObject for 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyHiveDecimalObjectInspector@1f18cebb:27:2
>   
>org.apache.hadoop.hive.ql.exec.FetchTask:fetch:FetchTask.java:164  
>org.apache.hadoop.hive.ql.Driver:getResults:Driver.java:2098  
>
> org.apache.hive.service.cli.operation.SQLOperation:getNextRowSet:SQLOperation.java:479
>   
>*java.lang.RuntimeException:java.lang.RuntimeException: Hive Internal 
> Error: no LazyObject for 
> org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyHiveDecimalObjectInspector@1f18cebb:43:16
>   
>
> org.apache.hadoop.hive.serde2.lazy.LazyStruct:initLazyFields:LazyStruct.java:172
>   
>org.apache.hadoop.hive.hbase.LazyHBaseRow:initFields:LazyHBaseRow.java:122 
>  
>org.apache.hadoop.hive.hbase.LazyHBaseRow:getField:LazyHBaseRow.java:116  
>
> org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector:getStructFieldData:LazySimpleStructObjectInspector.java:128
>   
>
> 

[jira] [Updated] (HIVE-15883) HBase mapped table in Hive insert fail for decimal

2017-12-07 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-15883:
-
Status: Patch Available  (was: Open)

[~aihuaxu] [~ashutoshc] Can you please re-review the patch? I just added a test 
case. No other code changes. Thanks

> HBase mapped table in Hive insert fail for decimal
> --
>
> Key: HIVE-15883
> URL: https://issues.apache.org/jira/browse/HIVE-15883
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-15883.1.patch, HIVE-15883.patch
>
>
> CREATE TABLE hbase_table (
> id int,
> balance decimal(15,2))
> ROW FORMAT DELIMITED
> COLLECTION ITEMS TERMINATED BY '~'
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
> "hbase.columns.mapping"=":key,cf:balance#b");
> insert into hbase_table values (1,1);
> 
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"tmp_values_col1":"1","tmp_values_col2":"1"}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1783)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"tmp_values_col1":"1","tmp_values_col2":"1"}
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
> ... 8 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.serde2.SerDeException: java.lang.RuntimeException: 
> Hive internal error.
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:733)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:97)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
> ... 9 more
> Caused by: org.apache.hadoop.hive.serde2.SerDeException: 
> java.lang.RuntimeException: Hive internal error.
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.serialize(HBaseSerDe.java:286)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:668)
> ... 15 more
> Caused by: java.lang.RuntimeException: Hive internal error.
> at 
> org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitive(LazyUtils.java:328)
> at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serialize(HBaseRowSerializer.java:220)
> at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serializeField(HBaseRowSerializer.java:194)
> at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serialize(HBaseRowSerializer.java:118)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.serialize(HBaseSerDe.java:282)
> ... 16 more 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15883) HBase mapped table in Hive insert fail for decimal

2017-12-07 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-15883:
-
Attachment: HIVE-15883.1.patch

Attaching a new patch. a) rebased to latest source b) Adding qtest for the 
hbase CLI driver.

> HBase mapped table in Hive insert fail for decimal
> --
>
> Key: HIVE-15883
> URL: https://issues.apache.org/jira/browse/HIVE-15883
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-15883.1.patch, HIVE-15883.patch
>
>
> CREATE TABLE hbase_table (
> id int,
> balance decimal(15,2))
> ROW FORMAT DELIMITED
> COLLECTION ITEMS TERMINATED BY '~'
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
> "hbase.columns.mapping"=":key,cf:balance#b");
> insert into hbase_table values (1,1);
> 
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"tmp_values_col1":"1","tmp_values_col2":"1"}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1783)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"tmp_values_col1":"1","tmp_values_col2":"1"}
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
> ... 8 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.serde2.SerDeException: java.lang.RuntimeException: 
> Hive internal error.
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:733)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:97)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
> ... 9 more
> Caused by: org.apache.hadoop.hive.serde2.SerDeException: 
> java.lang.RuntimeException: Hive internal error.
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.serialize(HBaseSerDe.java:286)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:668)
> ... 15 more
> Caused by: java.lang.RuntimeException: Hive internal error.
> at 
> org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitive(LazyUtils.java:328)
> at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serialize(HBaseRowSerializer.java:220)
> at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serializeField(HBaseRowSerializer.java:194)
> at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serialize(HBaseRowSerializer.java:118)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.serialize(HBaseSerDe.java:282)
> ... 16 more 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15883) HBase mapped table in Hive insert fail for decimal

2017-12-07 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-15883:
-
Status: Open  (was: Patch Available)

Will upload new one today.

> HBase mapped table in Hive insert fail for decimal
> --
>
> Key: HIVE-15883
> URL: https://issues.apache.org/jira/browse/HIVE-15883
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-15883.patch
>
>
> CREATE TABLE hbase_table (
> id int,
> balance decimal(15,2))
> ROW FORMAT DELIMITED
> COLLECTION ITEMS TERMINATED BY '~'
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
> "hbase.columns.mapping"=":key,cf:balance#b");
> insert into hbase_table values (1,1);
> 
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"tmp_values_col1":"1","tmp_values_col2":"1"}
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1783)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"tmp_values_col1":"1","tmp_values_col2":"1"}
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
> ... 8 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.serde2.SerDeException: java.lang.RuntimeException: 
> Hive internal error.
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:733)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:97)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
> ... 9 more
> Caused by: org.apache.hadoop.hive.serde2.SerDeException: 
> java.lang.RuntimeException: Hive internal error.
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.serialize(HBaseSerDe.java:286)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:668)
> ... 15 more
> Caused by: java.lang.RuntimeException: Hive internal error.
> at 
> org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitive(LazyUtils.java:328)
> at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serialize(HBaseRowSerializer.java:220)
> at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serializeField(HBaseRowSerializer.java:194)
> at 
> org.apache.hadoop.hive.hbase.HBaseRowSerializer.serialize(HBaseRowSerializer.java:118)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.serialize(HBaseSerDe.java:282)
> ... 16 more 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17923) 'cluster by' should not be needed for a bucketed table

2017-12-07 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal reassigned HIVE-17923:
-

Assignee: Deepak Jaiswal

> 'cluster by' should not be needed for a bucketed table
> --
>
> Key: HIVE-17923
> URL: https://issues.apache.org/jira/browse/HIVE-17923
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Deepak Jaiswal
>Priority: Blocker
>
> given 
> {noformat}
> CREATE TABLE over10k_orc_bucketed(t tinyint,
>si smallint,
>i int,
>b bigint,
>f float,
>d double,
>bo boolean,
>s string,
>ts timestamp,
>`dec` decimal(4,2),
>bin binary) CLUSTERED BY(si) INTO 4 BUCKETS STORED AS ORC;
> {noformat}
> insert into over10k_orc_bucketed select * from over10k
> {noformat}
> produces 1 data file (bucket 0).  It should produce 4 based on input data.
> {noformat}
> insert into over10k_orc_bucketed select * from over10k cluster by si
> {noformat}
> does the right thing.
> acid_vectorization_original.q has the full script (HIVE-17458)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HIVE-17923) 'cluster by' should not be needed for a bucketed table

2017-12-07 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal resolved HIVE-17923.
---
Resolution: Duplicate

Duplicate of https://issues.apache.org/jira/browse/HIVE-18157

> 'cluster by' should not be needed for a bucketed table
> --
>
> Key: HIVE-17923
> URL: https://issues.apache.org/jira/browse/HIVE-17923
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Deepak Jaiswal
>Priority: Blocker
>
> given 
> {noformat}
> CREATE TABLE over10k_orc_bucketed(t tinyint,
>si smallint,
>i int,
>b bigint,
>f float,
>d double,
>bo boolean,
>s string,
>ts timestamp,
>`dec` decimal(4,2),
>bin binary) CLUSTERED BY(si) INTO 4 BUCKETS STORED AS ORC;
> {noformat}
> insert into over10k_orc_bucketed select * from over10k
> {noformat}
> produces 1 data file (bucket 0).  It should produce 4 based on input data.
> {noformat}
> insert into over10k_orc_bucketed select * from over10k cluster by si
> {noformat}
> does the right thing.
> acid_vectorization_original.q has the full script (HIVE-17458)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18242) VectorizedRowBatch cast exception when analyzing partitioned table

2017-12-07 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282350#comment-16282350
 ] 

Matt McCline commented on HIVE-18242:
-

[~lirui] and [~gopalv] thanks for determining cause.

> VectorizedRowBatch cast exception when analyzing partitioned table
> --
>
> Key: HIVE-18242
> URL: https://issues.apache.org/jira/browse/HIVE-18242
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>
> Happens when I run the following (vectorization enabled):
> {code}
> ANALYZE TABLE srcpart PARTITION(ds, hr) COMPUTE STATISTICS;
> {code}
> The stack trace is:
> {noformat}
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch cannot be cast to 
> org.apache.hadoop.io.Text
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.copyObject(WritableStringObjectInspector.java:36)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:425)
>   at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.partialCopyToStandardObject(ObjectInspectorUtils.java:314)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.gatherStats(TableScanOperator.java:191)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:138)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.setupPartitionContextVars(VectorMapOperator.java:682)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.cleanUpInputFileChangedOp(VectorMapOperator.java:607)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1187)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:784)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18221) test acid default

2017-12-07 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18221:
--
Status: Patch Available  (was: Open)

> test acid default
> -
>
> Key: HIVE-18221
> URL: https://issues.apache.org/jira/browse/HIVE-18221
> Project: Hive
>  Issue Type: Test
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-18221.01.patch, HIVE-18221.02.patch, 
> HIVE-18221.03.patch, HIVE-18221.04.patch, HIVE-18221.07.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17495) CachedStore: prewarm improvement (avoid multiple sql calls to read partition column stats), refactoring and caching some aggregate stats

2017-12-07 Thread Daniel Dai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282395#comment-16282395
 ] 

Daniel Dai commented on HIVE-17495:
---

I didn't see ptest running on latest patch, but anyway this patch need to 
rebase after HIVE-17495. 

> CachedStore: prewarm improvement (avoid multiple sql calls to read partition 
> column stats), refactoring and caching some aggregate stats
> 
>
> Key: HIVE-17495
> URL: https://issues.apache.org/jira/browse/HIVE-17495
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-17495.1.patch, HIVE-17495.2.patch, 
> HIVE-17495.3.patch, HIVE-17495.4.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18112) show create for view having special char in where clause is not showing properly

2017-12-07 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282418#comment-16282418
 ] 

Ashutosh Chauhan commented on HIVE-18112:
-

You want this fixed on branch-2.1 or 2.2 your 2 previous comments are little 
confusing.

> show create for view having special char in where clause is not showing 
> properly
> 
>
> Key: HIVE-18112
> URL: https://issues.apache.org/jira/browse/HIVE-18112
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-18112-branch-2.2.patch, HIVE-18112.patch
>
>
> e.g., 
> CREATE VIEW `v2` AS select `evil_byte1`.`a` from `default`.`EVIL_BYTE1` where 
> `evil_byte1`.`a` = 'abcÖdefÖgh';
> Output:
> ==
> 0: jdbc:hive2://172.26.122.227:1> show create table v2;
> ++--+
> | createtab_stmt  
>|
> ++--+
> | CREATE VIEW `v2` AS select `evil_byte1`.`a` from `default`.`EVIL_BYTE1` 
> where `evil_byte1`.`a` = 'abc�def�gh'  |
> ++--+
> Only show create output is having invalid characters, actual source table 
> content is displayed properly in the console.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18221) test acid default

2017-12-07 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18221:
--
Attachment: (was: HIVE-18221.06.patch)

> test acid default
> -
>
> Key: HIVE-18221
> URL: https://issues.apache.org/jira/browse/HIVE-18221
> Project: Hive
>  Issue Type: Test
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-18221.01.patch, HIVE-18221.02.patch, 
> HIVE-18221.03.patch, HIVE-18221.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18221) test acid default

2017-12-07 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18221:
--
Status: Open  (was: Patch Available)

> test acid default
> -
>
> Key: HIVE-18221
> URL: https://issues.apache.org/jira/browse/HIVE-18221
> Project: Hive
>  Issue Type: Test
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-18221.01.patch, HIVE-18221.02.patch, 
> HIVE-18221.03.patch, HIVE-18221.04.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17710) LockManager and External tables

2017-12-07 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282402#comment-16282402
 ] 

Eugene Koifman commented on HIVE-17710:
---

Perhaps LM should only lock TableType.MANAGED_TABLE.  Not clear it makes sense 
for any other type
{noformat}
public enum TableType {
  MANAGED_TABLE, EXTERNAL_TABLE, VIRTUAL_VIEW, INDEX_TABLE, MATERIALIZED_VIEW
}
{noformat}


> LockManager and External tables
> ---
>
> Key: HIVE-17710
> URL: https://issues.apache.org/jira/browse/HIVE-17710
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> should the LM take locks on External tables?  Out of the box Acid LM is being 
> conservative which can cause throughput issues.
> A better strategy may be to exclude External tables but enable explicit "lock 
> table/partition " command (only on external tables?).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18221) test acid default

2017-12-07 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-18221:
--
Attachment: HIVE-18221.07.patch

patch 7 makes HCAT tests use External tables and ensures we don't make external 
tables Acid automatically

> test acid default
> -
>
> Key: HIVE-18221
> URL: https://issues.apache.org/jira/browse/HIVE-18221
> Project: Hive
>  Issue Type: Test
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-18221.01.patch, HIVE-18221.02.patch, 
> HIVE-18221.03.patch, HIVE-18221.04.patch, HIVE-18221.07.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18203) change the way WM is enabled and allow dropping the last resource plan

2017-12-07 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-18203:
---

Assignee: Sergey Shelukhin

> change the way WM is enabled and allow dropping the last resource plan
> --
>
> Key: HIVE-18203
> URL: https://issues.apache.org/jira/browse/HIVE-18203
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Aswathy Chellammal Sreekumar
>Assignee: Sergey Shelukhin
>
> Currently it's impossible to drop the last active resource plan even if WM is 
> disabled. It should be possible to deactivate the last resource plan AND 
> disable WM in the same action. Activating a resource plan should enable WM in 
> this case.
> This should interact with the WM queue config in a sensible manner.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18240) support getClientInfo/setClientInfo in JDBC

2017-12-07 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282394#comment-16282394
 ] 

Sergey Shelukhin commented on HIVE-18240:
-

[~vgumashta] maybe you can take a look?

> support getClientInfo/setClientInfo in JDBC
> ---
>
> Key: HIVE-18240
> URL: https://issues.apache.org/jira/browse/HIVE-18240
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-18240.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18202) Automatically migrate hbase.table.name to hbase.mapreduce.hfileoutputformat.table.name for hbase-based table

2017-12-07 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282368#comment-16282368
 ] 

Aihua Xu commented on HIVE-18202:
-

[~ngangam] Thanks for reviewing. Hadoop3 will be compatible with HBase2, not 
with HBase1. It may not be a good idea to have both properties since that will 
have inconsistency with the newly created tables (which will only have one 
property). Yeah, it will be backward incompatible. 

> Automatically migrate hbase.table.name to 
> hbase.mapreduce.hfileoutputformat.table.name for hbase-based table
> 
>
> Key: HIVE-18202
> URL: https://issues.apache.org/jira/browse/HIVE-18202
> Project: Hive
>  Issue Type: Sub-task
>  Components: HBase Handler
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-18202.1.patch
>
>
> The property name for Hbase table mapping is changed from hbase.table.name to 
> hbase.mapreduce.hfileoutputformat.table.name in HBase 2.
> We can include such upgrade for existing hbase-based tables in DB upgrade 
> script to automatically change such values.
> For the new tables, the query will be like:
> create table hbase_table(key int, val string) stored by 
> 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' with serdeproperties 
> ('hbase.columns.mapping' = ':key,cf:val') tblproperties 
> ('hbase.mapreduce.hfileoutputformat.table.name' = 
> 'positive_hbase_handler_bulk')



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18244) CachedStore: Fix UT when CachedStore is enabled

2017-12-07 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-18244:

Status: Patch Available  (was: Open)

> CachedStore: Fix UT when CachedStore is enabled
> ---
>
> Key: HIVE-18244
> URL: https://issues.apache.org/jira/browse/HIVE-18244
> Project: Hive
>  Issue Type: Bug
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-18244.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-14069) update curator version to 2.12.0

2017-12-07 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-14069:
--
Status: Patch Available  (was: Open)

> update curator version to 2.12.0 
> -
>
> Key: HIVE-14069
> URL: https://issues.apache.org/jira/browse/HIVE-14069
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Metastore
>Reporter: Thejas M Nair
>Assignee: Jason Dere
> Attachments: HIVE-14069.1.patch, HIVE-14069.2.patch, 
> HIVE-14069.3.patch, HIVE-14069.4.patch, HIVE-14069.5.patch
>
>
> curator-2.10.0 has several bug fixes over current version (2.6.0), updating 
> would help improve stability.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18244) CachedStore: Fix UT when CachedStore is enabled

2017-12-07 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-18244:

Attachment: HIVE-18244.1.patch

> CachedStore: Fix UT when CachedStore is enabled
> ---
>
> Key: HIVE-18244
> URL: https://issues.apache.org/jira/browse/HIVE-18244
> Project: Hive
>  Issue Type: Bug
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-18244.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-14069) update curator version to 2.12.0

2017-12-07 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-14069:
--
Attachment: HIVE-14069.5.patch

New patch to just change the curator version.

> update curator version to 2.12.0 
> -
>
> Key: HIVE-14069
> URL: https://issues.apache.org/jira/browse/HIVE-14069
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Metastore
>Reporter: Thejas M Nair
>Assignee: Jason Dere
> Attachments: HIVE-14069.1.patch, HIVE-14069.2.patch, 
> HIVE-14069.3.patch, HIVE-14069.4.patch, HIVE-14069.5.patch
>
>
> curator-2.10.0 has several bug fixes over current version (2.6.0), updating 
> would help improve stability.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18244) CachedStore: Fix UT when CachedStore is enabled

2017-12-07 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta reassigned HIVE-18244:
---


> CachedStore: Fix UT when CachedStore is enabled
> ---
>
> Key: HIVE-18244
> URL: https://issues.apache.org/jira/browse/HIVE-18244
> Project: Hive
>  Issue Type: Bug
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17981) Create a set of builders for Thrift classes

2017-12-07 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282270#comment-16282270
 ] 

Alan Gates commented on HIVE-17981:
---

[~pvary] reviewers are always welcome, no need to ask.  Also note that I have 
refactored the builders a little in the patch for HIVE-17990 which is currently 
on the standalone-metastore branch in commit 
bd212257f2c8f5472894e22501b17a56fd86318c

> Create a set of builders for Thrift classes
> ---
>
> Key: HIVE-17981
> URL: https://issues.apache.org/jira/browse/HIVE-17981
> Project: Hive
>  Issue Type: Sub-task
>  Components: Standalone Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>  Labels: pull-request-available
> Attachments: HIVE-17981.patch
>
>
> Instantiating some of the Thrift classes is painful.  Consider building a 
> {{Table}} object, which requires a {{StorageDescriptor}}, which requires a 
> {{SerDeInfo}} and a list of {{FieldInfo}}.  All that is really necessary for 
> a Table in the most simple case is a name, a database, and some columns.  But 
> currently creating even a simple Table requires 20+ lines of code.  This is 
> particularly painful in tests.  
> I propose to add a set of builders.  These will come with reasonable defaults 
> to minimize the boilerplate code.  They will also include simple methods for 
> common operations (like adding columns, or a parameter) without requiring the 
> user to create all the sub-objects (like {{StorageDescriptor}}).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17981) Create a set of builders for Thrift classes

2017-12-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282267#comment-16282267
 ] 

ASF GitHub Bot commented on HIVE-17981:
---

GitHub user alanfgates opened a pull request:

https://github.com/apache/hive/pull/274

HIVE-17981 Create a set of builders for Thrift classes



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/alanfgates/hive hive17981

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/274.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #274


commit 5360d684f12444bbaa846c80bdf972ca1748e875
Author: Alan Gates 
Date:   2017-09-12T20:18:09Z

HIVE-17981 Create a set of builders for Thrift classes




> Create a set of builders for Thrift classes
> ---
>
> Key: HIVE-17981
> URL: https://issues.apache.org/jira/browse/HIVE-17981
> Project: Hive
>  Issue Type: Sub-task
>  Components: Standalone Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>  Labels: pull-request-available
> Attachments: HIVE-17981.patch
>
>
> Instantiating some of the Thrift classes is painful.  Consider building a 
> {{Table}} object, which requires a {{StorageDescriptor}}, which requires a 
> {{SerDeInfo}} and a list of {{FieldInfo}}.  All that is really necessary for 
> a Table in the most simple case is a name, a database, and some columns.  But 
> currently creating even a simple Table requires 20+ lines of code.  This is 
> particularly painful in tests.  
> I propose to add a set of builders.  These will come with reasonable defaults 
> to minimize the boilerplate code.  They will also include simple methods for 
> common operations (like adding columns, or a parameter) without requiring the 
> user to create all the sub-objects (like {{StorageDescriptor}}).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17981) Create a set of builders for Thrift classes

2017-12-07 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-17981:
--
Labels: pull-request-available  (was: )

> Create a set of builders for Thrift classes
> ---
>
> Key: HIVE-17981
> URL: https://issues.apache.org/jira/browse/HIVE-17981
> Project: Hive
>  Issue Type: Sub-task
>  Components: Standalone Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>  Labels: pull-request-available
> Attachments: HIVE-17981.patch
>
>
> Instantiating some of the Thrift classes is painful.  Consider building a 
> {{Table}} object, which requires a {{StorageDescriptor}}, which requires a 
> {{SerDeInfo}} and a list of {{FieldInfo}}.  All that is really necessary for 
> a Table in the most simple case is a name, a database, and some columns.  But 
> currently creating even a simple Table requires 20+ lines of code.  This is 
> particularly painful in tests.  
> I propose to add a set of builders.  These will come with reasonable defaults 
> to minimize the boilerplate code.  They will also include simple methods for 
> common operations (like adding columns, or a parameter) without requiring the 
> user to create all the sub-objects (like {{StorageDescriptor}}).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17981) Create a set of builders for Thrift classes

2017-12-07 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17981:
--
Attachment: HIVE-17981.patch

> Create a set of builders for Thrift classes
> ---
>
> Key: HIVE-17981
> URL: https://issues.apache.org/jira/browse/HIVE-17981
> Project: Hive
>  Issue Type: Sub-task
>  Components: Standalone Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>  Labels: pull-request-available
> Attachments: HIVE-17981.patch
>
>
> Instantiating some of the Thrift classes is painful.  Consider building a 
> {{Table}} object, which requires a {{StorageDescriptor}}, which requires a 
> {{SerDeInfo}} and a list of {{FieldInfo}}.  All that is really necessary for 
> a Table in the most simple case is a name, a database, and some columns.  But 
> currently creating even a simple Table requires 20+ lines of code.  This is 
> particularly painful in tests.  
> I propose to add a set of builders.  These will come with reasonable defaults 
> to minimize the boilerplate code.  They will also include simple methods for 
> common operations (like adding columns, or a parameter) without requiring the 
> user to create all the sub-objects (like {{StorageDescriptor}}).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17981) Create a set of builders for Thrift classes

2017-12-07 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17981:
--
Status: Patch Available  (was: Open)

> Create a set of builders for Thrift classes
> ---
>
> Key: HIVE-17981
> URL: https://issues.apache.org/jira/browse/HIVE-17981
> Project: Hive
>  Issue Type: Sub-task
>  Components: Standalone Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17981.patch
>
>
> Instantiating some of the Thrift classes is painful.  Consider building a 
> {{Table}} object, which requires a {{StorageDescriptor}}, which requires a 
> {{SerDeInfo}} and a list of {{FieldInfo}}.  All that is really necessary for 
> a Table in the most simple case is a name, a database, and some columns.  But 
> currently creating even a simple Table requires 20+ lines of code.  This is 
> particularly painful in tests.  
> I propose to add a set of builders.  These will come with reasonable defaults 
> to minimize the boilerplate code.  They will also include simple methods for 
> common operations (like adding columns, or a parameter) without requiring the 
> user to create all the sub-objects (like {{StorageDescriptor}}).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18140) Partitioned tables statistics can go wrong in basic stats mixed case

2017-12-07 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282172#comment-16282172
 ] 

Ashutosh Chauhan commented on HIVE-18140:
-

[~kgyrtkirk] Does this need more work?

> Partitioned tables statistics can go wrong in basic stats mixed case
> 
>
> Key: HIVE-18140
> URL: https://issues.apache.org/jira/browse/HIVE-18140
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-18140.01wip01.patch
>
>
> suppose the following scenario:
> * part1 has basic stats {{RC=10,DS=1K}}
> * all other partition has no basic stats (and a bunch of rows)
> then 
> [this|https://github.com/apache/hive/blob/d9924ab3e285536f7e2cc15ecbea36a78c59c66d/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L378]
>  condition would be false; which in turn produces estimations for the whole 
> partitioned table: {{RC=10,DS=1K}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18206) Merge of RC/ORC file should follow other fileformate which use merge configuration parameter

2017-12-07 Thread Wang Haihua (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282123#comment-16282123
 ] 

Wang Haihua commented on HIVE-18206:


[~prasanth_j] Thanks. And does failed tests related to this patch?

> Merge of RC/ORC file should follow other fileformate which use merge 
> configuration parameter
> 
>
> Key: HIVE-18206
> URL: https://issues.apache.org/jira/browse/HIVE-18206
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 1.2.1, 2.1.1, 2.2.0, 3.0.0
>Reporter: Wang Haihua
>Assignee: Wang Haihua
> Attachments: HIVE-18206.1.patch, HIVE-18206.2.patch
>
>
> Merge configuration parameter, like {{hive.merge.size.per.task}} , decide the 
> average file after merge stage.
> But we found it only work for file format like {{Textfile/SequenceFile}}. 
> With {{RC/ORC}} file format, it {{does not work}}.
> For {{RC/ORC}} file format we found the file size after merge stage, depends 
> on parameter like {{mapreduce.input.fileinputformat.split.maxsize}.
> it is better to use {{hive.merge.size.per.task}} to decide the the average 
> file size for RC/ORC fileformat, which results in unifying.
> Root Cause is for RC/ORC file format, merge class is {{MergeFileTask}} 
> instead of {{MapRedTask}} for Textfile/SequenceFile. And {{MergeFileTask}}  
> just has not accept the configuration value in MergeFileWork, so the solution 
> is passing it into  {{MergeFileTask}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18054) Make Lineage work with concurrent queries on a Session

2017-12-07 Thread Andrew Sherman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman updated HIVE-18054:
--
Attachment: HIVE-18054.11.patch

try to get tests to run

>  Make Lineage work with concurrent queries on a Session
> ---
>
> Key: HIVE-18054
> URL: https://issues.apache.org/jira/browse/HIVE-18054
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
> Attachments: HIVE-18054.1.patch, HIVE-18054.10.patch, 
> HIVE-18054.11.patch, HIVE-18054.2.patch, HIVE-18054.3.patch, 
> HIVE-18054.4.patch, HIVE-18054.5.patch, HIVE-18054.6.patch, 
> HIVE-18054.7.patch, HIVE-18054.8.patch, HIVE-18054.9.patch
>
>
> A Hive Session can contain multiple concurrent sql Operations.
> Lineage is currently tracked in SessionState and is cleared when a query 
> completes. This results in Lineage for other running queries being lost.
> To fix this, move LineageState from SessionState to QueryState.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18228) Azure credential properties should be added to the HiveConf hidden list

2017-12-07 Thread Andrew Sherman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman updated HIVE-18228:
--
Attachment: HIVE-18228.2.patch

try to get tests to run

> Azure credential properties should be added to the HiveConf hidden list
> ---
>
> Key: HIVE-18228
> URL: https://issues.apache.org/jira/browse/HIVE-18228
> Project: Hive
>  Issue Type: Bug
>Reporter: Andrew Sherman
>Assignee: Andrew Sherman
> Attachments: HIVE-18228.1.patch, HIVE-18228.2.patch
>
>
> The HIVE_CONF_HIDDEN_LIST("hive.conf.hidden.list") already contains keys 
> contaiing aws credentials. The Azure properties to be added are:
> * dfs.adls.oauth2.credential
> * fs.adl.oauth2.credential



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18111) Fix temp path for Spark DPP sink

2017-12-07 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16282083#comment-16282083
 ] 

Hive QA commented on HIVE-18111:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
35s{color} | {color:red} ql: The patch generated 5 new + 543 unchanged - 1 
fixed = 548 total (was 544) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 13m 20s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 36f0d89 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8141/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8141/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Fix temp path for Spark DPP sink
> 
>
> Key: HIVE-18111
> URL: https://issues.apache.org/jira/browse/HIVE-18111
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-18111.1.patch, HIVE-18111.2.patch, 
> HIVE-18111.3.patch, HIVE-18111.4.patch
>
>
> Before HIVE-17877, each DPP sink has only one target work. The output path of 
> a DPP work is {{TMP_PATH/targetWorkId/dppWorkId}}. When we do the pruning, 
> each map work reads DPP outputs under {{TMP_PATH/targetWorkId}}.
> After HIVE-17877, each DPP sink can have multiple target works. It's possible 
> that a map work needs to read DPP outputs from multiple 
> {{TMP_PATH/targetWorkId}}. To solve this, I think we can have a DPP output 
> path specific to each query, e.g. {{QUERY_TMP_PATH/dpp_output}}. Each DPP 
> work outputs to {{QUERY_TMP_PATH/dpp_output/dppWorkId}}. And each map work 
> reads from {{QUERY_TMP_PATH/dpp_output}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   >