[jira] [Commented] (HIVE-14979) Removing stale Zookeeper locks at HiveServer2 initialization

2016-10-17 Thread Peter Vary (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581634#comment-15581634
 ] 

Peter Vary commented on HIVE-14979:
---

[~ashutoshc] When creating this patch I have found this code which I do not 
really understand (marked by "->"):
{code}
package org.apache.hadoop.hive.ql.lockmgr.zookeeper;
[..]
public class ZooKeeperHiveLockManager implements HiveLockManager {
[..]
  private static List getLocks(HiveConf conf,
  HiveLockObject key, String parent, boolean verifyTablePartition, boolean 
fetchData)
  throws LockException {
[..]
if (fetchData) {
  try {
data = new HiveLockObjectData(new 
String(curatorFramework.getData().watched().forPath(curChild)));
->data.setClientIp(clientIp);
  } catch (Exception e) {
LOG.error("Error in getting data for " + curChild, e);
// ignore error
  }
}
[..]
{code}

Why do we update the clientIp of every lock when fetching (reading) data from 
zookeeper. By any chance do you remember anything why this was needed? Seems 
like it is done by purpose but during my testing I haven't find any occasion 
when this was unset.

This is set by this code which seems to me that is quiet safe, and done every 
time when a new lock is created:
{code}
  private ZooKeeperHiveLock lockPrimitive(HiveLockObject key,
  HiveLockMode mode, boolean keepAlive, boolean parentCreated,
  Set conflictingLocks)
  throws Exception {
[..]
HiveLockObjectData lockData = key.getData();
lockData.setClientIp(clientIp);
{code}

Thanks,
Peter

> Removing stale Zookeeper locks at HiveServer2 initialization
> 
>
> Key: HIVE-14979
> URL: https://issues.apache.org/jira/browse/HIVE-14979
> Project: Hive
>  Issue Type: Improvement
>  Components: Locking
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-14979.patch
>
>
> HiveServer2 could use Zookeeper to store token that indicate that particular 
> tables are locked with the creation of persistent Zookeeper objects. 
> A problem can occur when a HiveServer2 instance creates a lock on a table and 
> the HiveServer2 instances crashes ("Out of Memory" for example) and the locks 
> are not released in Zookeeper. This lock will then remain until it is 
> manually cleared by an admin.
> There should be a way to remove stale locks at HiveServer2 initialization, 
> helping the admins life.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14979) Removing stale Zookeeper locks at HiveServer2 initialization

2016-10-17 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-14979:
--
Attachment: HIVE-14979.patch

First version of the patch

> Removing stale Zookeeper locks at HiveServer2 initialization
> 
>
> Key: HIVE-14979
> URL: https://issues.apache.org/jira/browse/HIVE-14979
> Project: Hive
>  Issue Type: Improvement
>  Components: Locking
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-14979.patch
>
>
> HiveServer2 could use Zookeeper to store token that indicate that particular 
> tables are locked with the creation of persistent Zookeeper objects. 
> A problem can occur when a HiveServer2 instance creates a lock on a table and 
> the HiveServer2 instances crashes ("Out of Memory" for example) and the locks 
> are not released in Zookeeper. This lock will then remain until it is 
> manually cleared by an admin.
> There should be a way to remove stale locks at HiveServer2 initialization, 
> helping the admins life.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14957) HiveSortLimitPullUpConstantsRule misses branches when parent operator is Union

2016-10-17 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581530#comment-15581530
 ] 

Jesus Camacho Rodriguez commented on HIVE-14957:


LGTM, +1

> HiveSortLimitPullUpConstantsRule misses branches when parent operator is Union
> --
>
> Key: HIVE-14957
> URL: https://issues.apache.org/jira/browse/HIVE-14957
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14957.01.patch, HIVE-14957.02.patch
>
>
> {code}
> call.transformTo(parent.copy(parent.getTraitSet(), 
> ImmutableList.of(relBuilder.build(;
> {code}
> When parent is an union operator which has 2 inputs, the parent.copy will 
> only copy the one that has SortLimit and ignore the other branches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14959) Support distinct with windowing when CBO is disabled

2016-10-17 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581548#comment-15581548
 ] 

Jesus Camacho Rodriguez commented on HIVE-14959:


[~ashutoshc], could you take a look? Thanks

> Support distinct with windowing when CBO is disabled
> 
>
> Key: HIVE-14959
> URL: https://issues.apache.org/jira/browse/HIVE-14959
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-14959.01.patch, HIVE-14959.patch
>
>
> For instance, the following query with CBO off:
> {code:sql}
> select distinct last_value(i) over ( partition by si order by i ),
>   first_value(t)  over ( partition by si order by i )
> from over10k limit 50;
> {code}
> will fail, with the following message:
> {noformat}
> SELECT DISTINCT not allowed in the presence of windowing functions when CBO 
> is off
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13316) Upgrade to Calcite 1.10

2016-10-17 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13316:
---
Attachment: HIVE-13316.10.patch

> Upgrade to Calcite 1.10
> ---
>
> Key: HIVE-13316
> URL: https://issues.apache.org/jira/browse/HIVE-13316
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13316.01.patch, HIVE-13316.02.patch, 
> HIVE-13316.05.patch, HIVE-13316.07.patch, HIVE-13316.08.patch, 
> HIVE-13316.09.patch, HIVE-13316.10.patch, HIVE-13316.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14979) Removing stale Zookeeper locks at HiveServer2 initialization

2016-10-17 Thread Peter Vary (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581715#comment-15581715
 ] 

Peter Vary commented on HIVE-14979:
---

CC: [~namit] - I have found out, that you were the one who wrote this code? 
Could you please chime in if you remember anything?

Thanks,
Peter

> Removing stale Zookeeper locks at HiveServer2 initialization
> 
>
> Key: HIVE-14979
> URL: https://issues.apache.org/jira/browse/HIVE-14979
> Project: Hive
>  Issue Type: Improvement
>  Components: Locking
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-14979.patch
>
>
> HiveServer2 could use Zookeeper to store token that indicate that particular 
> tables are locked with the creation of persistent Zookeeper objects. 
> A problem can occur when a HiveServer2 instance creates a lock on a table and 
> the HiveServer2 instances crashes ("Out of Memory" for example) and the locks 
> are not released in Zookeeper. This lock will then remain until it is 
> manually cleared by an admin.
> There should be a way to remove stale locks at HiveServer2 initialization, 
> helping the admins life.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14979) Removing stale Zookeeper locks at HiveServer2 initialization

2016-10-17 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-14979:
--
Status: Patch Available  (was: Open)

> Removing stale Zookeeper locks at HiveServer2 initialization
> 
>
> Key: HIVE-14979
> URL: https://issues.apache.org/jira/browse/HIVE-14979
> Project: Hive
>  Issue Type: Improvement
>  Components: Locking
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-14979.patch
>
>
> HiveServer2 could use Zookeeper to store token that indicate that particular 
> tables are locked with the creation of persistent Zookeeper objects. 
> A problem can occur when a HiveServer2 instance creates a lock on a table and 
> the HiveServer2 instances crashes ("Out of Memory" for example) and the locks 
> are not released in Zookeeper. This lock will then remain until it is 
> manually cleared by an admin.
> There should be a way to remove stale locks at HiveServer2 initialization, 
> helping the admins life.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14940) MiniTezCliDriver - switch back to SQL metastore as default

2016-10-17 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14940:
-
Attachment: HIVE-14940.4.patch

> MiniTezCliDriver - switch back to SQL metastore as default
> --
>
> Key: HIVE-14940
> URL: https://issues.apache.org/jira/browse/HIVE-14940
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14940.1.patch, HIVE-14940.1.patch, 
> HIVE-14940.2.patch, HIVE-14940.2.patch, HIVE-14940.3.patch, 
> HIVE-14940.3.patch, HIVE-14940.4.patch, HIVE-14940.4.patch
>
>
> HBase setup for metastore in MiniTez is taking around 3 mins for setup. The 
> actual runtime of the queries is typically much lower. To avoid the high 
> overhead we should be switch back to SQL metastore as default and if required 
> we can have dedicated set of tests that run against hbase metastore. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14921) Move slow CliDriver tests to MiniLlap - part 2

2016-10-17 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14921:
-
Attachment: HIVE-14921.3.patch

> Move slow CliDriver tests to MiniLlap - part 2
> --
>
> Key: HIVE-14921
> URL: https://issues.apache.org/jira/browse/HIVE-14921
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14921.1.patch, HIVE-14921.1.patch, 
> HIVE-14921.2.patch, HIVE-14921.2.patch, HIVE-14921.3.patch, HIVE-14921.3.patch
>
>
> Continuation to HIVE-14877



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14993) make WriteEntity distinguish writeType

2016-10-17 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14993:
--
Status: Patch Available  (was: Open)

> make WriteEntity distinguish writeType
> --
>
> Key: HIVE-14993
> URL: https://issues.apache.org/jira/browse/HIVE-14993
> Project: Hive
>  Issue Type: Bug
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-14993.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14993) make WriteEntity distinguish writeType

2016-10-17 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14993:
--
Attachment: HIVE-14993.patch

> make WriteEntity distinguish writeType
> --
>
> Key: HIVE-14993
> URL: https://issues.apache.org/jira/browse/HIVE-14993
> Project: Hive
>  Issue Type: Bug
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-14993.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14993) make WriteEntity distinguish writeType

2016-10-17 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14993:
--
Component/s: Transactions

> make WriteEntity distinguish writeType
> --
>
> Key: HIVE-14993
> URL: https://issues.apache.org/jira/browse/HIVE-14993
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-14993.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9941) sql std authorization on partitioned table: truncate and insert

2016-10-17 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-9941:
---
Target Version/s: 2.2.0  (was: 1.3.0, 1.2.2, 2.2.0)
  Status: Patch Available  (was: Open)

> sql std authorization on partitioned table: truncate and insert
> ---
>
> Key: HIVE-9941
> URL: https://issues.apache.org/jira/browse/HIVE-9941
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.2.0, 1.0.0
>Reporter: Olaf Flebbe
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-9941.2.patch, HIVE-9941.patch
>
>
> sql std authorization works as expected.
> However if a table is partitioned any user can truncate it
> User foo:
> {code}
> create table bla (a string) partitioned by (b string);
> #.. loading values ...
> {code}
> Admin:
> {code}
> 0: jdbc:hive2://localhost:1/default> set role admin;
> No rows affected (0,074 seconds)
> 0: jdbc:hive2://localhost:1/default> show grant on bla;
> +---+++-+-+-++---++--+--+
> | database  | table  | partition  | column  | principal_name  | 
> principal_type  | privilege  | grant_option  |   grant_time   | grantor  |
> +---+++-+-+-++---++--+--+
> | default   | bla|| | foo | USER  
>   | DELETE | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | INSERT | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | SELECT | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | UPDATE | true  | 1426158997000  | foo  |
> +---+++-+-+-++---++--+--+
> {code}
> now user olaf
> {code}
> 0: jdbc:hive2://localhost:1/default> select * from bla;
> Error: Error while compiling statement: FAILED: HiveAccessControlException 
> Permission denied: Principal [name=olaf, type=USER] does not have following 
> privileges for operation QUERY [[SELECT] on Object [type=TABLE_OR_VIEW, 
> name=default.bla]] (state=42000,code=4)
> {code}
> works as expected.
> _BUT_
> {code}
> 0: jdbc:hive2://localhost:1/default> truncate table bla;
> No rows affected (0,18 seconds)
> {code}
> _And table is empty afterwards_.
> Similarily: {{insert into table}} works, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14992) Relocate several common libraries in hive jdbc uber jar

2016-10-17 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-14992:
--
Attachment: HIVE-14992.1.patch

> Relocate several common libraries in hive jdbc uber jar
> ---
>
> Key: HIVE-14992
> URL: https://issues.apache.org/jira/browse/HIVE-14992
> Project: Hive
>  Issue Type: Bug
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-14992.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14927) Remove code duplication from tests in TestLdapAtnProviderWithMiniDS

2016-10-17 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583744#comment-15583744
 ] 

Chaoyu Tang commented on HIVE-14927:


Yeah, it seems that precommit build have some issues.

> Remove code duplication from tests in TestLdapAtnProviderWithMiniDS
> ---
>
> Key: HIVE-14927
> URL: https://issues.apache.org/jira/browse/HIVE-14927
> Project: Hive
>  Issue Type: Improvement
>  Components: Test
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
> Attachments: HIVE-14927.1.patch, HIVE-14927.2.patch, 
> HIVE-14927.3.patch
>
>
> * Extract inner class User and implement a proper builder for it.
> * Extract all common code to LdapAuthenticationTestCase class 
>   ** setting up the test case
>** executing test case
>** result validation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12764) Support Intersect (distinct/all) Except (distinct/all) Minus (distinct/all) in Hive

2016-10-17 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12764:
---
Summary: Support Intersect (distinct/all) Except (distinct/all) Minus 
(distinct/all) in Hive  (was: Support set operators in Hive)

> Support Intersect (distinct/all) Except (distinct/all) Minus (distinct/all) 
> in Hive
> ---
>
> Key: HIVE-12764
> URL: https://issues.apache.org/jira/browse/HIVE-12764
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>
> We plan to address union distinct (already done), intersect (all, distinct) 
> and except (all, distinct) by leveraging the power of relational algebra 
> through query rewriting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9941) sql std authorization on partitioned table: truncate and insert

2016-10-17 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583988#comment-15583988
 ] 

Jason Dere commented on HIVE-9941:
--

+1 if the tests pass

> sql std authorization on partitioned table: truncate and insert
> ---
>
> Key: HIVE-9941
> URL: https://issues.apache.org/jira/browse/HIVE-9941
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.0.0, 1.2.0
>Reporter: Olaf Flebbe
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-9941.2.patch, HIVE-9941.patch
>
>
> sql std authorization works as expected.
> However if a table is partitioned any user can truncate it
> User foo:
> {code}
> create table bla (a string) partitioned by (b string);
> #.. loading values ...
> {code}
> Admin:
> {code}
> 0: jdbc:hive2://localhost:1/default> set role admin;
> No rows affected (0,074 seconds)
> 0: jdbc:hive2://localhost:1/default> show grant on bla;
> +---+++-+-+-++---++--+--+
> | database  | table  | partition  | column  | principal_name  | 
> principal_type  | privilege  | grant_option  |   grant_time   | grantor  |
> +---+++-+-+-++---++--+--+
> | default   | bla|| | foo | USER  
>   | DELETE | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | INSERT | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | SELECT | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | UPDATE | true  | 1426158997000  | foo  |
> +---+++-+-+-++---++--+--+
> {code}
> now user olaf
> {code}
> 0: jdbc:hive2://localhost:1/default> select * from bla;
> Error: Error while compiling statement: FAILED: HiveAccessControlException 
> Permission denied: Principal [name=olaf, type=USER] does not have following 
> privileges for operation QUERY [[SELECT] on Object [type=TABLE_OR_VIEW, 
> name=default.bla]] (state=42000,code=4)
> {code}
> works as expected.
> _BUT_
> {code}
> 0: jdbc:hive2://localhost:1/default> truncate table bla;
> No rows affected (0,18 seconds)
> {code}
> _And table is empty afterwards_.
> Similarily: {{insert into table}} works, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (HIVE-9941) sql std authorization on partitioned table: truncate and insert

2016-10-17 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-9941:
-
Comment: was deleted

(was: +1 if the tests pass)

> sql std authorization on partitioned table: truncate and insert
> ---
>
> Key: HIVE-9941
> URL: https://issues.apache.org/jira/browse/HIVE-9941
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.0.0, 1.2.0
>Reporter: Olaf Flebbe
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-9941.2.patch, HIVE-9941.patch
>
>
> sql std authorization works as expected.
> However if a table is partitioned any user can truncate it
> User foo:
> {code}
> create table bla (a string) partitioned by (b string);
> #.. loading values ...
> {code}
> Admin:
> {code}
> 0: jdbc:hive2://localhost:1/default> set role admin;
> No rows affected (0,074 seconds)
> 0: jdbc:hive2://localhost:1/default> show grant on bla;
> +---+++-+-+-++---++--+--+
> | database  | table  | partition  | column  | principal_name  | 
> principal_type  | privilege  | grant_option  |   grant_time   | grantor  |
> +---+++-+-+-++---++--+--+
> | default   | bla|| | foo | USER  
>   | DELETE | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | INSERT | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | SELECT | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | UPDATE | true  | 1426158997000  | foo  |
> +---+++-+-+-++---++--+--+
> {code}
> now user olaf
> {code}
> 0: jdbc:hive2://localhost:1/default> select * from bla;
> Error: Error while compiling statement: FAILED: HiveAccessControlException 
> Permission denied: Principal [name=olaf, type=USER] does not have following 
> privileges for operation QUERY [[SELECT] on Object [type=TABLE_OR_VIEW, 
> name=default.bla]] (state=42000,code=4)
> {code}
> works as expected.
> _BUT_
> {code}
> 0: jdbc:hive2://localhost:1/default> truncate table bla;
> No rows affected (0,18 seconds)
> {code}
> _And table is empty afterwards_.
> Similarily: {{insert into table}} works, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9941) sql std authorization on partitioned table: truncate and insert

2016-10-17 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583993#comment-15583993
 ] 

Jason Dere commented on HIVE-9941:
--

Actually, I'll hold off my +1 until we see the ptest run, per the discussed new 
guildlines for waiting on test results before committing.
But the test cases look good to me.

> sql std authorization on partitioned table: truncate and insert
> ---
>
> Key: HIVE-9941
> URL: https://issues.apache.org/jira/browse/HIVE-9941
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.0.0, 1.2.0
>Reporter: Olaf Flebbe
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-9941.2.patch, HIVE-9941.patch
>
>
> sql std authorization works as expected.
> However if a table is partitioned any user can truncate it
> User foo:
> {code}
> create table bla (a string) partitioned by (b string);
> #.. loading values ...
> {code}
> Admin:
> {code}
> 0: jdbc:hive2://localhost:1/default> set role admin;
> No rows affected (0,074 seconds)
> 0: jdbc:hive2://localhost:1/default> show grant on bla;
> +---+++-+-+-++---++--+--+
> | database  | table  | partition  | column  | principal_name  | 
> principal_type  | privilege  | grant_option  |   grant_time   | grantor  |
> +---+++-+-+-++---++--+--+
> | default   | bla|| | foo | USER  
>   | DELETE | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | INSERT | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | SELECT | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | UPDATE | true  | 1426158997000  | foo  |
> +---+++-+-+-++---++--+--+
> {code}
> now user olaf
> {code}
> 0: jdbc:hive2://localhost:1/default> select * from bla;
> Error: Error while compiling statement: FAILED: HiveAccessControlException 
> Permission denied: Principal [name=olaf, type=USER] does not have following 
> privileges for operation QUERY [[SELECT] on Object [type=TABLE_OR_VIEW, 
> name=default.bla]] (state=42000,code=4)
> {code}
> works as expected.
> _BUT_
> {code}
> 0: jdbc:hive2://localhost:1/default> truncate table bla;
> No rows affected (0,18 seconds)
> {code}
> _And table is empty afterwards_.
> Similarily: {{insert into table}} works, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14921) Move slow CliDriver tests to MiniLlap - part 2

2016-10-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584003#comment-15584003
 ] 

Hive QA commented on HIVE-14921:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833204/HIVE-14921.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10567 tests 
executed
*Failed tests:*
{noformat}
TestBeelineWithHS2ConnectionFile - did not produce a TEST-*.xml file (likely 
timed out) (batchId=199)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_globallimit] 
(batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auto_mult_tables_compact]
 (batchId=32)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[pcs] 
(batchId=144)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
 (batchId=157)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0] (batchId=157)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1] (batchId=157)
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testBlackListedUdfUsage
 (batchId=206)
org.apache.hive.spark.client.TestSparkClient.testJobSubmission (batchId=265)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1604/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1604/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1604/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833204 - PreCommit-HIVE-Build

> Move slow CliDriver tests to MiniLlap - part 2
> --
>
> Key: HIVE-14921
> URL: https://issues.apache.org/jira/browse/HIVE-14921
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14921.1.patch, HIVE-14921.1.patch, 
> HIVE-14921.2.patch, HIVE-14921.2.patch, HIVE-14921.3.patch, HIVE-14921.3.patch
>
>
> Continuation to HIVE-14877



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14913) Add new unit tests

2016-10-17 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-14913:
---
Status: Patch Available  (was: Open)

Address review comments

> Add new unit tests
> --
>
> Key: HIVE-14913
> URL: https://issues.apache.org/jira/browse/HIVE-14913
> Project: Hive
>  Issue Type: Task
>  Components: Tests
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch, 
> HIVE-14913.3.patch, HIVE-14913.4.patch, HIVE-14913.5.patch
>
>
> Moving bunch of tests from system test to hive unit tests to reduce testing 
> overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14913) Add new unit tests

2016-10-17 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-14913:
---
Attachment: HIVE-14913.5.patch

> Add new unit tests
> --
>
> Key: HIVE-14913
> URL: https://issues.apache.org/jira/browse/HIVE-14913
> Project: Hive
>  Issue Type: Task
>  Components: Tests
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch, 
> HIVE-14913.3.patch, HIVE-14913.4.patch, HIVE-14913.5.patch
>
>
> Moving bunch of tests from system test to hive unit tests to reduce testing 
> overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14913) Add new unit tests

2016-10-17 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-14913:
---
Status: Open  (was: Patch Available)

> Add new unit tests
> --
>
> Key: HIVE-14913
> URL: https://issues.apache.org/jira/browse/HIVE-14913
> Project: Hive
>  Issue Type: Task
>  Components: Tests
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch, 
> HIVE-14913.3.patch, HIVE-14913.4.patch, HIVE-14913.5.patch
>
>
> Moving bunch of tests from system test to hive unit tests to reduce testing 
> overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14969) add test cases for ACID

2016-10-17 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14969:
--
Component/s: Transactions

> add test cases for ACID
> ---
>
> Key: HIVE-14969
> URL: https://issues.apache.org/jira/browse/HIVE-14969
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>
> I think the following tests are added
> 1) CTAS into transactional table must be transactional.
> 2) tablesample with buckets from ACID table - judging by HIVE-14967, 
> selecting buckets with nested directories may have bugs on Tez
> 3) insert with union - same reason, if the test doesn't already exist it 
> would be nice to see that bases and deltas are processed correctly given that 
> union creates 2 directories for the results of the same insert



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14995) double conversion can corrupt partition column values for insert overwrite with DP

2016-10-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14995:

Description: 
{noformat}
set hive.mapred.mode=nonstrict;
set hive.explain.user=false;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.fetch.task.conversion=none;

drop table iow1; 
create table iow1(key int) partitioned by (key2 int);

select key, key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 
desc limit 1;

explain
insert overwrite table iow1 partition (key2)
select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
limit 1;

insert overwrite table iow1 partition (key2)
select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
limit 1;
{noformat}

The result of the select query has the column converted to double (because 
src.key is string). 
{noformat}
498 499.0   499.0
{noformat}

When inserting that into table, the value is converted correctly to integer for 
the regular column, but not for partition column.
Explain for insert (extracted)
{noformat}
Map Reduce
  Map Operator Tree:
...
  Select Operator
expressions: (UDFToDouble(key) + 1.0) (type: double)
...
Reduce Output Operator
  key expressions: _col0 (type: double)
  sort order: -
...
  Reduce Operator Tree:
Select Operator
  expressions: KEY.reducesinkkey0 (type: double), KEY.reducesinkkey0 
(type: double)
...
Select Operator
  expressions: UDFToInteger(_col0) (type: int), _col1 (type: double)
  followed by FSOP and load into table
{noformat}
The result of the select from the resulting table is:
{noformat}
POSTHOOK: query: select key, key2 from iow1
...
POSTHOOK: Input: default@iow1@key2=499.0
...
499 NULL
{noformat}
Woops!



  was:
{noformat}
set hive.mapred.mode=nonstrict;
set hive.explain.user=false;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.fetch.task.conversion=none;

drop table iow1; 
create table iow1(key int) partitioned by (key2 int);

select key, key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 
desc limit 1;

explain
insert overwrite table iow1 partition (key2)
select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
limit 1;

insert overwrite table iow1 partition (key2)
select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
limit 1;
{noformat}

The result of the select query has the column converted to double (because 
src.key is string). 
The value is converted correctly to integer for the regular column, but not for 
partition column.
{noformat}
498 499.0   499.0
{noformat}

Explain for insert (extracted)
{noformat}
Map Reduce
  Map Operator Tree:
...
  Select Operator
expressions: (UDFToDouble(key) + 1.0) (type: double)
...
Reduce Output Operator
  key expressions: _col0 (type: double)
  sort order: -
...
  Reduce Operator Tree:
Select Operator
  expressions: KEY.reducesinkkey0 (type: double), KEY.reducesinkkey0 
(type: double)
...
Select Operator
  expressions: UDFToInteger(_col0) (type: int), _col1 (type: double)
  followed by FSOP and load into table
{noformat}
The result of the select from the resulting table is:
{noformat}
POSTHOOK: query: select key, key2 from iow1
...
POSTHOOK: Input: default@iow1@key2=499.0
...
499 NULL
{noformat}
Woops!




> double conversion can corrupt partition column values for insert overwrite 
> with DP
> --
>
> Key: HIVE-14995
> URL: https://issues.apache.org/jira/browse/HIVE-14995
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Critical
>
> {noformat}
> set hive.mapred.mode=nonstrict;
> set hive.explain.user=false;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.fetch.task.conversion=none;
> drop table iow1; 
> create table iow1(key int) partitioned by (key2 int);
> select key, key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 
> desc limit 1;
> explain
> insert overwrite table iow1 partition (key2)
> select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
> limit 1;
> insert overwrite table iow1 partition (key2)
> select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
> limit 1;
> {noformat}
> The result of the select query has the column converted to double (because 
> src.key is string). 
> {noformat}
> 498   499.0   499.0
> {noformat}
> When inserting that into table, the value is converted correctly to integer 
> for the regular column, but not for partition column.
> Explain for insert (extracted)
> {noformat}
> Map 

[jira] [Updated] (HIVE-14995) double conversion can corrupt partition column values for insert overwrite with DP

2016-10-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14995:

Description: 
{noformat}
set hive.mapred.mode=nonstrict;
set hive.explain.user=false;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.fetch.task.conversion=none;

drop table iow1; 
create table iow1(key int) partitioned by (key2 int);

select key, key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 
desc limit 1;

explain
insert overwrite table iow1 partition (key2)
select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
limit 1;

insert overwrite table iow1 partition (key2)
select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
limit 1;
{noformat}

The result of the select query has the column converted to double (because 
src.key is string). 
{noformat}
498 499.0   499.0
{noformat}

When inserting that into table, the value is converted correctly to integer for 
the regular column, but not for partition column.
Explain for insert (extracted)
{noformat}
Map Reduce
  Map Operator Tree:
...
  Select Operator
expressions: (UDFToDouble(key) + 1.0) (type: double)
...
Reduce Output Operator
  key expressions: _col0 (type: double)
...
  Reduce Operator Tree:
Select Operator
  expressions: KEY.reducesinkkey0 (type: double), KEY.reducesinkkey0 
(type: double)
...
Select Operator
  expressions: UDFToInteger(_col0) (type: int), _col1 (type: double)
  followed by FSOP and load into table
{noformat}
The result of the select from the resulting table is:
{noformat}
POSTHOOK: query: select key, key2 from iow1
...
POSTHOOK: Input: default@iow1@key2=499.0
...
499 NULL
{noformat}
Woops!



  was:
{noformat}
set hive.mapred.mode=nonstrict;
set hive.explain.user=false;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.fetch.task.conversion=none;

drop table iow1; 
create table iow1(key int) partitioned by (key2 int);

select key, key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 
desc limit 1;

explain
insert overwrite table iow1 partition (key2)
select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
limit 1;

insert overwrite table iow1 partition (key2)
select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
limit 1;
{noformat}

The result of the select query has the column converted to double (because 
src.key is string). 
{noformat}
498 499.0   499.0
{noformat}

When inserting that into table, the value is converted correctly to integer for 
the regular column, but not for partition column.
Explain for insert (extracted)
{noformat}
Map Reduce
  Map Operator Tree:
...
  Select Operator
expressions: (UDFToDouble(key) + 1.0) (type: double)
...
Reduce Output Operator
  key expressions: _col0 (type: double)
  sort order: -
...
  Reduce Operator Tree:
Select Operator
  expressions: KEY.reducesinkkey0 (type: double), KEY.reducesinkkey0 
(type: double)
...
Select Operator
  expressions: UDFToInteger(_col0) (type: int), _col1 (type: double)
  followed by FSOP and load into table
{noformat}
The result of the select from the resulting table is:
{noformat}
POSTHOOK: query: select key, key2 from iow1
...
POSTHOOK: Input: default@iow1@key2=499.0
...
499 NULL
{noformat}
Woops!




> double conversion can corrupt partition column values for insert overwrite 
> with DP
> --
>
> Key: HIVE-14995
> URL: https://issues.apache.org/jira/browse/HIVE-14995
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Critical
>
> {noformat}
> set hive.mapred.mode=nonstrict;
> set hive.explain.user=false;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.fetch.task.conversion=none;
> drop table iow1; 
> create table iow1(key int) partitioned by (key2 int);
> select key, key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 
> desc limit 1;
> explain
> insert overwrite table iow1 partition (key2)
> select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
> limit 1;
> insert overwrite table iow1 partition (key2)
> select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
> limit 1;
> {noformat}
> The result of the select query has the column converted to double (because 
> src.key is string). 
> {noformat}
> 498   499.0   499.0
> {noformat}
> When inserting that into table, the value is converted correctly to integer 
> for the regular column, but not for partition column.
> Explain for insert (extracted)
> {noformat}
> Map 

[jira] [Updated] (HIVE-14995) double conversion can corrupt partition column values for insert overwrite with DP

2016-10-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14995:

Description: 
{noformat}
set hive.mapred.mode=nonstrict;
set hive.explain.user=false;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.fetch.task.conversion=none;

drop table iow1; 
create table iow1(key int) partitioned by (key2 int);

select key, key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 
desc limit 1;

explain
insert overwrite table iow1 partition (key2)
select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
limit 1;

insert overwrite table iow1 partition (key2)
select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
limit 1;
{noformat}

The result of the select query has the column converted to double (because 
src.key is string). 
{noformat}
498 499.0   499.0
{noformat}

When inserting that into table, the value is converted correctly to integer for 
the regular column, but not for partition column.
Explain for insert (extracted)
{noformat}
  Map Operator Tree:
...
  Select Operator
expressions: (UDFToDouble(key) + 1.0) (type: double)
...
Reduce Output Operator
  key expressions: _col0 (type: double)
...
  Reduce Operator Tree:
Select Operator
  expressions: KEY.reducesinkkey0 (type: double), KEY.reducesinkkey0 
(type: double)
...
Select Operator
  expressions: UDFToInteger(_col0) (type: int), _col1 (type: double)
... followed by FSOP and load into table
{noformat}
The result of the select from the resulting table is:
{noformat}
POSTHOOK: query: select key, key2 from iow1
...
POSTHOOK: Input: default@iow1@key2=499.0
...
499 NULL
{noformat}
Woops!



  was:
{noformat}
set hive.mapred.mode=nonstrict;
set hive.explain.user=false;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.fetch.task.conversion=none;

drop table iow1; 
create table iow1(key int) partitioned by (key2 int);

select key, key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 
desc limit 1;

explain
insert overwrite table iow1 partition (key2)
select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
limit 1;

insert overwrite table iow1 partition (key2)
select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
limit 1;
{noformat}

The result of the select query has the column converted to double (because 
src.key is string). 
{noformat}
498 499.0   499.0
{noformat}

When inserting that into table, the value is converted correctly to integer for 
the regular column, but not for partition column.
Explain for insert (extracted)
{noformat}
Map Reduce
  Map Operator Tree:
...
  Select Operator
expressions: (UDFToDouble(key) + 1.0) (type: double)
...
Reduce Output Operator
  key expressions: _col0 (type: double)
...
  Reduce Operator Tree:
Select Operator
  expressions: KEY.reducesinkkey0 (type: double), KEY.reducesinkkey0 
(type: double)
...
Select Operator
  expressions: UDFToInteger(_col0) (type: int), _col1 (type: double)
  followed by FSOP and load into table
{noformat}
The result of the select from the resulting table is:
{noformat}
POSTHOOK: query: select key, key2 from iow1
...
POSTHOOK: Input: default@iow1@key2=499.0
...
499 NULL
{noformat}
Woops!




> double conversion can corrupt partition column values for insert overwrite 
> with DP
> --
>
> Key: HIVE-14995
> URL: https://issues.apache.org/jira/browse/HIVE-14995
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Critical
>
> {noformat}
> set hive.mapred.mode=nonstrict;
> set hive.explain.user=false;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.fetch.task.conversion=none;
> drop table iow1; 
> create table iow1(key int) partitioned by (key2 int);
> select key, key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 
> desc limit 1;
> explain
> insert overwrite table iow1 partition (key2)
> select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
> limit 1;
> insert overwrite table iow1 partition (key2)
> select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
> limit 1;
> {noformat}
> The result of the select query has the column converted to double (because 
> src.key is string). 
> {noformat}
> 498   499.0   499.0
> {noformat}
> When inserting that into table, the value is converted correctly to integer 
> for the regular column, but not for partition column.
> Explain for insert (extracted)
> {noformat}
>   Map Operator Tree:
> ...
>   Select 

[jira] [Commented] (HIVE-14995) double conversion can corrupt partition column values for insert overwrite with DP

2016-10-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584061#comment-15584061
 ] 

Sergey Shelukhin commented on HIVE-14995:
-

[~hagleitn] [~ashutoshc] another interesting one... incorrect results

> double conversion can corrupt partition column values for insert overwrite 
> with DP
> --
>
> Key: HIVE-14995
> URL: https://issues.apache.org/jira/browse/HIVE-14995
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Critical
>
> {noformat}
> set hive.mapred.mode=nonstrict;
> set hive.explain.user=false;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.fetch.task.conversion=none;
> drop table iow1; 
> create table iow1(key int) partitioned by (key2 int);
> select key, key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 
> desc limit 1;
> explain
> insert overwrite table iow1 partition (key2)
> select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
> limit 1;
> insert overwrite table iow1 partition (key2)
> select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
> limit 1;
> {noformat}
> The result of the select query has the column converted to double (because 
> src.key is string). 
> {noformat}
> 498   499.0   499.0
> {noformat}
> When inserting that into table, the value is converted correctly to integer 
> for the regular column, but not for partition column.
> Explain for insert (extracted)
> {noformat}
>   Map Operator Tree:
> ...
>   Select Operator
> expressions: (UDFToDouble(key) + 1.0) (type: double)
> ...
> Reduce Output Operator
>   key expressions: _col0 (type: double)
> ...
>   Reduce Operator Tree:
> Select Operator
>   expressions: KEY.reducesinkkey0 (type: double), KEY.reducesinkkey0 
> (type: double)
> ...
> Select Operator
>   expressions: UDFToInteger(_col0) (type: int), _col1 (type: 
> double)
> ... followed by FSOP and load into table
> {noformat}
> The result of the select from the resulting table is:
> {noformat}
> POSTHOOK: query: select key, key2 from iow1
> ...
> POSTHOOK: Input: default@iow1@key2=499.0
> ...
> 499   NULL
> {noformat}
> Woops!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13873) Column pruning for nested fields

2016-10-17 Thread Ferdinand Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-13873:

Attachment: HIVE-13873.4.patch

> Column pruning for nested fields
> 
>
> Key: HIVE-13873
> URL: https://issues.apache.org/jira/browse/HIVE-13873
> Project: Hive
>  Issue Type: New Feature
>  Components: Logical Optimizer
>Reporter: Xuefu Zhang
>Assignee: Ferdinand Xu
> Attachments: HIVE-13873.1.patch, HIVE-13873.2.patch, 
> HIVE-13873.3.patch, HIVE-13873.4.patch, HIVE-13873.patch, HIVE-13873.wip.patch
>
>
> Some columnar file formats such as Parquet store fields in struct type also 
> column by column using encoding described in Google Dramel pager. It's very 
> common in big data where data are stored in structs while queries only needs 
> a subset of the the fields in the structs. However, presently Hive still 
> needs to read the whole struct regardless whether all fields are selected. 
> Therefore, pruning unwanted sub-fields in struct or nested fields at file 
> reading time would be a big performance boost for such scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14940) MiniTezCliDriver - switch back to SQL metastore as default

2016-10-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584092#comment-15584092
 ] 

Hive QA commented on HIVE-14940:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833845/HIVE-14940.4.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10592 tests 
executed
*Failed tests:*
{noformat}
TestBeelineWithHS2ConnectionFile - did not produce a TEST-*.xml file (likely 
timed out) (batchId=197)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_globallimit] 
(batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[order_null] (batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_fast_stats] 
(batchId=46)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
 (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0] (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1] (batchId=155)
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testBlackListedUdfUsage
 (batchId=204)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1605/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1605/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1605/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833845 - PreCommit-HIVE-Build

> MiniTezCliDriver - switch back to SQL metastore as default
> --
>
> Key: HIVE-14940
> URL: https://issues.apache.org/jira/browse/HIVE-14940
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14940.1.patch, HIVE-14940.1.patch, 
> HIVE-14940.2.patch, HIVE-14940.2.patch, HIVE-14940.3.patch, 
> HIVE-14940.3.patch, HIVE-14940.4.patch, HIVE-14940.4.patch
>
>
> HBase setup for metastore in MiniTez is taking around 3 mins for setup. The 
> actual runtime of the queries is typically much lower. To avoid the high 
> overhead we should be switch back to SQL metastore as default and if required 
> we can have dedicated set of tests that run against hbase metastore. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-14994) double conversion can corrupt partition column values for insert overwrite with DP

2016-10-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-14994.
-
Resolution: Duplicate

Dup of HIVE-14995. Resolving this as I edited the description in the other one 
to improve it.

> double conversion can corrupt partition column values for insert overwrite 
> with DP
> --
>
> Key: HIVE-14994
> URL: https://issues.apache.org/jira/browse/HIVE-14994
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Critical
>
> {noformat}
> set hive.mapred.mode=nonstrict;
> set hive.explain.user=false;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.fetch.task.conversion=none;
> drop table iow1; 
> create table iow1(key int) partitioned by (key2 int);
> select key, key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 
> desc limit 1;
> explain
> insert overwrite table iow1 partition (key2)
> select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
> limit 1;
> insert overwrite table iow1 partition (key2)
> select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
> limit 1;
> {noformat}
> The result of the select query has the column converted to double (because 
> src.key is string). 
> The value is converted correctly to integer for the regular column, but not 
> for partition column.
> {noformat}
> 498   499.0   499.0
> {noformat}
> Explain for insert (extracted)
> {noformat}
> Map Reduce
>   Map Operator Tree:
> ...
>   Select Operator
> expressions: (UDFToDouble(key) + 1.0) (type: double)
> ...
> Reduce Output Operator
>   key expressions: _col0 (type: double)
>   sort order: -
> ...
>   Reduce Operator Tree:
> Select Operator
>   expressions: KEY.reducesinkkey0 (type: double), KEY.reducesinkkey0 
> (type: double)
> ...
> Select Operator
>   expressions: UDFToInteger(_col0) (type: int), _col1 (type: 
> double)
>   followed by FSOP and load into table
> {noformat}
> The result of the select from the resulting table is:
> {noformat}
> POSTHOOK: query: select key, key2 from iow1
> ...
> POSTHOOK: Input: default@iow1@key2=499.0
> ...
> 499   NULL
> {noformat}
> Woops!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14969) add test cases for ACID

2016-10-17 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584091#comment-15584091
 ] 

Sergey Shelukhin commented on HIVE-14969:
-

Note: tablesample pruner is super buggy, so it's probably best to disable it 
for ACID tables like it's disabled for some other stuff

> add test cases for ACID
> ---
>
> Key: HIVE-14969
> URL: https://issues.apache.org/jira/browse/HIVE-14969
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Sergey Shelukhin
>
> I think the following tests are added
> 1) CTAS into transactional table must be transactional.
> 2) tablesample with buckets from ACID table - judging by HIVE-14967, 
> selecting buckets with nested directories may have bugs on Tez
> 3) insert with union - same reason, if the test doesn't already exist it 
> would be nice to see that bases and deltas are processed correctly given that 
> union creates 2 directories for the results of the same insert



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14995) double conversion can corrupt partition column values for insert with dynamic partitions

2016-10-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14995:

Summary: double conversion can corrupt partition column values for insert 
with dynamic partitions  (was: double conversion can corrupt partition column 
values for insert overwrite with DP)

> double conversion can corrupt partition column values for insert with dynamic 
> partitions
> 
>
> Key: HIVE-14995
> URL: https://issues.apache.org/jira/browse/HIVE-14995
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Critical
>
> {noformat}
> set hive.mapred.mode=nonstrict;
> set hive.explain.user=false;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.fetch.task.conversion=none;
> drop table iow1; 
> create table iow1(key int) partitioned by (key2 int);
> select key, key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 
> desc limit 1;
> explain
> insert overwrite table iow1 partition (key2)
> select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
> limit 1;
> insert overwrite table iow1 partition (key2)
> select key + 1 as k1, key + 1 as k2 from src where key >= 0 order by k1 desc 
> limit 1;
> {noformat}
> The result of the select query has the column converted to double (because 
> src.key is string). 
> {noformat}
> 498   499.0   499.0
> {noformat}
> When inserting that into table, the value is converted correctly to integer 
> for the regular column, but not for partition column.
> Explain for insert (extracted)
> {noformat}
>   Map Operator Tree:
> ...
>   Select Operator
> expressions: (UDFToDouble(key) + 1.0) (type: double)
> ...
> Reduce Output Operator
>   key expressions: _col0 (type: double)
> ...
>   Reduce Operator Tree:
> Select Operator
>   expressions: KEY.reducesinkkey0 (type: double), KEY.reducesinkkey0 
> (type: double)
> ...
> Select Operator
>   expressions: UDFToInteger(_col0) (type: int), _col1 (type: 
> double)
> ... followed by FSOP and load into table
> {noformat}
> The result of the select from the resulting table is:
> {noformat}
> POSTHOOK: query: select key, key2 from iow1
> ...
> POSTHOOK: Input: default@iow1@key2=499.0
> ...
> 499   NULL
> {noformat}
> Woops!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9941) sql std authorization on partitioned table: truncate and insert

2016-10-17 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-9941:
---
Attachment: HIVE-9941.3.patch

Actually, I have a further update, with import and drop ptn as well. I was 
assuming this was tested elsewhere, but apparently not. Added them in.

> sql std authorization on partitioned table: truncate and insert
> ---
>
> Key: HIVE-9941
> URL: https://issues.apache.org/jira/browse/HIVE-9941
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.0.0, 1.2.0
>Reporter: Olaf Flebbe
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-9941.2.patch, HIVE-9941.3.patch, HIVE-9941.patch
>
>
> sql std authorization works as expected.
> However if a table is partitioned any user can truncate it
> User foo:
> {code}
> create table bla (a string) partitioned by (b string);
> #.. loading values ...
> {code}
> Admin:
> {code}
> 0: jdbc:hive2://localhost:1/default> set role admin;
> No rows affected (0,074 seconds)
> 0: jdbc:hive2://localhost:1/default> show grant on bla;
> +---+++-+-+-++---++--+--+
> | database  | table  | partition  | column  | principal_name  | 
> principal_type  | privilege  | grant_option  |   grant_time   | grantor  |
> +---+++-+-+-++---++--+--+
> | default   | bla|| | foo | USER  
>   | DELETE | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | INSERT | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | SELECT | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | UPDATE | true  | 1426158997000  | foo  |
> +---+++-+-+-++---++--+--+
> {code}
> now user olaf
> {code}
> 0: jdbc:hive2://localhost:1/default> select * from bla;
> Error: Error while compiling statement: FAILED: HiveAccessControlException 
> Permission denied: Principal [name=olaf, type=USER] does not have following 
> privileges for operation QUERY [[SELECT] on Object [type=TABLE_OR_VIEW, 
> name=default.bla]] (state=42000,code=4)
> {code}
> works as expected.
> _BUT_
> {code}
> 0: jdbc:hive2://localhost:1/default> truncate table bla;
> No rows affected (0,18 seconds)
> {code}
> _And table is empty afterwards_.
> Similarily: {{insert into table}} works, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14887) Reduce the memory requirements for tests

2016-10-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584186#comment-15584186
 ] 

Hive QA commented on HIVE-14887:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833300/HIVE-14887.03.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10594 tests 
executed
*Failed tests:*
{noformat}
TestBeelineWithHS2ConnectionFile - did not produce a TEST-*.xml file (likely 
timed out) (batchId=197)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_globallimit] 
(batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auto_partitioned] 
(batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[order_null] (batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_fast_stats] 
(batchId=46)
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_bulk] 
(batchId=89)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=91)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
 (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0] (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1] (batchId=155)
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testBlackListedUdfUsage
 (batchId=204)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1606/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1606/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1606/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833300 - PreCommit-HIVE-Build

> Reduce the memory requirements for tests
> 
>
> Key: HIVE-14887
> URL: https://issues.apache.org/jira/browse/HIVE-14887
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14887.01.patch, HIVE-14887.02.patch, 
> HIVE-14887.03.patch
>
>
> The clusters that we spin up end up requiring 16GB at times. Also the maven 
> arguments seem a little heavy weight.
> Reducing this will allow for additional ptest drones per box, which should 
> bring down the runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14921) Move slow CliDriver tests to MiniLlap - part 2

2016-10-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584278#comment-15584278
 ] 

Hive QA commented on HIVE-14921:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833847/HIVE-14921.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10553 tests 
executed
*Failed tests:*
{noformat}
TestBeelineWithHS2ConnectionFile - did not produce a TEST-*.xml file (likely 
timed out) (batchId=199)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_globallimit] 
(batchId=27)
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
 (batchId=92)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
 (batchId=157)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0] (batchId=157)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1] (batchId=157)
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testBlackListedUdfUsage
 (batchId=206)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1607/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1607/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1607/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833847 - PreCommit-HIVE-Build

> Move slow CliDriver tests to MiniLlap - part 2
> --
>
> Key: HIVE-14921
> URL: https://issues.apache.org/jira/browse/HIVE-14921
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14921.1.patch, HIVE-14921.1.patch, 
> HIVE-14921.2.patch, HIVE-14921.2.patch, HIVE-14921.3.patch, HIVE-14921.3.patch
>
>
> Continuation to HIVE-14877



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14642) handle insert overwrite for MM tables

2016-10-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14642:

Fix Version/s: hive-14535
   Status: Patch Available  (was: Open)

> handle insert overwrite for MM tables
> -
>
> Key: HIVE-14642
> URL: https://issues.apache.org/jira/browse/HIVE-14642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14642.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14642) handle insert overwrite for MM tables

2016-10-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14642:

Attachment: HIVE-14642.patch

Relatively small patch (mostly test changes, one fix for DP).
Seems like non-ORC merge is also broken... need to take a look separately

> handle insert overwrite for MM tables
> -
>
> Key: HIVE-14642
> URL: https://issues.apache.org/jira/browse/HIVE-14642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14642.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-10-17 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584307#comment-15584307
 ] 

Rui Li commented on HIVE-14797:
---

Thanks for the update [~roncenzhao]. I have one more question.
{{ObjectInspectorUtils.getBucketHashCode}} is also used in several places other 
than RS, e.g. in FS. Now if the # of reducers is 31, RS will compute the hash 
code differently from the other places. Wondering if we need to keep some kind 
of consistency among these calling paths. [~xuefuz] do you have any ideas?

> reducer number estimating may lead to data skew
> ---
>
> Key: HIVE-14797
> URL: https://issues.apache.org/jira/browse/HIVE-14797
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: roncenzhao
>Assignee: roncenzhao
> Attachments: HIVE-14797.2.patch, HIVE-14797.3.patch, 
> HIVE-14797.4.patch, HIVE-14797.patch
>
>
> HiveKey's hash code is generated by multipling by 31 key by key which is 
> implemented in method `ObjectInspectorUtils.getBucketHashCode()`:
> for (int i = 0; i < bucketFields.length; i++) {
>   int fieldHash = ObjectInspectorUtils.hashCode(bucketFields[i], 
> bucketFieldInspectors[i]);
>   hashCode = 31 * hashCode + fieldHash;
> }
> The follow example will lead to data skew:
> I hava two table called tbl1 and tbl2 and they have the same column: a int, b 
> string. The values of column 'a' in both two tables are not skew, but values 
> of column 'b' in both two tables are skew.
> When my sql is "select * from tbl1 join tbl2 on tbl1.a=tbl2.a and 
> tbl1.b=tbl2.b" and the estimated reducer number is 31, it will lead to data 
> skew.
> As we know, the HiveKey's hash code is generated by `hash(a)*31 + hash(b)`. 
> When reducer number is 31 the reducer No. of each row is `hash(b)%31`. In the 
> result, the job will be skew.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14029) Update Spark version to 2.0.0

2016-10-17 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584325#comment-15584325
 ] 

Rui Li commented on HIVE-14029:
---

Hmm even with a shim layer, it's difficult to support different Spark versions 
if b/c is not maintained between minor releases of Spark.
I'm wondering if the Spark used by Hive can be considered as some kind of 
embedded binaries that exclusively used for HoS. On Hive side, we just need to 
set spark.home pointing to this Spark. User's other Spark applications, e.g. 
SparkSQL, streaming, can still run against the current Spark they have in the 
cluster. Will this make it easier for the upgrade?
I think we also need to be more careful to upgrade Spark in the future, if the 
upgrade is breaking compatibility. For such upgrade, we need to firstly make 
sure there's no obvious regression in functionality and performance.

> Update Spark version to 2.0.0
> -
>
> Key: HIVE-14029
> URL: https://issues.apache.org/jira/browse/HIVE-14029
> Project: Hive
>  Issue Type: Bug
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
>  Labels: Incompatible, TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14029.1.patch, HIVE-14029.2.patch, 
> HIVE-14029.3.patch, HIVE-14029.4.patch, HIVE-14029.5.patch, 
> HIVE-14029.6.patch, HIVE-14029.7.patch, HIVE-14029.8.patch, HIVE-14029.patch
>
>
> There are quite some new optimizations in Spark 2.0.0. We need to bump up 
> Spark to 2.0.0 to benefit those performance improvements.
> To update Spark version to 2.0.0, the following changes are required:
> * Spark API updates:
> ** SparkShuffler#call return Iterator instead of Iterable
> ** SparkListener -> JavaSparkListener
> ** InputMetrics constructor doesn’t accept readMethod
> ** Method remoteBlocksFetched and localBlocksFetched in ShuffleReadMetrics 
> return long type instead of integer
> * Dependency upgrade:
> ** Jackson: 2.4.2 -> 2.6.5
> ** Netty version: 4.0.23.Final -> 4.0.29.Final
> ** Scala binary version: 2.10 -> 2.11
> ** Scala version: 2.10.4 -> 2.11.8



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14940) MiniTezCliDriver - switch back to SQL metastore as default

2016-10-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584369#comment-15584369
 ] 

Hive QA commented on HIVE-14940:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833845/HIVE-14940.4.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10592 tests 
executed
*Failed tests:*
{noformat}
TestBeelineWithHS2ConnectionFile - did not produce a TEST-*.xml file (likely 
timed out) (batchId=197)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_globallimit] 
(batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[order_null] (batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_fast_stats] 
(batchId=46)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
 (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0] (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1] (batchId=155)
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testBlackListedUdfUsage
 (batchId=204)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1608/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1608/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1608/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833845 - PreCommit-HIVE-Build

> MiniTezCliDriver - switch back to SQL metastore as default
> --
>
> Key: HIVE-14940
> URL: https://issues.apache.org/jira/browse/HIVE-14940
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14940.1.patch, HIVE-14940.1.patch, 
> HIVE-14940.2.patch, HIVE-14940.2.patch, HIVE-14940.3.patch, 
> HIVE-14940.3.patch, HIVE-14940.4.patch, HIVE-14940.4.patch
>
>
> HBase setup for metastore in MiniTez is taking around 3 mins for setup. The 
> actual runtime of the queries is typically much lower. To avoid the high 
> overhead we should be switch back to SQL metastore as default and if required 
> we can have dedicated set of tests that run against hbase metastore. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14940) MiniTezCliDriver - switch back to SQL metastore as default

2016-10-17 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14940:
-
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master.

> MiniTezCliDriver - switch back to SQL metastore as default
> --
>
> Key: HIVE-14940
> URL: https://issues.apache.org/jira/browse/HIVE-14940
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Fix For: 2.2.0
>
> Attachments: HIVE-14940.1.patch, HIVE-14940.1.patch, 
> HIVE-14940.2.patch, HIVE-14940.2.patch, HIVE-14940.3.patch, 
> HIVE-14940.3.patch, HIVE-14940.4.patch, HIVE-14940.4.patch
>
>
> HBase setup for metastore in MiniTez is taking around 3 mins for setup. The 
> actual runtime of the queries is typically much lower. To avoid the high 
> overhead we should be switch back to SQL metastore as default and if required 
> we can have dedicated set of tests that run against hbase metastore. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14642) handle insert overwrite for MM tables

2016-10-17 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14642:

Summary: handle insert overwrite for MM tables  (was: handle insert 
overwrite, load, import)

> handle insert overwrite for MM tables
> -
>
> Key: HIVE-14642
> URL: https://issues.apache.org/jira/browse/HIVE-14642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-10-17 Thread roncenzhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584262#comment-15584262
 ] 

roncenzhao commented on HIVE-14797:
---

Hi, [~lirui] , I hava resolved this problem in the new patch.
Please check it. Thanks~

> reducer number estimating may lead to data skew
> ---
>
> Key: HIVE-14797
> URL: https://issues.apache.org/jira/browse/HIVE-14797
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: roncenzhao
>Assignee: roncenzhao
> Attachments: HIVE-14797.2.patch, HIVE-14797.3.patch, 
> HIVE-14797.4.patch, HIVE-14797.patch
>
>
> HiveKey's hash code is generated by multipling by 31 key by key which is 
> implemented in method `ObjectInspectorUtils.getBucketHashCode()`:
> for (int i = 0; i < bucketFields.length; i++) {
>   int fieldHash = ObjectInspectorUtils.hashCode(bucketFields[i], 
> bucketFieldInspectors[i]);
>   hashCode = 31 * hashCode + fieldHash;
> }
> The follow example will lead to data skew:
> I hava two table called tbl1 and tbl2 and they have the same column: a int, b 
> string. The values of column 'a' in both two tables are not skew, but values 
> of column 'b' in both two tables are skew.
> When my sql is "select * from tbl1 join tbl2 on tbl1.a=tbl2.a and 
> tbl1.b=tbl2.b" and the estimated reducer number is 31, it will lead to data 
> skew.
> As we know, the HiveKey's hash code is generated by `hash(a)*31 + hash(b)`. 
> When reducer number is 31 the reducer No. of each row is `hash(b)%31`. In the 
> result, the job will be skew.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14797) reducer number estimating may lead to data skew

2016-10-17 Thread roncenzhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

roncenzhao updated HIVE-14797:
--
Attachment: HIVE-14797.4.patch

resolve the problem about running on spark/tez

> reducer number estimating may lead to data skew
> ---
>
> Key: HIVE-14797
> URL: https://issues.apache.org/jira/browse/HIVE-14797
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: roncenzhao
>Assignee: roncenzhao
> Attachments: HIVE-14797.2.patch, HIVE-14797.3.patch, 
> HIVE-14797.4.patch, HIVE-14797.patch
>
>
> HiveKey's hash code is generated by multipling by 31 key by key which is 
> implemented in method `ObjectInspectorUtils.getBucketHashCode()`:
> for (int i = 0; i < bucketFields.length; i++) {
>   int fieldHash = ObjectInspectorUtils.hashCode(bucketFields[i], 
> bucketFieldInspectors[i]);
>   hashCode = 31 * hashCode + fieldHash;
> }
> The follow example will lead to data skew:
> I hava two table called tbl1 and tbl2 and they have the same column: a int, b 
> string. The values of column 'a' in both two tables are not skew, but values 
> of column 'b' in both two tables are skew.
> When my sql is "select * from tbl1 join tbl2 on tbl1.a=tbl2.a and 
> tbl1.b=tbl2.b" and the estimated reducer number is 31, it will lead to data 
> skew.
> As we know, the HiveKey's hash code is generated by `hash(a)*31 + hash(b)`. 
> When reducer number is 31 the reducer No. of each row is `hash(b)%31`. In the 
> result, the job will be skew.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14940) MiniTezCliDriver - switch back to SQL metastore as default

2016-10-17 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584377#comment-15584377
 ] 

Prasanth Jayachandran commented on HIVE-14940:
--

These test failures are consistently failing in master for a while now after 
the ptest migration. 

> MiniTezCliDriver - switch back to SQL metastore as default
> --
>
> Key: HIVE-14940
> URL: https://issues.apache.org/jira/browse/HIVE-14940
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14940.1.patch, HIVE-14940.1.patch, 
> HIVE-14940.2.patch, HIVE-14940.2.patch, HIVE-14940.3.patch, 
> HIVE-14940.3.patch, HIVE-14940.4.patch, HIVE-14940.4.patch
>
>
> HBase setup for metastore in MiniTez is taking around 3 mins for setup. The 
> actual runtime of the queries is typically much lower. To avoid the high 
> overhead we should be switch back to SQL metastore as default and if required 
> we can have dedicated set of tests that run against hbase metastore. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14989) FIELDS TERMINATED BY parsing broken when delimiter is more than 1 byte

2016-10-17 Thread Niklaus Xiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584420#comment-15584420
 ] 

Niklaus Xiao commented on HIVE-14989:
-

You should use {{MultiDelimtSerde}} in this case.

> FIELDS TERMINATED BY parsing broken when delimiter is more than 1 byte
> --
>
> Key: HIVE-14989
> URL: https://issues.apache.org/jira/browse/HIVE-14989
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats, Parser, Reader
>Affects Versions: 0.13.0, 0.13.1
>Reporter: Ruslan Dautkhanov
>
> FIELDS TERMINATED BY parsing broken when delimiter is more than 1 byte. 
> Delimiter starting from 2nd character becomes part of returned data. No 
> parsed properly.
> Test case:
> {noformat}
> CREATE external TABLE test_muldelim
> (  string1 STRING,
>string2 STRING,
>string3 STRING
> )
>  ROW FORMAT 
>DELIMITED FIELDS TERMINATED BY '<>'
>   LINES TERMINATED BY '\n'
>  STORED AS TEXTFILE
>   location '/user/hive/test_muldelim'
> {noformat}
> Create a text file under /user/hive/test_muldelim with following 2 lines:
> {noformat}
> data1<>data2<>data3
> aa<>bb<>cc
> {noformat}
> Now notice that two-character delimiter wasn't parsed properly:
> {noformat}
> jdbc:hive2://host.domain.com:1> select * from ruslan_test.test_muldelim ;
> ++++--+
> | test_muldelim.string1  | test_muldelim.string2  | test_muldelim.string3  |
> ++++--+
> | data1  | >data2 | >data3 |
> | aa | >bb| >cc|
> ++++--+
> 2 rows selected (0.453 seconds)
> {noformat}
> The second delimiter's character ('>') became part of the columns to the 
> right (`string2` and `string3`).
> Table DDL:
> {noformat}
> 0: jdbc:hive2://host.domain.com:1> show create table dafault.test_muldelim ;
> +-+--+
> | createtab_stmt  |
> +-+--+
> | CREATE EXTERNAL TABLE `default.test_muldelim`(  |
> |   `string1` string, |
> |   `string2` string, |
> |   `string3` string) |
> | ROW FORMAT DELIMITED|
> |   FIELDS TERMINATED BY '<>' |
> |   LINES TERMINATED BY '\n'  |
> | STORED AS INPUTFORMAT   |
> |   'org.apache.hadoop.mapred.TextInputFormat'|
> | OUTPUTFORMAT|
> |   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'  |
> | LOCATION|
> |   'hdfs://epsdatalake/user/hive/test_muldelim'  |
> | TBLPROPERTIES ( |
> |   'transient_lastDdlTime'='1476727100') |
> +-+--+
> 15 rows selected (0.286 seconds)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14993) make WriteEntity distinguish writeType

2016-10-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1558#comment-1558
 ] 

Hive QA commented on HIVE-14993:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833848/HIVE-14993.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 849 failed/errored test(s), 10594 tests 
executed
*Failed tests:*
{noformat}
TestBeelineWithHS2ConnectionFile - did not produce a TEST-*.xml file (likely 
timed out) (batchId=197)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_globallimit] 
(batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin] (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[add_part_exist] 
(batchId=6)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter1] (batchId=76)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter2] (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter3] (batchId=19)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter4] (batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter5] (batchId=37)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_char1] (batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_char2] (batchId=34)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_file_format] 
(batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_merge] (batchId=24)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_merge_2] 
(batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_merge_2_orc] 
(batchId=67)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_merge_3] 
(batchId=71)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_merge_orc] 
(batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_merge_stats] 
(batchId=54)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_numbuckets_partitioned_table2_h23]
 (batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_numbuckets_partitioned_table_h23]
 (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_change_col]
 (batchId=23)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_clusterby_sortby]
 (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_coltype] 
(batchId=23)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_format_loc]
 (batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_rename_partition] 
(batchId=20)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_rename_partition_authorization]
 (batchId=53)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_skewed_table] 
(batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_cascade] 
(batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_invalidate_column_stats]
 (batchId=45)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_location] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_not_sorted] 
(batchId=57)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_partition_drop]
 (batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_serde2] 
(batchId=24)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_serde] 
(batchId=7)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_varchar1] 
(batchId=74)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_varchar2] 
(batchId=20)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_view_as_select] 
(batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_view_rename] 
(batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[analyze_table_null_partition]
 (batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_filter] 
(batchId=8)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_groupby2] 
(batchId=42)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_groupby] 
(batchId=43)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_join_pkfk]
 (batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_limit] 
(batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_select] 
(batchId=55)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_table] 
(batchId=19)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_union] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[archive_excludeHadoop20] 
(batchId=59)

[jira] [Commented] (HIVE-14989) FIELDS TERMINATED BY parsing broken when delimiter is more than 1 byte

2016-10-17 Thread Ruslan Dautkhanov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584443#comment-15584443
 ] 

Ruslan Dautkhanov commented on HIVE-14989:
--

Thank you.

> FIELDS TERMINATED BY parsing broken when delimiter is more than 1 byte
> --
>
> Key: HIVE-14989
> URL: https://issues.apache.org/jira/browse/HIVE-14989
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats, Parser, Reader
>Affects Versions: 0.13.0, 0.13.1
>Reporter: Ruslan Dautkhanov
>
> FIELDS TERMINATED BY parsing broken when delimiter is more than 1 byte. 
> Delimiter starting from 2nd character becomes part of returned data. No 
> parsed properly.
> Test case:
> {noformat}
> CREATE external TABLE test_muldelim
> (  string1 STRING,
>string2 STRING,
>string3 STRING
> )
>  ROW FORMAT 
>DELIMITED FIELDS TERMINATED BY '<>'
>   LINES TERMINATED BY '\n'
>  STORED AS TEXTFILE
>   location '/user/hive/test_muldelim'
> {noformat}
> Create a text file under /user/hive/test_muldelim with following 2 lines:
> {noformat}
> data1<>data2<>data3
> aa<>bb<>cc
> {noformat}
> Now notice that two-character delimiter wasn't parsed properly:
> {noformat}
> jdbc:hive2://host.domain.com:1> select * from ruslan_test.test_muldelim ;
> ++++--+
> | test_muldelim.string1  | test_muldelim.string2  | test_muldelim.string3  |
> ++++--+
> | data1  | >data2 | >data3 |
> | aa | >bb| >cc|
> ++++--+
> 2 rows selected (0.453 seconds)
> {noformat}
> The second delimiter's character ('>') became part of the columns to the 
> right (`string2` and `string3`).
> Table DDL:
> {noformat}
> 0: jdbc:hive2://host.domain.com:1> show create table dafault.test_muldelim ;
> +-+--+
> | createtab_stmt  |
> +-+--+
> | CREATE EXTERNAL TABLE `default.test_muldelim`(  |
> |   `string1` string, |
> |   `string2` string, |
> |   `string3` string) |
> | ROW FORMAT DELIMITED|
> |   FIELDS TERMINATED BY '<>' |
> |   LINES TERMINATED BY '\n'  |
> | STORED AS INPUTFORMAT   |
> |   'org.apache.hadoop.mapred.TextInputFormat'|
> | OUTPUTFORMAT|
> |   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'  |
> | LOCATION|
> |   'hdfs://epsdatalake/user/hive/test_muldelim'  |
> | TBLPROPERTIES ( |
> |   'transient_lastDdlTime'='1476727100') |
> +-+--+
> 15 rows selected (0.286 seconds)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-14989) FIELDS TERMINATED BY parsing broken when delimiter is more than 1 byte

2016-10-17 Thread Ruslan Dautkhanov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruslan Dautkhanov resolved HIVE-14989.
--
   Resolution: Duplicate
Fix Version/s: 0.14.1

> FIELDS TERMINATED BY parsing broken when delimiter is more than 1 byte
> --
>
> Key: HIVE-14989
> URL: https://issues.apache.org/jira/browse/HIVE-14989
> Project: Hive
>  Issue Type: Bug
>  Components: File Formats, Parser, Reader
>Affects Versions: 0.13.0, 0.13.1
>Reporter: Ruslan Dautkhanov
> Fix For: 0.14.1
>
>
> FIELDS TERMINATED BY parsing broken when delimiter is more than 1 byte. 
> Delimiter starting from 2nd character becomes part of returned data. No 
> parsed properly.
> Test case:
> {noformat}
> CREATE external TABLE test_muldelim
> (  string1 STRING,
>string2 STRING,
>string3 STRING
> )
>  ROW FORMAT 
>DELIMITED FIELDS TERMINATED BY '<>'
>   LINES TERMINATED BY '\n'
>  STORED AS TEXTFILE
>   location '/user/hive/test_muldelim'
> {noformat}
> Create a text file under /user/hive/test_muldelim with following 2 lines:
> {noformat}
> data1<>data2<>data3
> aa<>bb<>cc
> {noformat}
> Now notice that two-character delimiter wasn't parsed properly:
> {noformat}
> jdbc:hive2://host.domain.com:1> select * from ruslan_test.test_muldelim ;
> ++++--+
> | test_muldelim.string1  | test_muldelim.string2  | test_muldelim.string3  |
> ++++--+
> | data1  | >data2 | >data3 |
> | aa | >bb| >cc|
> ++++--+
> 2 rows selected (0.453 seconds)
> {noformat}
> The second delimiter's character ('>') became part of the columns to the 
> right (`string2` and `string3`).
> Table DDL:
> {noformat}
> 0: jdbc:hive2://host.domain.com:1> show create table dafault.test_muldelim ;
> +-+--+
> | createtab_stmt  |
> +-+--+
> | CREATE EXTERNAL TABLE `default.test_muldelim`(  |
> |   `string1` string, |
> |   `string2` string, |
> |   `string3` string) |
> | ROW FORMAT DELIMITED|
> |   FIELDS TERMINATED BY '<>' |
> |   LINES TERMINATED BY '\n'  |
> | STORED AS INPUTFORMAT   |
> |   'org.apache.hadoop.mapred.TextInputFormat'|
> | OUTPUTFORMAT|
> |   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'  |
> | LOCATION|
> |   'hdfs://epsdatalake/user/hive/test_muldelim'  |
> | TBLPROPERTIES ( |
> |   'transient_lastDdlTime'='1476727100') |
> +-+--+
> 15 rows selected (0.286 seconds)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14913) Add new unit tests

2016-10-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584522#comment-15584522
 ] 

Hive QA commented on HIVE-14913:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833860/HIVE-14913.5.patch

{color:green}SUCCESS:{color} +1 due to 9 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10592 tests 
executed
*Failed tests:*
{noformat}
TestBeelineWithHS2ConnectionFile - did not produce a TEST-*.xml file (likely 
timed out) (batchId=197)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_globallimit] 
(batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[order_null] (batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_fast_stats] 
(batchId=46)
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_bulk] 
(batchId=89)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=131)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[alter_merge_orc] 
(batchId=119)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
 (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0] (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1] (batchId=155)
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testBlackListedUdfUsage
 (batchId=204)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1610/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1610/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1610/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833860 - PreCommit-HIVE-Build

> Add new unit tests
> --
>
> Key: HIVE-14913
> URL: https://issues.apache.org/jira/browse/HIVE-14913
> Project: Hive
>  Issue Type: Task
>  Components: Tests
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch, 
> HIVE-14913.3.patch, HIVE-14913.4.patch, HIVE-14913.5.patch
>
>
> Moving bunch of tests from system test to hive unit tests to reduce testing 
> overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13557) Make interval keyword optional while specifying DAY in interval arithmetic

2016-10-17 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-13557:

Attachment: HIVE-13557.1.patch

> Make interval keyword optional while specifying DAY in interval arithmetic
> --
>
> Key: HIVE-13557
> URL: https://issues.apache.org/jira/browse/HIVE-13557
> Project: Hive
>  Issue Type: Sub-task
>  Components: Types
>Reporter: Ashutosh Chauhan
>Assignee: Zoltan Haindrich
> Attachments: HIVE-13557.1.patch, HIVE-13557.1.patch, 
> HIVE-13557.1.patch
>
>
> Currently we support expressions like: {code}
> WHERE SOLD_DATE BETWEEN ((DATE('2000-01-31'))  - INTERVAL '30' DAY) AND 
> DATE('2000-01-31')
> {code}
> We should support:
> {code}
> WHERE SOLD_DATE BETWEEN ((DATE('2000-01-31')) + (-30) DAY) AND 
> DATE('2000-01-31')
> {code}
>   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14981) Eliminate unnecessary MapJoin restriction in HIVE-11394

2016-10-17 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-14981:

Status: In Progress  (was: Patch Available)

> Eliminate unnecessary MapJoin restriction in HIVE-11394
> ---
>
> Key: HIVE-14981
> URL: https://issues.apache.org/jira/browse/HIVE-14981
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14981.01.patch
>
>
> No Empty Key restriction for Native Vector MapJoin added with HIVE-11394 is 
> unnecessary.  It caused Llap orc_llap.q test to timeout on Hive QA because 
> regular VectorMapJoinOperator is too slow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13423) Handle the overflow case for decimal datatype for sum()

2016-10-17 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583055#comment-15583055
 ] 

Xuefu Zhang commented on HIVE-13423:


[~ctang.ma]/[~aihuaxu], as to sacrificing scale for bigger integer part, it 
doesn't seem to be a viable option. The precision/scale of the result type of 
sum udf is determined statically as result metadata. That is, the result type 
of sum(decimal(p, s)) is decimal(p+10, s), which is decided before seeing any 
actual data. Thus, at run time when the data is actually processed, we cannot 
return the result of decimal( p+10+d, s-d) because the data (result) doesn't 
conform to the metadata (type decimal(p+10, s).

Please feel free to check standards or what other dbs are dong. As far as I 
know, there is no standard that permits this.

> Handle the overflow case for decimal datatype for sum()
> ---
>
> Key: HIVE-13423
> URL: https://issues.apache.org/jira/browse/HIVE-13423
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13423.1.patch
>
>
> When a column col1 defined as decimal and if the sum of the column overflows, 
> we will try to increase the decimal precision by 10. But if it's reaching 38 
> (the max precision), the overflow still could happen. Right now, if such case 
> happens, the following exception will throw since hive is writing incorrect 
> data.
> {noformat}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.readVInt(LazyBinaryUtils.java:314)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.checkObjectByteInfo(LazyBinaryUtils.java:219)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.parse(LazyBinaryStruct.java:142)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13423) Handle the overflow case for decimal datatype for sum()

2016-10-17 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15582911#comment-15582911
 ] 

Chaoyu Tang commented on HIVE-13423:


[~aihuaxu] The patch looks good. The issue might also exist in all other 
arithmetic functions or operations like plus(+), multiplication(*) etc I 
believe. I wonder if we need truncate the scale like SQLServer does to fit the 
intermediate data to the precision as discussed in HIVE-14281.

> Handle the overflow case for decimal datatype for sum()
> ---
>
> Key: HIVE-13423
> URL: https://issues.apache.org/jira/browse/HIVE-13423
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13423.1.patch
>
>
> When a column col1 defined as decimal and if the sum of the column overflows, 
> we will try to increase the decimal precision by 10. But if it's reaching 38 
> (the max precision), the overflow still could happen. Right now, if such case 
> happens, the following exception will throw since hive is writing incorrect 
> data.
> {noformat}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.readVInt(LazyBinaryUtils.java:314)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.checkObjectByteInfo(LazyBinaryUtils.java:219)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.parse(LazyBinaryStruct.java:142)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13423) Handle the overflow case for decimal datatype for sum()

2016-10-17 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15582946#comment-15582946
 ] 

Aihua Xu commented on HIVE-13423:
-

Yes. ArrayIndexOutOfBoundsException is because of the overflow and with this 
patch we does return NULL when the overflow occurs.

When there is an overflow, we are writing a 'non-null' flag to the file since 
the data indeed is a non-null, while later when we try to interpret the data, 
it resolves to null since it overflows. That causes to generate a corrupted 
file.





> Handle the overflow case for decimal datatype for sum()
> ---
>
> Key: HIVE-13423
> URL: https://issues.apache.org/jira/browse/HIVE-13423
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13423.1.patch
>
>
> When a column col1 defined as decimal and if the sum of the column overflows, 
> we will try to increase the decimal precision by 10. But if it's reaching 38 
> (the max precision), the overflow still could happen. Right now, if such case 
> happens, the following exception will throw since hive is writing incorrect 
> data.
> {noformat}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.readVInt(LazyBinaryUtils.java:314)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.checkObjectByteInfo(LazyBinaryUtils.java:219)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.parse(LazyBinaryStruct.java:142)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14981) Eliminate unnecessary MapJoin restriction in HIVE-11394

2016-10-17 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15582954#comment-15582954
 ] 

Siddharth Seth commented on HIVE-14981:
---

[~mmccline] - thankyou for working on the patch to fix the test.
The test used to run in less than 2 minutes, and ends up timing out after 
running for 40 minutes. Will the change mentioned here have that kind of impact?

> Eliminate unnecessary MapJoin restriction in HIVE-11394
> ---
>
> Key: HIVE-14981
> URL: https://issues.apache.org/jira/browse/HIVE-14981
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14981.01.patch
>
>
> No Empty Key restriction for Native Vector MapJoin added with HIVE-11394 is 
> unnecessary.  It caused Llap orc_llap.q test to timeout on Hive QA because 
> regular VectorMapJoinOperator is too slow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14957) HiveSortLimitPullUpConstantsRule misses branches when parent operator is Union

2016-10-17 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15582984#comment-15582984
 ] 

Pengcheng Xiong commented on HIVE-14957:


pushed to master and 2.1. thanks [~jcamachorodriguez] for the review.

> HiveSortLimitPullUpConstantsRule misses branches when parent operator is Union
> --
>
> Key: HIVE-14957
> URL: https://issues.apache.org/jira/browse/HIVE-14957
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14957.01.patch, HIVE-14957.02.patch
>
>
> {code}
> call.transformTo(parent.copy(parent.getTraitSet(), 
> ImmutableList.of(relBuilder.build(;
> {code}
> When parent is an union operator which has 2 inputs, the parent.copy will 
> only copy the one that has SortLimit and ignore the other branches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14973) Flaky test: TestJdbcWithSQLAuthorization.testBlackListedUdfUsage

2016-10-17 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14973:
--
Issue Type: Sub-task  (was: Bug)
Parent: HIVE-14547

> Flaky test: TestJdbcWithSQLAuthorization.testBlackListedUdfUsage
> 
>
> Key: HIVE-14973
> URL: https://issues.apache.org/jira/browse/HIVE-14973
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 2.2.0
>Reporter: Gopal V
>
> {code}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> AlreadyExistsException(message:Table test_jdbc_sql_auth_udf already exists)
>  at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:854)
>  at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:862)
>  at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4052)
>  at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:340)
>  at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199)
>  at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
>  at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1988)
>  at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1679)
>  at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1410)
>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1143)
>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1136)
>  at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:248)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14957) HiveSortLimitPullUpConstantsRule misses branches when parent operator is Union

2016-10-17 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14957:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> HiveSortLimitPullUpConstantsRule misses branches when parent operator is Union
> --
>
> Key: HIVE-14957
> URL: https://issues.apache.org/jira/browse/HIVE-14957
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.2.0
>
> Attachments: HIVE-14957.01.patch, HIVE-14957.02.patch
>
>
> {code}
> call.transformTo(parent.copy(parent.getTraitSet(), 
> ImmutableList.of(relBuilder.build(;
> {code}
> When parent is an union operator which has 2 inputs, the parent.copy will 
> only copy the one that has SortLimit and ignore the other branches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14957) HiveSortLimitPullUpConstantsRule misses branches when parent operator is Union

2016-10-17 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14957:
---
Affects Version/s: 2.1.0

> HiveSortLimitPullUpConstantsRule misses branches when parent operator is Union
> --
>
> Key: HIVE-14957
> URL: https://issues.apache.org/jira/browse/HIVE-14957
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.2.0
>
> Attachments: HIVE-14957.01.patch, HIVE-14957.02.patch
>
>
> {code}
> call.transformTo(parent.copy(parent.getTraitSet(), 
> ImmutableList.of(relBuilder.build(;
> {code}
> When parent is an union operator which has 2 inputs, the parent.copy will 
> only copy the one that has SortLimit and ignore the other branches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14957) HiveSortLimitPullUpConstantsRule misses branches when parent operator is Union

2016-10-17 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14957:
---
Fix Version/s: 2.2.0

> HiveSortLimitPullUpConstantsRule misses branches when parent operator is Union
> --
>
> Key: HIVE-14957
> URL: https://issues.apache.org/jira/browse/HIVE-14957
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.2.0
>
> Attachments: HIVE-14957.01.patch, HIVE-14957.02.patch
>
>
> {code}
> call.transformTo(parent.copy(parent.getTraitSet(), 
> ImmutableList.of(relBuilder.build(;
> {code}
> When parent is an union operator which has 2 inputs, the parent.copy will 
> only copy the one that has SortLimit and ignore the other branches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14981) Eliminate unnecessary MapJoin restriction in HIVE-11394

2016-10-17 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-14981:

Status: Patch Available  (was: In Progress)

Build #1 failed due to infrastructure.

> Eliminate unnecessary MapJoin restriction in HIVE-11394
> ---
>
> Key: HIVE-14981
> URL: https://issues.apache.org/jira/browse/HIVE-14981
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14981.01.patch, HIVE-14981.02.patch
>
>
> No Empty Key restriction for Native Vector MapJoin added with HIVE-11394 is 
> unnecessary.  It caused Llap orc_llap.q test to timeout on Hive QA because 
> regular VectorMapJoinOperator is too slow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14935) Add tests for beeline force option

2016-10-17 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-14935:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to master, thanks Kavan.

> Add tests for beeline force option
> --
>
> Key: HIVE-14935
> URL: https://issues.apache.org/jira/browse/HIVE-14935
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Reporter: Kavan Suresh
>Assignee: Kavan Suresh
> Fix For: 2.2.0
>
> Attachments: HIVE-14935.1.patch
>
>
> Add unit test for beeline with force option to ensure continuation of running 
> script even after errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14891) Parallelize TestHCatStorer

2016-10-17 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583047#comment-15583047
 ] 

Siddharth Seth commented on HIVE-14891:
---

HIVE-14973 to HIVE-14978, along with HIVE-14910 cover the flaky tests. They're 
not related to the patch.
+1 for the patch. There's one downside which is that any new formats would need 
to explicitly add a test class (earlier this was discovered). I think that's 
acceptable for now.

> Parallelize TestHCatStorer
> --
>
> Key: HIVE-14891
> URL: https://issues.apache.org/jira/browse/HIVE-14891
> Project: Hive
>  Issue Type: Sub-task
>  Components: HCatalog, Testing Infrastructure
>Affects Versions: 2.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-14891.1.patch, HIVE-14891.1.patch, 
> HIVE-14891.1.patch
>
>
> Currently TestHCatStorer runs as a parameterized test, where it runs the same 
> tests for each storage format but within the same junit test case. This 
> prevents it from being parallelized using ptest where parallelism granularity 
> is at a test case level. Instead of using parameterized tests, it makes sense 
> to create a new test case for each storage format.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14837) JDBC: standalone jar is missing hadoop core dependencies

2016-10-17 Thread Tao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583053#comment-15583053
 ] 

Tao Li commented on HIVE-14837:
---

[~gopalv] Can you please describe how to repo this error? Thanks.

> JDBC: standalone jar is missing hadoop core dependencies
> 
>
> Key: HIVE-14837
> URL: https://issues.apache.org/jira/browse/HIVE-14837
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 2.2.0
>Reporter: Gopal V
>
> {code}
> 2016/09/24 00:31:57 ERROR - jmeter.threads.JMeterThread: Test failed! 
> java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
> at 
> org.apache.hive.jdbc.HiveConnection.createUnderlyingTransport(HiveConnection.java:418)
> at 
> org.apache.hive.jdbc.HiveConnection.createBinaryTransport(HiveConnection.java:438)
> at 
> org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:225)
> at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:182)
> at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13046) DependencyResolver should not lowercase the dependency URI's authority

2016-10-17 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-13046:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to master.

> DependencyResolver should not lowercase the dependency URI's authority
> --
>
> Key: HIVE-13046
> URL: https://issues.apache.org/jira/browse/HIVE-13046
> Project: Hive
>  Issue Type: Bug
>Reporter: Anthony Hsu
>Assignee: Anthony Hsu
> Fix For: 2.2.0
>
> Attachments: HIVE-13046.1.patch, HIVE-13046.2.patch
>
>
> When using {{ADD JAR ivy://...}} to add a jar version {{1.2.3-SNAPSHOT}}, 
> Hive will lowercase it to {{1.2.3-snapshot}} due to:
> {code:title=DependencyResolver.java#84}
> String[] authorityTokens = authority.toLowerCase().split(":");
> {code}
> We should not {{.lowerCase()}}.
> RB: https://reviews.apache.org/r/43513



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13046) DependencyResolver should not lowercase the dependency URI's authority

2016-10-17 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-13046:
--
Fix Version/s: 2.2.0

> DependencyResolver should not lowercase the dependency URI's authority
> --
>
> Key: HIVE-13046
> URL: https://issues.apache.org/jira/browse/HIVE-13046
> Project: Hive
>  Issue Type: Bug
>Reporter: Anthony Hsu
>Assignee: Anthony Hsu
> Fix For: 2.2.0
>
> Attachments: HIVE-13046.1.patch, HIVE-13046.2.patch
>
>
> When using {{ADD JAR ivy://...}} to add a jar version {{1.2.3-SNAPSHOT}}, 
> Hive will lowercase it to {{1.2.3-snapshot}} due to:
> {code:title=DependencyResolver.java#84}
> String[] authorityTokens = authority.toLowerCase().split(":");
> {code}
> We should not {{.lowerCase()}}.
> RB: https://reviews.apache.org/r/43513



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14987) CombineHiveInputFormat with Tez fails to initiate vertex if table is empty

2016-10-17 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15582878#comment-15582878
 ] 

Hitesh Shah commented on HIVE-14987:


Moved this to Hive as this seems to be Hive specific. 

> CombineHiveInputFormat with Tez fails to initiate vertex if table is empty
> --
>
> Key: HIVE-14987
> URL: https://issues.apache.org/jira/browse/HIVE-14987
> Project: Hive
>  Issue Type: Bug
>Reporter: Yi Zhang
>
> Sometimes user have developed custom inputformat that extends from 
> CombineHiveInputFormat due to difficulty of extending from HiveInputFormat 
> directly, for example to filter out old data files.   
> in this use case, vertex fails to get initialized:
> SELECT city.cid
> FROM
> (select city_id as cid,
> row_number() over(partition by timezone order by population) rnum
> from cities) city
> JOIN
>   (select datestr, id from yizhang.emptyparts where datestr >= 
> date_sub(current_date(),30)) emp
> on city.cid = emp.id
> ;
> 
> VERTICES  STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  
> KILLED
> 
> Map 1 KILLED -1  00   -1   0  
>  0
> Map 3 FAILED -1  00   -1   0  
>  0
> Reducer 2 KILLED  1  001   0  
>  0
> 
> VERTICES: 00/03  [>>--] 0%ELAPSED TIME: 0.34 s
>  
> 
> Status: Failed
> Vertex failed, vertexName=Map 3, vertexId=vertex_1476217616538_398108_1_01, 
> diagnostics=[Vertex vertex_1476217616538_398108_1_01 [Map 3] killed/failed 
> due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: emp initializer failed, 
> vertex=vertex_1476217616538_398108_1_01 [Map 3], 
> java.lang.IllegalArgumentException
>   at 
> java.util.concurrent.ThreadPoolExecutor.(ThreadPoolExecutor.java:1307)
>   at 
> java.util.concurrent.ThreadPoolExecutor.(ThreadPoolExecutor.java:1195)
>   at java.util.concurrent.Executors.newFixedThreadPool(Executors.java:89)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:519)
>   at 
> org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateOldSplits(MRInputHelpers.java:447)
>   at 
> org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateInputSplitsToMem(MRInputHelpers.java:299)
>   at 
> org.apache.tez.mapreduce.common.MRInputAMSplitGenerator.initialize(MRInputAMSplitGenerator.java:121)
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:264)
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:258)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:258)
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:245)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> ]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Moved] (HIVE-14987) CombineHiveInputFormat with Tez fails to initiate vertex if table is empty

2016-10-17 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah moved TEZ-3474 to HIVE-14987:
-

Affects Version/s: (was: 0.7.1)
  Key: HIVE-14987  (was: TEZ-3474)
  Project: Hive  (was: Apache Tez)

> CombineHiveInputFormat with Tez fails to initiate vertex if table is empty
> --
>
> Key: HIVE-14987
> URL: https://issues.apache.org/jira/browse/HIVE-14987
> Project: Hive
>  Issue Type: Bug
>Reporter: Yi Zhang
>
> Sometimes user have developed custom inputformat that extends from 
> CombineHiveInputFormat due to difficulty of extending from HiveInputFormat 
> directly, for example to filter out old data files.   
> in this use case, vertex fails to get initialized:
> SELECT city.cid
> FROM
> (select city_id as cid,
> row_number() over(partition by timezone order by population) rnum
> from cities) city
> JOIN
>   (select datestr, id from yizhang.emptyparts where datestr >= 
> date_sub(current_date(),30)) emp
> on city.cid = emp.id
> ;
> 
> VERTICES  STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  
> KILLED
> 
> Map 1 KILLED -1  00   -1   0  
>  0
> Map 3 FAILED -1  00   -1   0  
>  0
> Reducer 2 KILLED  1  001   0  
>  0
> 
> VERTICES: 00/03  [>>--] 0%ELAPSED TIME: 0.34 s
>  
> 
> Status: Failed
> Vertex failed, vertexName=Map 3, vertexId=vertex_1476217616538_398108_1_01, 
> diagnostics=[Vertex vertex_1476217616538_398108_1_01 [Map 3] killed/failed 
> due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: emp initializer failed, 
> vertex=vertex_1476217616538_398108_1_01 [Map 3], 
> java.lang.IllegalArgumentException
>   at 
> java.util.concurrent.ThreadPoolExecutor.(ThreadPoolExecutor.java:1307)
>   at 
> java.util.concurrent.ThreadPoolExecutor.(ThreadPoolExecutor.java:1195)
>   at java.util.concurrent.Executors.newFixedThreadPool(Executors.java:89)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:519)
>   at 
> org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateOldSplits(MRInputHelpers.java:447)
>   at 
> org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateInputSplitsToMem(MRInputHelpers.java:299)
>   at 
> org.apache.tez.mapreduce.common.MRInputAMSplitGenerator.initialize(MRInputAMSplitGenerator.java:121)
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:264)
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:258)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:258)
>   at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:245)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> ]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13423) Handle the overflow case for decimal datatype for sum()

2016-10-17 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15582931#comment-15582931
 ] 

Xuefu Zhang commented on HIVE-13423:


[~aihuaxu], thanks for looking into this. Do we know the cause of 
ArrayIndexOutOfBoundsException?

Giving a warning message is fine, though that may not help all cases. I think 
the right behavior is to return NULL when result overflows but to provide a 
strict mode in which error will be thrown instead. This should be considered 
for all such cases.

One thing to find out is to sum integer columns. In such case, overflowing can 
also occur. I expect that NULL will be returned. For decimal, we should do the 
same until a general strict mode is implemented.



> Handle the overflow case for decimal datatype for sum()
> ---
>
> Key: HIVE-13423
> URL: https://issues.apache.org/jira/browse/HIVE-13423
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13423.1.patch
>
>
> When a column col1 defined as decimal and if the sum of the column overflows, 
> we will try to increase the decimal precision by 10. But if it's reaching 38 
> (the max precision), the overflow still could happen. Right now, if such case 
> happens, the following exception will throw since hive is writing incorrect 
> data.
> {noformat}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.readVInt(LazyBinaryUtils.java:314)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.checkObjectByteInfo(LazyBinaryUtils.java:219)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.parse(LazyBinaryStruct.java:142)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13423) Handle the overflow case for decimal datatype for sum()

2016-10-17 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15582911#comment-15582911
 ] 

Chaoyu Tang edited comment on HIVE-13423 at 10/17/16 5:59 PM:
--

[~aihuaxu] The patch looks good. The issue might also exist in all other 
arithmetic functions or operations like plus(+), multiplication(*) etc I 
believe. I wonder if we need truncate the scale like SQLServer does to fit the 
intermediate data to the precision as discussed in HIVE-14281. [~xuefuz], you 
have more insights into the decimal and what is your thought?


was (Author: ctang.ma):
[~aihuaxu] The patch looks good. The issue might also exist in all other 
arithmetic functions or operations like plus(+), multiplication(*) etc I 
believe. I wonder if we need truncate the scale like SQLServer does to fit the 
intermediate data to the precision as discussed in HIVE-14281.

> Handle the overflow case for decimal datatype for sum()
> ---
>
> Key: HIVE-13423
> URL: https://issues.apache.org/jira/browse/HIVE-13423
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13423.1.patch
>
>
> When a column col1 defined as decimal and if the sum of the column overflows, 
> we will try to increase the decimal precision by 10. But if it's reaching 38 
> (the max precision), the overflow still could happen. Right now, if such case 
> happens, the following exception will throw since hive is writing incorrect 
> data.
> {noformat}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.readVInt(LazyBinaryUtils.java:314)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.checkObjectByteInfo(LazyBinaryUtils.java:219)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.parse(LazyBinaryStruct.java:142)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14958) Improve the 'TestClass' did not produce a TEST-*.xml file message to include list of all qfiles in a batch, batch id

2016-10-17 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14958:
--
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Thanks for the review. Committed.

> Improve the 'TestClass' did not produce a TEST-*.xml file message to include 
> list of all qfiles in a batch, batch id
> 
>
> Key: HIVE-14958
> URL: https://issues.apache.org/jira/browse/HIVE-14958
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Fix For: 2.2.0
>
> Attachments: HIVE-14958.01.patch, HIVE-14958.02.patch
>
>
> Should make it easier to hunt down the logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14980) Minor compaction when triggered simultaniously on the same table/partition deletes data

2016-10-17 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583692#comment-15583692
 ] 

Eugene Koifman commented on HIVE-14980:
---

Relying on "show compactions" is not atomic so it's not a complete fix.
It should use locks of some kind, but not in the current lock manager.  
MutexAPI.acquireLock(String) was meant to support the kind of locking that this 
needs but it's not quite complete.  If you use  for the 
key, and use this from Worker, it will achieve the proper synchronization 
atomically and the "lock" will be released if the process dies.


> Minor compaction when triggered simultaniously on the same table/partition 
> deletes data
> ---
>
> Key: HIVE-14980
> URL: https://issues.apache.org/jira/browse/HIVE-14980
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 2.1.0
>Reporter: Mahipal Jupalli
>Assignee: Mahipal Jupalli
>Priority: Critical
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> I have two tables (TABLEA, TABLEB). If I manually trigger compaction after 
> each INSERT into TABLEB from TABLEA, compactions are triggered on random 
> metastore asynchronously and are stepping on each other which is causing the 
> data to be deleted.
> Example here: 
> TABLEA - has 10k rows. 
> insert into mj.tableb select * from mj.tablea;
> alter table mj.tableb compact 'MINOR';
> insert into mj.tableb select * from mj.tablea;
> alter table mj.tableb compact 'MINOR';
> Once all the compactions are complete, I should ideally see 20k rows in 
> TABLEB. But I see only 10k rows (Only the rows INSERTED before the last 
> compaction persist, the old rows are deleted. I believe the old delta files 
> are deleted). 
> To further confirm the bug, if I do only one compaction after two inserts, I 
> see 20k rows in TABLEB.
> Proposed Fix:
> I have identified the bug in the code, it requires an additional check in the 
> org.apache.hadoop.hive.ql.txn.compactor.Worker class to check for any active 
> compactions on the table/partition. I will 'share the details of the fix once 
> I test it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14992) Relocate several common libraries in hive jdbc uber jar

2016-10-17 Thread Tao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583711#comment-15583711
 ] 

Tao Li commented on HIVE-14992:
---

This is to avoid dependency version conflicts when users are using some common 
libs along with the JDBC standalone jar.
cc [~gopalv], [~thejas]

> Relocate several common libraries in hive jdbc uber jar
> ---
>
> Key: HIVE-14992
> URL: https://issues.apache.org/jira/browse/HIVE-14992
> Project: Hive
>  Issue Type: Bug
>Reporter: Tao Li
>Assignee: Tao Li
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14927) Remove code duplication from tests in TestLdapAtnProviderWithMiniDS

2016-10-17 Thread Illya Yalovyy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583621#comment-15583621
 ] 

Illya Yalovyy commented on HIVE-14927:
--

At the moment I can see many builds are failing with similar symptoms.

> Remove code duplication from tests in TestLdapAtnProviderWithMiniDS
> ---
>
> Key: HIVE-14927
> URL: https://issues.apache.org/jira/browse/HIVE-14927
> Project: Hive
>  Issue Type: Improvement
>  Components: Test
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
> Attachments: HIVE-14927.1.patch, HIVE-14927.2.patch, 
> HIVE-14927.3.patch
>
>
> * Extract inner class User and implement a proper builder for it.
> * Extract all common code to LdapAuthenticationTestCase class 
>   ** setting up the test case
>** executing test case
>** result validation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14753) Track the number of open/closed/abandoned sessions in HS2

2016-10-17 Thread Barna Zsombor Klara (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara updated HIVE-14753:
---
Status: Patch Available  (was: Open)

> Track the number of open/closed/abandoned sessions in HS2
> -
>
> Key: HIVE-14753
> URL: https://issues.apache.org/jira/browse/HIVE-14753
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Fix For: 2.2.0
>
> Attachments: HIVE-14753.1.patch, HIVE-14753.2.patch, HIVE-14753.patch
>
>
> We should be able to track the nr. of sessions since the startup of the HS2 
> instance as well as the average lifetime of a session.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14753) Track the number of open/closed/abandoned sessions in HS2

2016-10-17 Thread Barna Zsombor Klara (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara updated HIVE-14753:
---
Attachment: HIVE-14753.2.patch

Rebased the patch as TestSessionManagerMetrics has changed.
Made changes to the test to prevent race conditions from introduce flaky test 
failures.

> Track the number of open/closed/abandoned sessions in HS2
> -
>
> Key: HIVE-14753
> URL: https://issues.apache.org/jira/browse/HIVE-14753
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
> Fix For: 2.2.0
>
> Attachments: HIVE-14753.1.patch, HIVE-14753.2.patch, HIVE-14753.patch
>
>
> We should be able to track the nr. of sessions since the startup of the HS2 
> instance as well as the average lifetime of a session.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14985) Remove UDF-s created during test runs

2016-10-17 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-14985:
--
Status: Patch Available  (was: Open)

> Remove UDF-s created during test runs
> -
>
> Key: HIVE-14985
> URL: https://issues.apache.org/jira/browse/HIVE-14985
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Minor
> Attachments: HIVE-14985.patch
>
>
> When I tried to run llap_udf.q repeatedly from my IDE then the first run was 
> a pass, but following runs were failed. 
> The query does not remove the created functions in the query file which could 
> cause problems for the follow up tests.
> The same problem could happen if a query test fails in the middle of the 
> script, and even though the file contains the removal sql commands, those are 
> not executed.
> It might be a good idea to clean up not just tables and keys, but functions 
> created during the test run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14985) Remove UDF-s created during test runs

2016-10-17 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-14985:
--
Attachment: HIVE-14985.patch

Using the clearTablesCreatedDuringTests created a clearUDFsCreatedDuringTests, 
to remove extra UDFs.


> Remove UDF-s created during test runs
> -
>
> Key: HIVE-14985
> URL: https://issues.apache.org/jira/browse/HIVE-14985
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Minor
> Attachments: HIVE-14985.patch
>
>
> When I tried to run llap_udf.q repeatedly from my IDE then the first run was 
> a pass, but following runs were failed. 
> The query does not remove the created functions in the query file which could 
> cause problems for the follow up tests.
> The same problem could happen if a query test fails in the middle of the 
> script, and even though the file contains the removal sql commands, those are 
> not executed.
> It might be a good idea to clean up not just tables and keys, but functions 
> created during the test run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14984) Hive-WebUI access results in Request is a replay (34) attack

2016-10-17 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15582413#comment-15582413
 ] 

Aihua Xu commented on HIVE-14984:
-

[~szehon] Do you have any idea? 

> Hive-WebUI access results in Request is a replay (34) attack
> 
>
> Key: HIVE-14984
> URL: https://issues.apache.org/jira/browse/HIVE-14984
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.2.0
>Reporter: Venkat Sambath
>
> When trying to access kerberized webui of HS2, The following error is received
> GSSException: Failure unspecified at GSS-API level (Mechanism level: Request 
> is a replay (34))
> While this is not happening for RM webui (checked if kerberos webui is 
> enabled)
> To reproduce the issue 
> Try running
> curl --negotiate -u : -b ~/cookiejar.txt -c ~/cookiejar.txt 
> http://:10002/
> from any cluster nodes
> or 
> Try accessing the URL from a VM with windows machine and firefox browser to 
> replicate the issue
> The following workaround helped, but need a permanent solution for the bug
> Workaround:
> =
> First access the index.html directly and then actual URL of webui
> curl --negotiate -u : -b ~/cookiejar.txt -c ~/cookiejar.txt 
> http://:10002/index.html
> curl --negotiate -u : -b ~/cookiejar.txt -c ~/cookiejar.txt 
> http://:10002
> In browser:
> First access
> http://:10002/index.html
> then
> http://:10002



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13423) Handle the overflow case for decimal datatype for sum()

2016-10-17 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15582458#comment-15582458
 ] 

Aihua Xu commented on HIVE-13423:
-

[~xuefuz] and [~ctang.ma] Initially I remember we had issues with GroupBy on 
this decimal data type, but I couldn't see such issue any more (seems it has 
been fixed by HIVE-6459).

But we still have a small issue that when the sum overflows, it will produce 
corrupted intermediate file and give ArrayIndexOutOfBoundsException. 

Can you help take a look at the simple fix or do you have a better idea?

> Handle the overflow case for decimal datatype for sum()
> ---
>
> Key: HIVE-13423
> URL: https://issues.apache.org/jira/browse/HIVE-13423
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13423.1.patch
>
>
> When a column col1 defined as decimal and if the sum of the column overflows, 
> we will try to increase the decimal precision by 10. But if it's reaching 38 
> (the max precision), the overflow still could happen. Right now, if such case 
> happens, the following exception will throw since hive is writing incorrect 
> data.
> {noformat}
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
> at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.readVInt(LazyBinaryUtils.java:314)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.checkObjectByteInfo(LazyBinaryUtils.java:219)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.parse(LazyBinaryStruct.java:142)
>  ~[hive-exec-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14864) Distcp is not called from MoveTask when src is a directory

2016-10-17 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-14864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15582459#comment-15582459
 ] 

Sergio Peña commented on HIVE-14864:


A couple of comments:

* Will srcFS.getContentSummary(src) cause extra time when source is on S3? If 
so, maybe we want to put this line inside the if() statement.

* Once we're here, could you fix the message from HIVE_EXEC_COPYFILE_MAXSIZE to 
say (in Bytes) instead of (in Mb) ?

> Distcp is not called from MoveTask when src is a directory
> --
>
> Key: HIVE-14864
> URL: https://issues.apache.org/jira/browse/HIVE-14864
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Sahil Takiar
> Attachments: HIVE-14864.1.patch, HIVE-14864.2.patch, HIVE-14864.patch
>
>
> In FileUtils.java the following code does not get executed even when src 
> directory size is greater than HIVE_EXEC_COPYFILE_MAXSIZE because 
> srcFS.getFileStatus(src).getLen() returns 0 when src is a directory. We 
> should use srcFS.getContentSummary(src).getLength() instead.
> {noformat}
> /* Run distcp if source file/dir is too big */
> if (srcFS.getUri().getScheme().equals("hdfs") &&
> srcFS.getFileStatus(src).getLen() > 
> conf.getLongVar(HiveConf.ConfVars.HIVE_EXEC_COPYFILE_MAXSIZE)) {
>   LOG.info("Source is " + srcFS.getFileStatus(src).getLen() + " bytes. 
> (MAX: " + conf.getLongVar(HiveConf.ConfVars.HIVE_EXEC_COPYFILE_MAXSIZE) + 
> ")");
>   LOG.info("Launch distributed copy (distcp) job.");
>   HiveConfUtil.updateJobCredentialProviders(conf);
>   copied = shims.runDistCp(src, dst, conf);
>   if (copied && deleteSource) {
> srcFS.delete(src, true);
>   }
> }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14958) Improve the 'TestClass' did not produce a TEST-*.xml file message to include list of all qfiles in a batch, batch id

2016-10-17 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-14958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15582448#comment-15582448
 ] 

Sergio Peña commented on HIVE-14958:


LGTM +1

> Improve the 'TestClass' did not produce a TEST-*.xml file message to include 
> list of all qfiles in a batch, batch id
> 
>
> Key: HIVE-14958
> URL: https://issues.apache.org/jira/browse/HIVE-14958
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14958.01.patch, HIVE-14958.02.patch
>
>
> Should make it easier to hunt down the logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14968) Fix compilation failure on branch-1

2016-10-17 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-14968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15582480#comment-15582480
 ] 

Sergio Peña commented on HIVE-14968:


I see a LONG being casted to INT. Isn't going to have overflow issues if the 
LONG value is too large to fit?
Btw, you have to attach the file as HIVE-14968-branch-1.1.patch 

> Fix compilation failure on branch-1
> ---
>
> Key: HIVE-14968
> URL: https://issues.apache.org/jira/browse/HIVE-14968
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 1.3.0
>
> Attachments: HIVE-14968.1.patch
>
>
> branch-1 compilation failure due to:
> HIVE-14436: Hive 1.2.1/Hitting "ql.Driver: FAILED: IllegalArgumentException 
> Error: , expected at the end of 'decimal(9'" after enabling 
> hive.optimize.skewjoin and with MR engine
> HIVE-14483 : java.lang.ArrayIndexOutOfBoundsException 
> org.apache.orc.impl.TreeReaderFactory.commonReadByteArrays
> 1.2 branch is fine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13557) Make interval keyword optional while specifying DAY in interval arithmetic

2016-10-17 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-13557:

Attachment: HIVE-13557.1.patch

> Make interval keyword optional while specifying DAY in interval arithmetic
> --
>
> Key: HIVE-13557
> URL: https://issues.apache.org/jira/browse/HIVE-13557
> Project: Hive
>  Issue Type: Sub-task
>  Components: Types
>Reporter: Ashutosh Chauhan
>Assignee: Zoltan Haindrich
> Attachments: HIVE-13557.1.patch
>
>
> Currently we support expressions like: {code}
> WHERE SOLD_DATE BETWEEN ((DATE('2000-01-31'))  - INTERVAL '30' DAY) AND 
> DATE('2000-01-31')
> {code}
> We should support:
> {code}
> WHERE SOLD_DATE BETWEEN ((DATE('2000-01-31')) + (-30) DAY) AND 
> DATE('2000-01-31')
> {code}
>   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13557) Make interval keyword optional while specifying DAY in interval arithmetic

2016-10-17 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-13557:

Attachment: (was: HIVE-13557.1.patch)

> Make interval keyword optional while specifying DAY in interval arithmetic
> --
>
> Key: HIVE-13557
> URL: https://issues.apache.org/jira/browse/HIVE-13557
> Project: Hive
>  Issue Type: Sub-task
>  Components: Types
>Reporter: Ashutosh Chauhan
>Assignee: Zoltan Haindrich
> Attachments: HIVE-13557.1.patch
>
>
> Currently we support expressions like: {code}
> WHERE SOLD_DATE BETWEEN ((DATE('2000-01-31'))  - INTERVAL '30' DAY) AND 
> DATE('2000-01-31')
> {code}
> We should support:
> {code}
> WHERE SOLD_DATE BETWEEN ((DATE('2000-01-31')) + (-30) DAY) AND 
> DATE('2000-01-31')
> {code}
>   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14822) Add support for credential provider for jobs launched from Hiveserver2

2016-10-17 Thread Mohit Sabharwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-14822:
---
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, [~vihangk1]!

> Add support for credential provider for jobs launched from Hiveserver2
> --
>
> Key: HIVE-14822
> URL: https://issues.apache.org/jira/browse/HIVE-14822
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Fix For: 2.2.0
>
> Attachments: HIVE-14822.01.patch, HIVE-14822.02.patch, 
> HIVE-14822.03.patch, HIVE-14822.05.patch, HIVE-14822.06.patch, 
> HIVE-14822.07.patch
>
>
> When using encrypted passwords via the Hadoop Credential Provider, 
> HiveServer2 currently does not correctly forward enough information to the 
> job configuration for jobs to read those secrets. If your job needs to access 
> any secrets, like S3 credentials, then there's no convenient and secure way 
> to configure this today.
> You could specify the decryption key in files like mapred-site.xml that 
> HiveServer2 uses, but this would place the encryption password on local disk 
> in plaintext, which can be a security concern.
> To solve this problem, HiveServer2 should modify job configuration to include 
> the environment variable settings needed to decrypt the passwords. 
> Specifically, it will need to modify:
> * For MR2 jobs:
> ** yarn.app.mapreduce.am.admin.user.env
> ** mapreduce.admin.user.env
> * For Spark jobs:
> ** spark.yarn.appMasterEnv.HADOOP_CREDSTORE_PASSWORD
> ** spark.executorEnv.HADOOP_CREDSTORE_PASSWORD
> HiveServer2 can get the decryption password from its own environment, the 
> same way it does for its own credential provider store today.
> Additionally, it can be desirable for HiveServer2 to have a separate 
> encrypted password file than what is used by the job. HiveServer2 may have 
> secrets that the job should not have, such as the metastore database password 
> or the password to decrypt its private SSL certificate. It is also best 
> practices to have separate passwords on separate files. To facilitate this, 
> Hive will also accept:
> * A configuration for a path to a credential store to use for jobs. This 
> should already be uploaded in HDFS. (hive.server2.job.keystore.location or a 
> better name) If this is not specified, then HS2 will simply use the value of 
> hadoop.security.credential.provider.path.
> * An environment variable for the password to decrypt the credential store 
> (HIVE_JOB_KEYSTORE_PASSWORD or better). If this is not specified, then HS2 
> will simply use the standard environment variable for decrypting the Hadoop 
> Credential Provider.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14063) beeline to auto connect to the HiveServer2

2016-10-17 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-14063:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks Vihang for the work.

BTW: can you update the doc in the wiki?

> beeline to auto connect to the HiveServer2
> --
>
> Key: HIVE-14063
> URL: https://issues.apache.org/jira/browse/HIVE-14063
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-14063.01.patch, HIVE-14063.02.patch, 
> beeline.conf.template
>
>
> Currently one has to give an jdbc:hive2 url in order for Beeline to connect a 
> hiveserver2 instance. It would be great if Beeline can get the info somehow 
> (from a properties file at a well-known location?) and connect automatically 
> if user doesn't specify such a url. If the properties file is not present, 
> then beeline would expect user to provide the url and credentials using 
> !connect or ./beeline -u .. commands
> While Beeline is flexible (being a mere JDBC client), most environments would 
> have just a single HS2. Having users to manually connect into this via either 
> "beeline ~/.propsfile" or -u or !connect statements is lowering the 
> experience part.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14968) Fix compilation failure on branch-1

2016-10-17 Thread Daniel Dai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-14968:
--
Attachment: HIVE-14968-branch-1.1.patch

Thanks [~spena], changing the filename to branch-1 spec. The long/int part is 
taken from HIVE-14483. [~Spring], can you clarify?

> Fix compilation failure on branch-1
> ---
>
> Key: HIVE-14968
> URL: https://issues.apache.org/jira/browse/HIVE-14968
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 1.3.0
>
> Attachments: HIVE-14968-branch-1.1.patch, HIVE-14968.1.patch
>
>
> branch-1 compilation failure due to:
> HIVE-14436: Hive 1.2.1/Hitting "ql.Driver: FAILED: IllegalArgumentException 
> Error: , expected at the end of 'decimal(9'" after enabling 
> hive.optimize.skewjoin and with MR engine
> HIVE-14483 : java.lang.ArrayIndexOutOfBoundsException 
> org.apache.orc.impl.TreeReaderFactory.commonReadByteArrays
> 1.2 branch is fine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14981) Eliminate unnecessary MapJoin restriction in HIVE-11394

2016-10-17 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583344#comment-15583344
 ] 

Matt McCline commented on HIVE-14981:
-

Still have a problem.

> Eliminate unnecessary MapJoin restriction in HIVE-11394
> ---
>
> Key: HIVE-14981
> URL: https://issues.apache.org/jira/browse/HIVE-14981
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14981.01.patch, HIVE-14981.02.patch
>
>
> No Empty Key restriction for Native Vector MapJoin added with HIVE-11394 is 
> unnecessary.  It caused Llap orc_llap.q test to timeout on Hive QA because 
> regular VectorMapJoinOperator is too slow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14029) Update Spark version to 2.0.0

2016-10-17 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583356#comment-15583356
 ] 

Xuefu Zhang commented on HIVE-14029:


[~spena], Keeping b/c is a good thing in general. Before we take the effort 
(which seems a lot) to do it, I think we should clearly understand and define 
what b/c is in this case. Spark is rapidly releasing w/o much b/c in mind. So 
far, Hive on Spark has once depended on Spark 1.2, 1.3, 1.4, 1.5, and 1.6. I'm 
not sure what versions of Spark Hive has been released with, but one thing is 
clear, Spark isn't b/c between these releases. Before Spark community has a 
good sense of keeping b/c in their APIs, it's going to be very hard and 
burdensome for Hive to maintain support for different Spark releases, not to 
mention the library dependency issues we have had.

I'm okay to start thinking of a shim layer to support multiple versions of 
Spark, but it sounds daunting to me due to the dynamics of Spark project.

> Update Spark version to 2.0.0
> -
>
> Key: HIVE-14029
> URL: https://issues.apache.org/jira/browse/HIVE-14029
> Project: Hive
>  Issue Type: Bug
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
>  Labels: Incompatible, TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14029.1.patch, HIVE-14029.2.patch, 
> HIVE-14029.3.patch, HIVE-14029.4.patch, HIVE-14029.5.patch, 
> HIVE-14029.6.patch, HIVE-14029.7.patch, HIVE-14029.8.patch, HIVE-14029.patch
>
>
> There are quite some new optimizations in Spark 2.0.0. We need to bump up 
> Spark to 2.0.0 to benefit those performance improvements.
> To update Spark version to 2.0.0, the following changes are required:
> * Spark API updates:
> ** SparkShuffler#call return Iterator instead of Iterable
> ** SparkListener -> JavaSparkListener
> ** InputMetrics constructor doesn’t accept readMethod
> ** Method remoteBlocksFetched and localBlocksFetched in ShuffleReadMetrics 
> return long type instead of integer
> * Dependency upgrade:
> ** Jackson: 2.4.2 -> 2.6.5
> ** Netty version: 4.0.23.Final -> 4.0.29.Final
> ** Scala binary version: 2.10 -> 2.11
> ** Scala version: 2.10.4 -> 2.11.8



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14982) Remove some reserved keywords in 2.2

2016-10-17 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583386#comment-15583386
 ] 

Pengcheng Xiong commented on HIVE-14982:


We are moving towards SQL2011 standard compliance.  Those keywords conflict 
with SQL2011 standard.


> Remove some reserved keywords in 2.2
> 
>
> Key: HIVE-14982
> URL: https://issues.apache.org/jira/browse/HIVE-14982
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>
> It seems that CACHE, DAYOFWEEK, VIEWS are reserved keywords in master. This 
> conflicts with SQL2011 standard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14981) Eliminate unnecessary MapJoin restriction in HIVE-11394

2016-10-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15583371#comment-15583371
 ] 

Hive QA commented on HIVE-14981:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833777/HIVE-14981.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10564 tests 
executed
*Failed tests:*
{noformat}
TestBeelineWithHS2ConnectionFile - did not produce a TEST-*.xml file (likely 
timed out) (batchId=197)
TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=143)

[orc_llap.q,vectorization_pushdown.q,correlationoptimizer2.q,cbo_gby_empty.q,vectorization_short_regress.q,identity_project_remove_skip.q,auto_sortmerge_join_1.q,lineage3.q,cross_product_check_1.q,cbo_join.q,vector_struct_in.q,correlationoptimizer6.q,union_remove_26.q,vectorization_13.q,union2.q,groupby2.q,schema_evol_text_vec_table.q,dynpart_sort_opt_vectorization.q,exchgpartition2lel.q,multiMapJoin1.q,sample10.q,vectorized_timestamp_ints_casts.q,vector_char_simple.q,dynpart_sort_optimization_acid.q,auto_sortmerge_join_2.q,bucketizedhiveinputformat.q,leftsemijoin.q,special_character_in_tabnames_1.q,cte_mat_2.q,vectorization_8.q]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_globallimit] 
(batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[order_null] (batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_fast_stats] 
(batchId=46)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
 (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0] (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1] (batchId=155)
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testBlackListedUdfUsage
 (batchId=204)
org.apache.hive.spark.client.TestSparkClient.testJobSubmission (batchId=263)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1600/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1600/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1600/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833777 - PreCommit-HIVE-Build

> Eliminate unnecessary MapJoin restriction in HIVE-11394
> ---
>
> Key: HIVE-14981
> URL: https://issues.apache.org/jira/browse/HIVE-14981
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.2.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14981.01.patch, HIVE-14981.02.patch
>
>
> No Empty Key restriction for Native Vector MapJoin added with HIVE-11394 is 
> unnecessary.  It caused Llap orc_llap.q test to timeout on Hive QA because 
> regular VectorMapJoinOperator is too slow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14924) MSCK REPAIR table with single threaded is throwing null pointer exception

2016-10-17 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14924:
---
Status: Patch Available  (was: Open)

> MSCK REPAIR table with single threaded is throwing null pointer exception
> -
>
> Key: HIVE-14924
> URL: https://issues.apache.org/jira/browse/HIVE-14924
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Ratheesh Kamoor
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14924.01.patch
>
>
> MSCK REPAIR TABLE is throwing Null Pointer Exception while running on single 
> threaded mode (hive.mv.files.thread=0)
> Error:
> 2016-10-10T22:27:13,564 ERROR [e9ce04a8-2a84-426d-8e79-a2d15b8cee09 
> main([])]: exec.DDLTask (DDLTask.java:failed(581)) - 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkPartitionDirs(HiveMetaStoreChecker.java:423)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.findUnknownPartitions(HiveMetaStoreChecker.java:315)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkTable(HiveMetaStoreChecker.java:291)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkTable(HiveMetaStoreChecker.java:236)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkMetastore(HiveMetaStoreChecker.java:113)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.msck(DDLTask.java:1834)
> In order to reproduce:
> set hive.mv.files.thread=0 and run MSCK REPAIR TABLE command



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14924) MSCK REPAIR table with single threaded is throwing null pointer exception

2016-10-17 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14924:
---
Status: Open  (was: Patch Available)

> MSCK REPAIR table with single threaded is throwing null pointer exception
> -
>
> Key: HIVE-14924
> URL: https://issues.apache.org/jira/browse/HIVE-14924
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Ratheesh Kamoor
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14924.01.patch
>
>
> MSCK REPAIR TABLE is throwing Null Pointer Exception while running on single 
> threaded mode (hive.mv.files.thread=0)
> Error:
> 2016-10-10T22:27:13,564 ERROR [e9ce04a8-2a84-426d-8e79-a2d15b8cee09 
> main([])]: exec.DDLTask (DDLTask.java:failed(581)) - 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkPartitionDirs(HiveMetaStoreChecker.java:423)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.findUnknownPartitions(HiveMetaStoreChecker.java:315)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkTable(HiveMetaStoreChecker.java:291)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkTable(HiveMetaStoreChecker.java:236)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkMetastore(HiveMetaStoreChecker.java:113)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.msck(DDLTask.java:1834)
> In order to reproduce:
> set hive.mv.files.thread=0 and run MSCK REPAIR TABLE command



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14924) MSCK REPAIR table with single threaded is throwing null pointer exception

2016-10-17 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14924:
---
Attachment: (was: HIVE-14924.01.patch)

> MSCK REPAIR table with single threaded is throwing null pointer exception
> -
>
> Key: HIVE-14924
> URL: https://issues.apache.org/jira/browse/HIVE-14924
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Ratheesh Kamoor
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14924.01.patch
>
>
> MSCK REPAIR TABLE is throwing Null Pointer Exception while running on single 
> threaded mode (hive.mv.files.thread=0)
> Error:
> 2016-10-10T22:27:13,564 ERROR [e9ce04a8-2a84-426d-8e79-a2d15b8cee09 
> main([])]: exec.DDLTask (DDLTask.java:failed(581)) - 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkPartitionDirs(HiveMetaStoreChecker.java:423)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.findUnknownPartitions(HiveMetaStoreChecker.java:315)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkTable(HiveMetaStoreChecker.java:291)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkTable(HiveMetaStoreChecker.java:236)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkMetastore(HiveMetaStoreChecker.java:113)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.msck(DDLTask.java:1834)
> In order to reproduce:
> set hive.mv.files.thread=0 and run MSCK REPAIR TABLE command



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14924) MSCK REPAIR table with single threaded is throwing null pointer exception

2016-10-17 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14924:
---
Status: Patch Available  (was: Open)

> MSCK REPAIR table with single threaded is throwing null pointer exception
> -
>
> Key: HIVE-14924
> URL: https://issues.apache.org/jira/browse/HIVE-14924
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Ratheesh Kamoor
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14924.01.patch
>
>
> MSCK REPAIR TABLE is throwing Null Pointer Exception while running on single 
> threaded mode (hive.mv.files.thread=0)
> Error:
> 2016-10-10T22:27:13,564 ERROR [e9ce04a8-2a84-426d-8e79-a2d15b8cee09 
> main([])]: exec.DDLTask (DDLTask.java:failed(581)) - 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkPartitionDirs(HiveMetaStoreChecker.java:423)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.findUnknownPartitions(HiveMetaStoreChecker.java:315)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkTable(HiveMetaStoreChecker.java:291)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkTable(HiveMetaStoreChecker.java:236)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkMetastore(HiveMetaStoreChecker.java:113)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.msck(DDLTask.java:1834)
> In order to reproduce:
> set hive.mv.files.thread=0 and run MSCK REPAIR TABLE command



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14924) MSCK REPAIR table with single threaded is throwing null pointer exception

2016-10-17 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14924:
---
Status: Open  (was: Patch Available)

> MSCK REPAIR table with single threaded is throwing null pointer exception
> -
>
> Key: HIVE-14924
> URL: https://issues.apache.org/jira/browse/HIVE-14924
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Ratheesh Kamoor
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14924.01.patch
>
>
> MSCK REPAIR TABLE is throwing Null Pointer Exception while running on single 
> threaded mode (hive.mv.files.thread=0)
> Error:
> 2016-10-10T22:27:13,564 ERROR [e9ce04a8-2a84-426d-8e79-a2d15b8cee09 
> main([])]: exec.DDLTask (DDLTask.java:failed(581)) - 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkPartitionDirs(HiveMetaStoreChecker.java:423)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.findUnknownPartitions(HiveMetaStoreChecker.java:315)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkTable(HiveMetaStoreChecker.java:291)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkTable(HiveMetaStoreChecker.java:236)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkMetastore(HiveMetaStoreChecker.java:113)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.msck(DDLTask.java:1834)
> In order to reproduce:
> set hive.mv.files.thread=0 and run MSCK REPAIR TABLE command



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14924) MSCK REPAIR table with single threaded is throwing null pointer exception

2016-10-17 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14924:
---
Attachment: HIVE-14924.01.patch

> MSCK REPAIR table with single threaded is throwing null pointer exception
> -
>
> Key: HIVE-14924
> URL: https://issues.apache.org/jira/browse/HIVE-14924
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 2.2.0
>Reporter: Ratheesh Kamoor
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14924.01.patch
>
>
> MSCK REPAIR TABLE is throwing Null Pointer Exception while running on single 
> threaded mode (hive.mv.files.thread=0)
> Error:
> 2016-10-10T22:27:13,564 ERROR [e9ce04a8-2a84-426d-8e79-a2d15b8cee09 
> main([])]: exec.DDLTask (DDLTask.java:failed(581)) - 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkPartitionDirs(HiveMetaStoreChecker.java:423)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.findUnknownPartitions(HiveMetaStoreChecker.java:315)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkTable(HiveMetaStoreChecker.java:291)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkTable(HiveMetaStoreChecker.java:236)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveMetaStoreChecker.checkMetastore(HiveMetaStoreChecker.java:113)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.msck(DDLTask.java:1834)
> In order to reproduce:
> set hive.mv.files.thread=0 and run MSCK REPAIR TABLE command



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >