[jira] [Commented] (HIVE-15203) Hive export command does export to non HDFS default file system

2016-11-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666378#comment-15666378
 ] 

Thejas M Nair commented on HIVE-15203:
--

[~rajesh.balamohan] Thanks for checking. I realized my description wasn't 
accurate. 
This happens if no scheme is used with export uri.


> Hive export command does export to non HDFS default file system
> ---
>
> Key: HIVE-15203
> URL: https://issues.apache.org/jira/browse/HIVE-15203
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Thejas M Nair
>
> If a non hdfs filessystem is the default file system, then export command 
> tries to use hdfs scheme against the url of the default file system, if the 
> url doesn't have a scheme.
> For example, the following command would fail if the default file system is 
> not HDFS - export table empl to '/datastore'; 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15203) Hive export command does export to non HDFS default file system

2016-11-14 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-15203:
-
Description: 
If a non hdfs filessystem is the default file system, then export command tries 
to use hdfs scheme against the url of the default file system, if the url 
doesn't have a scheme.

For example, the following command would fail if the default file system is not 
HDFS - export table empl to '/datastore'; 


  was:
Hive export command does export to non HDFS file system.
If a non hdfs filessystem is the default file system, then export command tries 
to use hdfs scheme against the url of the default file system.



> Hive export command does export to non HDFS default file system
> ---
>
> Key: HIVE-15203
> URL: https://issues.apache.org/jira/browse/HIVE-15203
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Thejas M Nair
>
> If a non hdfs filessystem is the default file system, then export command 
> tries to use hdfs scheme against the url of the default file system, if the 
> url doesn't have a scheme.
> For example, the following command would fail if the default file system is 
> not HDFS - export table empl to '/datastore'; 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15203) Hive export command does export to non HDFS default file system

2016-11-14 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-15203:
-
Summary: Hive export command does export to non HDFS default file system  
(was: Hive export command does export to non HDFS file system)

> Hive export command does export to non HDFS default file system
> ---
>
> Key: HIVE-15203
> URL: https://issues.apache.org/jira/browse/HIVE-15203
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Thejas M Nair
>
> Hive export command does export to non HDFS file system.
> If a non hdfs filessystem is the default file system, then export command 
> tries to use hdfs scheme against the url of the default file system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15202) Concurrent compactions for the same partition may generate malformed folder structure

2016-11-14 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666351#comment-15666351
 ] 

Rui Li commented on HIVE-15202:
---

[~ashutoshc], would you mind share your ideas on this? Thank you!

> Concurrent compactions for the same partition may generate malformed folder 
> structure
> -
>
> Key: HIVE-15202
> URL: https://issues.apache.org/jira/browse/HIVE-15202
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>
> If two compactions run concurrently on a single partition, it may generate 
> folder structure like this: (nested base dir)
> {noformat}
> drwxr-xr-x   - root supergroup  0 2016-11-14 22:23 
> /user/hive/warehouse/test/z=1/base_007/base_007
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_0
> -rw-r--r--   3 root supergroup611 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_1
> -rw-r--r--   3 root supergroup614 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_2
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_3
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_4
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_5
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_6
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_7
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_8
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_9
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15203) Hive export command does export to non HDFS file system

2016-11-14 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666334#comment-15666334
 ] 

Rajesh Balamohan commented on HIVE-15203:
-

[~thejas] - Does this happen with 
{{hive.exim.uri.scheme.whitelist=hdfs,pfile,s3a}} ? I tried adding s3a to the 
whitelist and was able to proceed with export.

> Hive export command does export to non HDFS file system
> ---
>
> Key: HIVE-15203
> URL: https://issues.apache.org/jira/browse/HIVE-15203
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Thejas M Nair
>
> Hive export command does export to non HDFS file system.
> If a non hdfs filessystem is the default file system, then export command 
> tries to use hdfs scheme against the url of the default file system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15202) Concurrent compactions for the same partition may generate malformed folder structure

2016-11-14 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666323#comment-15666323
 ] 

Rui Li commented on HIVE-15202:
---

Prior to HIVE-13040, selecting on such a table fails with NPE in split 
generation. With HIVE-13040, the select returns properly. But I'm not sure if 
it 100% solves the problem because this isn't the original goal of HIVE-13040.

The root cause is in {{CompactorOutputCommitter::commitJob}}, we 're calling 
rename to move output from tmp location to final location. However, if the 
final location already exists, i.e. computed by another compaction task, the 
rename will merge the two outputs, resulting the nested base dir we see.
A mitigation is to delete the existing final location before the rename. But I 
guess it won't 100% solve the race condition here.

> Concurrent compactions for the same partition may generate malformed folder 
> structure
> -
>
> Key: HIVE-15202
> URL: https://issues.apache.org/jira/browse/HIVE-15202
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>
> If two compactions run concurrently on a single partition, it may generate 
> folder structure like this: (nested base dir)
> {noformat}
> drwxr-xr-x   - root supergroup  0 2016-11-14 22:23 
> /user/hive/warehouse/test/z=1/base_007/base_007
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_0
> -rw-r--r--   3 root supergroup611 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_1
> -rw-r--r--   3 root supergroup614 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_2
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_3
> -rw-r--r--   3 root supergroup621 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_4
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_5
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_6
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_7
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_8
> -rw-r--r--   3 root supergroup201 2016-11-14 21:46 
> /user/hive/warehouse/test/z=1/base_007/bucket_9
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15203) Hive export command does export to non HDFS file system

2016-11-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666316#comment-15666316
 ] 

Thejas M Nair commented on HIVE-15203:
--

Thanks to [~cnauroth] for analysis of the issue -

It looks like exports run some logic that hard-codes the URI scheme to "hdfs", 
but then pulls the authority (host) from the default file system:

https://github.com/apache/hive/blob/rel/release-2.1.0/ql/src/java/org/apache/hadoop/hive/ql/parse/ExportSemanticAnalyzer.java#L68
https://github.com/apache/hive/blob/rel/release-2.1.0/ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java#L94-L99
{code}
  // set correct scheme and authority
  if (StringUtils.isEmpty(scheme)) {
if (testMode) {
  scheme = "pfile";
} else {
  scheme = "hdfs";
}
  }

  // if scheme is specified but not authority then use the default
  // authority
  if (StringUtils.isEmpty(authority)) {
URI defaultURI = FileSystem.get(conf).getUri();
authority = defaultURI.getAuthority();
  }
{code}

I think that's our problem.  This logic is definitely wrong for the case of 
exporting to any file system that is not HDFS.

> Hive export command does export to non HDFS file system
> ---
>
> Key: HIVE-15203
> URL: https://issues.apache.org/jira/browse/HIVE-15203
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Thejas M Nair
>
> Hive export command does export to non HDFS file system.
> If a non hdfs filessystem is the default file system, then export command 
> tries to use hdfs scheme against the url of the default file system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15203) Hive export command does export to non HDFS file system

2016-11-14 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666318#comment-15666318
 ] 

Thejas M Nair commented on HIVE-15203:
--

I am not working on this at this time, feel free to take this on.


> Hive export command does export to non HDFS file system
> ---
>
> Key: HIVE-15203
> URL: https://issues.apache.org/jira/browse/HIVE-15203
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Reporter: Thejas M Nair
>
> Hive export command does export to non HDFS file system.
> If a non hdfs filessystem is the default file system, then export command 
> tries to use hdfs scheme against the url of the default file system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13931) Add support for HikariCP and replace BoneCP usage with HikariCP

2016-11-14 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13931:
-
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks for the reviews [~sushanth]!

> Add support for HikariCP and replace BoneCP usage with HikariCP
> ---
>
> Key: HIVE-13931
> URL: https://issues.apache.org/jira/browse/HIVE-13931
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sushanth Sowmyan
>Assignee: Prasanth Jayachandran
> Fix For: 2.2.0
>
> Attachments: HIVE-13931.2.patch, HIVE-13931.3.patch, 
> HIVE-13931.4.patch, HIVE-13931.patch
>
>
> Currently, we use BoneCP as our primary connection pooling mechanism 
> (overridable by users). However, BoneCP is no longer being actively 
> developed, and is considered deprecated, replaced by HikariCP.
> Thus, we should add support for HikariCP, and try to replace our primary 
> usage of BoneCP with it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13931) Add support for HikariCP and replace BoneCP usage with HikariCP

2016-11-14 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666264#comment-15666264
 ] 

Prasanth Jayachandran commented on HIVE-13931:
--

Test failures are flaky and are tracked in HIVE-15115, HIVE-15201 and HIVE-15116


> Add support for HikariCP and replace BoneCP usage with HikariCP
> ---
>
> Key: HIVE-13931
> URL: https://issues.apache.org/jira/browse/HIVE-13931
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sushanth Sowmyan
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13931.2.patch, HIVE-13931.3.patch, 
> HIVE-13931.4.patch, HIVE-13931.patch
>
>
> Currently, we use BoneCP as our primary connection pooling mechanism 
> (overridable by users). However, BoneCP is no longer being actively 
> developed, and is considered deprecated, replaced by HikariCP.
> Thus, we should add support for HikariCP, and try to replace our primary 
> usage of BoneCP with it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14990) run all tests for MM tables and fix the issues that are found

2016-11-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666247#comment-15666247
 ] 

Hive QA commented on HIVE-14990:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12838913/HIVE-14990.09.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2123/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2123/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2123/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2016-11-15 06:19:39.132
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-2123/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2016-11-15 06:19:39.135
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 6536e30 HIVE-15069: Optimize MetaStoreDirectSql:: 
aggrColStatsForPartitions during query compilation (Rajesh Balamohan reviewed 
by Sergey Shelukhin)
+ git clean -f -d
Removing common/src/java/org/apache/hadoop/hive/conf/HiveConf.java.orig
Removing itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java.orig
Removing 
metastore/src/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java.orig
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 6536e30 HIVE-15069: Optimize MetaStoreDirectSql:: 
aggrColStatsForPartitions during query compilation (Rajesh Balamohan reviewed 
by Sergey Shelukhin)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2016-11-15 06:19:40.122
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: patch failed: 
metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp:1240
error: metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp: patch does not 
apply
error: patch failed: 
metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp:17735
error: metastore/src/gen/thrift/gen-cpp/hive_metastore_types.cpp: patch does 
not apply
error: patch failed: metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h:384
error: metastore/src/gen/thrift/gen-cpp/hive_metastore_types.h: patch does not 
apply
error: patch failed: 
metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java:28995
error: 
metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java:
 patch does not apply
error: patch failed: 
metastore/src/gen/thrift/gen-php/metastore/ThriftHiveMetastore.php:10856
error: metastore/src/gen/thrift/gen-php/metastore/ThriftHiveMetastore.php: 
patch does not apply
error: patch failed: 
metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py:11367
error: metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py: 
patch does not apply
error: patch failed: 
metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:51
error: metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java: 
patch does not apply
error: patch failed: 
metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java:28
error: 
metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java: 
patch does not apply
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java:318
error: ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java: patch 
does not apply
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/parse/ExportSemanticAnalyzer.java:171
error: 

[jira] [Commented] (HIVE-13931) Add support for HikariCP and replace BoneCP usage with HikariCP

2016-11-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666239#comment-15666239
 ] 

Hive QA commented on HIVE-13931:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12838623/HIVE-13931.4.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10694 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[transform_ppr2] 
(batchId=133)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=145)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2122/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2122/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2122/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12838623 - PreCommit-HIVE-Build

> Add support for HikariCP and replace BoneCP usage with HikariCP
> ---
>
> Key: HIVE-13931
> URL: https://issues.apache.org/jira/browse/HIVE-13931
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sushanth Sowmyan
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13931.2.patch, HIVE-13931.3.patch, 
> HIVE-13931.4.patch, HIVE-13931.patch
>
>
> Currently, we use BoneCP as our primary connection pooling mechanism 
> (overridable by users). However, BoneCP is no longer being actively 
> developed, and is considered deprecated, replaced by HikariCP.
> Thus, we should add support for HikariCP, and try to replace our primary 
> usage of BoneCP with it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-15185) Extend JSONMessageFactory to store additional information about Partition metadata objects on different partition events

2016-11-14 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta resolved HIVE-15185.
-
   Resolution: Duplicate
Fix Version/s: 2.2.0

Will be uploading a consolidated patch on HIVE-15180.

> Extend JSONMessageFactory to store additional information about Partition 
> metadata objects on different partition events
> 
>
> Key: HIVE-15185
> URL: https://issues.apache.org/jira/browse/HIVE-15185
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Fix For: 2.2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15180) Extend JSONMessageFactory to store additional information about metadata objects on different table events

2016-11-14 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-15180:

Summary: Extend JSONMessageFactory to store additional information about 
metadata objects on different table events  (was: Extend JSONMessageFactory to 
store additional information about Table metadata objects on different table 
events)

> Extend JSONMessageFactory to store additional information about metadata 
> objects on different table events
> --
>
> Key: HIVE-15180
> URL: https://issues.apache.org/jira/browse/HIVE-15180
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-15180.1.patch
>
>
> We want the {{NOTIFICATION_LOG}} table to capture additional information 
> about the metadata objects when {{DbNotificationListener}} captures different 
> events for a table (create/drop/alter).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13557) Make interval keyword optional while specifying DAY in interval arithmetic

2016-11-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666127#comment-15666127
 ] 

Hive QA commented on HIVE-13557:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12838875/HIVE-13557.2.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10695 tests 
executed
*Failed tests:*
{noformat}
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=105)

[auto_join30.q,timestamp_null.q,date_udf.q,join16.q,groupby_ppr.q,bucketmapjoin7.q,smb_mapjoin_18.q,join19.q,vector_varchar_4.q,union6.q,cbo_subq_in.q,vectorization_part.q,sample8.q,vectorized_timestamp_funcs.q,join_star.q]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[show_functions] 
(batchId=66)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_4] 
(batchId=91)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[interval_2] 
(batchId=83)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[interval_3] 
(batchId=83)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2121/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2121/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2121/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12838875 - PreCommit-HIVE-Build

> Make interval keyword optional while specifying DAY in interval arithmetic
> --
>
> Key: HIVE-13557
> URL: https://issues.apache.org/jira/browse/HIVE-13557
> Project: Hive
>  Issue Type: Sub-task
>  Components: Types
>Reporter: Ashutosh Chauhan
>Assignee: Zoltan Haindrich
> Attachments: HIVE-13557.1.patch, HIVE-13557.1.patch, 
> HIVE-13557.1.patch, HIVE-13557.2.patch
>
>
> Currently we support expressions like: {code}
> WHERE SOLD_DATE BETWEEN ((DATE('2000-01-31'))  - INTERVAL '30' DAY) AND 
> DATE('2000-01-31')
> {code}
> We should support:
> {code}
> WHERE SOLD_DATE BETWEEN ((DATE('2000-01-31')) + (-30) DAY) AND 
> DATE('2000-01-31')
> {code}
>   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15069) Optimize MetaStoreDirectSql:: aggrColStatsForPartitions during query compilation

2016-11-14 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-15069:
-
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Patch committed to master. Thanks for the patch [~rajesh.balamohan]!

> Optimize MetaStoreDirectSql:: aggrColStatsForPartitions during query 
> compilation
> 
>
> Key: HIVE-15069
> URL: https://issues.apache.org/jira/browse/HIVE-15069
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-15069.1.patch, HIVE-15069.2.patch, 
> HIVE-15069.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15148) disallow loading data into bucketed tables (by default)

2016-11-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15666011#comment-15666011
 ] 

Hive QA commented on HIVE-15148:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12838864/HIVE-15148.02.patch

{color:green}SUCCESS:{color} +1 due to 100 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10664 tests 
executed
*Failed tests:*
{noformat}
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=121)

[groupby_complex_types.q,multigroupby_singlemr.q,mapjoin_decimal.q,groupby7.q,join5.q,bucketmapjoin_negative2.q,vectorization_div0.q,union_script.q,add_part_multiple.q,limit_pushdown.q,union_remove_17.q,uniquejoin.q,metadata_only_queries_with_filters.q,union25.q,load_dyn_part13.q]
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=127)

[groupby6_map.q,groupby2_noskew_multi_distinct.q,load_dyn_part12.q,scriptfile1.q,join15.q,auto_join17.q,join_hive_626.q,tez_join_tests.q,auto_join21.q,join_view.q,join_cond_pushdown_4.q,vectorization_0.q,union_null.q,auto_join3.q,vectorization_decimal_date.q]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=145)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2120/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2120/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2120/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12838864 - PreCommit-HIVE-Build

> disallow loading data into bucketed tables (by default)
> ---
>
> Key: HIVE-15148
> URL: https://issues.apache.org/jira/browse/HIVE-15148
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15148.01.patch, HIVE-15148.02.patch, 
> HIVE-15148.patch
>
>
> A few q file tests still use the following, allowed, pattern:
> {noformat}
> CREATE TABLE bucket_small (key string, value string) partitioned by (ds 
> string) CLUSTERED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
> load data local inpath '../../data/files/smallsrcsortbucket1outof4.txt' INTO 
> TABLE bucket_small partition(ds='2008-04-08');
> load data local inpath '../../data/files/smallsrcsortbucket2outof4.txt' INTO 
> TABLE bucket_small partition(ds='2008-04-08');
> {noformat}
> This relies on the user to load the correct number of files with correctly 
> hashed data and the correct order of file names; if there's some discrepancy 
> in any of the above, the queries will fail or may produce incorrect results 
> if some bucket-based optimizations kick in.
> Additionally, even if the user does everything correctly, as far as I know 
> some code derives bucket number from file name, which won't work in this case 
> (as opposed to getting buckets based on the order of files, which will work 
> here but won't work as per  HIVE-14970... sigh).
> Hive enforces bucketing in other cases (the check cannot even be disabled 
> these days), so I suggest that we either prohibit the above outright, or at 
> least add a safety config setting that would disallow it by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15197) count and sum query on empty table, returning empty output

2016-11-14 Thread vishal.rajan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vishal.rajan updated HIVE-15197:

Affects Version/s: 2.0.0
   2.0.1

> count and sum query on empty table, returning empty output 
> ---
>
> Key: HIVE-15197
> URL: https://issues.apache.org/jira/browse/HIVE-15197
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.0.0, 2.1.0, 2.0.1
>Reporter: vishal.rajan
>
> When the below query is run in hive 1.2.0  it returns  'NULL  NULL0' on 
> empty table but when the same query is run on hive 2.1.0, nothing is returned 
> on empty table.
> hive 1.2.0 -
> hive>  SELECT sum(destination_pincode),sum(length(source_city)),count(*)  
> from test_stage.geo_zone;
> MapReduce Jobs Launched: 
> Stage-Stage-1: Map: 1   Cumulative CPU: 4.79 sec   HDFS Read: 7354 HDFS 
> Write: 114 SUCCESS
> Total MapReduce CPU Time Spent: 4 seconds 790 msec
> OK
> NULL   NULL0
> Time taken: 38.168 seconds, Fetched: 1 row(s)
> -hive 2.1.0-
> hive> SELECT sum(destination_pincode),sum(length(source_city)),count(*)  from 
> test_stage.geo_zone
> WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the 
> future versions. Consider using a different execution engine (i.e. spark, 
> tez) or using Hive 1.X releases.
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks is set to 0 since there's no reduce operator
> 2016-11-14 19:06:15,421 WARN  [Thread-215] mapreduce.JobResourceUploader 
> (JobResourceUploader.java:uploadFiles(64)) - Hadoop command-line option 
> parsing not performed. Implement the Tool interface and execute your 
> application with ToolRunner to remedy this.
> 2016-11-14 19:06:19,222 INFO  [Thread-215] input.FileInputFormat 
> (FileInputFormat.java:listStatus(283)) - Total input paths to process : 1
> 2016-11-14 19:06:20,000 INFO  [Thread-215] mapreduce.JobSubmitter 
> (JobSubmitter.java:submitJobInternal(198)) - number of splits:0
> Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0
> 2016-11-14 19:06:39,405 Stage-1 map = 0%,  reduce = 0%
> Stage-Stage-1:  HDFS Read: 0 HDFS Write: 0 SUCCESS
> Total MapReduce CPU Time Spent: 0 msec
> OK
> Time taken: 28.302 seconds



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15197) count and sum query on empty table, returning empty output

2016-11-14 Thread vishal.rajan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665953#comment-15665953
 ] 

vishal.rajan commented on HIVE-15197:
-

[~ashutoshc] needs this issue to be fixed ASAP.

> count and sum query on empty table, returning empty output 
> ---
>
> Key: HIVE-15197
> URL: https://issues.apache.org/jira/browse/HIVE-15197
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.1.0
>Reporter: vishal.rajan
>
> When the below query is run in hive 1.2.0  it returns  'NULL  NULL0' on 
> empty table but when the same query is run on hive 2.1.0, nothing is returned 
> on empty table.
> hive 1.2.0 -
> hive>  SELECT sum(destination_pincode),sum(length(source_city)),count(*)  
> from test_stage.geo_zone;
> MapReduce Jobs Launched: 
> Stage-Stage-1: Map: 1   Cumulative CPU: 4.79 sec   HDFS Read: 7354 HDFS 
> Write: 114 SUCCESS
> Total MapReduce CPU Time Spent: 4 seconds 790 msec
> OK
> NULL   NULL0
> Time taken: 38.168 seconds, Fetched: 1 row(s)
> -hive 2.1.0-
> hive> SELECT sum(destination_pincode),sum(length(source_city)),count(*)  from 
> test_stage.geo_zone
> WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the 
> future versions. Consider using a different execution engine (i.e. spark, 
> tez) or using Hive 1.X releases.
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks is set to 0 since there's no reduce operator
> 2016-11-14 19:06:15,421 WARN  [Thread-215] mapreduce.JobResourceUploader 
> (JobResourceUploader.java:uploadFiles(64)) - Hadoop command-line option 
> parsing not performed. Implement the Tool interface and execute your 
> application with ToolRunner to remedy this.
> 2016-11-14 19:06:19,222 INFO  [Thread-215] input.FileInputFormat 
> (FileInputFormat.java:listStatus(283)) - Total input paths to process : 1
> 2016-11-14 19:06:20,000 INFO  [Thread-215] mapreduce.JobSubmitter 
> (JobSubmitter.java:submitJobInternal(198)) - number of splits:0
> Hadoop job information for Stage-1: number of mappers: 0; number of reducers: > 0
> 2016-11-14 19:06:39,405 Stage-1 map = 0%,  reduce = 0%
> Stage-Stage-1:  HDFS Read: 0 HDFS Write: 0 SUCCESS
> Total MapReduce CPU Time Spent: 0 msec
> OK
> Time taken: 28.302 seconds



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14990) run all tests for MM tables and fix the issues that are found

2016-11-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14990:

Attachment: HIVE-14990.09.patch

A few more fixes

> run all tests for MM tables and fix the issues that are found
> -
>
> Key: HIVE-14990
> URL: https://issues.apache.org/jira/browse/HIVE-14990
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14990.01.patch, HIVE-14990.02.patch, 
> HIVE-14990.03.patch, HIVE-14990.04.patch, HIVE-14990.04.patch, 
> HIVE-14990.05.patch, HIVE-14990.05.patch, HIVE-14990.06.patch, 
> HIVE-14990.06.patch, HIVE-14990.07.patch, HIVE-14990.08.patch, 
> HIVE-14990.09.patch, HIVE-14990.patch
>
>
> Expected failures 
> 1) All HCat tests (cannot write MM tables via the HCat writer)
> 2) Almost all merge tests (alter .. concat is not supported).
> 3) Tests that run dfs commands with specific paths (path changes).
> 4) Truncate column (not supported).
> 5) Describe formatted will have the new table fields in the output (before 
> merging MM with ACID).
> 6) Many tests w/explain extended - diff in partition "base file name" (path 
> changes).
> 7) TestTxnCommands - all the conversion tests, as they check for bucket count 
> using file lists (path changes).
> 8) HBase metastore tests cause methods are not implemented.
> 9) Some load and ExIm tests that export a table and then rely on specific 
> path for load (path changes).
> 10) Bucket map join/etc. - diffs; disabled the optimization for MM tables due 
> to how it accounts for buckets



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14990) run all tests for MM tables and fix the issues that are found

2016-11-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15648980#comment-15648980
 ] 

Sergey Shelukhin edited comment on HIVE-14990 at 11/15/16 3:28 AM:
---

Looked at all the remaining tests. Out of 749 failed tests, about 100 failures 
and diffs are (or might be, at least) relevant.
Many of them are similar e.g. are missing stats, but I don't know if they are 
missing stats for the same reason. 
Many e.g. exim may be due to unsupported/path-dependant scenarios that were not 
immediately obvious.

Not sure why TestSparkCliDriver fails. Fails in client init for me with no 
useful logs (logs child process exited with 127, then times out).
I think we'll fix that during branch merge, if still broken.

Crossing out ones that are actually irrelevant
{panel}
TestCliDriver:
authorization_insert
create_default_prop
exim_04_evolved_parts
-exim_11_managed_external-
-exim_12_external_location-
-exim_15_external_part-
-exim_18_part_external-
-exim_19_00_part_external_location-
-exim_19_part_external_location-
insert1
-list_bucket_dml_8-
-mm_all-
orc_createas1
ppd_join4
stats_empty_dyn_part
stats_partscan_1_23
temp_table_display_colstats_tbllvl
-temp_table_options1-
vector_udf2
list_bucket_dml_14,list_bucket_*
llap_acid
insert_overwrite_directory2
authorization_load
autoColumnStats_9
create_like
drop_database_removes_partition_dirs
drop_table_removes_partition_dirs
index_auto_update
exim_01_nonpart,exim_02_part,-exim_04_all_part,exim_05_some_part-,exim_06_one_part,exim_20_part_managed_location
exim_16_part_external,exim_17_part_managed
load_overwrite
materialized_view_authorization_sqlstd,materialized_*
merge_dynamic_partition, merge_dynamic_partition*
orc_int_type_promotion
orc_vectorization_ppd
parquet_join2
-partition_wise_fileformat,partition_wise_fileformat3-
-repl_1_drop-,repl_3_exim_metadata 
sample6
sample_islocalmode_hook
show_tablestatus
smb_bucket_1
smb_mapjoin_2,smb_mapjoin_3,smb_mapjoin_7
stats_list_bucket
stats_noscan_2
symlink_text_input_format
temp_table_precedence
offset_limit_global_optimizer
rand_partitionpruner2

TestEncryptedHDFSCliDriver:
encryption_ctas
encryption_drop_partition
encryption_insert_values
encryption_join_unencrypted_tbl
encryption_load_data_to_encrypted_tables

MiniLlapLocal:
exchgpartition2lel
cbo_rp_lineage2
create_merge_compressed
deleteAnalyze
delete_where_no_match
delete_where_non_partitioned
dynpart_sort_optimization
escape2
insert1
lineage2
lineage3
orc_llap
schema_evol_orc_nonvec_part
schema_evol_orc_vec_part
schema_evol_text_nonvec_part
schema_evol_text_vec_part
schema_evol_text_vecrow_part
smb_mapjoin_6
tez_dml
union_fast_stats
update_all_types
update_tmp_table
update_where_no_match
update_where_non_partitioned
vector_outer_join1
vector_outer_join4

MiniLlap:
load_fs2
orc_ppd_basic
external_table_with_space_in_location_path
file_with_header_footer
import_exported_table
schemeAuthority,schemeAuthority2
table_nonprintable

Minimr:
infer_bucket_sort_map_operators
infer_bucket_sort_merge
infer_bucket_sort_reducers_power_two
root_dir_external_table
scriptfile1

TestSymlinkTextInputFormat#testCombine 
TestJdbcWithLocalClusterSpark, etc.
{panel}



was (Author: sershe):
Looked at all the remaining tests. Out of 749 failed tests, about 100 failures 
and diffs are (or might be, at least) relevant.
Many of them are similar e.g. are missing stats, but I don't know if they are 
missing stats for the same reason. 
Many e.g. exim may be due to unsupported/path-dependant scenarios that were not 
immediately obvious.

Not sure why TestSparkCliDriver fails. Fails in client init for me with no 
useful logs (logs child process exited with 127, then times out).
I think we'll fix that during branch merge, if still broken.

Crossing out ones that are actually irrelevant
{panel}
TestCliDriver:
authorization_insert
create_default_prop
exim_04_evolved_parts
-exim_11_managed_external-
-exim_12_external_location-
-exim_15_external_part-
-exim_18_part_external-
-exim_19_00_part_external_location-
-exim_19_part_external_location-
insert1
-list_bucket_dml_8-
-mm_all-
orc_createas1
ppd_join4
stats_empty_dyn_part
stats_partscan_1_23
temp_table_display_colstats_tbllvl
-temp_table_options1-
vector_udf2
list_bucket_dml_14,list_bucket_*
llap_acid
insert_overwrite_directory2
authorization_load
autoColumnStats_9
create_like
drop_database_removes_partition_dirs
drop_table_removes_partition_dirs
index_auto_update
exim_01_nonpart,exim_02_part,-exim_04_all_part,exim_05_some_part-,exim_06_one_part,exim_20_part_managed_location
exim_16_part_external,exim_17_part_managed
load_overwrite
materialized_view_authorization_sqlstd,materialized_*
merge_dynamic_partition, merge_dynamic_partition*
orc_int_type_promotion
orc_vectorization_ppd
parquet_join2
partition_wise_fileformat,partition_wise_fileformat3
repl_1_drop,repl_3_exim_metadata 
sample6
sample_islocalmode_hook
show_tablestatus

[jira] [Comment Edited] (HIVE-14804) HPLSQL multiple db connection does not switch back to Hive

2016-11-14 Thread Fei Hui (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665841#comment-15665841
 ] 

Fei Hui edited comment on HIVE-14804 at 11/15/16 3:04 AM:
--

hi [~dkozlovdaisy]  in HIVE-13540's commit. it contians as below

hplsql/src/main/java/org/apache/hive/hplsql/Select.java 
hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java

change exec.conf.defaultConnection to conn;

diff --git hplsql/src/main/java/org/apache/hive/hplsql/Select.java 
hplsql/src/main/java/org/apache/hive/hplsql/Select.java
index 589e984..403810c 100644
--- hplsql/src/main/java/org/apache/hive/hplsql/Select.java
+++ hplsql/src/main/java/org/apache/hive/hplsql/Select.java
@@ -147,10 +147,10 @@ else if (ctx.parent instanceof HplsqlParser.StmtContext) {
 }
..
-  exec.closeQuery(query, exec.conf.defaultConnection);
+  exec.closeQuery(query, conn);

   return 1;
 }

-exec.closeQuery(query, exec.conf.defaultConnection);
+exec.closeQuery(query, conn);

 return 0;
   }



diff --git hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java 
hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
index 17d2195..c044616 100644
--- hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
+++ hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
..
@@ -793,7 +803,7 @@ else if (type == Conn.Type.HIVE && conf.insertValues == 
Conf.InsertValues.SELECT
   return 1;
 }
 exec.setSqlSuccess();
-exec.closeQuery(query, exec.conf.defaultConnection);
+exec.closeQuery(query, conn);
 return 0;
   }

you can patch it


was (Author: ferhui):
hi [~dkozlovdaisy]  in HIVE-13540's commit. it contians as below

hplsql/src/main/java/org/apache/hive/hplsql/Select.java 
hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java

change exec.conf.defaultConnection to conn;

diff --git hplsql/src/main/java/org/apache/hive/hplsql/Select.java 
hplsql/src/main/java/org/apache/hive/hplsql/Select.java
index 589e984..403810c 100644
--- hplsql/src/main/java/org/apache/hive/hplsql/Select.java
+++ hplsql/src/main/java/org/apache/hive/hplsql/Select.java
@@ -147,10 +147,10 @@ else if (ctx.parent instanceof HplsqlParser.StmtContext) {
 }
 catch (SQLException e) {
   exec.signal(query);

-  exec.closeQuery(query, exec.conf.defaultConnection);
+  exec.closeQuery(query, conn);

   return 1;
 }

-exec.closeQuery(query, exec.conf.defaultConnection);
+exec.closeQuery(query, conn);

 return 0;
   }



diff --git hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java 
hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
index 17d2195..c044616 100644
--- hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
+++ hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
...
@@ -793,7 +803,7 @@ else if (type == Conn.Type.HIVE && conf.insertValues == 
Conf.InsertValues.SELECT
   return 1;
 }
 exec.setSqlSuccess();
-exec.closeQuery(query, exec.conf.defaultConnection);
+exec.closeQuery(query, conn);
 return 0;
   }

you can patch it

> HPLSQL multiple db connection does not switch back to Hive
> --
>
> Key: HIVE-14804
> URL: https://issues.apache.org/jira/browse/HIVE-14804
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Dmitry Kozlov
>Assignee: Dmitry Tolpeko
>Priority: Blocker
>
> I have a problem with a multi database connection. I have 3 environments that 
> I would like to connect in my HPLSQL code Hive, DB2 and MySql. As soon as I 
> map any table either from DB2 or MySQL my code stops to recognize Hive 
> tables. Actually it starts to think that it is a table from the same database 
> (DB2 or MySql) that was mapped the last. It means your example 
> http://www.hplsql.org/map-object works only one way from Hive to MySQL and it 
> is not possible to go back to Hive.  
> Here is a simple piece of code.
> declare cnt int;
> begin
> /*
> PRINT 'Start MySQL';
> MAP OBJECT tbls TO hive.TBLS AT mysqlconn;
> select count(*)
> into cnt
> from tbls;
> PRINT cnt;
> PRINT 'Start Db2';
> MAP OBJECT exch TO DBDEV2.TEST_EXCHANGE AT db2conn;
> select count(1) 
> into cnt
> from exch;
> PRINT cnt;*/
> PRINT 'Check Hive';
> SELECT count(1) 
> into cnt
> FROM dev.test_sqoop;
> PRINT cnt;
> end;
> It has three blocks. One select from MySQL, second from DB2 and third from 
> Hive ORC table.
> When first two blocks are commented then block 3 works. See below
> Check Hive
> 16/09/20 18:08:08 INFO jdbc.Utils: Supplied authorities: localhost:1
> 16/09/20 18:08:08 INFO jdbc.Utils: Resolved authority: localhost:1
> 16/09/20 18:08:08 INFO jdbc.HiveConnection: Will try to open client transport 
> with JDBC Uri: jdbc:hive2://localhost:1
> Open connection: jdbc:hive2://localhost:1 (497 ms)
> Starting query
> 

[jira] [Commented] (HIVE-13931) Add support for HikariCP and replace BoneCP usage with HikariCP

2016-11-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665857#comment-15665857
 ] 

Hive QA commented on HIVE-13931:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12838623/HIVE-13931.4.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10679 tests 
executed
*Failed tests:*
{noformat}
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=112)

[escape_distributeby1.q,join9.q,groupby2.q,groupby4_map.q,udf_max.q,vectorization_pushdown.q,cbo_gby_empty.q,join_cond_pushdown_unqual3.q,vectorization_short_regress.q,join8.q,stats5.q,sample10.q,cross_product_check_1.q,auto_join_stats.q,input_part2.q]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=43)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] 
(batchId=90)
org.apache.hive.service.cli.TestEmbeddedThriftBinaryCLIService.testTaskStatus 
(batchId=207)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2119/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2119/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2119/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12838623 - PreCommit-HIVE-Build

> Add support for HikariCP and replace BoneCP usage with HikariCP
> ---
>
> Key: HIVE-13931
> URL: https://issues.apache.org/jira/browse/HIVE-13931
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sushanth Sowmyan
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13931.2.patch, HIVE-13931.3.patch, 
> HIVE-13931.4.patch, HIVE-13931.patch
>
>
> Currently, we use BoneCP as our primary connection pooling mechanism 
> (overridable by users). However, BoneCP is no longer being actively 
> developed, and is considered deprecated, replaced by HikariCP.
> Thus, we should add support for HikariCP, and try to replace our primary 
> usage of BoneCP with it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14804) HPLSQL multiple db connection does not switch back to Hive

2016-11-14 Thread Fei Hui (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665841#comment-15665841
 ] 

Fei Hui edited comment on HIVE-14804 at 11/15/16 2:59 AM:
--

hi [~dkozlovdaisy]  in HIVE-13540's commit. it contians as below

hplsql/src/main/java/org/apache/hive/hplsql/Select.java 
hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java

change exec.conf.defaultConnection to conn;

diff --git hplsql/src/main/java/org/apache/hive/hplsql/Select.java 
hplsql/src/main/java/org/apache/hive/hplsql/Select.java
index 589e984..403810c 100644
--- hplsql/src/main/java/org/apache/hive/hplsql/Select.java
+++ hplsql/src/main/java/org/apache/hive/hplsql/Select.java
@@ -147,10 +147,10 @@ else if (ctx.parent instanceof HplsqlParser.StmtContext) {
 }
 catch (SQLException e) {
   exec.signal(query);

-  exec.closeQuery(query, exec.conf.defaultConnection);
+  exec.closeQuery(query, conn);

   return 1;
 }

-exec.closeQuery(query, exec.conf.defaultConnection);
+exec.closeQuery(query, conn);

 return 0;
   }



diff --git hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java 
hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
index 17d2195..c044616 100644
--- hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
+++ hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
...
@@ -793,7 +803,7 @@ else if (type == Conn.Type.HIVE && conf.insertValues == 
Conf.InsertValues.SELECT
   return 1;
 }
 exec.setSqlSuccess();
-exec.closeQuery(query, exec.conf.defaultConnection);
+exec.closeQuery(query, conn);
 return 0;
   }

you can patch it


was (Author: ferhui):
hi [~dkozlovdaisy]  in HIVE-13540's commit. it contians as below

diff --git hplsql/src/main/java/org/apache/hive/hplsql/Select.java 
hplsql/src/main/java/org/apache/hive/hplsql/Select.java
index 589e984..403810c 100644
--- hplsql/src/main/java/org/apache/hive/hplsql/Select.java
+++ hplsql/src/main/java/org/apache/hive/hplsql/Select.java
@@ -147,10 +147,10 @@ else if (ctx.parent instanceof HplsqlParser.StmtContext) {
 }
 catch (SQLException e) {
   exec.signal(query);
-  exec.closeQuery(query, exec.conf.defaultConnection);
+  exec.closeQuery(query, conn);
   return 1;
 }
-exec.closeQuery(query, exec.conf.defaultConnection);
+exec.closeQuery(query, conn);
 return 0;
   }



diff --git hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java 
hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
index 17d2195..c044616 100644
--- hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
+++ hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
...
@@ -793,7 +803,7 @@ else if (type == Conn.Type.HIVE && conf.insertValues == 
Conf.InsertValues.SELECT
   return 1;
 }
 exec.setSqlSuccess();
-exec.closeQuery(query, exec.conf.defaultConnection);
+exec.closeQuery(query, conn);
 return 0;
   }

you can patch it

> HPLSQL multiple db connection does not switch back to Hive
> --
>
> Key: HIVE-14804
> URL: https://issues.apache.org/jira/browse/HIVE-14804
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Dmitry Kozlov
>Assignee: Dmitry Tolpeko
>Priority: Blocker
>
> I have a problem with a multi database connection. I have 3 environments that 
> I would like to connect in my HPLSQL code Hive, DB2 and MySql. As soon as I 
> map any table either from DB2 or MySQL my code stops to recognize Hive 
> tables. Actually it starts to think that it is a table from the same database 
> (DB2 or MySql) that was mapped the last. It means your example 
> http://www.hplsql.org/map-object works only one way from Hive to MySQL and it 
> is not possible to go back to Hive.  
> Here is a simple piece of code.
> declare cnt int;
> begin
> /*
> PRINT 'Start MySQL';
> MAP OBJECT tbls TO hive.TBLS AT mysqlconn;
> select count(*)
> into cnt
> from tbls;
> PRINT cnt;
> PRINT 'Start Db2';
> MAP OBJECT exch TO DBDEV2.TEST_EXCHANGE AT db2conn;
> select count(1) 
> into cnt
> from exch;
> PRINT cnt;*/
> PRINT 'Check Hive';
> SELECT count(1) 
> into cnt
> FROM dev.test_sqoop;
> PRINT cnt;
> end;
> It has three blocks. One select from MySQL, second from DB2 and third from 
> Hive ORC table.
> When first two blocks are commented then block 3 works. See below
> Check Hive
> 16/09/20 18:08:08 INFO jdbc.Utils: Supplied authorities: localhost:1
> 16/09/20 18:08:08 INFO jdbc.Utils: Resolved authority: localhost:1
> 16/09/20 18:08:08 INFO jdbc.HiveConnection: Will try to open client transport 
> with JDBC Uri: jdbc:hive2://localhost:1
> Open connection: jdbc:hive2://localhost:1 (497 ms)
> Starting query
> Query executed successfully (177 ms)
> 82
> When I try to uncomment any of those blocks then block 3 stops 

[jira] [Comment Edited] (HIVE-14804) HPLSQL multiple db connection does not switch back to Hive

2016-11-14 Thread Fei Hui (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665841#comment-15665841
 ] 

Fei Hui edited comment on HIVE-14804 at 11/15/16 2:57 AM:
--

hi [~dkozlovdaisy]  in HIVE-13540's commit. it contians as below

diff --git hplsql/src/main/java/org/apache/hive/hplsql/Select.java 
hplsql/src/main/java/org/apache/hive/hplsql/Select.java
index 589e984..403810c 100644
--- hplsql/src/main/java/org/apache/hive/hplsql/Select.java
+++ hplsql/src/main/java/org/apache/hive/hplsql/Select.java
@@ -147,10 +147,10 @@ else if (ctx.parent instanceof HplsqlParser.StmtContext) {
 }
 catch (SQLException e) {
   exec.signal(query);
-  exec.closeQuery(query, exec.conf.defaultConnection);
+  exec.closeQuery(query, conn);
   return 1;
 }
-exec.closeQuery(query, exec.conf.defaultConnection);
+exec.closeQuery(query, conn);
 return 0;
   }



diff --git hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java 
hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
index 17d2195..c044616 100644
--- hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
+++ hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
...
@@ -793,7 +803,7 @@ else if (type == Conn.Type.HIVE && conf.insertValues == 
Conf.InsertValues.SELECT
   return 1;
 }
 exec.setSqlSuccess();
-exec.closeQuery(query, exec.conf.defaultConnection);
+exec.closeQuery(query, conn);
 return 0;
   }

you can patch it


was (Author: ferhui):
hi [~dkozlovdaisy]  in HIVE-13540's commit. it contians as below
diff --git hplsql/src/main/java/org/apache/hive/hplsql/Select.java 
hplsql/src/main/java/org/apache/hive/hplsql/Select.java
index 589e984..403810c 100644
--- hplsql/src/main/java/org/apache/hive/hplsql/Select.java
+++ hplsql/src/main/java/org/apache/hive/hplsql/Select.java
@@ -147,10 +147,10 @@ else if (ctx.parent instanceof HplsqlParser.StmtContext) {
 }
 catch (SQLException e) {
   exec.signal(query);
- exec.closeQuery(query, exec.conf.defaultConnection);
+exec.closeQuery(query, conn);
   return 1;
 }
-  exec.closeQuery(query, exec.conf.defaultConnection);
+  exec.closeQuery(query, conn);
 return 0; 
   }  



diff --git hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java 
hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
index 17d2195..c044616 100644
--- hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
+++ hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
...
@@ -793,7 +803,7 @@ else if (type == Conn.Type.HIVE && conf.insertValues == 
Conf.InsertValues.SELECT
   return 1;
 }
 exec.setSqlSuccess();
-exec.closeQuery(query, exec.conf.defaultConnection);
+exec.closeQuery(query, conn);
 return 0; 
   }

you can patch it

> HPLSQL multiple db connection does not switch back to Hive
> --
>
> Key: HIVE-14804
> URL: https://issues.apache.org/jira/browse/HIVE-14804
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Dmitry Kozlov
>Assignee: Dmitry Tolpeko
>Priority: Blocker
>
> I have a problem with a multi database connection. I have 3 environments that 
> I would like to connect in my HPLSQL code Hive, DB2 and MySql. As soon as I 
> map any table either from DB2 or MySQL my code stops to recognize Hive 
> tables. Actually it starts to think that it is a table from the same database 
> (DB2 or MySql) that was mapped the last. It means your example 
> http://www.hplsql.org/map-object works only one way from Hive to MySQL and it 
> is not possible to go back to Hive.  
> Here is a simple piece of code.
> declare cnt int;
> begin
> /*
> PRINT 'Start MySQL';
> MAP OBJECT tbls TO hive.TBLS AT mysqlconn;
> select count(*)
> into cnt
> from tbls;
> PRINT cnt;
> PRINT 'Start Db2';
> MAP OBJECT exch TO DBDEV2.TEST_EXCHANGE AT db2conn;
> select count(1) 
> into cnt
> from exch;
> PRINT cnt;*/
> PRINT 'Check Hive';
> SELECT count(1) 
> into cnt
> FROM dev.test_sqoop;
> PRINT cnt;
> end;
> It has three blocks. One select from MySQL, second from DB2 and third from 
> Hive ORC table.
> When first two blocks are commented then block 3 works. See below
> Check Hive
> 16/09/20 18:08:08 INFO jdbc.Utils: Supplied authorities: localhost:1
> 16/09/20 18:08:08 INFO jdbc.Utils: Resolved authority: localhost:1
> 16/09/20 18:08:08 INFO jdbc.HiveConnection: Will try to open client transport 
> with JDBC Uri: jdbc:hive2://localhost:1
> Open connection: jdbc:hive2://localhost:1 (497 ms)
> Starting query
> Query executed successfully (177 ms)
> 82
> When I try to uncomment any of those blocks then block 3 stops working. For 
> example, if I uncomment block 1 I get this output. It is now assumes that 
> dev.test_sqoop is a MySQL table. Contrarily to your example
> Start MySQL
> 

[jira] [Assigned] (HIVE-15073) Schematool should detect malformed URIs

2016-11-14 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen reassigned HIVE-15073:
---

Assignee: Yongzhi Chen

> Schematool should detect malformed URIs
> ---
>
> Key: HIVE-15073
> URL: https://issues.apache.org/jira/browse/HIVE-15073
> Project: Hive
>  Issue Type: Improvement
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>
> For some causes(most unknown), HMS DB tables sometimes has invalid entries, 
> for example URI missing scheme for SDS table's LOCATION column or DBS's 
> DB_LOCATION_URI column. These malformed URIs lead to hard to analyze errors 
> in HIVE and SENTRY. Schematool need to provide a command to detect these 
> malformed URI, give a warning and provide an option to fix the URIs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14990) run all tests for MM tables and fix the issues that are found

2016-11-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15648980#comment-15648980
 ] 

Sergey Shelukhin edited comment on HIVE-14990 at 11/15/16 2:57 AM:
---

Looked at all the remaining tests. Out of 749 failed tests, about 100 failures 
and diffs are (or might be, at least) relevant.
Many of them are similar e.g. are missing stats, but I don't know if they are 
missing stats for the same reason. 
Many e.g. exim may be due to unsupported/path-dependant scenarios that were not 
immediately obvious.

Not sure why TestSparkCliDriver fails. Fails in client init for me with no 
useful logs (logs child process exited with 127, then times out).
I think we'll fix that during branch merge, if still broken.

Crossing out ones that are actually irrelevant
{panel}
TestCliDriver:
authorization_insert
create_default_prop
exim_04_evolved_parts
-exim_11_managed_external-
-exim_12_external_location-
-exim_15_external_part-
-exim_18_part_external-
-exim_19_00_part_external_location-
-exim_19_part_external_location-
insert1
-list_bucket_dml_8-
-mm_all-
orc_createas1
ppd_join4
stats_empty_dyn_part
stats_partscan_1_23
temp_table_display_colstats_tbllvl
-temp_table_options1-
vector_udf2
list_bucket_dml_14,list_bucket_*
llap_acid
insert_overwrite_directory2
authorization_load
autoColumnStats_9
create_like
drop_database_removes_partition_dirs
drop_table_removes_partition_dirs
index_auto_update
exim_01_nonpart,exim_02_part,-exim_04_all_part,exim_05_some_part-,exim_06_one_part,exim_20_part_managed_location
exim_16_part_external,exim_17_part_managed
load_overwrite
materialized_view_authorization_sqlstd,materialized_*
merge_dynamic_partition, merge_dynamic_partition*
orc_int_type_promotion
orc_vectorization_ppd
parquet_join2
partition_wise_fileformat,partition_wise_fileformat3
repl_1_drop,repl_3_exim_metadata 
sample6
sample_islocalmode_hook
show_tablestatus
smb_bucket_1
smb_mapjoin_2,smb_mapjoin_3,smb_mapjoin_7
stats_list_bucket
stats_noscan_2
symlink_text_input_format
temp_table_precedence
offset_limit_global_optimizer
rand_partitionpruner2

TestEncryptedHDFSCliDriver:
encryption_ctas
encryption_drop_partition
encryption_insert_values
encryption_join_unencrypted_tbl
encryption_load_data_to_encrypted_tables

MiniLlapLocal:
exchgpartition2lel
cbo_rp_lineage2
create_merge_compressed
deleteAnalyze
delete_where_no_match
delete_where_non_partitioned
dynpart_sort_optimization
escape2
insert1
lineage2
lineage3
orc_llap
schema_evol_orc_nonvec_part
schema_evol_orc_vec_part
schema_evol_text_nonvec_part
schema_evol_text_vec_part
schema_evol_text_vecrow_part
smb_mapjoin_6
tez_dml
union_fast_stats
update_all_types
update_tmp_table
update_where_no_match
update_where_non_partitioned
vector_outer_join1
vector_outer_join4

MiniLlap:
load_fs2
orc_ppd_basic
external_table_with_space_in_location_path
file_with_header_footer
import_exported_table
schemeAuthority,schemeAuthority2
table_nonprintable

Minimr:
infer_bucket_sort_map_operators
infer_bucket_sort_merge
infer_bucket_sort_reducers_power_two
root_dir_external_table
scriptfile1

TestSymlinkTextInputFormat#testCombine 
TestJdbcWithLocalClusterSpark, etc.
{panel}



was (Author: sershe):
Looked at all the remaining tests. Out of 749 failed tests, about 100 failures 
and diffs are (or might be, at least) relevant.
Many of them are similar e.g. are missing stats, but I don't know if they are 
missing stats for the same reason. 
Many e.g. exim may be due to unsupported/path-dependant scenarios that were not 
immediately obvious.

Not sure why TestSparkCliDriver fails. Fails in client init for me with no 
useful logs (logs child process exited with 127, then times out).
I think we'll fix that during branch merge, if still broken.

Crossing out ones that are actually irrelevant
{panel}
TestCliDriver:
authorization_insert
create_default_prop
exim_04_evolved_parts
-exim_11_managed_external-
-exim_12_external_location-
-exim_15_external_part-
-exim_18_part_external-
-exim_19_00_part_external_location-
-exim_19_part_external_location-
insert1
-list_bucket_dml_8-
-mm_all-
orc_createas1
ppd_join4
stats_empty_dyn_part
stats_partscan_1_23
temp_table_display_colstats_tbllvl
temp_table_options1
vector_udf2
list_bucket_dml_14,list_bucket_*
llap_acid
insert_overwrite_directory2
authorization_load
autoColumnStats_9
create_like
drop_database_removes_partition_dirs
drop_table_removes_partition_dirs
index_auto_update
exim_01_nonpart,exim_02_part,exim_04_all_part,exim_05_some_part,exim_06_one_part,exim_16_part_external,exim_17_part_managed,exim_20_part_managed_location
load_overwrite
materialized_view_authorization_sqlstd,materialized_*
merge_dynamic_partition, merge_dynamic_partition*
orc_int_type_promotion
orc_vectorization_ppd
parquet_join2
partition_wise_fileformat,partition_wise_fileformat3
repl_1_drop,repl_3_exim_metadata 
sample6
sample_islocalmode_hook
show_tablestatus

[jira] [Comment Edited] (HIVE-14804) HPLSQL multiple db connection does not switch back to Hive

2016-11-14 Thread Fei Hui (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665841#comment-15665841
 ] 

Fei Hui edited comment on HIVE-14804 at 11/15/16 2:55 AM:
--

hi [~dkozlovdaisy]  in HIVE-13540's commit. it contians as below
diff --git hplsql/src/main/java/org/apache/hive/hplsql/Select.java 
hplsql/src/main/java/org/apache/hive/hplsql/Select.java
index 589e984..403810c 100644
--- hplsql/src/main/java/org/apache/hive/hplsql/Select.java
+++ hplsql/src/main/java/org/apache/hive/hplsql/Select.java
@@ -147,10 +147,10 @@ else if (ctx.parent instanceof HplsqlParser.StmtContext) {
 }
 catch (SQLException e) {
   exec.signal(query);
- exec.closeQuery(query, exec.conf.defaultConnection);
+exec.closeQuery(query, conn);
   return 1;
 }
-  exec.closeQuery(query, exec.conf.defaultConnection);
+  exec.closeQuery(query, conn);
 return 0; 
   }  



diff --git hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java 
hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
index 17d2195..c044616 100644
--- hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
+++ hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
...
@@ -793,7 +803,7 @@ else if (type == Conn.Type.HIVE && conf.insertValues == 
Conf.InsertValues.SELECT
   return 1;
 }
 exec.setSqlSuccess();
-exec.closeQuery(query, exec.conf.defaultConnection);
+exec.closeQuery(query, conn);
 return 0; 
   }

you can patch it


was (Author: ferhui):
hi [~dkozlovdaisy]  in HIVE-13540's commit. it contians as below
diff --git hplsql/src/main/java/org/apache/hive/hplsql/Select.java 
hplsql/src/main/java/org/apache/hive/hplsql/Select.java
index 589e984..403810c 100644
--- hplsql/src/main/java/org/apache/hive/hplsql/Select.java
+++ hplsql/src/main/java/org/apache/hive/hplsql/Select.java
@@ -147,10 +147,10 @@ else if (ctx.parent instanceof HplsqlParser.StmtContext) {
 }
 catch (SQLException e) {
   exec.signal(query);
-  exec.closeQuery(query, exec.conf.defaultConnection);
+  exec.closeQuery(query, conn);
   return 1;
 }
-exec.closeQuery(query, exec.conf.defaultConnection);
+exec.closeQuery(query, conn);
 return 0; 
   }  



diff --git hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java 
hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
index 17d2195..c044616 100644
--- hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
+++ hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
...
@@ -793,7 +803,7 @@ else if (type == Conn.Type.HIVE && conf.insertValues == 
Conf.InsertValues.SELECT
   return 1;
 }
 exec.setSqlSuccess();
-exec.closeQuery(query, exec.conf.defaultConnection);
+exec.closeQuery(query, conn);
 return 0; 
   }

you can patch it

> HPLSQL multiple db connection does not switch back to Hive
> --
>
> Key: HIVE-14804
> URL: https://issues.apache.org/jira/browse/HIVE-14804
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Dmitry Kozlov
>Assignee: Dmitry Tolpeko
>Priority: Blocker
>
> I have a problem with a multi database connection. I have 3 environments that 
> I would like to connect in my HPLSQL code Hive, DB2 and MySql. As soon as I 
> map any table either from DB2 or MySQL my code stops to recognize Hive 
> tables. Actually it starts to think that it is a table from the same database 
> (DB2 or MySql) that was mapped the last. It means your example 
> http://www.hplsql.org/map-object works only one way from Hive to MySQL and it 
> is not possible to go back to Hive.  
> Here is a simple piece of code.
> declare cnt int;
> begin
> /*
> PRINT 'Start MySQL';
> MAP OBJECT tbls TO hive.TBLS AT mysqlconn;
> select count(*)
> into cnt
> from tbls;
> PRINT cnt;
> PRINT 'Start Db2';
> MAP OBJECT exch TO DBDEV2.TEST_EXCHANGE AT db2conn;
> select count(1) 
> into cnt
> from exch;
> PRINT cnt;*/
> PRINT 'Check Hive';
> SELECT count(1) 
> into cnt
> FROM dev.test_sqoop;
> PRINT cnt;
> end;
> It has three blocks. One select from MySQL, second from DB2 and third from 
> Hive ORC table.
> When first two blocks are commented then block 3 works. See below
> Check Hive
> 16/09/20 18:08:08 INFO jdbc.Utils: Supplied authorities: localhost:1
> 16/09/20 18:08:08 INFO jdbc.Utils: Resolved authority: localhost:1
> 16/09/20 18:08:08 INFO jdbc.HiveConnection: Will try to open client transport 
> with JDBC Uri: jdbc:hive2://localhost:1
> Open connection: jdbc:hive2://localhost:1 (497 ms)
> Starting query
> Query executed successfully (177 ms)
> 82
> When I try to uncomment any of those blocks then block 3 stops working. For 
> example, if I uncomment block 1 I get this output. It is now assumes that 
> dev.test_sqoop is a MySQL table. Contrarily to your example
> Start MySQL
> 

[jira] [Commented] (HIVE-14804) HPLSQL multiple db connection does not switch back to Hive

2016-11-14 Thread Fei Hui (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665841#comment-15665841
 ] 

Fei Hui commented on HIVE-14804:


hi [~dkozlovdaisy]  in HIVE-13540's commit. it contians as below
diff --git hplsql/src/main/java/org/apache/hive/hplsql/Select.java 
hplsql/src/main/java/org/apache/hive/hplsql/Select.java
index 589e984..403810c 100644
--- hplsql/src/main/java/org/apache/hive/hplsql/Select.java
+++ hplsql/src/main/java/org/apache/hive/hplsql/Select.java
@@ -147,10 +147,10 @@ else if (ctx.parent instanceof HplsqlParser.StmtContext) {
 }
 catch (SQLException e) {
   exec.signal(query);
-  exec.closeQuery(query, exec.conf.defaultConnection);
+  exec.closeQuery(query, conn);
   return 1;
 }
-exec.closeQuery(query, exec.conf.defaultConnection);
+exec.closeQuery(query, conn);
 return 0; 
   }  



diff --git hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java 
hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
index 17d2195..c044616 100644
--- hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
+++ hplsql/src/main/java/org/apache/hive/hplsql/Stmt.java
...
@@ -793,7 +803,7 @@ else if (type == Conn.Type.HIVE && conf.insertValues == 
Conf.InsertValues.SELECT
   return 1;
 }
 exec.setSqlSuccess();
-exec.closeQuery(query, exec.conf.defaultConnection);
+exec.closeQuery(query, conn);
 return 0; 
   }

you can patch it

> HPLSQL multiple db connection does not switch back to Hive
> --
>
> Key: HIVE-14804
> URL: https://issues.apache.org/jira/browse/HIVE-14804
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Dmitry Kozlov
>Assignee: Dmitry Tolpeko
>Priority: Blocker
>
> I have a problem with a multi database connection. I have 3 environments that 
> I would like to connect in my HPLSQL code Hive, DB2 and MySql. As soon as I 
> map any table either from DB2 or MySQL my code stops to recognize Hive 
> tables. Actually it starts to think that it is a table from the same database 
> (DB2 or MySql) that was mapped the last. It means your example 
> http://www.hplsql.org/map-object works only one way from Hive to MySQL and it 
> is not possible to go back to Hive.  
> Here is a simple piece of code.
> declare cnt int;
> begin
> /*
> PRINT 'Start MySQL';
> MAP OBJECT tbls TO hive.TBLS AT mysqlconn;
> select count(*)
> into cnt
> from tbls;
> PRINT cnt;
> PRINT 'Start Db2';
> MAP OBJECT exch TO DBDEV2.TEST_EXCHANGE AT db2conn;
> select count(1) 
> into cnt
> from exch;
> PRINT cnt;*/
> PRINT 'Check Hive';
> SELECT count(1) 
> into cnt
> FROM dev.test_sqoop;
> PRINT cnt;
> end;
> It has three blocks. One select from MySQL, second from DB2 and third from 
> Hive ORC table.
> When first two blocks are commented then block 3 works. See below
> Check Hive
> 16/09/20 18:08:08 INFO jdbc.Utils: Supplied authorities: localhost:1
> 16/09/20 18:08:08 INFO jdbc.Utils: Resolved authority: localhost:1
> 16/09/20 18:08:08 INFO jdbc.HiveConnection: Will try to open client transport 
> with JDBC Uri: jdbc:hive2://localhost:1
> Open connection: jdbc:hive2://localhost:1 (497 ms)
> Starting query
> Query executed successfully (177 ms)
> 82
> When I try to uncomment any of those blocks then block 3 stops working. For 
> example, if I uncomment block 1 I get this output. It is now assumes that 
> dev.test_sqoop is a MySQL table. Contrarily to your example
> Start MySQL
> Open connection: jdbc:mysql://10.11.12.144:3306/hive (489 ms)
> Starting query
> Query executed successfully (4 ms)
> 539
> Check Hive
> Starting query
> Unhandled exception in HPL/SQL
> com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Table 
> 'dev.test_sqoop' doesn't exist
> If I comment the second block then it starts to assume that dev.test_sqoop is 
> a DB2 table. See below. So switch between DB2 and MySQL is working, however, 
> the hive table is still not working
> Start MySQL
> Open connection: jdbc:mysql://10.11.12.144:3306/hive (485 ms)
> Starting query
> Query executed successfully (5 ms)
> 539
> Start Db2
> Open connection: jdbc:db2://10.11.12.141:5/WM (227 ms)
> Starting query
> Query executed successfully (48 ms)
> 0
> Check Hive
> Starting query
> Unhandled exception in HPL/SQL
> com.ibm.db2.jcc.am.SqlSyntaxErrorException: DB2 SQL Error: SQLCODE=-204, 
> SQLSTATE=42704, SQLERRMC=DEV.TEST_SQOOP, DRIVER=4.16.53
> Could you, please, provide your feedback on this finding. In addition, I 
> would like to check if it would be possible to insert into a DB2 table 
> records that were selected from a Hive with one statement as soon as DB2 
> table is properly mapped. Please, explain.
> Looking forward to hearing from you soon.
> Regards,
> Dmitry Kozlov
> Daisy Intelligence   



--
This message was sent 

[jira] [Commented] (HIVE-13590) Kerberized HS2 with LDAP auth enabled fails in multi-domain LDAP case

2016-11-14 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665832#comment-15665832
 ] 

Chaoyu Tang commented on HIVE-13590:


The concern we had is the new rules added in Hadoop auth_to_local for HS2 LDAP 
users might bring in a potential security hole for the kerberized cluster 
system.

> Kerberized HS2 with LDAP auth enabled fails in multi-domain LDAP case
> -
>
> Key: HIVE-13590
> URL: https://issues.apache.org/jira/browse/HIVE-13590
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication, Security
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-13590.1.patch, HIVE-13590.1.patch, 
> HIVE-13590.patch, HIVE-13590.patch
>
>
> In a kerberized HS2 with LDAP authentication enabled, LDAP user usually logs 
> in using username in form of username@domain in LDAP multi-domain case. But 
> it fails if the domain was not in the Hadoop auth_to_local mapping rule, the 
> error is as following:
> {code}
> Caused by: 
> org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
> No rules applied to ct...@mydomain.com
> at 
> org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:389)
> at org.apache.hadoop.security.User.(User.java:48)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13590) Kerberized HS2 with LDAP auth enabled fails in multi-domain LDAP case

2016-11-14 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665833#comment-15665833
 ] 

Chaoyu Tang commented on HIVE-13590:


The concern we had is the new rules added in Hadoop auth_to_local for HS2 LDAP 
users might bring in a potential security hole for the kerberized cluster 
system.

> Kerberized HS2 with LDAP auth enabled fails in multi-domain LDAP case
> -
>
> Key: HIVE-13590
> URL: https://issues.apache.org/jira/browse/HIVE-13590
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication, Security
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-13590.1.patch, HIVE-13590.1.patch, 
> HIVE-13590.patch, HIVE-13590.patch
>
>
> In a kerberized HS2 with LDAP authentication enabled, LDAP user usually logs 
> in using username in form of username@domain in LDAP multi-domain case. But 
> it fails if the domain was not in the Hadoop auth_to_local mapping rule, the 
> error is as following:
> {code}
> Caused by: 
> org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
> No rules applied to ct...@mydomain.com
> at 
> org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:389)
> at org.apache.hadoop.security.User.(User.java:48)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13590) Kerberized HS2 with LDAP auth enabled fails in multi-domain LDAP case

2016-11-14 Thread Ruslan Dautkhanov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665734#comment-15665734
 ] 

Ruslan Dautkhanov commented on HIVE-13590:
--

Thank you [~ctang.ma].

On your point #1. 
If this mapping logic only works when hive.server2.authentication.ldap.url 
and/or hive.server2.authentication are set.
Or perhaps, we there should be a new knob to turn on auth_to_local for LDAP 
authentication.
If above would be true, would it address your concerns?

Thanks again.

> Kerberized HS2 with LDAP auth enabled fails in multi-domain LDAP case
> -
>
> Key: HIVE-13590
> URL: https://issues.apache.org/jira/browse/HIVE-13590
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication, Security
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-13590.1.patch, HIVE-13590.1.patch, 
> HIVE-13590.patch, HIVE-13590.patch
>
>
> In a kerberized HS2 with LDAP authentication enabled, LDAP user usually logs 
> in using username in form of username@domain in LDAP multi-domain case. But 
> it fails if the domain was not in the Hadoop auth_to_local mapping rule, the 
> error is as following:
> {code}
> Caused by: 
> org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
> No rules applied to ct...@mydomain.com
> at 
> org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:389)
> at org.apache.hadoop.security.User.(User.java:48)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15167) remove SerDe interface; undeprecate Deserializer and Serializer

2016-11-14 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665714#comment-15665714
 ] 

Lefty Leverenz commented on HIVE-15167:
---

bq.  We should un-deprecate (reprecate? precate?) them.

I like reprecate, but perhaps dedeprecate would be clearer.  Or how about 
de2precate?

> remove SerDe interface; undeprecate Deserializer and Serializer
> ---
>
> Key: HIVE-15167
> URL: https://issues.apache.org/jira/browse/HIVE-15167
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0
>
> Attachments: HIVE-15167.patch
>
>
> SerDe interfaces were deprecated in HIVE-4007 to suggest that users do not 
> implement them. However, this results in deprecation warnings all over the 
> codebase where they are actually used.
> We should un-deprecate (reprecate? precate?) them. We can add a comment for 
> implementers instead (we could add a method with a clearly bogus name like 
> useThisAbstractClassInstead, and implement it in the class, so it would be 
> noticeable, but that would break compat).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10901) Optimize mutli column distinct queries

2016-11-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665710#comment-15665710
 ] 

Hive QA commented on HIVE-10901:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12838854/HIVE-10901.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 10695 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[multi_count_distinct] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_grouping_sets] 
(batchId=75)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[limit_pushdown3]
 (batchId=141)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[limit_pushdown]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[offset_limit_ppd_optimizer]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_grouping_sets]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_distinct_gby]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_1] 
(batchId=90)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_4] 
(batchId=91)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query70] 
(batchId=219)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[auto_join18_multi_distinct]
 (batchId=103)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join18_multi_distinct]
 (batchId=104)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[limit_pushdown] 
(batchId=121)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2118/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2118/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2118/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12838854 - PreCommit-HIVE-Build

> Optimize  mutli column distinct queries 
> 
>
> Key: HIVE-10901
> URL: https://issues.apache.org/jira/browse/HIVE-10901
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Logical Optimizer
>Affects Versions: 1.2.0
>Reporter: Mostafa Mokhtar
>Assignee: Pengcheng Xiong
> Attachments: HIVE-10901.02.patch, HIVE-10901.patch
>
>
> HIVE-10568 is useful only when there is a distinct on one column. It can be 
> expanded for multiple column cases too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15167) remove SerDe interface; undeprecate Deserializer and Serializer

2016-11-14 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665706#comment-15665706
 ] 

Lefty Leverenz commented on HIVE-15167:
---

Should this (and HIVE-4007) be documented in the wiki?

> remove SerDe interface; undeprecate Deserializer and Serializer
> ---
>
> Key: HIVE-15167
> URL: https://issues.apache.org/jira/browse/HIVE-15167
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0
>
> Attachments: HIVE-15167.patch
>
>
> SerDe interfaces were deprecated in HIVE-4007 to suggest that users do not 
> implement them. However, this results in deprecation warnings all over the 
> codebase where they are actually used.
> We should un-deprecate (reprecate? precate?) them. We can add a comment for 
> implementers instead (we could add a method with a clearly bogus name like 
> useThisAbstractClassInstead, and implement it in the class, so it would be 
> noticeable, but that would break compat).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13590) Kerberized HS2 with LDAP auth enabled fails in multi-domain LDAP case

2016-11-14 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665704#comment-15665704
 ] 

Chaoyu Tang commented on HIVE-13590:


[~Tagar] as I recall, we did use the KerberosName.getShortName for LDAP users 
for two reasons:
1. security: add the rules to LDAP users in Hadoop auth_to_local might 
introduce the security holes to other components accessing Hadoop only via 
Kerberos.
2. backward compatibility: for HS2 only using LDAP authentication. The 
KerberosName/auth_to_local rules were previously not required.

HTH, thanks

> Kerberized HS2 with LDAP auth enabled fails in multi-domain LDAP case
> -
>
> Key: HIVE-13590
> URL: https://issues.apache.org/jira/browse/HIVE-13590
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication, Security
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-13590.1.patch, HIVE-13590.1.patch, 
> HIVE-13590.patch, HIVE-13590.patch
>
>
> In a kerberized HS2 with LDAP authentication enabled, LDAP user usually logs 
> in using username in form of username@domain in LDAP multi-domain case. But 
> it fails if the domain was not in the Hadoop auth_to_local mapping rule, the 
> error is as following:
> {code}
> Caused by: 
> org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
> No rules applied to ct...@mydomain.com
> at 
> org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:389)
> at org.apache.hadoop.security.User.(User.java:48)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15167) remove SerDe interface; undeprecate Deserializer and Serializer

2016-11-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15167:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to master. Thanks for the review!

> remove SerDe interface; undeprecate Deserializer and Serializer
> ---
>
> Key: HIVE-15167
> URL: https://issues.apache.org/jira/browse/HIVE-15167
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15167.patch
>
>
> SerDe interfaces were deprecated in HIVE-4007 to suggest that users do not 
> implement them. However, this results in deprecation warnings all over the 
> codebase where they are actually used.
> We should un-deprecate (reprecate? precate?) them. We can add a comment for 
> implementers instead (we could add a method with a clearly bogus name like 
> useThisAbstractClassInstead, and implement it in the class, so it would be 
> noticeable, but that would break compat).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15167) remove SerDe interface; undeprecate Deserializer and Serializer

2016-11-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15167:

Fix Version/s: 2.2.0

> remove SerDe interface; undeprecate Deserializer and Serializer
> ---
>
> Key: HIVE-15167
> URL: https://issues.apache.org/jira/browse/HIVE-15167
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0
>
> Attachments: HIVE-15167.patch
>
>
> SerDe interfaces were deprecated in HIVE-4007 to suggest that users do not 
> implement them. However, this results in deprecation warnings all over the 
> codebase where they are actually used.
> We should un-deprecate (reprecate? precate?) them. We can add a comment for 
> implementers instead (we could add a method with a clearly bogus name like 
> useThisAbstractClassInstead, and implement it in the class, so it would be 
> noticeable, but that would break compat).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14089) complex type support in LLAP IO is broken

2016-11-14 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665647#comment-15665647
 ] 

Prasanth Jayachandran commented on HIVE-14089:
--

lgtm, +1

> complex type support in LLAP IO is broken 
> --
>
> Key: HIVE-14089
> URL: https://issues.apache.org/jira/browse/HIVE-14089
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14089.04.patch, HIVE-14089.05.patch, 
> HIVE-14089.06.patch, HIVE-14089.07.patch, HIVE-14089.08.patch, 
> HIVE-14089.09.patch, HIVE-14089.10.patch, HIVE-14089.10.patch, 
> HIVE-14089.10.patch, HIVE-14089.11.patch, HIVE-14089.12.patch, 
> HIVE-14089.WIP.2.patch, HIVE-14089.WIP.3.patch, HIVE-14089.WIP.patch
>
>
> HIVE-13617 is causing MiniLlapCliDriver following test failures
> {code}
> org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all
> org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15057) Support other types of operators (other than SELECT)

2016-11-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665594#comment-15665594
 ] 

Hive QA commented on HIVE-15057:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12838852/HIVE-15057.wip.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 477 failed/errored test(s), 10492 tests 
executed
*Failed tests:*
{noformat}
TestCBOMaxNumToCNF - did not produce a TEST-*.xml file (likely timed out) 
(batchId=255)
TestCBORuleFiredOnlyOnce - did not produce a TEST-*.xml file (likely timed out) 
(batchId=255)
TestCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=11)

[auto_join18.q,input1_limit.q,load_dyn_part3.q,autoColumnStats_4.q,auto_sortmerge_join_14.q,drop_table.q,bucket_map_join_tez2.q,auto_join33.q,merge4.q,parquet_external_time.q,storage_format_descriptor.q,mapjoin_hook.q,multi_column_in_single.q,schema_evol_orc_nonvec_table.q,cbo_rp_subq_in.q,authorization_view_disable_cbo_4.q,list_bucket_dml_2.q,cbo_rp_semijoin.q,char_2.q,union_remove_14.q,non_ascii_literal2.q,load_part_authsuccess.q,auto_sortmerge_join_15.q,explain_rearrange.q,varchar_union1.q,input21.q,vector_udf2.q,groupby_cube_multi_gby.q,bucketmapjoin8.q,union34.q]
TestCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=4)

[join_vc.q,varchar_join1.q,join7.q,insert_values_tmp_table.q,json_serde_tsformat.q,tez_union2.q,script_env_var1.q,bucketsortoptimize_insert_8.q,stats16.q,union20.q,inputddl5.q,select_transform_hint.q,parallel_join1.q,compute_stats_string.q,union_remove_7.q,union27.q,optional_outer.q,vector_include_no_sel.q,insert0.q,folder_predicate.q,groupby_cube1.q,groupby7_map_multi_single_reducer.q,join_reorder4.q,vector_interval_arithmetic.q,smb_mapjoin_17.q,groupby7_map.q,input_part10.q,udf_mask_show_first_n.q,union.q,cbo_udf_udaf.q]
TestCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=5)

[ptf_general_queries.q,correlationoptimizer9.q,auto_join_reordering_values.q,sample2.q,decimal_join.q,mapjoin_subquery2.q,join43.q,bucket_if_with_path_filter.q,udf_month.q,mapjoin1.q,avro_partitioned_native.q,join25.q,nullformatdir.q,authorization_admin_almighty1.q,udf_avg.q,cte_mat_4.q,groupby3.q,cbo_rp_union.q,udaf_covar_samp.q,exim_03_nonpart_over_compat.q,udf_logged_in_user.q,index_stale.q,union12.q,skewjoinopt2.q,skewjoinopt18.q,colstats_all_nulls.q,bucketsortoptimize_insert_2.q,quote2.q,udf_classloader.q,authorization_owner_actions.q]
TestCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=58)

[touch.q,auto_sortmerge_join_13.q,join4.q,join35.q,filter_cond_pushdown2.q,except_distinct.q,vector_left_outer_join2.q,udf_ucase.q,udf_ceil.q,vectorized_ptf.q,exim_25_export_parentpath_has_inaccessible_children.q,udf_array.q,join_filters.q,udf_current_user.q,acid_vectorization.q,join_reorder3.q,auto_join19.q,distinct_windowing_no_cbo.q,vectorization_15.q,union7.q,vectorization_nested_udf.q,database_properties.q,partition_varchar1.q,vector_groupby_3.q,udf_sort_array.q,cte_6.q,vector_mr_diff_schema_alias.q,rcfile_union.q,explain_logical.q,interval_3.q]
TestColumnPrunerProcCtx - did not produce a TEST-*.xml file (likely timed out) 
(batchId=255)
TestGenMapRedUtilsUsePartitionColumnsNegative - did not produce a TEST-*.xml 
file (likely timed out) (batchId=255)
TestHiveMetaStoreChecker - did not produce a TEST-*.xml file (likely timed out) 
(batchId=255)
TestNegativePartitionPrunerCompactExpr - did not produce a TEST-*.xml file 
(likely timed out) (batchId=255)
TestPositivePartitionPrunerCompactExpr - did not produce a TEST-*.xml file 
(likely timed out) (batchId=255)
TestTableIterable - did not produce a TEST-*.xml file (likely timed out) 
(batchId=255)
TestTxnCommands2 - did not produce a TEST-*.xml file (likely timed out) 
(batchId=255)
TestVectorizer - did not produce a TEST-*.xml file (likely timed out) 
(batchId=255)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_joins] 
(batchId=215)
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=215)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=24)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=65)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_join] (batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_vectorization_partition]
 (batchId=65)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_vectorization_project]
 (batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_merge] (batchId=24)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_part] 

[jira] [Commented] (HIVE-14271) FileSinkOperator should not rename files to final paths when S3 is the default destination

2016-11-14 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665533#comment-15665533
 ] 

Sahil Takiar commented on HIVE-14271:
-

[~spena] looked more into what we discussed this morning, you are correct, 
there are two places where the {{FileSinkOperator}} is renaming files. The 
first happens in the {{commit(FileSystem)}} method, the method is invoked 
inside each map task. The second happens in the {{jobCloseOp(boolean)}} method, 
the method is invoked inside HiveServer2.

I think we can break this work down into two JIRAs:

1: Eliminate the rename that occurs in HiveServer2
2: Eliminate the rename that occurs inside each map task

When running on S3, I can't think of a reason why either would be necessary. I 
think the first priority will be to eliminate the rename that occurs in 
HiveServer2 (as you said this morning).

> FileSinkOperator should not rename files to final paths when S3 is the 
> default destination
> --
>
> Key: HIVE-14271
> URL: https://issues.apache.org/jira/browse/HIVE-14271
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>
> FileSinkOperator does a rename of {{outPaths -> finalPaths}} when it finished 
> writing all rows to a temporary path. The problem is that S3 does not support 
> renaming.
> Two options can be considered:
> a. Use a copy operation instead. After FileSinkOperator writes all rows to 
> outPaths, then the commit method will do a copy() call instead of move().
> b. Write row by row directly to the S3 path (see HIVE-1620). This may add 
> better performance calls, but we should take care of the cleanup part in case 
> of writing errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13557) Make interval keyword optional while specifying DAY in interval arithmetic

2016-11-14 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-13557:

Attachment: HIVE-13557.2.patch

patch #2)

* {{1 day}} or {{'1' day}}
* {{(1+x) day}}
* {{interval 1 day}} or {{interval '1' day}}
* {{interval (1+x) day}}

I would be happy to be able to support
* {{interval 1+x day}} too...but it seems problematic

While I keep thinging about that...a test execution would be good...I may 
probably missed a few qtests which are affected by this change

> Make interval keyword optional while specifying DAY in interval arithmetic
> --
>
> Key: HIVE-13557
> URL: https://issues.apache.org/jira/browse/HIVE-13557
> Project: Hive
>  Issue Type: Sub-task
>  Components: Types
>Reporter: Ashutosh Chauhan
>Assignee: Zoltan Haindrich
> Attachments: HIVE-13557.1.patch, HIVE-13557.1.patch, 
> HIVE-13557.1.patch, HIVE-13557.2.patch
>
>
> Currently we support expressions like: {code}
> WHERE SOLD_DATE BETWEEN ((DATE('2000-01-31'))  - INTERVAL '30' DAY) AND 
> DATE('2000-01-31')
> {code}
> We should support:
> {code}
> WHERE SOLD_DATE BETWEEN ((DATE('2000-01-31')) + (-30) DAY) AND 
> DATE('2000-01-31')
> {code}
>   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15173) Allow dec as an alias for decimal

2016-11-14 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665404#comment-15665404
 ] 

Ashutosh Chauhan commented on HIVE-15173:
-

ya.. at some point we need to break b/c if we want to adhere to standard.

> Allow dec as an alias for decimal
> -
>
> Key: HIVE-15173
> URL: https://issues.apache.org/jira/browse/HIVE-15173
> Project: Hive
>  Issue Type: Sub-task
>  Components: Parser
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-15173.patch
>
>
> Standard allows dec as an alias for decimal



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15135) Add an llap mode which fails if queries cannot run in llap

2016-11-14 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-15135:
--
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master.

> Add an llap mode which fails if queries cannot run in llap
> --
>
> Key: HIVE-15135
> URL: https://issues.apache.org/jira/browse/HIVE-15135
> Project: Hive
>  Issue Type: Task
>  Components: llap
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-15135.01.patch, HIVE-15135.02.patch, 
> HIVE-15135.03.patch, HIVE-15135.04.patch
>
>
> ALL currently ends up launching new containers for queries which cannot run 
> in llap.
> There should be a mode where these queries don't run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15135) Add an llap mode which fails if queries cannot run in llap

2016-11-14 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665381#comment-15665381
 ] 

Siddharth Seth commented on HIVE-15135:
---

Test failures covered by HIVE-15116, HIVE-15115.

> Add an llap mode which fails if queries cannot run in llap
> --
>
> Key: HIVE-15135
> URL: https://issues.apache.org/jira/browse/HIVE-15135
> Project: Hive
>  Issue Type: Task
>  Components: llap
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>  Labels: TODOC2.2
> Attachments: HIVE-15135.01.patch, HIVE-15135.02.patch, 
> HIVE-15135.03.patch, HIVE-15135.04.patch
>
>
> ALL currently ends up launching new containers for queries which cannot run 
> in llap.
> There should be a mode where these queries don't run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14089) complex type support in LLAP IO is broken

2016-11-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665371#comment-15665371
 ] 

Hive QA commented on HIVE-14089:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12838835/HIVE-14089.12.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10670 tests 
executed
*Failed tests:*
{noformat}
TestSSL - did not produce a TEST-*.xml file (likely timed out) (batchId=209)
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=93)

[join_cond_pushdown_unqual4.q,union_remove_7.q,join13.q,join_vc.q,groupby_cube1.q,bucket_map_join_spark2.q,sample3.q,smb_mapjoin_19.q,stats16.q,union23.q,union.q,union31.q,cbo_udf_udaf.q,ptf_decimal.q,bucketmapjoin2.q]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=145)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2116/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2116/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2116/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12838835 - PreCommit-HIVE-Build

> complex type support in LLAP IO is broken 
> --
>
> Key: HIVE-14089
> URL: https://issues.apache.org/jira/browse/HIVE-14089
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14089.04.patch, HIVE-14089.05.patch, 
> HIVE-14089.06.patch, HIVE-14089.07.patch, HIVE-14089.08.patch, 
> HIVE-14089.09.patch, HIVE-14089.10.patch, HIVE-14089.10.patch, 
> HIVE-14089.10.patch, HIVE-14089.11.patch, HIVE-14089.12.patch, 
> HIVE-14089.WIP.2.patch, HIVE-14089.WIP.3.patch, HIVE-14089.WIP.patch
>
>
> HIVE-13617 is causing MiniLlapCliDriver following test failures
> {code}
> org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all
> org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15154) Fix rest of q test file changes in branch-2.1

2016-11-14 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665364#comment-15665364
 ] 

Jesus Camacho Rodriguez commented on HIVE-15154:


[~spena], thanks for offering your help! This week I'm in ApacheCon Big Data, 
so I do not think I will have time to push the RC for the bug-fix release. If 
you would like to have the release before then, please go ahead, I think there 
is no constraint for creating a bug fix release even if you were not the RM for 
the minor revision. Otherwise, I will try to push it myself as soon as I have 
some cycles.

> Fix rest of q test file changes in branch-2.1
> -
>
> Key: HIVE-15154
> URL: https://issues.apache.org/jira/browse/HIVE-15154
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Affects Versions: 2.1.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
> Fix For: 2.1.1
>
> Attachments: HIVE-15154-branch-2.1.patch, 
> HIVE-15154-branch-2.1.patch, HIVE-15154.2-branch-2.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15148) disallow loading data into bucketed tables (by default)

2016-11-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15148:

Attachment: (was: HIVE-15148.02.patch)

> disallow loading data into bucketed tables (by default)
> ---
>
> Key: HIVE-15148
> URL: https://issues.apache.org/jira/browse/HIVE-15148
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15148.01.patch, HIVE-15148.02.patch, 
> HIVE-15148.patch
>
>
> A few q file tests still use the following, allowed, pattern:
> {noformat}
> CREATE TABLE bucket_small (key string, value string) partitioned by (ds 
> string) CLUSTERED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
> load data local inpath '../../data/files/smallsrcsortbucket1outof4.txt' INTO 
> TABLE bucket_small partition(ds='2008-04-08');
> load data local inpath '../../data/files/smallsrcsortbucket2outof4.txt' INTO 
> TABLE bucket_small partition(ds='2008-04-08');
> {noformat}
> This relies on the user to load the correct number of files with correctly 
> hashed data and the correct order of file names; if there's some discrepancy 
> in any of the above, the queries will fail or may produce incorrect results 
> if some bucket-based optimizations kick in.
> Additionally, even if the user does everything correctly, as far as I know 
> some code derives bucket number from file name, which won't work in this case 
> (as opposed to getting buckets based on the order of files, which will work 
> here but won't work as per  HIVE-14970... sigh).
> Hive enforces bucketing in other cases (the check cannot even be disabled 
> these days), so I suggest that we either prohibit the above outright, or at 
> least add a safety config setting that would disallow it by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15148) disallow loading data into bucketed tables (by default)

2016-11-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15148:

Attachment: HIVE-15148.02.patch

> disallow loading data into bucketed tables (by default)
> ---
>
> Key: HIVE-15148
> URL: https://issues.apache.org/jira/browse/HIVE-15148
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15148.01.patch, HIVE-15148.02.patch, 
> HIVE-15148.patch
>
>
> A few q file tests still use the following, allowed, pattern:
> {noformat}
> CREATE TABLE bucket_small (key string, value string) partitioned by (ds 
> string) CLUSTERED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
> load data local inpath '../../data/files/smallsrcsortbucket1outof4.txt' INTO 
> TABLE bucket_small partition(ds='2008-04-08');
> load data local inpath '../../data/files/smallsrcsortbucket2outof4.txt' INTO 
> TABLE bucket_small partition(ds='2008-04-08');
> {noformat}
> This relies on the user to load the correct number of files with correctly 
> hashed data and the correct order of file names; if there's some discrepancy 
> in any of the above, the queries will fail or may produce incorrect results 
> if some bucket-based optimizations kick in.
> Additionally, even if the user does everything correctly, as far as I know 
> some code derives bucket number from file name, which won't work in this case 
> (as opposed to getting buckets based on the order of files, which will work 
> here but won't work as per  HIVE-14970... sigh).
> Hive enforces bucketing in other cases (the check cannot even be disabled 
> these days), so I suggest that we either prohibit the above outright, or at 
> least add a safety config setting that would disallow it by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15148) disallow loading data into bucketed tables (by default)

2016-11-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15148:

Attachment: (was: HIVE-15148.02.patch)

> disallow loading data into bucketed tables (by default)
> ---
>
> Key: HIVE-15148
> URL: https://issues.apache.org/jira/browse/HIVE-15148
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15148.01.patch, HIVE-15148.02.patch, 
> HIVE-15148.patch
>
>
> A few q file tests still use the following, allowed, pattern:
> {noformat}
> CREATE TABLE bucket_small (key string, value string) partitioned by (ds 
> string) CLUSTERED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
> load data local inpath '../../data/files/smallsrcsortbucket1outof4.txt' INTO 
> TABLE bucket_small partition(ds='2008-04-08');
> load data local inpath '../../data/files/smallsrcsortbucket2outof4.txt' INTO 
> TABLE bucket_small partition(ds='2008-04-08');
> {noformat}
> This relies on the user to load the correct number of files with correctly 
> hashed data and the correct order of file names; if there's some discrepancy 
> in any of the above, the queries will fail or may produce incorrect results 
> if some bucket-based optimizations kick in.
> Additionally, even if the user does everything correctly, as far as I know 
> some code derives bucket number from file name, which won't work in this case 
> (as opposed to getting buckets based on the order of files, which will work 
> here but won't work as per  HIVE-14970... sigh).
> Hive enforces bucketing in other cases (the check cannot even be disabled 
> these days), so I suggest that we either prohibit the above outright, or at 
> least add a safety config setting that would disallow it by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15148) disallow loading data into bucketed tables (by default)

2016-11-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15148:

Attachment: HIVE-15148.02.patch

Missed some q files

> disallow loading data into bucketed tables (by default)
> ---
>
> Key: HIVE-15148
> URL: https://issues.apache.org/jira/browse/HIVE-15148
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15148.01.patch, HIVE-15148.02.patch, 
> HIVE-15148.patch
>
>
> A few q file tests still use the following, allowed, pattern:
> {noformat}
> CREATE TABLE bucket_small (key string, value string) partitioned by (ds 
> string) CLUSTERED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
> load data local inpath '../../data/files/smallsrcsortbucket1outof4.txt' INTO 
> TABLE bucket_small partition(ds='2008-04-08');
> load data local inpath '../../data/files/smallsrcsortbucket2outof4.txt' INTO 
> TABLE bucket_small partition(ds='2008-04-08');
> {noformat}
> This relies on the user to load the correct number of files with correctly 
> hashed data and the correct order of file names; if there's some discrepancy 
> in any of the above, the queries will fail or may produce incorrect results 
> if some bucket-based optimizations kick in.
> Additionally, even if the user does everything correctly, as far as I know 
> some code derives bucket number from file name, which won't work in this case 
> (as opposed to getting buckets based on the order of files, which will work 
> here but won't work as per  HIVE-14970... sigh).
> Hive enforces bucketing in other cases (the check cannot even be disabled 
> these days), so I suggest that we either prohibit the above outright, or at 
> least add a safety config setting that would disallow it by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15188) Hive Runtime Error processing row

2016-11-14 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665277#comment-15665277
 ] 

Wei Zheng commented on HIVE-15188:
--

[~gliaw] Do you mind to attache your test case (query and data)? 

> Hive Runtime Error processing row
> -
>
> Key: HIVE-15188
> URL: https://issues.apache.org/jira/browse/HIVE-15188
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.1
>Reporter: George Liaw
>Assignee: Wei Zheng
>
> Originally asked this over the email group but was asked to file a bug report 
> by [~sershe].
> I'm running into the below error occasionally and it seems related to Hybrid 
> Grace. Removed row contents but there are multiple columns.
> Got around the issue for the time being with [~gopalv]'s suggestion to 
> disable {{hive.mapjoin.hybridgrace.hashtable}}.
> {code}
> Vertex failed, vertexName=Map 5, vertexId=vertex_1478744516303_0282_1_07, 
> diagnostics=
>  Task failed, taskId=task_1478744516303_0282_1_07_71, diagnostics=
>  TaskAttempt 0 failed, info=
>  Error: Error while running task ( failure ) : 
> attempt_1478744516303_0282_1_07_71_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row {"some_col":"some_val"...}
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:195)
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:160)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"some_col":"some_val"...}
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:95)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:70)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:356)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:172)
> ... 14 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"some_col":"some_val"...}
> at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:574)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:86)
> ... 17 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected 
> exception from MapJoinOperator : null
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:454)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:97)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:128)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:170)
> at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:564)
> ... 18 more
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.persistence.HybridHashTableContainer$ReusableRowContainer.setFromOutput(HybridHashTableContainer.java:844)
> at 
> org.apache.hadoop.hive.ql.exec.persistence.HybridHashTableContainer$GetAdaptor.setFromRow(HybridHashTableContainer.java:725)
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.setMapJoinKey(MapJoinOperator.java:339)
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:390)
> ... 26 more
> {code}



--

[jira] [Assigned] (HIVE-15188) Hive Runtime Error processing row

2016-11-14 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng reassigned HIVE-15188:


Assignee: Wei Zheng

> Hive Runtime Error processing row
> -
>
> Key: HIVE-15188
> URL: https://issues.apache.org/jira/browse/HIVE-15188
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.1
>Reporter: George Liaw
>Assignee: Wei Zheng
>
> Originally asked this over the email group but was asked to file a bug report 
> by [~sershe].
> I'm running into the below error occasionally and it seems related to Hybrid 
> Grace. Removed row contents but there are multiple columns.
> Got around the issue for the time being with [~gopalv]'s suggestion to 
> disable {{hive.mapjoin.hybridgrace.hashtable}}.
> {code}
> Vertex failed, vertexName=Map 5, vertexId=vertex_1478744516303_0282_1_07, 
> diagnostics=
>  Task failed, taskId=task_1478744516303_0282_1_07_71, diagnostics=
>  TaskAttempt 0 failed, info=
>  Error: Error while running task ( failure ) : 
> attempt_1478744516303_0282_1_07_71_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row {"some_col":"some_val"...}
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:195)
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:160)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row {"some_col":"some_val"...}
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:95)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:70)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:356)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:172)
> ... 14 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"some_col":"some_val"...}
> at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:574)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:86)
> ... 17 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected 
> exception from MapJoinOperator : null
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:454)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:97)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:128)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:170)
> at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:564)
> ... 18 more
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.exec.persistence.HybridHashTableContainer$ReusableRowContainer.setFromOutput(HybridHashTableContainer.java:844)
> at 
> org.apache.hadoop.hive.ql.exec.persistence.HybridHashTableContainer$GetAdaptor.setFromRow(HybridHashTableContainer.java:725)
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.setMapJoinKey(MapJoinOperator.java:339)
> at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:390)
> ... 26 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13931) Add support for HikariCP and replace BoneCP usage with HikariCP

2016-11-14 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665257#comment-15665257
 ] 

Prasanth Jayachandran commented on HIVE-13931:
--

The test runs seems to be very flaky with different runs showing different 
failures. Since we are not changing the defaults for the connection pool, I 
don't expect the failures to be related to this. I have started another test 
run to make sure. 

> Add support for HikariCP and replace BoneCP usage with HikariCP
> ---
>
> Key: HIVE-13931
> URL: https://issues.apache.org/jira/browse/HIVE-13931
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sushanth Sowmyan
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13931.2.patch, HIVE-13931.3.patch, 
> HIVE-13931.4.patch, HIVE-13931.patch
>
>
> Currently, we use BoneCP as our primary connection pooling mechanism 
> (overridable by users). However, BoneCP is no longer being actively 
> developed, and is considered deprecated, replaced by HikariCP.
> Thus, we should add support for HikariCP, and try to replace our primary 
> usage of BoneCP with it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15148) disallow loading data into bucketed tables (by default)

2016-11-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665236#comment-15665236
 ] 

Hive QA commented on HIVE-15148:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12838830/HIVE-15148.02.patch

{color:green}SUCCESS:{color} +1 due to 95 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10694 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_1] 
(batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucket_map_join_2] 
(batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_orig_table] 
(batchId=55)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table]
 (batchId=51)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=56)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_orig_table]
 (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=91)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucket_map_join_1] 
(batchId=120)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucket_map_join_2] 
(batchId=116)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2115/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2115/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2115/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12838830 - PreCommit-HIVE-Build

> disallow loading data into bucketed tables (by default)
> ---
>
> Key: HIVE-15148
> URL: https://issues.apache.org/jira/browse/HIVE-15148
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15148.01.patch, HIVE-15148.02.patch, 
> HIVE-15148.patch
>
>
> A few q file tests still use the following, allowed, pattern:
> {noformat}
> CREATE TABLE bucket_small (key string, value string) partitioned by (ds 
> string) CLUSTERED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
> load data local inpath '../../data/files/smallsrcsortbucket1outof4.txt' INTO 
> TABLE bucket_small partition(ds='2008-04-08');
> load data local inpath '../../data/files/smallsrcsortbucket2outof4.txt' INTO 
> TABLE bucket_small partition(ds='2008-04-08');
> {noformat}
> This relies on the user to load the correct number of files with correctly 
> hashed data and the correct order of file names; if there's some discrepancy 
> in any of the above, the queries will fail or may produce incorrect results 
> if some bucket-based optimizations kick in.
> Additionally, even if the user does everything correctly, as far as I know 
> some code derives bucket number from file name, which won't work in this case 
> (as opposed to getting buckets based on the order of files, which will work 
> here but won't work as per  HIVE-14970... sigh).
> Hive enforces bucketing in other cases (the check cannot even be disabled 
> these days), so I suggest that we either prohibit the above outright, or at 
> least add a safety config setting that would disallow it by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10901) Optimize mutli column distinct queries

2016-11-14 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-10901:
---
Status: Patch Available  (was: Open)

> Optimize  mutli column distinct queries 
> 
>
> Key: HIVE-10901
> URL: https://issues.apache.org/jira/browse/HIVE-10901
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Logical Optimizer
>Affects Versions: 1.2.0
>Reporter: Mostafa Mokhtar
>Assignee: Pengcheng Xiong
> Attachments: HIVE-10901.02.patch, HIVE-10901.patch
>
>
> HIVE-10568 is useful only when there is a distinct on one column. It can be 
> expanded for multiple column cases too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10901) Optimize mutli column distinct queries

2016-11-14 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-10901:
---
Status: Open  (was: Patch Available)

> Optimize  mutli column distinct queries 
> 
>
> Key: HIVE-10901
> URL: https://issues.apache.org/jira/browse/HIVE-10901
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Logical Optimizer
>Affects Versions: 1.2.0
>Reporter: Mostafa Mokhtar
>Assignee: Pengcheng Xiong
> Attachments: HIVE-10901.02.patch, HIVE-10901.patch
>
>
> HIVE-10568 is useful only when there is a distinct on one column. It can be 
> expanded for multiple column cases too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10901) Optimize mutli column distinct queries

2016-11-14 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-10901:
---
Attachment: HIVE-10901.02.patch

> Optimize  mutli column distinct queries 
> 
>
> Key: HIVE-10901
> URL: https://issues.apache.org/jira/browse/HIVE-10901
> Project: Hive
>  Issue Type: New Feature
>  Components: CBO, Logical Optimizer
>Affects Versions: 1.2.0
>Reporter: Mostafa Mokhtar
>Assignee: Pengcheng Xiong
> Attachments: HIVE-10901.02.patch, HIVE-10901.patch
>
>
> HIVE-10568 is useful only when there is a distinct on one column. It can be 
> expanded for multiple column cases too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15057) Support other types of operators (other than SELECT)

2016-11-14 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-15057:

Attachment: (was: HIVE-15057.wip.patch)

> Support other types of operators (other than SELECT)
> 
>
> Key: HIVE-15057
> URL: https://issues.apache.org/jira/browse/HIVE-15057
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer, Physical Optimizer
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-15057.wip.patch
>
>
> Currently only SELECT operators are supported for nested column pruning. We 
> should add support for other types of operators so the optimization can work 
> for complex queries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15057) Support other types of operators (other than SELECT)

2016-11-14 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-15057:

Attachment: HIVE-15057.wip.patch

> Support other types of operators (other than SELECT)
> 
>
> Key: HIVE-15057
> URL: https://issues.apache.org/jira/browse/HIVE-15057
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer, Physical Optimizer
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-15057.wip.patch
>
>
> Currently only SELECT operators are supported for nested column pruning. We 
> should add support for other types of operators so the optimization can work 
> for complex queries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13931) Add support for HikariCP and replace BoneCP usage with HikariCP

2016-11-14 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665111#comment-15665111
 ] 

Sushanth Sowmyan commented on HIVE-13931:
-

+1, .4.patch looks good to me. Thanks, [~prasanth_j]

> Add support for HikariCP and replace BoneCP usage with HikariCP
> ---
>
> Key: HIVE-13931
> URL: https://issues.apache.org/jira/browse/HIVE-13931
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sushanth Sowmyan
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13931.2.patch, HIVE-13931.3.patch, 
> HIVE-13931.4.patch, HIVE-13931.patch
>
>
> Currently, we use BoneCP as our primary connection pooling mechanism 
> (overridable by users). However, BoneCP is no longer being actively 
> developed, and is considered deprecated, replaced by HikariCP.
> Thus, we should add support for HikariCP, and try to replace our primary 
> usage of BoneCP with it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13931) Add support for HikariCP and replace BoneCP usage with HikariCP

2016-11-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665104#comment-15665104
 ] 

Hive QA commented on HIVE-13931:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12838623/HIVE-13931.4.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10679 tests 
executed
*Failed tests:*
{noformat}
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=94)

[parallel_join1.q,union27.q,union12.q,groupby7_map_multi_single_reducer.q,varchar_join1.q,join7.q,join_reorder4.q,skewjoinopt2.q,bucketsortoptimize_insert_2.q,smb_mapjoin_17.q,script_env_var1.q,groupby7_map.q,groupby3.q,bucketsortoptimize_insert_8.q,union20.q]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[join_acid_non_acid]
 (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=145)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_1] 
(batchId=90)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] 
(batchId=90)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2114/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2114/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2114/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12838623 - PreCommit-HIVE-Build

> Add support for HikariCP and replace BoneCP usage with HikariCP
> ---
>
> Key: HIVE-13931
> URL: https://issues.apache.org/jira/browse/HIVE-13931
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sushanth Sowmyan
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13931.2.patch, HIVE-13931.3.patch, 
> HIVE-13931.4.patch, HIVE-13931.patch
>
>
> Currently, we use BoneCP as our primary connection pooling mechanism 
> (overridable by users). However, BoneCP is no longer being actively 
> developed, and is considered deprecated, replaced by HikariCP.
> Thus, we should add support for HikariCP, and try to replace our primary 
> usage of BoneCP with it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15114) Remove extra MoveTask operators

2016-11-14 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-15114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-15114:
---
Status: Patch Available  (was: Open)

> Remove extra MoveTask operators
> ---
>
> Key: HIVE-15114
> URL: https://issues.apache.org/jira/browse/HIVE-15114
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 2.1.0
>Reporter: Sahil Takiar
>Assignee: Sergio Peña
> Attachments: HIVE-15114.WIP.1.patch, HIVE-15114.WIP.2.patch
>
>
> When running simple insert queries (e.g. {{INSERT INTO TABLE ... VALUES 
> ...}}) there an extraneous {{MoveTask}s is created.
> This is problematic when the scratch directory is on S3 since renames require 
> copying the entire dataset.
> For simple queries (like the one above), there are two MoveTasks. The first 
> one moves the output data from one file in the scratch directory to another 
> file in the scratch directory. The second MoveTask moves the data from the 
> scratch directory to its final table location.
> The first MoveTask should not be necessary. The goal of this JIRA it to 
> remove it. This should help improve performance when running on S3.
> It seems that the first Move might be caused by a dependency resolution 
> problem in the optimizer, where a dependent task doesn't get properly removed 
> when the task it depends on is filtered by a condition resolver.
> A dummy {{MoveTask}} is added in the 
> {{GenMapRedUtils.createMRWorkForMergingFiles}} method. This method creates a 
> conditional task which launches a job to merge tasks at the end of the file. 
> At the end of the conditional job there is a MoveTask.
> Even though Hive decides that the conditional merge job is no needed, it 
> seems the MoveTask is still added to the plan.
> Seems this extra {{MoveTask}} may have been added intentionally. Not sure why 
> yet. The {{ConditionalResolverMergeFiles}} says that one of three tasks will 
> be returned: move task only, merge task only, merge task followed by a move 
> task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13590) Kerberized HS2 with LDAP auth enabled fails in multi-domain LDAP case

2016-11-14 Thread Ruslan Dautkhanov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665041#comment-15665041
 ] 

Ruslan Dautkhanov commented on HIVE-13590:
--

thank you [~ctang.ma]

Looking at your patch 
https://issues.apache.org/jira/secure/attachment/12811366/HIVE-13590.1.patch

{noformat}
...
+// KerberosName.getShorName can only be used for kerberos user, but 
not for the user
+// logged in via other authentications such as LDAP
+KerberosNameShim fullKerberosName = 
ShimLoader.getHadoopShims().getKerberosNameShim(userName);
+ret = fullKerberosName.getShortName();
...
{noformat}

Is there any specific reason getShortName() can't be used for LDAP principals 
too?
Looking around for potential solutions for HIVE-15174

Thanks!

> Kerberized HS2 with LDAP auth enabled fails in multi-domain LDAP case
> -
>
> Key: HIVE-13590
> URL: https://issues.apache.org/jira/browse/HIVE-13590
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication, Security
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-13590.1.patch, HIVE-13590.1.patch, 
> HIVE-13590.patch, HIVE-13590.patch
>
>
> In a kerberized HS2 with LDAP authentication enabled, LDAP user usually logs 
> in using username in form of username@domain in LDAP multi-domain case. But 
> it fails if the domain was not in the Hadoop auth_to_local mapping rule, the 
> error is as following:
> {code}
> Caused by: 
> org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
> No rules applied to ct...@mydomain.com
> at 
> org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:389)
> at org.apache.hadoop.security.User.(User.java:48)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15114) Remove extra MoveTask operators

2016-11-14 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HIVE-15114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-15114:
---
Attachment: HIVE-15114.WIP.2.patch

Attaching another file to run a full set of tests. A couple of things are 
needed before committing this:

- HIVE-15199 must be fixed
- S3 optimizations must be disabled by default (I enabled it for running the 
tests)

> Remove extra MoveTask operators
> ---
>
> Key: HIVE-15114
> URL: https://issues.apache.org/jira/browse/HIVE-15114
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 2.1.0
>Reporter: Sahil Takiar
>Assignee: Sergio Peña
> Attachments: HIVE-15114.WIP.1.patch, HIVE-15114.WIP.2.patch
>
>
> When running simple insert queries (e.g. {{INSERT INTO TABLE ... VALUES 
> ...}}) there an extraneous {{MoveTask}s is created.
> This is problematic when the scratch directory is on S3 since renames require 
> copying the entire dataset.
> For simple queries (like the one above), there are two MoveTasks. The first 
> one moves the output data from one file in the scratch directory to another 
> file in the scratch directory. The second MoveTask moves the data from the 
> scratch directory to its final table location.
> The first MoveTask should not be necessary. The goal of this JIRA it to 
> remove it. This should help improve performance when running on S3.
> It seems that the first Move might be caused by a dependency resolution 
> problem in the optimizer, where a dependent task doesn't get properly removed 
> when the task it depends on is filtered by a condition resolver.
> A dummy {{MoveTask}} is added in the 
> {{GenMapRedUtils.createMRWorkForMergingFiles}} method. This method creates a 
> conditional task which launches a job to merge tasks at the end of the file. 
> At the end of the conditional job there is a MoveTask.
> Even though Hive decides that the conditional merge job is no needed, it 
> seems the MoveTask is still added to the plan.
> Seems this extra {{MoveTask}} may have been added intentionally. Not sure why 
> yet. The {{ConditionalResolverMergeFiles}} says that one of three tasks will 
> be returned: move task only, merge task only, merge task followed by a move 
> task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-15167) remove SerDe interface; undeprecate Deserializer and Serializer

2016-11-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664980#comment-15664980
 ] 

Sergey Shelukhin edited comment on HIVE-15167 at 11/14/16 9:11 PM:
---

All the tests are known flaky tests (PPD one is HIVE-14936, the rest are very 
old).




was (Author: sershe):
All the tests are known flaky tests (PPD one is 
https://issues.apache.org/jira/browse/HIVE-14936, the rest are very old).



> remove SerDe interface; undeprecate Deserializer and Serializer
> ---
>
> Key: HIVE-15167
> URL: https://issues.apache.org/jira/browse/HIVE-15167
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15167.patch
>
>
> SerDe interfaces were deprecated in HIVE-4007 to suggest that users do not 
> implement them. However, this results in deprecation warnings all over the 
> codebase where they are actually used.
> We should un-deprecate (reprecate? precate?) them. We can add a comment for 
> implementers instead (we could add a method with a clearly bogus name like 
> useThisAbstractClassInstead, and implement it in the class, so it would be 
> noticeable, but that would break compat).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-15178) ORC stripe merge may produce many MR jobs and no merge if split size is small

2016-11-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-15178:

Attachment: HIVE-15178.01.patch

Cannot repro and the test report is gone, trying the same patch again.

> ORC stripe merge may produce many MR jobs and no merge if split size is small
> -
>
> Key: HIVE-15178
> URL: https://issues.apache.org/jira/browse/HIVE-15178
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15178.01.patch, HIVE-15178.patch
>
>
> orc_createas1
> logs the following:
> {noformat}
> 2016-11-10T13:38:54,366  INFO [LocalJobRunner Map Task Executor #0] 
> mapred.MapTask: Processing split: 
> Paths:/Users/sergey/git/hivegit2/itests/qtest/target/warehouse/.hive-staging_hive_2016-11-10_13-38-52_334_1323113125332102866-1/-ext-10004/01_0:2400+100InputFormatClass:
>  org.apache.hadoop.hive.ql.io.orc.OrcFileStripeMergeInputFormat
> 2016-11-10T13:38:54,373  INFO [LocalJobRunner Map Task Executor #0] 
> mapred.MapTask: Processing split: 
> Paths:/Users/sergey/git/hivegit2/itests/qtest/target/warehouse/.hive-staging_hive_2016-11-10_13-38-52_334_1323113125332102866-1/-ext-10004/01_0:2500+100InputFormatClass:
>  org.apache.hadoop.hive.ql.io.orc.OrcFileStripeMergeInputFormat
> 2016-11-10T13:38:54,380  INFO [LocalJobRunner Map Task Executor #0] 
> mapred.MapTask: Processing split: 
> Paths:/Users/sergey/git/hivegit2/itests/qtest/target/warehouse/.hive-staging_hive_2016-11-10_13-38-52_334_1323113125332102866-1/-ext-10004/01_0:2600+100InputFormatClass:
>  org.apache.hadoop.hive.ql.io.orc.OrcFileStripeMergeInputFormat
> 2016-11-10T13:38:54,387  INFO [LocalJobRunner Map Task Executor #0] 
> mapred.MapTask: Processing split: 
> Paths:/Users/sergey/git/hivegit2/itests/qtest/target/warehouse/.hive-staging_hive_2016-11-10_13-38-52_334_1323113125332102866-1/-ext-10004/01_0:2700+100InputFormatClass:
>  org.apache.hadoop.hive.ql.io.orc.OrcFileStripeMergeInputFormat
> ...
> {noformat}
> It tries to merge 2 files, but instead ends up running tons of MR tasks for 
> every 100 bytes and produces 2 files again (I assume most tasks don't produce 
> the files because the split at a random 100-byte offset is invalid).
> {noformat}
> 2016-11-10T13:38:53,985  INFO [LocalJobRunner Map Task Executor #0] 
> OrcFileMergeOperator: Merged stripe from file 
> pfile:/Users/sergey/git/hivegit2/itests/qtest/target/warehouse/.hive-staging_hive_2016-11-10_13-38-52_334_1323113125332102866-1/-ext-10004/00_0
>  [ offset : 3 length: 2770 row: 500 ]
> 2016-11-10T13:38:53,995  INFO [LocalJobRunner Map Task Executor #0] 
> exec.AbstractFileMergeOperator: renamed path 
> pfile:/Users/sergey/git/hivegit2/itests/qtest/target/warehouse/.hive-staging_hive_2016-11-10_13-38-52_334_1323113125332102866-1/_task_tmp.-ext-10002/_tmp.02_0
>  to 
> pfile:/Users/sergey/git/hivegit2/itests/qtest/target/warehouse/.hive-staging_hive_2016-11-10_13-38-52_334_1323113125332102866-1/_tmp.-ext-10002/02_0
>  . File size is 2986
> 2016-11-10T13:38:54,206  INFO [LocalJobRunner Map Task Executor #0] 
> OrcFileMergeOperator: Merged stripe from file 
> pfile:/Users/sergey/git/hivegit2/itests/qtest/target/warehouse/.hive-staging_hive_2016-11-10_13-38-52_334_1323113125332102866-1/-ext-10004/01_0
>  [ offset : 3 length: 2770 row: 500 ]
> 2016-11-10T13:38:54,215  INFO [LocalJobRunner Map Task Executor #0] 
> exec.AbstractFileMergeOperator: renamed path 
> pfile:/Users/sergey/git/hivegit2/itests/qtest/target/warehouse/.hive-staging_hive_2016-11-10_13-38-52_334_1323113125332102866-1/_task_tmp.-ext-10002/_tmp.30_0
>  to 
> pfile:/Users/sergey/git/hivegit2/itests/qtest/target/warehouse/.hive-staging_hive_2016-11-10_13-38-52_334_1323113125332102866-1/_tmp.-ext-10002/30_0
>  . File size is 2986
> {noformat}
> This is because the test sets the max split size to 100. Merge jobs is 
> supposed to override that, but that doesn't happen somehow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15167) remove SerDe interface; undeprecate Deserializer and Serializer

2016-11-14 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665016#comment-15665016
 ] 

Ashutosh Chauhan commented on HIVE-15167:
-

+1

> remove SerDe interface; undeprecate Deserializer and Serializer
> ---
>
> Key: HIVE-15167
> URL: https://issues.apache.org/jira/browse/HIVE-15167
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15167.patch
>
>
> SerDe interfaces were deprecated in HIVE-4007 to suggest that users do not 
> implement them. However, this results in deprecation warnings all over the 
> codebase where they are actually used.
> We should un-deprecate (reprecate? precate?) them. We can add a comment for 
> implementers instead (we could add a method with a clearly bogus name like 
> useThisAbstractClassInstead, and implement it in the class, so it would be 
> noticeable, but that would break compat).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15148) disallow loading data into bucketed tables (by default)

2016-11-14 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665006#comment-15665006
 ] 

Ashutosh Chauhan commented on HIVE-15148:
-

+1

> disallow loading data into bucketed tables (by default)
> ---
>
> Key: HIVE-15148
> URL: https://issues.apache.org/jira/browse/HIVE-15148
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15148.01.patch, HIVE-15148.02.patch, 
> HIVE-15148.patch
>
>
> A few q file tests still use the following, allowed, pattern:
> {noformat}
> CREATE TABLE bucket_small (key string, value string) partitioned by (ds 
> string) CLUSTERED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
> load data local inpath '../../data/files/smallsrcsortbucket1outof4.txt' INTO 
> TABLE bucket_small partition(ds='2008-04-08');
> load data local inpath '../../data/files/smallsrcsortbucket2outof4.txt' INTO 
> TABLE bucket_small partition(ds='2008-04-08');
> {noformat}
> This relies on the user to load the correct number of files with correctly 
> hashed data and the correct order of file names; if there's some discrepancy 
> in any of the above, the queries will fail or may produce incorrect results 
> if some bucket-based optimizations kick in.
> Additionally, even if the user does everything correctly, as far as I know 
> some code derives bucket number from file name, which won't work in this case 
> (as opposed to getting buckets based on the order of files, which will work 
> here but won't work as per  HIVE-14970... sigh).
> Hive enforces bucketing in other cases (the check cannot even be disabled 
> these days), so I suggest that we either prohibit the above outright, or at 
> least add a safety config setting that would disallow it by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15199) INSERT INTO data on S3 is replacing the old rows with the new ones

2016-11-14 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-15199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665000#comment-15665000
 ] 

Sergio Peña commented on HIVE-15199:


This issue is happening on the following code (Hive.java):
{noformat}
private static void copyFiles(...) {
...
  if (renameNonLocal) {
  for (int counter = 1; !destFs.rename(srcP,destPath); counter++) {
   destPath = new Path(destf, name + ("_copy_" + counter) + filetype);
   }
  } else {
   destPath = mvFile(conf, srcP, destPath, isSrcLocal, srcFs, destFs, name, 
filetype);
  }
...
}
{noformat}

Even if the file already exists on S3, the {{destFs.rename()}} call is renaming 
the file. 
This does not happen with HDFS. If the file exists on HDFS, then the rename 
will fail, and the _copy_ string will be appended to the filename, and retry 
the rename.

[~ste...@apache.org] Do you know if this is a known bug on the Hadoop side?

> INSERT INTO data on S3 is replacing the old rows with the new ones
> --
>
> Key: HIVE-15199
> URL: https://issues.apache.org/jira/browse/HIVE-15199
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>Priority: Critical
>
> Any INSERT INTO statement run on S3 tables and when the scratch directory is 
> saved on S3 is deleting old rows of the table.
> {noformat}
> hive> set hive.blobstore.use.blobstore.as.scratchdir=true;
> hive> create table t1 (id int, name string) location 's3a://spena-bucket/t1';
> hive> insert into table t1 values (1,'name1');
> hive> select * from t1;
> 1   name1
> hive> insert into table t1 values (2,'name2');
> hive> select * from t1;
> 2   name2
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15180) Extend JSONMessageFactory to store additional information about Table metadata objects on different table events

2016-11-14 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664983#comment-15664983
 ] 

Sushanth Sowmyan commented on HIVE-15180:
-

+1 on the code changes in that they look good to me - could you look to see why 
we have test failures on TestHCatClientNotification.createTable, 
TestHCatClientNotification.dropTable, TestDbNotificationListener.alterTable and 
TestDbNotificationListener.dropTable ? I think those tests might need further 
updates.

> Extend JSONMessageFactory to store additional information about Table 
> metadata objects on different table events
> 
>
> Key: HIVE-15180
> URL: https://issues.apache.org/jira/browse/HIVE-15180
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
> Attachments: HIVE-15180.1.patch
>
>
> We want the {{NOTIFICATION_LOG}} table to capture additional information 
> about the metadata objects when {{DbNotificationListener}} captures different 
> events for a table (create/drop/alter).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15167) remove SerDe interface; undeprecate Deserializer and Serializer

2016-11-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664980#comment-15664980
 ] 

Sergey Shelukhin commented on HIVE-15167:
-

All the tests are known flaky tests (PPD one is 
https://issues.apache.org/jira/browse/HIVE-14936, the rest are very old).



> remove SerDe interface; undeprecate Deserializer and Serializer
> ---
>
> Key: HIVE-15167
> URL: https://issues.apache.org/jira/browse/HIVE-15167
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15167.patch
>
>
> SerDe interfaces were deprecated in HIVE-4007 to suggest that users do not 
> implement them. However, this results in deprecation warnings all over the 
> codebase where they are actually used.
> We should un-deprecate (reprecate? precate?) them. We can add a comment for 
> implementers instead (we could add a method with a clearly bogus name like 
> useThisAbstractClassInstead, and implement it in the class, so it would be 
> noticeable, but that would break compat).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14089) complex type support in LLAP IO is broken

2016-11-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14089:

Attachment: HIVE-14089.12.patch

Addressing the RB feedback

> complex type support in LLAP IO is broken 
> --
>
> Key: HIVE-14089
> URL: https://issues.apache.org/jira/browse/HIVE-14089
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14089.04.patch, HIVE-14089.05.patch, 
> HIVE-14089.06.patch, HIVE-14089.07.patch, HIVE-14089.08.patch, 
> HIVE-14089.09.patch, HIVE-14089.10.patch, HIVE-14089.10.patch, 
> HIVE-14089.10.patch, HIVE-14089.11.patch, HIVE-14089.12.patch, 
> HIVE-14089.WIP.2.patch, HIVE-14089.WIP.3.patch, HIVE-14089.WIP.patch
>
>
> HIVE-13617 is causing MiniLlapCliDriver following test failures
> {code}
> org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all
> org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-15194) Hive on Tez - Hive Runtime Error while closing operators

2016-11-14 Thread Shankar M (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15663871#comment-15663871
 ] 

Shankar M commented on HIVE-15194:
--

Thanks you very much. It is much needed help. And i hope patch given will be 
included in future versions. 

> Hive on Tez - Hive Runtime Error while closing operators
> 
>
> Key: HIVE-15194
> URL: https://issues.apache.org/jira/browse/HIVE-15194
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Tez
>Affects Versions: 2.1.0
> Environment: Hive 2.1.0 
> Tez 0.8.4
> 4 Nodes x CentOS-6 x64 (32GB Memory, 8 CPUs)
> Hadoop 2.7.1
>Reporter: Shankar M
>
> Please help me to solve below issue.. 
> --
> I am setting below commands in hive CLI: 
> set hive.execution.engine=tez;
> set hive.vectorized.execution.enabled = true;
> set hive.vectorized.execution.reduce.enabled = true;
> set hive.cbo.enable=true;
> set hive.compute.query.using.stats=true;
> set hive.stats.fetch.column.stats=true;
> set hive.stats.fetch.partition.stats=true;
> SET hive.tez.container.size=4096;
> SET hive.tez.java.opts=-Xmx3072m;
> --
> {code}
> hive> CREATE TABLE tmp_parquet_newtable STORED AS PARQUET AS 
> > select a.* from orc_very_large_table a where a.event = 1 and EXISTS 
> (SELECT 1 FROM tmp_small_parquet_table b WHERE b.session_id = a.session_id ) ;
> Query ID = hadoop_20161114132930_65843cb3-557c-4b42-b662-2901caf5be2d
> Total jobs = 1
> Launching Job 1 out of 1
> Status: Running (Executing on YARN cluster with App id 
> application_1479059955967_0049)
> --
> VERTICES  MODESTATUS  TOTAL  COMPLETED  RUNNING  PENDING  
> FAILED  KILLED  
> --
> Map 1 .  containerFAILED384 440  340  
> 26   0  
> Map 2 .. container SUCCEEDED  1  100  
>  0   0  
> --
> VERTICES: 01/02  [===>>---] 11%   ELAPSED TIME: 43.76 s   
>  
> --
> Status: Failed
> Vertex failed, vertexName=Map 1, vertexId=vertex_1479059955967_0049_2_01, 
> diagnostics=[Task failed, taskId=task_1479059955967_0049_2_01_48, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( 
> failure ) : 
> attempt_1479059955967_0049_2_01_48_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: Hive Runtime Error while closing operators
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:198)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:160)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Hive Runtime Error while closing 
> operators
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:422)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186)
>   ... 14 more
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:513)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:682)
>   at