[jira] [Commented] (HIVE-17144) export of temporary tables not working and it seems to be using distcp rather than filesystem copy

2017-08-01 Thread anishek (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110358#comment-16110358
 ] 

anishek commented on HIVE-17144:


* 
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
 : runs fine on local machine after increasing the mvn test memory
* 
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 :  runs fine on local machine.
* 
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 : runs fine on local machine


other failures are from previous builds.

[~daijy] can you please review !

> export of temporary tables not working and it seems to be using distcp rather 
> than filesystem copy
> --
>
> Key: HIVE-17144
> URL: https://issues.apache.org/jira/browse/HIVE-17144
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
> Fix For: 3.0.0
>
> Attachments: HIVE-17144.1.patch
>
>
> create temporary table t1 (i int);
> insert into t1 values (3);
> export table t1 to 'hdfs://somelocation';
> above fails. additionally it should use filesystem copy and not distcp to do 
> the job.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17217) SMB Join : Assert if paths are different in TezGroupedSplit in KeyValueInputMerger

2017-08-01 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110331#comment-16110331
 ] 

Gopal V commented on HIVE-17217:


Thanks, LGTM - +1.

> SMB Join : Assert if paths are different in TezGroupedSplit in 
> KeyValueInputMerger
> --
>
> Key: HIVE-17217
> URL: https://issues.apache.org/jira/browse/HIVE-17217
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-17217.1.patch, HIVE-17217.2.patch
>
>
> In KeyValueInputMerger, a TezGroupedSplit may contain more than 1 splits. 
> However, the splits should all belong to same path.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17213) HoS: file merging doesn't work for union all

2017-08-01 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-17213:

Attachment: HIVE-17213.3.patch

> HoS: file merging doesn't work for union all
> 
>
> Key: HIVE-17213
> URL: https://issues.apache.org/jira/browse/HIVE-17213
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-17213.0.patch, HIVE-17213.1.patch, 
> HIVE-17213.2.patch, HIVE-17213.3.patch
>
>
> HoS file merging doesn't work properly since it doesn't set linked file sinks 
> properly which is used to generate move tasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17217) SMB Join : Assert if paths are different in TezGroupedSplit in KeyValueInputMerger

2017-08-01 Thread Deepak Jaiswal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-17217:
--
Attachment: HIVE-17217.2.patch

Updated the patch with cleaner code.

> SMB Join : Assert if paths are different in TezGroupedSplit in 
> KeyValueInputMerger
> --
>
> Key: HIVE-17217
> URL: https://issues.apache.org/jira/browse/HIVE-17217
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-17217.1.patch, HIVE-17217.2.patch
>
>
> In KeyValueInputMerger, a TezGroupedSplit may contain more than 1 splits. 
> However, the splits should all belong to same path.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17213) HoS: file merging doesn't work for union all

2017-08-01 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-17213:

Attachment: (was: HIVE-17213.3.patch)

> HoS: file merging doesn't work for union all
> 
>
> Key: HIVE-17213
> URL: https://issues.apache.org/jira/browse/HIVE-17213
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-17213.0.patch, HIVE-17213.1.patch, 
> HIVE-17213.2.patch
>
>
> HoS file merging doesn't work properly since it doesn't set linked file sinks 
> properly which is used to generate move tasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17227) Incremental replication load should creates tasks in execution phase rather than semantic phase

2017-08-01 Thread anishek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek reassigned HIVE-17227:
--

Assignee: anishek

> Incremental replication load should creates tasks in execution phase rather 
> than semantic phase 
> 
>
> Key: HIVE-17227
> URL: https://issues.apache.org/jira/browse/HIVE-17227
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
> Fix For: 3.0.0
>
>
> as we did for bootstrap replication load in HIVE-16896 we should use a 
> mechanism to dynamically create dag graph for incremental replication as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17213) HoS: file merging doesn't work for union all

2017-08-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110312#comment-16110312
 ] 

Hive QA commented on HIVE-17213:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879946/HIVE-17213.3.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6224/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6224/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6224/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-08-02 05:07:15.923
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-6224/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-08-02 05:07:15.926
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 5c147f0 HIVE-17209: ObjectCacheFactory should return null when 
tez shared object registry is not setup (Rajesh Balamohan, reviewed by Sergey 
Shelukhin)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 5c147f0 HIVE-17209: ObjectCacheFactory should return null when 
tez shared object registry is not setup (Rajesh Balamohan, reviewed by Sergey 
Shelukhin)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-08-02 05:07:22.269
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: patch failed: 
llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapRecordReader.java:181
error: 
llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapRecordReader.java:
 patch does not apply
error: patch failed: ql/src/test/queries/clientpositive/llap_smb.q:1
error: ql/src/test/queries/clientpositive/llap_smb.q: patch does not apply
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12879946 - PreCommit-HIVE-Build

> HoS: file merging doesn't work for union all
> 
>
> Key: HIVE-17213
> URL: https://issues.apache.org/jira/browse/HIVE-17213
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-17213.0.patch, HIVE-17213.1.patch, 
> HIVE-17213.2.patch, HIVE-17213.3.patch
>
>
> HoS file merging doesn't work properly since it doesn't set linked file sinks 
> properly which is used to generate move tasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17217) SMB Join : Assert if paths are different in TezGroupedSplit in KeyValueInputMerger

2017-08-01 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110309#comment-16110309
 ] 

Gopal V commented on HIVE-17217:


The patch took me a few reads to understand.

If someone removes the assert, it ends up with the last path replacing all 
others, which might not be obvious.

{code}
for (int i = 0; i < splits.size(); i++) {
{code}

is better written with i=1, so that the loop only compares and doesn't do a 
put().

+1, with that minor nit.

> SMB Join : Assert if paths are different in TezGroupedSplit in 
> KeyValueInputMerger
> --
>
> Key: HIVE-17217
> URL: https://issues.apache.org/jira/browse/HIVE-17217
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
> Attachments: HIVE-17217.1.patch
>
>
> In KeyValueInputMerger, a TezGroupedSplit may contain more than 1 splits. 
> However, the splits should all belong to same path.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17089) make acid 2.0 the default

2017-08-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110308#comment-16110308
 ] 

Hive QA commented on HIVE-17089:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879936/HIVE-17089.03.patch

{color:green}SUCCESS:{color} +1 due to 7 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 27 failed/errored test(s), 11003 tests 
executed
*Failed tests:*
{noformat}
TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=236)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=74)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_3]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.ql.io.TestAcidUtils.testAcidOperationalPropertiesSettersAndGetters
 (batchId=262)
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testEmpty (batchId=265)
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testNewBaseAndDelta 
(batchId=265)
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderDelta 
(batchId=265)
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderIncompleteDelta
 (batchId=265)
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderNewBaseAndDelta
 (batchId=265)
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderOldBaseAndDelta
 (batchId=265)
org.apache.hadoop.hive.ql.io.orc.TestOrcRecordUpdater.testUpdates (batchId=265)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
org.apache.hive.hcatalog.streaming.TestStreaming.testInterleavedTransactionBatchCommits
 (batchId=191)
org.apache.hive.hcatalog.streaming.TestStreaming.testMultipleTransactionBatchCommits
 (batchId=191)
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchAbortAndCommit
 (batchId=191)
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Delimited
 (batchId=191)
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_DelimitedUGI
 (batchId=191)
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json
 (batchId=191)
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Regex
 (batchId=191)
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_RegexUGI
 (batchId=191)
org.apache.hive.hcatalog.streaming.mutate.TestMutations.testMulti (batchId=191)
org.apache.hive.hcatalog.streaming.mutate.TestMutations.testTransactionBatchCommitPartitioned
 (batchId=191)
org.apache.hive.hcatalog.streaming.mutate.TestMutations.testTransactionBatchCommitUnpartitioned
 (batchId=191)
org.apache.hive.hcatalog.streaming.mutate.TestMutations.testUpdatesAndDeletes 
(batchId=191)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6223/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6223/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6223/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 27 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12879936 - PreCommit-HIVE-Build

> make acid 2.0 the default
> -
>
> Key: HIVE-17089
> URL: https://issues.apache.org/jira/browse/HIVE-17089
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17089.01.patch, HIVE-17089.03.patch
>
>
> acid 2.0 is introduced in HIVE-14035.  It replaces Update events with a 
> combination of Delete + Insert events.  This now makes U=D+I the default (and 
> only) supported acid table type in Hive 3.0.  
> The expectation for upgrade is that Major compaction has to be run on all 
> acid tables in the existing Hive cluster and that no new writes to these 
> table take place since the start of compaction (Need to add a mechanism to 
> put a table in read-only mode - this way it can still be read while it's 
> being compacted).  Then upgrade to Hive 3.0 can take place.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-12631) LLAP: support ORC ACID tables

2017-08-01 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-12631:
--
Attachment: HIVE-12631.26.patch

> LLAP: support ORC ACID tables
> -
>
> Key: HIVE-12631
> URL: https://issues.apache.org/jira/browse/HIVE-12631
> Project: Hive
>  Issue Type: Bug
>  Components: llap, Transactions
>Reporter: Sergey Shelukhin
>Assignee: Teddy Choi
> Attachments: HIVE-12631.10.patch, HIVE-12631.10.patch, 
> HIVE-12631.11.patch, HIVE-12631.11.patch, HIVE-12631.12.patch, 
> HIVE-12631.13.patch, HIVE-12631.15.patch, HIVE-12631.16.patch, 
> HIVE-12631.17.patch, HIVE-12631.18.patch, HIVE-12631.19.patch, 
> HIVE-12631.1.patch, HIVE-12631.20.patch, HIVE-12631.21.patch, 
> HIVE-12631.22.patch, HIVE-12631.23.patch, HIVE-12631.24.patch, 
> HIVE-12631.25.patch, HIVE-12631.26.patch, HIVE-12631.2.patch, 
> HIVE-12631.3.patch, HIVE-12631.4.patch, HIVE-12631.5.patch, 
> HIVE-12631.6.patch, HIVE-12631.7.patch, HIVE-12631.8.patch, 
> HIVE-12631.8.patch, HIVE-12631.9.patch
>
>
> LLAP uses a completely separate read path in ORC to allow for caching and 
> parallelization of reads and processing. This path does not support ACID. As 
> far as I remember ACID logic is embedded inside ORC format; we need to 
> refactor it to be on top of some interface, if practical; or just port it to 
> LLAP read path.
> Another consideration is how the logic will work with cache. The cache is 
> currently low-level (CB-level in ORC), so we could just use it to read bases 
> and deltas (deltas should be cached with higher priority) and merge as usual. 
> We could also cache merged representation in future.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16896) move replication load related work in semantic analysis phase to execution phase using a task

2017-08-01 Thread anishek (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110279#comment-16110279
 ] 

anishek commented on HIVE-16896:


not sure why the pull request is not shown here 
https://github.com/apache/hive/pull/214


> move replication load related work in semantic analysis phase to execution 
> phase using a task
> -
>
> Key: HIVE-16896
> URL: https://issues.apache.org/jira/browse/HIVE-16896
> Project: Hive
>  Issue Type: Sub-task
>Reporter: anishek
>Assignee: anishek
> Attachments: HIVE-16896.1.patch
>
>
> we want to not create too many tasks in memory in the analysis phase while 
> loading data. Currently we load all the files in the bootstrap dump location 
> as {{FileStatus[]}} and then iterate over it to load objects, we should 
> rather move to 
> {code}
> org.apache.hadoop.fs.RemoteIteratorlistFiles(Path 
> f, boolean recursive)
> {code}
> which would internally batch and return values. 
> additionally since we cant hand off partial tasks from analysis pahse => 
> execution phase, we are going to move the whole repl load functionality to 
> execution phase so we can better control creation/execution of tasks (not 
> related to hive {{Task}}, we may get rid of ReplCopyTask)
> Additional consideration to take into account at the end of this jira is to 
> see if we want to specifically do a multi threaded load of bootstrap dump.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17172) add ordering checks to DiskRangeList

2017-08-01 Thread Deepak Jaiswal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110257#comment-16110257
 ] 

Deepak Jaiswal commented on HIVE-17172:
---

Thanks for adding comments. As far as I understand it looks good. An RB link 
would have been more helpful in understanding the code though.

+1

> add ordering checks to DiskRangeList
> 
>
> Key: HIVE-17172
> URL: https://issues.apache.org/jira/browse/HIVE-17172
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17172.01.patch, HIVE-17172.02.patch, 
> HIVE-17172.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17170) Move thrift generated code to stand alone metastore

2017-08-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110227#comment-16110227
 ] 

Hive QA commented on HIVE-17170:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879930/HIVE-17170.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 11040 tests 
executed
*Failed tests:*
{noformat}
TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=236)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testHttpRetryOnServerIdleTimeout 
(batchId=229)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6222/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6222/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6222/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12879930 - PreCommit-HIVE-Build

> Move thrift generated code to stand alone metastore
> ---
>
> Key: HIVE-17170
> URL: https://issues.apache.org/jira/browse/HIVE-17170
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17170.2.patch, HIVE-17170.patch
>
>
> hive_metastore.thrift and the code it generates needs to be moved into the 
> standalone metastore module.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17225) HoS DPP pruning sink ops can target parallel work objects

2017-08-01 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110187#comment-16110187
 ] 

Sahil Takiar edited comment on HIVE-17225 at 8/2/17 3:31 AM:
-

The query above fails because the {{Spark Partition Pruning Sink Operator}} in 
Map 3 has a target work of Map 1. This means that when Map 1 runs, it will 
looks for a tmp file on HDFS that contains all the partitions it should scan. 
The problem is that Map 1 and Map 3 will run in parallel, so Map 1 will fail 
with a FNF exception.

Here is a brief explanation of the whats happening in the query above. There 
are three tables: pt1, r1 and r2. pt1 is partitioned, r1t and rt2 aren't. In 
terms of data size pt1 < rt1 = rt2. Map-joins are enabled. Since pt1 is the 
smallest table, it is scanned and written to a hash table. rt2 is also scanned 
and written to a hash table. rt1 is treated as the big table in the map-join. 
The hashtables for pt1 and rt2 are generated in the same Spark job. If DPP is 
enabled, then the scan for rt2 will result in a pruning sink targeting the scan 
for pt1. This will cause the FNF exception show above, because the scans for 
rt2 and pt1 run in parallel.


was (Author: stakiar):
Here is another example of when this can happen. Say there are three tables: 
pt1, pt2 and r1. pt1 and pt2 are partitioned and r1 is not. In terms of data 
size pt1 < r1 < pt2. If map-joins are enabled, and all three tables are joined 
the following scenario may occur. pt1 is scanned and written to a hash table, 
r1 is scanned and written to a hash table. pt2 is treated as the big table in 
the map-join. The hashtables for pt1 and r1 are generated in the same Spark 
job. If DPP is enabled, then the scan for r1 will result in a pruning sink 
targeting the scan for pt1. This will cause the FNF exception show above, 
because the scans for r1 and pt1 run in parallel.

> HoS DPP pruning sink ops can target parallel work objects
> -
>
> Key: HIVE-17225
> URL: https://issues.apache.org/jira/browse/HIVE-17225
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> Setup:
> {code:sql}
> SET hive.spark.dynamic.partition.pruning=true;
> SET hive.strict.checks.cartesian.product=false;
> SET hive.auto.convert.join=true;
> CREATE TABLE partitioned_table1 (col int) PARTITIONED BY (part_col int);
> CREATE TABLE regular_table1 (col int);
> CREATE TABLE regular_table2 (col int);
> ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 1);
> ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 2);
> ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 3);
> INSERT INTO table regular_table1 VALUES (1), (2), (3), (4), (5), (6);
> INSERT INTO table regular_table2 VALUES (1), (2), (3), (4), (5), (6);
> INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 1) VALUES (1);
> INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 2) VALUES (2);
> INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 3) VALUES (3);
> SELECT *
> FROM   partitioned_table1,
>regular_table1 rt1,
>regular_table2 rt2
> WHERE  rt1.col = partitioned_table1.part_col
>AND rt2.col = partitioned_table1.part_col;
> {code}
> Exception:
> {code}
> 2017-08-01T13:27:47,483 ERROR [b0d354a8-4cdb-4ba9-acec-27d14926aaf4 main] 
> ql.Driver: FAILED: Execution Error, return code 3 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask. java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.FileNotFoundException: File 
> file:/Users/stakiar/Documents/idea/apache-hive/itests/qtest-spark/target/tmp/scratchdir/stakiar/b0d354a8-4cdb-4ba9-acec-27d14926aaf4/hive_2017-08-01_13-27-45_553_1088589686371686526-1/-mr-10004/3/5
>  does not exist
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:408)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:498)
>   at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
>   at scala.Option.getOrElse(Option.scala:121)
>   at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
>   at scala.Option.getOrElse(Option.scala:121)
>   at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
>   at 

[jira] [Updated] (HIVE-17225) HoS DPP pruning sink ops can target parallel work objects

2017-08-01 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17225:

Description: 
Setup:

{code:sql}
SET hive.spark.dynamic.partition.pruning=true;
SET hive.strict.checks.cartesian.product=false;
SET hive.auto.convert.join=true;

CREATE TABLE partitioned_table1 (col int) PARTITIONED BY (part_col int);

CREATE TABLE regular_table1 (col int);
CREATE TABLE regular_table2 (col int);

ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 1);
ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 2);
ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 3);

INSERT INTO table regular_table1 VALUES (1), (2), (3), (4), (5), (6);
INSERT INTO table regular_table2 VALUES (1), (2), (3), (4), (5), (6);

INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 1) VALUES (1);
INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 2) VALUES (2);
INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 3) VALUES (3);

SELECT *
FROM   partitioned_table1,
   regular_table1 rt1,
   regular_table2 rt2
WHERE  rt1.col = partitioned_table1.part_col
   AND rt2.col = partitioned_table1.part_col;
{code}

Exception:

{code}
2017-08-01T13:27:47,483 ERROR [b0d354a8-4cdb-4ba9-acec-27d14926aaf4 main] 
ql.Driver: FAILED: Execution Error, return code 3 from 
org.apache.hadoop.hive.ql.exec.spark.SparkTask. java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: 
java.io.FileNotFoundException: File 
file:/Users/stakiar/Documents/idea/apache-hive/itests/qtest-spark/target/tmp/scratchdir/stakiar/b0d354a8-4cdb-4ba9-acec-27d14926aaf4/hive_2017-08-01_13-27-45_553_1088589686371686526-1/-mr-10004/3/5
 does not exist
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:408)
at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:498)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.immutable.List.map(List.scala:285)
at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:82)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.immutable.List.map(List.scala:285)
at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:82)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at 

[jira] [Commented] (HIVE-17225) HoS DPP pruning sink ops can target parallel work objects

2017-08-01 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110189#comment-16110189
 ] 

Sahil Takiar commented on HIVE-17225:
-

A simple solution would be to add a rule that removes a DPP branch whenever the 
target work is in a parallel work object. A tricker solution would be to splits 
the work objects into different stages (and thus different Spark jobs), or add 
some other dependency between the work objects (not sure if a Map Work can have 
a dependency on another Map Work).

> HoS DPP pruning sink ops can target parallel work objects
> -
>
> Key: HIVE-17225
> URL: https://issues.apache.org/jira/browse/HIVE-17225
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> Setup:
> {code:sql}
> SET hive.spark.dynamic.partition.pruning=true;
> SET hive.strict.checks.cartesian.product=false;
> SET hive.auto.convert.join=true;
> CREATE TABLE partitioned_table1 (col int) PARTITIONED BY (part_col int);
> CREATE TABLE regular_table1 (col1 int, col2 int);
> CREATE TABLE regular_table2 (col1 int, col2 int);
> ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 1);
> ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 2);
> ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 3);
> INSERT INTO table regular_table1 VALUES (0, 0), (1, 1), (2, 2);
> INSERT INTO table regular_table2 VALUES (0, 0), (1, 1), (2, 2);
> INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 1) VALUES (1), 
> (2), (3);
> INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 2) VALUES (1), 
> (2), (3);
> INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 3) VALUES (1), 
> (2), (3);
> SELECT * 
> FROM   regular_table1, 
>regular_table2, 
>partitioned_table1 
> WHERE  partitioned_table1.part_col IN (SELECT regular_table1.col2 
>FROM   regular_table1 
>WHERE  regular_table1.col1 > 0) 
>AND partitioned_table1.part_col IN (SELECT regular_table2.col2 
>FROM   regular_table2 
>WHERE  regular_table2.col1 > 1); 
> {code}
> Exception:
> {code}
> 2017-08-01T13:27:47,483 ERROR [b0d354a8-4cdb-4ba9-acec-27d14926aaf4 main] 
> ql.Driver: FAILED: Execution Error, return code 3 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask. java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.FileNotFoundException: File 
> file:/Users/stakiar/Documents/idea/apache-hive/itests/qtest-spark/target/tmp/scratchdir/stakiar/b0d354a8-4cdb-4ba9-acec-27d14926aaf4/hive_2017-08-01_13-27-45_553_1088589686371686526-1/-mr-10004/3/5
>  does not exist
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:408)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:498)
>   at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
>   at scala.Option.getOrElse(Option.scala:121)
>   at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
>   at scala.Option.getOrElse(Option.scala:121)
>   at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
>   at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
>   at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at scala.collection.immutable.List.foreach(List.scala:381)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
>   at scala.collection.immutable.List.map(List.scala:285)
>   at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:82)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
>   at scala.Option.getOrElse(Option.scala:121)
>   at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
>   at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
>   at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
>   at 
> 

[jira] [Commented] (HIVE-17225) HoS DPP pruning sink ops can target parallel work objects

2017-08-01 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110187#comment-16110187
 ] 

Sahil Takiar commented on HIVE-17225:
-

Here is another example of when this can happen. Say there are three tables: 
pt1, pt2 and r1. pt1 and pt2 are partitioned and r1 is not. In terms of data 
size pt1 < r1 < pt2. If map-joins are enabled, and all three tables are joined 
the following scenario may occur. pt1 is scanned and written to a hash table, 
r1 is scanned and written to a hash table. pt2 is treated as the big table in 
the map-join. The hashtables for pt1 and r1 are generated in the same Spark 
job. If DPP is enabled, then the scan for r1 will result in a pruning sink 
targeting the scan for pt1. This will cause the FNF exception show above, 
because the scans for r1 and pt1 run in parallel.

> HoS DPP pruning sink ops can target parallel work objects
> -
>
> Key: HIVE-17225
> URL: https://issues.apache.org/jira/browse/HIVE-17225
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> Setup:
> {code:sql}
> SET hive.spark.dynamic.partition.pruning=true;
> SET hive.strict.checks.cartesian.product=false;
> SET hive.auto.convert.join=true;
> CREATE TABLE partitioned_table1 (col int) PARTITIONED BY (part_col int);
> CREATE TABLE regular_table1 (col1 int, col2 int);
> CREATE TABLE regular_table2 (col1 int, col2 int);
> ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 1);
> ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 2);
> ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 3);
> INSERT INTO table regular_table1 VALUES (0, 0), (1, 1), (2, 2);
> INSERT INTO table regular_table2 VALUES (0, 0), (1, 1), (2, 2);
> INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 1) VALUES (1), 
> (2), (3);
> INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 2) VALUES (1), 
> (2), (3);
> INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 3) VALUES (1), 
> (2), (3);
> SELECT * 
> FROM   regular_table1, 
>regular_table2, 
>partitioned_table1 
> WHERE  partitioned_table1.part_col IN (SELECT regular_table1.col2 
>FROM   regular_table1 
>WHERE  regular_table1.col1 > 0) 
>AND partitioned_table1.part_col IN (SELECT regular_table2.col2 
>FROM   regular_table2 
>WHERE  regular_table2.col1 > 1); 
> {code}
> Exception:
> {code}
> 2017-08-01T13:27:47,483 ERROR [b0d354a8-4cdb-4ba9-acec-27d14926aaf4 main] 
> ql.Driver: FAILED: Execution Error, return code 3 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask. java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.FileNotFoundException: File 
> file:/Users/stakiar/Documents/idea/apache-hive/itests/qtest-spark/target/tmp/scratchdir/stakiar/b0d354a8-4cdb-4ba9-acec-27d14926aaf4/hive_2017-08-01_13-27-45_553_1088589686371686526-1/-mr-10004/3/5
>  does not exist
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:408)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:498)
>   at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
>   at scala.Option.getOrElse(Option.scala:121)
>   at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
>   at scala.Option.getOrElse(Option.scala:121)
>   at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
>   at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
>   at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at scala.collection.immutable.List.foreach(List.scala:381)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
>   at scala.collection.immutable.List.map(List.scala:285)
>   at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:82)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
>   at 

[jira] [Updated] (HIVE-17213) HoS: file merging doesn't work for union all

2017-08-01 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-17213:

Attachment: HIVE-17213.3.patch

Test failures unrelated for patch v2. Attaching patch v3 with qtest. [~xuefuz], 
can you take another look?

> HoS: file merging doesn't work for union all
> 
>
> Key: HIVE-17213
> URL: https://issues.apache.org/jira/browse/HIVE-17213
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-17213.0.patch, HIVE-17213.1.patch, 
> HIVE-17213.2.patch, HIVE-17213.3.patch
>
>
> HoS file merging doesn't work properly since it doesn't set linked file sinks 
> properly which is used to generate move tasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17225) HoS DPP pruning sink ops can target parallel work objects

2017-08-01 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17225:

Summary: HoS DPP pruning sink ops can target parallel work objects  (was: 
HoS DPP throws FileNotFoundException in HiveInputFormat#init when target work 
is in the same Spark job)

> HoS DPP pruning sink ops can target parallel work objects
> -
>
> Key: HIVE-17225
> URL: https://issues.apache.org/jira/browse/HIVE-17225
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> Setup:
> {code:sql}
> SET hive.spark.dynamic.partition.pruning=true;
> SET hive.strict.checks.cartesian.product=false;
> SET hive.auto.convert.join=true;
> CREATE TABLE partitioned_table1 (col int) PARTITIONED BY (part_col int);
> CREATE TABLE regular_table1 (col1 int, col2 int);
> CREATE TABLE regular_table2 (col1 int, col2 int);
> ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 1);
> ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 2);
> ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 3);
> INSERT INTO table regular_table1 VALUES (0, 0), (1, 1), (2, 2);
> INSERT INTO table regular_table2 VALUES (0, 0), (1, 1), (2, 2);
> INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 1) VALUES (1), 
> (2), (3);
> INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 2) VALUES (1), 
> (2), (3);
> INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 3) VALUES (1), 
> (2), (3);
> SELECT * 
> FROM   regular_table1, 
>regular_table2, 
>partitioned_table1 
> WHERE  partitioned_table1.part_col IN (SELECT regular_table1.col2 
>FROM   regular_table1 
>WHERE  regular_table1.col1 > 0) 
>AND partitioned_table1.part_col IN (SELECT regular_table2.col2 
>FROM   regular_table2 
>WHERE  regular_table2.col1 > 1); 
> {code}
> Exception:
> {code}
> 2017-08-01T13:27:47,483 ERROR [b0d354a8-4cdb-4ba9-acec-27d14926aaf4 main] 
> ql.Driver: FAILED: Execution Error, return code 3 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask. java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.FileNotFoundException: File 
> file:/Users/stakiar/Documents/idea/apache-hive/itests/qtest-spark/target/tmp/scratchdir/stakiar/b0d354a8-4cdb-4ba9-acec-27d14926aaf4/hive_2017-08-01_13-27-45_553_1088589686371686526-1/-mr-10004/3/5
>  does not exist
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:408)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:498)
>   at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
>   at scala.Option.getOrElse(Option.scala:121)
>   at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
>   at scala.Option.getOrElse(Option.scala:121)
>   at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
>   at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
>   at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at scala.collection.immutable.List.foreach(List.scala:381)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
>   at scala.collection.immutable.List.map(List.scala:285)
>   at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:82)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
>   at scala.Option.getOrElse(Option.scala:121)
>   at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
>   at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
>   at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 

[jira] [Updated] (HIVE-17225) HoS DPP throws FileNotFoundException in HiveInputFormat#init when target work is in the same Spark job

2017-08-01 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17225:

Description: 
Setup:

{code:sql}
SET hive.spark.dynamic.partition.pruning=true;
SET hive.strict.checks.cartesian.product=false;
SET hive.auto.convert.join=true;

CREATE TABLE partitioned_table1 (col int) PARTITIONED BY (part_col int);
CREATE TABLE regular_table1 (col1 int, col2 int);
CREATE TABLE regular_table2 (col1 int, col2 int);

ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 1);
ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 2);
ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 3);

INSERT INTO table regular_table1 VALUES (0, 0), (1, 1), (2, 2);
INSERT INTO table regular_table2 VALUES (0, 0), (1, 1), (2, 2);

INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 1) VALUES (1), (2), 
(3);
INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 2) VALUES (1), (2), 
(3);
INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 3) VALUES (1), (2), 
(3);

SELECT * 
FROM   regular_table1, 
   regular_table2, 
   partitioned_table1 
WHERE  partitioned_table1.part_col IN (SELECT regular_table1.col2 
   FROM   regular_table1 
   WHERE  regular_table1.col1 > 0) 
   AND partitioned_table1.part_col IN (SELECT regular_table2.col2 
   FROM   regular_table2 
   WHERE  regular_table2.col1 > 1); 
{code}

Exception:

{code}
2017-08-01T13:27:47,483 ERROR [b0d354a8-4cdb-4ba9-acec-27d14926aaf4 main] 
ql.Driver: FAILED: Execution Error, return code 3 from 
org.apache.hadoop.hive.ql.exec.spark.SparkTask. java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: 
java.io.FileNotFoundException: File 
file:/Users/stakiar/Documents/idea/apache-hive/itests/qtest-spark/target/tmp/scratchdir/stakiar/b0d354a8-4cdb-4ba9-acec-27d14926aaf4/hive_2017-08-01_13-27-45_553_1088589686371686526-1/-mr-10004/3/5
 does not exist
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:408)
at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:498)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.immutable.List.map(List.scala:285)
at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:82)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.immutable.List.map(List.scala:285)
at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:82)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)

[jira] [Updated] (HIVE-17209) ObjectCacheFactory should return null when tez shared object registry is not setup

2017-08-01 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-17209:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Created ORC-221 for orc related change and it got committed as well.

Thanks [~sershe]. Committed this patch to master.

> ObjectCacheFactory should return null when tez shared object registry is not 
> setup
> --
>
> Key: HIVE-17209
> URL: https://issues.apache.org/jira/browse/HIVE-17209
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-17209.1.patch
>
>
> HIVE-15269 introduced dynamic min/max bloom filter 
> ("hive.tez.dynamic.semijoin.reduction=true"). This needs to access 
> ObjectCache and in tez, ObjectCache can only be created by {{TezProcessor}}.
> In the following case {{AM --> splits --> 
> OrcInputFormat.pickStripes::evaluatePredicateMinMax --> 
> DynamicValue.getLiteral --> objectCache access}}, AM ends up throwing lots of 
> NPE since AM has not created ObjectCache.  
> Orc reader catches these exceptions, skips PPD and proceeds further. For e.g, 
> in Q95 it ends up throwing ~30,000 NPE before completing split information.
> ObjectCacheFactory should return null when tez shared object registry is not 
> setup. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17225) HoS DPP throws FileNotFoundException in HiveInputFormat#init when target work is in the same Spark job

2017-08-01 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17225:

Summary: HoS DPP throws FileNotFoundException in HiveInputFormat#init when 
target work is in the same Spark job  (was: FileNotFoundException in 
HiveInputFormat#init for query HoS DPP query with multiple left semi-joins 
against the same partition column)

> HoS DPP throws FileNotFoundException in HiveInputFormat#init when target work 
> is in the same Spark job
> --
>
> Key: HIVE-17225
> URL: https://issues.apache.org/jira/browse/HIVE-17225
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> Setup:
> {code:sql}
> SET hive.spark.dynamic.partition.pruning=true;
> SET hive.strict.checks.cartesian.product=false;
> SET hive.auto.convert.join=true;
> CREATE TABLE partitioned_table1 (col int) PARTITIONED BY (part_col int);
> CREATE TABLE regular_table1 (col1 int, col2 int);
> CREATE TABLE regular_table2 (col1 int, col2 int);
> ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 1);
> ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 2);
> ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 3);
> INSERT INTO table regular_table1 VALUES (0, 0), (1, 1), (2, 2);
> INSERT INTO table regular_table2 VALUES (0, 0), (1, 1), (2, 2);
> INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 1) VALUES (1), 
> (2), (3);
> INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 2) VALUES (1), 
> (2), (3);
> INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 3) VALUES (1), 
> (2), (3);
> SELECT * 
> FROM   regular_table1, 
>regular_table2, 
>partitioned_table1 
> WHERE  partitioned_table1.part_col IN (SELECT regular_table1.col2 
>FROM   regular_table1 
>WHERE  regular_table1.col1 > 0) 
>AND partitioned_table1.part_col IN (SELECT regular_table2.col2 
>FROM   regular_table2 
>WHERE  regular_table2.col1 > 1); 
> {code}
> Exception:
> {code}
> 2017-08-01T13:27:47,483 ERROR [b0d354a8-4cdb-4ba9-acec-27d14926aaf4 main] 
> ql.Driver: FAILED: Execution Error, return code 3 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask. java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.FileNotFoundException: File 
> file:/Users/stakiar/Documents/idea/apache-hive/itests/qtest-spark/target/tmp/scratchdir/stakiar/b0d354a8-4cdb-4ba9-acec-27d14926aaf4/hive_2017-08-01_13-27-45_553_1088589686371686526-1/-mr-10004/3/5
>  does not exist
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:408)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:498)
>   at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
>   at scala.Option.getOrElse(Option.scala:121)
>   at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
>   at scala.Option.getOrElse(Option.scala:121)
>   at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
>   at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
>   at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at scala.collection.immutable.List.foreach(List.scala:381)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
>   at scala.collection.immutable.List.map(List.scala:285)
>   at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:82)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
>   at scala.Option.getOrElse(Option.scala:121)
>   at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
>   at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
>   at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
>   at 
> 

[jira] [Updated] (HIVE-17225) FileNotFoundException in HiveInputFormat#init for query HoS DPP query with multiple left semi-joins against the same partition column

2017-08-01 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17225:

Description: 
Setup:

{code:sql}
SET hive.spark.dynamic.partition.pruning=true;
SET hive.strict.checks.cartesian.product=false;
SET hive.auto.convert.join=true;

CREATE TABLE partitioned_table1 (col int) PARTITIONED BY (part_col int);
CREATE TABLE regular_table1 (col1 int, col2 int);
CREATE TABLE regular_table2 (col1 int, col2 int);

ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 1);
ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 2);
ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 3);

INSERT INTO table regular_table1 VALUES (0, 0), (1, 1), (2, 2);
INSERT INTO table regular_table2 VALUES (0, 0), (1, 1), (2, 2);

INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 1) VALUES (1), (2), 
(3);
INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 2) VALUES (1), (2), 
(3);
INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 3) VALUES (1), (2), 
(3);

SELECT * 
FROM   regular_table1, 
   regular_table2, 
   partitioned_table1 
WHERE  partitioned_table1.part_col IN (SELECT regular_table1.col2 
   FROM   regular_table1 
   WHERE  regular_table1.col1 > 0) 
   AND partitioned_table1.part_col IN (SELECT regular_table2.col2 
   FROM   regular_table2 
   WHERE  regular_table2.col1 > 1); 
{code}

Exception:

{code}
2017-08-01T13:27:47,483 ERROR [b0d354a8-4cdb-4ba9-acec-27d14926aaf4 main] 
ql.Driver: FAILED: Execution Error, return code 3 from 
org.apache.hadoop.hive.ql.exec.spark.SparkTask. java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: 
java.io.FileNotFoundException: File 
file:/Users/stakiar/Documents/idea/apache-hive/itests/qtest-spark/target/tmp/scratchdir/stakiar/b0d354a8-4cdb-4ba9-acec-27d14926aaf4/hive_2017-08-01_13-27-45_553_1088589686371686526-1/-mr-10004/3/5
 does not exist
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:408)
at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:498)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.immutable.List.map(List.scala:285)
at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:82)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.immutable.List.map(List.scala:285)
at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:82)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)

[jira] [Commented] (HIVE-17220) Bloomfilter probing in semijoin reduction is thrashing L1 dcache

2017-08-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110143#comment-16110143
 ] 

Hive QA commented on HIVE-17220:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879912/HIVE-17220.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 11058 tests 
executed
*Failed tests:*
{noformat}
TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=236)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterByte (batchId=178)
org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterBytes (batchId=178)
org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterDouble 
(batchId=178)
org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterFloat (batchId=178)
org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterInt (batchId=178)
org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterLong (batchId=178)
org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterString 
(batchId=178)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6221/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6221/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6221/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12879912 - PreCommit-HIVE-Build

> Bloomfilter probing in semijoin reduction is thrashing L1 dcache
> 
>
> Key: HIVE-17220
> URL: https://issues.apache.org/jira/browse/HIVE-17220
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17220.1.patch, HIVE-17220.WIP.patch
>
>
> [~gopalv] observed perf profiles showing bloomfilter probes as bottleneck for 
> some of the TPC-DS queries and resulted L1 data cache thrashing. 
> This is because of the huge bitset in bloom filter that doesn't fit in any 
> levels of cache, also the hash bits corresponding to a single key map to 
> different segments of bitset which are spread out. This can result in K-1 
> memory access (K being number of hash functions) in worst case for every key 
> that gets probed because of locality miss in L1 cache. 
> Ran a JMH microbenchmark to verify the same. Following is the JMH perf 
> profile for bloom filter probing
> {code}
> Perf stats:
> --
>5101.935637  task-clock (msec) #0.461 CPUs utilized
>346  context-switches  #0.068 K/sec
>336  cpu-migrations#0.066 K/sec
>  6,207  page-faults   #0.001 M/sec
> 10,016,486,301  cycles#1.963 GHz  
> (26.90%)
>  5,751,692,176  stalled-cycles-frontend   #   57.42% frontend cycles 
> idle (27.05%)
>  stalled-cycles-backend
> 14,359,914,397  instructions  #1.43  insns per cycle
>   #0.40  stalled cycles 
> per insn  (33.78%)
>  2,200,632,861  branches  #  431.333 M/sec
> (33.84%)
>  1,162,860  branch-misses #0.05% of all branches  
> (33.97%)
>  1,025,992,254  L1-dcache-loads   #  201.099 M/sec
> (26.56%)
>432,663,098  L1-dcache-load-misses #   42.17% of all L1-dcache 
> hits(14.49%)
>331,383,297  LLC-loads #   64.952 M/sec
> (14.47%)
>203,524  LLC-load-misses   #0.06% of all LL-cache 
> hits (21.67%)
>  L1-icache-loads
>  1,633,821  L1-icache-load-misses #0.320 M/sec
> (28.85%)
>950,368,796  dTLB-loads#  186.276 M/sec
> (28.61%)
>246,813,393   

[jira] [Commented] (HIVE-16820) TezTask may not shut down correctly before submit

2017-08-01 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110071#comment-16110071
 ] 

Mithun Radhakrishnan commented on HIVE-16820:
-

[~sershe], I wonder if a similar fix should go into 
{{MergeFileTask::execute()}}, to check for cancellation before job-submission. 

> TezTask may not shut down correctly before submit
> -
>
> Key: HIVE-16820
> URL: https://issues.apache.org/jira/browse/HIVE-16820
> Project: Hive
>  Issue Type: Bug
>Reporter: Visakh Nair
>Assignee: Sergey Shelukhin
> Fix For: 3.0.0
>
> Attachments: HIVE-16820.01.patch, HIVE-16820.patch
>
>
> The query will run and only fail at the very end when the driver checks its 
> own shutdown flag.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17133) NoSuchMethodError in Hadoop FileStatus.compareTo

2017-08-01 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110072#comment-16110072
 ] 

Sergey Shelukhin commented on HIVE-17133:
-

The solution would be to upgrade to 2.8.2 once that is released. We'd always 
build with the old signature and thus support old and new versions but not the 
ones in the middle (need to verify that we do indeed refer to a method with old 
signature when building).
Thus, Hive will not support Hadoop 2.8.0 and 2.8.1.
The problem is if we release something referring to the new signature soon (or 
have already). We might need 2.2.1 and 2.3.0 if these were built against Hadoop 
2.8.0/1, as far as I understand cc [~owen.omalley] [~pxiong]

> NoSuchMethodError in Hadoop FileStatus.compareTo
> 
>
> Key: HIVE-17133
> URL: https://issues.apache.org/jira/browse/HIVE-17133
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>
> The stack trace is:
> {noformat}
> Caused by: java.lang.NoSuchMethodError: 
> org.apache.hadoop.fs.FileStatus.compareTo(Lorg/apache/hadoop/fs/FileStatus;)I
>   at 
> org.apache.hadoop.hive.ql.io.AcidUtils.lambda$getAcidState$0(AcidUtils.java:931)
>   at java.util.TimSort.countRunAndMakeAscending(TimSort.java:355)
>   at java.util.TimSort.sort(TimSort.java:234)
>   at java.util.Arrays.sort(Arrays.java:1512)
>   at java.util.ArrayList.sort(ArrayList.java:1454)
>   at java.util.Collections.sort(Collections.java:175)
>   at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getAcidState(AcidUtils.java:929)
> {noformat}
> I'm on Hive master and using Hadoop 2.7.2. The method signature in Hadoop 
> 2.7.2 is:
> https://github.com/apache/hadoop/blob/release-2.7.2-RC2/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileStatus.java#L336
> In Hadoop 2.8.0 it becomes:
> https://github.com/apache/hadoop/blob/release-2.8.0-RC3/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileStatus.java#L332
> I think that breaks binary compatibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17133) NoSuchMethodError in Hadoop FileStatus.compareTo

2017-08-01 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110072#comment-16110072
 ] 

Sergey Shelukhin edited comment on HIVE-17133 at 8/2/17 12:51 AM:
--

The solution would be to upgrade to 2.8.2 once that is released. We'd always 
build with the old signature and thus support old and new versions but not the 
ones in the middle (need to verify that we do indeed refer to a method with old 
signature when building).
Thus, Hive will not support Hadoop 2.8.0 and 2.8.1.
The problem is if we release something referring to the new signature soon (or 
have already). We might need 2.2.1 and 2.3.1 if these were built against Hadoop 
2.8.0/1, as far as I understand cc [~owen.omalley] [~pxiong]


was (Author: sershe):
The solution would be to upgrade to 2.8.2 once that is released. We'd always 
build with the old signature and thus support old and new versions but not the 
ones in the middle (need to verify that we do indeed refer to a method with old 
signature when building).
Thus, Hive will not support Hadoop 2.8.0 and 2.8.1.
The problem is if we release something referring to the new signature soon (or 
have already). We might need 2.2.1 and 2.3.0 if these were built against Hadoop 
2.8.0/1, as far as I understand cc [~owen.omalley] [~pxiong]

> NoSuchMethodError in Hadoop FileStatus.compareTo
> 
>
> Key: HIVE-17133
> URL: https://issues.apache.org/jira/browse/HIVE-17133
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>
> The stack trace is:
> {noformat}
> Caused by: java.lang.NoSuchMethodError: 
> org.apache.hadoop.fs.FileStatus.compareTo(Lorg/apache/hadoop/fs/FileStatus;)I
>   at 
> org.apache.hadoop.hive.ql.io.AcidUtils.lambda$getAcidState$0(AcidUtils.java:931)
>   at java.util.TimSort.countRunAndMakeAscending(TimSort.java:355)
>   at java.util.TimSort.sort(TimSort.java:234)
>   at java.util.Arrays.sort(Arrays.java:1512)
>   at java.util.ArrayList.sort(ArrayList.java:1454)
>   at java.util.Collections.sort(Collections.java:175)
>   at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getAcidState(AcidUtils.java:929)
> {noformat}
> I'm on Hive master and using Hadoop 2.7.2. The method signature in Hadoop 
> 2.7.2 is:
> https://github.com/apache/hadoop/blob/release-2.7.2-RC2/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileStatus.java#L336
> In Hadoop 2.8.0 it becomes:
> https://github.com/apache/hadoop/blob/release-2.8.0-RC3/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileStatus.java#L332
> I think that breaks binary compatibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17220) Bloomfilter probing in semijoin reduction is thrashing L1 dcache

2017-08-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110050#comment-16110050
 ] 

Hive QA commented on HIVE-17220:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879912/HIVE-17220.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 11058 tests 
executed
*Failed tests:*
{noformat}
TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=236)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=241)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterByte (batchId=178)
org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterBytes (batchId=178)
org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterDouble 
(batchId=178)
org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterFloat (batchId=178)
org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterInt (batchId=178)
org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterLong (batchId=178)
org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterString 
(batchId=178)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteDecimalX 
(batchId=182)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6220/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6220/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6220/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12879912 - PreCommit-HIVE-Build

> Bloomfilter probing in semijoin reduction is thrashing L1 dcache
> 
>
> Key: HIVE-17220
> URL: https://issues.apache.org/jira/browse/HIVE-17220
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17220.1.patch, HIVE-17220.WIP.patch
>
>
> [~gopalv] observed perf profiles showing bloomfilter probes as bottleneck for 
> some of the TPC-DS queries and resulted L1 data cache thrashing. 
> This is because of the huge bitset in bloom filter that doesn't fit in any 
> levels of cache, also the hash bits corresponding to a single key map to 
> different segments of bitset which are spread out. This can result in K-1 
> memory access (K being number of hash functions) in worst case for every key 
> that gets probed because of locality miss in L1 cache. 
> Ran a JMH microbenchmark to verify the same. Following is the JMH perf 
> profile for bloom filter probing
> {code}
> Perf stats:
> --
>5101.935637  task-clock (msec) #0.461 CPUs utilized
>346  context-switches  #0.068 K/sec
>336  cpu-migrations#0.066 K/sec
>  6,207  page-faults   #0.001 M/sec
> 10,016,486,301  cycles#1.963 GHz  
> (26.90%)
>  5,751,692,176  stalled-cycles-frontend   #   57.42% frontend cycles 
> idle (27.05%)
>  stalled-cycles-backend
> 14,359,914,397  instructions  #1.43  insns per cycle
>   #0.40  stalled cycles 
> per insn  (33.78%)
>  2,200,632,861  branches  #  431.333 M/sec
> (33.84%)
>  1,162,860  branch-misses #0.05% of all branches  
> (33.97%)
>  1,025,992,254  L1-dcache-loads   #  201.099 M/sec
> (26.56%)
>432,663,098  L1-dcache-load-misses #   42.17% of all L1-dcache 
> hits(14.49%)
>331,383,297  LLC-loads #   64.952 M/sec
> (14.47%)
>203,524  LLC-load-misses   #0.06% of all 

[jira] [Updated] (HIVE-17089) make acid 2.0 the default

2017-08-01 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17089:
--
Description: 
acid 2.0 is introduced in HIVE-14035.  It replaces Update events with a 
combination of Delete + Insert events.  This now makes U=D+I the default (and 
only) supported acid table type in Hive 3.0.  

The expectation for upgrade is that Major compaction has to be run on all acid 
tables in the existing Hive cluster and that no new writes to these table take 
place since the start of compaction (Need to add a mechanism to put a table in 
read-only mode - this way it can still be read while it's being compacted).  
Then upgrade to Hive 3.0 can take place.

  was:
acid 2.0 is introduced in HIVE-14035.  It replaces Update events with a 
combination of Delete + Insert events.  This now makes U=D+I the default (and 
only) supported acid table type in Hive 3.0.  

The expectation for upgrade is that Major compaction has to be run on all acid 
tables in the existing Hive cluster and that no new writes to these table take 
place since the start of compaction.  Then upgrade to Hive 3.0 can take place.


> make acid 2.0 the default
> -
>
> Key: HIVE-17089
> URL: https://issues.apache.org/jira/browse/HIVE-17089
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17089.01.patch, HIVE-17089.03.patch
>
>
> acid 2.0 is introduced in HIVE-14035.  It replaces Update events with a 
> combination of Delete + Insert events.  This now makes U=D+I the default (and 
> only) supported acid table type in Hive 3.0.  
> The expectation for upgrade is that Major compaction has to be run on all 
> acid tables in the existing Hive cluster and that no new writes to these 
> table take place since the start of compaction (Need to add a mechanism to 
> put a table in read-only mode - this way it can still be read while it's 
> being compacted).  Then upgrade to Hive 3.0 can take place.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17089) make acid 2.0 the default

2017-08-01 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17089:
--
Description: 
acid 2.0 is introduced in HIVE-14035.  It replaces Update events with a 
combination of Delete + Insert events.  This now makes U=D+I the default (and 
only) supported acid table type in Hive 3.0.  

The expectation for upgrade is that Major compaction has to be run on all acid 
tables in the existing Hive cluster and that no new writes to these table take 
place since the start of compaction.  Then upgrade to Hive 3.0 can take place.

  was:acid 2.0 is introduced in HIVE-14035.  It replaces Update events with a 
combination of Delete + Insert events.  This now makes U=D+I the default (and 
only) supported acid table type in Hive 3.0


> make acid 2.0 the default
> -
>
> Key: HIVE-17089
> URL: https://issues.apache.org/jira/browse/HIVE-17089
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17089.01.patch, HIVE-17089.03.patch
>
>
> acid 2.0 is introduced in HIVE-14035.  It replaces Update events with a 
> combination of Delete + Insert events.  This now makes U=D+I the default (and 
> only) supported acid table type in Hive 3.0.  
> The expectation for upgrade is that Major compaction has to be run on all 
> acid tables in the existing Hive cluster and that no new writes to these 
> table take place since the start of compaction.  Then upgrade to Hive 3.0 can 
> take place.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17089) make acid 2.0 the default

2017-08-01 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17089:
--
Description: acid 2.0 is introduced in HIVE-14035.  It replaces Update 
events with a combination of Delete + Insert events.  This now makes U=D+I the 
default (and only) supported acid table type in Hive 3.0

> make acid 2.0 the default
> -
>
> Key: HIVE-17089
> URL: https://issues.apache.org/jira/browse/HIVE-17089
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17089.01.patch, HIVE-17089.03.patch
>
>
> acid 2.0 is introduced in HIVE-14035.  It replaces Update events with a 
> combination of Delete + Insert events.  This now makes U=D+I the default (and 
> only) supported acid table type in Hive 3.0



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17089) make acid 2.0 the default

2017-08-01 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17089:
--
Issue Type: New Feature  (was: Test)

> make acid 2.0 the default
> -
>
> Key: HIVE-17089
> URL: https://issues.apache.org/jira/browse/HIVE-17089
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17089.01.patch, HIVE-17089.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17089) make acid 2.0 the default

2017-08-01 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17089:
--
Status: Patch Available  (was: Open)

> make acid 2.0 the default
> -
>
> Key: HIVE-17089
> URL: https://issues.apache.org/jira/browse/HIVE-17089
> Project: Hive
>  Issue Type: Test
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17089.01.patch, HIVE-17089.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17089) make acid 2.0 the default

2017-08-01 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17089:
--
Attachment: HIVE-17089.03.patch

> make acid 2.0 the default
> -
>
> Key: HIVE-17089
> URL: https://issues.apache.org/jira/browse/HIVE-17089
> Project: Hive
>  Issue Type: Test
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17089.01.patch, HIVE-17089.03.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17164) Vectorization: Support PTF (Part 2: Unbounded Support-- Turn ON by default)

2017-08-01 Thread Teddy Choi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109995#comment-16109995
 ] 

Teddy Choi commented on HIVE-17164:
---

+1 LGTM and tests pending.

> Vectorization: Support PTF (Part 2: Unbounded Support-- Turn ON by default)
> ---
>
> Key: HIVE-17164
> URL: https://issues.apache.org/jira/browse/HIVE-17164
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-17164.01.patch, HIVE-17164.02.patch
>
>
> Add disk storage backing.  Turn hive.vectorized.execution.ptf.enabled on by 
> default.
> Add hive.vectorized.ptf.max.memory.buffering.batch.count to specify the 
> maximum number of vectorized row batch to buffer in memory before spilling to 
> disk.
> Add hive.vectorized.testing.reducer.batch.size parameter to have the Tez 
> Reducer make small batches for making a lot of key group batches that cause 
> memory buffering and disk storage backing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17170) Move thrift generated code to stand alone metastore

2017-08-01 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17170:
--
Status: Patch Available  (was: Open)

> Move thrift generated code to stand alone metastore
> ---
>
> Key: HIVE-17170
> URL: https://issues.apache.org/jira/browse/HIVE-17170
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17170.2.patch, HIVE-17170.patch
>
>
> hive_metastore.thrift and the code it generates needs to be moved into the 
> standalone metastore module.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17170) Move thrift generated code to stand alone metastore

2017-08-01 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17170:
--
Attachment: HIVE-17170.2.patch

New version of the patch that I think addresses the issue (it builds for me 
locally, so I'm not sure).

> Move thrift generated code to stand alone metastore
> ---
>
> Key: HIVE-17170
> URL: https://issues.apache.org/jira/browse/HIVE-17170
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17170.2.patch, HIVE-17170.patch
>
>
> hive_metastore.thrift and the code it generates needs to be moved into the 
> standalone metastore module.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17170) Move thrift generated code to stand alone metastore

2017-08-01 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17170:
--
Status: Open  (was: Patch Available)

> Move thrift generated code to stand alone metastore
> ---
>
> Key: HIVE-17170
> URL: https://issues.apache.org/jira/browse/HIVE-17170
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17170.2.patch, HIVE-17170.patch
>
>
> hive_metastore.thrift and the code it generates needs to be moved into the 
> standalone metastore module.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17172) add ordering checks to DiskRangeList

2017-08-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109974#comment-16109974
 ] 

Hive QA commented on HIVE-17172:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879904/HIVE-17172.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 11041 tests 
executed
*Failed tests:*
{noformat}
TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=236)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=241)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_uncompressed] 
(batchId=56)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6219/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6219/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6219/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12879904 - PreCommit-HIVE-Build

> add ordering checks to DiskRangeList
> 
>
> Key: HIVE-17172
> URL: https://issues.apache.org/jira/browse/HIVE-17172
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17172.01.patch, HIVE-17172.02.patch, 
> HIVE-17172.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16979) Cache UGI for metastore

2017-08-01 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109964#comment-16109964
 ] 

Gopal V commented on HIVE-16979:


>  A metastore call normally takes less than a second.

Oh, I thought that TUGIAssumingProcessor is used by ThriftBinaryCLIService as 
well.

ThriftBinaryCLIService::run() -> 
KerberosSaslHelper$CLIServiceProcessorFactory::getProcessor() -> 
HadoopThriftAuthBridge::wrapNonAssumingProcessor() -> new 
TUGIAssumingProcessor()

> Cache UGI for metastore
> ---
>
> Key: HIVE-16979
> URL: https://issues.apache.org/jira/browse/HIVE-16979
> Project: Hive
>  Issue Type: Improvement
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-16979.1.patch, HIVE-16979.2.patch, 
> HIVE-16979.3.patch
>
>
> FileSystem.closeAllForUGI is called per request against metastore to dispose 
> UGI, which involves talking to HDFS name node and is time consuming. So the 
> perf improvement would be caching and reusing the UGI.
> Per FileSystem.closeAllForUG call could take up to 20 ms as E2E latency 
> against HDFS. Usually a Hive query could result in several calls against 
> metastore, so we can save up to 50-100 ms per hive query.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16974) Change the sort key for the schema tool validator to be

2017-08-01 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-16974:
-
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Patch has been pushed to master. Thank you for the review [~aihuaxu]

> Change the sort key for the schema tool validator to be 
> 
>
> Key: HIVE-16974
> URL: https://issues.apache.org/jira/browse/HIVE-16974
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Fix For: 3.0.0
>
> Attachments: HIVE-16974.patch, HIVE-16974.patch
>
>
> In HIVE-16729, we introduced ordering of results/failures returned by 
> schematool's validators. This allows fault injection testing to expect 
> results that can be verified. However, they were sorted on NAME values which 
> in the HMS schema can be NULL. So if the introduced fault has a NULL/BLANK 
> name column value, the result could be different depending on the backend 
> database(if they sort NULLs first or last).
> So I think it is better to sort on a non-null column value.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17226) Use strong hashing as security improvement

2017-08-01 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17226:
--
Component/s: (was: Hive)
 Security

> Use strong hashing as security improvement
> --
>
> Key: HIVE-17226
> URL: https://issues.apache.org/jira/browse/HIVE-17226
> Project: Hive
>  Issue Type: Improvement
>  Components: Security
>Reporter: Tao Li
>Assignee: Tao Li
>
> There have been 2 places identified where weak hashing needs to be replaced 
> by SHA256.
> 1. CookieSigner.java uses MessageDigest.getInstance("SHA"). Mostly SHA is 
> mapped to SHA-1, which is not secure enough according to today's standards. 
> We should use SHA-256 instead.
> 2. GenericUDFMaskHash.java uses DigestUtils.md5Hex. MD5 is considered weak 
> and should be replaced by DigestUtils.sha256Hex.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17164) Vectorization: Support PTF (Part 2: Unbounded Support-- Turn ON by default)

2017-08-01 Thread Teddy Choi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109901#comment-16109901
 ] 

Teddy Choi commented on HIVE-17164:
---

The patch looks good, but some tests are failed. 
llap/vector_ptf_part_simple.q.out is failed because of different fractions. 
Also vector_windowing_expressions.q.out for TestCliDriver needs to be updated, 
too.

> Vectorization: Support PTF (Part 2: Unbounded Support-- Turn ON by default)
> ---
>
> Key: HIVE-17164
> URL: https://issues.apache.org/jira/browse/HIVE-17164
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-17164.01.patch, HIVE-17164.02.patch
>
>
> Add disk storage backing.  Turn hive.vectorized.execution.ptf.enabled on by 
> default.
> Add hive.vectorized.ptf.max.memory.buffering.batch.count to specify the 
> maximum number of vectorized row batch to buffer in memory before spilling to 
> disk.
> Add hive.vectorized.testing.reducer.batch.size parameter to have the Tez 
> Reducer make small batches for making a lot of key group batches that cause 
> memory buffering and disk storage backing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17226) Use strong hashing as security improvement

2017-08-01 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-17226:
--
Component/s: Hive

> Use strong hashing as security improvement
> --
>
> Key: HIVE-17226
> URL: https://issues.apache.org/jira/browse/HIVE-17226
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Tao Li
>Assignee: Tao Li
>
> There have been 2 places identified where weak hashing needs to be replaced 
> by SHA256.
> 1. CookieSigner.java uses MessageDigest.getInstance("SHA"). Mostly SHA is 
> mapped to SHA-1, which is not secure enough according to today's standards. 
> We should use SHA-256 instead.
> 2. GenericUDFMaskHash.java uses DigestUtils.md5Hex. MD5 is considered weak 
> and should be replaced by DigestUtils.sha256Hex.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17226) Use strong hashing as security improvement

2017-08-01 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li reassigned HIVE-17226:
-


> Use strong hashing as security improvement
> --
>
> Key: HIVE-17226
> URL: https://issues.apache.org/jira/browse/HIVE-17226
> Project: Hive
>  Issue Type: Improvement
>Reporter: Tao Li
>Assignee: Tao Li
>
> There have been 2 places identified where weak hashing needs to be replaced 
> by SHA256.
> 1. CookieSigner.java uses MessageDigest.getInstance("SHA"). Mostly SHA is 
> mapped to SHA-1, which is not secure enough according to today's standards. 
> We should use SHA-256 instead.
> 2. GenericUDFMaskHash.java uses DigestUtils.md5Hex. MD5 is considered weak 
> and should be replaced by DigestUtils.sha256Hex.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17089) make acid 2.0 the default

2017-08-01 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17089:
--
Summary: make acid 2.0 the default  (was: run ptest with acid 2.0 the 
default)

> make acid 2.0 the default
> -
>
> Key: HIVE-17089
> URL: https://issues.apache.org/jira/browse/HIVE-17089
> Project: Hive
>  Issue Type: Test
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-17089.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17213) HoS: file merging doesn't work for union all

2017-08-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109852#comment-16109852
 ] 

Hive QA commented on HIVE-17213:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879885/HIVE-17213.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 11040 tests 
executed
*Failed tests:*
{noformat}
TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=236)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6218/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6218/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6218/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12879885 - PreCommit-HIVE-Build

> HoS: file merging doesn't work for union all
> 
>
> Key: HIVE-17213
> URL: https://issues.apache.org/jira/browse/HIVE-17213
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-17213.0.patch, HIVE-17213.1.patch, 
> HIVE-17213.2.patch
>
>
> HoS file merging doesn't work properly since it doesn't set linked file sinks 
> properly which is used to generate move tasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17220) Bloomfilter probing in semijoin reduction is thrashing L1 dcache

2017-08-01 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109789#comment-16109789
 ] 

Prasanth Jayachandran commented on HIVE-17220:
--

Just found it and fixed it :)

> Bloomfilter probing in semijoin reduction is thrashing L1 dcache
> 
>
> Key: HIVE-17220
> URL: https://issues.apache.org/jira/browse/HIVE-17220
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17220.1.patch, HIVE-17220.WIP.patch
>
>
> [~gopalv] observed perf profiles showing bloomfilter probes as bottleneck for 
> some of the TPC-DS queries and resulted L1 data cache thrashing. 
> This is because of the huge bitset in bloom filter that doesn't fit in any 
> levels of cache, also the hash bits corresponding to a single key map to 
> different segments of bitset which are spread out. This can result in K-1 
> memory access (K being number of hash functions) in worst case for every key 
> that gets probed because of locality miss in L1 cache. 
> Ran a JMH microbenchmark to verify the same. Following is the JMH perf 
> profile for bloom filter probing
> {code}
> Perf stats:
> --
>5101.935637  task-clock (msec) #0.461 CPUs utilized
>346  context-switches  #0.068 K/sec
>336  cpu-migrations#0.066 K/sec
>  6,207  page-faults   #0.001 M/sec
> 10,016,486,301  cycles#1.963 GHz  
> (26.90%)
>  5,751,692,176  stalled-cycles-frontend   #   57.42% frontend cycles 
> idle (27.05%)
>  stalled-cycles-backend
> 14,359,914,397  instructions  #1.43  insns per cycle
>   #0.40  stalled cycles 
> per insn  (33.78%)
>  2,200,632,861  branches  #  431.333 M/sec
> (33.84%)
>  1,162,860  branch-misses #0.05% of all branches  
> (33.97%)
>  1,025,992,254  L1-dcache-loads   #  201.099 M/sec
> (26.56%)
>432,663,098  L1-dcache-load-misses #   42.17% of all L1-dcache 
> hits(14.49%)
>331,383,297  LLC-loads #   64.952 M/sec
> (14.47%)
>203,524  LLC-load-misses   #0.06% of all LL-cache 
> hits (21.67%)
>  L1-icache-loads
>  1,633,821  L1-icache-load-misses #0.320 M/sec
> (28.85%)
>950,368,796  dTLB-loads#  186.276 M/sec
> (28.61%)
>246,813,393  dTLB-load-misses  #   25.97% of all dTLB 
> cache hits   (14.53%)
> 25,451  iTLB-loads#0.005 M/sec
> (14.48%)
> 35,415  iTLB-load-misses  #  139.15% of all iTLB 
> cache hits   (21.73%)
>  L1-dcache-prefetches
>175,958  L1-dcache-prefetch-misses #0.034 M/sec
> (28.94%)
>   11.064783140 seconds time elapsed
> {code}
> This shows 42.17% of L1 data cache misses. 
> This jira is to use cache efficient bloom filter for semijoin probing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17220) Bloomfilter probing in semijoin reduction is thrashing L1 dcache

2017-08-01 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17220:
-
Attachment: (was: HIVE-17220.1.patch)

> Bloomfilter probing in semijoin reduction is thrashing L1 dcache
> 
>
> Key: HIVE-17220
> URL: https://issues.apache.org/jira/browse/HIVE-17220
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17220.1.patch, HIVE-17220.WIP.patch
>
>
> [~gopalv] observed perf profiles showing bloomfilter probes as bottleneck for 
> some of the TPC-DS queries and resulted L1 data cache thrashing. 
> This is because of the huge bitset in bloom filter that doesn't fit in any 
> levels of cache, also the hash bits corresponding to a single key map to 
> different segments of bitset which are spread out. This can result in K-1 
> memory access (K being number of hash functions) in worst case for every key 
> that gets probed because of locality miss in L1 cache. 
> Ran a JMH microbenchmark to verify the same. Following is the JMH perf 
> profile for bloom filter probing
> {code}
> Perf stats:
> --
>5101.935637  task-clock (msec) #0.461 CPUs utilized
>346  context-switches  #0.068 K/sec
>336  cpu-migrations#0.066 K/sec
>  6,207  page-faults   #0.001 M/sec
> 10,016,486,301  cycles#1.963 GHz  
> (26.90%)
>  5,751,692,176  stalled-cycles-frontend   #   57.42% frontend cycles 
> idle (27.05%)
>  stalled-cycles-backend
> 14,359,914,397  instructions  #1.43  insns per cycle
>   #0.40  stalled cycles 
> per insn  (33.78%)
>  2,200,632,861  branches  #  431.333 M/sec
> (33.84%)
>  1,162,860  branch-misses #0.05% of all branches  
> (33.97%)
>  1,025,992,254  L1-dcache-loads   #  201.099 M/sec
> (26.56%)
>432,663,098  L1-dcache-load-misses #   42.17% of all L1-dcache 
> hits(14.49%)
>331,383,297  LLC-loads #   64.952 M/sec
> (14.47%)
>203,524  LLC-load-misses   #0.06% of all LL-cache 
> hits (21.67%)
>  L1-icache-loads
>  1,633,821  L1-icache-load-misses #0.320 M/sec
> (28.85%)
>950,368,796  dTLB-loads#  186.276 M/sec
> (28.61%)
>246,813,393  dTLB-load-misses  #   25.97% of all dTLB 
> cache hits   (14.53%)
> 25,451  iTLB-loads#0.005 M/sec
> (14.48%)
> 35,415  iTLB-load-misses  #  139.15% of all iTLB 
> cache hits   (21.73%)
>  L1-dcache-prefetches
>175,958  L1-dcache-prefetch-misses #0.034 M/sec
> (28.94%)
>   11.064783140 seconds time elapsed
> {code}
> This shows 42.17% of L1 data cache misses. 
> This jira is to use cache efficient bloom filter for semijoin probing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17220) Bloomfilter probing in semijoin reduction is thrashing L1 dcache

2017-08-01 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17220:
-
Attachment: HIVE-17220.1.patch

> Bloomfilter probing in semijoin reduction is thrashing L1 dcache
> 
>
> Key: HIVE-17220
> URL: https://issues.apache.org/jira/browse/HIVE-17220
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17220.1.patch, HIVE-17220.WIP.patch
>
>
> [~gopalv] observed perf profiles showing bloomfilter probes as bottleneck for 
> some of the TPC-DS queries and resulted L1 data cache thrashing. 
> This is because of the huge bitset in bloom filter that doesn't fit in any 
> levels of cache, also the hash bits corresponding to a single key map to 
> different segments of bitset which are spread out. This can result in K-1 
> memory access (K being number of hash functions) in worst case for every key 
> that gets probed because of locality miss in L1 cache. 
> Ran a JMH microbenchmark to verify the same. Following is the JMH perf 
> profile for bloom filter probing
> {code}
> Perf stats:
> --
>5101.935637  task-clock (msec) #0.461 CPUs utilized
>346  context-switches  #0.068 K/sec
>336  cpu-migrations#0.066 K/sec
>  6,207  page-faults   #0.001 M/sec
> 10,016,486,301  cycles#1.963 GHz  
> (26.90%)
>  5,751,692,176  stalled-cycles-frontend   #   57.42% frontend cycles 
> idle (27.05%)
>  stalled-cycles-backend
> 14,359,914,397  instructions  #1.43  insns per cycle
>   #0.40  stalled cycles 
> per insn  (33.78%)
>  2,200,632,861  branches  #  431.333 M/sec
> (33.84%)
>  1,162,860  branch-misses #0.05% of all branches  
> (33.97%)
>  1,025,992,254  L1-dcache-loads   #  201.099 M/sec
> (26.56%)
>432,663,098  L1-dcache-load-misses #   42.17% of all L1-dcache 
> hits(14.49%)
>331,383,297  LLC-loads #   64.952 M/sec
> (14.47%)
>203,524  LLC-load-misses   #0.06% of all LL-cache 
> hits (21.67%)
>  L1-icache-loads
>  1,633,821  L1-icache-load-misses #0.320 M/sec
> (28.85%)
>950,368,796  dTLB-loads#  186.276 M/sec
> (28.61%)
>246,813,393  dTLB-load-misses  #   25.97% of all dTLB 
> cache hits   (14.53%)
> 25,451  iTLB-loads#0.005 M/sec
> (14.48%)
> 35,415  iTLB-load-misses  #  139.15% of all iTLB 
> cache hits   (21.73%)
>  L1-dcache-prefetches
>175,958  L1-dcache-prefetch-misses #0.034 M/sec
> (28.94%)
>   11.064783140 seconds time elapsed
> {code}
> This shows 42.17% of L1 data cache misses. 
> This jira is to use cache efficient bloom filter for semijoin probing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17220) Bloomfilter probing in semijoin reduction is thrashing L1 dcache

2017-08-01 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109771#comment-16109771
 ] 

Gopal V commented on HIVE-17220:


[~prasanth_j]: cut-paste issue?

{code}
TEZ_BLOOM_FILTER_FPP("hive.tez.bloom.filter.factor", 0.03f,
{code}

> Bloomfilter probing in semijoin reduction is thrashing L1 dcache
> 
>
> Key: HIVE-17220
> URL: https://issues.apache.org/jira/browse/HIVE-17220
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17220.1.patch, HIVE-17220.WIP.patch
>
>
> [~gopalv] observed perf profiles showing bloomfilter probes as bottleneck for 
> some of the TPC-DS queries and resulted L1 data cache thrashing. 
> This is because of the huge bitset in bloom filter that doesn't fit in any 
> levels of cache, also the hash bits corresponding to a single key map to 
> different segments of bitset which are spread out. This can result in K-1 
> memory access (K being number of hash functions) in worst case for every key 
> that gets probed because of locality miss in L1 cache. 
> Ran a JMH microbenchmark to verify the same. Following is the JMH perf 
> profile for bloom filter probing
> {code}
> Perf stats:
> --
>5101.935637  task-clock (msec) #0.461 CPUs utilized
>346  context-switches  #0.068 K/sec
>336  cpu-migrations#0.066 K/sec
>  6,207  page-faults   #0.001 M/sec
> 10,016,486,301  cycles#1.963 GHz  
> (26.90%)
>  5,751,692,176  stalled-cycles-frontend   #   57.42% frontend cycles 
> idle (27.05%)
>  stalled-cycles-backend
> 14,359,914,397  instructions  #1.43  insns per cycle
>   #0.40  stalled cycles 
> per insn  (33.78%)
>  2,200,632,861  branches  #  431.333 M/sec
> (33.84%)
>  1,162,860  branch-misses #0.05% of all branches  
> (33.97%)
>  1,025,992,254  L1-dcache-loads   #  201.099 M/sec
> (26.56%)
>432,663,098  L1-dcache-load-misses #   42.17% of all L1-dcache 
> hits(14.49%)
>331,383,297  LLC-loads #   64.952 M/sec
> (14.47%)
>203,524  LLC-load-misses   #0.06% of all LL-cache 
> hits (21.67%)
>  L1-icache-loads
>  1,633,821  L1-icache-load-misses #0.320 M/sec
> (28.85%)
>950,368,796  dTLB-loads#  186.276 M/sec
> (28.61%)
>246,813,393  dTLB-load-misses  #   25.97% of all dTLB 
> cache hits   (14.53%)
> 25,451  iTLB-loads#0.005 M/sec
> (14.48%)
> 35,415  iTLB-load-misses  #  139.15% of all iTLB 
> cache hits   (21.73%)
>  L1-dcache-prefetches
>175,958  L1-dcache-prefetch-misses #0.034 M/sec
> (28.94%)
>   11.064783140 seconds time elapsed
> {code}
> This shows 42.17% of L1 data cache misses. 
> This jira is to use cache efficient bloom filter for semijoin probing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17220) Bloomfilter probing in semijoin reduction is thrashing L1 dcache

2017-08-01 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17220:
-
Status: Patch Available  (was: Open)

> Bloomfilter probing in semijoin reduction is thrashing L1 dcache
> 
>
> Key: HIVE-17220
> URL: https://issues.apache.org/jira/browse/HIVE-17220
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17220.1.patch, HIVE-17220.WIP.patch
>
>
> [~gopalv] observed perf profiles showing bloomfilter probes as bottleneck for 
> some of the TPC-DS queries and resulted L1 data cache thrashing. 
> This is because of the huge bitset in bloom filter that doesn't fit in any 
> levels of cache, also the hash bits corresponding to a single key map to 
> different segments of bitset which are spread out. This can result in K-1 
> memory access (K being number of hash functions) in worst case for every key 
> that gets probed because of locality miss in L1 cache. 
> Ran a JMH microbenchmark to verify the same. Following is the JMH perf 
> profile for bloom filter probing
> {code}
> Perf stats:
> --
>5101.935637  task-clock (msec) #0.461 CPUs utilized
>346  context-switches  #0.068 K/sec
>336  cpu-migrations#0.066 K/sec
>  6,207  page-faults   #0.001 M/sec
> 10,016,486,301  cycles#1.963 GHz  
> (26.90%)
>  5,751,692,176  stalled-cycles-frontend   #   57.42% frontend cycles 
> idle (27.05%)
>  stalled-cycles-backend
> 14,359,914,397  instructions  #1.43  insns per cycle
>   #0.40  stalled cycles 
> per insn  (33.78%)
>  2,200,632,861  branches  #  431.333 M/sec
> (33.84%)
>  1,162,860  branch-misses #0.05% of all branches  
> (33.97%)
>  1,025,992,254  L1-dcache-loads   #  201.099 M/sec
> (26.56%)
>432,663,098  L1-dcache-load-misses #   42.17% of all L1-dcache 
> hits(14.49%)
>331,383,297  LLC-loads #   64.952 M/sec
> (14.47%)
>203,524  LLC-load-misses   #0.06% of all LL-cache 
> hits (21.67%)
>  L1-icache-loads
>  1,633,821  L1-icache-load-misses #0.320 M/sec
> (28.85%)
>950,368,796  dTLB-loads#  186.276 M/sec
> (28.61%)
>246,813,393  dTLB-load-misses  #   25.97% of all dTLB 
> cache hits   (14.53%)
> 25,451  iTLB-loads#0.005 M/sec
> (14.48%)
> 35,415  iTLB-load-misses  #  139.15% of all iTLB 
> cache hits   (21.73%)
>  L1-dcache-prefetches
>175,958  L1-dcache-prefetch-misses #0.034 M/sec
> (28.94%)
>   11.064783140 seconds time elapsed
> {code}
> This shows 42.17% of L1 data cache misses. 
> This jira is to use cache efficient bloom filter for semijoin probing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17220) Bloomfilter probing in semijoin reduction is thrashing L1 dcache

2017-08-01 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17220:
-
Attachment: HIVE-17220.1.patch

Made fpp configurable, also changed default fpp to 0.03 for bloom-1.

> Bloomfilter probing in semijoin reduction is thrashing L1 dcache
> 
>
> Key: HIVE-17220
> URL: https://issues.apache.org/jira/browse/HIVE-17220
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17220.1.patch, HIVE-17220.WIP.patch
>
>
> [~gopalv] observed perf profiles showing bloomfilter probes as bottleneck for 
> some of the TPC-DS queries and resulted L1 data cache thrashing. 
> This is because of the huge bitset in bloom filter that doesn't fit in any 
> levels of cache, also the hash bits corresponding to a single key map to 
> different segments of bitset which are spread out. This can result in K-1 
> memory access (K being number of hash functions) in worst case for every key 
> that gets probed because of locality miss in L1 cache. 
> Ran a JMH microbenchmark to verify the same. Following is the JMH perf 
> profile for bloom filter probing
> {code}
> Perf stats:
> --
>5101.935637  task-clock (msec) #0.461 CPUs utilized
>346  context-switches  #0.068 K/sec
>336  cpu-migrations#0.066 K/sec
>  6,207  page-faults   #0.001 M/sec
> 10,016,486,301  cycles#1.963 GHz  
> (26.90%)
>  5,751,692,176  stalled-cycles-frontend   #   57.42% frontend cycles 
> idle (27.05%)
>  stalled-cycles-backend
> 14,359,914,397  instructions  #1.43  insns per cycle
>   #0.40  stalled cycles 
> per insn  (33.78%)
>  2,200,632,861  branches  #  431.333 M/sec
> (33.84%)
>  1,162,860  branch-misses #0.05% of all branches  
> (33.97%)
>  1,025,992,254  L1-dcache-loads   #  201.099 M/sec
> (26.56%)
>432,663,098  L1-dcache-load-misses #   42.17% of all L1-dcache 
> hits(14.49%)
>331,383,297  LLC-loads #   64.952 M/sec
> (14.47%)
>203,524  LLC-load-misses   #0.06% of all LL-cache 
> hits (21.67%)
>  L1-icache-loads
>  1,633,821  L1-icache-load-misses #0.320 M/sec
> (28.85%)
>950,368,796  dTLB-loads#  186.276 M/sec
> (28.61%)
>246,813,393  dTLB-load-misses  #   25.97% of all dTLB 
> cache hits   (14.53%)
> 25,451  iTLB-loads#0.005 M/sec
> (14.48%)
> 35,415  iTLB-load-misses  #  139.15% of all iTLB 
> cache hits   (21.73%)
>  L1-dcache-prefetches
>175,958  L1-dcache-prefetch-misses #0.034 M/sec
> (28.94%)
>   11.064783140 seconds time elapsed
> {code}
> This shows 42.17% of L1 data cache misses. 
> This jira is to use cache efficient bloom filter for semijoin probing.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16979) Cache UGI for metastore

2017-08-01 Thread Tao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109720#comment-16109720
 ] 

Tao Li commented on HIVE-16979:
---

[~gopalv] The original code is creating and closing a UGI instance per 
metastore request, and our change is caching the UGI 24 hours after last 
access. A metastore call normally takes less than a second. So 24 hours is long 
enough to make sure we will not fail any ongoing metastore call. Does it answer 
your question?

> Cache UGI for metastore
> ---
>
> Key: HIVE-16979
> URL: https://issues.apache.org/jira/browse/HIVE-16979
> Project: Hive
>  Issue Type: Improvement
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-16979.1.patch, HIVE-16979.2.patch, 
> HIVE-16979.3.patch
>
>
> FileSystem.closeAllForUGI is called per request against metastore to dispose 
> UGI, which involves talking to HDFS name node and is time consuming. So the 
> perf improvement would be caching and reusing the UGI.
> Per FileSystem.closeAllForUG call could take up to 20 ms as E2E latency 
> against HDFS. Usually a Hive query could result in several calls against 
> metastore, so we can save up to 50-100 ms per hive query.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17225) FileNotFoundException in HiveInputFormat#init for query HoS DPP query with multiple left semi-joins against the same partition column

2017-08-01 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar reassigned HIVE-17225:
---


> FileNotFoundException in HiveInputFormat#init for query HoS DPP query with 
> multiple left semi-joins against the same partition column
> -
>
> Key: HIVE-17225
> URL: https://issues.apache.org/jira/browse/HIVE-17225
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Affects Versions: 3.0.0
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> Setup:
> {code:sql}
> SET hive.spark.dynamic.partition.pruning=true;
> SET hive.strict.checks.cartesian.product=false;
> CREATE TABLE partitioned_table1 (col int) PARTITIONED BY (part_col int);
> CREATE TABLE regular_table1 (col1 int, col2 int);
> CREATE TABLE regular_table2 (col1 int, col2 int);
> ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 1);
> ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 2);
> ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 3);
> INSERT INTO table regular_table1 VALUES (0, 0), (1, 1), (2, 2);
> INSERT INTO table regular_table2 VALUES (0, 0), (1, 1), (2, 2);
> INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 1) VALUES (1), 
> (2), (3);
> INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 2) VALUES (1), 
> (2), (3);
> INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 3) VALUES (1), 
> (2), (3);
> SELECT * 
> FROM   regular_table1, 
>regular_table2, 
>partitioned_table1 
> WHERE  partitioned_table1.part_col IN (SELECT regular_table1.col2 
>FROM   regular_table1 
>WHERE  regular_table1.col1 > 0) 
>AND partitioned_table1.part_col IN (SELECT regular_table2.col2 
>FROM   regular_table2 
>WHERE  regular_table2.col1 > 1); 
> {code}
> Exception:
> {code}
> 2017-08-01T13:27:47,483 ERROR [b0d354a8-4cdb-4ba9-acec-27d14926aaf4 main] 
> ql.Driver: FAILED: Execution Error, return code 3 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask. java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.FileNotFoundException: File 
> file:/Users/stakiar/Documents/idea/apache-hive/itests/qtest-spark/target/tmp/scratchdir/stakiar/b0d354a8-4cdb-4ba9-acec-27d14926aaf4/hive_2017-08-01_13-27-45_553_1088589686371686526-1/-mr-10004/3/5
>  does not exist
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:408)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:498)
>   at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
>   at scala.Option.getOrElse(Option.scala:121)
>   at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
>   at scala.Option.getOrElse(Option.scala:121)
>   at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
>   at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
>   at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at scala.collection.immutable.List.foreach(List.scala:381)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
>   at scala.collection.immutable.List.map(List.scala:285)
>   at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:82)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
>   at scala.Option.getOrElse(Option.scala:121)
>   at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
>   at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
>   at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>   at scala.collection.immutable.List.foreach(List.scala:381)
>   at 

[jira] [Commented] (HIVE-16979) Cache UGI for metastore

2017-08-01 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109684#comment-16109684
 ] 

Gopal V commented on HIVE-16979:


[~taoli-hwx]: does this fail queries which take > 24hours?

Is there something we can do to mark "liveness" from the query progress loop to 
make sure the FileSystem.closeAllForUgi() -> deleteOnExit doesn't cleanup any 
directory currently being written to inside the cluster?

> Cache UGI for metastore
> ---
>
> Key: HIVE-16979
> URL: https://issues.apache.org/jira/browse/HIVE-16979
> Project: Hive
>  Issue Type: Improvement
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-16979.1.patch, HIVE-16979.2.patch, 
> HIVE-16979.3.patch
>
>
> FileSystem.closeAllForUGI is called per request against metastore to dispose 
> UGI, which involves talking to HDFS name node and is time consuming. So the 
> perf improvement would be caching and reusing the UGI.
> Per FileSystem.closeAllForUG call could take up to 20 ms as E2E latency 
> against HDFS. Usually a Hive query could result in several calls against 
> metastore, so we can save up to 50-100 ms per hive query.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17170) Move thrift generated code to stand alone metastore

2017-08-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109672#comment-16109672
 ] 

Hive QA commented on HIVE-17170:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879879/HIVE-17170.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6217/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6217/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6217/

Messages:
{noformat}
 This message was trimmed, see log for full details 
[ERROR] 
/data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[41,44]
 package org.apache.hadoop.hive.metastore.api does not exist
[ERROR] 
/data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[42,44]
 package org.apache.hadoop.hive.metastore.api does not exist
[ERROR] 
/data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[216,30]
 cannot find symbol
  symbol:   class Table
  location: class org.apache.hadoop.hive.druid.DruidStorageHandler
[ERROR] 
/data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[216,50]
 cannot find symbol
  symbol:   class MetaException
  location: class org.apache.hadoop.hive.druid.DruidStorageHandler
[ERROR] 
/data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[249,35]
 cannot find symbol
  symbol:   class Table
  location: class org.apache.hadoop.hive.druid.DruidStorageHandler
[ERROR] 
/data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[249,55]
 cannot find symbol
  symbol:   class MetaException
  location: class org.apache.hadoop.hive.druid.DruidStorageHandler
[ERROR] 
/data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[273,33]
 cannot find symbol
  symbol:   class Table
  location: class org.apache.hadoop.hive.druid.DruidStorageHandler
[ERROR] 
/data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[273,53]
 cannot find symbol
  symbol:   class MetaException
  location: class org.apache.hadoop.hive.druid.DruidStorageHandler
[ERROR] 
/data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[278,31]
 cannot find symbol
  symbol:   class Table
  location: class org.apache.hadoop.hive.druid.DruidStorageHandler
[ERROR] 
/data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[278,70]
 cannot find symbol
  symbol:   class MetaException
  location: class org.apache.hadoop.hive.druid.DruidStorageHandler
[ERROR] 
/data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[455,28]
 cannot find symbol
  symbol:   class Table
  location: class org.apache.hadoop.hive.druid.DruidStorageHandler
[ERROR] 
/data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[455,48]
 cannot find symbol
  symbol:   class MetaException
  location: class org.apache.hadoop.hive.druid.DruidStorageHandler
[ERROR] 
/data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[460,33]
 cannot find symbol
  symbol:   class Table
  location: class org.apache.hadoop.hive.druid.DruidStorageHandler
[ERROR] 
/data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[460,53]
 cannot find symbol
  symbol:   class MetaException
  location: class org.apache.hadoop.hive.druid.DruidStorageHandler
[ERROR] 
/data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[465,31]
 cannot find symbol
  symbol:   class Table
  location: class org.apache.hadoop.hive.druid.DruidStorageHandler
[ERROR] 
/data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[465,71]
 cannot find symbol
  symbol:   class MetaException
  location: class org.apache.hadoop.hive.druid.DruidStorageHandler
[ERROR] 
/data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[497,33]
 cannot find symbol
  symbol:   class 

[jira] [Updated] (HIVE-17172) add ordering checks to DiskRangeList

2017-08-01 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17172:

Attachment: (was: HIVE-17172.02.patch)

> add ordering checks to DiskRangeList
> 
>
> Key: HIVE-17172
> URL: https://issues.apache.org/jira/browse/HIVE-17172
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17172.01.patch, HIVE-17172.02.patch, 
> HIVE-17172.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17172) add ordering checks to DiskRangeList

2017-08-01 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17172:

Attachment: HIVE-17172.02.patch

> add ordering checks to DiskRangeList
> 
>
> Key: HIVE-17172
> URL: https://issues.apache.org/jira/browse/HIVE-17172
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17172.01.patch, HIVE-17172.02.patch, 
> HIVE-17172.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17172) add ordering checks to DiskRangeList

2017-08-01 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17172:

Attachment: HIVE-17172.02.patch

Added some comments to the usage of the method (that I needed to look at to fix 
the test).
Fixed the test to actually test something, and altered one test case that is 
valid and shouldn't cause an error.

> add ordering checks to DiskRangeList
> 
>
> Key: HIVE-17172
> URL: https://issues.apache.org/jira/browse/HIVE-17172
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17172.01.patch, HIVE-17172.02.patch, 
> HIVE-17172.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17190) Schema changes for bitvectors for unpartitioned tables

2017-08-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109652#comment-16109652
 ] 

Hive QA commented on HIVE-17190:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879856/HIVE-17190.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 11137 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=236)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6216/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6216/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6216/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12879856 - PreCommit-HIVE-Build

> Schema changes for bitvectors for unpartitioned tables
> --
>
> Key: HIVE-17190
> URL: https://issues.apache.org/jira/browse/HIVE-17190
> Project: Hive
>  Issue Type: Test
>  Components: Metastore, Statistics
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-17190.2.patch, HIVE-17190.3.patch
>
>
> Missed in HIVE-16997



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17172) add ordering checks to DiskRangeList

2017-08-01 Thread Deepak Jaiswal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109606#comment-16109606
 ] 

Deepak Jaiswal commented on HIVE-17172:
---

[~sershe] Can you please put the next patch in RB? It is much easier to review 
that way.

> add ordering checks to DiskRangeList
> 
>
> Key: HIVE-17172
> URL: https://issues.apache.org/jira/browse/HIVE-17172
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17172.01.patch, HIVE-17172.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17172) add ordering checks to DiskRangeList

2017-08-01 Thread Deepak Jaiswal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109604#comment-16109604
 ] 

Deepak Jaiswal commented on HIVE-17172:
---

+1 Lgtm.

> add ordering checks to DiskRangeList
> 
>
> Key: HIVE-17172
> URL: https://issues.apache.org/jira/browse/HIVE-17172
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17172.01.patch, HIVE-17172.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17172) add ordering checks to DiskRangeList

2017-08-01 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109593#comment-16109593
 ] 

Sergey Shelukhin commented on HIVE-17172:
-

actually there's a bug in the test :) will update it

> add ordering checks to DiskRangeList
> 
>
> Key: HIVE-17172
> URL: https://issues.apache.org/jira/browse/HIVE-17172
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17172.01.patch, HIVE-17172.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17172) add ordering checks to DiskRangeList

2017-08-01 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109572#comment-16109572
 ] 

Prasanth Jayachandran commented on HIVE-17172:
--

+1

> add ordering checks to DiskRangeList
> 
>
> Key: HIVE-17172
> URL: https://issues.apache.org/jira/browse/HIVE-17172
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17172.01.patch, HIVE-17172.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17172) add ordering checks to DiskRangeList

2017-08-01 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109553#comment-16109553
 ] 

Sergey Shelukhin commented on HIVE-17172:
-

[~prasanth_j] [~owen.omalley] [~djaiswal] ping

> add ordering checks to DiskRangeList
> 
>
> Key: HIVE-17172
> URL: https://issues.apache.org/jira/browse/HIVE-17172
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17172.01.patch, HIVE-17172.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17212) Dynamic add partition by insert shouldn't generate INSERT event.

2017-08-01 Thread Sankar Hariappan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109522#comment-16109522
 ] 

Sankar Hariappan commented on HIVE-17212:
-

Thanks [~anishek] for the review!
Request [~daijy]/[~thejas] to commit this patch to master!

> Dynamic add partition by insert shouldn't generate INSERT event.
> 
>
> Key: HIVE-17212
> URL: https://issues.apache.org/jira/browse/HIVE-17212
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 2.1.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>  Labels: DR, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17212.01.patch
>
>
> A partition is dynamically added if INSERT INTO is invoked on a non-existing 
> partition.
> Generally, insert operation generated INSERT event to notify the operation 
> with new data files.
> In this case, Hive should generate only ADD_PARTITION events with the new 
> files added. It shouldn't create INSERT event.
> Need to test and verify this behaviour.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16357) Failed folder creation when creating a new table is reported incorrectly

2017-08-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109512#comment-16109512
 ] 

Hive QA commented on HIVE-16357:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879795/HIVE-16357.04.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 11040 tests 
executed
*Failed tests:*
{noformat}
TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=236)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=241)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_wise_fileformat5]
 (batchId=3)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6215/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6215/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6215/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12879795 - PreCommit-HIVE-Build

> Failed folder creation when creating a new table is reported incorrectly
> 
>
> Key: HIVE-16357
> URL: https://issues.apache.org/jira/browse/HIVE-16357
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-16357.01.patch, HIVE-16357.02.patch, 
> HIVE-16357.03.patch, HIVE-16357.04.patch
>
>
> If the directory for a Hive table could not be created, them the HMS will 
> throw a metaexception:
> {code}
>  if (tblPath != null) {
>   if (!wh.isDir(tblPath)) {
> if (!wh.mkdirs(tblPath, true)) {
>   throw new MetaException(tblPath
>   + " is not a directory or unable to create one");
> }
> madeDir = true;
>   }
> }
> {code}
> However in the finally block we always try to call the 
> DbNotificationListener, which in turn will also throw an exception because 
> the directory is missing, overwriting the initial exception with a 
> FileNotFoundException.
> Actual stacktrace seen by the caller:
> {code}
> 2017-04-03T05:58:00,128 ERROR [pool-7-thread-2] metastore.RetryingHMSHandler: 
> MetaException(message:java.lang.RuntimeException: 
> java.io.FileNotFoundException: File file:/.../0 does not exist)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6074)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1496)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
>   at com.sun.proxy.$Proxy28.create_table_with_environment_context(Unknown 
> Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11125)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11109)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)

[jira] [Commented] (HIVE-17194) JDBC: Implement Gzip servlet filter

2017-08-01 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109500#comment-16109500
 ] 

Gopal V commented on HIVE-17194:


[~vgumashta]/[~thejas]: can you review?

> JDBC: Implement Gzip servlet filter
> ---
>
> Key: HIVE-17194
> URL: https://issues.apache.org/jira/browse/HIVE-17194
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, JDBC
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-17194.1.patch, HIVE-17194.2.patch, 
> HIVE-17194.3.patch
>
>
> {code}
> POST /cliservice HTTP/1.1
> Content-Type: application/x-thrift
> Accept: application/x-thrift
> User-Agent: Java/THttpClient/HC
> Authorization: Basic YW5vbnltb3VzOmFub255bW91cw==
> Content-Length: 71
> Host: localhost:10007
> Connection: Keep-Alive
> Accept-Encoding: gzip,deflate
> X-XSRF-HEADER: true
> {code}
> The Beeline client clearly sends out HTTP compression headers which are 
> ignored by the HTTP service layer in HS2.
> After patch, result looks like
> {code}
> HTTP/1.1 200 OK
> Date: Tue, 01 Aug 2017 01:47:23 GMT
> Content-Type: application/x-thrift
> Vary: Accept-Encoding, User-Agent
> Content-Encoding: gzip
> Transfer-Encoding: chunked
> Server: Jetty(9.3.8.v20160314)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17222) Llap: Iotrace throws java.lang.UnsupportedOperationException with IncompleteCb

2017-08-01 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109468#comment-16109468
 ] 

Sergey Shelukhin commented on HIVE-17222:
-

+1

> Llap: Iotrace throws  java.lang.UnsupportedOperationException with 
> IncompleteCb
> ---
>
> Key: HIVE-17222
> URL: https://issues.apache.org/jira/browse/HIVE-17222
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-17222.1.patch
>
>
> branch: hive master 
> Running Q76 at 1 TB generates the following exception.
> {noformat}
> Caused by: java.io.IOException: java.lang.UnsupportedOperationException
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.rethrowErrorIfAny(LlapRecordReader.java:349)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.nextCvb(LlapRecordReader.java:304)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.next(LlapRecordReader.java:244)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.next(LlapRecordReader.java:67)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
> ... 23 more
> Caused by: java.lang.UnsupportedOperationException
> at 
> org.apache.hadoop.hive.common.io.DiskRange.getData(DiskRange.java:86)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.IoTrace.logRange(IoTrace.java:304)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.IoTrace.logRanges(IoTrace.java:291)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:328)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:426)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:250)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:247)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:247)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:96)
> ... 6 more
> {noformat}
> When {{IncompleteCb}} is encountered, it ends up throwing this error.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17213) HoS: file merging doesn't work for union all

2017-08-01 Thread Chao Sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-17213:

Attachment: HIVE-17213.2.patch

Attaching patch v2 to address the case when linkedFileSinkDesc is not set.

> HoS: file merging doesn't work for union all
> 
>
> Key: HIVE-17213
> URL: https://issues.apache.org/jira/browse/HIVE-17213
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-17213.0.patch, HIVE-17213.1.patch, 
> HIVE-17213.2.patch
>
>
> HoS file merging doesn't work properly since it doesn't set linked file sinks 
> properly which is used to generate move tasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17224) Move JDO classes to standalone metastore

2017-08-01 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates reassigned HIVE-17224:
-


> Move JDO classes to standalone metastore
> 
>
> Key: HIVE-17224
> URL: https://issues.apache.org/jira/browse/HIVE-17224
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>
> The JDO model classes (MDatabase, MTable, etc.) and the package.jdo file that 
> defines the DB mapping need to be moved to the standalone metastore.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17170) Move thrift generated code to stand alone metastore

2017-08-01 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17170:
--
Status: Patch Available  (was: Open)

> Move thrift generated code to stand alone metastore
> ---
>
> Key: HIVE-17170
> URL: https://issues.apache.org/jira/browse/HIVE-17170
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17170.patch
>
>
> hive_metastore.thrift and the code it generates needs to be moved into the 
> standalone metastore module.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17170) Move thrift generated code to stand alone metastore

2017-08-01 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17170:
--
Attachment: HIVE-17170.patch

The patch is huge because it moves all the Thrift generated files around.  It 
will be much easier to review the PR.

> Move thrift generated code to stand alone metastore
> ---
>
> Key: HIVE-17170
> URL: https://issues.apache.org/jira/browse/HIVE-17170
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Attachments: HIVE-17170.patch
>
>
> hive_metastore.thrift and the code it generates needs to be moved into the 
> standalone metastore module.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17170) Move thrift generated code to stand alone metastore

2017-08-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109406#comment-16109406
 ] 

ASF GitHub Bot commented on HIVE-17170:
---

GitHub user alanfgates opened a pull request:

https://github.com/apache/hive/pull/216

HIVE-17170 Move thrift generated code to stand alone metastore



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/alanfgates/hive hive17170

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/216.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #216


commit ffd8599cd3db1cbb0464606901dc6f73916bdc69
Author: Alan Gates 
Date:   2017-07-25T20:50:38Z

HIVE-17170 Move thrift generated code to stand alone metastore




> Move thrift generated code to stand alone metastore
> ---
>
> Key: HIVE-17170
> URL: https://issues.apache.org/jira/browse/HIVE-17170
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
>
> hive_metastore.thrift and the code it generates needs to be moved into the 
> standalone metastore module.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17189) Fix backwards incompatibility in HiveMetaStoreClient

2017-08-01 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-17189:
---
   Resolution: Fixed
Fix Version/s: 2.4.0
   3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master and branch-2. Thanks for the review [~alangates]

> Fix backwards incompatibility in HiveMetaStoreClient
> 
>
> Key: HIVE-17189
> URL: https://issues.apache.org/jira/browse/HIVE-17189
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.1.1
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-17189.01.patch, HIVE-17189.02.patch
>
>
> HIVE-12730 adds the ability to edit the basic stats using {{alter table}} and 
> {{alter partition}} commands. However, it changes the signature of @public 
> interface of MetastoreClient and removes some methods which breaks backwards 
> compatibility. This can be fixed easily by re-introducing the removed methods 
> and making them call into newly added method 
> {{alter_table_with_environment_context}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17222) Llap: Iotrace throws java.lang.UnsupportedOperationException with IncompleteCb

2017-08-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109364#comment-16109364
 ] 

Hive QA commented on HIVE-17222:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879789/HIVE-17222.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11018 tests 
executed
*Failed tests:*
{noformat}
TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=240)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6214/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6214/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6214/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12879789 - PreCommit-HIVE-Build

> Llap: Iotrace throws  java.lang.UnsupportedOperationException with 
> IncompleteCb
> ---
>
> Key: HIVE-17222
> URL: https://issues.apache.org/jira/browse/HIVE-17222
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-17222.1.patch
>
>
> branch: hive master 
> Running Q76 at 1 TB generates the following exception.
> {noformat}
> Caused by: java.io.IOException: java.lang.UnsupportedOperationException
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.rethrowErrorIfAny(LlapRecordReader.java:349)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.nextCvb(LlapRecordReader.java:304)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.next(LlapRecordReader.java:244)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.next(LlapRecordReader.java:67)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
> ... 23 more
> Caused by: java.lang.UnsupportedOperationException
> at 
> org.apache.hadoop.hive.common.io.DiskRange.getData(DiskRange.java:86)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.IoTrace.logRange(IoTrace.java:304)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.IoTrace.logRanges(IoTrace.java:291)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:328)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:426)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:250)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:247)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:247)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:96)
> ... 6 more
> {noformat}
> When {{IncompleteCb}} is encountered, it ends up throwing this error.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17167) Create metastore specific configuration tool

2017-08-01 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-17167:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Patch 2 committed.  Thanks Vihang for the reviews and feedback.

> Create metastore specific configuration tool
> 
>
> Key: HIVE-17167
> URL: https://issues.apache.org/jira/browse/HIVE-17167
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Alan Gates
>Assignee: Alan Gates
> Fix For: 3.0.0
>
> Attachments: HIVE-17167.2.patch, HIVE-17167.patch
>
>
> As part of making the metastore a separately releasable module we need 
> configuration tools that are specific to that module.  It cannot use or 
> extend HiveConf as that is in hive common.  But it must take a HiveConf 
> object and be able to operate on it.
> The best way to achieve this is using Hadoop's Configuration object (which 
> HiveConf extends) together with enums and static methods.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17189) Fix backwards incompatibility in HiveMetaStoreClient

2017-08-01 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109212#comment-16109212
 ] 

Alan Gates commented on HIVE-17189:
---

+1

> Fix backwards incompatibility in HiveMetaStoreClient
> 
>
> Key: HIVE-17189
> URL: https://issues.apache.org/jira/browse/HIVE-17189
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.1.1
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-17189.01.patch, HIVE-17189.02.patch
>
>
> HIVE-12730 adds the ability to edit the basic stats using {{alter table}} and 
> {{alter partition}} commands. However, it changes the signature of @public 
> interface of MetastoreClient and removes some methods which breaks backwards 
> compatibility. This can be fixed easily by re-introducing the removed methods 
> and making them call into newly added method 
> {{alter_table_with_environment_context}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16845) INSERT OVERWRITE a table with dynamic partitions on S3 fails with NPE

2017-08-01 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109205#comment-16109205
 ] 

Sahil Takiar commented on HIVE-16845:
-

+1

> INSERT OVERWRITE a table with dynamic partitions on S3 fails with NPE
> -
>
> Key: HIVE-16845
> URL: https://issues.apache.org/jira/browse/HIVE-16845
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
> Attachments: HIVE-16845.1.patch, HIVE-16845.2.patch, 
> HIVE-16845.3.patch, HIVE-16845.4.patch
>
>
> *How to reproduce*
> - Create a partitioned table on S3:
> {noformat}
> CREATE EXTERNAL TABLE s3table(user_id string COMMENT '', event_name string 
> COMMENT '') PARTITIONED BY (reported_date string, product_id int) LOCATION 
> 's3a://'; 
> {noformat}
> - Create a temp table:
> {noformat}
> create table tmp_table (id string, name string, date string, pid int) row 
> format delimited fields terminated by '\t' lines terminated by '\n' stored as 
> textfile;
> {noformat}
> - Load the following rows to the tmp table:
> {noformat}
> u1value1  2017-04-10  1
> u2value2  2017-04-10  1
> u3value3  2017-04-10  10001
> {noformat}
> - Set the following parameters:
> -- hive.exec.dynamic.partition.mode=nonstrict
> -- mapreduce.input.fileinputformat.split.maxsize=10
> -- hive.blobstore.optimizations.enabled=true
> -- hive.blobstore.use.blobstore.as.scratchdir=false
> -- hive.merge.mapfiles=true
> - Insert the rows from the temp table into the s3 table:
> {noformat}
> INSERT OVERWRITE TABLE s3table
> PARTITION (reported_date, product_id)
> SELECT
>   t.id as user_id,
>   t.name as event_name,
>   t.date as reported_date,
>   t.pid as product_id
> FROM tmp_table t;
> {noformat}
> A NPE will occur with the following stacktrace:
> {noformat}
> 2017-05-08 21:32:50,607 ERROR 
> org.apache.hive.service.cli.operation.Operation: 
> [HiveServer2-Background-Pool: Thread-184028]: Error running hive query: 
> org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code -101 from 
> org.apache.hadoop.hive.ql.exec.ConditionalTask. null
> at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:400)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:239)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.plan.ConditionalResolverMergeFiles.generateActualTasks(ConditionalResolverMergeFiles.java:290)
> at 
> org.apache.hadoop.hive.ql.plan.ConditionalResolverMergeFiles.getTasks(ConditionalResolverMergeFiles.java:175)
> at 
> org.apache.hadoop.hive.ql.exec.ConditionalTask.execute(ConditionalTask.java:81)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1977)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1690)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1422)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1206)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1201)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
> ... 11 more 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17208) Repl dump should pass in db/table information to authorization API

2017-08-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109160#comment-16109160
 ] 

Hive QA commented on HIVE-17208:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879777/HIVE-17208.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 11019 tests 
executed
*Failed tests:*
{noformat}
TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[repl_dump_requires_admin]
 (batchId=90)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[repl_load_requires_admin]
 (batchId=90)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6213/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6213/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6213/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12879777 - PreCommit-HIVE-Build

> Repl dump should pass in db/table information to authorization API
> --
>
> Key: HIVE-17208
> URL: https://issues.apache.org/jira/browse/HIVE-17208
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-17208.1.patch, HIVE-17208.2.patch
>
>
> "repl dump" does not provide db/table information. That is necessary for 
> authorization replication in ranger.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17190) Schema changes for bitvectors for unpartitioned tables

2017-08-01 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-17190:

Status: Patch Available  (was: Open)

> Schema changes for bitvectors for unpartitioned tables
> --
>
> Key: HIVE-17190
> URL: https://issues.apache.org/jira/browse/HIVE-17190
> Project: Hive
>  Issue Type: Test
>  Components: Metastore, Statistics
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-17190.2.patch, HIVE-17190.3.patch
>
>
> Missed in HIVE-16997



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17190) Schema changes for bitvectors for unpartitioned tables

2017-08-01 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-17190:

Status: Open  (was: Patch Available)

> Schema changes for bitvectors for unpartitioned tables
> --
>
> Key: HIVE-17190
> URL: https://issues.apache.org/jira/browse/HIVE-17190
> Project: Hive
>  Issue Type: Test
>  Components: Metastore, Statistics
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-17190.2.patch, HIVE-17190.3.patch
>
>
> Missed in HIVE-16997



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17190) Schema changes for bitvectors for unpartitioned tables

2017-08-01 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-17190:

Attachment: HIVE-17190.3.patch

> Schema changes for bitvectors for unpartitioned tables
> --
>
> Key: HIVE-17190
> URL: https://issues.apache.org/jira/browse/HIVE-17190
> Project: Hive
>  Issue Type: Test
>  Components: Metastore, Statistics
>Affects Versions: 3.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-17190.2.patch, HIVE-17190.3.patch
>
>
> Missed in HIVE-16997



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16357) Failed folder creation when creating a new table is reported incorrectly

2017-08-01 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109104#comment-16109104
 ] 

Sergio Peña commented on HIVE-16357:


[~pvary] Why is the DbNotificationListener called when a table failed? I see 
this event is called inside the try block, and if a table fails and an 
exception is thrown, then this shouldn't be called, right? What am I missing? 
Btw, DbNotificationListener is a transactional listener, so it is only called 
inside the try block (no in the finally).

See this line
https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L1547

> Failed folder creation when creating a new table is reported incorrectly
> 
>
> Key: HIVE-16357
> URL: https://issues.apache.org/jira/browse/HIVE-16357
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-16357.01.patch, HIVE-16357.02.patch, 
> HIVE-16357.03.patch, HIVE-16357.04.patch
>
>
> If the directory for a Hive table could not be created, them the HMS will 
> throw a metaexception:
> {code}
>  if (tblPath != null) {
>   if (!wh.isDir(tblPath)) {
> if (!wh.mkdirs(tblPath, true)) {
>   throw new MetaException(tblPath
>   + " is not a directory or unable to create one");
> }
> madeDir = true;
>   }
> }
> {code}
> However in the finally block we always try to call the 
> DbNotificationListener, which in turn will also throw an exception because 
> the directory is missing, overwriting the initial exception with a 
> FileNotFoundException.
> Actual stacktrace seen by the caller:
> {code}
> 2017-04-03T05:58:00,128 ERROR [pool-7-thread-2] metastore.RetryingHMSHandler: 
> MetaException(message:java.lang.RuntimeException: 
> java.io.FileNotFoundException: File file:/.../0 does not exist)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6074)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1496)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
>   at com.sun.proxy.$Proxy28.create_table_with_environment_context(Unknown 
> Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11125)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11109)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File 
> file:/.../0 does not exist
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener$FileIterator.(DbNotificationListener.java:203)
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:137)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1463)
>   at 
> 

[jira] [Commented] (HIVE-16357) Failed folder creation when creating a new table is reported incorrectly

2017-08-01 Thread Peter Vary (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109081#comment-16109081
 ] 

Peter Vary commented on HIVE-16357:
---

[~spena]: The original issue is the following with the original code:
- When the table creation is failed
- DBNotificationListener will be triggered with incomplete data
- DBNotificationListener will throw a new exception which will mask the 
original from the user, making it harder to find the root problem

The patch fixes it by adding the check inside the DBNotificationListener, thus 
avoiding throwing the second exception. Also the patch makes sure that if there 
is any further exceptions thrown by the DBNotificationListener, or any other 
Listener configured by the user, they are catched, logged and not propagated 
further.

The rationale behind this solution is the following:
- Listeners can be configured by the user, so they can have different usages. 
For example one listener might only collect failed table creation events. In 
this case the listeners should be notified on every event, and the listener 
should decide which event to handle, and which event to omit.
- DBNotificationListener on the other hand is not interested in the failed 
events, so it should skip these events.
- Also we thought, that listeners are configured by the users, so their code 
can be unstable, buggy. With this in mind, it would be good be sure that they 
do not affect each other. So if there is an error in one listener then the 
other listeners should still be notified.

We might not be aware of every usage patterns of the Listeners, or might 
overcomplicate this Listener architecture. What do you think [~spena]? Is it 
worth to be prepared to these use-cases?

Thanks,
Peter

> Failed folder creation when creating a new table is reported incorrectly
> 
>
> Key: HIVE-16357
> URL: https://issues.apache.org/jira/browse/HIVE-16357
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-16357.01.patch, HIVE-16357.02.patch, 
> HIVE-16357.03.patch, HIVE-16357.04.patch
>
>
> If the directory for a Hive table could not be created, them the HMS will 
> throw a metaexception:
> {code}
>  if (tblPath != null) {
>   if (!wh.isDir(tblPath)) {
> if (!wh.mkdirs(tblPath, true)) {
>   throw new MetaException(tblPath
>   + " is not a directory or unable to create one");
> }
> madeDir = true;
>   }
> }
> {code}
> However in the finally block we always try to call the 
> DbNotificationListener, which in turn will also throw an exception because 
> the directory is missing, overwriting the initial exception with a 
> FileNotFoundException.
> Actual stacktrace seen by the caller:
> {code}
> 2017-04-03T05:58:00,128 ERROR [pool-7-thread-2] metastore.RetryingHMSHandler: 
> MetaException(message:java.lang.RuntimeException: 
> java.io.FileNotFoundException: File file:/.../0 does not exist)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6074)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1496)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
>   at com.sun.proxy.$Proxy28.create_table_with_environment_context(Unknown 
> Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11125)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11109)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> 

[jira] [Commented] (HIVE-16357) Failed folder creation when creating a new table is reported incorrectly

2017-08-01 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109056#comment-16109056
 ] 

Sergio Peña commented on HIVE-16357:


[~zsombor.klara] I don't understand why triggering events with failed 
operations cause this error. Seems the fix you did will not fix the original 
problem.

This patch fixes an issue that shouldn't cause issues on HMS. Btw, for a future 
fix, I would think is better to check the status on the HiveMetaStore side 
itself and do not trigger the event instead of doing it on the 
DbNotificationListener. Even if this avoids triggering failed events, 
developers will still get confused about why notifyEvent() is called on the 
HiveMetaStore with failed transactions?



> Failed folder creation when creating a new table is reported incorrectly
> 
>
> Key: HIVE-16357
> URL: https://issues.apache.org/jira/browse/HIVE-16357
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-16357.01.patch, HIVE-16357.02.patch, 
> HIVE-16357.03.patch, HIVE-16357.04.patch
>
>
> If the directory for a Hive table could not be created, them the HMS will 
> throw a metaexception:
> {code}
>  if (tblPath != null) {
>   if (!wh.isDir(tblPath)) {
> if (!wh.mkdirs(tblPath, true)) {
>   throw new MetaException(tblPath
>   + " is not a directory or unable to create one");
> }
> madeDir = true;
>   }
> }
> {code}
> However in the finally block we always try to call the 
> DbNotificationListener, which in turn will also throw an exception because 
> the directory is missing, overwriting the initial exception with a 
> FileNotFoundException.
> Actual stacktrace seen by the caller:
> {code}
> 2017-04-03T05:58:00,128 ERROR [pool-7-thread-2] metastore.RetryingHMSHandler: 
> MetaException(message:java.lang.RuntimeException: 
> java.io.FileNotFoundException: File file:/.../0 does not exist)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6074)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1496)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
>   at com.sun.proxy.$Proxy28.create_table_with_environment_context(Unknown 
> Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11125)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11109)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File 
> file:/.../0 does not exist
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener$FileIterator.(DbNotificationListener.java:203)
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:137)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1463)
>   at 
> 

[jira] [Commented] (HIVE-17144) export of temporary tables not working and it seems to be using distcp rather than filesystem copy

2017-08-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109016#comment-16109016
 ] 

Hive QA commented on HIVE-17144:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879775/HIVE-17144.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 11019 tests 
executed
*Failed tests:*
{noformat}
TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed]
 (batchId=240)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
 (batchId=240)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6212/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6212/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6212/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12879775 - PreCommit-HIVE-Build

> export of temporary tables not working and it seems to be using distcp rather 
> than filesystem copy
> --
>
> Key: HIVE-17144
> URL: https://issues.apache.org/jira/browse/HIVE-17144
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
> Fix For: 3.0.0
>
> Attachments: HIVE-17144.1.patch
>
>
> create temporary table t1 (i int);
> insert into t1 values (3);
> export table t1 to 'hdfs://somelocation';
> above fails. additionally it should use filesystem copy and not distcp to do 
> the job.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17115) MetaStoreUtils.getDeserializer doesn't catch the java.lang.ClassNotFoundException

2017-08-01 Thread Aroop Maliakkal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108896#comment-16108896
 ] 

Aroop Maliakkal commented on HIVE-17115:


[~daijy] :: The create tables succeeded. We might be using the direct mysql 
connection from the hive client when we were creating these tables. Now we 
enforced all clients to go through metastore instead of direct mysql connection.


> MetaStoreUtils.getDeserializer doesn't catch the 
> java.lang.ClassNotFoundException
> -
>
> Key: HIVE-17115
> URL: https://issues.apache.org/jira/browse/HIVE-17115
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1
>Reporter: Erik.fang
>Assignee: Erik.fang
> Attachments: HIVE-17115.1.patch, HIVE-17115.patch
>
>
> Suppose we create a table with Custom SerDe, then call 
> HiveMetaStoreClient.getSchema(String db, String tableName) to extract the 
> metadata from HiveMetaStore Service
> the thrift client hangs there with exception in HiveMetaStore Service's log, 
> such as
> {code:java}
> Exception in thread "pool-5-thread-129" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/util/Bytes
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.parseColumnsMapping(HBaseSerDe.java:184)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDeParameters.(HBaseSerDeParameters.java:73)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.initialize(HBaseSerDe.java:117)
> at 
> org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:53)
> at 
> org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:521)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:401)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_fields_with_environment_context(HiveMetaStore.java:3556)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_schema_with_environment_context(HiveMetaStore.java:3636)
> at sun.reflect.GeneratedMethodAccessor104.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
> at com.sun.proxy.$Proxy4.get_schema_with_environment_context(Unknown 
> Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_schema_with_environment_context.getResult(ThriftHiveMetastore.java:9146)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_schema_with_environment_context.getResult(ThriftHiveMetastore.java:9130)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:551)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:546)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:546)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.util.Bytes
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16896) move replication load related work in semantic analysis phase to execution phase using a task

2017-08-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108868#comment-16108868
 ] 

Hive QA commented on HIVE-16896:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879776/HIVE-16896.1.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 11019 tests 
executed
*Failed tests:*
{noformat}
TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[repl_load_requires_admin]
 (batchId=90)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6211/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6211/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6211/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12879776 - PreCommit-HIVE-Build

> move replication load related work in semantic analysis phase to execution 
> phase using a task
> -
>
> Key: HIVE-16896
> URL: https://issues.apache.org/jira/browse/HIVE-16896
> Project: Hive
>  Issue Type: Sub-task
>Reporter: anishek
>Assignee: anishek
> Attachments: HIVE-16896.1.patch
>
>
> we want to not create too many tasks in memory in the analysis phase while 
> loading data. Currently we load all the files in the bootstrap dump location 
> as {{FileStatus[]}} and then iterate over it to load objects, we should 
> rather move to 
> {code}
> org.apache.hadoop.fs.RemoteIteratorlistFiles(Path 
> f, boolean recursive)
> {code}
> which would internally batch and return values. 
> additionally since we cant hand off partial tasks from analysis pahse => 
> execution phase, we are going to move the whole repl load functionality to 
> execution phase so we can better control creation/execution of tasks (not 
> related to hive {{Task}}, we may get rid of ReplCopyTask)
> Additional consideration to take into account at the end of this jira is to 
> see if we want to specifically do a multi threaded load of bootstrap dump.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17213) HoS: file merging doesn't work for union all

2017-08-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108759#comment-16108759
 ] 

Hive QA commented on HIVE-17213:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879761/HIVE-17213.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 11018 tests 
executed
*Failed tests:*
{noformat}
TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=46)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge1]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge2]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge3]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge4]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge5]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge6]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge7]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge9]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge_diff_fs]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[merge1] 
(batchId=124)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[merge2] 
(batchId=101)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6210/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6210/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6210/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 18 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12879761 - PreCommit-HIVE-Build

> HoS: file merging doesn't work for union all
> 
>
> Key: HIVE-17213
> URL: https://issues.apache.org/jira/browse/HIVE-17213
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Chao Sun
>Assignee: Chao Sun
> Attachments: HIVE-17213.0.patch, HIVE-17213.1.patch
>
>
> HoS file merging doesn't work properly since it doesn't set linked file sinks 
> properly which is used to generate move tasks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17194) JDBC: Implement Gzip servlet filter

2017-08-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108682#comment-16108682
 ] 

Hive QA commented on HIVE-17194:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12879759/HIVE-17194.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 11012 tests 
executed
*Failed tests:*
{noformat}
TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver
 (batchId=242)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6209/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6209/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6209/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12879759 - PreCommit-HIVE-Build

> JDBC: Implement Gzip servlet filter
> ---
>
> Key: HIVE-17194
> URL: https://issues.apache.org/jira/browse/HIVE-17194
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, JDBC
>Affects Versions: 3.0.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-17194.1.patch, HIVE-17194.2.patch, 
> HIVE-17194.3.patch
>
>
> {code}
> POST /cliservice HTTP/1.1
> Content-Type: application/x-thrift
> Accept: application/x-thrift
> User-Agent: Java/THttpClient/HC
> Authorization: Basic YW5vbnltb3VzOmFub255bW91cw==
> Content-Length: 71
> Host: localhost:10007
> Connection: Keep-Alive
> Accept-Encoding: gzip,deflate
> X-XSRF-HEADER: true
> {code}
> The Beeline client clearly sends out HTTP compression headers which are 
> ignored by the HTTP service layer in HS2.
> After patch, result looks like
> {code}
> HTTP/1.1 200 OK
> Date: Tue, 01 Aug 2017 01:47:23 GMT
> Content-Type: application/x-thrift
> Vary: Accept-Encoding, User-Agent
> Content-Encoding: gzip
> Transfer-Encoding: chunked
> Server: Jetty(9.3.8.v20160314)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17115) MetaStoreUtils.getDeserializer doesn't catch the java.lang.ClassNotFoundException

2017-08-01 Thread Erik.fang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108672#comment-16108672
 ] 

Erik.fang commented on HIVE-17115:
--

In our cluster, I think there are both local mode metastore hive client and 
standalone metastore service deployed, which share backend mysql
In hive client with local mode metastore, user can add jars by themselves, they 
can add hbase.jar and create the table
However, metastore service does't load the hbase.jar, so NoClassDefFoundError 
is raised by HiveMetaStoreClient.getSchema

This might be a deployment issue, however, it is always inappropriate to miss 
the NoClassDefFoundError and crash the worker thread in metastore service



> MetaStoreUtils.getDeserializer doesn't catch the 
> java.lang.ClassNotFoundException
> -
>
> Key: HIVE-17115
> URL: https://issues.apache.org/jira/browse/HIVE-17115
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1
>Reporter: Erik.fang
>Assignee: Erik.fang
> Attachments: HIVE-17115.1.patch, HIVE-17115.patch
>
>
> Suppose we create a table with Custom SerDe, then call 
> HiveMetaStoreClient.getSchema(String db, String tableName) to extract the 
> metadata from HiveMetaStore Service
> the thrift client hangs there with exception in HiveMetaStore Service's log, 
> such as
> {code:java}
> Exception in thread "pool-5-thread-129" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/util/Bytes
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.parseColumnsMapping(HBaseSerDe.java:184)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDeParameters.(HBaseSerDeParameters.java:73)
> at 
> org.apache.hadoop.hive.hbase.HBaseSerDe.initialize(HBaseSerDe.java:117)
> at 
> org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:53)
> at 
> org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:521)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:401)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_fields_with_environment_context(HiveMetaStore.java:3556)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_schema_with_environment_context(HiveMetaStore.java:3636)
> at sun.reflect.GeneratedMethodAccessor104.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
> at com.sun.proxy.$Proxy4.get_schema_with_environment_context(Unknown 
> Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_schema_with_environment_context.getResult(ThriftHiveMetastore.java:9146)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_schema_with_environment_context.getResult(ThriftHiveMetastore.java:9130)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:551)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:546)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:546)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.util.Bytes
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16357) Failed folder creation when creating a new table is reported incorrectly

2017-08-01 Thread Peter Vary (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108643#comment-16108643
 ] 

Peter Vary commented on HIVE-16357:
---

Thanks for the patch [~zsombor.klara]!
I like this solution. +1 pending tests.

Peter

> Failed folder creation when creating a new table is reported incorrectly
> 
>
> Key: HIVE-16357
> URL: https://issues.apache.org/jira/browse/HIVE-16357
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-16357.01.patch, HIVE-16357.02.patch, 
> HIVE-16357.03.patch, HIVE-16357.04.patch
>
>
> If the directory for a Hive table could not be created, them the HMS will 
> throw a metaexception:
> {code}
>  if (tblPath != null) {
>   if (!wh.isDir(tblPath)) {
> if (!wh.mkdirs(tblPath, true)) {
>   throw new MetaException(tblPath
>   + " is not a directory or unable to create one");
> }
> madeDir = true;
>   }
> }
> {code}
> However in the finally block we always try to call the 
> DbNotificationListener, which in turn will also throw an exception because 
> the directory is missing, overwriting the initial exception with a 
> FileNotFoundException.
> Actual stacktrace seen by the caller:
> {code}
> 2017-04-03T05:58:00,128 ERROR [pool-7-thread-2] metastore.RetryingHMSHandler: 
> MetaException(message:java.lang.RuntimeException: 
> java.io.FileNotFoundException: File file:/.../0 does not exist)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6074)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1496)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
>   at com.sun.proxy.$Proxy28.create_table_with_environment_context(Unknown 
> Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11125)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11109)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File 
> file:/.../0 does not exist
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener$FileIterator.(DbNotificationListener.java:203)
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:137)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1463)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1482)
>   ... 20 more
> Caused by: java.io.FileNotFoundException: File file:/.../0 does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:429)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1515)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1555)
>   at 
> 

[jira] [Commented] (HIVE-14013) Describe table doesn't show unicode properly

2017-08-01 Thread hzfeng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108625#comment-16108625
 ] 

hzfeng commented on HIVE-14013:
---

Thanks a lot
Is it convenient for guiding me how you configured hive-2.3.0?
I will be appreciate.


> Describe table doesn't show unicode properly
> 
>
> Key: HIVE-14013
> URL: https://issues.apache.org/jira/browse/HIVE-14013
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 2.3.0
>
> Attachments: HIVE-14013.1.patch, HIVE-14013.2.patch, 
> HIVE-14013.3.patch, HIVE-14013.4.patch
>
>
> Describe table output will show comments incorrectly rather than the unicode 
> itself.
> {noformat}
> hive> desc formatted t1;
> # Detailed Table Information 
> Table Type: MANAGED_TABLE
> Table Parameters:
> COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
> comment \u8868\u4E2D\u6587\u6D4B\u8BD5
> numFiles0   
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-16357) Failed folder creation when creating a new table is reported incorrectly

2017-08-01 Thread Barna Zsombor Klara (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108620#comment-16108620
 ] 

Barna Zsombor Klara edited comment on HIVE-16357 at 8/1/17 9:25 AM:


Removed changes from the MetaStore class and made notifications more resilient. 
An exception from one listener should not affect the rest of the listeners.


was (Author: zsombor.klara):
Remove changes from the MetaStore class and made notifications more resilient. 
An exception from one listener should not affect the rest of the listeners.

> Failed folder creation when creating a new table is reported incorrectly
> 
>
> Key: HIVE-16357
> URL: https://issues.apache.org/jira/browse/HIVE-16357
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-16357.01.patch, HIVE-16357.02.patch, 
> HIVE-16357.03.patch, HIVE-16357.04.patch
>
>
> If the directory for a Hive table could not be created, them the HMS will 
> throw a metaexception:
> {code}
>  if (tblPath != null) {
>   if (!wh.isDir(tblPath)) {
> if (!wh.mkdirs(tblPath, true)) {
>   throw new MetaException(tblPath
>   + " is not a directory or unable to create one");
> }
> madeDir = true;
>   }
> }
> {code}
> However in the finally block we always try to call the 
> DbNotificationListener, which in turn will also throw an exception because 
> the directory is missing, overwriting the initial exception with a 
> FileNotFoundException.
> Actual stacktrace seen by the caller:
> {code}
> 2017-04-03T05:58:00,128 ERROR [pool-7-thread-2] metastore.RetryingHMSHandler: 
> MetaException(message:java.lang.RuntimeException: 
> java.io.FileNotFoundException: File file:/.../0 does not exist)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6074)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1496)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
>   at com.sun.proxy.$Proxy28.create_table_with_environment_context(Unknown 
> Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11125)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11109)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File 
> file:/.../0 does not exist
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener$FileIterator.(DbNotificationListener.java:203)
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:137)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1463)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1482)
>   ... 20 more
> Caused by: java.io.FileNotFoundException: File 

[jira] [Updated] (HIVE-16357) Failed folder creation when creating a new table is reported incorrectly

2017-08-01 Thread Barna Zsombor Klara (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara updated HIVE-16357:
---
Attachment: HIVE-16357.04.patch

Remove changes from the MetaStore class and made notifications more resilient. 
An exception from one listener should not affect the rest of the listeners.

> Failed folder creation when creating a new table is reported incorrectly
> 
>
> Key: HIVE-16357
> URL: https://issues.apache.org/jira/browse/HIVE-16357
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 2.3.0, 3.0.0
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HIVE-16357.01.patch, HIVE-16357.02.patch, 
> HIVE-16357.03.patch, HIVE-16357.04.patch
>
>
> If the directory for a Hive table could not be created, them the HMS will 
> throw a metaexception:
> {code}
>  if (tblPath != null) {
>   if (!wh.isDir(tblPath)) {
> if (!wh.mkdirs(tblPath, true)) {
>   throw new MetaException(tblPath
>   + " is not a directory or unable to create one");
> }
> madeDir = true;
>   }
> }
> {code}
> However in the finally block we always try to call the 
> DbNotificationListener, which in turn will also throw an exception because 
> the directory is missing, overwriting the initial exception with a 
> FileNotFoundException.
> Actual stacktrace seen by the caller:
> {code}
> 2017-04-03T05:58:00,128 ERROR [pool-7-thread-2] metastore.RetryingHMSHandler: 
> MetaException(message:java.lang.RuntimeException: 
> java.io.FileNotFoundException: File file:/.../0 does not exist)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6074)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1496)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
>   at com.sun.proxy.$Proxy28.create_table_with_environment_context(Unknown 
> Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11125)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11109)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File 
> file:/.../0 does not exist
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener$FileIterator.(DbNotificationListener.java:203)
>   at 
> org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:137)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1463)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1482)
>   ... 20 more
> Caused by: java.io.FileNotFoundException: File file:/.../0 does not exist
>   at 
> org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:429)
>   at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1515)
>   at 

[jira] [Commented] (HIVE-16845) INSERT OVERWRITE a table with dynamic partitions on S3 fails with NPE

2017-08-01 Thread Marta Kuczora (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108598#comment-16108598
 ] 

Marta Kuczora commented on HIVE-16845:
--

[~stakiar], [~pvary], could you please have a look at the patch?

> INSERT OVERWRITE a table with dynamic partitions on S3 fails with NPE
> -
>
> Key: HIVE-16845
> URL: https://issues.apache.org/jira/browse/HIVE-16845
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
> Attachments: HIVE-16845.1.patch, HIVE-16845.2.patch, 
> HIVE-16845.3.patch, HIVE-16845.4.patch
>
>
> *How to reproduce*
> - Create a partitioned table on S3:
> {noformat}
> CREATE EXTERNAL TABLE s3table(user_id string COMMENT '', event_name string 
> COMMENT '') PARTITIONED BY (reported_date string, product_id int) LOCATION 
> 's3a://'; 
> {noformat}
> - Create a temp table:
> {noformat}
> create table tmp_table (id string, name string, date string, pid int) row 
> format delimited fields terminated by '\t' lines terminated by '\n' stored as 
> textfile;
> {noformat}
> - Load the following rows to the tmp table:
> {noformat}
> u1value1  2017-04-10  1
> u2value2  2017-04-10  1
> u3value3  2017-04-10  10001
> {noformat}
> - Set the following parameters:
> -- hive.exec.dynamic.partition.mode=nonstrict
> -- mapreduce.input.fileinputformat.split.maxsize=10
> -- hive.blobstore.optimizations.enabled=true
> -- hive.blobstore.use.blobstore.as.scratchdir=false
> -- hive.merge.mapfiles=true
> - Insert the rows from the temp table into the s3 table:
> {noformat}
> INSERT OVERWRITE TABLE s3table
> PARTITION (reported_date, product_id)
> SELECT
>   t.id as user_id,
>   t.name as event_name,
>   t.date as reported_date,
>   t.pid as product_id
> FROM tmp_table t;
> {noformat}
> A NPE will occur with the following stacktrace:
> {noformat}
> 2017-05-08 21:32:50,607 ERROR 
> org.apache.hive.service.cli.operation.Operation: 
> [HiveServer2-Background-Pool: Thread-184028]: Error running hive query: 
> org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code -101 from 
> org.apache.hadoop.hive.ql.exec.ConditionalTask. null
> at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:400)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:239)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.plan.ConditionalResolverMergeFiles.generateActualTasks(ConditionalResolverMergeFiles.java:290)
> at 
> org.apache.hadoop.hive.ql.plan.ConditionalResolverMergeFiles.getTasks(ConditionalResolverMergeFiles.java:175)
> at 
> org.apache.hadoop.hive.ql.exec.ConditionalTask.execute(ConditionalTask.java:81)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1977)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1690)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1422)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1206)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1201)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
> ... 11 more 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-14261) Support set/unset partition parameters

2017-08-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108587#comment-16108587
 ] 

Hive QA commented on HIVE-14261:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12818468/HIVE-14261.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 11019 tests 
executed
*Failed tests:*
{noformat}
TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite]
 (batchId=240)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] 
(batchId=99)
org.apache.hadoop.hive.metastore.TestMarkPartitionRemote.testMarkingPartitionSet
 (batchId=214)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
 (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6208/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6208/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6208/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12818468 - PreCommit-HIVE-Build

> Support set/unset partition parameters
> --
>
> Key: HIVE-14261
> URL: https://issues.apache.org/jira/browse/HIVE-14261
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14261.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17222) Llap: Iotrace throws java.lang.UnsupportedOperationException with IncompleteCb

2017-08-01 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan reassigned HIVE-17222:
---

Assignee: Rajesh Balamohan

> Llap: Iotrace throws  java.lang.UnsupportedOperationException with 
> IncompleteCb
> ---
>
> Key: HIVE-17222
> URL: https://issues.apache.org/jira/browse/HIVE-17222
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-17222.1.patch
>
>
> branch: hive master 
> Running Q76 at 1 TB generates the following exception.
> {noformat}
> Caused by: java.io.IOException: java.lang.UnsupportedOperationException
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.rethrowErrorIfAny(LlapRecordReader.java:349)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.nextCvb(LlapRecordReader.java:304)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.next(LlapRecordReader.java:244)
> at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.next(LlapRecordReader.java:67)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
> ... 23 more
> Caused by: java.lang.UnsupportedOperationException
> at 
> org.apache.hadoop.hive.common.io.DiskRange.getData(DiskRange.java:86)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.IoTrace.logRange(IoTrace.java:304)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.IoTrace.logRanges(IoTrace.java:291)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:328)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:426)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:250)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:247)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:247)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:96)
> ... 6 more
> {noformat}
> When {{IncompleteCb}} is encountered, it ends up throwing this error.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   >