[jira] [Commented] (HIVE-17144) export of temporary tables not working and it seems to be using distcp rather than filesystem copy
[ https://issues.apache.org/jira/browse/HIVE-17144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110358#comment-16110358 ] anishek commented on HIVE-17144: * org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] : runs fine on local machine after increasing the mvn test memory * org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1] : runs fine on local machine. * org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed] : runs fine on local machine other failures are from previous builds. [~daijy] can you please review ! > export of temporary tables not working and it seems to be using distcp rather > than filesystem copy > -- > > Key: HIVE-17144 > URL: https://issues.apache.org/jira/browse/HIVE-17144 > Project: Hive > Issue Type: Bug > Components: Hive, HiveServer2 >Affects Versions: 3.0.0 >Reporter: anishek >Assignee: anishek > Fix For: 3.0.0 > > Attachments: HIVE-17144.1.patch > > > create temporary table t1 (i int); > insert into t1 values (3); > export table t1 to 'hdfs://somelocation'; > above fails. additionally it should use filesystem copy and not distcp to do > the job. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17217) SMB Join : Assert if paths are different in TezGroupedSplit in KeyValueInputMerger
[ https://issues.apache.org/jira/browse/HIVE-17217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110331#comment-16110331 ] Gopal V commented on HIVE-17217: Thanks, LGTM - +1. > SMB Join : Assert if paths are different in TezGroupedSplit in > KeyValueInputMerger > -- > > Key: HIVE-17217 > URL: https://issues.apache.org/jira/browse/HIVE-17217 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-17217.1.patch, HIVE-17217.2.patch > > > In KeyValueInputMerger, a TezGroupedSplit may contain more than 1 splits. > However, the splits should all belong to same path. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17213) HoS: file merging doesn't work for union all
[ https://issues.apache.org/jira/browse/HIVE-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HIVE-17213: Attachment: HIVE-17213.3.patch > HoS: file merging doesn't work for union all > > > Key: HIVE-17213 > URL: https://issues.apache.org/jira/browse/HIVE-17213 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HIVE-17213.0.patch, HIVE-17213.1.patch, > HIVE-17213.2.patch, HIVE-17213.3.patch > > > HoS file merging doesn't work properly since it doesn't set linked file sinks > properly which is used to generate move tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17217) SMB Join : Assert if paths are different in TezGroupedSplit in KeyValueInputMerger
[ https://issues.apache.org/jira/browse/HIVE-17217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal updated HIVE-17217: -- Attachment: HIVE-17217.2.patch Updated the patch with cleaner code. > SMB Join : Assert if paths are different in TezGroupedSplit in > KeyValueInputMerger > -- > > Key: HIVE-17217 > URL: https://issues.apache.org/jira/browse/HIVE-17217 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-17217.1.patch, HIVE-17217.2.patch > > > In KeyValueInputMerger, a TezGroupedSplit may contain more than 1 splits. > However, the splits should all belong to same path. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17213) HoS: file merging doesn't work for union all
[ https://issues.apache.org/jira/browse/HIVE-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HIVE-17213: Attachment: (was: HIVE-17213.3.patch) > HoS: file merging doesn't work for union all > > > Key: HIVE-17213 > URL: https://issues.apache.org/jira/browse/HIVE-17213 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HIVE-17213.0.patch, HIVE-17213.1.patch, > HIVE-17213.2.patch > > > HoS file merging doesn't work properly since it doesn't set linked file sinks > properly which is used to generate move tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17227) Incremental replication load should creates tasks in execution phase rather than semantic phase
[ https://issues.apache.org/jira/browse/HIVE-17227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] anishek reassigned HIVE-17227: -- Assignee: anishek > Incremental replication load should creates tasks in execution phase rather > than semantic phase > > > Key: HIVE-17227 > URL: https://issues.apache.org/jira/browse/HIVE-17227 > Project: Hive > Issue Type: Sub-task > Components: Hive, HiveServer2 >Affects Versions: 3.0.0 >Reporter: anishek >Assignee: anishek > Fix For: 3.0.0 > > > as we did for bootstrap replication load in HIVE-16896 we should use a > mechanism to dynamically create dag graph for incremental replication as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17213) HoS: file merging doesn't work for union all
[ https://issues.apache.org/jira/browse/HIVE-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110312#comment-16110312 ] Hive QA commented on HIVE-17213: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12879946/HIVE-17213.3.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6224/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6224/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6224/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2017-08-02 05:07:15.923 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-6224/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2017-08-02 05:07:15.926 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 5c147f0 HIVE-17209: ObjectCacheFactory should return null when tez shared object registry is not setup (Rajesh Balamohan, reviewed by Sergey Shelukhin) + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 5c147f0 HIVE-17209: ObjectCacheFactory should return null when tez shared object registry is not setup (Rajesh Balamohan, reviewed by Sergey Shelukhin) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2017-08-02 05:07:22.269 + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: patch failed: llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapRecordReader.java:181 error: llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapRecordReader.java: patch does not apply error: patch failed: ql/src/test/queries/clientpositive/llap_smb.q:1 error: ql/src/test/queries/clientpositive/llap_smb.q: patch does not apply The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12879946 - PreCommit-HIVE-Build > HoS: file merging doesn't work for union all > > > Key: HIVE-17213 > URL: https://issues.apache.org/jira/browse/HIVE-17213 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HIVE-17213.0.patch, HIVE-17213.1.patch, > HIVE-17213.2.patch, HIVE-17213.3.patch > > > HoS file merging doesn't work properly since it doesn't set linked file sinks > properly which is used to generate move tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17217) SMB Join : Assert if paths are different in TezGroupedSplit in KeyValueInputMerger
[ https://issues.apache.org/jira/browse/HIVE-17217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110309#comment-16110309 ] Gopal V commented on HIVE-17217: The patch took me a few reads to understand. If someone removes the assert, it ends up with the last path replacing all others, which might not be obvious. {code} for (int i = 0; i < splits.size(); i++) { {code} is better written with i=1, so that the loop only compares and doesn't do a put(). +1, with that minor nit. > SMB Join : Assert if paths are different in TezGroupedSplit in > KeyValueInputMerger > -- > > Key: HIVE-17217 > URL: https://issues.apache.org/jira/browse/HIVE-17217 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-17217.1.patch > > > In KeyValueInputMerger, a TezGroupedSplit may contain more than 1 splits. > However, the splits should all belong to same path. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17089) make acid 2.0 the default
[ https://issues.apache.org/jira/browse/HIVE-17089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110308#comment-16110308 ] Hive QA commented on HIVE-17089: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12879936/HIVE-17089.03.patch {color:green}SUCCESS:{color} +1 due to 7 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 27 failed/errored test(s), 11003 tests executed *Failed tests:* {noformat} TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=236) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=74) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_3] (batchId=159) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=168) org.apache.hadoop.hive.ql.io.TestAcidUtils.testAcidOperationalPropertiesSettersAndGetters (batchId=262) org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testEmpty (batchId=265) org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testNewBaseAndDelta (batchId=265) org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderDelta (batchId=265) org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderIncompleteDelta (batchId=265) org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderNewBaseAndDelta (batchId=265) org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderOldBaseAndDelta (batchId=265) org.apache.hadoop.hive.ql.io.orc.TestOrcRecordUpdater.testUpdates (batchId=265) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=179) org.apache.hive.hcatalog.streaming.TestStreaming.testInterleavedTransactionBatchCommits (batchId=191) org.apache.hive.hcatalog.streaming.TestStreaming.testMultipleTransactionBatchCommits (batchId=191) org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchAbortAndCommit (batchId=191) org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Delimited (batchId=191) org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_DelimitedUGI (batchId=191) org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json (batchId=191) org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Regex (batchId=191) org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_RegexUGI (batchId=191) org.apache.hive.hcatalog.streaming.mutate.TestMutations.testMulti (batchId=191) org.apache.hive.hcatalog.streaming.mutate.TestMutations.testTransactionBatchCommitPartitioned (batchId=191) org.apache.hive.hcatalog.streaming.mutate.TestMutations.testTransactionBatchCommitUnpartitioned (batchId=191) org.apache.hive.hcatalog.streaming.mutate.TestMutations.testUpdatesAndDeletes (batchId=191) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6223/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6223/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6223/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 27 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12879936 - PreCommit-HIVE-Build > make acid 2.0 the default > - > > Key: HIVE-17089 > URL: https://issues.apache.org/jira/browse/HIVE-17089 > Project: Hive > Issue Type: New Feature > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-17089.01.patch, HIVE-17089.03.patch > > > acid 2.0 is introduced in HIVE-14035. It replaces Update events with a > combination of Delete + Insert events. This now makes U=D+I the default (and > only) supported acid table type in Hive 3.0. > The expectation for upgrade is that Major compaction has to be run on all > acid tables in the existing Hive cluster and that no new writes to these > table take place since the start of compaction (Need to add a mechanism to > put a table in read-only mode - this way it can still be read while it's > being compacted). Then upgrade to Hive 3.0 can take place. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-12631) LLAP: support ORC ACID tables
[ https://issues.apache.org/jira/browse/HIVE-12631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teddy Choi updated HIVE-12631: -- Attachment: HIVE-12631.26.patch > LLAP: support ORC ACID tables > - > > Key: HIVE-12631 > URL: https://issues.apache.org/jira/browse/HIVE-12631 > Project: Hive > Issue Type: Bug > Components: llap, Transactions >Reporter: Sergey Shelukhin >Assignee: Teddy Choi > Attachments: HIVE-12631.10.patch, HIVE-12631.10.patch, > HIVE-12631.11.patch, HIVE-12631.11.patch, HIVE-12631.12.patch, > HIVE-12631.13.patch, HIVE-12631.15.patch, HIVE-12631.16.patch, > HIVE-12631.17.patch, HIVE-12631.18.patch, HIVE-12631.19.patch, > HIVE-12631.1.patch, HIVE-12631.20.patch, HIVE-12631.21.patch, > HIVE-12631.22.patch, HIVE-12631.23.patch, HIVE-12631.24.patch, > HIVE-12631.25.patch, HIVE-12631.26.patch, HIVE-12631.2.patch, > HIVE-12631.3.patch, HIVE-12631.4.patch, HIVE-12631.5.patch, > HIVE-12631.6.patch, HIVE-12631.7.patch, HIVE-12631.8.patch, > HIVE-12631.8.patch, HIVE-12631.9.patch > > > LLAP uses a completely separate read path in ORC to allow for caching and > parallelization of reads and processing. This path does not support ACID. As > far as I remember ACID logic is embedded inside ORC format; we need to > refactor it to be on top of some interface, if practical; or just port it to > LLAP read path. > Another consideration is how the logic will work with cache. The cache is > currently low-level (CB-level in ORC), so we could just use it to read bases > and deltas (deltas should be cached with higher priority) and merge as usual. > We could also cache merged representation in future. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16896) move replication load related work in semantic analysis phase to execution phase using a task
[ https://issues.apache.org/jira/browse/HIVE-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110279#comment-16110279 ] anishek commented on HIVE-16896: not sure why the pull request is not shown here https://github.com/apache/hive/pull/214 > move replication load related work in semantic analysis phase to execution > phase using a task > - > > Key: HIVE-16896 > URL: https://issues.apache.org/jira/browse/HIVE-16896 > Project: Hive > Issue Type: Sub-task >Reporter: anishek >Assignee: anishek > Attachments: HIVE-16896.1.patch > > > we want to not create too many tasks in memory in the analysis phase while > loading data. Currently we load all the files in the bootstrap dump location > as {{FileStatus[]}} and then iterate over it to load objects, we should > rather move to > {code} > org.apache.hadoop.fs.RemoteIteratorlistFiles(Path > f, boolean recursive) > {code} > which would internally batch and return values. > additionally since we cant hand off partial tasks from analysis pahse => > execution phase, we are going to move the whole repl load functionality to > execution phase so we can better control creation/execution of tasks (not > related to hive {{Task}}, we may get rid of ReplCopyTask) > Additional consideration to take into account at the end of this jira is to > see if we want to specifically do a multi threaded load of bootstrap dump. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17172) add ordering checks to DiskRangeList
[ https://issues.apache.org/jira/browse/HIVE-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110257#comment-16110257 ] Deepak Jaiswal commented on HIVE-17172: --- Thanks for adding comments. As far as I understand it looks good. An RB link would have been more helpful in understanding the code though. +1 > add ordering checks to DiskRangeList > > > Key: HIVE-17172 > URL: https://issues.apache.org/jira/browse/HIVE-17172 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17172.01.patch, HIVE-17172.02.patch, > HIVE-17172.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17170) Move thrift generated code to stand alone metastore
[ https://issues.apache.org/jira/browse/HIVE-17170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110227#comment-16110227 ] Hive QA commented on HIVE-17170: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12879930/HIVE-17170.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 11040 tests executed *Failed tests:* {noformat} TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=236) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=168) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=179) org.apache.hive.jdbc.TestJdbcWithMiniHS2.testHttpRetryOnServerIdleTimeout (batchId=229) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6222/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6222/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6222/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12879930 - PreCommit-HIVE-Build > Move thrift generated code to stand alone metastore > --- > > Key: HIVE-17170 > URL: https://issues.apache.org/jira/browse/HIVE-17170 > Project: Hive > Issue Type: Sub-task > Components: Metastore >Reporter: Alan Gates >Assignee: Alan Gates > Attachments: HIVE-17170.2.patch, HIVE-17170.patch > > > hive_metastore.thrift and the code it generates needs to be moved into the > standalone metastore module. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-17225) HoS DPP pruning sink ops can target parallel work objects
[ https://issues.apache.org/jira/browse/HIVE-17225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110187#comment-16110187 ] Sahil Takiar edited comment on HIVE-17225 at 8/2/17 3:31 AM: - The query above fails because the {{Spark Partition Pruning Sink Operator}} in Map 3 has a target work of Map 1. This means that when Map 1 runs, it will looks for a tmp file on HDFS that contains all the partitions it should scan. The problem is that Map 1 and Map 3 will run in parallel, so Map 1 will fail with a FNF exception. Here is a brief explanation of the whats happening in the query above. There are three tables: pt1, r1 and r2. pt1 is partitioned, r1t and rt2 aren't. In terms of data size pt1 < rt1 = rt2. Map-joins are enabled. Since pt1 is the smallest table, it is scanned and written to a hash table. rt2 is also scanned and written to a hash table. rt1 is treated as the big table in the map-join. The hashtables for pt1 and rt2 are generated in the same Spark job. If DPP is enabled, then the scan for rt2 will result in a pruning sink targeting the scan for pt1. This will cause the FNF exception show above, because the scans for rt2 and pt1 run in parallel. was (Author: stakiar): Here is another example of when this can happen. Say there are three tables: pt1, pt2 and r1. pt1 and pt2 are partitioned and r1 is not. In terms of data size pt1 < r1 < pt2. If map-joins are enabled, and all three tables are joined the following scenario may occur. pt1 is scanned and written to a hash table, r1 is scanned and written to a hash table. pt2 is treated as the big table in the map-join. The hashtables for pt1 and r1 are generated in the same Spark job. If DPP is enabled, then the scan for r1 will result in a pruning sink targeting the scan for pt1. This will cause the FNF exception show above, because the scans for r1 and pt1 run in parallel. > HoS DPP pruning sink ops can target parallel work objects > - > > Key: HIVE-17225 > URL: https://issues.apache.org/jira/browse/HIVE-17225 > Project: Hive > Issue Type: Sub-task > Components: Spark >Affects Versions: 3.0.0 >Reporter: Sahil Takiar >Assignee: Sahil Takiar > > Setup: > {code:sql} > SET hive.spark.dynamic.partition.pruning=true; > SET hive.strict.checks.cartesian.product=false; > SET hive.auto.convert.join=true; > CREATE TABLE partitioned_table1 (col int) PARTITIONED BY (part_col int); > CREATE TABLE regular_table1 (col int); > CREATE TABLE regular_table2 (col int); > ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 1); > ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 2); > ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 3); > INSERT INTO table regular_table1 VALUES (1), (2), (3), (4), (5), (6); > INSERT INTO table regular_table2 VALUES (1), (2), (3), (4), (5), (6); > INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 1) VALUES (1); > INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 2) VALUES (2); > INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 3) VALUES (3); > SELECT * > FROM partitioned_table1, >regular_table1 rt1, >regular_table2 rt2 > WHERE rt1.col = partitioned_table1.part_col >AND rt2.col = partitioned_table1.part_col; > {code} > Exception: > {code} > 2017-08-01T13:27:47,483 ERROR [b0d354a8-4cdb-4ba9-acec-27d14926aaf4 main] > ql.Driver: FAILED: Execution Error, return code 3 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask. java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: > java.io.FileNotFoundException: File > file:/Users/stakiar/Documents/idea/apache-hive/itests/qtest-spark/target/tmp/scratchdir/stakiar/b0d354a8-4cdb-4ba9-acec-27d14926aaf4/hive_2017-08-01_13-27-45_553_1088589686371686526-1/-mr-10004/3/5 > does not exist > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:408) > at > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:498) > at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) > at scala.Option.getOrElse(Option.scala:121) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) > at > org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) > at scala.Option.getOrElse(Option.scala:121) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) > at
[jira] [Updated] (HIVE-17225) HoS DPP pruning sink ops can target parallel work objects
[ https://issues.apache.org/jira/browse/HIVE-17225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-17225: Description: Setup: {code:sql} SET hive.spark.dynamic.partition.pruning=true; SET hive.strict.checks.cartesian.product=false; SET hive.auto.convert.join=true; CREATE TABLE partitioned_table1 (col int) PARTITIONED BY (part_col int); CREATE TABLE regular_table1 (col int); CREATE TABLE regular_table2 (col int); ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 1); ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 2); ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 3); INSERT INTO table regular_table1 VALUES (1), (2), (3), (4), (5), (6); INSERT INTO table regular_table2 VALUES (1), (2), (3), (4), (5), (6); INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 1) VALUES (1); INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 2) VALUES (2); INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 3) VALUES (3); SELECT * FROM partitioned_table1, regular_table1 rt1, regular_table2 rt2 WHERE rt1.col = partitioned_table1.part_col AND rt2.col = partitioned_table1.part_col; {code} Exception: {code} 2017-08-01T13:27:47,483 ERROR [b0d354a8-4cdb-4ba9-acec-27d14926aaf4 main] ql.Driver: FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.FileNotFoundException: File file:/Users/stakiar/Documents/idea/apache-hive/itests/qtest-spark/target/tmp/scratchdir/stakiar/b0d354a8-4cdb-4ba9-acec-27d14926aaf4/hive_2017-08-01_13-27-45_553_1088589686371686526-1/-mr-10004/3/5 does not exist at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:408) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:498) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.immutable.List.foreach(List.scala:381) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.immutable.List.map(List.scala:285) at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:82) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.immutable.List.foreach(List.scala:381) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.immutable.List.map(List.scala:285) at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:82) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.immutable.List.foreach(List.scala:381) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at
[jira] [Commented] (HIVE-17225) HoS DPP pruning sink ops can target parallel work objects
[ https://issues.apache.org/jira/browse/HIVE-17225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110189#comment-16110189 ] Sahil Takiar commented on HIVE-17225: - A simple solution would be to add a rule that removes a DPP branch whenever the target work is in a parallel work object. A tricker solution would be to splits the work objects into different stages (and thus different Spark jobs), or add some other dependency between the work objects (not sure if a Map Work can have a dependency on another Map Work). > HoS DPP pruning sink ops can target parallel work objects > - > > Key: HIVE-17225 > URL: https://issues.apache.org/jira/browse/HIVE-17225 > Project: Hive > Issue Type: Sub-task > Components: Spark >Affects Versions: 3.0.0 >Reporter: Sahil Takiar >Assignee: Sahil Takiar > > Setup: > {code:sql} > SET hive.spark.dynamic.partition.pruning=true; > SET hive.strict.checks.cartesian.product=false; > SET hive.auto.convert.join=true; > CREATE TABLE partitioned_table1 (col int) PARTITIONED BY (part_col int); > CREATE TABLE regular_table1 (col1 int, col2 int); > CREATE TABLE regular_table2 (col1 int, col2 int); > ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 1); > ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 2); > ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 3); > INSERT INTO table regular_table1 VALUES (0, 0), (1, 1), (2, 2); > INSERT INTO table regular_table2 VALUES (0, 0), (1, 1), (2, 2); > INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 1) VALUES (1), > (2), (3); > INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 2) VALUES (1), > (2), (3); > INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 3) VALUES (1), > (2), (3); > SELECT * > FROM regular_table1, >regular_table2, >partitioned_table1 > WHERE partitioned_table1.part_col IN (SELECT regular_table1.col2 >FROM regular_table1 >WHERE regular_table1.col1 > 0) >AND partitioned_table1.part_col IN (SELECT regular_table2.col2 >FROM regular_table2 >WHERE regular_table2.col1 > 1); > {code} > Exception: > {code} > 2017-08-01T13:27:47,483 ERROR [b0d354a8-4cdb-4ba9-acec-27d14926aaf4 main] > ql.Driver: FAILED: Execution Error, return code 3 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask. java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: > java.io.FileNotFoundException: File > file:/Users/stakiar/Documents/idea/apache-hive/itests/qtest-spark/target/tmp/scratchdir/stakiar/b0d354a8-4cdb-4ba9-acec-27d14926aaf4/hive_2017-08-01_13-27-45_553_1088589686371686526-1/-mr-10004/3/5 > does not exist > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:408) > at > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:498) > at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) > at scala.Option.getOrElse(Option.scala:121) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) > at > org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) > at scala.Option.getOrElse(Option.scala:121) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) > at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) > at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at scala.collection.immutable.List.foreach(List.scala:381) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at scala.collection.immutable.List.map(List.scala:285) > at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:82) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) > at scala.Option.getOrElse(Option.scala:121) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) > at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) > at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) > at >
[jira] [Commented] (HIVE-17225) HoS DPP pruning sink ops can target parallel work objects
[ https://issues.apache.org/jira/browse/HIVE-17225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110187#comment-16110187 ] Sahil Takiar commented on HIVE-17225: - Here is another example of when this can happen. Say there are three tables: pt1, pt2 and r1. pt1 and pt2 are partitioned and r1 is not. In terms of data size pt1 < r1 < pt2. If map-joins are enabled, and all three tables are joined the following scenario may occur. pt1 is scanned and written to a hash table, r1 is scanned and written to a hash table. pt2 is treated as the big table in the map-join. The hashtables for pt1 and r1 are generated in the same Spark job. If DPP is enabled, then the scan for r1 will result in a pruning sink targeting the scan for pt1. This will cause the FNF exception show above, because the scans for r1 and pt1 run in parallel. > HoS DPP pruning sink ops can target parallel work objects > - > > Key: HIVE-17225 > URL: https://issues.apache.org/jira/browse/HIVE-17225 > Project: Hive > Issue Type: Sub-task > Components: Spark >Affects Versions: 3.0.0 >Reporter: Sahil Takiar >Assignee: Sahil Takiar > > Setup: > {code:sql} > SET hive.spark.dynamic.partition.pruning=true; > SET hive.strict.checks.cartesian.product=false; > SET hive.auto.convert.join=true; > CREATE TABLE partitioned_table1 (col int) PARTITIONED BY (part_col int); > CREATE TABLE regular_table1 (col1 int, col2 int); > CREATE TABLE regular_table2 (col1 int, col2 int); > ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 1); > ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 2); > ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 3); > INSERT INTO table regular_table1 VALUES (0, 0), (1, 1), (2, 2); > INSERT INTO table regular_table2 VALUES (0, 0), (1, 1), (2, 2); > INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 1) VALUES (1), > (2), (3); > INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 2) VALUES (1), > (2), (3); > INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 3) VALUES (1), > (2), (3); > SELECT * > FROM regular_table1, >regular_table2, >partitioned_table1 > WHERE partitioned_table1.part_col IN (SELECT regular_table1.col2 >FROM regular_table1 >WHERE regular_table1.col1 > 0) >AND partitioned_table1.part_col IN (SELECT regular_table2.col2 >FROM regular_table2 >WHERE regular_table2.col1 > 1); > {code} > Exception: > {code} > 2017-08-01T13:27:47,483 ERROR [b0d354a8-4cdb-4ba9-acec-27d14926aaf4 main] > ql.Driver: FAILED: Execution Error, return code 3 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask. java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: > java.io.FileNotFoundException: File > file:/Users/stakiar/Documents/idea/apache-hive/itests/qtest-spark/target/tmp/scratchdir/stakiar/b0d354a8-4cdb-4ba9-acec-27d14926aaf4/hive_2017-08-01_13-27-45_553_1088589686371686526-1/-mr-10004/3/5 > does not exist > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:408) > at > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:498) > at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) > at scala.Option.getOrElse(Option.scala:121) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) > at > org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) > at scala.Option.getOrElse(Option.scala:121) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) > at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) > at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at scala.collection.immutable.List.foreach(List.scala:381) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at scala.collection.immutable.List.map(List.scala:285) > at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:82) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) > at
[jira] [Updated] (HIVE-17213) HoS: file merging doesn't work for union all
[ https://issues.apache.org/jira/browse/HIVE-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HIVE-17213: Attachment: HIVE-17213.3.patch Test failures unrelated for patch v2. Attaching patch v3 with qtest. [~xuefuz], can you take another look? > HoS: file merging doesn't work for union all > > > Key: HIVE-17213 > URL: https://issues.apache.org/jira/browse/HIVE-17213 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HIVE-17213.0.patch, HIVE-17213.1.patch, > HIVE-17213.2.patch, HIVE-17213.3.patch > > > HoS file merging doesn't work properly since it doesn't set linked file sinks > properly which is used to generate move tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17225) HoS DPP pruning sink ops can target parallel work objects
[ https://issues.apache.org/jira/browse/HIVE-17225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-17225: Summary: HoS DPP pruning sink ops can target parallel work objects (was: HoS DPP throws FileNotFoundException in HiveInputFormat#init when target work is in the same Spark job) > HoS DPP pruning sink ops can target parallel work objects > - > > Key: HIVE-17225 > URL: https://issues.apache.org/jira/browse/HIVE-17225 > Project: Hive > Issue Type: Sub-task > Components: Spark >Affects Versions: 3.0.0 >Reporter: Sahil Takiar >Assignee: Sahil Takiar > > Setup: > {code:sql} > SET hive.spark.dynamic.partition.pruning=true; > SET hive.strict.checks.cartesian.product=false; > SET hive.auto.convert.join=true; > CREATE TABLE partitioned_table1 (col int) PARTITIONED BY (part_col int); > CREATE TABLE regular_table1 (col1 int, col2 int); > CREATE TABLE regular_table2 (col1 int, col2 int); > ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 1); > ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 2); > ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 3); > INSERT INTO table regular_table1 VALUES (0, 0), (1, 1), (2, 2); > INSERT INTO table regular_table2 VALUES (0, 0), (1, 1), (2, 2); > INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 1) VALUES (1), > (2), (3); > INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 2) VALUES (1), > (2), (3); > INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 3) VALUES (1), > (2), (3); > SELECT * > FROM regular_table1, >regular_table2, >partitioned_table1 > WHERE partitioned_table1.part_col IN (SELECT regular_table1.col2 >FROM regular_table1 >WHERE regular_table1.col1 > 0) >AND partitioned_table1.part_col IN (SELECT regular_table2.col2 >FROM regular_table2 >WHERE regular_table2.col1 > 1); > {code} > Exception: > {code} > 2017-08-01T13:27:47,483 ERROR [b0d354a8-4cdb-4ba9-acec-27d14926aaf4 main] > ql.Driver: FAILED: Execution Error, return code 3 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask. java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: > java.io.FileNotFoundException: File > file:/Users/stakiar/Documents/idea/apache-hive/itests/qtest-spark/target/tmp/scratchdir/stakiar/b0d354a8-4cdb-4ba9-acec-27d14926aaf4/hive_2017-08-01_13-27-45_553_1088589686371686526-1/-mr-10004/3/5 > does not exist > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:408) > at > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:498) > at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) > at scala.Option.getOrElse(Option.scala:121) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) > at > org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) > at scala.Option.getOrElse(Option.scala:121) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) > at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) > at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at scala.collection.immutable.List.foreach(List.scala:381) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at scala.collection.immutable.List.map(List.scala:285) > at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:82) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) > at scala.Option.getOrElse(Option.scala:121) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) > at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) > at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at
[jira] [Updated] (HIVE-17225) HoS DPP throws FileNotFoundException in HiveInputFormat#init when target work is in the same Spark job
[ https://issues.apache.org/jira/browse/HIVE-17225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-17225: Description: Setup: {code:sql} SET hive.spark.dynamic.partition.pruning=true; SET hive.strict.checks.cartesian.product=false; SET hive.auto.convert.join=true; CREATE TABLE partitioned_table1 (col int) PARTITIONED BY (part_col int); CREATE TABLE regular_table1 (col1 int, col2 int); CREATE TABLE regular_table2 (col1 int, col2 int); ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 1); ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 2); ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 3); INSERT INTO table regular_table1 VALUES (0, 0), (1, 1), (2, 2); INSERT INTO table regular_table2 VALUES (0, 0), (1, 1), (2, 2); INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 1) VALUES (1), (2), (3); INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 2) VALUES (1), (2), (3); INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 3) VALUES (1), (2), (3); SELECT * FROM regular_table1, regular_table2, partitioned_table1 WHERE partitioned_table1.part_col IN (SELECT regular_table1.col2 FROM regular_table1 WHERE regular_table1.col1 > 0) AND partitioned_table1.part_col IN (SELECT regular_table2.col2 FROM regular_table2 WHERE regular_table2.col1 > 1); {code} Exception: {code} 2017-08-01T13:27:47,483 ERROR [b0d354a8-4cdb-4ba9-acec-27d14926aaf4 main] ql.Driver: FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.FileNotFoundException: File file:/Users/stakiar/Documents/idea/apache-hive/itests/qtest-spark/target/tmp/scratchdir/stakiar/b0d354a8-4cdb-4ba9-acec-27d14926aaf4/hive_2017-08-01_13-27-45_553_1088589686371686526-1/-mr-10004/3/5 does not exist at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:408) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:498) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.immutable.List.foreach(List.scala:381) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.immutable.List.map(List.scala:285) at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:82) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.immutable.List.foreach(List.scala:381) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.immutable.List.map(List.scala:285) at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:82) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
[jira] [Updated] (HIVE-17209) ObjectCacheFactory should return null when tez shared object registry is not setup
[ https://issues.apache.org/jira/browse/HIVE-17209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HIVE-17209: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Created ORC-221 for orc related change and it got committed as well. Thanks [~sershe]. Committed this patch to master. > ObjectCacheFactory should return null when tez shared object registry is not > setup > -- > > Key: HIVE-17209 > URL: https://issues.apache.org/jira/browse/HIVE-17209 > Project: Hive > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Fix For: 3.0.0 > > Attachments: HIVE-17209.1.patch > > > HIVE-15269 introduced dynamic min/max bloom filter > ("hive.tez.dynamic.semijoin.reduction=true"). This needs to access > ObjectCache and in tez, ObjectCache can only be created by {{TezProcessor}}. > In the following case {{AM --> splits --> > OrcInputFormat.pickStripes::evaluatePredicateMinMax --> > DynamicValue.getLiteral --> objectCache access}}, AM ends up throwing lots of > NPE since AM has not created ObjectCache. > Orc reader catches these exceptions, skips PPD and proceeds further. For e.g, > in Q95 it ends up throwing ~30,000 NPE before completing split information. > ObjectCacheFactory should return null when tez shared object registry is not > setup. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17225) HoS DPP throws FileNotFoundException in HiveInputFormat#init when target work is in the same Spark job
[ https://issues.apache.org/jira/browse/HIVE-17225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-17225: Summary: HoS DPP throws FileNotFoundException in HiveInputFormat#init when target work is in the same Spark job (was: FileNotFoundException in HiveInputFormat#init for query HoS DPP query with multiple left semi-joins against the same partition column) > HoS DPP throws FileNotFoundException in HiveInputFormat#init when target work > is in the same Spark job > -- > > Key: HIVE-17225 > URL: https://issues.apache.org/jira/browse/HIVE-17225 > Project: Hive > Issue Type: Sub-task > Components: Spark >Affects Versions: 3.0.0 >Reporter: Sahil Takiar >Assignee: Sahil Takiar > > Setup: > {code:sql} > SET hive.spark.dynamic.partition.pruning=true; > SET hive.strict.checks.cartesian.product=false; > SET hive.auto.convert.join=true; > CREATE TABLE partitioned_table1 (col int) PARTITIONED BY (part_col int); > CREATE TABLE regular_table1 (col1 int, col2 int); > CREATE TABLE regular_table2 (col1 int, col2 int); > ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 1); > ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 2); > ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 3); > INSERT INTO table regular_table1 VALUES (0, 0), (1, 1), (2, 2); > INSERT INTO table regular_table2 VALUES (0, 0), (1, 1), (2, 2); > INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 1) VALUES (1), > (2), (3); > INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 2) VALUES (1), > (2), (3); > INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 3) VALUES (1), > (2), (3); > SELECT * > FROM regular_table1, >regular_table2, >partitioned_table1 > WHERE partitioned_table1.part_col IN (SELECT regular_table1.col2 >FROM regular_table1 >WHERE regular_table1.col1 > 0) >AND partitioned_table1.part_col IN (SELECT regular_table2.col2 >FROM regular_table2 >WHERE regular_table2.col1 > 1); > {code} > Exception: > {code} > 2017-08-01T13:27:47,483 ERROR [b0d354a8-4cdb-4ba9-acec-27d14926aaf4 main] > ql.Driver: FAILED: Execution Error, return code 3 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask. java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: > java.io.FileNotFoundException: File > file:/Users/stakiar/Documents/idea/apache-hive/itests/qtest-spark/target/tmp/scratchdir/stakiar/b0d354a8-4cdb-4ba9-acec-27d14926aaf4/hive_2017-08-01_13-27-45_553_1088589686371686526-1/-mr-10004/3/5 > does not exist > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:408) > at > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:498) > at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) > at scala.Option.getOrElse(Option.scala:121) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) > at > org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) > at scala.Option.getOrElse(Option.scala:121) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) > at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) > at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at scala.collection.immutable.List.foreach(List.scala:381) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at scala.collection.immutable.List.map(List.scala:285) > at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:82) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) > at scala.Option.getOrElse(Option.scala:121) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) > at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) > at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) > at >
[jira] [Updated] (HIVE-17225) FileNotFoundException in HiveInputFormat#init for query HoS DPP query with multiple left semi-joins against the same partition column
[ https://issues.apache.org/jira/browse/HIVE-17225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-17225: Description: Setup: {code:sql} SET hive.spark.dynamic.partition.pruning=true; SET hive.strict.checks.cartesian.product=false; SET hive.auto.convert.join=true; CREATE TABLE partitioned_table1 (col int) PARTITIONED BY (part_col int); CREATE TABLE regular_table1 (col1 int, col2 int); CREATE TABLE regular_table2 (col1 int, col2 int); ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 1); ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 2); ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 3); INSERT INTO table regular_table1 VALUES (0, 0), (1, 1), (2, 2); INSERT INTO table regular_table2 VALUES (0, 0), (1, 1), (2, 2); INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 1) VALUES (1), (2), (3); INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 2) VALUES (1), (2), (3); INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 3) VALUES (1), (2), (3); SELECT * FROM regular_table1, regular_table2, partitioned_table1 WHERE partitioned_table1.part_col IN (SELECT regular_table1.col2 FROM regular_table1 WHERE regular_table1.col1 > 0) AND partitioned_table1.part_col IN (SELECT regular_table2.col2 FROM regular_table2 WHERE regular_table2.col1 > 1); {code} Exception: {code} 2017-08-01T13:27:47,483 ERROR [b0d354a8-4cdb-4ba9-acec-27d14926aaf4 main] ql.Driver: FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.FileNotFoundException: File file:/Users/stakiar/Documents/idea/apache-hive/itests/qtest-spark/target/tmp/scratchdir/stakiar/b0d354a8-4cdb-4ba9-acec-27d14926aaf4/hive_2017-08-01_13-27-45_553_1088589686371686526-1/-mr-10004/3/5 does not exist at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:408) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:498) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.immutable.List.foreach(List.scala:381) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.immutable.List.map(List.scala:285) at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:82) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.immutable.List.foreach(List.scala:381) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.immutable.List.map(List.scala:285) at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:82) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82)
[jira] [Commented] (HIVE-17220) Bloomfilter probing in semijoin reduction is thrashing L1 dcache
[ https://issues.apache.org/jira/browse/HIVE-17220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110143#comment-16110143 ] Hive QA commented on HIVE-17220: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12879912/HIVE-17220.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 11058 tests executed *Failed tests:* {noformat} TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=236) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=168) org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterByte (batchId=178) org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterBytes (batchId=178) org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterDouble (batchId=178) org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterFloat (batchId=178) org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterInt (batchId=178) org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterLong (batchId=178) org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterString (batchId=178) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=179) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6221/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6221/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6221/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 12 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12879912 - PreCommit-HIVE-Build > Bloomfilter probing in semijoin reduction is thrashing L1 dcache > > > Key: HIVE-17220 > URL: https://issues.apache.org/jira/browse/HIVE-17220 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-17220.1.patch, HIVE-17220.WIP.patch > > > [~gopalv] observed perf profiles showing bloomfilter probes as bottleneck for > some of the TPC-DS queries and resulted L1 data cache thrashing. > This is because of the huge bitset in bloom filter that doesn't fit in any > levels of cache, also the hash bits corresponding to a single key map to > different segments of bitset which are spread out. This can result in K-1 > memory access (K being number of hash functions) in worst case for every key > that gets probed because of locality miss in L1 cache. > Ran a JMH microbenchmark to verify the same. Following is the JMH perf > profile for bloom filter probing > {code} > Perf stats: > -- >5101.935637 task-clock (msec) #0.461 CPUs utilized >346 context-switches #0.068 K/sec >336 cpu-migrations#0.066 K/sec > 6,207 page-faults #0.001 M/sec > 10,016,486,301 cycles#1.963 GHz > (26.90%) > 5,751,692,176 stalled-cycles-frontend # 57.42% frontend cycles > idle (27.05%) > stalled-cycles-backend > 14,359,914,397 instructions #1.43 insns per cycle > #0.40 stalled cycles > per insn (33.78%) > 2,200,632,861 branches # 431.333 M/sec > (33.84%) > 1,162,860 branch-misses #0.05% of all branches > (33.97%) > 1,025,992,254 L1-dcache-loads # 201.099 M/sec > (26.56%) >432,663,098 L1-dcache-load-misses # 42.17% of all L1-dcache > hits(14.49%) >331,383,297 LLC-loads # 64.952 M/sec > (14.47%) >203,524 LLC-load-misses #0.06% of all LL-cache > hits (21.67%) > L1-icache-loads > 1,633,821 L1-icache-load-misses #0.320 M/sec > (28.85%) >950,368,796 dTLB-loads# 186.276 M/sec > (28.61%) >246,813,393
[jira] [Commented] (HIVE-16820) TezTask may not shut down correctly before submit
[ https://issues.apache.org/jira/browse/HIVE-16820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110071#comment-16110071 ] Mithun Radhakrishnan commented on HIVE-16820: - [~sershe], I wonder if a similar fix should go into {{MergeFileTask::execute()}}, to check for cancellation before job-submission. > TezTask may not shut down correctly before submit > - > > Key: HIVE-16820 > URL: https://issues.apache.org/jira/browse/HIVE-16820 > Project: Hive > Issue Type: Bug >Reporter: Visakh Nair >Assignee: Sergey Shelukhin > Fix For: 3.0.0 > > Attachments: HIVE-16820.01.patch, HIVE-16820.patch > > > The query will run and only fail at the very end when the driver checks its > own shutdown flag. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17133) NoSuchMethodError in Hadoop FileStatus.compareTo
[ https://issues.apache.org/jira/browse/HIVE-17133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110072#comment-16110072 ] Sergey Shelukhin commented on HIVE-17133: - The solution would be to upgrade to 2.8.2 once that is released. We'd always build with the old signature and thus support old and new versions but not the ones in the middle (need to verify that we do indeed refer to a method with old signature when building). Thus, Hive will not support Hadoop 2.8.0 and 2.8.1. The problem is if we release something referring to the new signature soon (or have already). We might need 2.2.1 and 2.3.0 if these were built against Hadoop 2.8.0/1, as far as I understand cc [~owen.omalley] [~pxiong] > NoSuchMethodError in Hadoop FileStatus.compareTo > > > Key: HIVE-17133 > URL: https://issues.apache.org/jira/browse/HIVE-17133 > Project: Hive > Issue Type: Bug >Reporter: Rui Li > > The stack trace is: > {noformat} > Caused by: java.lang.NoSuchMethodError: > org.apache.hadoop.fs.FileStatus.compareTo(Lorg/apache/hadoop/fs/FileStatus;)I > at > org.apache.hadoop.hive.ql.io.AcidUtils.lambda$getAcidState$0(AcidUtils.java:931) > at java.util.TimSort.countRunAndMakeAscending(TimSort.java:355) > at java.util.TimSort.sort(TimSort.java:234) > at java.util.Arrays.sort(Arrays.java:1512) > at java.util.ArrayList.sort(ArrayList.java:1454) > at java.util.Collections.sort(Collections.java:175) > at > org.apache.hadoop.hive.ql.io.AcidUtils.getAcidState(AcidUtils.java:929) > {noformat} > I'm on Hive master and using Hadoop 2.7.2. The method signature in Hadoop > 2.7.2 is: > https://github.com/apache/hadoop/blob/release-2.7.2-RC2/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileStatus.java#L336 > In Hadoop 2.8.0 it becomes: > https://github.com/apache/hadoop/blob/release-2.8.0-RC3/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileStatus.java#L332 > I think that breaks binary compatibility. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-17133) NoSuchMethodError in Hadoop FileStatus.compareTo
[ https://issues.apache.org/jira/browse/HIVE-17133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110072#comment-16110072 ] Sergey Shelukhin edited comment on HIVE-17133 at 8/2/17 12:51 AM: -- The solution would be to upgrade to 2.8.2 once that is released. We'd always build with the old signature and thus support old and new versions but not the ones in the middle (need to verify that we do indeed refer to a method with old signature when building). Thus, Hive will not support Hadoop 2.8.0 and 2.8.1. The problem is if we release something referring to the new signature soon (or have already). We might need 2.2.1 and 2.3.1 if these were built against Hadoop 2.8.0/1, as far as I understand cc [~owen.omalley] [~pxiong] was (Author: sershe): The solution would be to upgrade to 2.8.2 once that is released. We'd always build with the old signature and thus support old and new versions but not the ones in the middle (need to verify that we do indeed refer to a method with old signature when building). Thus, Hive will not support Hadoop 2.8.0 and 2.8.1. The problem is if we release something referring to the new signature soon (or have already). We might need 2.2.1 and 2.3.0 if these were built against Hadoop 2.8.0/1, as far as I understand cc [~owen.omalley] [~pxiong] > NoSuchMethodError in Hadoop FileStatus.compareTo > > > Key: HIVE-17133 > URL: https://issues.apache.org/jira/browse/HIVE-17133 > Project: Hive > Issue Type: Bug >Reporter: Rui Li > > The stack trace is: > {noformat} > Caused by: java.lang.NoSuchMethodError: > org.apache.hadoop.fs.FileStatus.compareTo(Lorg/apache/hadoop/fs/FileStatus;)I > at > org.apache.hadoop.hive.ql.io.AcidUtils.lambda$getAcidState$0(AcidUtils.java:931) > at java.util.TimSort.countRunAndMakeAscending(TimSort.java:355) > at java.util.TimSort.sort(TimSort.java:234) > at java.util.Arrays.sort(Arrays.java:1512) > at java.util.ArrayList.sort(ArrayList.java:1454) > at java.util.Collections.sort(Collections.java:175) > at > org.apache.hadoop.hive.ql.io.AcidUtils.getAcidState(AcidUtils.java:929) > {noformat} > I'm on Hive master and using Hadoop 2.7.2. The method signature in Hadoop > 2.7.2 is: > https://github.com/apache/hadoop/blob/release-2.7.2-RC2/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileStatus.java#L336 > In Hadoop 2.8.0 it becomes: > https://github.com/apache/hadoop/blob/release-2.8.0-RC3/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileStatus.java#L332 > I think that breaks binary compatibility. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17220) Bloomfilter probing in semijoin reduction is thrashing L1 dcache
[ https://issues.apache.org/jira/browse/HIVE-17220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110050#comment-16110050 ] Hive QA commented on HIVE-17220: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12879912/HIVE-17220.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 11058 tests executed *Failed tests:* {noformat} TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=236) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed] (batchId=241) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=168) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=100) org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterByte (batchId=178) org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterBytes (batchId=178) org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterDouble (batchId=178) org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterFloat (batchId=178) org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterInt (batchId=178) org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterLong (batchId=178) org.apache.hive.common.util.TestBloom1Filter.testBloom1FilterString (batchId=178) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=179) org.apache.hive.hcatalog.pig.TestTextFileHCatStorer.testWriteDecimalX (batchId=182) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6220/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6220/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6220/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12879912 - PreCommit-HIVE-Build > Bloomfilter probing in semijoin reduction is thrashing L1 dcache > > > Key: HIVE-17220 > URL: https://issues.apache.org/jira/browse/HIVE-17220 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-17220.1.patch, HIVE-17220.WIP.patch > > > [~gopalv] observed perf profiles showing bloomfilter probes as bottleneck for > some of the TPC-DS queries and resulted L1 data cache thrashing. > This is because of the huge bitset in bloom filter that doesn't fit in any > levels of cache, also the hash bits corresponding to a single key map to > different segments of bitset which are spread out. This can result in K-1 > memory access (K being number of hash functions) in worst case for every key > that gets probed because of locality miss in L1 cache. > Ran a JMH microbenchmark to verify the same. Following is the JMH perf > profile for bloom filter probing > {code} > Perf stats: > -- >5101.935637 task-clock (msec) #0.461 CPUs utilized >346 context-switches #0.068 K/sec >336 cpu-migrations#0.066 K/sec > 6,207 page-faults #0.001 M/sec > 10,016,486,301 cycles#1.963 GHz > (26.90%) > 5,751,692,176 stalled-cycles-frontend # 57.42% frontend cycles > idle (27.05%) > stalled-cycles-backend > 14,359,914,397 instructions #1.43 insns per cycle > #0.40 stalled cycles > per insn (33.78%) > 2,200,632,861 branches # 431.333 M/sec > (33.84%) > 1,162,860 branch-misses #0.05% of all branches > (33.97%) > 1,025,992,254 L1-dcache-loads # 201.099 M/sec > (26.56%) >432,663,098 L1-dcache-load-misses # 42.17% of all L1-dcache > hits(14.49%) >331,383,297 LLC-loads # 64.952 M/sec > (14.47%) >203,524 LLC-load-misses #0.06% of all
[jira] [Updated] (HIVE-17089) make acid 2.0 the default
[ https://issues.apache.org/jira/browse/HIVE-17089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-17089: -- Description: acid 2.0 is introduced in HIVE-14035. It replaces Update events with a combination of Delete + Insert events. This now makes U=D+I the default (and only) supported acid table type in Hive 3.0. The expectation for upgrade is that Major compaction has to be run on all acid tables in the existing Hive cluster and that no new writes to these table take place since the start of compaction (Need to add a mechanism to put a table in read-only mode - this way it can still be read while it's being compacted). Then upgrade to Hive 3.0 can take place. was: acid 2.0 is introduced in HIVE-14035. It replaces Update events with a combination of Delete + Insert events. This now makes U=D+I the default (and only) supported acid table type in Hive 3.0. The expectation for upgrade is that Major compaction has to be run on all acid tables in the existing Hive cluster and that no new writes to these table take place since the start of compaction. Then upgrade to Hive 3.0 can take place. > make acid 2.0 the default > - > > Key: HIVE-17089 > URL: https://issues.apache.org/jira/browse/HIVE-17089 > Project: Hive > Issue Type: New Feature > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-17089.01.patch, HIVE-17089.03.patch > > > acid 2.0 is introduced in HIVE-14035. It replaces Update events with a > combination of Delete + Insert events. This now makes U=D+I the default (and > only) supported acid table type in Hive 3.0. > The expectation for upgrade is that Major compaction has to be run on all > acid tables in the existing Hive cluster and that no new writes to these > table take place since the start of compaction (Need to add a mechanism to > put a table in read-only mode - this way it can still be read while it's > being compacted). Then upgrade to Hive 3.0 can take place. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17089) make acid 2.0 the default
[ https://issues.apache.org/jira/browse/HIVE-17089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-17089: -- Description: acid 2.0 is introduced in HIVE-14035. It replaces Update events with a combination of Delete + Insert events. This now makes U=D+I the default (and only) supported acid table type in Hive 3.0. The expectation for upgrade is that Major compaction has to be run on all acid tables in the existing Hive cluster and that no new writes to these table take place since the start of compaction. Then upgrade to Hive 3.0 can take place. was:acid 2.0 is introduced in HIVE-14035. It replaces Update events with a combination of Delete + Insert events. This now makes U=D+I the default (and only) supported acid table type in Hive 3.0 > make acid 2.0 the default > - > > Key: HIVE-17089 > URL: https://issues.apache.org/jira/browse/HIVE-17089 > Project: Hive > Issue Type: New Feature > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-17089.01.patch, HIVE-17089.03.patch > > > acid 2.0 is introduced in HIVE-14035. It replaces Update events with a > combination of Delete + Insert events. This now makes U=D+I the default (and > only) supported acid table type in Hive 3.0. > The expectation for upgrade is that Major compaction has to be run on all > acid tables in the existing Hive cluster and that no new writes to these > table take place since the start of compaction. Then upgrade to Hive 3.0 can > take place. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17089) make acid 2.0 the default
[ https://issues.apache.org/jira/browse/HIVE-17089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-17089: -- Description: acid 2.0 is introduced in HIVE-14035. It replaces Update events with a combination of Delete + Insert events. This now makes U=D+I the default (and only) supported acid table type in Hive 3.0 > make acid 2.0 the default > - > > Key: HIVE-17089 > URL: https://issues.apache.org/jira/browse/HIVE-17089 > Project: Hive > Issue Type: New Feature > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-17089.01.patch, HIVE-17089.03.patch > > > acid 2.0 is introduced in HIVE-14035. It replaces Update events with a > combination of Delete + Insert events. This now makes U=D+I the default (and > only) supported acid table type in Hive 3.0 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17089) make acid 2.0 the default
[ https://issues.apache.org/jira/browse/HIVE-17089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-17089: -- Issue Type: New Feature (was: Test) > make acid 2.0 the default > - > > Key: HIVE-17089 > URL: https://issues.apache.org/jira/browse/HIVE-17089 > Project: Hive > Issue Type: New Feature > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-17089.01.patch, HIVE-17089.03.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17089) make acid 2.0 the default
[ https://issues.apache.org/jira/browse/HIVE-17089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-17089: -- Status: Patch Available (was: Open) > make acid 2.0 the default > - > > Key: HIVE-17089 > URL: https://issues.apache.org/jira/browse/HIVE-17089 > Project: Hive > Issue Type: Test > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-17089.01.patch, HIVE-17089.03.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17089) make acid 2.0 the default
[ https://issues.apache.org/jira/browse/HIVE-17089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-17089: -- Attachment: HIVE-17089.03.patch > make acid 2.0 the default > - > > Key: HIVE-17089 > URL: https://issues.apache.org/jira/browse/HIVE-17089 > Project: Hive > Issue Type: Test > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-17089.01.patch, HIVE-17089.03.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17164) Vectorization: Support PTF (Part 2: Unbounded Support-- Turn ON by default)
[ https://issues.apache.org/jira/browse/HIVE-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109995#comment-16109995 ] Teddy Choi commented on HIVE-17164: --- +1 LGTM and tests pending. > Vectorization: Support PTF (Part 2: Unbounded Support-- Turn ON by default) > --- > > Key: HIVE-17164 > URL: https://issues.apache.org/jira/browse/HIVE-17164 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-17164.01.patch, HIVE-17164.02.patch > > > Add disk storage backing. Turn hive.vectorized.execution.ptf.enabled on by > default. > Add hive.vectorized.ptf.max.memory.buffering.batch.count to specify the > maximum number of vectorized row batch to buffer in memory before spilling to > disk. > Add hive.vectorized.testing.reducer.batch.size parameter to have the Tez > Reducer make small batches for making a lot of key group batches that cause > memory buffering and disk storage backing. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17170) Move thrift generated code to stand alone metastore
[ https://issues.apache.org/jira/browse/HIVE-17170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-17170: -- Status: Patch Available (was: Open) > Move thrift generated code to stand alone metastore > --- > > Key: HIVE-17170 > URL: https://issues.apache.org/jira/browse/HIVE-17170 > Project: Hive > Issue Type: Sub-task > Components: Metastore >Reporter: Alan Gates >Assignee: Alan Gates > Attachments: HIVE-17170.2.patch, HIVE-17170.patch > > > hive_metastore.thrift and the code it generates needs to be moved into the > standalone metastore module. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17170) Move thrift generated code to stand alone metastore
[ https://issues.apache.org/jira/browse/HIVE-17170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-17170: -- Attachment: HIVE-17170.2.patch New version of the patch that I think addresses the issue (it builds for me locally, so I'm not sure). > Move thrift generated code to stand alone metastore > --- > > Key: HIVE-17170 > URL: https://issues.apache.org/jira/browse/HIVE-17170 > Project: Hive > Issue Type: Sub-task > Components: Metastore >Reporter: Alan Gates >Assignee: Alan Gates > Attachments: HIVE-17170.2.patch, HIVE-17170.patch > > > hive_metastore.thrift and the code it generates needs to be moved into the > standalone metastore module. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17170) Move thrift generated code to stand alone metastore
[ https://issues.apache.org/jira/browse/HIVE-17170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-17170: -- Status: Open (was: Patch Available) > Move thrift generated code to stand alone metastore > --- > > Key: HIVE-17170 > URL: https://issues.apache.org/jira/browse/HIVE-17170 > Project: Hive > Issue Type: Sub-task > Components: Metastore >Reporter: Alan Gates >Assignee: Alan Gates > Attachments: HIVE-17170.2.patch, HIVE-17170.patch > > > hive_metastore.thrift and the code it generates needs to be moved into the > standalone metastore module. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17172) add ordering checks to DiskRangeList
[ https://issues.apache.org/jira/browse/HIVE-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109974#comment-16109974 ] Hive QA commented on HIVE-17172: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12879904/HIVE-17172.02.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 11041 tests executed *Failed tests:* {noformat} TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=236) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1] (batchId=241) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_uncompressed] (batchId=56) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype] (batchId=158) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=168) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=100) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=179) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6219/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6219/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6219/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12879904 - PreCommit-HIVE-Build > add ordering checks to DiskRangeList > > > Key: HIVE-17172 > URL: https://issues.apache.org/jira/browse/HIVE-17172 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17172.01.patch, HIVE-17172.02.patch, > HIVE-17172.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16979) Cache UGI for metastore
[ https://issues.apache.org/jira/browse/HIVE-16979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109964#comment-16109964 ] Gopal V commented on HIVE-16979: > A metastore call normally takes less than a second. Oh, I thought that TUGIAssumingProcessor is used by ThriftBinaryCLIService as well. ThriftBinaryCLIService::run() -> KerberosSaslHelper$CLIServiceProcessorFactory::getProcessor() -> HadoopThriftAuthBridge::wrapNonAssumingProcessor() -> new TUGIAssumingProcessor() > Cache UGI for metastore > --- > > Key: HIVE-16979 > URL: https://issues.apache.org/jira/browse/HIVE-16979 > Project: Hive > Issue Type: Improvement >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-16979.1.patch, HIVE-16979.2.patch, > HIVE-16979.3.patch > > > FileSystem.closeAllForUGI is called per request against metastore to dispose > UGI, which involves talking to HDFS name node and is time consuming. So the > perf improvement would be caching and reusing the UGI. > Per FileSystem.closeAllForUG call could take up to 20 ms as E2E latency > against HDFS. Usually a Hive query could result in several calls against > metastore, so we can save up to 50-100 ms per hive query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16974) Change the sort key for the schema tool validator to be
[ https://issues.apache.org/jira/browse/HIVE-16974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-16974: - Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Patch has been pushed to master. Thank you for the review [~aihuaxu] > Change the sort key for the schema tool validator to be > > > Key: HIVE-16974 > URL: https://issues.apache.org/jira/browse/HIVE-16974 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: Naveen Gangam >Assignee: Naveen Gangam > Fix For: 3.0.0 > > Attachments: HIVE-16974.patch, HIVE-16974.patch > > > In HIVE-16729, we introduced ordering of results/failures returned by > schematool's validators. This allows fault injection testing to expect > results that can be verified. However, they were sorted on NAME values which > in the HMS schema can be NULL. So if the introduced fault has a NULL/BLANK > name column value, the result could be different depending on the backend > database(if they sort NULLs first or last). > So I think it is better to sort on a non-null column value. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17226) Use strong hashing as security improvement
[ https://issues.apache.org/jira/browse/HIVE-17226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Li updated HIVE-17226: -- Component/s: (was: Hive) Security > Use strong hashing as security improvement > -- > > Key: HIVE-17226 > URL: https://issues.apache.org/jira/browse/HIVE-17226 > Project: Hive > Issue Type: Improvement > Components: Security >Reporter: Tao Li >Assignee: Tao Li > > There have been 2 places identified where weak hashing needs to be replaced > by SHA256. > 1. CookieSigner.java uses MessageDigest.getInstance("SHA"). Mostly SHA is > mapped to SHA-1, which is not secure enough according to today's standards. > We should use SHA-256 instead. > 2. GenericUDFMaskHash.java uses DigestUtils.md5Hex. MD5 is considered weak > and should be replaced by DigestUtils.sha256Hex. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17164) Vectorization: Support PTF (Part 2: Unbounded Support-- Turn ON by default)
[ https://issues.apache.org/jira/browse/HIVE-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109901#comment-16109901 ] Teddy Choi commented on HIVE-17164: --- The patch looks good, but some tests are failed. llap/vector_ptf_part_simple.q.out is failed because of different fractions. Also vector_windowing_expressions.q.out for TestCliDriver needs to be updated, too. > Vectorization: Support PTF (Part 2: Unbounded Support-- Turn ON by default) > --- > > Key: HIVE-17164 > URL: https://issues.apache.org/jira/browse/HIVE-17164 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-17164.01.patch, HIVE-17164.02.patch > > > Add disk storage backing. Turn hive.vectorized.execution.ptf.enabled on by > default. > Add hive.vectorized.ptf.max.memory.buffering.batch.count to specify the > maximum number of vectorized row batch to buffer in memory before spilling to > disk. > Add hive.vectorized.testing.reducer.batch.size parameter to have the Tez > Reducer make small batches for making a lot of key group batches that cause > memory buffering and disk storage backing. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17226) Use strong hashing as security improvement
[ https://issues.apache.org/jira/browse/HIVE-17226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Li updated HIVE-17226: -- Component/s: Hive > Use strong hashing as security improvement > -- > > Key: HIVE-17226 > URL: https://issues.apache.org/jira/browse/HIVE-17226 > Project: Hive > Issue Type: Improvement > Components: Hive >Reporter: Tao Li >Assignee: Tao Li > > There have been 2 places identified where weak hashing needs to be replaced > by SHA256. > 1. CookieSigner.java uses MessageDigest.getInstance("SHA"). Mostly SHA is > mapped to SHA-1, which is not secure enough according to today's standards. > We should use SHA-256 instead. > 2. GenericUDFMaskHash.java uses DigestUtils.md5Hex. MD5 is considered weak > and should be replaced by DigestUtils.sha256Hex. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17226) Use strong hashing as security improvement
[ https://issues.apache.org/jira/browse/HIVE-17226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Li reassigned HIVE-17226: - > Use strong hashing as security improvement > -- > > Key: HIVE-17226 > URL: https://issues.apache.org/jira/browse/HIVE-17226 > Project: Hive > Issue Type: Improvement >Reporter: Tao Li >Assignee: Tao Li > > There have been 2 places identified where weak hashing needs to be replaced > by SHA256. > 1. CookieSigner.java uses MessageDigest.getInstance("SHA"). Mostly SHA is > mapped to SHA-1, which is not secure enough according to today's standards. > We should use SHA-256 instead. > 2. GenericUDFMaskHash.java uses DigestUtils.md5Hex. MD5 is considered weak > and should be replaced by DigestUtils.sha256Hex. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17089) make acid 2.0 the default
[ https://issues.apache.org/jira/browse/HIVE-17089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-17089: -- Summary: make acid 2.0 the default (was: run ptest with acid 2.0 the default) > make acid 2.0 the default > - > > Key: HIVE-17089 > URL: https://issues.apache.org/jira/browse/HIVE-17089 > Project: Hive > Issue Type: Test > Components: Transactions >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-17089.01.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17213) HoS: file merging doesn't work for union all
[ https://issues.apache.org/jira/browse/HIVE-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109852#comment-16109852 ] Hive QA commented on HIVE-17213: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12879885/HIVE-17213.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 11040 tests executed *Failed tests:* {noformat} TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=236) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=168) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=99) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=179) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6218/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6218/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6218/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12879885 - PreCommit-HIVE-Build > HoS: file merging doesn't work for union all > > > Key: HIVE-17213 > URL: https://issues.apache.org/jira/browse/HIVE-17213 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HIVE-17213.0.patch, HIVE-17213.1.patch, > HIVE-17213.2.patch > > > HoS file merging doesn't work properly since it doesn't set linked file sinks > properly which is used to generate move tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17220) Bloomfilter probing in semijoin reduction is thrashing L1 dcache
[ https://issues.apache.org/jira/browse/HIVE-17220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109789#comment-16109789 ] Prasanth Jayachandran commented on HIVE-17220: -- Just found it and fixed it :) > Bloomfilter probing in semijoin reduction is thrashing L1 dcache > > > Key: HIVE-17220 > URL: https://issues.apache.org/jira/browse/HIVE-17220 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-17220.1.patch, HIVE-17220.WIP.patch > > > [~gopalv] observed perf profiles showing bloomfilter probes as bottleneck for > some of the TPC-DS queries and resulted L1 data cache thrashing. > This is because of the huge bitset in bloom filter that doesn't fit in any > levels of cache, also the hash bits corresponding to a single key map to > different segments of bitset which are spread out. This can result in K-1 > memory access (K being number of hash functions) in worst case for every key > that gets probed because of locality miss in L1 cache. > Ran a JMH microbenchmark to verify the same. Following is the JMH perf > profile for bloom filter probing > {code} > Perf stats: > -- >5101.935637 task-clock (msec) #0.461 CPUs utilized >346 context-switches #0.068 K/sec >336 cpu-migrations#0.066 K/sec > 6,207 page-faults #0.001 M/sec > 10,016,486,301 cycles#1.963 GHz > (26.90%) > 5,751,692,176 stalled-cycles-frontend # 57.42% frontend cycles > idle (27.05%) > stalled-cycles-backend > 14,359,914,397 instructions #1.43 insns per cycle > #0.40 stalled cycles > per insn (33.78%) > 2,200,632,861 branches # 431.333 M/sec > (33.84%) > 1,162,860 branch-misses #0.05% of all branches > (33.97%) > 1,025,992,254 L1-dcache-loads # 201.099 M/sec > (26.56%) >432,663,098 L1-dcache-load-misses # 42.17% of all L1-dcache > hits(14.49%) >331,383,297 LLC-loads # 64.952 M/sec > (14.47%) >203,524 LLC-load-misses #0.06% of all LL-cache > hits (21.67%) > L1-icache-loads > 1,633,821 L1-icache-load-misses #0.320 M/sec > (28.85%) >950,368,796 dTLB-loads# 186.276 M/sec > (28.61%) >246,813,393 dTLB-load-misses # 25.97% of all dTLB > cache hits (14.53%) > 25,451 iTLB-loads#0.005 M/sec > (14.48%) > 35,415 iTLB-load-misses # 139.15% of all iTLB > cache hits (21.73%) > L1-dcache-prefetches >175,958 L1-dcache-prefetch-misses #0.034 M/sec > (28.94%) > 11.064783140 seconds time elapsed > {code} > This shows 42.17% of L1 data cache misses. > This jira is to use cache efficient bloom filter for semijoin probing. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17220) Bloomfilter probing in semijoin reduction is thrashing L1 dcache
[ https://issues.apache.org/jira/browse/HIVE-17220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-17220: - Attachment: (was: HIVE-17220.1.patch) > Bloomfilter probing in semijoin reduction is thrashing L1 dcache > > > Key: HIVE-17220 > URL: https://issues.apache.org/jira/browse/HIVE-17220 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-17220.1.patch, HIVE-17220.WIP.patch > > > [~gopalv] observed perf profiles showing bloomfilter probes as bottleneck for > some of the TPC-DS queries and resulted L1 data cache thrashing. > This is because of the huge bitset in bloom filter that doesn't fit in any > levels of cache, also the hash bits corresponding to a single key map to > different segments of bitset which are spread out. This can result in K-1 > memory access (K being number of hash functions) in worst case for every key > that gets probed because of locality miss in L1 cache. > Ran a JMH microbenchmark to verify the same. Following is the JMH perf > profile for bloom filter probing > {code} > Perf stats: > -- >5101.935637 task-clock (msec) #0.461 CPUs utilized >346 context-switches #0.068 K/sec >336 cpu-migrations#0.066 K/sec > 6,207 page-faults #0.001 M/sec > 10,016,486,301 cycles#1.963 GHz > (26.90%) > 5,751,692,176 stalled-cycles-frontend # 57.42% frontend cycles > idle (27.05%) > stalled-cycles-backend > 14,359,914,397 instructions #1.43 insns per cycle > #0.40 stalled cycles > per insn (33.78%) > 2,200,632,861 branches # 431.333 M/sec > (33.84%) > 1,162,860 branch-misses #0.05% of all branches > (33.97%) > 1,025,992,254 L1-dcache-loads # 201.099 M/sec > (26.56%) >432,663,098 L1-dcache-load-misses # 42.17% of all L1-dcache > hits(14.49%) >331,383,297 LLC-loads # 64.952 M/sec > (14.47%) >203,524 LLC-load-misses #0.06% of all LL-cache > hits (21.67%) > L1-icache-loads > 1,633,821 L1-icache-load-misses #0.320 M/sec > (28.85%) >950,368,796 dTLB-loads# 186.276 M/sec > (28.61%) >246,813,393 dTLB-load-misses # 25.97% of all dTLB > cache hits (14.53%) > 25,451 iTLB-loads#0.005 M/sec > (14.48%) > 35,415 iTLB-load-misses # 139.15% of all iTLB > cache hits (21.73%) > L1-dcache-prefetches >175,958 L1-dcache-prefetch-misses #0.034 M/sec > (28.94%) > 11.064783140 seconds time elapsed > {code} > This shows 42.17% of L1 data cache misses. > This jira is to use cache efficient bloom filter for semijoin probing. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17220) Bloomfilter probing in semijoin reduction is thrashing L1 dcache
[ https://issues.apache.org/jira/browse/HIVE-17220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-17220: - Attachment: HIVE-17220.1.patch > Bloomfilter probing in semijoin reduction is thrashing L1 dcache > > > Key: HIVE-17220 > URL: https://issues.apache.org/jira/browse/HIVE-17220 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-17220.1.patch, HIVE-17220.WIP.patch > > > [~gopalv] observed perf profiles showing bloomfilter probes as bottleneck for > some of the TPC-DS queries and resulted L1 data cache thrashing. > This is because of the huge bitset in bloom filter that doesn't fit in any > levels of cache, also the hash bits corresponding to a single key map to > different segments of bitset which are spread out. This can result in K-1 > memory access (K being number of hash functions) in worst case for every key > that gets probed because of locality miss in L1 cache. > Ran a JMH microbenchmark to verify the same. Following is the JMH perf > profile for bloom filter probing > {code} > Perf stats: > -- >5101.935637 task-clock (msec) #0.461 CPUs utilized >346 context-switches #0.068 K/sec >336 cpu-migrations#0.066 K/sec > 6,207 page-faults #0.001 M/sec > 10,016,486,301 cycles#1.963 GHz > (26.90%) > 5,751,692,176 stalled-cycles-frontend # 57.42% frontend cycles > idle (27.05%) > stalled-cycles-backend > 14,359,914,397 instructions #1.43 insns per cycle > #0.40 stalled cycles > per insn (33.78%) > 2,200,632,861 branches # 431.333 M/sec > (33.84%) > 1,162,860 branch-misses #0.05% of all branches > (33.97%) > 1,025,992,254 L1-dcache-loads # 201.099 M/sec > (26.56%) >432,663,098 L1-dcache-load-misses # 42.17% of all L1-dcache > hits(14.49%) >331,383,297 LLC-loads # 64.952 M/sec > (14.47%) >203,524 LLC-load-misses #0.06% of all LL-cache > hits (21.67%) > L1-icache-loads > 1,633,821 L1-icache-load-misses #0.320 M/sec > (28.85%) >950,368,796 dTLB-loads# 186.276 M/sec > (28.61%) >246,813,393 dTLB-load-misses # 25.97% of all dTLB > cache hits (14.53%) > 25,451 iTLB-loads#0.005 M/sec > (14.48%) > 35,415 iTLB-load-misses # 139.15% of all iTLB > cache hits (21.73%) > L1-dcache-prefetches >175,958 L1-dcache-prefetch-misses #0.034 M/sec > (28.94%) > 11.064783140 seconds time elapsed > {code} > This shows 42.17% of L1 data cache misses. > This jira is to use cache efficient bloom filter for semijoin probing. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17220) Bloomfilter probing in semijoin reduction is thrashing L1 dcache
[ https://issues.apache.org/jira/browse/HIVE-17220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109771#comment-16109771 ] Gopal V commented on HIVE-17220: [~prasanth_j]: cut-paste issue? {code} TEZ_BLOOM_FILTER_FPP("hive.tez.bloom.filter.factor", 0.03f, {code} > Bloomfilter probing in semijoin reduction is thrashing L1 dcache > > > Key: HIVE-17220 > URL: https://issues.apache.org/jira/browse/HIVE-17220 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-17220.1.patch, HIVE-17220.WIP.patch > > > [~gopalv] observed perf profiles showing bloomfilter probes as bottleneck for > some of the TPC-DS queries and resulted L1 data cache thrashing. > This is because of the huge bitset in bloom filter that doesn't fit in any > levels of cache, also the hash bits corresponding to a single key map to > different segments of bitset which are spread out. This can result in K-1 > memory access (K being number of hash functions) in worst case for every key > that gets probed because of locality miss in L1 cache. > Ran a JMH microbenchmark to verify the same. Following is the JMH perf > profile for bloom filter probing > {code} > Perf stats: > -- >5101.935637 task-clock (msec) #0.461 CPUs utilized >346 context-switches #0.068 K/sec >336 cpu-migrations#0.066 K/sec > 6,207 page-faults #0.001 M/sec > 10,016,486,301 cycles#1.963 GHz > (26.90%) > 5,751,692,176 stalled-cycles-frontend # 57.42% frontend cycles > idle (27.05%) > stalled-cycles-backend > 14,359,914,397 instructions #1.43 insns per cycle > #0.40 stalled cycles > per insn (33.78%) > 2,200,632,861 branches # 431.333 M/sec > (33.84%) > 1,162,860 branch-misses #0.05% of all branches > (33.97%) > 1,025,992,254 L1-dcache-loads # 201.099 M/sec > (26.56%) >432,663,098 L1-dcache-load-misses # 42.17% of all L1-dcache > hits(14.49%) >331,383,297 LLC-loads # 64.952 M/sec > (14.47%) >203,524 LLC-load-misses #0.06% of all LL-cache > hits (21.67%) > L1-icache-loads > 1,633,821 L1-icache-load-misses #0.320 M/sec > (28.85%) >950,368,796 dTLB-loads# 186.276 M/sec > (28.61%) >246,813,393 dTLB-load-misses # 25.97% of all dTLB > cache hits (14.53%) > 25,451 iTLB-loads#0.005 M/sec > (14.48%) > 35,415 iTLB-load-misses # 139.15% of all iTLB > cache hits (21.73%) > L1-dcache-prefetches >175,958 L1-dcache-prefetch-misses #0.034 M/sec > (28.94%) > 11.064783140 seconds time elapsed > {code} > This shows 42.17% of L1 data cache misses. > This jira is to use cache efficient bloom filter for semijoin probing. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17220) Bloomfilter probing in semijoin reduction is thrashing L1 dcache
[ https://issues.apache.org/jira/browse/HIVE-17220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-17220: - Status: Patch Available (was: Open) > Bloomfilter probing in semijoin reduction is thrashing L1 dcache > > > Key: HIVE-17220 > URL: https://issues.apache.org/jira/browse/HIVE-17220 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-17220.1.patch, HIVE-17220.WIP.patch > > > [~gopalv] observed perf profiles showing bloomfilter probes as bottleneck for > some of the TPC-DS queries and resulted L1 data cache thrashing. > This is because of the huge bitset in bloom filter that doesn't fit in any > levels of cache, also the hash bits corresponding to a single key map to > different segments of bitset which are spread out. This can result in K-1 > memory access (K being number of hash functions) in worst case for every key > that gets probed because of locality miss in L1 cache. > Ran a JMH microbenchmark to verify the same. Following is the JMH perf > profile for bloom filter probing > {code} > Perf stats: > -- >5101.935637 task-clock (msec) #0.461 CPUs utilized >346 context-switches #0.068 K/sec >336 cpu-migrations#0.066 K/sec > 6,207 page-faults #0.001 M/sec > 10,016,486,301 cycles#1.963 GHz > (26.90%) > 5,751,692,176 stalled-cycles-frontend # 57.42% frontend cycles > idle (27.05%) > stalled-cycles-backend > 14,359,914,397 instructions #1.43 insns per cycle > #0.40 stalled cycles > per insn (33.78%) > 2,200,632,861 branches # 431.333 M/sec > (33.84%) > 1,162,860 branch-misses #0.05% of all branches > (33.97%) > 1,025,992,254 L1-dcache-loads # 201.099 M/sec > (26.56%) >432,663,098 L1-dcache-load-misses # 42.17% of all L1-dcache > hits(14.49%) >331,383,297 LLC-loads # 64.952 M/sec > (14.47%) >203,524 LLC-load-misses #0.06% of all LL-cache > hits (21.67%) > L1-icache-loads > 1,633,821 L1-icache-load-misses #0.320 M/sec > (28.85%) >950,368,796 dTLB-loads# 186.276 M/sec > (28.61%) >246,813,393 dTLB-load-misses # 25.97% of all dTLB > cache hits (14.53%) > 25,451 iTLB-loads#0.005 M/sec > (14.48%) > 35,415 iTLB-load-misses # 139.15% of all iTLB > cache hits (21.73%) > L1-dcache-prefetches >175,958 L1-dcache-prefetch-misses #0.034 M/sec > (28.94%) > 11.064783140 seconds time elapsed > {code} > This shows 42.17% of L1 data cache misses. > This jira is to use cache efficient bloom filter for semijoin probing. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17220) Bloomfilter probing in semijoin reduction is thrashing L1 dcache
[ https://issues.apache.org/jira/browse/HIVE-17220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-17220: - Attachment: HIVE-17220.1.patch Made fpp configurable, also changed default fpp to 0.03 for bloom-1. > Bloomfilter probing in semijoin reduction is thrashing L1 dcache > > > Key: HIVE-17220 > URL: https://issues.apache.org/jira/browse/HIVE-17220 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-17220.1.patch, HIVE-17220.WIP.patch > > > [~gopalv] observed perf profiles showing bloomfilter probes as bottleneck for > some of the TPC-DS queries and resulted L1 data cache thrashing. > This is because of the huge bitset in bloom filter that doesn't fit in any > levels of cache, also the hash bits corresponding to a single key map to > different segments of bitset which are spread out. This can result in K-1 > memory access (K being number of hash functions) in worst case for every key > that gets probed because of locality miss in L1 cache. > Ran a JMH microbenchmark to verify the same. Following is the JMH perf > profile for bloom filter probing > {code} > Perf stats: > -- >5101.935637 task-clock (msec) #0.461 CPUs utilized >346 context-switches #0.068 K/sec >336 cpu-migrations#0.066 K/sec > 6,207 page-faults #0.001 M/sec > 10,016,486,301 cycles#1.963 GHz > (26.90%) > 5,751,692,176 stalled-cycles-frontend # 57.42% frontend cycles > idle (27.05%) > stalled-cycles-backend > 14,359,914,397 instructions #1.43 insns per cycle > #0.40 stalled cycles > per insn (33.78%) > 2,200,632,861 branches # 431.333 M/sec > (33.84%) > 1,162,860 branch-misses #0.05% of all branches > (33.97%) > 1,025,992,254 L1-dcache-loads # 201.099 M/sec > (26.56%) >432,663,098 L1-dcache-load-misses # 42.17% of all L1-dcache > hits(14.49%) >331,383,297 LLC-loads # 64.952 M/sec > (14.47%) >203,524 LLC-load-misses #0.06% of all LL-cache > hits (21.67%) > L1-icache-loads > 1,633,821 L1-icache-load-misses #0.320 M/sec > (28.85%) >950,368,796 dTLB-loads# 186.276 M/sec > (28.61%) >246,813,393 dTLB-load-misses # 25.97% of all dTLB > cache hits (14.53%) > 25,451 iTLB-loads#0.005 M/sec > (14.48%) > 35,415 iTLB-load-misses # 139.15% of all iTLB > cache hits (21.73%) > L1-dcache-prefetches >175,958 L1-dcache-prefetch-misses #0.034 M/sec > (28.94%) > 11.064783140 seconds time elapsed > {code} > This shows 42.17% of L1 data cache misses. > This jira is to use cache efficient bloom filter for semijoin probing. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16979) Cache UGI for metastore
[ https://issues.apache.org/jira/browse/HIVE-16979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109720#comment-16109720 ] Tao Li commented on HIVE-16979: --- [~gopalv] The original code is creating and closing a UGI instance per metastore request, and our change is caching the UGI 24 hours after last access. A metastore call normally takes less than a second. So 24 hours is long enough to make sure we will not fail any ongoing metastore call. Does it answer your question? > Cache UGI for metastore > --- > > Key: HIVE-16979 > URL: https://issues.apache.org/jira/browse/HIVE-16979 > Project: Hive > Issue Type: Improvement >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-16979.1.patch, HIVE-16979.2.patch, > HIVE-16979.3.patch > > > FileSystem.closeAllForUGI is called per request against metastore to dispose > UGI, which involves talking to HDFS name node and is time consuming. So the > perf improvement would be caching and reusing the UGI. > Per FileSystem.closeAllForUG call could take up to 20 ms as E2E latency > against HDFS. Usually a Hive query could result in several calls against > metastore, so we can save up to 50-100 ms per hive query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17225) FileNotFoundException in HiveInputFormat#init for query HoS DPP query with multiple left semi-joins against the same partition column
[ https://issues.apache.org/jira/browse/HIVE-17225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar reassigned HIVE-17225: --- > FileNotFoundException in HiveInputFormat#init for query HoS DPP query with > multiple left semi-joins against the same partition column > - > > Key: HIVE-17225 > URL: https://issues.apache.org/jira/browse/HIVE-17225 > Project: Hive > Issue Type: Sub-task > Components: Spark >Affects Versions: 3.0.0 >Reporter: Sahil Takiar >Assignee: Sahil Takiar > > Setup: > {code:sql} > SET hive.spark.dynamic.partition.pruning=true; > SET hive.strict.checks.cartesian.product=false; > CREATE TABLE partitioned_table1 (col int) PARTITIONED BY (part_col int); > CREATE TABLE regular_table1 (col1 int, col2 int); > CREATE TABLE regular_table2 (col1 int, col2 int); > ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 1); > ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 2); > ALTER TABLE partitioned_table1 ADD PARTITION (part_col = 3); > INSERT INTO table regular_table1 VALUES (0, 0), (1, 1), (2, 2); > INSERT INTO table regular_table2 VALUES (0, 0), (1, 1), (2, 2); > INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 1) VALUES (1), > (2), (3); > INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 2) VALUES (1), > (2), (3); > INSERT INTO TABLE partitioned_table1 PARTITION (part_col = 3) VALUES (1), > (2), (3); > SELECT * > FROM regular_table1, >regular_table2, >partitioned_table1 > WHERE partitioned_table1.part_col IN (SELECT regular_table1.col2 >FROM regular_table1 >WHERE regular_table1.col1 > 0) >AND partitioned_table1.part_col IN (SELECT regular_table2.col2 >FROM regular_table2 >WHERE regular_table2.col1 > 1); > {code} > Exception: > {code} > 2017-08-01T13:27:47,483 ERROR [b0d354a8-4cdb-4ba9-acec-27d14926aaf4 main] > ql.Driver: FAILED: Execution Error, return code 3 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask. java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: > java.io.FileNotFoundException: File > file:/Users/stakiar/Documents/idea/apache-hive/itests/qtest-spark/target/tmp/scratchdir/stakiar/b0d354a8-4cdb-4ba9-acec-27d14926aaf4/hive_2017-08-01_13-27-45_553_1088589686371686526-1/-mr-10004/3/5 > does not exist > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:408) > at > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:498) > at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) > at scala.Option.getOrElse(Option.scala:121) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) > at > org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) > at scala.Option.getOrElse(Option.scala:121) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) > at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) > at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at scala.collection.immutable.List.foreach(List.scala:381) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at scala.collection.immutable.List.map(List.scala:285) > at org.apache.spark.rdd.UnionRDD.getPartitions(UnionRDD.scala:82) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246) > at scala.Option.getOrElse(Option.scala:121) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:246) > at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) > at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply(UnionRDD.scala:82) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at scala.collection.immutable.List.foreach(List.scala:381) > at
[jira] [Commented] (HIVE-16979) Cache UGI for metastore
[ https://issues.apache.org/jira/browse/HIVE-16979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109684#comment-16109684 ] Gopal V commented on HIVE-16979: [~taoli-hwx]: does this fail queries which take > 24hours? Is there something we can do to mark "liveness" from the query progress loop to make sure the FileSystem.closeAllForUgi() -> deleteOnExit doesn't cleanup any directory currently being written to inside the cluster? > Cache UGI for metastore > --- > > Key: HIVE-16979 > URL: https://issues.apache.org/jira/browse/HIVE-16979 > Project: Hive > Issue Type: Improvement >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-16979.1.patch, HIVE-16979.2.patch, > HIVE-16979.3.patch > > > FileSystem.closeAllForUGI is called per request against metastore to dispose > UGI, which involves talking to HDFS name node and is time consuming. So the > perf improvement would be caching and reusing the UGI. > Per FileSystem.closeAllForUG call could take up to 20 ms as E2E latency > against HDFS. Usually a Hive query could result in several calls against > metastore, so we can save up to 50-100 ms per hive query. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17170) Move thrift generated code to stand alone metastore
[ https://issues.apache.org/jira/browse/HIVE-17170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109672#comment-16109672 ] Hive QA commented on HIVE-17170: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12879879/HIVE-17170.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6217/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6217/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6217/ Messages: {noformat} This message was trimmed, see log for full details [ERROR] /data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[41,44] package org.apache.hadoop.hive.metastore.api does not exist [ERROR] /data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[42,44] package org.apache.hadoop.hive.metastore.api does not exist [ERROR] /data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[216,30] cannot find symbol symbol: class Table location: class org.apache.hadoop.hive.druid.DruidStorageHandler [ERROR] /data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[216,50] cannot find symbol symbol: class MetaException location: class org.apache.hadoop.hive.druid.DruidStorageHandler [ERROR] /data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[249,35] cannot find symbol symbol: class Table location: class org.apache.hadoop.hive.druid.DruidStorageHandler [ERROR] /data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[249,55] cannot find symbol symbol: class MetaException location: class org.apache.hadoop.hive.druid.DruidStorageHandler [ERROR] /data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[273,33] cannot find symbol symbol: class Table location: class org.apache.hadoop.hive.druid.DruidStorageHandler [ERROR] /data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[273,53] cannot find symbol symbol: class MetaException location: class org.apache.hadoop.hive.druid.DruidStorageHandler [ERROR] /data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[278,31] cannot find symbol symbol: class Table location: class org.apache.hadoop.hive.druid.DruidStorageHandler [ERROR] /data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[278,70] cannot find symbol symbol: class MetaException location: class org.apache.hadoop.hive.druid.DruidStorageHandler [ERROR] /data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[455,28] cannot find symbol symbol: class Table location: class org.apache.hadoop.hive.druid.DruidStorageHandler [ERROR] /data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[455,48] cannot find symbol symbol: class MetaException location: class org.apache.hadoop.hive.druid.DruidStorageHandler [ERROR] /data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[460,33] cannot find symbol symbol: class Table location: class org.apache.hadoop.hive.druid.DruidStorageHandler [ERROR] /data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[460,53] cannot find symbol symbol: class MetaException location: class org.apache.hadoop.hive.druid.DruidStorageHandler [ERROR] /data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[465,31] cannot find symbol symbol: class Table location: class org.apache.hadoop.hive.druid.DruidStorageHandler [ERROR] /data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[465,71] cannot find symbol symbol: class MetaException location: class org.apache.hadoop.hive.druid.DruidStorageHandler [ERROR] /data/hiveptest/working/apache-github-source-source/druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java:[497,33] cannot find symbol symbol: class
[jira] [Updated] (HIVE-17172) add ordering checks to DiskRangeList
[ https://issues.apache.org/jira/browse/HIVE-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17172: Attachment: (was: HIVE-17172.02.patch) > add ordering checks to DiskRangeList > > > Key: HIVE-17172 > URL: https://issues.apache.org/jira/browse/HIVE-17172 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17172.01.patch, HIVE-17172.02.patch, > HIVE-17172.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17172) add ordering checks to DiskRangeList
[ https://issues.apache.org/jira/browse/HIVE-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17172: Attachment: HIVE-17172.02.patch > add ordering checks to DiskRangeList > > > Key: HIVE-17172 > URL: https://issues.apache.org/jira/browse/HIVE-17172 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17172.01.patch, HIVE-17172.02.patch, > HIVE-17172.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17172) add ordering checks to DiskRangeList
[ https://issues.apache.org/jira/browse/HIVE-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17172: Attachment: HIVE-17172.02.patch Added some comments to the usage of the method (that I needed to look at to fix the test). Fixed the test to actually test something, and altered one test case that is valid and shouldn't cause an error. > add ordering checks to DiskRangeList > > > Key: HIVE-17172 > URL: https://issues.apache.org/jira/browse/HIVE-17172 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17172.01.patch, HIVE-17172.02.patch, > HIVE-17172.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17190) Schema changes for bitvectors for unpartitioned tables
[ https://issues.apache.org/jira/browse/HIVE-17190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109652#comment-16109652 ] Hive QA commented on HIVE-17190: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12879856/HIVE-17190.3.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 11137 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype] (batchId=158) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=168) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=236) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=179) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6216/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6216/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6216/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12879856 - PreCommit-HIVE-Build > Schema changes for bitvectors for unpartitioned tables > -- > > Key: HIVE-17190 > URL: https://issues.apache.org/jira/browse/HIVE-17190 > Project: Hive > Issue Type: Test > Components: Metastore, Statistics >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-17190.2.patch, HIVE-17190.3.patch > > > Missed in HIVE-16997 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17172) add ordering checks to DiskRangeList
[ https://issues.apache.org/jira/browse/HIVE-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109606#comment-16109606 ] Deepak Jaiswal commented on HIVE-17172: --- [~sershe] Can you please put the next patch in RB? It is much easier to review that way. > add ordering checks to DiskRangeList > > > Key: HIVE-17172 > URL: https://issues.apache.org/jira/browse/HIVE-17172 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17172.01.patch, HIVE-17172.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17172) add ordering checks to DiskRangeList
[ https://issues.apache.org/jira/browse/HIVE-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109604#comment-16109604 ] Deepak Jaiswal commented on HIVE-17172: --- +1 Lgtm. > add ordering checks to DiskRangeList > > > Key: HIVE-17172 > URL: https://issues.apache.org/jira/browse/HIVE-17172 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17172.01.patch, HIVE-17172.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17172) add ordering checks to DiskRangeList
[ https://issues.apache.org/jira/browse/HIVE-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109593#comment-16109593 ] Sergey Shelukhin commented on HIVE-17172: - actually there's a bug in the test :) will update it > add ordering checks to DiskRangeList > > > Key: HIVE-17172 > URL: https://issues.apache.org/jira/browse/HIVE-17172 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17172.01.patch, HIVE-17172.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17172) add ordering checks to DiskRangeList
[ https://issues.apache.org/jira/browse/HIVE-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109572#comment-16109572 ] Prasanth Jayachandran commented on HIVE-17172: -- +1 > add ordering checks to DiskRangeList > > > Key: HIVE-17172 > URL: https://issues.apache.org/jira/browse/HIVE-17172 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17172.01.patch, HIVE-17172.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17172) add ordering checks to DiskRangeList
[ https://issues.apache.org/jira/browse/HIVE-17172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109553#comment-16109553 ] Sergey Shelukhin commented on HIVE-17172: - [~prasanth_j] [~owen.omalley] [~djaiswal] ping > add ordering checks to DiskRangeList > > > Key: HIVE-17172 > URL: https://issues.apache.org/jira/browse/HIVE-17172 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17172.01.patch, HIVE-17172.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17212) Dynamic add partition by insert shouldn't generate INSERT event.
[ https://issues.apache.org/jira/browse/HIVE-17212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109522#comment-16109522 ] Sankar Hariappan commented on HIVE-17212: - Thanks [~anishek] for the review! Request [~daijy]/[~thejas] to commit this patch to master! > Dynamic add partition by insert shouldn't generate INSERT event. > > > Key: HIVE-17212 > URL: https://issues.apache.org/jira/browse/HIVE-17212 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, repl >Affects Versions: 2.1.0 >Reporter: Sankar Hariappan >Assignee: Sankar Hariappan > Labels: DR, replication > Fix For: 3.0.0 > > Attachments: HIVE-17212.01.patch > > > A partition is dynamically added if INSERT INTO is invoked on a non-existing > partition. > Generally, insert operation generated INSERT event to notify the operation > with new data files. > In this case, Hive should generate only ADD_PARTITION events with the new > files added. It shouldn't create INSERT event. > Need to test and verify this behaviour. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16357) Failed folder creation when creating a new table is reported incorrectly
[ https://issues.apache.org/jira/browse/HIVE-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109512#comment-16109512 ] Hive QA commented on HIVE-16357: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12879795/HIVE-16357.04.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 11040 tests executed *Failed tests:* {noformat} TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=236) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed] (batchId=241) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_wise_fileformat5] (batchId=3) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=168) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=99) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=179) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6215/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6215/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6215/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12879795 - PreCommit-HIVE-Build > Failed folder creation when creating a new table is reported incorrectly > > > Key: HIVE-16357 > URL: https://issues.apache.org/jira/browse/HIVE-16357 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.3.0, 3.0.0 >Reporter: Barna Zsombor Klara >Assignee: Barna Zsombor Klara >Priority: Minor > Fix For: 3.0.0 > > Attachments: HIVE-16357.01.patch, HIVE-16357.02.patch, > HIVE-16357.03.patch, HIVE-16357.04.patch > > > If the directory for a Hive table could not be created, them the HMS will > throw a metaexception: > {code} > if (tblPath != null) { > if (!wh.isDir(tblPath)) { > if (!wh.mkdirs(tblPath, true)) { > throw new MetaException(tblPath > + " is not a directory or unable to create one"); > } > madeDir = true; > } > } > {code} > However in the finally block we always try to call the > DbNotificationListener, which in turn will also throw an exception because > the directory is missing, overwriting the initial exception with a > FileNotFoundException. > Actual stacktrace seen by the caller: > {code} > 2017-04-03T05:58:00,128 ERROR [pool-7-thread-2] metastore.RetryingHMSHandler: > MetaException(message:java.lang.RuntimeException: > java.io.FileNotFoundException: File file:/.../0 does not exist) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6074) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1496) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) > at com.sun.proxy.$Proxy28.create_table_with_environment_context(Unknown > Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11125) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11109) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
[jira] [Commented] (HIVE-17194) JDBC: Implement Gzip servlet filter
[ https://issues.apache.org/jira/browse/HIVE-17194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109500#comment-16109500 ] Gopal V commented on HIVE-17194: [~vgumashta]/[~thejas]: can you review? > JDBC: Implement Gzip servlet filter > --- > > Key: HIVE-17194 > URL: https://issues.apache.org/jira/browse/HIVE-17194 > Project: Hive > Issue Type: Improvement > Components: HiveServer2, JDBC >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-17194.1.patch, HIVE-17194.2.patch, > HIVE-17194.3.patch > > > {code} > POST /cliservice HTTP/1.1 > Content-Type: application/x-thrift > Accept: application/x-thrift > User-Agent: Java/THttpClient/HC > Authorization: Basic YW5vbnltb3VzOmFub255bW91cw== > Content-Length: 71 > Host: localhost:10007 > Connection: Keep-Alive > Accept-Encoding: gzip,deflate > X-XSRF-HEADER: true > {code} > The Beeline client clearly sends out HTTP compression headers which are > ignored by the HTTP service layer in HS2. > After patch, result looks like > {code} > HTTP/1.1 200 OK > Date: Tue, 01 Aug 2017 01:47:23 GMT > Content-Type: application/x-thrift > Vary: Accept-Encoding, User-Agent > Content-Encoding: gzip > Transfer-Encoding: chunked > Server: Jetty(9.3.8.v20160314) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17222) Llap: Iotrace throws java.lang.UnsupportedOperationException with IncompleteCb
[ https://issues.apache.org/jira/browse/HIVE-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109468#comment-16109468 ] Sergey Shelukhin commented on HIVE-17222: - +1 > Llap: Iotrace throws java.lang.UnsupportedOperationException with > IncompleteCb > --- > > Key: HIVE-17222 > URL: https://issues.apache.org/jira/browse/HIVE-17222 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-17222.1.patch > > > branch: hive master > Running Q76 at 1 TB generates the following exception. > {noformat} > Caused by: java.io.IOException: java.lang.UnsupportedOperationException > at > org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.rethrowErrorIfAny(LlapRecordReader.java:349) > at > org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.nextCvb(LlapRecordReader.java:304) > at > org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.next(LlapRecordReader.java:244) > at > org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.next(LlapRecordReader.java:67) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ... 23 more > Caused by: java.lang.UnsupportedOperationException > at > org.apache.hadoop.hive.common.io.DiskRange.getData(DiskRange.java:86) > at > org.apache.hadoop.hive.ql.io.orc.encoded.IoTrace.logRange(IoTrace.java:304) > at > org.apache.hadoop.hive.ql.io.orc.encoded.IoTrace.logRanges(IoTrace.java:291) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:328) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:426) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:250) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:247) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:247) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:96) > ... 6 more > {noformat} > When {{IncompleteCb}} is encountered, it ends up throwing this error. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17213) HoS: file merging doesn't work for union all
[ https://issues.apache.org/jira/browse/HIVE-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HIVE-17213: Attachment: HIVE-17213.2.patch Attaching patch v2 to address the case when linkedFileSinkDesc is not set. > HoS: file merging doesn't work for union all > > > Key: HIVE-17213 > URL: https://issues.apache.org/jira/browse/HIVE-17213 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HIVE-17213.0.patch, HIVE-17213.1.patch, > HIVE-17213.2.patch > > > HoS file merging doesn't work properly since it doesn't set linked file sinks > properly which is used to generate move tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17224) Move JDO classes to standalone metastore
[ https://issues.apache.org/jira/browse/HIVE-17224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates reassigned HIVE-17224: - > Move JDO classes to standalone metastore > > > Key: HIVE-17224 > URL: https://issues.apache.org/jira/browse/HIVE-17224 > Project: Hive > Issue Type: Sub-task > Components: Metastore >Reporter: Alan Gates >Assignee: Alan Gates > > The JDO model classes (MDatabase, MTable, etc.) and the package.jdo file that > defines the DB mapping need to be moved to the standalone metastore. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17170) Move thrift generated code to stand alone metastore
[ https://issues.apache.org/jira/browse/HIVE-17170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-17170: -- Status: Patch Available (was: Open) > Move thrift generated code to stand alone metastore > --- > > Key: HIVE-17170 > URL: https://issues.apache.org/jira/browse/HIVE-17170 > Project: Hive > Issue Type: Sub-task > Components: Metastore >Reporter: Alan Gates >Assignee: Alan Gates > Attachments: HIVE-17170.patch > > > hive_metastore.thrift and the code it generates needs to be moved into the > standalone metastore module. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17170) Move thrift generated code to stand alone metastore
[ https://issues.apache.org/jira/browse/HIVE-17170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-17170: -- Attachment: HIVE-17170.patch The patch is huge because it moves all the Thrift generated files around. It will be much easier to review the PR. > Move thrift generated code to stand alone metastore > --- > > Key: HIVE-17170 > URL: https://issues.apache.org/jira/browse/HIVE-17170 > Project: Hive > Issue Type: Sub-task > Components: Metastore >Reporter: Alan Gates >Assignee: Alan Gates > Attachments: HIVE-17170.patch > > > hive_metastore.thrift and the code it generates needs to be moved into the > standalone metastore module. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17170) Move thrift generated code to stand alone metastore
[ https://issues.apache.org/jira/browse/HIVE-17170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109406#comment-16109406 ] ASF GitHub Bot commented on HIVE-17170: --- GitHub user alanfgates opened a pull request: https://github.com/apache/hive/pull/216 HIVE-17170 Move thrift generated code to stand alone metastore You can merge this pull request into a Git repository by running: $ git pull https://github.com/alanfgates/hive hive17170 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hive/pull/216.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #216 commit ffd8599cd3db1cbb0464606901dc6f73916bdc69 Author: Alan GatesDate: 2017-07-25T20:50:38Z HIVE-17170 Move thrift generated code to stand alone metastore > Move thrift generated code to stand alone metastore > --- > > Key: HIVE-17170 > URL: https://issues.apache.org/jira/browse/HIVE-17170 > Project: Hive > Issue Type: Sub-task > Components: Metastore >Reporter: Alan Gates >Assignee: Alan Gates > > hive_metastore.thrift and the code it generates needs to be moved into the > standalone metastore module. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17189) Fix backwards incompatibility in HiveMetaStoreClient
[ https://issues.apache.org/jira/browse/HIVE-17189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-17189: --- Resolution: Fixed Fix Version/s: 2.4.0 3.0.0 Status: Resolved (was: Patch Available) Pushed to master and branch-2. Thanks for the review [~alangates] > Fix backwards incompatibility in HiveMetaStoreClient > > > Key: HIVE-17189 > URL: https://issues.apache.org/jira/browse/HIVE-17189 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.1.1 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Fix For: 3.0.0, 2.4.0 > > Attachments: HIVE-17189.01.patch, HIVE-17189.02.patch > > > HIVE-12730 adds the ability to edit the basic stats using {{alter table}} and > {{alter partition}} commands. However, it changes the signature of @public > interface of MetastoreClient and removes some methods which breaks backwards > compatibility. This can be fixed easily by re-introducing the removed methods > and making them call into newly added method > {{alter_table_with_environment_context}} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17222) Llap: Iotrace throws java.lang.UnsupportedOperationException with IncompleteCb
[ https://issues.apache.org/jira/browse/HIVE-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109364#comment-16109364 ] Hive QA commented on HIVE-17222: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12879789/HIVE-17222.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11018 tests executed *Failed tests:* {noformat} TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1] (batchId=240) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=168) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=99) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=179) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6214/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6214/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6214/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12879789 - PreCommit-HIVE-Build > Llap: Iotrace throws java.lang.UnsupportedOperationException with > IncompleteCb > --- > > Key: HIVE-17222 > URL: https://issues.apache.org/jira/browse/HIVE-17222 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-17222.1.patch > > > branch: hive master > Running Q76 at 1 TB generates the following exception. > {noformat} > Caused by: java.io.IOException: java.lang.UnsupportedOperationException > at > org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.rethrowErrorIfAny(LlapRecordReader.java:349) > at > org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.nextCvb(LlapRecordReader.java:304) > at > org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.next(LlapRecordReader.java:244) > at > org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.next(LlapRecordReader.java:67) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ... 23 more > Caused by: java.lang.UnsupportedOperationException > at > org.apache.hadoop.hive.common.io.DiskRange.getData(DiskRange.java:86) > at > org.apache.hadoop.hive.ql.io.orc.encoded.IoTrace.logRange(IoTrace.java:304) > at > org.apache.hadoop.hive.ql.io.orc.encoded.IoTrace.logRanges(IoTrace.java:291) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:328) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:426) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:250) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:247) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:247) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:96) > ... 6 more > {noformat} > When {{IncompleteCb}} is encountered, it ends up throwing this error. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17167) Create metastore specific configuration tool
[ https://issues.apache.org/jira/browse/HIVE-17167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-17167: -- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Patch 2 committed. Thanks Vihang for the reviews and feedback. > Create metastore specific configuration tool > > > Key: HIVE-17167 > URL: https://issues.apache.org/jira/browse/HIVE-17167 > Project: Hive > Issue Type: Sub-task > Components: Metastore >Reporter: Alan Gates >Assignee: Alan Gates > Fix For: 3.0.0 > > Attachments: HIVE-17167.2.patch, HIVE-17167.patch > > > As part of making the metastore a separately releasable module we need > configuration tools that are specific to that module. It cannot use or > extend HiveConf as that is in hive common. But it must take a HiveConf > object and be able to operate on it. > The best way to achieve this is using Hadoop's Configuration object (which > HiveConf extends) together with enums and static methods. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17189) Fix backwards incompatibility in HiveMetaStoreClient
[ https://issues.apache.org/jira/browse/HIVE-17189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109212#comment-16109212 ] Alan Gates commented on HIVE-17189: --- +1 > Fix backwards incompatibility in HiveMetaStoreClient > > > Key: HIVE-17189 > URL: https://issues.apache.org/jira/browse/HIVE-17189 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.1.1 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Attachments: HIVE-17189.01.patch, HIVE-17189.02.patch > > > HIVE-12730 adds the ability to edit the basic stats using {{alter table}} and > {{alter partition}} commands. However, it changes the signature of @public > interface of MetastoreClient and removes some methods which breaks backwards > compatibility. This can be fixed easily by re-introducing the removed methods > and making them call into newly added method > {{alter_table_with_environment_context}} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16845) INSERT OVERWRITE a table with dynamic partitions on S3 fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109205#comment-16109205 ] Sahil Takiar commented on HIVE-16845: - +1 > INSERT OVERWRITE a table with dynamic partitions on S3 fails with NPE > - > > Key: HIVE-16845 > URL: https://issues.apache.org/jira/browse/HIVE-16845 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Marta Kuczora >Assignee: Marta Kuczora > Attachments: HIVE-16845.1.patch, HIVE-16845.2.patch, > HIVE-16845.3.patch, HIVE-16845.4.patch > > > *How to reproduce* > - Create a partitioned table on S3: > {noformat} > CREATE EXTERNAL TABLE s3table(user_id string COMMENT '', event_name string > COMMENT '') PARTITIONED BY (reported_date string, product_id int) LOCATION > 's3a://'; > {noformat} > - Create a temp table: > {noformat} > create table tmp_table (id string, name string, date string, pid int) row > format delimited fields terminated by '\t' lines terminated by '\n' stored as > textfile; > {noformat} > - Load the following rows to the tmp table: > {noformat} > u1value1 2017-04-10 1 > u2value2 2017-04-10 1 > u3value3 2017-04-10 10001 > {noformat} > - Set the following parameters: > -- hive.exec.dynamic.partition.mode=nonstrict > -- mapreduce.input.fileinputformat.split.maxsize=10 > -- hive.blobstore.optimizations.enabled=true > -- hive.blobstore.use.blobstore.as.scratchdir=false > -- hive.merge.mapfiles=true > - Insert the rows from the temp table into the s3 table: > {noformat} > INSERT OVERWRITE TABLE s3table > PARTITION (reported_date, product_id) > SELECT > t.id as user_id, > t.name as event_name, > t.date as reported_date, > t.pid as product_id > FROM tmp_table t; > {noformat} > A NPE will occur with the following stacktrace: > {noformat} > 2017-05-08 21:32:50,607 ERROR > org.apache.hive.service.cli.operation.Operation: > [HiveServer2-Background-Pool: Thread-184028]: Error running hive query: > org.apache.hive.service.cli.HiveSQLException: Error while processing > statement: FAILED: Execution Error, return code -101 from > org.apache.hadoop.hive.ql.exec.ConditionalTask. null > at > org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:400) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:239) > at > org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88) > at > org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920) > at > org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.plan.ConditionalResolverMergeFiles.generateActualTasks(ConditionalResolverMergeFiles.java:290) > at > org.apache.hadoop.hive.ql.plan.ConditionalResolverMergeFiles.getTasks(ConditionalResolverMergeFiles.java:175) > at > org.apache.hadoop.hive.ql.exec.ConditionalTask.execute(ConditionalTask.java:81) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1977) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1690) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1422) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1206) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1201) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237) > ... 11 more > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17208) Repl dump should pass in db/table information to authorization API
[ https://issues.apache.org/jira/browse/HIVE-17208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109160#comment-16109160 ] Hive QA commented on HIVE-17208: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12879777/HIVE-17208.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 11019 tests executed *Failed tests:* {noformat} TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype] (batchId=158) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=168) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=99) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[repl_dump_requires_admin] (batchId=90) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[repl_load_requires_admin] (batchId=90) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=179) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6213/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6213/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6213/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12879777 - PreCommit-HIVE-Build > Repl dump should pass in db/table information to authorization API > -- > > Key: HIVE-17208 > URL: https://issues.apache.org/jira/browse/HIVE-17208 > Project: Hive > Issue Type: Bug > Components: Authorization >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-17208.1.patch, HIVE-17208.2.patch > > > "repl dump" does not provide db/table information. That is necessary for > authorization replication in ranger. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17190) Schema changes for bitvectors for unpartitioned tables
[ https://issues.apache.org/jira/browse/HIVE-17190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-17190: Status: Patch Available (was: Open) > Schema changes for bitvectors for unpartitioned tables > -- > > Key: HIVE-17190 > URL: https://issues.apache.org/jira/browse/HIVE-17190 > Project: Hive > Issue Type: Test > Components: Metastore, Statistics >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-17190.2.patch, HIVE-17190.3.patch > > > Missed in HIVE-16997 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17190) Schema changes for bitvectors for unpartitioned tables
[ https://issues.apache.org/jira/browse/HIVE-17190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-17190: Status: Open (was: Patch Available) > Schema changes for bitvectors for unpartitioned tables > -- > > Key: HIVE-17190 > URL: https://issues.apache.org/jira/browse/HIVE-17190 > Project: Hive > Issue Type: Test > Components: Metastore, Statistics >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-17190.2.patch, HIVE-17190.3.patch > > > Missed in HIVE-16997 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17190) Schema changes for bitvectors for unpartitioned tables
[ https://issues.apache.org/jira/browse/HIVE-17190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-17190: Attachment: HIVE-17190.3.patch > Schema changes for bitvectors for unpartitioned tables > -- > > Key: HIVE-17190 > URL: https://issues.apache.org/jira/browse/HIVE-17190 > Project: Hive > Issue Type: Test > Components: Metastore, Statistics >Affects Versions: 3.0.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-17190.2.patch, HIVE-17190.3.patch > > > Missed in HIVE-16997 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16357) Failed folder creation when creating a new table is reported incorrectly
[ https://issues.apache.org/jira/browse/HIVE-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109104#comment-16109104 ] Sergio Peña commented on HIVE-16357: [~pvary] Why is the DbNotificationListener called when a table failed? I see this event is called inside the try block, and if a table fails and an exception is thrown, then this shouldn't be called, right? What am I missing? Btw, DbNotificationListener is a transactional listener, so it is only called inside the try block (no in the finally). See this line https://github.com/apache/hive/blob/master/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L1547 > Failed folder creation when creating a new table is reported incorrectly > > > Key: HIVE-16357 > URL: https://issues.apache.org/jira/browse/HIVE-16357 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.3.0, 3.0.0 >Reporter: Barna Zsombor Klara >Assignee: Barna Zsombor Klara >Priority: Minor > Fix For: 3.0.0 > > Attachments: HIVE-16357.01.patch, HIVE-16357.02.patch, > HIVE-16357.03.patch, HIVE-16357.04.patch > > > If the directory for a Hive table could not be created, them the HMS will > throw a metaexception: > {code} > if (tblPath != null) { > if (!wh.isDir(tblPath)) { > if (!wh.mkdirs(tblPath, true)) { > throw new MetaException(tblPath > + " is not a directory or unable to create one"); > } > madeDir = true; > } > } > {code} > However in the finally block we always try to call the > DbNotificationListener, which in turn will also throw an exception because > the directory is missing, overwriting the initial exception with a > FileNotFoundException. > Actual stacktrace seen by the caller: > {code} > 2017-04-03T05:58:00,128 ERROR [pool-7-thread-2] metastore.RetryingHMSHandler: > MetaException(message:java.lang.RuntimeException: > java.io.FileNotFoundException: File file:/.../0 does not exist) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6074) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1496) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) > at com.sun.proxy.$Proxy28.create_table_with_environment_context(Unknown > Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11125) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11109) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File > file:/.../0 does not exist > at > org.apache.hive.hcatalog.listener.DbNotificationListener$FileIterator.(DbNotificationListener.java:203) > at > org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:137) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1463) > at >
[jira] [Commented] (HIVE-16357) Failed folder creation when creating a new table is reported incorrectly
[ https://issues.apache.org/jira/browse/HIVE-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109081#comment-16109081 ] Peter Vary commented on HIVE-16357: --- [~spena]: The original issue is the following with the original code: - When the table creation is failed - DBNotificationListener will be triggered with incomplete data - DBNotificationListener will throw a new exception which will mask the original from the user, making it harder to find the root problem The patch fixes it by adding the check inside the DBNotificationListener, thus avoiding throwing the second exception. Also the patch makes sure that if there is any further exceptions thrown by the DBNotificationListener, or any other Listener configured by the user, they are catched, logged and not propagated further. The rationale behind this solution is the following: - Listeners can be configured by the user, so they can have different usages. For example one listener might only collect failed table creation events. In this case the listeners should be notified on every event, and the listener should decide which event to handle, and which event to omit. - DBNotificationListener on the other hand is not interested in the failed events, so it should skip these events. - Also we thought, that listeners are configured by the users, so their code can be unstable, buggy. With this in mind, it would be good be sure that they do not affect each other. So if there is an error in one listener then the other listeners should still be notified. We might not be aware of every usage patterns of the Listeners, or might overcomplicate this Listener architecture. What do you think [~spena]? Is it worth to be prepared to these use-cases? Thanks, Peter > Failed folder creation when creating a new table is reported incorrectly > > > Key: HIVE-16357 > URL: https://issues.apache.org/jira/browse/HIVE-16357 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.3.0, 3.0.0 >Reporter: Barna Zsombor Klara >Assignee: Barna Zsombor Klara >Priority: Minor > Fix For: 3.0.0 > > Attachments: HIVE-16357.01.patch, HIVE-16357.02.patch, > HIVE-16357.03.patch, HIVE-16357.04.patch > > > If the directory for a Hive table could not be created, them the HMS will > throw a metaexception: > {code} > if (tblPath != null) { > if (!wh.isDir(tblPath)) { > if (!wh.mkdirs(tblPath, true)) { > throw new MetaException(tblPath > + " is not a directory or unable to create one"); > } > madeDir = true; > } > } > {code} > However in the finally block we always try to call the > DbNotificationListener, which in turn will also throw an exception because > the directory is missing, overwriting the initial exception with a > FileNotFoundException. > Actual stacktrace seen by the caller: > {code} > 2017-04-03T05:58:00,128 ERROR [pool-7-thread-2] metastore.RetryingHMSHandler: > MetaException(message:java.lang.RuntimeException: > java.io.FileNotFoundException: File file:/.../0 does not exist) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6074) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1496) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) > at com.sun.proxy.$Proxy28.create_table_with_environment_context(Unknown > Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11125) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11109) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at >
[jira] [Commented] (HIVE-16357) Failed folder creation when creating a new table is reported incorrectly
[ https://issues.apache.org/jira/browse/HIVE-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109056#comment-16109056 ] Sergio Peña commented on HIVE-16357: [~zsombor.klara] I don't understand why triggering events with failed operations cause this error. Seems the fix you did will not fix the original problem. This patch fixes an issue that shouldn't cause issues on HMS. Btw, for a future fix, I would think is better to check the status on the HiveMetaStore side itself and do not trigger the event instead of doing it on the DbNotificationListener. Even if this avoids triggering failed events, developers will still get confused about why notifyEvent() is called on the HiveMetaStore with failed transactions? > Failed folder creation when creating a new table is reported incorrectly > > > Key: HIVE-16357 > URL: https://issues.apache.org/jira/browse/HIVE-16357 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.3.0, 3.0.0 >Reporter: Barna Zsombor Klara >Assignee: Barna Zsombor Klara >Priority: Minor > Fix For: 3.0.0 > > Attachments: HIVE-16357.01.patch, HIVE-16357.02.patch, > HIVE-16357.03.patch, HIVE-16357.04.patch > > > If the directory for a Hive table could not be created, them the HMS will > throw a metaexception: > {code} > if (tblPath != null) { > if (!wh.isDir(tblPath)) { > if (!wh.mkdirs(tblPath, true)) { > throw new MetaException(tblPath > + " is not a directory or unable to create one"); > } > madeDir = true; > } > } > {code} > However in the finally block we always try to call the > DbNotificationListener, which in turn will also throw an exception because > the directory is missing, overwriting the initial exception with a > FileNotFoundException. > Actual stacktrace seen by the caller: > {code} > 2017-04-03T05:58:00,128 ERROR [pool-7-thread-2] metastore.RetryingHMSHandler: > MetaException(message:java.lang.RuntimeException: > java.io.FileNotFoundException: File file:/.../0 does not exist) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6074) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1496) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) > at com.sun.proxy.$Proxy28.create_table_with_environment_context(Unknown > Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11125) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11109) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File > file:/.../0 does not exist > at > org.apache.hive.hcatalog.listener.DbNotificationListener$FileIterator.(DbNotificationListener.java:203) > at > org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:137) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1463) > at >
[jira] [Commented] (HIVE-17144) export of temporary tables not working and it seems to be using distcp rather than filesystem copy
[ https://issues.apache.org/jira/browse/HIVE-17144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16109016#comment-16109016 ] Hive QA commented on HIVE-17144: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12879775/HIVE-17144.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 11019 tests executed *Failed tests:* {noformat} TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed] (batchId=240) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1] (batchId=240) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype] (batchId=158) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=168) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=100) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=99) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=179) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6212/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6212/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6212/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 10 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12879775 - PreCommit-HIVE-Build > export of temporary tables not working and it seems to be using distcp rather > than filesystem copy > -- > > Key: HIVE-17144 > URL: https://issues.apache.org/jira/browse/HIVE-17144 > Project: Hive > Issue Type: Bug > Components: Hive, HiveServer2 >Affects Versions: 3.0.0 >Reporter: anishek >Assignee: anishek > Fix For: 3.0.0 > > Attachments: HIVE-17144.1.patch > > > create temporary table t1 (i int); > insert into t1 values (3); > export table t1 to 'hdfs://somelocation'; > above fails. additionally it should use filesystem copy and not distcp to do > the job. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17115) MetaStoreUtils.getDeserializer doesn't catch the java.lang.ClassNotFoundException
[ https://issues.apache.org/jira/browse/HIVE-17115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108896#comment-16108896 ] Aroop Maliakkal commented on HIVE-17115: [~daijy] :: The create tables succeeded. We might be using the direct mysql connection from the hive client when we were creating these tables. Now we enforced all clients to go through metastore instead of direct mysql connection. > MetaStoreUtils.getDeserializer doesn't catch the > java.lang.ClassNotFoundException > - > > Key: HIVE-17115 > URL: https://issues.apache.org/jira/browse/HIVE-17115 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.2.1 >Reporter: Erik.fang >Assignee: Erik.fang > Attachments: HIVE-17115.1.patch, HIVE-17115.patch > > > Suppose we create a table with Custom SerDe, then call > HiveMetaStoreClient.getSchema(String db, String tableName) to extract the > metadata from HiveMetaStore Service > the thrift client hangs there with exception in HiveMetaStore Service's log, > such as > {code:java} > Exception in thread "pool-5-thread-129" java.lang.NoClassDefFoundError: > org/apache/hadoop/hbase/util/Bytes > at > org.apache.hadoop.hive.hbase.HBaseSerDe.parseColumnsMapping(HBaseSerDe.java:184) > at > org.apache.hadoop.hive.hbase.HBaseSerDeParameters.(HBaseSerDeParameters.java:73) > at > org.apache.hadoop.hive.hbase.HBaseSerDe.initialize(HBaseSerDe.java:117) > at > org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:53) > at > org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:521) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:401) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_fields_with_environment_context(HiveMetaStore.java:3556) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_schema_with_environment_context(HiveMetaStore.java:3636) > at sun.reflect.GeneratedMethodAccessor104.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) > at com.sun.proxy.$Proxy4.get_schema_with_environment_context(Unknown > Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_schema_with_environment_context.getResult(ThriftHiveMetastore.java:9146) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_schema_with_environment_context.getResult(ThriftHiveMetastore.java:9130) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:551) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:546) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:546) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.hbase.util.Bytes > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16896) move replication load related work in semantic analysis phase to execution phase using a task
[ https://issues.apache.org/jira/browse/HIVE-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108868#comment-16108868 ] Hive QA commented on HIVE-16896: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12879776/HIVE-16896.1.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 11019 tests executed *Failed tests:* {noformat} TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype] (batchId=158) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=168) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=99) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[repl_load_requires_admin] (batchId=90) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=179) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6211/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6211/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6211/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12879776 - PreCommit-HIVE-Build > move replication load related work in semantic analysis phase to execution > phase using a task > - > > Key: HIVE-16896 > URL: https://issues.apache.org/jira/browse/HIVE-16896 > Project: Hive > Issue Type: Sub-task >Reporter: anishek >Assignee: anishek > Attachments: HIVE-16896.1.patch > > > we want to not create too many tasks in memory in the analysis phase while > loading data. Currently we load all the files in the bootstrap dump location > as {{FileStatus[]}} and then iterate over it to load objects, we should > rather move to > {code} > org.apache.hadoop.fs.RemoteIteratorlistFiles(Path > f, boolean recursive) > {code} > which would internally batch and return values. > additionally since we cant hand off partial tasks from analysis pahse => > execution phase, we are going to move the whole repl load functionality to > execution phase so we can better control creation/execution of tasks (not > related to hive {{Task}}, we may get rid of ReplCopyTask) > Additional consideration to take into account at the end of this jira is to > see if we want to specifically do a multi threaded load of bootstrap dump. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17213) HoS: file merging doesn't work for union all
[ https://issues.apache.org/jira/browse/HIVE-17213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108759#comment-16108759 ] Hive QA commented on HIVE-17213: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12879761/HIVE-17213.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 11018 tests executed *Failed tests:* {noformat} TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] (batchId=46) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge1] (batchId=168) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge2] (batchId=171) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge3] (batchId=168) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge4] (batchId=170) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge5] (batchId=169) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge6] (batchId=169) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge7] (batchId=171) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge9] (batchId=168) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge_diff_fs] (batchId=168) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=168) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=99) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[merge1] (batchId=124) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[merge2] (batchId=101) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=179) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6210/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6210/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6210/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 18 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12879761 - PreCommit-HIVE-Build > HoS: file merging doesn't work for union all > > > Key: HIVE-17213 > URL: https://issues.apache.org/jira/browse/HIVE-17213 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HIVE-17213.0.patch, HIVE-17213.1.patch > > > HoS file merging doesn't work properly since it doesn't set linked file sinks > properly which is used to generate move tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17194) JDBC: Implement Gzip servlet filter
[ https://issues.apache.org/jira/browse/HIVE-17194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108682#comment-16108682 ] Hive QA commented on HIVE-17194: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12879759/HIVE-17194.3.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 11012 tests executed *Failed tests:* {noformat} TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=168) org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver (batchId=242) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=179) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6209/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6209/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6209/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12879759 - PreCommit-HIVE-Build > JDBC: Implement Gzip servlet filter > --- > > Key: HIVE-17194 > URL: https://issues.apache.org/jira/browse/HIVE-17194 > Project: Hive > Issue Type: Improvement > Components: HiveServer2, JDBC >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-17194.1.patch, HIVE-17194.2.patch, > HIVE-17194.3.patch > > > {code} > POST /cliservice HTTP/1.1 > Content-Type: application/x-thrift > Accept: application/x-thrift > User-Agent: Java/THttpClient/HC > Authorization: Basic YW5vbnltb3VzOmFub255bW91cw== > Content-Length: 71 > Host: localhost:10007 > Connection: Keep-Alive > Accept-Encoding: gzip,deflate > X-XSRF-HEADER: true > {code} > The Beeline client clearly sends out HTTP compression headers which are > ignored by the HTTP service layer in HS2. > After patch, result looks like > {code} > HTTP/1.1 200 OK > Date: Tue, 01 Aug 2017 01:47:23 GMT > Content-Type: application/x-thrift > Vary: Accept-Encoding, User-Agent > Content-Encoding: gzip > Transfer-Encoding: chunked > Server: Jetty(9.3.8.v20160314) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17115) MetaStoreUtils.getDeserializer doesn't catch the java.lang.ClassNotFoundException
[ https://issues.apache.org/jira/browse/HIVE-17115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108672#comment-16108672 ] Erik.fang commented on HIVE-17115: -- In our cluster, I think there are both local mode metastore hive client and standalone metastore service deployed, which share backend mysql In hive client with local mode metastore, user can add jars by themselves, they can add hbase.jar and create the table However, metastore service does't load the hbase.jar, so NoClassDefFoundError is raised by HiveMetaStoreClient.getSchema This might be a deployment issue, however, it is always inappropriate to miss the NoClassDefFoundError and crash the worker thread in metastore service > MetaStoreUtils.getDeserializer doesn't catch the > java.lang.ClassNotFoundException > - > > Key: HIVE-17115 > URL: https://issues.apache.org/jira/browse/HIVE-17115 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 1.2.1 >Reporter: Erik.fang >Assignee: Erik.fang > Attachments: HIVE-17115.1.patch, HIVE-17115.patch > > > Suppose we create a table with Custom SerDe, then call > HiveMetaStoreClient.getSchema(String db, String tableName) to extract the > metadata from HiveMetaStore Service > the thrift client hangs there with exception in HiveMetaStore Service's log, > such as > {code:java} > Exception in thread "pool-5-thread-129" java.lang.NoClassDefFoundError: > org/apache/hadoop/hbase/util/Bytes > at > org.apache.hadoop.hive.hbase.HBaseSerDe.parseColumnsMapping(HBaseSerDe.java:184) > at > org.apache.hadoop.hive.hbase.HBaseSerDeParameters.(HBaseSerDeParameters.java:73) > at > org.apache.hadoop.hive.hbase.HBaseSerDe.initialize(HBaseSerDe.java:117) > at > org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:53) > at > org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:521) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:401) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_fields_with_environment_context(HiveMetaStore.java:3556) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_schema_with_environment_context(HiveMetaStore.java:3636) > at sun.reflect.GeneratedMethodAccessor104.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) > at com.sun.proxy.$Proxy4.get_schema_with_environment_context(Unknown > Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_schema_with_environment_context.getResult(ThriftHiveMetastore.java:9146) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_schema_with_environment_context.getResult(ThriftHiveMetastore.java:9130) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:551) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:546) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:546) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.hbase.util.Bytes > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16357) Failed folder creation when creating a new table is reported incorrectly
[ https://issues.apache.org/jira/browse/HIVE-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108643#comment-16108643 ] Peter Vary commented on HIVE-16357: --- Thanks for the patch [~zsombor.klara]! I like this solution. +1 pending tests. Peter > Failed folder creation when creating a new table is reported incorrectly > > > Key: HIVE-16357 > URL: https://issues.apache.org/jira/browse/HIVE-16357 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.3.0, 3.0.0 >Reporter: Barna Zsombor Klara >Assignee: Barna Zsombor Klara >Priority: Minor > Fix For: 3.0.0 > > Attachments: HIVE-16357.01.patch, HIVE-16357.02.patch, > HIVE-16357.03.patch, HIVE-16357.04.patch > > > If the directory for a Hive table could not be created, them the HMS will > throw a metaexception: > {code} > if (tblPath != null) { > if (!wh.isDir(tblPath)) { > if (!wh.mkdirs(tblPath, true)) { > throw new MetaException(tblPath > + " is not a directory or unable to create one"); > } > madeDir = true; > } > } > {code} > However in the finally block we always try to call the > DbNotificationListener, which in turn will also throw an exception because > the directory is missing, overwriting the initial exception with a > FileNotFoundException. > Actual stacktrace seen by the caller: > {code} > 2017-04-03T05:58:00,128 ERROR [pool-7-thread-2] metastore.RetryingHMSHandler: > MetaException(message:java.lang.RuntimeException: > java.io.FileNotFoundException: File file:/.../0 does not exist) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6074) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1496) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) > at com.sun.proxy.$Proxy28.create_table_with_environment_context(Unknown > Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11125) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11109) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File > file:/.../0 does not exist > at > org.apache.hive.hcatalog.listener.DbNotificationListener$FileIterator.(DbNotificationListener.java:203) > at > org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:137) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1463) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1482) > ... 20 more > Caused by: java.io.FileNotFoundException: File file:/.../0 does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:429) > at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1515) > at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1555) > at >
[jira] [Commented] (HIVE-14013) Describe table doesn't show unicode properly
[ https://issues.apache.org/jira/browse/HIVE-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108625#comment-16108625 ] hzfeng commented on HIVE-14013: --- Thanks a lot Is it convenient for guiding me how you configured hive-2.3.0? I will be appreciate. > Describe table doesn't show unicode properly > > > Key: HIVE-14013 > URL: https://issues.apache.org/jira/browse/HIVE-14013 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 2.2.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Fix For: 2.3.0 > > Attachments: HIVE-14013.1.patch, HIVE-14013.2.patch, > HIVE-14013.3.patch, HIVE-14013.4.patch > > > Describe table output will show comments incorrectly rather than the unicode > itself. > {noformat} > hive> desc formatted t1; > # Detailed Table Information > Table Type: MANAGED_TABLE > Table Parameters: > COLUMN_STATS_ACCURATE {\"BASIC_STATS\":\"true\"} > comment \u8868\u4E2D\u6587\u6D4B\u8BD5 > numFiles0 > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-16357) Failed folder creation when creating a new table is reported incorrectly
[ https://issues.apache.org/jira/browse/HIVE-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108620#comment-16108620 ] Barna Zsombor Klara edited comment on HIVE-16357 at 8/1/17 9:25 AM: Removed changes from the MetaStore class and made notifications more resilient. An exception from one listener should not affect the rest of the listeners. was (Author: zsombor.klara): Remove changes from the MetaStore class and made notifications more resilient. An exception from one listener should not affect the rest of the listeners. > Failed folder creation when creating a new table is reported incorrectly > > > Key: HIVE-16357 > URL: https://issues.apache.org/jira/browse/HIVE-16357 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.3.0, 3.0.0 >Reporter: Barna Zsombor Klara >Assignee: Barna Zsombor Klara >Priority: Minor > Fix For: 3.0.0 > > Attachments: HIVE-16357.01.patch, HIVE-16357.02.patch, > HIVE-16357.03.patch, HIVE-16357.04.patch > > > If the directory for a Hive table could not be created, them the HMS will > throw a metaexception: > {code} > if (tblPath != null) { > if (!wh.isDir(tblPath)) { > if (!wh.mkdirs(tblPath, true)) { > throw new MetaException(tblPath > + " is not a directory or unable to create one"); > } > madeDir = true; > } > } > {code} > However in the finally block we always try to call the > DbNotificationListener, which in turn will also throw an exception because > the directory is missing, overwriting the initial exception with a > FileNotFoundException. > Actual stacktrace seen by the caller: > {code} > 2017-04-03T05:58:00,128 ERROR [pool-7-thread-2] metastore.RetryingHMSHandler: > MetaException(message:java.lang.RuntimeException: > java.io.FileNotFoundException: File file:/.../0 does not exist) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6074) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1496) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) > at com.sun.proxy.$Proxy28.create_table_with_environment_context(Unknown > Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11125) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11109) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File > file:/.../0 does not exist > at > org.apache.hive.hcatalog.listener.DbNotificationListener$FileIterator.(DbNotificationListener.java:203) > at > org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:137) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1463) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1482) > ... 20 more > Caused by: java.io.FileNotFoundException: File
[jira] [Updated] (HIVE-16357) Failed folder creation when creating a new table is reported incorrectly
[ https://issues.apache.org/jira/browse/HIVE-16357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Barna Zsombor Klara updated HIVE-16357: --- Attachment: HIVE-16357.04.patch Remove changes from the MetaStore class and made notifications more resilient. An exception from one listener should not affect the rest of the listeners. > Failed folder creation when creating a new table is reported incorrectly > > > Key: HIVE-16357 > URL: https://issues.apache.org/jira/browse/HIVE-16357 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 2.3.0, 3.0.0 >Reporter: Barna Zsombor Klara >Assignee: Barna Zsombor Klara >Priority: Minor > Fix For: 3.0.0 > > Attachments: HIVE-16357.01.patch, HIVE-16357.02.patch, > HIVE-16357.03.patch, HIVE-16357.04.patch > > > If the directory for a Hive table could not be created, them the HMS will > throw a metaexception: > {code} > if (tblPath != null) { > if (!wh.isDir(tblPath)) { > if (!wh.mkdirs(tblPath, true)) { > throw new MetaException(tblPath > + " is not a directory or unable to create one"); > } > madeDir = true; > } > } > {code} > However in the finally block we always try to call the > DbNotificationListener, which in turn will also throw an exception because > the directory is missing, overwriting the initial exception with a > FileNotFoundException. > Actual stacktrace seen by the caller: > {code} > 2017-04-03T05:58:00,128 ERROR [pool-7-thread-2] metastore.RetryingHMSHandler: > MetaException(message:java.lang.RuntimeException: > java.io.FileNotFoundException: File file:/.../0 does not exist) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6074) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1496) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) > at com.sun.proxy.$Proxy28.create_table_with_environment_context(Unknown > Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11125) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:11109) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: File > file:/.../0 does not exist > at > org.apache.hive.hcatalog.listener.DbNotificationListener$FileIterator.(DbNotificationListener.java:203) > at > org.apache.hive.hcatalog.listener.DbNotificationListener.onCreateTable(DbNotificationListener.java:137) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1463) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1482) > ... 20 more > Caused by: java.io.FileNotFoundException: File file:/.../0 does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:429) > at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1515) > at
[jira] [Commented] (HIVE-16845) INSERT OVERWRITE a table with dynamic partitions on S3 fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108598#comment-16108598 ] Marta Kuczora commented on HIVE-16845: -- [~stakiar], [~pvary], could you please have a look at the patch? > INSERT OVERWRITE a table with dynamic partitions on S3 fails with NPE > - > > Key: HIVE-16845 > URL: https://issues.apache.org/jira/browse/HIVE-16845 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Marta Kuczora >Assignee: Marta Kuczora > Attachments: HIVE-16845.1.patch, HIVE-16845.2.patch, > HIVE-16845.3.patch, HIVE-16845.4.patch > > > *How to reproduce* > - Create a partitioned table on S3: > {noformat} > CREATE EXTERNAL TABLE s3table(user_id string COMMENT '', event_name string > COMMENT '') PARTITIONED BY (reported_date string, product_id int) LOCATION > 's3a://'; > {noformat} > - Create a temp table: > {noformat} > create table tmp_table (id string, name string, date string, pid int) row > format delimited fields terminated by '\t' lines terminated by '\n' stored as > textfile; > {noformat} > - Load the following rows to the tmp table: > {noformat} > u1value1 2017-04-10 1 > u2value2 2017-04-10 1 > u3value3 2017-04-10 10001 > {noformat} > - Set the following parameters: > -- hive.exec.dynamic.partition.mode=nonstrict > -- mapreduce.input.fileinputformat.split.maxsize=10 > -- hive.blobstore.optimizations.enabled=true > -- hive.blobstore.use.blobstore.as.scratchdir=false > -- hive.merge.mapfiles=true > - Insert the rows from the temp table into the s3 table: > {noformat} > INSERT OVERWRITE TABLE s3table > PARTITION (reported_date, product_id) > SELECT > t.id as user_id, > t.name as event_name, > t.date as reported_date, > t.pid as product_id > FROM tmp_table t; > {noformat} > A NPE will occur with the following stacktrace: > {noformat} > 2017-05-08 21:32:50,607 ERROR > org.apache.hive.service.cli.operation.Operation: > [HiveServer2-Background-Pool: Thread-184028]: Error running hive query: > org.apache.hive.service.cli.HiveSQLException: Error while processing > statement: FAILED: Execution Error, return code -101 from > org.apache.hadoop.hive.ql.exec.ConditionalTask. null > at > org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:400) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:239) > at > org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88) > at > org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920) > at > org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.plan.ConditionalResolverMergeFiles.generateActualTasks(ConditionalResolverMergeFiles.java:290) > at > org.apache.hadoop.hive.ql.plan.ConditionalResolverMergeFiles.getTasks(ConditionalResolverMergeFiles.java:175) > at > org.apache.hadoop.hive.ql.exec.ConditionalTask.execute(ConditionalTask.java:81) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1977) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1690) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1422) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1206) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1201) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237) > ... 11 more > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-14261) Support set/unset partition parameters
[ https://issues.apache.org/jira/browse/HIVE-14261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108587#comment-16108587 ] Hive QA commented on HIVE-14261: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12818468/HIVE-14261.01.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 11019 tests executed *Failed tests:* {noformat} TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite] (batchId=240) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=168) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=99) org.apache.hadoop.hive.metastore.TestMarkPartitionRemote.testMarkingPartitionSet (batchId=214) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=179) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6208/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6208/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6208/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12818468 - PreCommit-HIVE-Build > Support set/unset partition parameters > -- > > Key: HIVE-14261 > URL: https://issues.apache.org/jira/browse/HIVE-14261 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14261.01.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17222) Llap: Iotrace throws java.lang.UnsupportedOperationException with IncompleteCb
[ https://issues.apache.org/jira/browse/HIVE-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan reassigned HIVE-17222: --- Assignee: Rajesh Balamohan > Llap: Iotrace throws java.lang.UnsupportedOperationException with > IncompleteCb > --- > > Key: HIVE-17222 > URL: https://issues.apache.org/jira/browse/HIVE-17222 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-17222.1.patch > > > branch: hive master > Running Q76 at 1 TB generates the following exception. > {noformat} > Caused by: java.io.IOException: java.lang.UnsupportedOperationException > at > org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.rethrowErrorIfAny(LlapRecordReader.java:349) > at > org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.nextCvb(LlapRecordReader.java:304) > at > org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.next(LlapRecordReader.java:244) > at > org.apache.hadoop.hive.llap.io.api.impl.LlapRecordReader.next(LlapRecordReader.java:67) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ... 23 more > Caused by: java.lang.UnsupportedOperationException > at > org.apache.hadoop.hive.common.io.DiskRange.getData(DiskRange.java:86) > at > org.apache.hadoop.hive.ql.io.orc.encoded.IoTrace.logRange(IoTrace.java:304) > at > org.apache.hadoop.hive.ql.io.orc.encoded.IoTrace.logRanges(IoTrace.java:291) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:328) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:426) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:250) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:247) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:247) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:96) > ... 6 more > {noformat} > When {{IncompleteCb}} is encountered, it ends up throwing this error. -- This message was sent by Atlassian JIRA (v6.4.14#64029)