[jira] [Commented] (HIVE-16895) Multi-threaded execution of bootstrap dump of partitions
[ https://issues.apache.org/jira/browse/HIVE-16895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16187297#comment-16187297 ] Lefty Leverenz commented on HIVE-16895: --- Doc update: HIVE-17625 changes the default value to 100, also in release 3.0.0. The wiki has been updated: * [hive.repl.partitions.dump.parallelism | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.repl.partitions.dump.parallelism] > Multi-threaded execution of bootstrap dump of partitions > - > > Key: HIVE-16895 > URL: https://issues.apache.org/jira/browse/HIVE-16895 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: anishek >Assignee: anishek > Fix For: 3.0.0 > > Attachments: HIVE-16895.1.patch, HIVE-16895.2.patch > > > to allow faster execution of bootstrap dump phase we dump multiple partitions > from same table simultaneously. > even though dumping functions is not going to be a blocker, moving to > similar execution modes for all metastore objects will make code more > coherent. > Bootstrap dump at db level does : > * boostrap of all tables > ** boostrap of all partitions in a table. (scope of current jira) > * boostrap of all functions -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16895) Multi-threaded execution of bootstrap dump of partitions
[ https://issues.apache.org/jira/browse/HIVE-16895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16159718#comment-16159718 ] Lefty Leverenz commented on HIVE-16895: --- Thanks for the documentation, [~anishek]. I removed the TODOC3.0 label. Here's a direct link to the doc: * [hive.repl.partitions.dump.parallelism | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.repl.partitions.dump.parallelism] > Multi-threaded execution of bootstrap dump of partitions > - > > Key: HIVE-16895 > URL: https://issues.apache.org/jira/browse/HIVE-16895 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: anishek >Assignee: anishek > Fix For: 3.0.0 > > Attachments: HIVE-16895.1.patch, HIVE-16895.2.patch > > > to allow faster execution of bootstrap dump phase we dump multiple partitions > from same table simultaneously. > even though dumping functions is not going to be a blocker, moving to > similar execution modes for all metastore objects will make code more > coherent. > Bootstrap dump at db level does : > * boostrap of all tables > ** boostrap of all partitions in a table. (scope of current jira) > * boostrap of all functions -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16895) Multi-threaded execution of bootstrap dump of partitions
[ https://issues.apache.org/jira/browse/HIVE-16895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16147807#comment-16147807 ] ASF GitHub Bot commented on HIVE-16895: --- Github user anishek closed the pull request at: https://github.com/apache/hive/pull/217 > Multi-threaded execution of bootstrap dump of partitions > - > > Key: HIVE-16895 > URL: https://issues.apache.org/jira/browse/HIVE-16895 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: anishek >Assignee: anishek > Labels: TODOC3.0 > Fix For: 3.0.0 > > Attachments: HIVE-16895.1.patch, HIVE-16895.2.patch > > > to allow faster execution of bootstrap dump phase we dump multiple partitions > from same table simultaneously. > even though dumping functions is not going to be a blocker, moving to > similar execution modes for all metastore objects will make code more > coherent. > Bootstrap dump at db level does : > * boostrap of all tables > ** boostrap of all partitions in a table. (scope of current jira) > * boostrap of all functions -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16895) Multi-threaded execution of bootstrap dump of partitions
[ https://issues.apache.org/jira/browse/HIVE-16895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119448#comment-16119448 ] anishek commented on HIVE-16895: [~leftylev] added the required configuration in doc. > Multi-threaded execution of bootstrap dump of partitions > - > > Key: HIVE-16895 > URL: https://issues.apache.org/jira/browse/HIVE-16895 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: anishek >Assignee: anishek > Labels: TODOC3.0 > Fix For: 3.0.0 > > Attachments: HIVE-16895.1.patch, HIVE-16895.2.patch > > > to allow faster execution of bootstrap dump phase we dump multiple partitions > from same table simultaneously. > even though dumping functions is not going to be a blocker, moving to > similar execution modes for all metastore objects will make code more > coherent. > Bootstrap dump at db level does : > * boostrap of all tables > ** boostrap of all partitions in a table. (scope of current jira) > * boostrap of all functions -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16895) Multi-threaded execution of bootstrap dump of partitions
[ https://issues.apache.org/jira/browse/HIVE-16895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16117910#comment-16117910 ] Lefty Leverenz commented on HIVE-16895: --- Doc note: This adds *hive.repl.partitions.dump.parallelism* to HiveConf.java, so it needs to be documented in the wiki. * [Configuration Properties -- Replication | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Replication] Added a TODOC3.0 label. > Multi-threaded execution of bootstrap dump of partitions > - > > Key: HIVE-16895 > URL: https://issues.apache.org/jira/browse/HIVE-16895 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: anishek >Assignee: anishek > Labels: TODOC3.0 > Fix For: 3.0.0 > > Attachments: HIVE-16895.1.patch, HIVE-16895.2.patch > > > to allow faster execution of bootstrap dump phase we dump multiple partitions > from same table simultaneously. > even though dumping functions is not going to be a blocker, moving to > similar execution modes for all metastore objects will make code more > coherent. > Bootstrap dump at db level does : > * boostrap of all tables > ** boostrap of all partitions in a table. (scope of current jira) > * boostrap of all functions -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16895) Multi-threaded execution of bootstrap dump of partitions
[ https://issues.apache.org/jira/browse/HIVE-16895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16116346#comment-16116346 ] anishek commented on HIVE-16895: * org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] -- passes on local machine * org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] -- fails on local machine, but the test does not touch the code changed by the patch * org.apache.hive.jdbc.TestJdbcWithMiniHS2.testConcurrentStatements -- passes on local machine Other failures are from older builds. [~thejas]/[~daijy] can you please commit this patch. > Multi-threaded execution of bootstrap dump of partitions > - > > Key: HIVE-16895 > URL: https://issues.apache.org/jira/browse/HIVE-16895 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: anishek >Assignee: anishek > Fix For: 3.0.0 > > Attachments: HIVE-16895.1.patch, HIVE-16895.2.patch > > > to allow faster execution of bootstrap dump phase we dump multiple partitions > from same table simultaneously. > even though dumping functions is not going to be a blocker, moving to > similar execution modes for all metastore objects will make code more > coherent. > Bootstrap dump at db level does : > * boostrap of all tables > ** boostrap of all partitions in a table. (scope of current jira) > * boostrap of all functions -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16895) Multi-threaded execution of bootstrap dump of partitions
[ https://issues.apache.org/jira/browse/HIVE-16895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114447#comment-16114447 ] Hive QA commented on HIVE-16895: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12880359/HIVE-16895.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 11144 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_uncompressed] (batchId=56) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=168) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=100) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=99) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=179) org.apache.hive.jdbc.TestJdbcWithMiniHS2.testConcurrentStatements (batchId=229) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6258/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6258/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6258/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12880359 - PreCommit-HIVE-Build > Multi-threaded execution of bootstrap dump of partitions > - > > Key: HIVE-16895 > URL: https://issues.apache.org/jira/browse/HIVE-16895 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: anishek >Assignee: anishek > Fix For: 3.0.0 > > Attachments: HIVE-16895.1.patch, HIVE-16895.2.patch > > > to allow faster execution of bootstrap dump phase we dump multiple partitions > from same table simultaneously. > even though dumping functions is not going to be a blocker, moving to > similar execution modes for all metastore objects will make code more > coherent. > Bootstrap dump at db level does : > * boostrap of all tables > ** boostrap of all partitions in a table. (scope of current jira) > * boostrap of all functions -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16895) Multi-threaded execution of bootstrap dump of partitions
[ https://issues.apache.org/jira/browse/HIVE-16895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16114026#comment-16114026 ] anishek commented on HIVE-16895: Added patch with license header ! Thanks for review [~sankarh] > Multi-threaded execution of bootstrap dump of partitions > - > > Key: HIVE-16895 > URL: https://issues.apache.org/jira/browse/HIVE-16895 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: anishek >Assignee: anishek > Fix For: 3.0.0 > > Attachments: HIVE-16895.1.patch, HIVE-16895.2.patch > > > to allow faster execution of bootstrap dump phase we dump multiple partitions > from same table simultaneously. > even though dumping functions is not going to be a blocker, moving to > similar execution modes for all metastore objects will make code more > coherent. > Bootstrap dump at db level does : > * boostrap of all tables > ** boostrap of all partitions in a table. (scope of current jira) > * boostrap of all functions -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16895) Multi-threaded execution of bootstrap dump of partitions
[ https://issues.apache.org/jira/browse/HIVE-16895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16113265#comment-16113265 ] Daniel Dai commented on HIVE-16895: --- The new file needs a license header. > Multi-threaded execution of bootstrap dump of partitions > - > > Key: HIVE-16895 > URL: https://issues.apache.org/jira/browse/HIVE-16895 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: anishek >Assignee: anishek > Fix For: 3.0.0 > > Attachments: HIVE-16895.1.patch > > > to allow faster execution of bootstrap dump phase we dump multiple partitions > from same table simultaneously. > even though dumping functions is not going to be a blocker, moving to > similar execution modes for all metastore objects will make code more > coherent. > Bootstrap dump at db level does : > * boostrap of all tables > ** boostrap of all partitions in a table. (scope of current jira) > * boostrap of all functions -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16895) Multi-threaded execution of bootstrap dump of partitions
[ https://issues.apache.org/jira/browse/HIVE-16895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16112726#comment-16112726 ] Sankar Hariappan commented on HIVE-16895: - +1 Patch looks good to me Request [~daijy] to review/commit to master! > Multi-threaded execution of bootstrap dump of partitions > - > > Key: HIVE-16895 > URL: https://issues.apache.org/jira/browse/HIVE-16895 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: anishek >Assignee: anishek > Fix For: 3.0.0 > > Attachments: HIVE-16895.1.patch > > > to allow faster execution of bootstrap dump phase we dump multiple partitions > from same table simultaneously. > even though dumping functions is not going to be a blocker, moving to > similar execution modes for all metastore objects will make code more > coherent. > Bootstrap dump at db level does : > * boostrap of all tables > ** boostrap of all partitions in a table. (scope of current jira) > * boostrap of all functions -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16895) Multi-threaded execution of bootstrap dump of partitions
[ https://issues.apache.org/jira/browse/HIVE-16895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16112612#comment-16112612 ] anishek commented on HIVE-16895: * org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning] : runs fine on local machine * org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_combine_equivalent_work] : runs fine on local machine * org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_use_op_stats] : runs fine on local machine * org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge_diff_fs] : runs fine on local machine * org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[quotedid_smb] : runs fine on local machine * org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[truncate_column_buckets] : runs fine on local machine * org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge3] : runs fine on local machine other tests are failing from old builds. [~thejas]/[~sankarh]/[~daijy] please review > Multi-threaded execution of bootstrap dump of partitions > - > > Key: HIVE-16895 > URL: https://issues.apache.org/jira/browse/HIVE-16895 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: anishek >Assignee: anishek > Fix For: 3.0.0 > > Attachments: HIVE-16895.1.patch > > > to allow faster execution of bootstrap dump phase we dump multiple partitions > from same table simultaneously. > even though dumping functions is not going to be a blocker, moving to > similar execution modes for all metastore objects will make code more > coherent. > Bootstrap dump at db level does : > * boostrap of all tables > ** boostrap of all partitions in a table. (scope of current jira) > * boostrap of all functions -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16895) Multi-threaded execution of bootstrap dump of partitions
[ https://issues.apache.org/jira/browse/HIVE-16895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110958#comment-16110958 ] Hive QA commented on HIVE-16895: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12880022/HIVE-16895.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 11040 tests executed *Failed tests:* {noformat} TestPerfCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=236) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge1] (batchId=168) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge3] (batchId=168) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[orc_merge_diff_fs] (batchId=168) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[quotedid_smb] (batchId=168) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_combine_equivalent_work] (batchId=168) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning] (batchId=168) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_use_op_stats] (batchId=168) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning] (batchId=168) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[truncate_column_buckets] (batchId=168) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=99) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=179) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=179) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6231/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6231/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6231/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 14 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12880022 - PreCommit-HIVE-Build > Multi-threaded execution of bootstrap dump of partitions > - > > Key: HIVE-16895 > URL: https://issues.apache.org/jira/browse/HIVE-16895 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: anishek >Assignee: anishek > Fix For: 3.0.0 > > Attachments: HIVE-16895.1.patch > > > to allow faster execution of bootstrap dump phase we dump multiple partitions > from same table simultaneously. > even though dumping functions is not going to be a blocker, moving to > similar execution modes for all metastore objects will make code more > coherent. > Bootstrap dump at db level does : > * boostrap of all tables > ** boostrap of all partitions in a table. (scope of current jira) > * boostrap of all functions -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16895) Multi-threaded execution of bootstrap dump of partitions
[ https://issues.apache.org/jira/browse/HIVE-16895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16110839#comment-16110839 ] ASF GitHub Bot commented on HIVE-16895: --- GitHub user anishek opened a pull request: https://github.com/apache/hive/pull/217 HIVE-16895: Multi-threaded execution of bootstrap dump of partitions You can merge this pull request into a Git repository by running: $ git pull https://github.com/anishek/hive HIVE-16895 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hive/pull/217.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #217 commit 3411d1447137c91408ef9f3138e608b11504a7f9 Author: Anishek AgarwalDate: 2017-08-02T12:33:02Z HIVE-16895: Multi-threaded execution of bootstrap dump of partitions > Multi-threaded execution of bootstrap dump of partitions > - > > Key: HIVE-16895 > URL: https://issues.apache.org/jira/browse/HIVE-16895 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: anishek >Assignee: anishek > Fix For: 3.0.0 > > Attachments: HIVE-16895.1.patch > > > to allow faster execution of bootstrap dump phase we dump multiple partitions > from same table simultaneously. > even though dumping functions is not going to be a blocker, moving to > similar execution modes for all metastore objects will make code more > coherent. > Bootstrap dump at db level does : > * boostrap of all tables > ** boostrap of all partitions in a table. (scope of current jira) > * boostrap of all functions -- This message was sent by Atlassian JIRA (v6.4.14#64029)