[jira] [Commented] (HIVE-14925) MSCK repair table hang while running with multi threading enabled
[ https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16527138#comment-16527138 ] Hive QA commented on HIVE-14925: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832969/HIVE-14925.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12234/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12234/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12234/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2018-06-29 03:47:57.242 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-12234/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2018-06-29 03:47:57.246 + cd apache-github-source-source + git fetch origin >From https://github.com/apache/hive 1b3ac73..5b2cbb5 master -> origin/master + git reset --hard HEAD HEAD is now at 1b3ac73 HIVE-20010: Fix create view over literals (Zoltan Haindrich, reviewed by Ashutosh Chauhan, Daniel Dai) + git clean -f -d + git checkout master Already on 'master' Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded. (use "git pull" to update your local branch) + git reset --hard origin/master HEAD is now at 5b2cbb5 HIVE-18786 : NPE in Hive windowing functions (Dongwook Kwon via Ashutosh Chauhan) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2018-06-29 03:47:59.050 + rm -rf ../yetus_PreCommit-HIVE-Build-12234 + mkdir ../yetus_PreCommit-HIVE-Build-12234 + git gc + cp -R . ../yetus_PreCommit-HIVE-Build-12234 + mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-12234/yetus + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: a/ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java: does not exist in index error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java:426 Falling back to three-way merge... Applied patch to 'ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java' with conflicts. Going to apply patch with: git apply -p1 error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java:426 Falling back to three-way merge... Applied patch to 'ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java' with conflicts. U ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java + result=1 + '[' 1 -ne 0 ']' + rm -rf yetus_PreCommit-HIVE-Build-12234 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12832969 - PreCommit-HIVE-Build > MSCK repair table hang while running with multi threading enabled > - > > Key: HIVE-14925 > URL: https://issues.apache.org/jira/browse/HIVE-14925 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 2.2.0 >Reporter: Ratheesh Kamoor >Assignee: Ratheesh Kamoor >Priority: Critical > Fix For: 3.2.0 > > Attachments: HIVE-14925.patch > > > MSCK REPAIR TABLE hanging while running with multi-threading enabled > (default). I think it is because of a major design flaw in how thread pool > implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This > method has a thread pool which register Callable but callable makes a > recursive call to checkPartitionDirs method again. This code will hang when > number of directories is more than
[jira] [Commented] (HIVE-14925) MSCK repair table hang while running with multi threading enabled
[ https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16431848#comment-16431848 ] Hive QA commented on HIVE-14925: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832969/HIVE-14925.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/10111/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/10111/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-10111/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2018-04-10 07:30:17.238 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-10111/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2018-04-10 07:30:17.240 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at dcd9b59 HIVE-19146 : Delete dangling q.out + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at dcd9b59 HIVE-19146 : Delete dangling q.out + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2018-04-10 07:30:17.774 + rm -rf ../yetus_PreCommit-HIVE-Build-10111 + mkdir ../yetus_PreCommit-HIVE-Build-10111 + git gc + cp -R . ../yetus_PreCommit-HIVE-Build-10111 + mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-10111/yetus + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: a/ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java: does not exist in index error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java:426 Falling back to three-way merge... Applied patch to 'ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java' with conflicts. Going to apply patch with: git apply -p1 error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java:426 Falling back to three-way merge... Applied patch to 'ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java' with conflicts. U ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12832969 - PreCommit-HIVE-Build > MSCK repair table hang while running with multi threading enabled > - > > Key: HIVE-14925 > URL: https://issues.apache.org/jira/browse/HIVE-14925 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 2.2.0 >Reporter: Ratheesh Kamoor >Assignee: Ratheesh Kamoor >Priority: Critical > Fix For: 3.1.0 > > Attachments: HIVE-14925.patch > > > MSCK REPAIR TABLE hanging while running with multi-threading enabled > (default). I think it is because of a major design flaw in how thread pool > implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This > method has a thread pool which register Callable but callable makes a > recursive call to checkPartitionDirs method again. This code will hang when > number of directories is more than thread pool size. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-14925) MSCK repair table hang while running with multi threading enabled
[ https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15572698#comment-15572698 ] Pengcheng Xiong commented on HIVE-14925: [~rkamoor], there are several test cases failing, could u take a look at them? And, as [~rajesh.balamohan] suggested, we need to add a test case for the patch. If q tests are hard to add, maybe you can add a JUnit test with some artificial delay, e.g., thread.sleep, etc to expose the problem and prove the benefit of your patch. Thanks again for your efforts. > MSCK repair table hang while running with multi threading enabled > - > > Key: HIVE-14925 > URL: https://issues.apache.org/jira/browse/HIVE-14925 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 2.2.0 >Reporter: Ratheesh Kamoor >Assignee: Ratheesh Kamoor >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-14925.patch > > > MSCK REPAIR TABLE hanging while running with multi-threading enabled > (default). I think it is because of a major design flaw in how thread pool > implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This > method has a thread pool which register Callable but callable makes a > recursive call to checkPartitionDirs method again. This code will hang when > number of directories is more than thread pool size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14925) MSCK repair table hang while running with multi threading enabled
[ https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570693#comment-15570693 ] Hive QA commented on HIVE-14925: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12832969/HIVE-14925.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10560 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[msck_repair_1] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[msck_repair_3] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[msck_repair_batchsize] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[reloadJar] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[repair] org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_bulk] org.apache.hadoop.hive.ql.metadata.TestHiveMetaStoreChecker.testPartitionsCheck {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1517/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1517/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1517/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12832969 - PreCommit-HIVE-Build > MSCK repair table hang while running with multi threading enabled > - > > Key: HIVE-14925 > URL: https://issues.apache.org/jira/browse/HIVE-14925 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 2.2.0 >Reporter: Ratheesh Kamoor >Assignee: Ratheesh Kamoor >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-14925.patch > > > MSCK REPAIR TABLE hanging while running with multi-threading enabled > (default). I think it is because of a major design flaw in how thread pool > implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This > method has a thread pool which register Callable but callable makes a > recursive call to checkPartitionDirs method again. This code will hang when > number of directories is more than thread pool size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14925) MSCK repair table hang while running with multi threading enabled
[ https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570618#comment-15570618 ] Rajesh Balamohan commented on HIVE-14925: - This was tried with with partitions in S3. One of the main reason to make it multi-threaded is to improve the runtime for systems like S3 and azure. > MSCK repair table hang while running with multi threading enabled > - > > Key: HIVE-14925 > URL: https://issues.apache.org/jira/browse/HIVE-14925 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 2.2.0 >Reporter: Ratheesh Kamoor >Assignee: Ratheesh Kamoor >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-14925.patch > > > MSCK REPAIR TABLE hanging while running with multi-threading enabled > (default). I think it is because of a major design flaw in how thread pool > implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This > method has a thread pool which register Callable but callable makes a > recursive call to checkPartitionDirs method again. This code will hang when > number of directories is more than thread pool size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14925) MSCK repair table hang while running with multi threading enabled
[ https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570328#comment-15570328 ] Ratheesh Kamoor commented on HIVE-14925: Are you trying with partitions in hdfs? You may not run into issues if threads are fast enough to finish execution before recursive call happens, File systems like S3 will clearly shows error due to n/w latency. > MSCK repair table hang while running with multi threading enabled > - > > Key: HIVE-14925 > URL: https://issues.apache.org/jira/browse/HIVE-14925 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 2.2.0 >Reporter: Ratheesh Kamoor >Assignee: Ratheesh Kamoor >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-14925.patch > > > MSCK REPAIR TABLE hanging while running with multi-threading enabled > (default). I think it is because of a major design flaw in how thread pool > implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This > method has a thread pool which register Callable but callable makes a > recursive call to checkPartitionDirs method again. This code will hang when > number of directories is more than thread pool size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14925) MSCK repair table hang while running with multi threading enabled
[ https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570317#comment-15570317 ] Rajesh Balamohan commented on HIVE-14925: - It would be helpful to have the repro for this. We have tried with 10K partitions and with 10 & 15 threads in MSCK which worked fine without issues. > MSCK repair table hang while running with multi threading enabled > - > > Key: HIVE-14925 > URL: https://issues.apache.org/jira/browse/HIVE-14925 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 2.2.0 >Reporter: Ratheesh Kamoor >Assignee: Ratheesh Kamoor >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-14925.patch > > > MSCK REPAIR TABLE hanging while running with multi-threading enabled > (default). I think it is because of a major design flaw in how thread pool > implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This > method has a thread pool which register Callable but callable makes a > recursive call to checkPartitionDirs method again. This code will hang when > number of directories is more than thread pool size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14925) MSCK repair table hang while running with multi threading enabled
[ https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570225#comment-15570225 ] Ratheesh Kamoor commented on HIVE-14925: Done. This first time I am using RB tool, please let me know if I need to provide more info. Thx > MSCK repair table hang while running with multi threading enabled > - > > Key: HIVE-14925 > URL: https://issues.apache.org/jira/browse/HIVE-14925 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 2.2.0 >Reporter: Ratheesh Kamoor >Assignee: Rajesh Balamohan >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-14925.patch > > > MSCK REPAIR TABLE hanging while running with multi-threading enabled > (default). I think it is because of a major design flaw in how thread pool > implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This > method has a thread pool which register Callable but callable makes a > recursive call to checkPartitionDirs method again. This code will hang when > number of directories is more than thread pool size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14925) MSCK repair table hang while running with multi threading enabled
[ https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15570022#comment-15570022 ] Pengcheng Xiong commented on HIVE-14925: that is fast... i was planning to do this today... Could u create a RB for it? Thanks. > MSCK repair table hang while running with multi threading enabled > - > > Key: HIVE-14925 > URL: https://issues.apache.org/jira/browse/HIVE-14925 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 2.2.0 >Reporter: Ratheesh Kamoor >Assignee: Pengcheng Xiong >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-14925.patch > > > MSCK REPAIR TABLE hanging while running with multi-threading enabled > (default). I think it is because of a major design flaw in how thread pool > implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This > method has a thread pool which register Callable but callable makes a > recursive call to checkPartitionDirs method again. This code will hang when > number of directories is more than thread pool size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14925) MSCK repair table hang while running with multi threading enabled
[ https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569746#comment-15569746 ] Ratheesh Kamoor commented on HIVE-14925: [~pxiong] I moved the logic in inline callable to an external class so that code can be reused in with multi-threaded and non-multi threaded scenario. Also, it will fix the issues of thread lock. Could you please review. Tested with very large partitions (5K+) we have and worked fine. > MSCK repair table hang while running with multi threading enabled > - > > Key: HIVE-14925 > URL: https://issues.apache.org/jira/browse/HIVE-14925 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 2.2.0 >Reporter: Ratheesh Kamoor >Assignee: Pengcheng Xiong >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-14925.patch > > > MSCK REPAIR TABLE hanging while running with multi-threading enabled > (default). I think it is because of a major design flaw in how thread pool > implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This > method has a thread pool which register Callable but callable makes a > recursive call to checkPartitionDirs method again. This code will hang when > number of directories is more than thread pool size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14925) MSCK repair table hang while running with multi threading enabled
[ https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566691#comment-15566691 ] Ashutosh Chauhan commented on HIVE-14925: - [~pxiong] Reproducible is msck statement with # of dirs > # of threads. > MSCK repair table hang while running with multi threading enabled > - > > Key: HIVE-14925 > URL: https://issues.apache.org/jira/browse/HIVE-14925 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 2.2.0 >Reporter: Ratheesh Kamoor >Assignee: Pengcheng Xiong >Priority: Critical > > MSCK REPAIR TABLE hanging while running with multi-threading enabled > (default). I think it is because of a major design flaw in how thread pool > implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This > method has a thread pool which register Callable but callable makes a > recursive call to checkPartitionDirs method again. This code will hang when > number of directories is more than thread pool size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14925) MSCK repair table hang while running with multi threading enabled
[ https://issues.apache.org/jira/browse/HIVE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564334#comment-15564334 ] Pengcheng Xiong commented on HIVE-14925: When we implemented this, we tried to reuse the thread pool. Is it possible for you to provide us a reproduce case? Thanks. > MSCK repair table hang while running with multi threading enabled > - > > Key: HIVE-14925 > URL: https://issues.apache.org/jira/browse/HIVE-14925 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 2.2.0 >Reporter: Ratheesh Kamoor >Priority: Critical > > MSCK REPAIR TABLE hanging while running with multi-threading enabled > (default). I think it is because of a major design flaw in how thread pool > implemented in HiveMetaSoreChecker class / checkPartitionDirs method. This > method has a thread pool which register Callable but callable makes a > recursive call to checkPartitionDirs method again. This code will hang when > number of directories is more than thread pool size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)