[
https://issues.apache.org/jira/browse/HIVE-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15882053#comment-15882053
]
Hive QA commented on HIVE-16014:
--------------------------------
Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12854275/HIVE-16014.02.patch
{color:red}ERROR:{color} -1 due to no test(s) being added or modified.
{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10258 tests
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out)
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14]
(batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23]
(batchId=223)
org.apache.hadoop.hive.thrift.TestHadoopAuthBridge23.testSaslWithHiveMetaStore
(batchId=220)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel
(batchId=211)
{noformat}
Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3743/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3743/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3743/
Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}
This message is automatically generated.
ATTACHMENT ID: 12854275 - PreCommit-HIVE-Build
> HiveMetastoreChecker should use hive.metastore.fshandler.threads instead of
> hive.mv.files.thread for pool size
> --------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-16014
> URL: https://issues.apache.org/jira/browse/HIVE-16014
> Project: Hive
> Issue Type: Improvement
> Reporter: Vihang Karajgaonkar
> Assignee: Vihang Karajgaonkar
> Attachments: HIVE-16014.01.patch, HIVE-16014.02.patch
>
>
> HiveMetastoreChecker uses hive.mv.files.thread configuration value for
> determining the pool size as below :
> {noformat}
> private void checkPartitionDirs(Path basePath, Set<Path> allDirs, int
> maxDepth) throws IOException, HiveException {
> ConcurrentLinkedQueue<Path> basePaths = new ConcurrentLinkedQueue<>();
> basePaths.add(basePath);
> Set<Path> dirSet = Collections.newSetFromMap(new ConcurrentHashMap<Path,
> Boolean>());
> // Here we just reuse the THREAD_COUNT configuration for
> // HIVE_MOVE_FILES_THREAD_COUNT
> int poolSize = conf.getInt(ConfVars.HIVE_MOVE_FILES_THREAD_COUNT.varname,
> 15);
> // Check if too low config is provided for move files. 2x CPU is
> reasonable max count.
> poolSize = poolSize == 0 ? poolSize : Math.max(poolSize,
> Runtime.getRuntime().availableProcessors() * 2);
> {noformat}
> msck is commonly used to add the missing partitions for the table from the
> Filesystem. In such a case different pool sizes for HMSHandler and
> HiveMetastoreChecker can affect the performance. Eg. If
> {{hive.metastore.fshandler.threads}} is set to a lower value like 15 and
> {{hive.mv.files.thread}} is much higher like 100 or vice versa the smaller
> pool will become the bottleneck. If would be good to use
> {{hive.metastore.fshandler.threads}} to size the pool for
> HiveMetastoreChecker since the number missing partitions and number of
> partitions to be added will most likely be the same. In such a case the
> performance of the query will be optimum when both the pool sizes are same.
> Since it is possible to tune both the configs individually it will be very
> likely that they may be different. But since there is a strong co-relation
> between amount of work done by HiveMetastoreChecker and
> HiveMetastore.add_partitions call it might be a good idea to use
> {{hive.metastore.fshandler.threads}} for pool size instead of
> {{hive.mv.files.thread}}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)