[ https://issues.apache.org/jira/browse/HIVE-16014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15882053#comment-15882053 ]
Hive QA commented on HIVE-16014: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12854275/HIVE-16014.02.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10258 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=223) org.apache.hadoop.hive.thrift.TestHadoopAuthBridge23.testSaslWithHiveMetaStore (batchId=220) org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgressParallel (batchId=211) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3743/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3743/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3743/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12854275 - PreCommit-HIVE-Build > HiveMetastoreChecker should use hive.metastore.fshandler.threads instead of > hive.mv.files.thread for pool size > -------------------------------------------------------------------------------------------------------------- > > Key: HIVE-16014 > URL: https://issues.apache.org/jira/browse/HIVE-16014 > Project: Hive > Issue Type: Improvement > Reporter: Vihang Karajgaonkar > Assignee: Vihang Karajgaonkar > Attachments: HIVE-16014.01.patch, HIVE-16014.02.patch > > > HiveMetastoreChecker uses hive.mv.files.thread configuration value for > determining the pool size as below : > {noformat} > private void checkPartitionDirs(Path basePath, Set<Path> allDirs, int > maxDepth) throws IOException, HiveException { > ConcurrentLinkedQueue<Path> basePaths = new ConcurrentLinkedQueue<>(); > basePaths.add(basePath); > Set<Path> dirSet = Collections.newSetFromMap(new ConcurrentHashMap<Path, > Boolean>()); > // Here we just reuse the THREAD_COUNT configuration for > // HIVE_MOVE_FILES_THREAD_COUNT > int poolSize = conf.getInt(ConfVars.HIVE_MOVE_FILES_THREAD_COUNT.varname, > 15); > // Check if too low config is provided for move files. 2x CPU is > reasonable max count. > poolSize = poolSize == 0 ? poolSize : Math.max(poolSize, > Runtime.getRuntime().availableProcessors() * 2); > {noformat} > msck is commonly used to add the missing partitions for the table from the > Filesystem. In such a case different pool sizes for HMSHandler and > HiveMetastoreChecker can affect the performance. Eg. If > {{hive.metastore.fshandler.threads}} is set to a lower value like 15 and > {{hive.mv.files.thread}} is much higher like 100 or vice versa the smaller > pool will become the bottleneck. If would be good to use > {{hive.metastore.fshandler.threads}} to size the pool for > HiveMetastoreChecker since the number missing partitions and number of > partitions to be added will most likely be the same. In such a case the > performance of the query will be optimum when both the pool sizes are same. > Since it is possible to tune both the configs individually it will be very > likely that they may be different. But since there is a strong co-relation > between amount of work done by HiveMetastoreChecker and > HiveMetastore.add_partitions call it might be a good idea to use > {{hive.metastore.fshandler.threads}} for pool size instead of > {{hive.mv.files.thread}} -- This message was sent by Atlassian JIRA (v6.3.15#6346)