LantaoJin opened a new pull request #28938: URL: https://github.com/apache/spark/pull/28938
### What changes were proposed in this pull request? Use `ReadWriteLock` for each database instead of one synchronized block to improve the performance. ### Why are the changes needed? In `HiveExternalCatalog`, all metastore operations are synchronized by a same object lock. In a heavy traffic Spark thriftserver or Spark Driver, users's queries may be stuck by any a long operation. For example, if a user is accessing a table which contains mass partitions, the operation `loadDynamicPartitions()` holds the object lock for a long time. All queries are blocking to wait for the lock. From the thread dump stack, `Thread-61500` was holding the object lock with a high frequency as mass partitions table access, this lead to many queries stuck. ``` 61500 HiveServer2-Background-Pool: Thread-61500 java.lang.Object.wait(Native Method) org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1542) org.apache.hadoop.ipc.Client.call(Client.java:1498) org.apache.hadoop.ipc.Client.call(Client.java:1398) org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) com.sun.proxy.$Proxy10.getEZForPath(Unknown Source) org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getEZForPath(ClientNamenodeProtocolTranslatorPB.java:1448) sun.reflect.GeneratedMethodAccessor292.invoke(Unknown Source) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) java.lang.reflect.Method.invoke(Method.java:498) org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291) org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203) org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185) com.sun.proxy.$Proxy11.getEZForPath(Unknown Source) org.apache.hadoop.hdfs.DFSClient.getEZForPath(DFSClient.java:3408) org.apache.hadoop.hdfs.DistributedFileSystem.getEZForPath(DistributedFileSystem.java:2259) org.apache.hadoop.hdfs.client.HdfsAdmin.getEncryptionZoneForPath(HdfsAdmin.java:339) org.apache.hadoop.hive.shims.Hadoop23Shims$HdfsEncryptionShim.isPathEncrypted(Hadoop23Shims.java:1221) org.apache.hadoop.hive.ql.metadata.Hive.needToCopy(Hive.java:2687) org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2621) org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:2748) org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1403) org.apache.hadoop.hive.ql.metadata.Hive.loadDynamicPartitions(Hive.java:1593) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) java.lang.reflect.Method.invoke(Method.java:498) org.apache.spark.sql.hive.client.Shim_v1_2.loadDynamicPartitions(HiveShim.scala:1001) org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$loadDynamicPartitions$1.apply$mcV$sp(HiveClientImpl.scala:961) org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$loadDynamicPartitions$1.apply(HiveClientImpl.scala:959) org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$loadDynamicPartitions$1.apply(HiveClientImpl.scala:959) org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1$$anonfun$apply$2.apply(HiveClientImpl.scala:326) org.apache.spark.sql.hive.client.HiveClientImpl.org$apache$spark$sql$hive$client$HiveClientImpl$$retryLocked(HiveClientImpl.scala:255) org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:309) org.apache.spark.sql.hive.client.HiveClientImpl.updateCallMetrics(HiveClientImpl.scala:339) org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:308) org.apache.spark.sql.hive.client.HiveClientImpl.loadDynamicPartitions(HiveClientImpl.scala:959) org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$loadDynamicPartitions$1.apply$mcV$sp(HiveExternalCatalog.scala:993) org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$loadDynamicPartitions$1.apply(HiveExternalCatalog.scala:981) org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$loadDynamicPartitions$1.apply(HiveExternalCatalog.scala:981) org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:127) org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:152) org.apache.spark.sql.hive.HiveExternalCatalog.loadDynamicPartitions(HiveExternalCatalog.scala:981) org.apache.spark.sql.hive.execution.InsertIntoHiveTable.processInsert(InsertIntoHiveTable.scala:262) org.apache.spark.sql.hive.execution.InsertIntoHiveTable.run(InsertIntoHiveTable.scala:111) org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:111) => holding Monitor(org.apache.spark.sql.execution.command.DataWritingCommandExec@708550291}) org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:109) org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:126) org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.executeCollect(AdaptiveSparkPlanExec.scala:137) => holding Monitor(java.lang.Object@1464134318}) org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:197) org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:197) ``` ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Exists UTs. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
