GitHub user wzhfy opened a pull request:
https://github.com/apache/spark/pull/19605
[SPARK-22394] [SQL] Remove redundant synchronization for metastore access
## What changes were proposed in this pull request?
Before Spark 2.x, synchronization for metastore access was protected at
[line229 in
ClientWrapper](https://github.com/apache/spark/blob/branch-1.6/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/ClientWrapper.scala#L229)
(now it's at [line203 in HiveClientWrapper
](https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L203)).
After Spark 2.x, `HiveExternalCatalog` was introduced by
[SPARK-13080](https://github.com/apache/spark/pull/11293), where an extra level
of synchronization was added at
[line95](https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala#L95).
That is, now we have two levels of synchronization: one is
`HiveExternalCatalog` and the other is `IsolatedClientLoader` in
`HiveClientImpl`. But since both `HiveExternalCatalog` and
`IsolatedClientLoader` are shared among all spark sessions, the extra level of
synchronization in `Hiv
eExternalCatalog` is redundant, thus can be removed.
## How was this patch tested?
Manual test and existing tests.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/wzhfy/spark redundant_sync
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19605.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19605
----
commit 072b27d083f2c2ed8d8bdd20caa5b0fe0ba267f6
Author: Zhenhua Wang <[email protected]>
Date: 2017-10-30T01:47:12Z
remove redundant sync
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]