subject:"\[GitHub\] \[spark\] cloud\-fan commented on pull request #28938\: \[SPARK\-32118\]\[SQL\] Use fine\-grained read write lock for each database in HiveExternalCatalog"

[GitHub] [spark] cloud-fan commented on pull request #28938: [SPARK-32118][SQL] Use fine-grained read write lock for each database in HiveExternalCatalog

2020-07-07 Thread GitBox

cloud-fan commented on pull request #28938: URL: https://github.com/apache/spark/pull/28938#issuecomment-654881394 The problem is we don't know why hive client is not thread-safe. You need to investigate and explain the internal details (what are the internal states of the hive client).

[GitHub] [spark] cloud-fan commented on pull request #28938: [SPARK-32118][SQL] Use fine-grained read write lock for each database in HiveExternalCatalog

2020-07-07 Thread GitBox

cloud-fan commented on pull request #28938: URL: https://github.com/apache/spark/pull/28938#issuecomment-654809755 Your comments seem to rely on Hive internal details, it's better to have official document to prove it. BTW I'm not sure the hive client thread-unsafety comes from

[GitHub] [spark] cloud-fan commented on pull request #28938: [SPARK-32118][SQL] Use fine-grained read write lock for each database in HiveExternalCatalog

2020-07-07 Thread GitBox

cloud-fan commented on pull request #28938: URL: https://github.com/apache/spark/pull/28938#issuecomment-654706001 Can you reference some hive doc? IIRC we use a global lock because the hive client is not thread-safe. I'm not convinced that the hive client is thread-safe when operating on