[GitHub] [spark] cxzl25 commented on a change in pull request #34431: [SPARK-35437][SQL] Use expressions to filter Hive partitions at client side

GitBox Sun, 07 Nov 2021 09:52:52 -0800


cxzl25 commented on a change in pull request #34431:
URL: https://github.com/apache/spark/pull/34431#discussion_r744293312




##########
File path: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/client/HivePartitionFilteringSuite.scala
##########
@@ -62,6 +63,9 @@ class HivePartitionFilteringSuite(version: String)
     properties = Map.empty
   )
 
+  // Avoid repeatedly constructing multiple hive instances that do not use 
direct sql
+  private var disableDirectSqlClient: HiveClient = _

Review comment:
       First, the size of `METASPACE_SIZE` is 1g.
   Then `HiveClientVersions` needs to test "0.14", "1.0", "1.1", "1.2", "2.0", 
"2.1", "2.2", "2.3", "3.0", "3.1", 11 hive versions.
   Each version is initialized 3 times with `IsolatedClientLoader`, and 
hive-related classes cannot be unloaded after ut ends.
   
   If we create hive client several times (val client = init(false)), it will 
cause `OutOfMemoryError: Metaspace`
   For example, this test
   
https://github.com/cxzl25/spark/commit/888ac1ef068ed88f4802079d8120cac11b46999c 
 
    https://github.com/cxzl25/spark/runs/4023095713  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] cxzl25 commented on a change in pull request #34431: [SPARK-35437][SQL] Use expressions to filter Hive partitions at client side

Reply via email to