[GitHub] [hudi] xushiyan commented on a diff in pull request #7180: [MINOR] add in minor perf wins in hudi-utilities and locking related tests

GitBox Fri, 11 Nov 2022 07:20:38 -0800


xushiyan commented on code in PR #7180:
URL: https://github.com/apache/hudi/pull/7180#discussion_r1020298273



##########
hudi-utilities/src/test/java/org/apache/hudi/utilities/TestHiveIncrementalPuller.java:
##########
@@ -51,11 +53,16 @@
 import static org.junit.jupiter.api.Assertions.assertThrows;
 import static org.junit.jupiter.api.Assertions.assertTrue;
 
-public class TestHiveIncrementalPuller {
+public class TestHiveIncrementalPuller extends UtilitiesTestBase {
 
   private HiveIncrementalPuller.Config config;
   private String targetBasePath = null;
 
+  @BeforeAll
+  public static void setupOnce() throws Exception {
+    initTestServices();
+  }

Review Comment:
   but this did not need utilities test base, what test resources we are adding 
here?



##########
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/testutils/HoodieClientTestUtils.java:
##########
@@ -92,7 +92,9 @@ public class HoodieClientTestUtils {
    */
   public static SparkConf getSparkConfForTest(String appName) {
     SparkConf sparkConf = new SparkConf().setAppName(appName)
-        .set("spark.serializer", 
"org.apache.spark.serializer.KryoSerializer").setMaster("local[8]");
+        .set("spark.serializer", 
"org.apache.spark.serializer.KryoSerializer").setMaster("local[8]")
+        .set("spark.sql.shuffle.partitions", "4")
+        .set("spark.default.parallelism", "4");

Review Comment:
   we can reduce to `local[4]` if set to parallelism to 4



##########
hudi-utilities/src/test/java/org/apache/hudi/utilities/testutils/UtilitiesTestBase.java:
##########
@@ -155,6 +155,11 @@ public static void initTestServices(boolean needsHdfs, 
boolean needsHive, boolea
       zookeeperTestService = new ZookeeperTestService(hadoopConf);
       zookeeperTestService.start();
     }
+
+    jsc = UtilHelpers.buildSparkContext(UtilitiesTestBase.class.getName() + 
"-hoodie", "local[2]");

Review Comment:
   should we make a constant for spark master local[4] for test cases?



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/UtilHelpers.java:
##########
@@ -280,6 +280,8 @@ private static SparkConf buildSparkConf(String appName, 
String defaultMaster, Ma
     sparkConf.set("spark.hadoop.mapred.output.compression.codec", 
"org.apache.hadoop.io.compress.GzipCodec");
     sparkConf.set("spark.hadoop.mapred.output.compression.type", "BLOCK");
     sparkConf.set("spark.driver.allowMultipleContexts", "true");
+    sparkConf.set("spark.sql.shuffle.partitions", "4");
+    sparkConf.set("spark.default.parallelism", "4");

Review Comment:
   but this is not test class, is it?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] xushiyan commented on a diff in pull request #7180: [MINOR] add in minor perf wins in hudi-utilities and locking related tests

Reply via email to