jonvex commented on code in PR #7071:
URL: https://github.com/apache/hudi/pull/7071#discussion_r1012031298


##########
packaging/bundle-validation/validate.sh:
##########
@@ -111,6 +114,73 @@ test_utilities_bundle () {
     echo "::warning::validate.sh done validating deltastreamer in spark shell"
 }
 
+##
+# Function to test the utilities bundle and utilities slim bundle + spark 
bundle.
+# It runs deltastreamer and then verifies that deltastreamer worked correctly.
+#
+# 1st arg: main jar to run with spark-submit, usually it's the 
utilities(-slim) bundle
+# 2nd arg and beyond: any additional jars to pass to --jars option
+#
+# env vars (defined in container):
+#   SPARK_HOME: path to the spark directory
+##
+test_utilities_bundle () {
+    OUTPUT_DIR=/tmp/hudi-utilities-test/
+    rm -r $OUTPUT_DIR
+    EXPECTED_SIZE=580
+    test_utilities_bundle_helper $1 "${@:2}"
+    exit $?
+}
+
+
+##
+# Function to test the upgrading the utilities bundle and 
+# utilities slim bundle + spark bundle.
+# It runs deltastreamer and then verifies that deltastreamer worked correctly 
on
+# half the data. Then, using an upgraded hudi, runs deltastreamer and verifies 
+# that deltastreamer worked correctly on the rest of the data
+#
+#
+# env vars (defined in container):
+#   SPARK_HOME: path to the spark directory
+#   FIRST_MAIN_ARG: what you would put as the first arg to 
test_utilities_bundle
+#       and that is used for running deltastreamer on the first batch of data
+#   FIRST_ADDITIONAL_ARG: what you would put as extra args to 
test_utilities_bundle
+#       and that is used for running deltastreamer on the first batch of data
+#   SECOND_MAIN_ARG: what you would put as the first arg to 
test_utilities_bundle
+#       and that is used for running deltastreamer on the second batch of data
+#   SECOND_ADDITIONAL_ARG: what you would put as extra args to 
test_utilities_bundle
+#       and that is used for running deltastreamer on the second batch of data
+##
+test_upgrade_bundle () {

Review Comment:
   I made it work only only on the release branch, but the code currently works 
on this branch so that the tests run. After you approve what I have, then I 
will change it so it works only on the release branch and then it can be merged



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to