vinothchandar commented on a change in pull request #4955:
URL: https://github.com/apache/hudi/pull/4955#discussion_r831563020



##########
File path: .github/workflows/bot.yml
##########
@@ -16,18 +16,36 @@ jobs:
     strategy:
       matrix:
         include:
-          - scala: "scala-2.11"
-            spark: "spark2"
-          - scala: "scala-2.11"
-            spark: "spark2,spark-shade-unbundle-avro"
-          - scala: "scala-2.12"
-            spark: "spark3.1.x"
-          - scala: "scala-2.12"
-            spark: "spark3.1.x,spark-shade-unbundle-avro"
-          - scala: "scala-2.12"
-            spark: "spark3"
-          - scala: "scala-2.12"
-            spark: "spark3,spark-shade-unbundle-avro"
+          - scalaProfile: "scala-2.11"
+            sparkProfile: "spark2"
+            sparkVersion: "2.4.4"
+
+          # Spark 3.1.x
+          - scalaProfile: "scala-2.12"
+            sparkProfile: "spark3.1.x"
+            sparkVersion: "3.1.0"
+
+          - scalaProfile: "scala-2.12"
+            sparkProfile: "spark3.1.x"
+            sparkVersion: "3.1.1"
+
+          - scalaProfile: "scala-2.12"
+            sparkProfile: "spark3.1.x"
+            sparkVersion: "3.1.2"
+
+          - scalaProfile: "scala-2.12"
+            sparkProfile: "spark3.1.x"
+            sparkVersion: "3.1.3"
+
+          # Spark 3.2.x
+          - scalaProfile: "scala-2.12"
+            sparkProfile: "spark3"
+            sparkVersion: "3.2.0"
+
+          - scalaProfile: "scala-2.12"

Review comment:
       Would this be okay with gh action minutes? @xushiyan 

##########
File path: 
hudi-client/hudi-spark-client/src/main/scala/org/apache/spark/sql/hudi/SparkAdapter.scala
##########
@@ -54,6 +59,11 @@ trait SparkAdapter extends Serializable {
    */
   def createAvroDeserializer(rootAvroType: Schema, rootCatalystType: 
DataType): HoodieAvroDeserializer
 
+  /**
+   * TODO

Review comment:
       docs?

##########
File path: README.md
##########
@@ -90,21 +90,14 @@ mvn clean package -DskipTests -Dspark3
 mvn clean package -DskipTests -Dspark3.1.x
 ```
 
-### Build without spark-avro module
+### What about "spark-avro" module?
 
-The default hudi-jar bundles spark-avro module. To build without spark-avro 
module, build using `spark-shade-unbundle-avro` profile
+Previously, Hudi bundles were packaging (and shading) "spark-avro" module 
internally. However, due to multiple occasion 
+of it being broken b/w patch versions (most recent was, b/w 3.2.0 and 3.2.1) 
of Spark after substantial deliberation 
+we took a decision to let go such dependency and instead simply clone the 
structures we're relying on to better control 

Review comment:
       we can shorten this a bit and just make README have the actual steps to 
do here?
   
   ```
   ### What about "spark-avro" module?
   
   Hudi versions 0.11 or later, no longer required `spark-avro` to be specified 
using `--packages`
   
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to