LuciferYang opened a new pull request, #41466:
URL: https://github.com/apache/spark/pull/41466

   ### What changes were proposed in this pull request?
   There will be maven test failed of connect server module before this pr:
   
   run
   
   ```
   build/mvn clean install -DskipTests
   build/mvn test -pl connector/connect/server
   ```
   
   there will be two test failed as follows:
   
   
   ```
   - from_protobuf_messageClassName *** FAILED ***
     org.apache.spark.sql.AnalysisException: [CANNOT_LOAD_PROTOBUF_CLASS] Could 
not load Protobuf class with name org.apache.spark.connect.proto.StorageLevel. 
org.apache.spark.connect.proto.StorageLevel does not extend shaded Protobuf 
Message class org.sparkproject.spark_protobuf.protobuf.Message. The jar with 
Protobuf classes needs to be shaded (com.google.protobuf.* --> 
org.sparkproject.spark_protobuf.protobuf.*).
     at 
org.apache.spark.sql.errors.QueryCompilationErrors$.protobufClassLoadError(QueryCompilationErrors.scala:3417)
     at 
org.apache.spark.sql.protobuf.utils.ProtobufUtils$.buildDescriptorFromJavaClass(ProtobufUtils.scala:193)
     at 
org.apache.spark.sql.protobuf.utils.ProtobufUtils$.buildDescriptor(ProtobufUtils.scala:151)
     at 
org.apache.spark.sql.protobuf.ProtobufDataToCatalyst.messageDescriptor$lzycompute(ProtobufDataToCatalyst.scala:58)
     at 
org.apache.spark.sql.protobuf.ProtobufDataToCatalyst.messageDescriptor(ProtobufDataToCatalyst.scala:57)
     at 
org.apache.spark.sql.protobuf.ProtobufDataToCatalyst.dataType$lzycompute(ProtobufDataToCatalyst.scala:43)
     at 
org.apache.spark.sql.protobuf.ProtobufDataToCatalyst.dataType(ProtobufDataToCatalyst.scala:42)
     at 
org.apache.spark.sql.catalyst.expressions.Alias.toAttribute(namedExpressions.scala:194)
     at 
org.apache.spark.sql.catalyst.plans.logical.Project.$anonfun$output$1(basicLogicalOperators.scala:72)
     at 
scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
   
   
   - from_protobuf_messageClassName_options *** FAILED ***
     org.apache.spark.sql.AnalysisException: [CANNOT_LOAD_PROTOBUF_CLASS] Could 
not load Protobuf class with name org.apache.spark.connect.proto.StorageLevel. 
org.apache.spark.connect.proto.StorageLevel does not extend shaded Protobuf 
Message class org.sparkproject.spark_protobuf.protobuf.Message. The jar with 
Protobuf classes needs to be shaded (com.google.protobuf.* --> 
org.sparkproject.spark_protobuf.protobuf.*).
     at 
org.apache.spark.sql.errors.QueryCompilationErrors$.protobufClassLoadError(QueryCompilationErrors.scala:3417)
     at 
org.apache.spark.sql.protobuf.utils.ProtobufUtils$.buildDescriptorFromJavaClass(ProtobufUtils.scala:193)
     at 
org.apache.spark.sql.protobuf.utils.ProtobufUtils$.buildDescriptor(ProtobufUtils.scala:151)
     at 
org.apache.spark.sql.protobuf.ProtobufDataToCatalyst.messageDescriptor$lzycompute(ProtobufDataToCatalyst.scala:58)
     at 
org.apache.spark.sql.protobuf.ProtobufDataToCatalyst.messageDescriptor(ProtobufDataToCatalyst.scala:57)
     at 
org.apache.spark.sql.protobuf.ProtobufDataToCatalyst.dataType$lzycompute(ProtobufDataToCatalyst.scala:43)
     at 
org.apache.spark.sql.protobuf.ProtobufDataToCatalyst.dataType(ProtobufDataToCatalyst.scala:42)
     at 
org.apache.spark.sql.catalyst.expressions.Alias.toAttribute(namedExpressions.scala:194)
     at 
org.apache.spark.sql.catalyst.plans.logical.Project.$anonfun$output$1(basicLogicalOperators.scala:72)
     at 
scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) 
   ```
   
   The reason for the test failure is that the Maven test used shaded 
spark-protobuf jar, and the `Message` type `StorageLevel` used for testing 
inherited from `com.google.protobuf.Message` instead of 
`org.sparkproject.spark_protobuf.protobuf.Message`.
   
   GitHub Actions can pass due to sbt test always used non-assembly 
park-protobuf jar, the `com.google.protobuf.Message`  class has not been 
relocated.
   
   So this pr references the patterns of `kafka-0-10/kafka-0-10-assembly` and 
`kinesis-asl/kinesis-asl-assembly`, splitting `spark-protobuf` module into 
`spark-protobuf` and `spark-protobuf-assembly`, and make connect server module 
always use `spark-protobuf` for testing to maintain the same behavior as sbt 
testing.
   
   With this pr, the above maven test commands will pass.
   
   
   ### Why are the changes needed?
   Make connect server module can test pass using maven.
   
   
   ### Does this PR introduce _any_ user-facing change?
   Yes, user needs to use spark-protobuf-assembly jar instead of the previous 
spark-protobuf jar.
   
   
   ### How was this patch tested?
   - Pass Github Actions
   - Manual check 
   
   ```
   build/mvn clean install -DskipTests
   build/mvn test -pl connector/connect/server
   ```
   
   all test passed after this pr.
   
   - Manual check the the contents of the jar, the content of 
`spark-protobuf-assembly` and `spark-protobuf(without this pr)` is the same, 
and `spark-protobuf(with this pr)` is no longer shaded


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to