LuciferYang opened a new pull request, #41466:
URL: https://github.com/apache/spark/pull/41466
### What changes were proposed in this pull request?
There will be maven test failed of connect server module before this pr:
run
```
build/mvn clean install -DskipTests
build/mvn test -pl connector/connect/server
```
there will be two test failed as follows:
```
- from_protobuf_messageClassName *** FAILED ***
org.apache.spark.sql.AnalysisException: [CANNOT_LOAD_PROTOBUF_CLASS] Could
not load Protobuf class with name org.apache.spark.connect.proto.StorageLevel.
org.apache.spark.connect.proto.StorageLevel does not extend shaded Protobuf
Message class org.sparkproject.spark_protobuf.protobuf.Message. The jar with
Protobuf classes needs to be shaded (com.google.protobuf.* -->
org.sparkproject.spark_protobuf.protobuf.*).
at
org.apache.spark.sql.errors.QueryCompilationErrors$.protobufClassLoadError(QueryCompilationErrors.scala:3417)
at
org.apache.spark.sql.protobuf.utils.ProtobufUtils$.buildDescriptorFromJavaClass(ProtobufUtils.scala:193)
at
org.apache.spark.sql.protobuf.utils.ProtobufUtils$.buildDescriptor(ProtobufUtils.scala:151)
at
org.apache.spark.sql.protobuf.ProtobufDataToCatalyst.messageDescriptor$lzycompute(ProtobufDataToCatalyst.scala:58)
at
org.apache.spark.sql.protobuf.ProtobufDataToCatalyst.messageDescriptor(ProtobufDataToCatalyst.scala:57)
at
org.apache.spark.sql.protobuf.ProtobufDataToCatalyst.dataType$lzycompute(ProtobufDataToCatalyst.scala:43)
at
org.apache.spark.sql.protobuf.ProtobufDataToCatalyst.dataType(ProtobufDataToCatalyst.scala:42)
at
org.apache.spark.sql.catalyst.expressions.Alias.toAttribute(namedExpressions.scala:194)
at
org.apache.spark.sql.catalyst.plans.logical.Project.$anonfun$output$1(basicLogicalOperators.scala:72)
at
scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
- from_protobuf_messageClassName_options *** FAILED ***
org.apache.spark.sql.AnalysisException: [CANNOT_LOAD_PROTOBUF_CLASS] Could
not load Protobuf class with name org.apache.spark.connect.proto.StorageLevel.
org.apache.spark.connect.proto.StorageLevel does not extend shaded Protobuf
Message class org.sparkproject.spark_protobuf.protobuf.Message. The jar with
Protobuf classes needs to be shaded (com.google.protobuf.* -->
org.sparkproject.spark_protobuf.protobuf.*).
at
org.apache.spark.sql.errors.QueryCompilationErrors$.protobufClassLoadError(QueryCompilationErrors.scala:3417)
at
org.apache.spark.sql.protobuf.utils.ProtobufUtils$.buildDescriptorFromJavaClass(ProtobufUtils.scala:193)
at
org.apache.spark.sql.protobuf.utils.ProtobufUtils$.buildDescriptor(ProtobufUtils.scala:151)
at
org.apache.spark.sql.protobuf.ProtobufDataToCatalyst.messageDescriptor$lzycompute(ProtobufDataToCatalyst.scala:58)
at
org.apache.spark.sql.protobuf.ProtobufDataToCatalyst.messageDescriptor(ProtobufDataToCatalyst.scala:57)
at
org.apache.spark.sql.protobuf.ProtobufDataToCatalyst.dataType$lzycompute(ProtobufDataToCatalyst.scala:43)
at
org.apache.spark.sql.protobuf.ProtobufDataToCatalyst.dataType(ProtobufDataToCatalyst.scala:42)
at
org.apache.spark.sql.catalyst.expressions.Alias.toAttribute(namedExpressions.scala:194)
at
org.apache.spark.sql.catalyst.plans.logical.Project.$anonfun$output$1(basicLogicalOperators.scala:72)
at
scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
```
The reason for the test failure is that the Maven test used shaded
spark-protobuf jar, and the `Message` type `StorageLevel` used for testing
inherited from `com.google.protobuf.Message` instead of
`org.sparkproject.spark_protobuf.protobuf.Message`.
GitHub Actions can pass due to sbt test always used non-assembly
park-protobuf jar, the `com.google.protobuf.Message` class has not been
relocated.
So this pr references the patterns of `kafka-0-10/kafka-0-10-assembly` and
`kinesis-asl/kinesis-asl-assembly`, splitting `spark-protobuf` module into
`spark-protobuf` and `spark-protobuf-assembly`, and make connect server module
always use `spark-protobuf` for testing to maintain the same behavior as sbt
testing.
With this pr, the above maven test commands will pass.
### Why are the changes needed?
Make connect server module can test pass using maven.
### Does this PR introduce _any_ user-facing change?
Yes, user needs to use spark-protobuf-assembly jar instead of the previous
spark-protobuf jar.
### How was this patch tested?
- Pass Github Actions
- Manual check
```
build/mvn clean install -DskipTests
build/mvn test -pl connector/connect/server
```
all test passed after this pr.
- Manual check the the contents of the jar, the content of
`spark-protobuf-assembly` and `spark-protobuf(without this pr)` is the same,
and `spark-protobuf(with this pr)` is no longer shaded
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]