yihua opened a new pull request, #6735:
URL: https://github.com/apache/hudi/pull/6735

   ### Change Logs
   
   This PR fixes the hudi-spark3-bundle.  Before this PR, reading a Hudi table 
with Spark datasource in Spark 3.3 shell with hudi-spark3-bundle throws the 
following exception.  Some classes are not packaged into the spark3 bundle.
   
   ```
   scala> val df = spark.read.format("hudi").load("<table_path>")
   java.util.ServiceConfigurationError: 
org.apache.spark.sql.sources.DataSourceRegister: Provider 
org.apache.hudi.Spark32PlusDefaultSource not found
     at java.util.ServiceLoader.fail(ServiceLoader.java:239)
     at java.util.ServiceLoader.access$300(ServiceLoader.java:185)
     at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:372)
     at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
     at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
     at 
scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:46)
     at scala.collection.Iterator.foreach(Iterator.scala:943)
     at scala.collection.Iterator.foreach$(Iterator.scala:943)
     at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
     at scala.collection.IterableLike.foreach(IterableLike.scala:74)
     at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
     at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
     at scala.collection.TraversableLike.filterImpl(TraversableLike.scala:303)
     at scala.collection.TraversableLike.filterImpl$(TraversableLike.scala:297)
     at scala.collection.AbstractTraversable.filterImpl(Traversable.scala:108)
     at scala.collection.TraversableLike.filter(TraversableLike.scala:395)
     at scala.collection.TraversableLike.filter$(TraversableLike.scala:395)
     at scala.collection.AbstractTraversable.filter(Traversable.scala:108)
     at 
org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:657)
     at 
org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:725)
     at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:207)
     at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:185)
     ... 47 elided 
   ```
   
   ### Impact
   
   **Risk level: low**
   
   Fixing the hudi-spark3-bundle packaging only to avoid class not found.
   
   Tested locally and on EMR that the hudi-spark3-bundle works after the fix.
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to