vinothchandar commented on pull request #1760: URL: https://github.com/apache/hudi/pull/1760#issuecomment-685998284
@bschell here's a path forward. - We can create an abstraction for deserializing : `RowDeserializer` and which implements row. deser differently based on spark 2 and spark 3. - Spark 2 impl lives in `hudi-spark` and we pick the deser impl based on spark version within `hudi-spark` itself using reflection. Note that we will create the instance alone this way, i.e reflection used only for initing the deserializer object - lets create a new module `hudi-spark3` which contain just the one class that implementation the deserialization using spark 3 apis, we will override the spark version to 3 for that module alone. i.e this class alone gets compiled against spark 3. (this needs to be confirmed) - we include `hudi-spark3` in the spark bundle. Spark is anyway a provided dependency, so as long as we invoke the spark3 deserializer only with the spark 3 jars around, things should work well. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
