wypoon opened a new pull request #2512: URL: https://github.com/apache/iceberg/pull/2512
Code changes that allow spark3 and spark3-extensions to be built against both Spark 3.0 and Spark 3.1. - A method from `org.apache.spark.sql.catalyst.util.DateTimeUtils` that has changed its name is copied to `org.apache.iceberg.util.DateTimeUtil` and the new method is used instead. - The trait, `org.apache.spark.sql.catalyst.plans.logical.V2WriteCommand`, has 3 additional methods that need to be implemented. They are implemented in `ReplaceData` but without `override`. - The trait, `org.apache.spark.sql.catalyst.parser.ParserInterface`, no longer has the `parseRawDataType` method, so `IcebergSparkSqlExtensionsParser` implements it without `override` and the delegation checks to see if the method is defined in the delegate. - The main constructor for `org.apache.spark.sql.catalyst.expressions.SortOrder` has changed its signature. Use reflection to determine how to create instances of `SortOrder`. - The constructor for `org.apache.spark.sql.execution.datasources.v2.DataSourceV2ScanRelation` has changed its signature. Use reflection to determine how to create instances of `DataSourceV2ScanRelation`. - `org.apache.spark.sql.catalyst.SQLConfHelper` was introduced in Spark 3.1 and a number of classes and traits now extend it, including `org.apache.spark.sql.catalyst.analysis.CastSupport`, which has it as a self type. This is the trickiest part. I move the mixin, `CastSupport` from `AssignmentAlignmentSupport` to the rule, `AlignRowLevelOperations`, since `org.apache.spark.sql.catalyst.rules.Rule` implements `SQLConfHelper`. I define the `conf` method in the traits `AssignmentAlignmentSupport` and `RewriteRowLevelOperationHelper`, so that it can be overridden in the classes that extend them, which also extend `Rule[LogicalPlan]` (and thus `SQLConfHelper` in Spark 3.1). When compiling with Spark 3.0, the `conf` in the Iceberg traits are overridden, and when compiling with Spark 3.1, the `conf` in `SQLConfHelper` is overridden. I have not changed the Spark 3 version here. I am open to suggestions on how we want to do this. I am unfamiliar with gradle. With maven, I would define profiles so the Spark 3 support can be built with either Spark 3.0 or Spark 3.1. I have tested this change locally by building with both Spark 3.0 and Spark 3.1 and running the unit tests in the spark3 and spark3-extensions modules in both cases. I have also tried using the Spark 3 runtime jar built against Spark 3.0 with a Spark 3.1 cluster, but only running a couple Spark 3 procedures, so the testing is far from comprehensive. I am not sure if we need a Spark 3.1 runtime jar for Spark 3.1 clusters for it to be safe. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
