jackye1995 commented on pull request #3256: URL: https://github.com/apache/iceberg/pull/3256#issuecomment-940264386
Thanks for the PR! There are a few aspects I'd like to discuss: > 1. should we continue to keep the existing iceberg-spark module as a common module across all versions My original thought in #3237 was that we can define the rule that the shared code can continue to live, and any code that cannot build againt all versions should then be refactored to different version modules. The intention is to minimize duplicated code across version modules, but I don't have enough context yet if this is worth the effort, maybe after refactoring the current code to build across Spark 2 and 3, we don't have much shared code anyway. > 2. should we have v3.1 if v3.0 and v3.1 can all be built against v3.0 In #3237 I have the concept of a build version and source version. The definition is that build version is the actual version we build Spark, and source version is the smallest version that would be forward compatible until we create the next source version. So in this case, we will have source version v2.4 that can build Spark 2.4.x, source version v3.0 that can build Spark 3.0.x and 3.1.x. And we will have a v3.2 for build versions v3.2.x and forward until we cannot build against v3.2 source. The intention is again to minimize duplicated code by minimizing the number of version folders. > 3. How would we handle Scala version in this architecture? In #3237 I had another `scalaVersion` system property to allow setting a flexible Scala version. Do you think that is something worth adding? It doesn't need to be refactored in this PR, just want to know what you think about Scala versioning. > 4. What would be the naming scheme for modules after v3.0? This is just something not yet shown in this PR, for example for v3.2, would the package names be something like `:iceberg-spark:spark32-runtime`, or `:iceberg-spark:spark3-2-runtime`, or something else? Also combining with Scala version consideration, if we support that, should we also publish artifacts for all Scala versions we support like what Spark does? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
