Hi folks, I'm planning to upgrade the Iceberg version in the upcoming 0.9.0 release. There are several key points we need to discuss before finalizing the upgrade plan: 💡 Key Discussion Points:
1. *Should we upgrade to the latest Iceberg version (1.9.x)?* Iceberg 1.9.x brings new features and improvements, but also introduces breaking changes (see below). 2. *Can we drop support for Hadoop 2?* Iceberg 1.9.x officially removes support for Hadoop 2. Dropping it could simplify our upgrade path, but may impact legacy users. 3. *Is it acceptable to allow the mixed format modules (especially Spark) to use different Iceberg versions?* This includes options like using Iceberg 1.9.x in core modules while retaining older versions (e.g., 1.6.x or 1.8.x) in Spark/Flink mixed format modules to maintain compatibility. ------------------------------ 🧭 Proposed Options:*Option 1: Full Upgrade* - Upgrade to Iceberg *1.9.x* across the board - Drop support for *Hadoop 2* and *Spark ≤ 3.3* - Standardize on *Spark 3.4+* - Flink using Iceberg 1.4.3 - ✅ Pros: Clean and future-proof. - ❌ Cons: Breaks compatibility for older environments. *Option 2: Hybrid Compatibility* - Upgrade core to *1.9.x*. - For *Hadoop 2* environments, fallback to Iceberg *1.8.x*. - For *Spark mixed format*, either: - Drop support for Spark ≤ 3.3, *or* - Use Iceberg 1.8.x specifically in the Spark mixed format module. - ✅ Pros: Balances new features with backward compatibility. - ❌ Cons: More complex build and dependency management. *Option 3: Conservative Upgrade* - Upgrade to Iceberg *1.8.x* as the maximum version. - In Flink mixed format (e.g., Flink 1.17), keep using Iceberg *1.6.x*. - ✅ Pros: Minimal compatibility risk. - ❌ Cons: Misses improvements in newer Iceberg versions. *Also consider downgrading to Iceberg 1.6.x when compiling with JDK 8.*