linliu-code commented on PR #18770: URL: https://github.com/apache/hudi/pull/18770#issuecomment-4483839958
Fixed the CI compile error. Root cause: `ColumnVectorUtils.populate` has incompatible signatures across the Spark versions hudi-spark-common compiles against: | Spark version | populate signature | |---|---| | 3.3.x | `populate(WritableColumnVector, InternalRow, int)` | | 3.4.x | `populate(ConstantColumnVector, InternalRow, int)` | | 3.5.x | `populate(ConstantColumnVector, InternalRow, int)` | No single overload works for all three. Replaced the call with a small private helper that switches on the partition column's `DataType` and uses `ConstantColumnVector`'s primitive setters directly (those have been stable across 3.3-3.5). Unsupported partition types fall through to `setNull()` — safe for count(*) since partition predicates are applied at planning by the FileIndex, not by reading these vectors at execution. Verified locally: - `mvn compile -Dspark3.3 -Dscala-2.12` passes - `mvn compile -Dspark3.4 -Dscala-2.12` passes - `mvn compile -Dspark3.5 -Dscala-2.12` passes - runtime: `count=10,000` (scale S) and `count=1,000,000` (scale L) correct; wall ratio 1.26× / 1.20× vs raw parquet. Pushed as fixup commit on the same branch. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
