MaxGekk commented on code in PR #56422:
URL: https://github.com/apache/spark/pull/56422#discussion_r3473235637
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetUtils.scala:
##########
@@ -526,6 +526,9 @@ object ParquetUtils extends Logging {
DataSourceUtils.setConfIfAbsent(conf,
SQLConf.PARQUET_ANNOTATE_VARIANT_LOGICAL_TYPE.key,
sqlConf.parquetAnnotateVariantLogicalType.toString)
+ DataSourceUtils.setConfIfAbsent(conf,
+ SQLConf.PARQUET_TIME_TYPE_ALLOW_IS_ADJUSTED_TO_UTC_READ.key,
Review Comment:
`prepareWrite` is a write-path method: it sets only write flags
(`PARQUET_WRITE_LEGACY_FORMAT`, `PARQUET_OUTPUT_TIMESTAMP_TYPE`, ...) into the
job conf consumed by `ParquetWriteSupport`.
`PARQUET_TIME_TYPE_ALLOW_IS_ADJUSTED_TO_UTC_READ` is a read-only config —
nothing on the write path reads it back, so this line is dead, and it's
inconsistent with the analogue read config `respectUnknownTypeAnnotation`,
which is not set in `prepareWrite`. The read path is already fully served by
`ParquetFileFormat.populateConf` plus the converter constructor param. It also
slightly contradicts the PR's "write path unchanged" intent. Suggest removing
this addition (non-blocking).
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/types/ops/TimeTypeParquetOps.scala:
##########
@@ -83,13 +86,28 @@ case class TimeTypeParquetOps(t: TimeType) extends
ParquetTypeOps {
// ==================== Row-Based Read ====================
override def newConverter(
+ parquetType: Type,
+ updater: ParentContainerUpdater): Converter with
HasParentContainerUpdater =
+ newConverterInternal(parquetType, updater)
+
+ override def newConverter(
Review Comment:
This 6-arg `newConverter` override is functionally identical to the
inherited `ParquetTypeOps` default, which already delegates the 6-arg form to
the 2-arg `newConverter` (`ParquetTypeOps.scala:159`). Both reach
`newConverterInternal`, so removing this override changes nothing. Keeping it
risks misleading a reader into thinking
`schemaConverter`/`convertTz`/`datetimeRebaseSpec`/`int96RebaseSpec` are
handled here — they're silently discarded. Suggest dropping the override and
keeping just `newConverterInternal` + the 2-arg override (non-blocking).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]