MaxGekk commented on PR #56622: URL: https://github.com/apache/spark/pull/56622#issuecomment-4756785111
Thanks for the careful review, @shrirangmhalgi! On the Avro `time-micros` design note: you're right that the value is mislabeled - Spark writes the internal nanoseconds-since-midnight `Long` but annotates the column with the `time-micros` logical type, so a consumer that honors that annotation would misread it. Two clarifications on scope: - It isn't specific to TIME(7-9). The value is stored as nanoseconds under `time-micros` for **all** precisions (even TIME(6)), so external readers are already affected today. This PR doesn't change the Avro path (`SchemaConverters`/`AvroSerializer`/`AvroDeserializer` are untouched); Spark-to-Spark round-trips stay correct because both sides treat the `Long` as raw nanos and recover precision from the `spark.sql.catalyst.type` property. - A `time-nanos`-for-7-9-only bifurcation would make 7-9 externally correct but leave 0-6 still mislabeled. Making the encoding fully unit-correct is also a format change with a backward-compatibility angle: a new reader would misread Avro files already written by current Spark (nanos stored under `time-micros`), so legacy files need a detection/migration story. Since it's pre-existing, broader than this change, and needs that migration decision, I've filed a follow-up to do the full unit-correct Avro encoding (all precisions): [SPARK-57581](https://issues.apache.org/jira/browse/SPARK-57581). I'd prefer to keep this PR focused on the precision-cap extension and tackle Avro there - does that work for you? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
