MaxGekk commented on PR #56622:
URL: https://github.com/apache/spark/pull/56622#issuecomment-4756785111

   Thanks for the careful review, @shrirangmhalgi!
   
   On the Avro `time-micros` design note: you're right that the value is 
mislabeled - Spark writes the internal nanoseconds-since-midnight `Long` but 
annotates the column with the `time-micros` logical type, so a consumer that 
honors that annotation would misread it.
   
   Two clarifications on scope:
   
   - It isn't specific to TIME(7-9). The value is stored as nanoseconds under 
`time-micros` for **all** precisions (even TIME(6)), so external readers are 
already affected today. This PR doesn't change the Avro path 
(`SchemaConverters`/`AvroSerializer`/`AvroDeserializer` are untouched); 
Spark-to-Spark round-trips stay correct because both sides treat the `Long` as 
raw nanos and recover precision from the `spark.sql.catalyst.type` property.
   - A `time-nanos`-for-7-9-only bifurcation would make 7-9 externally correct 
but leave 0-6 still mislabeled. Making the encoding fully unit-correct is also 
a format change with a backward-compatibility angle: a new reader would misread 
Avro files already written by current Spark (nanos stored under `time-micros`), 
so legacy files need a detection/migration story.
   
   Since it's pre-existing, broader than this change, and needs that migration 
decision, I've filed a follow-up to do the full unit-correct Avro encoding (all 
precisions): [SPARK-57581](https://issues.apache.org/jira/browse/SPARK-57581). 
I'd prefer to keep this PR focused on the precision-cap extension and tackle 
Avro there - does that work for you?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to