MaxGekk commented on PR #56778:
URL: https://github.com/apache/spark/pull/56778#issuecomment-4802193523

   > AFAIK, `python/pyspark/sql/pandas/types.py` has the similar logic. Please 
double-check them too in this PR.
   
   Thanks @dongjoon-hyun, you're right. `python/pyspark/sql/pandas/types.py` 
has the same gap: `to_arrow_type` maps `TimeType` to `pa.time64("ns")` (no 
precision carried), and `from_arrow_type` maps `is_time64(at)` back to 
`TimeType()`, which defaults to precision 6. So a `TIME(p)` round-trip through 
the PySpark Arrow/pandas path also collapses to `TIME(6)`, same as the JVM 
`ArrowUtils` path this PR fixes.
   
   Since the PySpark mapping carries precision through a different channel (the 
pyarrow field metadata in `to_arrow_type`/`from_arrow_type`) and needs its own 
tests, I filed it as a sibling sub-task under SPARK-57550: 
[SPARK-57696](https://issues.apache.org/jira/browse/SPARK-57696) (Preserve TIME 
precision in the PySpark Arrow/pandas type mapping). I'll handle it as a 
follow-up so this PR stays scoped to the JVM `ArrowUtils` mapping. Happy to 
fold it in here instead if you'd prefer -- let me know.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to