jorisvandenbossche commented on issue #39539:
URL: https://github.com/apache/arrow/issues/39539#issuecomment-2017751525

   I would still prefer someone to first do a PR to the spec to add this. If it 
is just clarifying that the existing `DATETIME` dtype kind can also be used for 
other Arrow date and time dtypes, that should relatively easy.
   
   
   
   > I see that libraries are working around this by defining date and time 
types as protocol DATETIME data type with Apache Arrow C Data Interface format 
string (example `tdD` for `date32`, `tdm` for `date64` etc, see [Polars 
code](https://github.com/pola-rs/polars/blob/53f55367d1428b6d4ab51a7b17a8dbf4c003ac43/py-polars/polars/interchange/utils.py#L48-L50)
 and [pandas 
code](https://github.com/pandas-dev/pandas/blob/4f145b3a04ac2e9167545a8a2a09d30856d9ce42/pandas/core/interchange/utils.py#L84-L92)).
   
   AFAIK pandas doesn't actually support this for duration, at least not for 
the default timedelta dtype (from testing with pandas main):
   
   ```
   In [7]: from pyarrow.interchange import from_dataframe
   
   In [8]: from_dataframe(pd.DataFrame({'a': pd.timedelta_range(0, "1 days", 
freq='s')}))
   ...
   File ~/scipy/repos/pandas/pandas/core/interchange/utils.py:147, in 
dtype_to_arrow_c_fmt(dtype)
       144 elif isinstance(dtype, DatetimeTZDtype):
       145     return ArrowCTypes.TIMESTAMP.format(resolution=dtype.unit[0], 
tz=dtype.tz)
   --> 147 raise NotImplementedError(
       148     f"Conversion of {dtype} to Arrow C format string is not 
implemented."
       149 )
   
   NotImplementedError: Conversion of timedelta64[ns] to Arrow C format string 
is not implemented.
   ```
   
   FWIW, my proposal to add support for the Arrow PyCapsule protocol to the 
interchange standard (https://github.com/data-apis/dataframe-api/pull/342) 
would also solve this for the case of polars and pyarrow, as both are 
Arrow-memory based, and could interchange easily those data types. 
   (although that of course requires polars to implement it, and based on 
https://github.com/pola-rs/polars/issues/12530 that is still WIP I think)
   
   We _could_ start checking for that protocol in 
`pyarrow.interchange.from_dataframe`, although that would also be an extension 
not covered by the official spec.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to