AlenkaF commented on PR #14804: URL: https://github.com/apache/arrow/pull/14804#issuecomment-1359168605
@jorisvandenbossche I think I addressed all the comments and topics we have talked about. Some information about the recent changes: - Large string dtype is working correctly. Pandas doesn't support this type of data so I changed the adaptation and added a test to check for error due to not supported dtype. PyArrow roundtrip works correctly - I had to construct an array with `pyarrow.LargeStringArray.from_buffers` and not `pyarrow.Array.from_buffers` in case of large string dtype. - String dtype is removed from pandas roundtrip tests as pandas defines `.size()` as a method in [column.py](https://github.com/pandas-dev/pandas/blob/91111fd99898d9dcaa6bf6bedb662db4108da6e6/pandas/core/interchange/column.py#L84-L88) but calls it as a property in [from_dataframe.py](https://github.com/pandas-dev/pandas/blob/5c66e65d7b9fef47ccb585ce2fd0b3ea18dc82ea/pandas/core/interchange/from_dataframe.py#L247) and so the roundtrip with pandas errors for string dtypes. - I have added better test coverage with parametrizing some of the tests and also using strategies for one of the test in `test_interchange_spec.py`. - Now an error is raised if `nan_as_null=True`, I also added a test for it. - I have exposed `from_dataframe` in `interchange/__init__.py`, hope I have done it correctly. So I think this PR is ready for another round of review 🙏 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
