wjones1 commented on pull request #6979:
URL: https://github.com/apache/arrow/pull/6979#issuecomment-619463693


   I found the cause of the test failure: If the `batch_size` isn't aligned 
with the `chunk_size`, categorical columns will fail with the error:
   ```
   pyarrow.lib.ArrowNotImplementedError: This class cannot yet iterate chunked 
arrays
   ```
   
   I think this means categorical columns/DictionaryArray columns aren't 
supported by this method for now, except if you are able to align the 
`batch_size` with `chunk_size`. 
   
   Is it possible or even common that `chunk_size` might be variable within a 
file?
   
   (The reason we were seeing the error in Python 3.5 and not in later Python 
versions is I was selecting a subset of columns using indices, and the ordering 
of columns changed between Python versions. I think because of the change in 
dictionary ordering in 3.6+. I've instead moved to have the offending test run 
on all columns.)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to