jorisvandenbossche commented on PR #13033:
URL: https://github.com/apache/arrow/pull/13033#issuecomment-1116180259

   To get this finalized:
   
   - The `test_parquet_dataset_factory_fsspec` is failing on Windows 
(Appveyor). It seems that the file listing of the dataset is using file paths 
relative to the root of the dataset folder (where the ``_metadata`` file 
lives). And thus when trying to read it, it gives a FileNotFound error (since 
those files are located in some temporary directory, and relative paths don't 
work). 
     While if I test this locally, I properly get absolute paths (also when 
using fsspec). And also with s3 instead of local filesystem (earlier on this 
PR), the windows tests were passing. 
     We are already normalizing the metadata path inside 
`ds.parquet_dataset(..)` (and inside `FileFromRowGroup` in C++, where we 
combine the path defined in the `_metadata` file with the root path), so my 
_assumption_ is that this is some issue with the fsspec filesystem on Windows. 
     So maybe we can skip this test for now for Windows, and open a follow-up 
JIRA for investigating it further? (and potentially open upstream issue) In 
practice users shouldn't run into this failure, as internally we translate a 
local fsspec filesystem to a native one.
   - The dask integration tests are still failing, because of the `test_s3.py` 
tests I added (the moto server timed out). I ensure locally that those are 
passing now, so this can probably also be done as a follow-up.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to