0x26res commented on issue #32067:
URL: https://github.com/apache/arrow/issues/32067#issuecomment-1415814636

   Now that this change is effective in 11.0, we get this warning when loading 
data with `use_legacy_dataset=True`.
   
   ```
   FutureWarning: Passing 'use_legacy_dataset=True' to get the legacy behaviour 
is deprecated as of pyarrow 11.0.0, and the legacy implementation will be 
removed in a future version.
   ```
   
   I'm in the process of migrating to `use_legacy_dataset=False`, but was 
wondering what differences to expect between the 2 implementations. Is this 
documented somewhere?
   
   I have noticed one significant difference in behaviour. The legacy 
implementation would complain if the parquet schema are heterogeneous. The new 
implementation will try to convert all files to the schema of the first file it 
found (or the `schema` argument when provided).
   
   Are there other differences to expect?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to