0x26res commented on issue #32067: URL: https://github.com/apache/arrow/issues/32067#issuecomment-1415814636
Now that this change is effective in 11.0, we get this warning when loading data with `use_legacy_dataset=True`. ``` FutureWarning: Passing 'use_legacy_dataset=True' to get the legacy behaviour is deprecated as of pyarrow 11.0.0, and the legacy implementation will be removed in a future version. ``` I'm in the process of migrating to `use_legacy_dataset=False`, but was wondering what differences to expect between the 2 implementations. Is this documented somewhere? I have noticed one significant difference in behaviour. The legacy implementation would complain if the parquet schema are heterogeneous. The new implementation will try to convert all files to the schema of the first file it found (or the `schema` argument when provided). Are there other differences to expect? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org