[GitHub] [arrow] jorisvandenbossche commented on pull request #7156: ARROW-8074: [C++][Dataset][Python] FileFragments from buffers and NativeFiles

2020-06-18 Thread GitBox
jorisvandenbossche commented on pull request #7156: URL: https://github.com/apache/arrow/pull/7156#issuecomment-645976313 @bkietz did you already open some follow-up JIRAs? (eg for https://github.com/apache/arrow/pull/7156#discussion_r439503475) I will handle my comment at https://g

[GitHub] [arrow] jorisvandenbossche commented on pull request #7156: ARROW-8074: [C++][Dataset][Python] FileFragments from buffers and NativeFiles

2020-06-18 Thread GitBox
jorisvandenbossche commented on pull request #7156: URL: https://github.com/apache/arrow/pull/7156#issuecomment-645971102 The travis failure is an unrelated Flight failure This is an automated message from the Apache Git Serv

[GitHub] [arrow] jorisvandenbossche commented on pull request #7156: ARROW-8074: [C++][Dataset][Python] FileFragments from buffers and NativeFiles

2020-06-16 Thread GitBox
jorisvandenbossche commented on pull request #7156: URL: https://github.com/apache/arrow/pull/7156#issuecomment-644966741 Note that it is not *only* for testing. We for sure use it for testing in pyarrow, but in pandas 1.0.4, we accidentally broke reading parquet files from file-like objec

[GitHub] [arrow] jorisvandenbossche commented on pull request #7156: ARROW-8074: [C++][Dataset][Python] FileFragments from buffers and NativeFiles

2020-06-16 Thread GitBox
jorisvandenbossche commented on pull request #7156: URL: https://github.com/apache/arrow/pull/7156#issuecomment-644966084 Taking a step back: wouldn't it be possible to eg "just" allow to create a Fragment from a buffer instead from a file? In practice, I think we only need to suppor

[GitHub] [arrow] jorisvandenbossche commented on pull request #7156: ARROW-8074: [C++][Dataset][Python] FileFragments from buffers and NativeFiles

2020-06-02 Thread GitBox
jorisvandenbossche commented on pull request #7156: URL: https://github.com/apache/arrow/pull/7156#issuecomment-637743541 Regarding my comments on `dataset.py`, since that file is in a need of a general clean-up regarding "input" handling (the handling of single file path / directory path

[GitHub] [arrow] jorisvandenbossche commented on pull request #7156: ARROW-8074: [C++][Dataset][Python] FileFragments from buffers and NativeFiles

2020-05-14 Thread GitBox
jorisvandenbossche commented on pull request #7156: URL: https://github.com/apache/arrow/pull/7156#issuecomment-628657581 I am testing the other parquet tests that are also skipped, and that turned up already one issue: https://issues.apache.org/jira/browse/ARROW-8799 With a few smal