[GitHub] [arrow] jorisvandenbossche edited a comment on pull request #8188: ARROW-9924: [C++][Dataset] Enable per-column parallelism for single ParquetFileFragment scans

2020-09-22 Thread GitBox


jorisvandenbossche edited a comment on pull request #8188:
URL: https://github.com/apache/arrow/pull/8188#issuecomment-696688370


   It seems the crashing test is:
   
   
https://github.com/apache/arrow/blob/40d64756dc3b2c51489b48362d0f04ee3e2a7388/python/pyarrow/tests/test_parquet.py#L3389-L3414
   
   where it is passing for `use_legacy_dataset=True`, but then crashing in the 
next test which normally is the same with `use_legacy_dataset=False`
   
   Not directly an idea how this would be related to the changes here, but 
maybe something about the decimal conversion code is not thread-safe? Although 
there are multiple columns, so also with legacy code it should run in parallel 
..



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] jorisvandenbossche edited a comment on pull request #8188: ARROW-9924: [C++][Dataset] Enable per-column parallelism for single ParquetFileFragment scans

2020-09-22 Thread GitBox


jorisvandenbossche edited a comment on pull request #8188:
URL: https://github.com/apache/arrow/pull/8188#issuecomment-696688370


   It seems the crashing test is:
   
   
https://github.com/apache/arrow/blob/40d64756dc3b2c51489b48362d0f04ee3e2a7388/python/pyarrow/tests/test_parquet.py#L3389-L3414
   
   where it is passing for `use_legacy_dataset=True`, but then crashing in the 
next test which normally is the same with `use_legacy_dataset=False`
   
   Not directly an idea how this would be related to the changes here, but 
maybe something about the decimal conversion code is not thread-safe? Although 
there are multiple columns, so also with legacy code it should run in parallel 
..



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org