[GitHub] [arrow] wesm commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

2020-06-04 Thread GitBox


wesm commented on pull request #6985:
URL: https://github.com/apache/arrow/pull/6985#issuecomment-639178716


   +1. I'm going to go ahead and merge this and then I'll rebase #7352 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] wesm commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

2020-06-01 Thread GitBox


wesm commented on pull request #6985:
URL: https://github.com/apache/arrow/pull/6985#issuecomment-637180635


   I'll kick the tires on this re: performance. Is this ready to merge 
otherwise?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] wesm commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

2020-05-01 Thread GitBox


wesm commented on pull request #6985:
URL: https://github.com/apache/arrow/pull/6985#issuecomment-622493938


   > I might lean towards macros around FMV for clang/GCC that could enable 
fallback to a slow version for MSVC
   
   FTR, it would seem unfortunate to do the work of SIMD-ifying code and then 
have it not be availabe on Windows (e.g. Python and R have a lot of Windows 
users). I'm not sure how much additional work is needed to get things to work 
on MSVC but it would be be good to have an OS-independent approach



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] wesm commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

2020-05-01 Thread GitBox


wesm commented on pull request #6985:
URL: https://github.com/apache/arrow/pull/6985#issuecomment-622409707


   The 32-bit R failure seems like it could be real



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] wesm commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

2020-05-01 Thread GitBox


wesm commented on pull request #6985:
URL: https://github.com/apache/arrow/pull/6985#issuecomment-622373323


   Take a look at how this is currently being handled in NumPy
   
   * https://numpy.org/neps/nep-0038-SIMD-optimizations.html
   * https://github.com/numpy/numpy/pull/13516
   
   I assume a lot of thought was put into this, and since it was done in 2019 
it should provide a good guideline for what we should do



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] wesm commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

2020-05-01 Thread GitBox


wesm commented on pull request #6985:
URL: https://github.com/apache/arrow/pull/6985#issuecomment-622371728


   What I've seen other projects do (have to dig for some examples) is to have 
files like
   
   ```
   functionality_nosimd.cc
   functionality_sse42.cc
   functionality_avx2.cc
   ```
   
   The appropriate compiler flags are set on the respective compilation unit. 
   
   Then query the CPU at runtime to determine its features and choose which 
variant to invoke. @cyb70289 expressed an interest in this recently, I think



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] wesm commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

2020-04-30 Thread GitBox


wesm commented on pull request #6985:
URL: https://github.com/apache/arrow/pull/6985#issuecomment-622155576


   I think this is fine to merge once most of the typos in the comments are 
fixed. A rebase will probably fix the Rust lint error



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] wesm commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

2020-04-30 Thread GitBox


wesm commented on pull request #6985:
URL: https://github.com/apache/arrow/pull/6985#issuecomment-621993208


   Yep sorry thanks for the nudge, will look today 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] wesm commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

2020-04-26 Thread GitBox


wesm commented on pull request #6985:
URL: https://github.com/apache/arrow/pull/6985#issuecomment-619653776


   I'll try to have a closer look tomorrow or Tuesday



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org