[GitHub] [arrow] wesm commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column
wesm commented on pull request #6985: URL: https://github.com/apache/arrow/pull/6985#issuecomment-639178716 +1. I'm going to go ahead and merge this and then I'll rebase #7352 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] wesm commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column
wesm commented on pull request #6985: URL: https://github.com/apache/arrow/pull/6985#issuecomment-637180635 I'll kick the tires on this re: performance. Is this ready to merge otherwise? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] wesm commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column
wesm commented on pull request #6985: URL: https://github.com/apache/arrow/pull/6985#issuecomment-622493938 > I might lean towards macros around FMV for clang/GCC that could enable fallback to a slow version for MSVC FTR, it would seem unfortunate to do the work of SIMD-ifying code and then have it not be availabe on Windows (e.g. Python and R have a lot of Windows users). I'm not sure how much additional work is needed to get things to work on MSVC but it would be be good to have an OS-independent approach This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] wesm commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column
wesm commented on pull request #6985: URL: https://github.com/apache/arrow/pull/6985#issuecomment-622409707 The 32-bit R failure seems like it could be real This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] wesm commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column
wesm commented on pull request #6985: URL: https://github.com/apache/arrow/pull/6985#issuecomment-622373323 Take a look at how this is currently being handled in NumPy * https://numpy.org/neps/nep-0038-SIMD-optimizations.html * https://github.com/numpy/numpy/pull/13516 I assume a lot of thought was put into this, and since it was done in 2019 it should provide a good guideline for what we should do This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] wesm commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column
wesm commented on pull request #6985: URL: https://github.com/apache/arrow/pull/6985#issuecomment-622371728 What I've seen other projects do (have to dig for some examples) is to have files like ``` functionality_nosimd.cc functionality_sse42.cc functionality_avx2.cc ``` The appropriate compiler flags are set on the respective compilation unit. Then query the CPU at runtime to determine its features and choose which variant to invoke. @cyb70289 expressed an interest in this recently, I think This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] wesm commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column
wesm commented on pull request #6985: URL: https://github.com/apache/arrow/pull/6985#issuecomment-622155576 I think this is fine to merge once most of the typos in the comments are fixed. A rebase will probably fix the Rust lint error This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] wesm commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column
wesm commented on pull request #6985: URL: https://github.com/apache/arrow/pull/6985#issuecomment-621993208 Yep sorry thanks for the nudge, will look today This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow] wesm commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column
wesm commented on pull request #6985: URL: https://github.com/apache/arrow/pull/6985#issuecomment-619653776 I'll try to have a closer look tomorrow or Tuesday This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org