samkumar opened a new issue, #37830:
URL: https://github.com/apache/arrow/issues/37830

   ### Describe the usage question you have. Please include as many useful 
details as  possible.
   
   
   When I try to run Valgrind (which I would like to do for its profiling 
tools) with Apache Arrow, it crashes because Apache Arrow uses AVX-512 
instructions that valgrind doesn't support. Even if I add the compiler flags to 
the build to disable AVX512, it looks like the Arrow code calls intrinsics that 
cause these errors to appear. Here is sample output from Valgrind:
   
   ```
   vex amd64->IR: unhandled instruction bytes: 0x62 0xF2 0x7D 0x48 0x58 0x45 
0xD3 0x48 0x8D 0x85
   vex amd64->IR:   REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
   vex amd64->IR:   VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE
   vex amd64->IR:   PFX.66=0 PFX.F2=0 PFX.F3=0
   ==76387== valgrind: Unrecognised instruction at address 0x683acec.
   ==76387==    at 0x683ACEC: _mm512_set1_epi32 (avx512fintrin.h:4130)
   ==76387==    by 0x683ACEC: xsimd::batch<unsigned int, xsimd::avx512bw> 
xsimd::kernel::broadcast<xsimd::avx512bw, unsigned int, void>(unsigned int, 
xsimd::avx512f const&) (xsimd_avx512f.hpp:629)
   ==76387==    by 0x683A8BF: xsimd::batch<unsigned int, 
xsimd::avx512bw>::batch(unsigned int) (xsimd_batch.hpp:437)
   ==76387==    by 0x682BDD8: arrow::internal::(anonymous 
namespace)::UnpackBits512<(arrow::internal::DispatchLevel)3>::unpack1_32(unsigned
 int const*, unsigned int*) (bpacking_simd512_generated.h:51)
   ==76387==    by 0x682B3C5: int 
arrow::internal::unpack32_specialized<arrow::internal::(anonymous 
namespace)::UnpackBits512<(arrow::internal::DispatchLevel)3> >(unsigned int 
const*, unsigned int*, int, int) (bpacking_simd_internal.h:35)
   ==76387==    by 0x682B296: arrow::internal::unpack32_avx512(unsigned int 
const*, unsigned int*, int, int) (bpacking_avx512.cc:26)
   ==76387==    by 0x672B7B0: arrow::internal::unpack32(unsigned int const*, 
unsigned int*, int, int) (bpacking.cc:176)
   ==76387==    by 0x772D540: int 
arrow::bit_util::BitReader::GetBatch<short>(int, short*, int) 
(bit_stream_utils.h:367)
   ==76387==    by 0x772D2B7: int 
arrow::util::RleDecoder::GetBatch<short>(short*, int) (rle_encoding.h:320)
   ==76387==    by 0x76E3888: parquet::LevelDecoder::Decode(int, short*) 
(column_reader.cc:183)
   ==76387==    by 0x771113E: parquet::(anonymous 
namespace)::ColumnReaderImplBase<parquet::PhysicalType<(parquet::Type::type)1> 
>::ReadDefinitionLevels(long, short*) (column_reader.cc:688)
   ==76387==    by 0x7704807: parquet::internal::(anonymous 
namespace)::TypedRecordReader<parquet::PhysicalType<(parquet::Type::type)1> 
>::ReadRecords(long) (column_reader.cc:1421)
   ==76387==    by 0x75FD8B2: parquet::arrow::(anonymous 
namespace)::LeafReader::LoadBatch(long) (reader.cc:493)
   ==76387== Your program just tried to execute an instruction that Valgrind
   ==76387== did not recognise.  There are two possible reasons for this.
   ==76387== 1. Your program has a bug and erroneously jumped to a non-code
   ==76387==    location.  If you are running Memcheck and you just saw a
   ==76387==    warning about a bad jump, it's probably your program's fault.
   ==76387== 2. The instruction is legitimate but Valgrind doesn't handle it,
   ==76387==    i.e. it's Valgrind's fault.  If you think this is the case or
   ==76387==    you are not sure, please let us know and we'll try to fix it.
   ==76387== Either way, Valgrind will now raise a SIGILL signal which will
   ==76387== probably kill your program.
   ==76387== 
   ==76387== Process terminating with default action of signal 4 (SIGILL)
   ==76387==  Illegal opcode at address 0x683ACEC
   ==76387==    at 0x683ACEC: _mm512_set1_epi32 (avx512fintrin.h:4130)
   ==76387==    by 0x683ACEC: xsimd::batch<unsigned int, xsimd::avx512bw> 
xsimd::kernel::broadcast<xsimd::avx512bw, unsigned int, void>(unsigned int, 
xsimd::avx512f const&) (xsimd_avx512f.hpp:629)
   ==76387==    by 0x683A8BF: xsimd::batch<unsigned int, 
xsimd::avx512bw>::batch(unsigned int) (xsimd_batch.hpp:437)
   ==76387==    by 0x682BDD8: arrow::internal::(anonymous 
namespace)::UnpackBits512<(arrow::internal::DispatchLevel)3>::unpack1_32(unsigned
 int const*, unsigned int*) (bpacking_simd512_generated.h:51)
   ==76387==    by 0x682B3C5: int 
arrow::internal::unpack32_specialized<arrow::internal::(anonymous 
namespace)::UnpackBits512<(arrow::internal::DispatchLevel)3> >(unsigned int 
const*, unsigned int*, int, int) (bpacking_simd_internal.h:35)
   ==76387==    by 0x682B296: arrow::internal::unpack32_avx512(unsigned int 
const*, unsigned int*, int, int) (bpacking_avx512.cc:26)
   ==76387==    by 0x672B7B0: arrow::internal::unpack32(unsigned int const*, 
unsigned int*, int, int) (bpacking.cc:176)
   ==76387==    by 0x772D540: int 
arrow::bit_util::BitReader::GetBatch<short>(int, short*, int) 
(bit_stream_utils.h:367)
   ==76387==    by 0x772D2B7: int 
arrow::util::RleDecoder::GetBatch<short>(short*, int) (rle_encoding.h:320)
   ==76387==    by 0x76E3888: parquet::LevelDecoder::Decode(int, short*) 
(column_reader.cc:183)
   ==76387==    by 0x771113E: parquet::(anonymous 
namespace)::ColumnReaderImplBase<parquet::PhysicalType<(parquet::Type::type)1> 
>::ReadDefinitionLevels(long, short*) (column_reader.cc:688)
   ==76387==    by 0x7704807: parquet::internal::(anonymous 
namespace)::TypedRecordReader<parquet::PhysicalType<(parquet::Type::type)1> 
>::ReadRecords(long) (column_reader.cc:1421)
   ==76387==    by 0x75FD8B2: parquet::arrow::(anonymous 
namespace)::LeafReader::LoadBatch(long) (reader.cc:493)
   ==76387== 
   ==76387== HEAP SUMMARY:
   ==76387==     in use at exit: 265,988 bytes in 1,932 blocks
   ==76387==   total heap usage: 3,598 allocs, 1,666 frees, 526,331 bytes 
allocated
   ==76387== 
   ==76387== LEAK SUMMARY:
   ==76387==    definitely lost: 0 bytes in 0 blocks
   ==76387==    indirectly lost: 0 bytes in 0 blocks
   ==76387==      possibly lost: 336 bytes in 1 blocks
   ==76387==    still reachable: 265,652 bytes in 1,931 blocks
   ==76387==                       of which reachable via heuristic:
   ==76387==                         multipleinheritance: 1,096 bytes in 24 
blocks
   ==76387==         suppressed: 0 bytes in 0 blocks
   ==76387== Rerun with --leak-check=full to see details of leaked memory
   ==76387== 
   ==76387== For lists of detected and suppressed errors, rerun with: -s
   ==76387== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
   Illegal instruction (core dumped)
   ```
   
   Looking at the other issues here, it is clear that folks have indeed 
successfully run `valgrind` with Apache Arrow. Were these with a release build? 
If so, what am I missing?
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to