jonkeane commented on pull request #11614: URL: https://github.com/apache/arrow/pull/11614#issuecomment-961482886
Turns out this exposes a sanitizer [issue in snappy](https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=15095&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=d9b15392-e4ce-5e4c-0c8c-b69645229181&l=4006): ``` /tmp/RtmpR2qogf/file50379aa3da/snappy_ep-prefix/src/snappy_ep/snappy.cc:95:19: runtime error: load of misaligned address 0x60a0000052c1 for type 'const uint32', which requires 4 byte alignment 0x60a0000052c1: note: pointer points here 00 00 13 00 00 00 00 00 00 35 40 66 66 66 66 66 66 35 40 9a 99 99 99 99 19 32 40 33 33 33 33 33 ^ #0 0x7f376be87331 in snappy::internal::CompressFragment(char const*, unsigned long, char*, unsigned short*, int) (/usr/local/RDsan/lib/R/site-library/arrow/libs/arrow.so+0x30bc2331) #1 0x7f376be8bc82 in snappy::Compress(snappy::Source*, snappy::Sink*) (/usr/local/RDsan/lib/R/site-library/arrow/libs/arrow.so+0x30bc6c82) #2 0x7f376be91439 in snappy::RawCompress(char const*, unsigned long, char*, unsigned long*) (/usr/local/RDsan/lib/R/site-library/arrow/libs/arrow.so+0x30bcc439) #3 0x7f3760bc6e29 in arrow::util::internal::(anonymous namespace)::SnappyCodec::Compress(long, unsigned char const*, long, unsigned char*) (/usr/local/RDsan/lib/R/site-library/arrow/libs/arrow.so+0x25901e29) #4 0x7f375c2b7176 in parquet::SerializedPageWriter::Compress(arrow::Buffer const&, arrow::ResizableBuffer*) (/usr/local/RDsan/lib/R/site-library/arrow/libs/arrow.so+0x20ff2176) #5 0x7f375c356565 in parquet::SerializedPageWriter::WriteDictionaryPage(parquet::DictionaryPage const&) (/usr/local/RDsan/lib/R/site-library/arrow/libs/arrow.so+0x21091565) #6 0x7f375c2a8560 in parquet::TypedColumnWriterImpl<parquet::PhysicalType<(parquet::Type::type)5> >::WriteDictionaryPage() (/usr/local/RDsan/lib/R/site-library/arrow/libs/arrow.so+0x20fe3560) #7 0x7f375c25e021 in parquet::ColumnWriterImpl::Close() (/usr/local/RDsan/lib/R/site-library/arrow/libs/arrow.so+0x20f99021) #8 0x7f375bce64ad in parquet::arrow::FileWriterImpl::WriteColumnChunk(std::shared_ptr<arrow::ChunkedArray> const&, long, long) (/usr/local/RDsan/lib/R/site-library/arrow/libs/arrow.so+0x20a214ad) #9 0x7f375bcb6f52 in parquet::arrow::FileWriterImpl::WriteTable(arrow::Table const&, long) (/usr/local/RDsan/lib/R/site-library/arrow/libs/arrow.so+0x209f1f52) #10 0x7f375b5f19f2 in arrow::dataset::ParquetFileWriter::Write(std::shared_ptr<arrow::RecordBatch> const&) (/usr/local/RDsan/lib/R/site-library/arrow/libs/arrow.so+0x2032c9f2) #11 0x7f375b743040 in arrow::internal::FnOnce<void ()>::FnImpl<std::_Bind<arrow::detail::ContinueFuture (arrow::Future<unsigned long>, arrow::dataset::internal::(anonymous namespace)::DatasetWriterFileQueue::WriteNext()::{lambda()#1})> >::invoke() (/usr/local/RDsan/lib/R/site-library/arrow/libs/arrow.so+0x2047e040) #12 0x7f37608b3ad0 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::{lambda()#1}> > >::_M_run() (/usr/local/RDsan/lib/R/site-library/arrow/libs/arrow.so+0x255eead0) #13 0x7f379537ade3 (/usr/lib/x86_64-linux-gnu/libstdc++.so.6+0xd6de3) #14 0x7f3795e12608 in start_thread /build/glibc-eX1tMB/glibc-2.31/nptl/pthread_create.c:477 #15 0x7f3795d37292 in __clone (/usr/lib/x86_64-linux-gnu/libc.so.6+0x122292) ``` https://dev.azure.com/ursacomputing/crossbow/_build/latest?definitionId=1&branchName=actions-1093-azure-test-ubuntu-18.04-r-sanitizer -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org