[ 
https://issues.apache.org/jira/browse/ARROW-13812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17407629#comment-17407629
 ] 

David Li commented on ARROW-13812:
----------------------------------

So what's going on here took me a while to unwind, but:
 * We're generating the array of unique values for a bool.
 * Ultimately this happens in BitUtil::bytes_to_bits. This reads 8 bytes and 
packs them into one byte.
 * The buffer it reads is generated by EncoderInteger::Decode. This writes data 
into a buffer given to it.
 * The buffer given to it ultimately stems from a TempVectorStack, whose memory 
is not initialized.

And in this test case, we're generating an array of 3 bools. So from Valgrind's 
perspective, we're taking 3 bytes and 5 garbage bytes and packing it into one 
garbage byte. I think the solution should be to always round up to 8 bytes.

> [C++] Valgrind failure in Grouper.BooleanKey (uninitialized values)
> -------------------------------------------------------------------
>
>                 Key: ARROW-13812
>                 URL: https://issues.apache.org/jira/browse/ARROW-13812
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>            Reporter: David Li
>            Assignee: David Li
>            Priority: Major
>              Labels: query-engine
>             Fix For: 6.0.0
>
>
> From the 
> [nightlies|https://dev.azure.com/ursacomputing/crossbow/_build/results?buildId=10785&view=logs&j=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb&t=d9b15392-e4ce-5e4c-0c8c-b69645229181&l=4978].
> {noformat}
> [ RUN      ] Grouper.BooleanKey
> ==11849== Conditional jump or move depends on uninitialised value(s)
> ==11849==    at 0x4122555: arrow::TestInitialized(arrow::ArrayData const&) 
> (gtest_util.cc:675)
> ==11849==    by 0x431604: arrow::compute::(anonymous 
> namespace)::ValidateOutput(arrow::ArrayData const&) (test_util.cc:202)
> ==11849==    by 0x431F94: arrow::compute::ValidateOutput(arrow::Datum const&) 
> (test_util.cc:235)
> ==11849==    by 0x40010B: 
> arrow::compute::TestGrouper::ValidateConsume(arrow::compute::ExecBatch 
> const&, arrow::Datum const&) (hash_aggregate_test.cc:380)
> ==11849==    by 0x400C03: 
> arrow::compute::TestGrouper::ConsumeAndValidate(arrow::compute::ExecBatch 
> const&, arrow::Datum*) (hash_aggregate_test.cc:364)
> ==11849==    by 0x410BD5: ExpectConsume (hash_aggregate_test.cc:318)
> ==11849==    by 0x410BD5: 
> arrow::compute::TestGrouper::ExpectConsume(std::__cxx11::basic_string<char, 
> std::char_traits<char>, std::allocator<char> > const&, 
> std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> 
> > const&) (hash_aggregate_test.cc:307)
> ==11849==    by 0x410F74: arrow::compute::Grouper_BooleanKey_Test::TestBody() 
> (hash_aggregate_test.cc:415)
> ==11849==    by 0x5D7398D: void 
> testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, 
> void>(testing::Test*, void (testing::Test::*)(), char const*) (in 
> /opt/conda/envs/arrow/lib/libgtest.so)
> ==11849==    by 0x5D73BE0: testing::Test::Run() (in 
> /opt/conda/envs/arrow/lib/libgtest.so)
> ==11849==    by 0x5D73F0E: testing::TestInfo::Run() (in 
> /opt/conda/envs/arrow/lib/libgtest.so)
> ==11849==    by 0x5D74035: testing::TestSuite::Run() (in 
> /opt/conda/envs/arrow/lib/libgtest.so)
> ==11849==    by 0x5D745EB: testing::internal::UnitTestImpl::RunAllTests() (in 
> /opt/conda/envs/arrow/lib/libgtest.so)
> ==11849==    by 0x5D74858: testing::UnitTest::Run() (in 
> /opt/conda/envs/arrow/lib/libgtest.so)
> ==11849==    by 0x421207E: main (in 
> /opt/conda/envs/arrow/lib/libgtest_main.so)
> ==11849== 
> {
>    <insert_a_suppression_name_here>
>    Memcheck:Cond
>    fun:_ZN5arrow15TestInitializedERKNS_9ArrayDataE
>    fun:_ZN5arrow7compute12_GLOBAL__N_114ValidateOutputERKNS_9ArrayDataE
>    fun:_ZN5arrow7compute14ValidateOutputERKNS_5DatumE
>    
> fun:_ZN5arrow7compute11TestGrouper15ValidateConsumeERKNS0_9ExecBatchERKNS_5DatumE
>    
> fun:_ZN5arrow7compute11TestGrouper18ConsumeAndValidateERKNS0_9ExecBatchEPNS_5DatumE
>    fun:ExpectConsume
>    
> fun:_ZN5arrow7compute11TestGrouper13ExpectConsumeERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES9_
>    fun:_ZN5arrow7compute23Grouper_BooleanKey_Test8TestBodyEv
>    
> fun:_ZN7testing8internal35HandleExceptionsInMethodIfSupportedINS_4TestEvEET0_PT_MS4_FS3_vEPKc
>    fun:_ZN7testing4Test3RunEv
>    fun:_ZN7testing8TestInfo3RunEv
>    fun:_ZN7testing9TestSuite3RunEv
>    fun:_ZN7testing8internal12UnitTestImpl11RunAllTestsEv
>    fun:_ZN7testing8UnitTest3RunEv
>    fun:main
> } {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to