[
https://issues.apache.org/jira/browse/PARQUET-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15552702#comment-15552702
]
Florian Scheibner commented on PARQUET-739:
-------------------------------------------
Valgrind found this, seems two decoder unpacking to the same buffer...
==56675== Possible data race during write of size 4 at 0x1EB3AAC by thread #10
==56675== Locks held: none
==56675== at 0x172E420: parquet::unpack14_32(unsigned int const*, unsigned
int*) (bpacking.h:1103)
==56675== by 0x1731386: parquet::unpack32(unsigned int const*, unsigned
int*, int, int) (bpacking.h:3242)
==56675== by 0x173333F: int parquet::BitReader::GetBatch<int>(int, int*,
int) (bit-stream-utils.inline.h:147)
==56675== by 0x1734ADF: int
parquet::RleDecoder::GetBatchWithDict<parquet::ByteArray>(parquet::Vector<parquet::ByteArray>
const&, parquet::ByteArray*, int) (rle-encoding.h:366)
==56675== by 0x1734C48:
parquet::DictionaryDecoder<parquet::DataType<(parquet::Type::type)6>
>::Decode(parquet::ByteArray*, int) (dictionary-encoding.h:67)
==56675== by 0x1731972:
parquet::TypedColumnReader<parquet::DataType<(parquet::Type::type)6>
>::ReadValues(long, parquet::ByteArray*) (reader.h:159)
==56675== by 0x173414B:
parquet::TypedColumnReader<parquet::DataType<(parquet::Type::type)6>
>::ReadBatch(int, short*, short*, parquet::ByteArray*, long*) (reader.h:202)
==56675==
==56675== This conflicts with a previous write of size 4 by thread #9
==56675== Locks held: none
==56675== at 0x172E522: parquet::unpack14_32(unsigned int const*, unsigned
int*) (bpacking.h:1146)
==56675== by 0x1731386: parquet::unpack32(unsigned int const*, unsigned
int*, int, int) (bpacking.h:3242)
==56675== by 0x173333F: int parquet::BitReader::GetBatch<int>(int, int*,
int) (bit-stream-utils.inline.h:147)
==56675== by 0x1734ADF: int
parquet::RleDecoder::GetBatchWithDict<parquet::ByteArray>(parquet::Vector<parquet::ByteArray>
const&, parquet::ByteArray*, int) (rle-encoding.h:366)
==56675== by 0x1734C48:
parquet::DictionaryDecoder<parquet::DataType<(parquet::Type::type)6>
>::Decode(parquet::ByteArray*, int) (dictionary-encoding.h:67)
==56675== by 0x1731972:
parquet::TypedColumnReader<parquet::DataType<(parquet::Type::type)6>
>::ReadValues(long, parquet::ByteArray*) (reader.h:159)
==56675== by 0x173414B:
parquet::TypedColumnReader<parquet::DataType<(parquet::Type::type)6>
>::ReadBatch(int, short*, short*, parquet::ByteArray*, long*) (reader.h:202)
==56675==
> Read after free with uncompressed page
> --------------------------------------
>
> Key: PARQUET-739
> URL: https://issues.apache.org/jira/browse/PARQUET-739
> Project: Parquet
> Issue Type: Bug
> Components: parquet-cpp
> Reporter: Florian Scheibner
> Assignee: Florian Scheibner
>
> Reading two parquet files in parallel lead to a memory corruption that caused
> a crash. The columns are rle dictionary encoded strings in an uncompressed
> page, created with parquet-mr. -fsanitize tracked the issue to a use-after
> free:
> {code}
> =================================================================
> ==81678==ERROR: AddressSanitizer: heap-use-after-free on address
> 0x6060001088c0 at pc 0x000003dbd42b bp 0x7fffe30fbe00 sp 0x7fffe30fbdf8
> READ of size 16 at 0x6060001088c0 thread T8
> #0 0x3dbd42a in int
> parquet::RleDecoder::GetBatchWithDict<parquet::ByteArray>(parquet::Vector<parquet::ByteArray>
> const&, parquet::ByteArray*, int)
> (/home/fscheibner/Snowflake/ExecPlatform/bin/snowflake+0x3dbd42a)
> #1 0x3db8efa in
> parquet::DictionaryDecoder<parquet::DataType<(parquet::Type::type)6>
> >::Decode(parquet::ByteArray*, int)
> (/home/fscheibner/Snowflake/ExecPlatform/bin/snowflake+0x3db8efa)
> #2 0x3d84767 in
> parquet::TypedColumnReader<parquet::DataType<(parquet::Type::type)6>
> >::ReadValues(long, parquet::ByteArray*)
> (/home/fscheibner/Snowflake/ExecPlatform/bin/snowflake+0x3d84767)
> #3 0x3d83497 in
> parquet::TypedColumnReader<parquet::DataType<(parquet::Type::type)6>
> >::ReadBatch(int, short*, short*, parquet::ByteArray*, long*)
> (/home/fscheibner/Snowflake/ExecPlatform/bin/snowflake+0x3d83497)
> {code}
> Initial debugging showed that the indices for the dictionary returned by the
> rle decoder are garbage. So that data page got corrupted in memory. Reading
> the files in one thread works.
> I have a ColumnReader for each column and read one element from reach column
> to get a complete row.
> My guess is that some data buffer is freed and then later still used for
> reading. I couldn't track the source yet. Any ideas [~wesmckinn]?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)