This is an automated email from the ASF dual-hosted git repository.
mdeepak pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/parquet-cpp.git
The following commit(s) were added to refs/heads/master by this push:
new d9c262a PARQUET-1333: [C++] Reading of files with dictionary size 0
fails on Windows with bad_alloc
d9c262a is described below
commit d9c262a00f512699b64472cf58ecff7642853efc
Author: Philipp Hoch <[email protected]>
AuthorDate: Thu Jun 28 11:27:39 2018 -0400
PARQUET-1333: [C++] Reading of files with dictionary size 0 fails on
Windows with bad_alloc
The call with size 0 ends up in arrows memory_pool,
https://github.com/apache/arrow/blob/884474ca5ca1b8da55c0b23eb7cb784c2cd9bdb4/cpp/src/arrow/memory_pool.cc#L50,
and the according allocation fails. See according documentation,
https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/aligned-malloc.
Only happens on Windows environment, as posix_memalign seems to handle 0
inputs in unix environments.
Author: Philipp Hoch <[email protected]>
Closes #472 from philhoch/bugfix-cover-empty-dicitionary-size-on-windows
and squashes the following commits:
0be10bc [Philipp Hoch] account for total_size being 0, as _alligned_malloc
with size 0 raises error on Windows environment
---
src/parquet/encoding-internal.h | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/src/parquet/encoding-internal.h b/src/parquet/encoding-internal.h
index 894410f..e22edd0 100644
--- a/src/parquet/encoding-internal.h
+++ b/src/parquet/encoding-internal.h
@@ -398,9 +398,11 @@ inline void DictionaryDecoder<ByteArrayType>::SetDict(
for (int i = 0; i < num_dictionary_values; ++i) {
total_size += dictionary_[i].len;
}
- PARQUET_THROW_NOT_OK(byte_array_data_->Resize(total_size, false));
- int offset = 0;
+ if (total_size > 0) {
+ PARQUET_THROW_NOT_OK(byte_array_data_->Resize(total_size, false));
+ }
+ int offset = 0;
uint8_t* bytes_data = byte_array_data_->mutable_data();
for (int i = 0; i < num_dictionary_values; ++i) {
memcpy(bytes_data + offset, dictionary_[i].ptr, dictionary_[i].len);