[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923778#comment-16923778 ] Wes McKinney commented on ARROW-6417: - After reverting the jemalloc version, the benchmarks show that master is faster than v0.12.1, which is certainly what I was *hoping for* after all the work we put in refactoring stuff the last few months. So the SafeLoadAs issue is no longer a concern > [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have > slowed down since 0.11.x > - > > Key: ARROW-6417 > URL: https://issues.apache.org/jira/browse/ARROW-6417 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Python >Reporter: Wes McKinney >Priority: Major > Labels: pull-request-available > Attachments: 20190903_parquet_benchmark.py, > 20190903_parquet_read_perf.png > > Time Spent: 1.5h > Remaining Estimate: 0h > > In doing some benchmarking, I have found that binary reads seem to be slower > from Arrow 0.11.1 to master branch. It would be a good idea to do some basic > profiling to see where we might improve our memory allocation strategy (or > whatever the bottleneck turns out to be) -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923639#comment-16923639 ] Antoine Pitrou commented on ARROW-6417: --- FTR, similar issues with jemalloc seem to have happened in the past, I wonder if it's a regression: https://github.com/jemalloc/jemalloc/issues/335 https://github.com/jemalloc/jemalloc/issues/126 > [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have > slowed down since 0.11.x > - > > Key: ARROW-6417 > URL: https://issues.apache.org/jira/browse/ARROW-6417 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Python >Reporter: Wes McKinney >Priority: Major > Labels: pull-request-available > Attachments: 20190903_parquet_benchmark.py, > 20190903_parquet_read_perf.png > > Time Spent: 1h 10m > Remaining Estimate: 0h > > In doing some benchmarking, I have found that binary reads seem to be slower > from Arrow 0.11.1 to master branch. It would be a good idea to do some basic > profiling to see where we might improve our memory allocation strategy (or > whatever the bottleneck turns out to be) -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923630#comment-16923630 ] Antoine Pitrou commented on ARROW-6417: --- Wow that's massive. Re {{SafeLoadAs}}, I'm with Micah: it shouldn't make a difference on a x86 CPU with a decent compiler. Worth checking anyway. > [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have > slowed down since 0.11.x > - > > Key: ARROW-6417 > URL: https://issues.apache.org/jira/browse/ARROW-6417 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Python >Reporter: Wes McKinney >Priority: Major > Labels: pull-request-available > Attachments: 20190903_parquet_benchmark.py, > 20190903_parquet_read_perf.png > > Time Spent: 1h 10m > Remaining Estimate: 0h > > In doing some benchmarking, I have found that binary reads seem to be slower > from Arrow 0.11.1 to master branch. It would be a good idea to do some basic > profiling to see where we might improve our memory allocation strategy (or > whatever the bottleneck turns out to be) -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923612#comment-16923612 ] Wes McKinney commented on ARROW-6417: - I opened an issue with jemalloc to see if we're doing something wrong https://github.com/jemalloc/jemalloc/issues/1621 > [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have > slowed down since 0.11.x > - > > Key: ARROW-6417 > URL: https://issues.apache.org/jira/browse/ARROW-6417 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Python >Reporter: Wes McKinney >Priority: Major > Labels: pull-request-available > Attachments: 20190903_parquet_benchmark.py, > 20190903_parquet_read_perf.png > > Time Spent: 1h > Remaining Estimate: 0h > > In doing some benchmarking, I have found that binary reads seem to be slower > from Arrow 0.11.1 to master branch. It would be a good idea to do some basic > profiling to see where we might improve our memory allocation strategy (or > whatever the bottleneck turns out to be) -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923607#comment-16923607 ] Wes McKinney commented on ARROW-6417: - The benchmark results in arrow-builder-benchmark are pretty damning... master {code} --- BenchmarkTime CPU Iterations --- BufferBuilderTinyWrites/real_time264925407 ns 264925189 ns 2 966.31MB/s BufferBuilderSmallWrites/real_time 178721490 ns 178720664 ns 4 1.39882GB/s BufferBuilderLargeWrites/real_time 192722520 ns 192720335 ns 4 1.29027GB/s BuildBooleanArrayNoNulls 61622618 ns 61620052 ns 11 4.05712GB/s BuildIntArrayNoNulls 159926782 ns 159919611 ns 4 1.56329GB/s BuildAdaptiveIntNoNulls 34141484 ns 34141072 ns 20 7.32256GB/s BuildAdaptiveIntNoNullsScalarAppend 118671966 ns 118669726 ns 6 2.10669GB/s BuildBinaryArray 646172067 ns 646165509 ns 1 396.183MB/s BuildChunkedBinaryArray 629538527 ns 629517882 ns 1 406.66MB/s BuildFixedSizeBinaryArray319843478 ns 319421997 ns 2 801.448MB/s BuildDecimalArray613258571 ns 613249404 ns 1 834.897MB/s BuildInt64DictionaryArrayRandom 265489567 ns 265479003 ns 3 964.295MB/s BuildInt64DictionaryArraySequential 256461735 ns 256454103 ns 3 998.229MB/s BuildInt64DictionaryArraySimilar 436497455 ns 436496161 ns 2 586.489MB/s BuildStringDictionaryArray 737468427 ns 737429710 ns 1 463.142MB/s ArrayDataConstructDestruct 38895 ns 38895 ns 18067 {code} master with the older jemalloc {code} --- BenchmarkTime CPU Iterations --- BufferBuilderTinyWrites/real_time139816022 ns 139814056 ns 5 1.78806GB/s BufferBuilderSmallWrites/real_time35215592 ns 35214766 ns 19 7.09912GB/s BufferBuilderLargeWrites/real_time32460612 ns 32456001 ns 21 7.66046GB/s BuildBooleanArrayNoNulls 33690068 ns 33688611 ns 21 7.42091GB/s BuildIntArrayNoNulls 49988970 ns 49987507 ns 14 5.00125GB/s BuildAdaptiveIntNoNulls 23878665 ns 23876703 ns 29 10.4705GB/s BuildAdaptiveIntNoNullsScalarAppend 116140426 ns 116137665 ns 6 2.15262GB/s BuildBinaryArray 593711307 ns 593699295 ns 1 431.195MB/s BuildChunkedBinaryArray 538185849 ns 538185876 ns 1 475.672MB/s BuildFixedSizeBinaryArray218638403 ns 218631191 ns 3 1.14348GB/s BuildDecimalArray294477232 ns 294474155 ns 2 1.69794GB/s BuildInt64DictionaryArrayRandom 248790745 ns 248788395 ns 3 1028.99MB/s BuildInt64DictionaryArraySequential 238954386 ns 238949356 ns 3 1071.36MB/s BuildInt64DictionaryArraySimilar 422484600 ns 422471016 ns 2 605.959MB/s BuildStringDictionaryArray 716507144 ns 716487471 ns 1 476.68MB/s ArrayDataConstructDestruct 38406 ns 38406 ns 18229 {code} So it seems that performance in realloc-heavy workloads is degraded > [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have > slowed down since 0.11.x > - > > Key: ARROW-6417 > URL: https://issues.apache.org/jira/browse/ARROW-6417 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Python >Reporter: Wes McKinney >Priority: Major > Labels: pull-request-available > Attachments: 20190903_parquet_benchmark.py, > 20190903_parquet_read_perf.png > > Time Spent: 1h > Remaining Estimate: 0h > > In doing some benchmarking, I have found that binary reads seem to be slower > from Arrow 0.11.1 to master branch. It would be a good idea to do some basic > profiling to see where we might improve our memory allocation strategy (or > whatever the bottleneck turns out to be) -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923566#comment-16923566 ] Wes McKinney commented on ARROW-6417: - OK, it appears that the jemalloc version is causing the perf difference current master branch with vendored jemalloc version (4.something with patches) {code} $ python 20190903_parquet_benchmark.py dense-random 10 ({'case': 'read-dense-random-single-thread'}, 0.6065331888198853) {code} master with jemalloc 5.2.0 {code} $ python 20190903_parquet_benchmark.py dense-random 10 ({'case': 'read-dense-random-single-thread'}, 1.2143790817260742) {code} To reproduce these results yourself * Get the old jemalloc tarball from here https://github.com/apache/arrow/tree/maint-0.12.x/cpp/thirdparty/jemalloc * Set {{$ARROW_JEMALLOC_URL}} to the path of that before building * Use this branch which has the old EP configuration https://github.com/wesm/arrow/tree/use-old-jemalloc Here's the benchmark script that I'm running above https://gist.github.com/wesm/7e5ae1d41981cfdd20415faf71e5f57e I'm interested if other benchmarks are affected or if this is a peculiarity of this particular benchmark > [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have > slowed down since 0.11.x > - > > Key: ARROW-6417 > URL: https://issues.apache.org/jira/browse/ARROW-6417 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Python >Reporter: Wes McKinney >Priority: Major > Labels: pull-request-available > Attachments: 20190903_parquet_benchmark.py, > 20190903_parquet_read_perf.png > > Time Spent: 1h > Remaining Estimate: 0h > > In doing some benchmarking, I have found that binary reads seem to be slower > from Arrow 0.11.1 to master branch. It would be a good idea to do some basic > profiling to see where we might improve our memory allocation strategy (or > whatever the bottleneck turns out to be) -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923534#comment-16923534 ] Micah Kornfield commented on ARROW-6417: For SafeLoadAs, you could try changing the implementation to dereference instead of memcpy, which should be equivalent to the old code (assuming it is getting inlined correctly). IIRC, we saw very comparable numbers for the existing parquet benchmarks when I made those changes. > [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have > slowed down since 0.11.x > - > > Key: ARROW-6417 > URL: https://issues.apache.org/jira/browse/ARROW-6417 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Python >Reporter: Wes McKinney >Priority: Major > Labels: pull-request-available > Attachments: 20190903_parquet_benchmark.py, > 20190903_parquet_read_perf.png > > Time Spent: 40m > Remaining Estimate: 0h > > In doing some benchmarking, I have found that binary reads seem to be slower > from Arrow 0.11.1 to master branch. It would be a good idea to do some basic > profiling to see where we might improve our memory allocation strategy (or > whatever the bottleneck turns out to be) -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923474#comment-16923474 ] Wes McKinney commented on ARROW-6417: - I will try that next. I'm going to merge my current patch in the meantime and leave this JIRA open > [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have > slowed down since 0.11.x > - > > Key: ARROW-6417 > URL: https://issues.apache.org/jira/browse/ARROW-6417 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Python >Reporter: Wes McKinney >Priority: Major > Labels: pull-request-available > Attachments: 20190903_parquet_benchmark.py, > 20190903_parquet_read_perf.png > > Time Spent: 40m > Remaining Estimate: 0h > > In doing some benchmarking, I have found that binary reads seem to be slower > from Arrow 0.11.1 to master branch. It would be a good idea to do some basic > profiling to see where we might improve our memory allocation strategy (or > whatever the bottleneck turns out to be) -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923360#comment-16923360 ] Antoine Pitrou commented on ARROW-6417: --- Have you tried to measure the same jemalloc version for the two Arrow versions (or, conversely, the two jemalloc versions for the same Arrow version)? > [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have > slowed down since 0.11.x > - > > Key: ARROW-6417 > URL: https://issues.apache.org/jira/browse/ARROW-6417 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Python >Reporter: Wes McKinney >Priority: Major > Labels: pull-request-available > Attachments: 20190903_parquet_benchmark.py, > 20190903_parquet_read_perf.png > > Time Spent: 40m > Remaining Estimate: 0h > > In doing some benchmarking, I have found that binary reads seem to be slower > from Arrow 0.11.1 to master branch. It would be a good idea to do some basic > profiling to see where we might improve our memory allocation strategy (or > whatever the bottleneck turns out to be) -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16922791#comment-16922791 ] Wes McKinney commented on ARROW-6417: - Further down the rabbit hole 0.12.1 perf profile {code} - parquet::arrow::FileReader::Impl::ReadSchemaField - 66.24% parquet::arrow::ColumnReader::NextBatch - parquet::arrow::PrimitiveImpl::NextBatch - 66.23% parquet::internal::RecordReader::ReadRecords - 41.51% parquet::internal::TypedRecordReader >::ReadRecordData - 38.62% parquet::internal::TypedRecordReader >::ReadValuesSpaced - 26.97% arrow::internal::ChunkedBinaryBuilder::Append - 24.06% arrow::BinaryBuilder::Append + 12.78% arrow::BufferBuilder::Append 1.99% arrow::ArrayBuilder::Reserve 1.16% arrow::BufferBuilder::Append@plt 0.52% arrow::ArrayBuilder::Reserve@plt 0.57% arrow::BinaryBuilder::Append@plt + 8.34% parquet::Decoder >::DecodeSpaced 0.53% arrow::internal::ChunkedBinaryBuilder::Append@plt 2.02% parquet::internal::DefinitionLevelsToBitmap + 0.86% parquet::internal::RecordReader::RecordReaderImpl::ReserveValues + 24.31% parquet::internal::TypedRecordReader >::ReadNewPage {code} master / my ARROW-6417 branch {code} - 74.04% parquet::internal::TypedRecordReader >::ReadRecords - 49.00% parquet::internal::TypedRecordReader >::ReadRecordData - 45.82% parquet::internal::ByteArrayChunkedRecordReader::ReadValuesSpaced - 45.19% parquet::PlainByteArrayDecoder::DecodeArrow + 20.92% arrow::BaseBinaryBuilder::ReserveData 7.61% __memmove_avx_unaligned_erms + 2.59% arrow::BaseBinaryBuilder::Resize 0.77% memcpy@plt + 0.63% parquet::DictByteArrayDecoderImpl::DecodeArrow 2.09% parquet::internal::DefinitionLevelsToBitmap + 1.07% parquet::internal::TypedRecordReader >::ReserveValues + 24.32% parquet::SerializedPageReader::NextPage {code} Furthermore, jemalloc is show up as taking a lot more time on
[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16922758#comment-16922758 ] Wes McKinney commented on ARROW-6417: - So on closer inspection, in v0.11.1 we weren't yet handling chunked binary reads at all, so the comparison is not really apples to oranges. v0.12.x was the first release series to include chunking support, so could be the more appropriate comparison. This performance issue is really vexing. We also changed jemalloc versions between 0.12.x and 0.15.x so I wonder if the allocator version could be impacting performance > [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have > slowed down since 0.11.x > - > > Key: ARROW-6417 > URL: https://issues.apache.org/jira/browse/ARROW-6417 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Python >Reporter: Wes McKinney >Priority: Major > Labels: pull-request-available > Attachments: 20190903_parquet_benchmark.py, > 20190903_parquet_read_perf.png > > Time Spent: 0.5h > Remaining Estimate: 0h > > In doing some benchmarking, I have found that binary reads seem to be slower > from Arrow 0.11.1 to master branch. It would be a good idea to do some basic > profiling to see where we might improve our memory allocation strategy (or > whatever the bottleneck turns out to be) -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921168#comment-16921168 ] Wes McKinney commented on ARROW-6417: - OK, I think to make things faster we need to be more careful about pre-allocating with {{BinaryBuilder}} and calling {{BaseBinaryBuilder::UnsafeAppend}} instead of {{Append}}. It's a bit tricky because we have {{ChunkedBinaryBuilder}} in the mix, so we may have to manage the creation of chunks in the Parquet value decoder. I think this is worth the effort given how much of a hot path this is for reading Parquet files. I'll spend a little time on it tomorrow > [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have > slowed down since 0.11.x > - > > Key: ARROW-6417 > URL: https://issues.apache.org/jira/browse/ARROW-6417 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Python >Reporter: Wes McKinney >Priority: Major > Attachments: 20190903_parquet_benchmark.py, > 20190903_parquet_read_perf.png > > > In doing some benchmarking, I have found that binary reads seem to be slower > from Arrow 0.11.1 to master branch. It would be a good idea to do some basic > profiling to see where we might improve our memory allocation strategy (or > whatever the bottleneck turns out to be) -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921164#comment-16921164 ] Wes McKinney commented on ARROW-6417: - The dreaded {{\_\_memmove_avx_unaligned_erms}} has showed up again. I'll have a poke at this to see what could be done {code:java} + 60.85% 0.01% python libparquet.so.15.0.0 [.] parquet::internal::TypedRecordReader::Append ▒ + 30.62% 9.64% python libarrow.so.15.0.0 [.] arrow::BufferBuilder::Append ▒ + 23.51%10.91% python libc-2.27.so [.] __memmove_avx_unaligned_erms ▒ + 21.23% 0.00% python [unknown] [.] 0x ▒ + 18.58% 0.01% python libparquet.so.15.0.0 [.] parquet::ColumnReaderImplBase >▒ + 18.45%14.80% python libsnappy.so.1.1.7 [.] snappy::RawUncompress ▒ + 18.42% 0.02% python libparquet.so.15.0.0 [.] parquet::SerializedPageReader::NextPage ▒ + 18.27% 0.00% python libarrow.so.15.0.0 [.] arrow::util::SnappyCodec::Decompress ▒ + 18.27% 0.00% python libarrow.so.15.0.0 [.] arrow::util::SnappyCodec::Decompress ▒ + 18.27% 0.00% python libsnappy.so.1.1.7 [.] snappy::RawUncompress ▒ + 14.99% 0.00% python libarrow.so.15.0.0 [.] arrow::PoolBuffer::Resize ▒ + 14.99% 0.00% python libarrow.so.15.0.0 [.] arrow::PoolBuffer::Reserve ▒ + 14.99% 0.00% python libarrow.so.15.0.0 [.] arrow::DefaultMemoryPool::Reallocate ▒ + 14.98% 0.01% python libarrow.so.15.0.0 [.] je_arrow_rallocx ▒ + 14.97% 0.00% python libarrow.so.15.0.0 [.] je_arrow_private_je_arena_ralloc ▒ + 14.96% 0.00% python libarrow.so.15.0.0 [.] je_arrow_private_je_large_ralloc ▒ + 14.64% 0.00% python libarrow.so.15.0.0 [.] arrow::BufferBuilder::Resize ▒ + 12.82%12.82% python [unknown] [k] 0x98e00a67 ▒ + 11.74% 3.73% python libarrow.so.15.0.0 [.] arrow::BaseBinaryBuilder::AppendNextOffset {code} > [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have > slowed down since 0.11.x > - > > Key: ARROW-6417 > URL: https://issues.apache.org/jira/browse/ARROW-6417 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Python >Reporter: Wes McKinney >Priority: Major > Attachments: 20190903_parquet_benchmark.py, > 20190903_parquet_read_perf.png > > > In doing some benchmarking, I have found that binary reads seem to be slower > from Arrow 0.11.1 to master branch. It would be a good idea to do some basic > profiling to see where we might improve our memory allocation strategy (or > whatever the bottleneck turns out to be) -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921131#comment-16921131 ] Wes McKinney commented on ARROW-6417: - I updated the results plot to use gcc 8.3 in both v0.11.1 and master branch as of 9/2/2019 > [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have > slowed down since 0.11.x > - > > Key: ARROW-6417 > URL: https://issues.apache.org/jira/browse/ARROW-6417 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Python >Reporter: Wes McKinney >Priority: Major > Attachments: 20190903_parquet_benchmark.py, > 20190903_parquet_read_perf.png > > > In doing some benchmarking, I have found that binary reads seem to be slower > from Arrow 0.11.1 to master branch. It would be a good idea to do some basic > profiling to see where we might improve our memory allocation strategy (or > whatever the bottleneck turns out to be) -- This message was sent by Atlassian Jira (v8.3.2#803003)