[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x

2019-09-05 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923778#comment-16923778 ] Wes McKinney commented on ARROW-6417: - After reverting the jemalloc version, the benchmarks show that

[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x

2019-09-05 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923639#comment-16923639 ] Antoine Pitrou commented on ARROW-6417: --- FTR, similar issues with jemalloc seem to have happened in

[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x

2019-09-05 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923630#comment-16923630 ] Antoine Pitrou commented on ARROW-6417: --- Wow that's massive. Re {{SafeLoadAs}}, I'm with Micah: it

[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x

2019-09-05 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923612#comment-16923612 ] Wes McKinney commented on ARROW-6417: - I opened an issue with jemalloc to see if we're doing

[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x

2019-09-05 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923607#comment-16923607 ] Wes McKinney commented on ARROW-6417: - The benchmark results in arrow-builder-benchmark are pretty

[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x

2019-09-05 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923566#comment-16923566 ] Wes McKinney commented on ARROW-6417: - OK, it appears that the jemalloc version is causing the perf

[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x

2019-09-05 Thread Micah Kornfield (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923534#comment-16923534 ] Micah Kornfield commented on ARROW-6417: For SafeLoadAs, you could try changing the

[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x

2019-09-05 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923474#comment-16923474 ] Wes McKinney commented on ARROW-6417: - I will try that next. I'm going to merge my current patch in

[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x

2019-09-05 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923360#comment-16923360 ] Antoine Pitrou commented on ARROW-6417: --- Have you tried to measure the same jemalloc version for

[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x

2019-09-04 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16922791#comment-16922791 ] Wes McKinney commented on ARROW-6417: - Further down the rabbit hole 0.12.1 perf profile {code} -

[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x

2019-09-04 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16922758#comment-16922758 ] Wes McKinney commented on ARROW-6417: - So on closer inspection, in v0.11.1 we weren't yet handling

[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x

2019-09-02 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921168#comment-16921168 ] Wes McKinney commented on ARROW-6417: - OK, I think to make things faster we need to be more careful

[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x

2019-09-02 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921164#comment-16921164 ] Wes McKinney commented on ARROW-6417: - The dreaded {{\_\_memmove_avx_unaligned_erms}} has showed up

[jira] [Commented] (ARROW-6417) [C++][Parquet] Non-dictionary BinaryArray reads from Parquet format have slowed down since 0.11.x

2019-09-02 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921131#comment-16921131 ] Wes McKinney commented on ARROW-6417: - I updated the results plot to use gcc 8.3 in both v0.11.1 and