Hi all, A week ago, Wes and I had a discussion about the performance of the Arrow/Java implementation on the Apache Crail (Incubating) mailing list ( http://mail-archives.apache.org/mod_mbox/crail-dev/201809.mbox/browser). In a nutshell: I am investigating the performance of various file formats (including Arrow) on high-performance NVMe and RDMA/100Gbps/RoCE setups. I benchmarked how long does it take to materialize values (ints, longs, doubles) of the store_sales table, the largest table in the TPC-DS dataset stored on different file formats. Here is a write-up on this - https://crail.incubator.apache.org/blog/2018/08/sql-p1.html. I found that between a pair of machine connected over a 100 Gbps link, Arrow (using as a file format on HDFS) delivered close to ~30 Gbps of bandwidth with all 16 cores engaged. Wes pointed out that (i) Arrow is in-memory IPC format, and has not been optimized for storage interfaces/APIs like HDFS; (ii) the performance I am measuring is for the java implementation.
Wes, I hope I summarized our discussion correctly. That brings us to this email where I promised to follow up on the Arrow mailing list to understand and optimize the performance of Arrow/Java implementation on high-performance devices. I wrote a small stand-alone benchmark (https://github.com/animeshtrivedi/benchmarking-arrow) with three implementations of WritableByteChannel, SeekableByteChannel interfaces: 1. Arrow data is stored in HDFS/tmpfs - this gives me ~30 Gbps performance 2. Arrow data is stored in Crail/DRAM - this gives me ~35-36 Gbps performance 3. Arrow data is stored in on-heap byte[] - this gives me ~39 Gbps performance I think the order makes sense. To better understand the performance of Arrow/Java we can focus on the option 3. The key question I am trying to answer is "what would it take for Arrow/Java to deliver 100+ Gbps of performance"? Is it possible? If yes, then what is missing/or mis-interpreted by me? If not, then where is the performance lost? Does anyone have any performance measurements for C++ implementation? if they have seen/expect better numbers. As a next step, I will profile the read path of Arrow/Java for the option 3. I will report my findings. Any thoughts and feedback on this investigation are very welcome. Cheers, -- Animesh PS~ Cross-posting on the [email protected] list as well.
