hello, It sounds like you are talking about the C++ implementation in https://github.com/apache/parquet-cpp/blob/master/src/parquet/arrow/reader.cc, is that right?
Which data types are you benchmarking? My understanding is that we are not appending 1 cell at a time. Let us know. Thanks Wes On Fri, Mar 9, 2018 at 9:55 AM, <[email protected]> wrote: > Hi, I am testing parquet->arrow performance and find it's really slow to read > parquet file into arrow table. When I check the parquet source code, it seems > parquet need to check the null value and use arrow Append method to insert > the cell one by one. Although we can use multithread to speed up the reading > when we have several column in a fragment. But the I/O performance is still > far from it's limitation. I want to know is there any reason, parquet can > reach better reading performance?
