hello,

It sounds like you are talking about the C++ implementation in
https://github.com/apache/parquet-cpp/blob/master/src/parquet/arrow/reader.cc,
is that right?

Which data types are you benchmarking? My understanding is that we are
not appending 1 cell at a time. Let us know.

Thanks
Wes

On Fri, Mar 9, 2018 at 9:55 AM,  <[email protected]> wrote:
> Hi, I am testing parquet->arrow performance and find it's really slow to read 
> parquet file into arrow table. When I check the parquet source code, it seems 
> parquet need to check the null value and use arrow Append method to insert 
> the cell one by one. Although we can use multithread to speed up the reading 
> when we have several column in a fragment. But the I/O performance is still 
> far from it's limitation. I want to know is there any reason, parquet can 
> reach better reading performance?

Reply via email to