Hi, I am testing parquet->arrow performance and find it's really slow to read parquet file into arrow table. When I check the parquet source code, it seems parquet need to check the null value and use arrow Append method to insert the cell one by one. Although we can use multithread to speed up the reading when we have several column in a fragment. But the I/O performance is still far from it's limitation. I want to know is there any reason, parquet can reach better reading performance?
- parquet performance mildwolf_jh
- Re: parquet performance Wes McKinney
