Hey all,

We (the Spark team) have being considering to upgrade parquet-mr in Spark to 1.8.1 to fix PARQUET-251 <https://issues.apache.org/jira/browse/PARQUET-251>. However, my micro-benchmark shows that 1.8.1 seems to be suffering a slight performance regression (5% ~ 10%) compared to 1.7.0 (the version we are currently using). Not sure whether this is a known issue. Did a quick search on JIRA using

  project = parquet and affectedVersion in ("1.8.0", "1.8.1")

But didn't find any related tickets. What I did in the micro benchmark was simply reading the whole TPC-DS store_sales table (scale factor 15). The good news is that 1.8.2-SNAPSHOT looks fine. So directly upgrading to 1.8.2 seems to be a better idea. Could anybody provide some details about 1.8.2 release schedule? Thanks in advance!

Cheng

Reply via email to