Thanks for the information Ryan, let me see whether I can manage to locate the cause this weekend.

Cheng

On 10/7/15 9:51 AM, Ryan Blue wrote:
Sorry for the delay, I was at Hadoop World NY last week and I think a lot of other people were at conferences.

I think a 1.8.2 release sounds like a good idea to fix a performance regression. I'd really like to find out what caused it and what fixed it, though. Is it possible for you to bisect the Parquet tree and run the test?

rb

On 10/06/2015 10:09 AM, Cheng Lian wrote:
Could anybody help elaborating on 1.8.2 release plan? Thanks :)

Cheng

On 9/30/15 2:42 PM, Cheng Lian wrote:
Hey all,

We (the Spark team) have being considering to upgrade parquet-mr in
Spark to 1.8.1 to fix PARQUET-251
<https://issues.apache.org/jira/browse/PARQUET-251>. However, my
micro-benchmark shows that 1.8.1 seems to be suffering a slight
performance regression (5% ~ 10%) compared to 1.7.0 (the version we
are currently using). Not sure whether this is a known issue. Did a
quick search on JIRA using

  project = parquet and affectedVersion in ("1.8.0", "1.8.1")

But didn't find any related tickets. What I did in the micro benchmark
was simply reading the whole TPC-DS store_sales table (scale factor
15). The good news is that 1.8.2-SNAPSHOT looks fine. So directly
upgrading to 1.8.2 seems to be a better idea. Could anybody provide
some details about 1.8.2 release schedule? Thanks in advance!

Cheng





Reply via email to