Hi,
We previously ran some experiments on builds from the 3.5 branch and
noticed that Hadoop had a regression
(https://issues.apache.org/jira/browse/HADOOP-18568) in their s3a
committer affecting 3.3.5 and 3.3.6 (Spark 3.4 uses hadoop 3.3.4). This
fix has been merged into Hadoop and will be part the next release of Hadoop.
From our testing the regression when writing data to S3 with large
number of tasks S3 is severe enough that we would need to revert to
hadoop 3.3.4 in order to use spark 3.5 release.
Since it only for S3 I am not sure it warrants action changes in Spark
(e.g rolling back hadoop to 3.3.4). But it probably something people
testing the rc against s3 should be aware of.
Best,
Emil
On 29/07/2023 10:29, Yuanjian Li wrote:
Hi everyone,
Following the release timeline, I will cut the RC on*Tuesday, Aug 1st at
1 pm PST* as scheduled.
Date Event
July 17th 2023
Late July
2023 Code freeze. Release branch cut.
QA period. Focus on bug fixes, tests, stability and docs.
Generally, no new features merged.
August 2023 Release candidates (RC), voting, etc. until final release passes
Best,
Yuanjian
---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org