Hi, All.
Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches
including 11 correctness patches arrived at branch-3.2.
Shall we make a new release, Apache Spark 3.2.2, as the third release
at 3.2 line? I'd like to volunteer as the release manager for Apache
Spark 3.2.2. I'm thinking about starting the first RC next week.
$ git log --oneline v3.2.1..HEAD | wc -l
197
# Correctness issues
SPARK-38075 Hive script transform with order by and limit will
return fake rows
SPARK-38204 All state operators are at a risk of inconsistency
between state partitioning and operator partitioning
SPARK-38309 SHS has incorrect percentiles for shuffle read bytes
and shuffle total blocks metrics
SPARK-38320 (flat)MapGroupsWithState can timeout groups which just
received inputs in the same microbatch
SPARK-38614 After Spark update, df.show() shows incorrect
F.percent_rank results
SPARK-38655 OffsetWindowFunctionFrameBase cannot find the offset
row whose input is not null
SPARK-38684 Stream-stream outer join has a possible correctness
issue due to weakly read consistent on outer iterators
SPARK-39061 Incorrect results or NPE when using Inline function
against an array of dynamically created structs
SPARK-39107 Silent change in regexp_replace's handling of empty strings
SPARK-39259 Timestamps returned by now() and equivalent functions
are not consistent in subqueries
SPARK-39293 The accumulator of ArrayAggregate should copy the
intermediate result if string, struct, array, or map
Best,
Dongjoon.
---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]