Re: Spark 3.0 preview release 2?

Takeshi Yamamuro Mon, 09 Dec 2019 16:27:06 -0800

+1; Looks great if we can in terms of user's feedbacks.

Bests,
Takeshi


On Tue, Dec 10, 2019 at 3:14 AM Dongjoon Hyun <[email protected]>
wrote:

> Thank you, All.
>
> +1 for another `3.0-preview`.
>
> Also, thank you Yuming for volunteering for that!
>
> Bests,
> Dongjoon.
>
>
> On Mon, Dec 9, 2019 at 9:39 AM Xiao Li <[email protected]> wrote:
>
>> When entering the official release candidates, the new features have to
>> be disabled or even reverted [if the conf is not available] if the fixes
>> are not trivial; otherwise, we might need 10+ RCs to make the final
>> release. The new features should not block the release based on the
>> previous discussions.
>>
>> I agree we should have code freeze at the beginning of 2020. The preview
>> releases should not block the official releases. The preview is just to
>> collect more feedback about these new features or behavior changes.
>>
>> Also, for the release of Spark 3.0, we still need the Hive community to
>> do us a favor to release 2.3.7 for having HIVE-22190
>> <https://issues.apache.org/jira/browse/HIVE-22190>. Before asking Hive
>> community to do 2.3.7 release, if possible, we want our Spark community to
>> have more tries, especially the support of JDK 11 on Hadoop 2.7 and 3.2,
>> which is based on Hive 2.3 execution JAR. During the preview stage, we
>> might find more issues that are not covered by our test cases.
>>
>>
>>
>> On Mon, Dec 9, 2019 at 4:55 AM Sean Owen <[email protected]> wrote:
>>
>>> Seems fine to me of course. Honestly that wouldn't be a bad result for
>>> a release candidate, though we would probably roll another one now.
>>> How about simply moving to a release candidate? If not now then at
>>> least move to code freeze from the start of 2020. There is also some
>>> downside in pushing out the 3.0 release further with previews.
>>>
>>> On Mon, Dec 9, 2019 at 12:32 AM Xiao Li <[email protected]> wrote:
>>> >
>>> > I got many great feedbacks from the community about the recent 3.0
>>> preview release. Since the last 3.0 preview release, we already have 353
>>> commits [https://github.com/apache/spark/compare/v3.0.0-preview...master].
>>> There are various important features and behavior changes we want the
>>> community to try before entering the official release candidates of Spark
>>> 3.0.
>>> >
>>> >
>>> > Below is my selected items that are not part of the last 3.0 preview
>>> but already available in the upstream master branch:
>>> >
>>> > Support JDK 11 with Hadoop 2.7
>>> > Spark SQL will respect its own default format (i.e., parquet) when
>>> users do CREATE TABLE without USING or STORED AS clauses
>>> > Enable Parquet nested schema pruning and nested pruning on expressions
>>> by default
>>> > Add observable Metrics for Streaming queries
>>> > Column pruning through nondeterministic expressions
>>> > RecordBinaryComparator should check endianness when compared by long
>>> > Improve parallelism for local shuffle reader in adaptive query
>>> execution
>>> > Upgrade Apache Arrow to version 0.15.1
>>> > Various interval-related SQL support
>>> > Add a mode to pin Python thread into JVM's
>>> > Provide option to clean up completed files in streaming query
>>> >
>>> > I am wondering if we can have another preview release for Spark 3.0?
>>> This can help us find the design/API defects as early as possible and avoid
>>> the significant delay of the upcoming Spark 3.0 release
>>> >
>>> >
>>> > Also, any committer is willing to volunteer as the release manager of
>>> the next preview release of Spark 3.0, if we have such a release?
>>> >
>>> >
>>> > Cheers,
>>> >
>>> >
>>> > Xiao
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: [email protected]
>>>
>>>
>>
>> --
>> [image: Databricks Summit - Watch the talks]
>> <https://databricks.com/sparkaisummit/north-america>
>>
>

-- 
---
Takeshi Yamamuro

Re: Spark 3.0 preview release 2?

Reply via email to