Ya, it sounds like that. Could you link those items to the following JIRA?

https://issues.apache.org/jira/browse/SPARK-44111 Prepare Apache Spark 4.0.0

Dongjoon.



On Tue, Jun 20, 2023 at 12:45 PM Holden Karau <hol...@pigscanfly.ca> wrote:

> That seems like a really good reason for a major version change given the
> % of PySpark users and the fact we are (effectively) tied to pandas APIs.
>
> On Tue, Jun 20, 2023 at 12:24 PM Bjørn Jørgensen <bjornjorgen...@gmail.com>
> wrote:
>
>> One big thing for 4.0 will be that pandas API on spark will support
>> pandas version 2.0
>>
>> With the major release of pandas 2.0.0 on April 3, 2023, numerous
>> breaking changes have been introduced. So, we have made the decision to
>> postpone addressing these breaking changes until the next major release of
>> Spark, version 4.0.0 to minimize disruptions for our users and provide a
>> more seamless upgrade experience.
>>
>> The pandas 2.0.0 release includes a significant number of updates, such
>> as API removals, changes in API behavior, parameter removals, parameter
>> behavior changes, and bug fixes. We have planned the following approach for
>> each item:
>>
>> - *API Removals*: Removed APIs will remain deprecated in Spark 3.5.0,
>> provide appropriate warnings, and will be removed in Spark 4.0.0.
>>
>> - *API Behavior Changes*: APIs with changed behavior will retain the
>> behavior in Spark 3.5.0, provide appropriate warnings, and will align the
>> behavior with pandas in Spark 4.0.0.
>>
>> - *Parameter Removals*: Removed parameters will remain deprecated in
>> Spark 3.5.0, provide appropriate warnings, and will be removed in Spark
>> 4.0.0.
>>
>> - *Parameter Behavior Changes*: Parameters with changed behavior will
>> retain the behavior in Spark 3.5.0, provide appropriate warnings, and will
>> align the behavior with pandas in Spark 4.0.0.
>>
>> - *Bug Fixes*: Bug fixes mainly related to correctness issues will be
>> fixed in pandas 3.5.0.
>>
>> *To recap, all breaking changes related to pandas 2.0.0 will be supported
>> in Spark 4.0.0,* *and will remain deprecated with appropriate errors in
>> Spark 3.5.0.*
>>
>>
>>
>> https://issues.apache.org/jira/browse/SPARK-43291?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel
>>
>> tir. 20. juni 2023 kl. 06:18 skrev Dongjoon Hyun <dongj...@apache.org>:
>>
>>> Hi, Herman.
>>>
>>> This is a series of discussions as I re-summarized here.
>>>
>>> You can find some context in the previous timeline thread.
>>>
>>> 2023-05-30 Apache Spark 4.0 Timeframe?
>>> https://lists.apache.org/thread/xhkgj60j361gdpywoxxz7qspp2w80ry6
>>>
>>> Could you reply there to collect your timeline suggestions? We can
>>> discuss more there.
>>>
>>> Dongjoon.
>>>
>>>
>>>
>>> On Mon, Jun 19, 2023 at 1:58 PM Herman van Hovell <her...@databricks.com>
>>> wrote:
>>>
>>>> Dongjoon, I am not sure if I am not sure if I follow the line of
>>>> thought here.
>>>>
>>>> Multiple people have asked for clarification on what Spark 4.0 would
>>>> mean (Holden, Mridul, Jia & Xiao). You can - for the record - also add me
>>>> to this list. However you choose to single out Xiao because asks this
>>>> question and wants to do a preview release as well? So again, what does
>>>> Spark 4 mean, and why does it need to take almost a year? Historically
>>>> major Spark releases tend to break APIs, but if it only entails changing to
>>>> Scala 2.13 and dropping support for JDK 8, then we could also just release
>>>> a month after 3.5.
>>>>
>>>> How about we do this? We get 3.5 released, and afterwards we do a
>>>> couple of meetings where we build this roadmap. Using that, we can -
>>>> hopefully - have a grounded discussion.
>>>>
>>>> Cheers,
>>>> Herman
>>>>
>>>> On Mon, Jun 19, 2023 at 4:01 PM Dongjoon Hyun <dongj...@apache.org>
>>>> wrote:
>>>>
>>>>> Thank you. I reviewed the threads, vote and result once more.
>>>>>
>>>>> I found that I missed the binding vote mark on Holden in the vote
>>>>> result email. The following should be "-0: Holden Karau *". Sorry for this
>>>>> mistake, Holden and all.
>>>>>
>>>>> > -0: Holden Karau
>>>>>
>>>>> To Hyukjin, I disagree with you at the following point because the
>>>>> thread started clearly with your and Sean's Apache Spark 4.0 requirement 
>>>>> in
>>>>> order to move away from Scala 2.12. In addition, we also discussed another
>>>>> item (dropping Java 8) from other current dev thread. The vote scope and
>>>>> goal is clear and specific.
>>>>>
>>>>> > we're unclear on the picture of Spark 4.0.0.
>>>>>
>>>>> Instead of vote scope and result, what is really unclear is that what
>>>>> you propose here. If Xiao wants a preview, Xiao can propose the preview
>>>>> plan more. It's welcome. If you want to has many 4.0 dev ideas which are
>>>>> not exposed to the community yet. Please share them with the community.
>>>>> It's welcome, too. Apache Spark is open source community. If you don't
>>>>> share it, there is no way for us to know what you want.
>>>>>
>>>>> Dongjoon
>>>>>
>>>>> On 2023/06/19 04:31:46 Hyukjin Kwon wrote:
>>>>> > The major concerns raised in the thread were that we should initiate
>>>>> the
>>>>> > discussion for the below first:
>>>>> > - Apache Spark 4.0.0 Preview (and Dates)
>>>>> > - Apache Spark 4.0.0 Items
>>>>> > - Apache Spark 4.0.0 Plan Adjustment
>>>>> >
>>>>> > before setting the timeline for Spark 4.0.0 because we're unclear on
>>>>> the
>>>>> > picture of Spark 4.0.0. So discussing the timeline 4.0.0 first is the
>>>>> > opposite order procedurally.
>>>>> > The vote passed as a procedural issue, but I would prefer to
>>>>> consider this
>>>>> > as a tentative date, and should probably need another vote to adjust
>>>>> the
>>>>> > date considering the plans, preview dates, and items we aim for
>>>>> 4.0.0.
>>>>> >
>>>>> >
>>>>> > On Sat, 17 Jun 2023 at 04:33, Dongjoon Hyun <dongj...@apache.org>
>>>>> wrote:
>>>>> >
>>>>> > > This was a part of the following on-going discussions.
>>>>> > >
>>>>> > > 2023-05-28  Apache Spark 3.5.0 Expectations (?)
>>>>> > > https://lists.apache.org/thread/3x6dh17bmy20n3frtt3crgxjydnxh2o0
>>>>> > >
>>>>> > > 2023-05-30 Apache Spark 4.0 Timeframe?
>>>>> > > https://lists.apache.org/thread/xhkgj60j361gdpywoxxz7qspp2w80ry6
>>>>> > >
>>>>> > > 2023-06-05 ASF policy violation and Scala version issues
>>>>> > > https://lists.apache.org/thread/k7gr65wt0fwtldc7hp7bd0vkg1k93rrb
>>>>> > >
>>>>> > > 2023-06-12 [VOTE] Release Plan for Apache Spark 4.0.0 (June 2024)
>>>>> > > https://lists.apache.org/thread/r0zn6rd8y25yn2dg59ktw3ttrwxzqrfb
>>>>> > >
>>>>> > > I'm looking forward to seeing the upcoming detailed discussions
>>>>> including
>>>>> > > the following
>>>>> > > - Apache Spark 4.0.0 Preview (and Dates)
>>>>> > > - Apache Spark 4.0.0 Items
>>>>> > > - Apache Spark 4.0.0 Plan Adjustment
>>>>> > >
>>>>> > > Please initiate the discussion.
>>>>> > >
>>>>> > > Thanks,
>>>>> > > Dongjoon.
>>>>> > >
>>>>> > >
>>>>> > > On 2023/06/16 19:30:42 Dongjoon Hyun wrote:
>>>>> > > > The vote passes with 6 +1s (4 binding +1s), one -0, and one -1.
>>>>> > > > Thank you all for your participation and
>>>>> > > > especially your additional comments during this voting,
>>>>> > > > Mridul, Hyukjin, and Jungtaek.
>>>>> > > >
>>>>> > > > (* = binding)
>>>>> > > > +1:
>>>>> > > > - Dongjoon Hyun *
>>>>> > > > - Huaxin Gao *
>>>>> > > > - Liang-Chi Hsieh *
>>>>> > > > - Kazuyuki Tanimura
>>>>> > > > - Chao Sun *
>>>>> > > > - Jia Fan
>>>>> > > >
>>>>> > > > -0: Holden Karau
>>>>> > > >
>>>>> > > > -1: Xiao Li *
>>>>> > > >
>>>>> > >
>>>>> > >
>>>>> ---------------------------------------------------------------------
>>>>> > > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>> > >
>>>>> > >
>>>>> >
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>>
>>>>>
>>
>> --
>> Bjørn Jørgensen
>> Vestre Aspehaug 4, 6010 Ålesund
>> Norge
>>
>> +47 480 94 297
>>
>
>
> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>

Reply via email to