答复: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-05 Thread Pan,Bingkun
Okay, Let me double-check it carefully.

Thank you very much for your help!



发件人: Jungtaek Lim 
发送时间: 2024年3月5日 21:56:41
收件人: Pan,Bingkun
抄送: Dongjoon Hyun; dev; user
主题: Re: [ANNOUNCE] Apache Spark 3.5.1 released

Yeah the approach seems OK to me - please double check that the doc generation 
in Spark repo won't fail after the move of the js file. Other than that, it 
would be probably just a matter of updating the release process.

On Tue, Mar 5, 2024 at 7:24 PM Pan,Bingkun 
mailto:panbing...@baidu.com>> wrote:

Okay, I see.

Perhaps we can solve this confusion by sharing the same file `version.json` 
across `all versions` in the `Spark website repo`? Make each version of the 
document display the `same` data in the dropdown menu.


发件人: Jungtaek Lim 
mailto:kabhwan.opensou...@gmail.com>>
发送时间: 2024年3月5日 17:09:07
收件人: Pan,Bingkun
抄送: Dongjoon Hyun; dev; user
主题: Re: [ANNOUNCE] Apache Spark 3.5.1 released

Let me be more specific.

We have two active release version lines, 3.4.x and 3.5.x. We just released 
Spark 3.5.1, having a dropdown as 3.5.1 and 3.4.2 given the fact the last 
version of 3.4.x is 3.4.2. After a month we released Spark 3.4.3. In the 
dropdown of Spark 3.4.3, there will be 3.5.1 and 3.4.3. But if we call this as 
done, 3.5.1 (still latest) won't show 3.4.3 in the dropdown, giving confusion 
that 3.4.3 wasn't ever released.

This is just about two active release version lines with keeping only the 
latest version of version lines. If you expand this to EOLed version lines and 
versions which aren't the latest in their version line, the problem gets much 
more complicated.

On Tue, Mar 5, 2024 at 6:01 PM Pan,Bingkun 
mailto:panbing...@baidu.com>> wrote:

Based on my understanding, we should not update versions that have already been 
released,

such as the situation you mentioned: `But what about dropout of version D? 
Should we add E in the dropdown?` We only need to record the latest `version. 
json` file that has already been published at the time of each new document 
release.

Of course, if we need to keep the latest in every document, I think it's also 
possible.

Only by sharing the same version. json file in each version.


发件人: Jungtaek Lim 
mailto:kabhwan.opensou...@gmail.com>>
发送时间: 2024年3月5日 16:47:30
收件人: Pan,Bingkun
抄送: Dongjoon Hyun; dev; user
主题: Re: [ANNOUNCE] Apache Spark 3.5.1 released

But this does not answer my question about updating the dropdown for the doc of 
"already released versions", right?

Let's say we just released version D, and the dropdown has version A, B, C. We 
have another release tomorrow as version E, and it's probably easy to add A, B, 
C, D in the dropdown of E. But what about dropdown of version D? Should we add 
E in the dropdown? How do we maintain it if we will have 10 releases afterwards?

On Tue, Mar 5, 2024 at 5:27 PM Pan,Bingkun 
mailto:panbing...@baidu.com>> wrote:

According to my understanding, the original intention of this feature is that 
when a user has entered the pyspark document, if he finds that the version he 
is currently in is not the version he wants, he can easily jump to the version 
he wants by clicking on the drop-down box. Additionally, in this PR, the 
current automatic mechanism for PRs did not merge in.

https://github.com/apache/spark/pull/42881<https://mailshield.baidu.com/check?q=NXF5O0EN4F6TOoAzxFGzXSJvMnQlPeztKpu%2fBYaKpd2sRl6qEYTx2NGUrTYUrhOk>

So, we need to manually update this file. I can manually submit an update first 
to get this feature working.


发件人: Jungtaek Lim 
mailto:kabhwan.opensou...@gmail.com>>
发送时间: 2024年3月4日 6:34:42
收件人: Dongjoon Hyun
抄送: dev; user
主题: Re: [ANNOUNCE] Apache Spark 3.5.1 released

Shall we revisit this functionality? The API doc is built with individual 
versions, and for each individual version we depend on other released versions. 
This does not seem to be right to me. Also, the functionality is only in 
PySpark API doc which does not seem to be consistent as well.

I don't think this is manageable with the current approach (listing versions in 
version-dependent doc). Let's say we release 3.4.3 after 3.5.1. Should we 
update the versions in 3.5.1 to add 3.4.3 in version switcher? How about the 
time we are going to release the new version after releasing 10 versions? 
What's the criteria of pruning the version?

Unless we have a good answer to these questions, I think it's better to revert 
the functionality - it missed various considerations.

On Fri, Mar 1, 2024 at 2:44 PM Jungtaek Lim 
mailto:kabhwan.opensou...@gmail.com>> wrote:
Thanks for reporting - this is odd - the dropdown did not exist in other recent 
releases.

https://spark.apache.org/docs/3.5.0/api/python/index.html<https://mailshield.baidu.com/check?q=uXELebgeq9ShKrQ3HDYtw08xGdWbbrT3FEzFk%2fzTZ%2bVxzlJrJa41y1xJkZ7RbZ

Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-05 Thread Jungtaek Lim
Yeah the approach seems OK to me - please double check that the doc
generation in Spark repo won't fail after the move of the js file. Other
than that, it would be probably just a matter of updating the release
process.

On Tue, Mar 5, 2024 at 7:24 PM Pan,Bingkun  wrote:

> Okay, I see.
>
> Perhaps we can solve this confusion by sharing the same file `version.json`
> across `all versions` in the `Spark website repo`? Make each version of
> the document display the `same` data in the dropdown menu.
> --
> *发件人:* Jungtaek Lim 
> *发送时间:* 2024年3月5日 17:09:07
> *收件人:* Pan,Bingkun
> *抄送:* Dongjoon Hyun; dev; user
> *主题:* Re: [ANNOUNCE] Apache Spark 3.5.1 released
>
> Let me be more specific.
>
> We have two active release version lines, 3.4.x and 3.5.x. We just
> released Spark 3.5.1, having a dropdown as 3.5.1 and 3.4.2 given the fact
> the last version of 3.4.x is 3.4.2. After a month we released Spark 3.4.3.
> In the dropdown of Spark 3.4.3, there will be 3.5.1 and 3.4.3. But if we
> call this as done, 3.5.1 (still latest) won't show 3.4.3 in the dropdown,
> giving confusion that 3.4.3 wasn't ever released.
>
> This is just about two active release version lines with keeping only the
> latest version of version lines. If you expand this to EOLed version lines
> and versions which aren't the latest in their version line, the problem
> gets much more complicated.
>
> On Tue, Mar 5, 2024 at 6:01 PM Pan,Bingkun  wrote:
>
>> Based on my understanding, we should not update versions that have
>> already been released,
>>
>> such as the situation you mentioned: `But what about dropout of version
>> D? Should we add E in the dropdown?` We only need to record the latest
>> `version. json` file that has already been published at the time of each
>> new document release.
>>
>> Of course, if we need to keep the latest in every document, I think it's
>> also possible.
>>
>> Only by sharing the same version. json file in each version.
>> ------
>> *发件人:* Jungtaek Lim 
>> *发送时间:* 2024年3月5日 16:47:30
>> *收件人:* Pan,Bingkun
>> *抄送:* Dongjoon Hyun; dev; user
>> *主题:* Re: [ANNOUNCE] Apache Spark 3.5.1 released
>>
>> But this does not answer my question about updating the dropdown for the
>> doc of "already released versions", right?
>>
>> Let's say we just released version D, and the dropdown has version A, B,
>> C. We have another release tomorrow as version E, and it's probably easy to
>> add A, B, C, D in the dropdown of E. But what about dropdown of version D?
>> Should we add E in the dropdown? How do we maintain it if we will have 10
>> releases afterwards?
>>
>> On Tue, Mar 5, 2024 at 5:27 PM Pan,Bingkun  wrote:
>>
>>> According to my understanding, the original intention of this feature is
>>> that when a user has entered the pyspark document, if he finds that the
>>> version he is currently in is not the version he wants, he can easily jump
>>> to the version he wants by clicking on the drop-down box. Additionally, in
>>> this PR, the current automatic mechanism for PRs did not merge in.
>>>
>>> https://github.com/apache/spark/pull/42881
>>> <https://mailshield.baidu.com/check?q=NXF5O0EN4F6TOoAzxFGzXSJvMnQlPeztKpu%2fBYaKpd2sRl6qEYTx2NGUrTYUrhOk>
>>>
>>> So, we need to manually update this file. I can manually submit an
>>> update first to get this feature working.
>>> --
>>> *发件人:* Jungtaek Lim 
>>> *发送时间:* 2024年3月4日 6:34:42
>>> *收件人:* Dongjoon Hyun
>>> *抄送:* dev; user
>>> *主题:* Re: [ANNOUNCE] Apache Spark 3.5.1 released
>>>
>>> Shall we revisit this functionality? The API doc is built with
>>> individual versions, and for each individual version we depend on other
>>> released versions. This does not seem to be right to me. Also, the
>>> functionality is only in PySpark API doc which does not seem to be
>>> consistent as well.
>>>
>>> I don't think this is manageable with the current approach (listing
>>> versions in version-dependent doc). Let's say we release 3.4.3 after 3.5.1.
>>> Should we update the versions in 3.5.1 to add 3.4.3 in version switcher?
>>> How about the time we are going to release the new version after releasing
>>> 10 versions? What's the criteria of pruning the version?
>>>
>>> Unless we have a good answer to these questions, I think it's better to
>>> revert the functionality - it missed various considerations.
>>>
>>> On Fri, Mar 1,

答复: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-05 Thread Pan,Bingkun
Okay, I see.

Perhaps we can solve this confusion by sharing the same file `version.json` 
across `all versions` in the `Spark website repo`? Make each version of the 
document display the `same` data in the dropdown menu.


发件人: Jungtaek Lim 
发送时间: 2024年3月5日 17:09:07
收件人: Pan,Bingkun
抄送: Dongjoon Hyun; dev; user
主题: Re: [ANNOUNCE] Apache Spark 3.5.1 released

Let me be more specific.

We have two active release version lines, 3.4.x and 3.5.x. We just released 
Spark 3.5.1, having a dropdown as 3.5.1 and 3.4.2 given the fact the last 
version of 3.4.x is 3.4.2. After a month we released Spark 3.4.3. In the 
dropdown of Spark 3.4.3, there will be 3.5.1 and 3.4.3. But if we call this as 
done, 3.5.1 (still latest) won't show 3.4.3 in the dropdown, giving confusion 
that 3.4.3 wasn't ever released.

This is just about two active release version lines with keeping only the 
latest version of version lines. If you expand this to EOLed version lines and 
versions which aren't the latest in their version line, the problem gets much 
more complicated.

On Tue, Mar 5, 2024 at 6:01 PM Pan,Bingkun 
mailto:panbing...@baidu.com>> wrote:

Based on my understanding, we should not update versions that have already been 
released,

such as the situation you mentioned: `But what about dropout of version D? 
Should we add E in the dropdown?` We only need to record the latest `version. 
json` file that has already been published at the time of each new document 
release.

Of course, if we need to keep the latest in every document, I think it's also 
possible.

Only by sharing the same version. json file in each version.


发件人: Jungtaek Lim 
mailto:kabhwan.opensou...@gmail.com>>
发送时间: 2024年3月5日 16:47:30
收件人: Pan,Bingkun
抄送: Dongjoon Hyun; dev; user
主题: Re: [ANNOUNCE] Apache Spark 3.5.1 released

But this does not answer my question about updating the dropdown for the doc of 
"already released versions", right?

Let's say we just released version D, and the dropdown has version A, B, C. We 
have another release tomorrow as version E, and it's probably easy to add A, B, 
C, D in the dropdown of E. But what about dropdown of version D? Should we add 
E in the dropdown? How do we maintain it if we will have 10 releases afterwards?

On Tue, Mar 5, 2024 at 5:27 PM Pan,Bingkun 
mailto:panbing...@baidu.com>> wrote:

According to my understanding, the original intention of this feature is that 
when a user has entered the pyspark document, if he finds that the version he 
is currently in is not the version he wants, he can easily jump to the version 
he wants by clicking on the drop-down box. Additionally, in this PR, the 
current automatic mechanism for PRs did not merge in.

https://github.com/apache/spark/pull/42881<https://mailshield.baidu.com/check?q=NXF5O0EN4F6TOoAzxFGzXSJvMnQlPeztKpu%2fBYaKpd2sRl6qEYTx2NGUrTYUrhOk>

So, we need to manually update this file. I can manually submit an update first 
to get this feature working.


发件人: Jungtaek Lim 
mailto:kabhwan.opensou...@gmail.com>>
发送时间: 2024年3月4日 6:34:42
收件人: Dongjoon Hyun
抄送: dev; user
主题: Re: [ANNOUNCE] Apache Spark 3.5.1 released

Shall we revisit this functionality? The API doc is built with individual 
versions, and for each individual version we depend on other released versions. 
This does not seem to be right to me. Also, the functionality is only in 
PySpark API doc which does not seem to be consistent as well.

I don't think this is manageable with the current approach (listing versions in 
version-dependent doc). Let's say we release 3.4.3 after 3.5.1. Should we 
update the versions in 3.5.1 to add 3.4.3 in version switcher? How about the 
time we are going to release the new version after releasing 10 versions? 
What's the criteria of pruning the version?

Unless we have a good answer to these questions, I think it's better to revert 
the functionality - it missed various considerations.

On Fri, Mar 1, 2024 at 2:44 PM Jungtaek Lim 
mailto:kabhwan.opensou...@gmail.com>> wrote:
Thanks for reporting - this is odd - the dropdown did not exist in other recent 
releases.

https://spark.apache.org/docs/3.5.0/api/python/index.html<https://mailshield.baidu.com/check?q=uXELebgeq9ShKrQ3HDYtw08xGdWbbrT3FEzFk%2fzTZ%2bVxzlJrJa41y1xJkZ7RbZcLmQNMGzBVvVX6KlpxrtsKRQ%3d%3d>
https://spark.apache.org/docs/3.4.2/api/python/index.html<https://mailshield.baidu.com/check?q=vFHg6IjqXnlPilWEcpu6a0oCJLXpFeNnsL6hZ%2fpZY0nGPd6tnUFbimhVD6zVpMlL8RAgxzN8GQM6cNBFe8hXvA%3d%3d>
https://spark.apache.org/docs/3.3.4/api/python/index.html<https://mailshield.baidu.com/check?q=cfoH89Pu%2fNbZC4s7657SqqfHpT7hoppw7e6%2fZzsz8S7ZoEMm2LijOxwcGgKS5O29HzYUyQoooMRdy%2fd5Y36e2Q%3d%3d>

Looks like the dropdown feature was recently introduced but partially done. The 
addition of a dropdown was done, but the way how to bump the version was missed 
to be do

Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-05 Thread Jungtaek Lim
Let me be more specific.

We have two active release version lines, 3.4.x and 3.5.x. We just released
Spark 3.5.1, having a dropdown as 3.5.1 and 3.4.2 given the fact the last
version of 3.4.x is 3.4.2. After a month we released Spark 3.4.3. In the
dropdown of Spark 3.4.3, there will be 3.5.1 and 3.4.3. But if we call this
as done, 3.5.1 (still latest) won't show 3.4.3 in the dropdown, giving
confusion that 3.4.3 wasn't ever released.

This is just about two active release version lines with keeping only the
latest version of version lines. If you expand this to EOLed version lines
and versions which aren't the latest in their version line, the problem
gets much more complicated.

On Tue, Mar 5, 2024 at 6:01 PM Pan,Bingkun  wrote:

> Based on my understanding, we should not update versions that have already
> been released,
>
> such as the situation you mentioned: `But what about dropout of version D?
> Should we add E in the dropdown?` We only need to record the latest
> `version. json` file that has already been published at the time of each
> new document release.
>
> Of course, if we need to keep the latest in every document, I think it's
> also possible.
>
> Only by sharing the same version. json file in each version.
> --
> *发件人:* Jungtaek Lim 
> *发送时间:* 2024年3月5日 16:47:30
> *收件人:* Pan,Bingkun
> *抄送:* Dongjoon Hyun; dev; user
> *主题:* Re: [ANNOUNCE] Apache Spark 3.5.1 released
>
> But this does not answer my question about updating the dropdown for the
> doc of "already released versions", right?
>
> Let's say we just released version D, and the dropdown has version A, B,
> C. We have another release tomorrow as version E, and it's probably easy to
> add A, B, C, D in the dropdown of E. But what about dropdown of version D?
> Should we add E in the dropdown? How do we maintain it if we will have 10
> releases afterwards?
>
> On Tue, Mar 5, 2024 at 5:27 PM Pan,Bingkun  wrote:
>
>> According to my understanding, the original intention of this feature is
>> that when a user has entered the pyspark document, if he finds that the
>> version he is currently in is not the version he wants, he can easily jump
>> to the version he wants by clicking on the drop-down box. Additionally, in
>> this PR, the current automatic mechanism for PRs did not merge in.
>>
>> https://github.com/apache/spark/pull/42881
>> <https://mailshield.baidu.com/check?q=NXF5O0EN4F6TOoAzxFGzXSJvMnQlPeztKpu%2fBYaKpd2sRl6qEYTx2NGUrTYUrhOk>
>>
>> So, we need to manually update this file. I can manually submit an update
>> first to get this feature working.
>> --
>> *发件人:* Jungtaek Lim 
>> *发送时间:* 2024年3月4日 6:34:42
>> *收件人:* Dongjoon Hyun
>> *抄送:* dev; user
>> *主题:* Re: [ANNOUNCE] Apache Spark 3.5.1 released
>>
>> Shall we revisit this functionality? The API doc is built with individual
>> versions, and for each individual version we depend on other released
>> versions. This does not seem to be right to me. Also, the functionality is
>> only in PySpark API doc which does not seem to be consistent as well.
>>
>> I don't think this is manageable with the current approach (listing
>> versions in version-dependent doc). Let's say we release 3.4.3 after 3.5.1.
>> Should we update the versions in 3.5.1 to add 3.4.3 in version switcher?
>> How about the time we are going to release the new version after releasing
>> 10 versions? What's the criteria of pruning the version?
>>
>> Unless we have a good answer to these questions, I think it's better to
>> revert the functionality - it missed various considerations.
>>
>> On Fri, Mar 1, 2024 at 2:44 PM Jungtaek Lim 
>> wrote:
>>
>>> Thanks for reporting - this is odd - the dropdown did not exist in other
>>> recent releases.
>>>
>>> https://spark.apache.org/docs/3.5.0/api/python/index.html
>>> <https://mailshield.baidu.com/check?q=uXELebgeq9ShKrQ3HDYtw08xGdWbbrT3FEzFk%2fzTZ%2bVxzlJrJa41y1xJkZ7RbZcLmQNMGzBVvVX6KlpxrtsKRQ%3d%3d>
>>> https://spark.apache.org/docs/3.4.2/api/python/index.html
>>> <https://mailshield.baidu.com/check?q=vFHg6IjqXnlPilWEcpu6a0oCJLXpFeNnsL6hZ%2fpZY0nGPd6tnUFbimhVD6zVpMlL8RAgxzN8GQM6cNBFe8hXvA%3d%3d>
>>> https://spark.apache.org/docs/3.3.4/api/python/index.html
>>> <https://mailshield.baidu.com/check?q=cfoH89Pu%2fNbZC4s7657SqqfHpT7hoppw7e6%2fZzsz8S7ZoEMm2LijOxwcGgKS5O29HzYUyQoooMRdy%2fd5Y36e2Q%3d%3d>
>>>
>>> Looks like the dropdown feature was recently introduced but partially
>>> done. The addition of a dropdown was done, but the way how to bump the
>>> version was missed t

答复: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-05 Thread Pan,Bingkun
Based on my understanding, we should not update versions that have already been 
released,

such as the situation you mentioned: `But what about dropout of version D? 
Should we add E in the dropdown?` We only need to record the latest `version. 
json` file that has already been published at the time of each new document 
release.

Of course, if we need to keep the latest in every document, I think it's also 
possible.

Only by sharing the same version. json file in each version.


发件人: Jungtaek Lim 
发送时间: 2024年3月5日 16:47:30
收件人: Pan,Bingkun
抄送: Dongjoon Hyun; dev; user
主题: Re: [ANNOUNCE] Apache Spark 3.5.1 released

But this does not answer my question about updating the dropdown for the doc of 
"already released versions", right?

Let's say we just released version D, and the dropdown has version A, B, C. We 
have another release tomorrow as version E, and it's probably easy to add A, B, 
C, D in the dropdown of E. But what about dropdown of version D? Should we add 
E in the dropdown? How do we maintain it if we will have 10 releases afterwards?

On Tue, Mar 5, 2024 at 5:27 PM Pan,Bingkun 
mailto:panbing...@baidu.com>> wrote:

According to my understanding, the original intention of this feature is that 
when a user has entered the pyspark document, if he finds that the version he 
is currently in is not the version he wants, he can easily jump to the version 
he wants by clicking on the drop-down box. Additionally, in this PR, the 
current automatic mechanism for PRs did not merge in.

https://github.com/apache/spark/pull/42881<https://mailshield.baidu.com/check?q=NXF5O0EN4F6TOoAzxFGzXSJvMnQlPeztKpu%2fBYaKpd2sRl6qEYTx2NGUrTYUrhOk>

So, we need to manually update this file. I can manually submit an update first 
to get this feature working.


发件人: Jungtaek Lim 
mailto:kabhwan.opensou...@gmail.com>>
发送时间: 2024年3月4日 6:34:42
收件人: Dongjoon Hyun
抄送: dev; user
主题: Re: [ANNOUNCE] Apache Spark 3.5.1 released

Shall we revisit this functionality? The API doc is built with individual 
versions, and for each individual version we depend on other released versions. 
This does not seem to be right to me. Also, the functionality is only in 
PySpark API doc which does not seem to be consistent as well.

I don't think this is manageable with the current approach (listing versions in 
version-dependent doc). Let's say we release 3.4.3 after 3.5.1. Should we 
update the versions in 3.5.1 to add 3.4.3 in version switcher? How about the 
time we are going to release the new version after releasing 10 versions? 
What's the criteria of pruning the version?

Unless we have a good answer to these questions, I think it's better to revert 
the functionality - it missed various considerations.

On Fri, Mar 1, 2024 at 2:44 PM Jungtaek Lim 
mailto:kabhwan.opensou...@gmail.com>> wrote:
Thanks for reporting - this is odd - the dropdown did not exist in other recent 
releases.

https://spark.apache.org/docs/3.5.0/api/python/index.html<https://mailshield.baidu.com/check?q=uXELebgeq9ShKrQ3HDYtw08xGdWbbrT3FEzFk%2fzTZ%2bVxzlJrJa41y1xJkZ7RbZcLmQNMGzBVvVX6KlpxrtsKRQ%3d%3d>
https://spark.apache.org/docs/3.4.2/api/python/index.html<https://mailshield.baidu.com/check?q=vFHg6IjqXnlPilWEcpu6a0oCJLXpFeNnsL6hZ%2fpZY0nGPd6tnUFbimhVD6zVpMlL8RAgxzN8GQM6cNBFe8hXvA%3d%3d>
https://spark.apache.org/docs/3.3.4/api/python/index.html<https://mailshield.baidu.com/check?q=cfoH89Pu%2fNbZC4s7657SqqfHpT7hoppw7e6%2fZzsz8S7ZoEMm2LijOxwcGgKS5O29HzYUyQoooMRdy%2fd5Y36e2Q%3d%3d>

Looks like the dropdown feature was recently introduced but partially done. The 
addition of a dropdown was done, but the way how to bump the version was missed 
to be documented.
The contributor proposed the way to update the version "automatically", but the 
PR wasn't merged. As a result, we are neither having the instruction how to 
bump the version manually, nor having the automatic bump.

* PR for addition of dropdown: 
https://github.com/apache/spark/pull/42428<https://mailshield.baidu.com/check?q=pSDq2Cdb4aBtjOEg7J1%2fXPtYeSxjVkQfXKV%2fmfX1Y7NeT77hnIS%2bsvMbbXwT3DLm>
* PR for automatically bumping version: 
https://github.com/apache/spark/pull/42881<https://mailshield.baidu.com/check?q=NXF5O0EN4F6TOoAzxFGzXSJvMnQlPeztKpu%2fBYaKpd2sRl6qEYTx2NGUrTYUrhOk>

We will probably need to add an instruction in the release process to update 
the version. (For automatic bumping I don't have a good idea.)
I'll look into it. Please expect some delay during the holiday weekend in S. 
Korea.

Thanks again.
Jungtaek Lim (HeartSaVioR)


On Fri, Mar 1, 2024 at 2:14 PM Dongjoon Hyun 
mailto:dongjoon.h...@gmail.com>> wrote:
BTW, Jungtaek.

PySpark document seems to show a wrong branch. At this time, `master`.


https://spark.apache.org/docs/3.5.1/api/python/index.html<https://mailshield.baidu.com/check?q=KwooIjNwx9R5XjkTxvpqs6ApF2YX2ZujKl%2bha1PX%

Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-05 Thread Jungtaek Lim
But this does not answer my question about updating the dropdown for the
doc of "already released versions", right?

Let's say we just released version D, and the dropdown has version A, B, C.
We have another release tomorrow as version E, and it's probably easy to
add A, B, C, D in the dropdown of E. But what about dropdown of version D?
Should we add E in the dropdown? How do we maintain it if we will have 10
releases afterwards?

On Tue, Mar 5, 2024 at 5:27 PM Pan,Bingkun  wrote:

> According to my understanding, the original intention of this feature is
> that when a user has entered the pyspark document, if he finds that the
> version he is currently in is not the version he wants, he can easily jump
> to the version he wants by clicking on the drop-down box. Additionally, in
> this PR, the current automatic mechanism for PRs did not merge in.
>
> https://github.com/apache/spark/pull/42881
>
> So, we need to manually update this file. I can manually submit an update
> first to get this feature working.
> --
> *发件人:* Jungtaek Lim 
> *发送时间:* 2024年3月4日 6:34:42
> *收件人:* Dongjoon Hyun
> *抄送:* dev; user
> *主题:* Re: [ANNOUNCE] Apache Spark 3.5.1 released
>
> Shall we revisit this functionality? The API doc is built with individual
> versions, and for each individual version we depend on other released
> versions. This does not seem to be right to me. Also, the functionality is
> only in PySpark API doc which does not seem to be consistent as well.
>
> I don't think this is manageable with the current approach (listing
> versions in version-dependent doc). Let's say we release 3.4.3 after 3.5.1.
> Should we update the versions in 3.5.1 to add 3.4.3 in version switcher?
> How about the time we are going to release the new version after releasing
> 10 versions? What's the criteria of pruning the version?
>
> Unless we have a good answer to these questions, I think it's better to
> revert the functionality - it missed various considerations.
>
> On Fri, Mar 1, 2024 at 2:44 PM Jungtaek Lim 
> wrote:
>
>> Thanks for reporting - this is odd - the dropdown did not exist in other
>> recent releases.
>>
>> https://spark.apache.org/docs/3.5.0/api/python/index.html
>> <https://mailshield.baidu.com/check?q=uXELebgeq9ShKrQ3HDYtw08xGdWbbrT3FEzFk%2fzTZ%2bVxzlJrJa41y1xJkZ7RbZcLmQNMGzBVvVX6KlpxrtsKRQ%3d%3d>
>> https://spark.apache.org/docs/3.4.2/api/python/index.html
>> <https://mailshield.baidu.com/check?q=vFHg6IjqXnlPilWEcpu6a0oCJLXpFeNnsL6hZ%2fpZY0nGPd6tnUFbimhVD6zVpMlL8RAgxzN8GQM6cNBFe8hXvA%3d%3d>
>> https://spark.apache.org/docs/3.3.4/api/python/index.html
>> <https://mailshield.baidu.com/check?q=cfoH89Pu%2fNbZC4s7657SqqfHpT7hoppw7e6%2fZzsz8S7ZoEMm2LijOxwcGgKS5O29HzYUyQoooMRdy%2fd5Y36e2Q%3d%3d>
>>
>> Looks like the dropdown feature was recently introduced but partially
>> done. The addition of a dropdown was done, but the way how to bump the
>> version was missed to be documented.
>> The contributor proposed the way to update the version "automatically",
>> but the PR wasn't merged. As a result, we are neither having the
>> instruction how to bump the version manually, nor having the automatic bump.
>>
>> * PR for addition of dropdown: https://github.com/apache/spark/pull/42428
>> <https://mailshield.baidu.com/check?q=pSDq2Cdb4aBtjOEg7J1%2fXPtYeSxjVkQfXKV%2fmfX1Y7NeT77hnIS%2bsvMbbXwT3DLm>
>> * PR for automatically bumping version:
>> https://github.com/apache/spark/pull/42881
>> <https://mailshield.baidu.com/check?q=NXF5O0EN4F6TOoAzxFGzXSJvMnQlPeztKpu%2fBYaKpd2sRl6qEYTx2NGUrTYUrhOk>
>>
>> We will probably need to add an instruction in the release process to
>> update the version. (For automatic bumping I don't have a good idea.)
>> I'll look into it. Please expect some delay during the holiday weekend
>> in S. Korea.
>>
>> Thanks again.
>> Jungtaek Lim (HeartSaVioR)
>>
>>
>> On Fri, Mar 1, 2024 at 2:14 PM Dongjoon Hyun 
>> wrote:
>>
>>> BTW, Jungtaek.
>>>
>>> PySpark document seems to show a wrong branch. At this time, `master`.
>>>
>>> https://spark.apache.org/docs/3.5.1/api/python/index.html
>>> <https://mailshield.baidu.com/check?q=KwooIjNwx9R5XjkTxvpqs6ApF2YX2ZujKl%2bha1PX%2bf3X4CQowIWtvSFmFPVO1297fFYMkgFMgmFuEBDkuDwpig%3d%3d>
>>>
>>> PySpark Overview
>>> <https://mailshield.baidu.com/check?q=rahGq5g%2bcbjBOU3xXCbESExdvGhXXTpk%2f%2f3BUMatX7zAgGbgcBy3mkuJmlmgtZZIoahnY2Cj2t4uylAFmefkTY1%2bQbN0rqSWYUU6qjrQRqY%3d>
>>>
>>>Date: Feb 24, 2024 Version: master
>>>
>>> [image: Screenshot 2024-02-29 at

答复: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-05 Thread Pan,Bingkun
According to my understanding, the original intention of this feature is that 
when a user has entered the pyspark document, if he finds that the version he 
is currently in is not the version he wants, he can easily jump to the version 
he wants by clicking on the drop-down box. Additionally, in this PR, the 
current automatic mechanism for PRs did not merge in.

https://github.com/apache/spark/pull/42881

So, we need to manually update this file. I can manually submit an update first 
to get this feature working.


发件人: Jungtaek Lim 
发送时间: 2024年3月4日 6:34:42
收件人: Dongjoon Hyun
抄送: dev; user
主题: Re: [ANNOUNCE] Apache Spark 3.5.1 released

Shall we revisit this functionality? The API doc is built with individual 
versions, and for each individual version we depend on other released versions. 
This does not seem to be right to me. Also, the functionality is only in 
PySpark API doc which does not seem to be consistent as well.

I don't think this is manageable with the current approach (listing versions in 
version-dependent doc). Let's say we release 3.4.3 after 3.5.1. Should we 
update the versions in 3.5.1 to add 3.4.3 in version switcher? How about the 
time we are going to release the new version after releasing 10 versions? 
What's the criteria of pruning the version?

Unless we have a good answer to these questions, I think it's better to revert 
the functionality - it missed various considerations.

On Fri, Mar 1, 2024 at 2:44 PM Jungtaek Lim 
mailto:kabhwan.opensou...@gmail.com>> wrote:
Thanks for reporting - this is odd - the dropdown did not exist in other recent 
releases.

https://spark.apache.org/docs/3.5.0/api/python/index.html<https://mailshield.baidu.com/check?q=uXELebgeq9ShKrQ3HDYtw08xGdWbbrT3FEzFk%2fzTZ%2bVxzlJrJa41y1xJkZ7RbZcLmQNMGzBVvVX6KlpxrtsKRQ%3d%3d>
https://spark.apache.org/docs/3.4.2/api/python/index.html<https://mailshield.baidu.com/check?q=vFHg6IjqXnlPilWEcpu6a0oCJLXpFeNnsL6hZ%2fpZY0nGPd6tnUFbimhVD6zVpMlL8RAgxzN8GQM6cNBFe8hXvA%3d%3d>
https://spark.apache.org/docs/3.3.4/api/python/index.html<https://mailshield.baidu.com/check?q=cfoH89Pu%2fNbZC4s7657SqqfHpT7hoppw7e6%2fZzsz8S7ZoEMm2LijOxwcGgKS5O29HzYUyQoooMRdy%2fd5Y36e2Q%3d%3d>

Looks like the dropdown feature was recently introduced but partially done. The 
addition of a dropdown was done, but the way how to bump the version was missed 
to be documented.
The contributor proposed the way to update the version "automatically", but the 
PR wasn't merged. As a result, we are neither having the instruction how to 
bump the version manually, nor having the automatic bump.

* PR for addition of dropdown: 
https://github.com/apache/spark/pull/42428<https://mailshield.baidu.com/check?q=pSDq2Cdb4aBtjOEg7J1%2fXPtYeSxjVkQfXKV%2fmfX1Y7NeT77hnIS%2bsvMbbXwT3DLm>
* PR for automatically bumping version: 
https://github.com/apache/spark/pull/42881<https://mailshield.baidu.com/check?q=NXF5O0EN4F6TOoAzxFGzXSJvMnQlPeztKpu%2fBYaKpd2sRl6qEYTx2NGUrTYUrhOk>

We will probably need to add an instruction in the release process to update 
the version. (For automatic bumping I don't have a good idea.)
I'll look into it. Please expect some delay during the holiday weekend in S. 
Korea.

Thanks again.
Jungtaek Lim (HeartSaVioR)


On Fri, Mar 1, 2024 at 2:14 PM Dongjoon Hyun 
mailto:dongjoon.h...@gmail.com>> wrote:
BTW, Jungtaek.

PySpark document seems to show a wrong branch. At this time, `master`.


https://spark.apache.org/docs/3.5.1/api/python/index.html<https://mailshield.baidu.com/check?q=KwooIjNwx9R5XjkTxvpqs6ApF2YX2ZujKl%2bha1PX%2bf3X4CQowIWtvSFmFPVO1297fFYMkgFMgmFuEBDkuDwpig%3d%3d>

PySpark 
Overview<https://mailshield.baidu.com/check?q=rahGq5g%2bcbjBOU3xXCbESExdvGhXXTpk%2f%2f3BUMatX7zAgGbgcBy3mkuJmlmgtZZIoahnY2Cj2t4uylAFmefkTY1%2bQbN0rqSWYUU6qjrQRqY%3d>

   Date: Feb 24, 2024 Version: master

[Screenshot 2024-02-29 at 21.12.24.png]


Could you do the follow-up, please?

Thank you in advance.

Dongjoon.


On Thu, Feb 29, 2024 at 2:48 PM John Zhuge 
mailto:jzh...@apache.org>> wrote:
Excellent work, congratulations!

On Wed, Feb 28, 2024 at 10:12 PM Dongjoon Hyun 
mailto:dongjoon.h...@gmail.com>> wrote:
Congratulations!

Bests,
Dongjoon.

On Wed, Feb 28, 2024 at 11:43 AM beliefer 
mailto:belie...@163.com>> wrote:

Congratulations!



At 2024-02-28 17:43:25, "Jungtaek Lim" 
mailto:kabhwan.opensou...@gmail.com>> wrote:

Hi everyone,

We are happy to announce the availability of Spark 3.5.1!

Spark 3.5.1 is a maintenance release containing stability fixes. This
release is based on the branch-3.5 maintenance branch of Spark. We strongly
recommend all 3.5 users to upgrade to this stable release.

To download Spark 3.5.1, head over to the download page:
https://spark.apache.org/downloads.html<https://mailshield.baidu.com/check?q=aV5QpxMQ4pApHhycByY17SDpg%2fyWowLsFKuT2QIJ%2blgKNmM8ZTuo%2bh%2bxuQw%3d>

To view th

Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-04 Thread Yang Jie
hmm... I guess this is meant to cc  @Bingkun Pan ?


On 2024/03/05 02:16:12 Hyukjin Kwon wrote:
> Is this related to https://github.com/apache/spark/pull/42428?
> 
> cc @Yang,Jie(INF) 
> 
> On Mon, 4 Mar 2024 at 22:21, Jungtaek Lim 
> wrote:
> 
> > Shall we revisit this functionality? The API doc is built with individual
> > versions, and for each individual version we depend on other released
> > versions. This does not seem to be right to me. Also, the functionality is
> > only in PySpark API doc which does not seem to be consistent as well.
> >
> > I don't think this is manageable with the current approach (listing
> > versions in version-dependent doc). Let's say we release 3.4.3 after 3.5.1.
> > Should we update the versions in 3.5.1 to add 3.4.3 in version switcher?
> > How about the time we are going to release the new version after releasing
> > 10 versions? What's the criteria of pruning the version?
> >
> > Unless we have a good answer to these questions, I think it's better to
> > revert the functionality - it missed various considerations.
> >
> > On Fri, Mar 1, 2024 at 2:44 PM Jungtaek Lim 
> > wrote:
> >
> >> Thanks for reporting - this is odd - the dropdown did not exist in other
> >> recent releases.
> >>
> >> https://spark.apache.org/docs/3.5.0/api/python/index.html
> >> https://spark.apache.org/docs/3.4.2/api/python/index.html
> >> https://spark.apache.org/docs/3.3.4/api/python/index.html
> >>
> >> Looks like the dropdown feature was recently introduced but partially
> >> done. The addition of a dropdown was done, but the way how to bump the
> >> version was missed to be documented.
> >> The contributor proposed the way to update the version "automatically",
> >> but the PR wasn't merged. As a result, we are neither having the
> >> instruction how to bump the version manually, nor having the automatic 
> >> bump.
> >>
> >> * PR for addition of dropdown: https://github.com/apache/spark/pull/42428
> >> * PR for automatically bumping version:
> >> https://github.com/apache/spark/pull/42881
> >>
> >> We will probably need to add an instruction in the release process to
> >> update the version. (For automatic bumping I don't have a good idea.)
> >> I'll look into it. Please expect some delay during the holiday weekend
> >> in S. Korea.
> >>
> >> Thanks again.
> >> Jungtaek Lim (HeartSaVioR)
> >>
> >>
> >> On Fri, Mar 1, 2024 at 2:14 PM Dongjoon Hyun 
> >> wrote:
> >>
> >>> BTW, Jungtaek.
> >>>
> >>> PySpark document seems to show a wrong branch. At this time, `master`.
> >>>
> >>> https://spark.apache.org/docs/3.5.1/api/python/index.html
> >>>
> >>> PySpark Overview
> >>> <https://spark.apache.org/docs/3.5.1/api/python/index.html#pyspark-overview>
> >>>
> >>>Date: Feb 24, 2024 Version: master
> >>>
> >>> [image: Screenshot 2024-02-29 at 21.12.24.png]
> >>>
> >>>
> >>> Could you do the follow-up, please?
> >>>
> >>> Thank you in advance.
> >>>
> >>> Dongjoon.
> >>>
> >>>
> >>> On Thu, Feb 29, 2024 at 2:48 PM John Zhuge  wrote:
> >>>
> >>>> Excellent work, congratulations!
> >>>>
> >>>> On Wed, Feb 28, 2024 at 10:12 PM Dongjoon Hyun 
> >>>> wrote:
> >>>>
> >>>>> Congratulations!
> >>>>>
> >>>>> Bests,
> >>>>> Dongjoon.
> >>>>>
> >>>>> On Wed, Feb 28, 2024 at 11:43 AM beliefer  wrote:
> >>>>>
> >>>>>> Congratulations!
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> At 2024-02-28 17:43:25, "Jungtaek Lim" 
> >>>>>> wrote:
> >>>>>>
> >>>>>> Hi everyone,
> >>>>>>
> >>>>>> We are happy to announce the availability of Spark 3.5.1!
> >>>>>>
> >>>>>> Spark 3.5.1 is a maintenance release containing stability fixes. This
> >>>>>> release is based on the branch-3.5 maintenance branch of Spark. We
> >>>>>> strongly
> >>>>>> recommend all 3.5 users to upgrade to this stable release.
> >>>>>>
> >>>>>> To download Spark 3.5.1, head over to the download page:
> >>>>>> https://spark.apache.org/downloads.html
> >>>>>>
> >>>>>> To view the release notes:
> >>>>>> https://spark.apache.org/releases/spark-release-3-5-1.html
> >>>>>>
> >>>>>> We would like to acknowledge all community members for contributing
> >>>>>> to this
> >>>>>> release. This release would not have been possible without you.
> >>>>>>
> >>>>>> Jungtaek Lim
> >>>>>>
> >>>>>> ps. Yikun is helping us through releasing the official docker image
> >>>>>> for Spark 3.5.1 (Thanks Yikun!) It may take some time to be generally
> >>>>>> available.
> >>>>>>
> >>>>>>
> >>>>
> >>>> --
> >>>> John Zhuge
> >>>>
> >>>
> 

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-04 Thread yangjie01
That sounds like a great suggestion.

发件人: Jungtaek Lim 
日期: 2024年3月5日 星期二 10:46
收件人: Hyukjin Kwon 
抄送: yangjie01 , Dongjoon Hyun , 
dev , user 
主题: Re: [ANNOUNCE] Apache Spark 3.5.1 released

Yes, it's relevant to that PR. I wonder, if we want to expose version switcher, 
it should be in versionless doc (spark-website) rather than the doc being 
pinned to a specific version.

On Tue, Mar 5, 2024 at 11:18 AM Hyukjin Kwon 
mailto:gurwls...@apache.org>> wrote:
Is this related to 
https://github.com/apache/spark/pull/42428<https://mailshield.baidu.com/check?q=pSDq2Cdb4aBtjOEg7J1%2fXPtYeSxjVkQfXKV%2fmfX1Y7NeT77hnIS%2bsvMbbXwT3DLm>?

cc @Yang,Jie(INF)<mailto:yangji...@baidu.com>

On Mon, 4 Mar 2024 at 22:21, Jungtaek Lim 
mailto:kabhwan.opensou...@gmail.com>> wrote:
Shall we revisit this functionality? The API doc is built with individual 
versions, and for each individual version we depend on other released versions. 
This does not seem to be right to me. Also, the functionality is only in 
PySpark API doc which does not seem to be consistent as well.

I don't think this is manageable with the current approach (listing versions in 
version-dependent doc). Let's say we release 3.4.3 after 3.5.1. Should we 
update the versions in 3.5.1 to add 3.4.3 in version switcher? How about the 
time we are going to release the new version after releasing 10 versions? 
What's the criteria of pruning the version?

Unless we have a good answer to these questions, I think it's better to revert 
the functionality - it missed various considerations.

On Fri, Mar 1, 2024 at 2:44 PM Jungtaek Lim 
mailto:kabhwan.opensou...@gmail.com>> wrote:
Thanks for reporting - this is odd - the dropdown did not exist in other recent 
releases.

https://spark.apache.org/docs/3.5.0/api/python/index.html<https://mailshield.baidu.com/check?q=uXELebgeq9ShKrQ3HDYtw08xGdWbbrT3FEzFk%2fzTZ%2bVxzlJrJa41y1xJkZ7RbZcLmQNMGzBVvVX6KlpxrtsKRQ%3d%3d>
https://spark.apache.org/docs/3.4.2/api/python/index.html<https://mailshield.baidu.com/check?q=vFHg6IjqXnlPilWEcpu6a0oCJLXpFeNnsL6hZ%2fpZY0nGPd6tnUFbimhVD6zVpMlL8RAgxzN8GQM6cNBFe8hXvA%3d%3d>
https://spark.apache.org/docs/3.3.4/api/python/index.html<https://mailshield.baidu.com/check?q=cfoH89Pu%2fNbZC4s7657SqqfHpT7hoppw7e6%2fZzsz8S7ZoEMm2LijOxwcGgKS5O29HzYUyQoooMRdy%2fd5Y36e2Q%3d%3d>

Looks like the dropdown feature was recently introduced but partially done. The 
addition of a dropdown was done, but the way how to bump the version was missed 
to be documented.
The contributor proposed the way to update the version "automatically", but the 
PR wasn't merged. As a result, we are neither having the instruction how to 
bump the version manually, nor having the automatic bump.

* PR for addition of dropdown: 
https://github.com/apache/spark/pull/42428<https://mailshield.baidu.com/check?q=pSDq2Cdb4aBtjOEg7J1%2fXPtYeSxjVkQfXKV%2fmfX1Y7NeT77hnIS%2bsvMbbXwT3DLm>
* PR for automatically bumping version: 
https://github.com/apache/spark/pull/42881<https://mailshield.baidu.com/check?q=NXF5O0EN4F6TOoAzxFGzXSJvMnQlPeztKpu%2fBYaKpd2sRl6qEYTx2NGUrTYUrhOk>

We will probably need to add an instruction in the release process to update 
the version. (For automatic bumping I don't have a good idea.)
I'll look into it. Please expect some delay during the holiday weekend in S. 
Korea.

Thanks again.
Jungtaek Lim (HeartSaVioR)


On Fri, Mar 1, 2024 at 2:14 PM Dongjoon Hyun 
mailto:dongjoon.h...@gmail.com>> wrote:
BTW, Jungtaek.

PySpark document seems to show a wrong branch. At this time, `master`.


https://spark.apache.org/docs/3.5.1/api/python/index.html<https://mailshield.baidu.com/check?q=KwooIjNwx9R5XjkTxvpqs6ApF2YX2ZujKl%2bha1PX%2bf3X4CQowIWtvSFmFPVO1297fFYMkgFMgmFuEBDkuDwpig%3d%3d>

PySpark Overview

   Date: Feb 24, 2024 Version: master
[cid:image001.png@01DA6F13.CD4B0B00]



Could you do the follow-up, please?

Thank you in advance.

Dongjoon.


On Thu, Feb 29, 2024 at 2:48 PM John Zhuge 
mailto:jzh...@apache.org>> wrote:
Excellent work, congratulations!

On Wed, Feb 28, 2024 at 10:12 PM Dongjoon Hyun 
mailto:dongjoon.h...@gmail.com>> wrote:
Congratulations!

Bests,
Dongjoon.

On Wed, Feb 28, 2024 at 11:43 AM beliefer 
mailto:belie...@163.com>> wrote:

Congratulations!





At 2024-02-28 17:43:25, "Jungtaek Lim" 
mailto:kabhwan.opensou...@gmail.com>> wrote:
Hi everyone,

We are happy to announce the availability of Spark 3.5.1!

Spark 3.5.1 is a maintenance release containing stability fixes. This
release is based on the branch-3.5 maintenance branch of Spark. We strongly
recommend all 3.5 users to upgrade to this stable release.

To download Spark 3.5.1, head over to the download page:
https://spark.apache.org/downloads.html<https://mailshield.baidu.com/check?q=aV5QpxMQ4pApHhycByY17SDpg%2fyWowLsFKuT2QIJ%2blgKNmM8ZTuo%2bh%2bxuQw%3d>

To view the

Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-04 Thread Jungtaek Lim
Yes, it's relevant to that PR. I wonder, if we want to expose version
switcher, it should be in versionless doc (spark-website) rather than the
doc being pinned to a specific version.

On Tue, Mar 5, 2024 at 11:18 AM Hyukjin Kwon  wrote:

> Is this related to https://github.com/apache/spark/pull/42428?
>
> cc @Yang,Jie(INF) 
>
> On Mon, 4 Mar 2024 at 22:21, Jungtaek Lim 
> wrote:
>
>> Shall we revisit this functionality? The API doc is built with individual
>> versions, and for each individual version we depend on other released
>> versions. This does not seem to be right to me. Also, the functionality is
>> only in PySpark API doc which does not seem to be consistent as well.
>>
>> I don't think this is manageable with the current approach (listing
>> versions in version-dependent doc). Let's say we release 3.4.3 after 3.5.1.
>> Should we update the versions in 3.5.1 to add 3.4.3 in version switcher?
>> How about the time we are going to release the new version after releasing
>> 10 versions? What's the criteria of pruning the version?
>>
>> Unless we have a good answer to these questions, I think it's better to
>> revert the functionality - it missed various considerations.
>>
>> On Fri, Mar 1, 2024 at 2:44 PM Jungtaek Lim 
>> wrote:
>>
>>> Thanks for reporting - this is odd - the dropdown did not exist in other
>>> recent releases.
>>>
>>> https://spark.apache.org/docs/3.5.0/api/python/index.html
>>> https://spark.apache.org/docs/3.4.2/api/python/index.html
>>> https://spark.apache.org/docs/3.3.4/api/python/index.html
>>>
>>> Looks like the dropdown feature was recently introduced but partially
>>> done. The addition of a dropdown was done, but the way how to bump the
>>> version was missed to be documented.
>>> The contributor proposed the way to update the version "automatically",
>>> but the PR wasn't merged. As a result, we are neither having the
>>> instruction how to bump the version manually, nor having the automatic bump.
>>>
>>> * PR for addition of dropdown:
>>> https://github.com/apache/spark/pull/42428
>>> * PR for automatically bumping version:
>>> https://github.com/apache/spark/pull/42881
>>>
>>> We will probably need to add an instruction in the release process to
>>> update the version. (For automatic bumping I don't have a good idea.)
>>> I'll look into it. Please expect some delay during the holiday weekend
>>> in S. Korea.
>>>
>>> Thanks again.
>>> Jungtaek Lim (HeartSaVioR)
>>>
>>>
>>> On Fri, Mar 1, 2024 at 2:14 PM Dongjoon Hyun 
>>> wrote:
>>>
>>>> BTW, Jungtaek.
>>>>
>>>> PySpark document seems to show a wrong branch. At this time, `master`.
>>>>
>>>> https://spark.apache.org/docs/3.5.1/api/python/index.html
>>>>
>>>> PySpark Overview
>>>> <https://spark.apache.org/docs/3.5.1/api/python/index.html#pyspark-overview>
>>>>
>>>>Date: Feb 24, 2024 Version: master
>>>>
>>>> [image: Screenshot 2024-02-29 at 21.12.24.png]
>>>>
>>>>
>>>> Could you do the follow-up, please?
>>>>
>>>> Thank you in advance.
>>>>
>>>> Dongjoon.
>>>>
>>>>
>>>> On Thu, Feb 29, 2024 at 2:48 PM John Zhuge  wrote:
>>>>
>>>>> Excellent work, congratulations!
>>>>>
>>>>> On Wed, Feb 28, 2024 at 10:12 PM Dongjoon Hyun <
>>>>> dongjoon.h...@gmail.com> wrote:
>>>>>
>>>>>> Congratulations!
>>>>>>
>>>>>> Bests,
>>>>>> Dongjoon.
>>>>>>
>>>>>> On Wed, Feb 28, 2024 at 11:43 AM beliefer  wrote:
>>>>>>
>>>>>>> Congratulations!
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> At 2024-02-28 17:43:25, "Jungtaek Lim" 
>>>>>>> wrote:
>>>>>>>
>>>>>>> Hi everyone,
>>>>>>>
>>>>>>> We are happy to announce the availability of Spark 3.5.1!
>>>>>>>
>>>>>>> Spark 3.5.1 is a maintenance release containing stability fixes. This
>>>>>>> release is based on the branch-3.5 maintenance branch of Spark. We
>>>>>>> strongly
>>>>>>> recommend all 3.5 users to upgrade to this stable release.
>>>>>>>
>>>>>>> To download Spark 3.5.1, head over to the download page:
>>>>>>> https://spark.apache.org/downloads.html
>>>>>>>
>>>>>>> To view the release notes:
>>>>>>> https://spark.apache.org/releases/spark-release-3-5-1.html
>>>>>>>
>>>>>>> We would like to acknowledge all community members for contributing
>>>>>>> to this
>>>>>>> release. This release would not have been possible without you.
>>>>>>>
>>>>>>> Jungtaek Lim
>>>>>>>
>>>>>>> ps. Yikun is helping us through releasing the official docker image
>>>>>>> for Spark 3.5.1 (Thanks Yikun!) It may take some time to be generally
>>>>>>> available.
>>>>>>>
>>>>>>>
>>>>>
>>>>> --
>>>>> John Zhuge
>>>>>
>>>>


Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-04 Thread Hyukjin Kwon
Is this related to https://github.com/apache/spark/pull/42428?

cc @Yang,Jie(INF) 

On Mon, 4 Mar 2024 at 22:21, Jungtaek Lim 
wrote:

> Shall we revisit this functionality? The API doc is built with individual
> versions, and for each individual version we depend on other released
> versions. This does not seem to be right to me. Also, the functionality is
> only in PySpark API doc which does not seem to be consistent as well.
>
> I don't think this is manageable with the current approach (listing
> versions in version-dependent doc). Let's say we release 3.4.3 after 3.5.1.
> Should we update the versions in 3.5.1 to add 3.4.3 in version switcher?
> How about the time we are going to release the new version after releasing
> 10 versions? What's the criteria of pruning the version?
>
> Unless we have a good answer to these questions, I think it's better to
> revert the functionality - it missed various considerations.
>
> On Fri, Mar 1, 2024 at 2:44 PM Jungtaek Lim 
> wrote:
>
>> Thanks for reporting - this is odd - the dropdown did not exist in other
>> recent releases.
>>
>> https://spark.apache.org/docs/3.5.0/api/python/index.html
>> https://spark.apache.org/docs/3.4.2/api/python/index.html
>> https://spark.apache.org/docs/3.3.4/api/python/index.html
>>
>> Looks like the dropdown feature was recently introduced but partially
>> done. The addition of a dropdown was done, but the way how to bump the
>> version was missed to be documented.
>> The contributor proposed the way to update the version "automatically",
>> but the PR wasn't merged. As a result, we are neither having the
>> instruction how to bump the version manually, nor having the automatic bump.
>>
>> * PR for addition of dropdown: https://github.com/apache/spark/pull/42428
>> * PR for automatically bumping version:
>> https://github.com/apache/spark/pull/42881
>>
>> We will probably need to add an instruction in the release process to
>> update the version. (For automatic bumping I don't have a good idea.)
>> I'll look into it. Please expect some delay during the holiday weekend
>> in S. Korea.
>>
>> Thanks again.
>> Jungtaek Lim (HeartSaVioR)
>>
>>
>> On Fri, Mar 1, 2024 at 2:14 PM Dongjoon Hyun 
>> wrote:
>>
>>> BTW, Jungtaek.
>>>
>>> PySpark document seems to show a wrong branch. At this time, `master`.
>>>
>>> https://spark.apache.org/docs/3.5.1/api/python/index.html
>>>
>>> PySpark Overview
>>> <https://spark.apache.org/docs/3.5.1/api/python/index.html#pyspark-overview>
>>>
>>>Date: Feb 24, 2024 Version: master
>>>
>>> [image: Screenshot 2024-02-29 at 21.12.24.png]
>>>
>>>
>>> Could you do the follow-up, please?
>>>
>>> Thank you in advance.
>>>
>>> Dongjoon.
>>>
>>>
>>> On Thu, Feb 29, 2024 at 2:48 PM John Zhuge  wrote:
>>>
>>>> Excellent work, congratulations!
>>>>
>>>> On Wed, Feb 28, 2024 at 10:12 PM Dongjoon Hyun 
>>>> wrote:
>>>>
>>>>> Congratulations!
>>>>>
>>>>> Bests,
>>>>> Dongjoon.
>>>>>
>>>>> On Wed, Feb 28, 2024 at 11:43 AM beliefer  wrote:
>>>>>
>>>>>> Congratulations!
>>>>>>
>>>>>>
>>>>>>
>>>>>> At 2024-02-28 17:43:25, "Jungtaek Lim" 
>>>>>> wrote:
>>>>>>
>>>>>> Hi everyone,
>>>>>>
>>>>>> We are happy to announce the availability of Spark 3.5.1!
>>>>>>
>>>>>> Spark 3.5.1 is a maintenance release containing stability fixes. This
>>>>>> release is based on the branch-3.5 maintenance branch of Spark. We
>>>>>> strongly
>>>>>> recommend all 3.5 users to upgrade to this stable release.
>>>>>>
>>>>>> To download Spark 3.5.1, head over to the download page:
>>>>>> https://spark.apache.org/downloads.html
>>>>>>
>>>>>> To view the release notes:
>>>>>> https://spark.apache.org/releases/spark-release-3-5-1.html
>>>>>>
>>>>>> We would like to acknowledge all community members for contributing
>>>>>> to this
>>>>>> release. This release would not have been possible without you.
>>>>>>
>>>>>> Jungtaek Lim
>>>>>>
>>>>>> ps. Yikun is helping us through releasing the official docker image
>>>>>> for Spark 3.5.1 (Thanks Yikun!) It may take some time to be generally
>>>>>> available.
>>>>>>
>>>>>>
>>>>
>>>> --
>>>> John Zhuge
>>>>
>>>


Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-03 Thread Jungtaek Lim
Shall we revisit this functionality? The API doc is built with individual
versions, and for each individual version we depend on other released
versions. This does not seem to be right to me. Also, the functionality is
only in PySpark API doc which does not seem to be consistent as well.

I don't think this is manageable with the current approach (listing
versions in version-dependent doc). Let's say we release 3.4.3 after 3.5.1.
Should we update the versions in 3.5.1 to add 3.4.3 in version switcher?
How about the time we are going to release the new version after releasing
10 versions? What's the criteria of pruning the version?

Unless we have a good answer to these questions, I think it's better to
revert the functionality - it missed various considerations.

On Fri, Mar 1, 2024 at 2:44 PM Jungtaek Lim 
wrote:

> Thanks for reporting - this is odd - the dropdown did not exist in other
> recent releases.
>
> https://spark.apache.org/docs/3.5.0/api/python/index.html
> https://spark.apache.org/docs/3.4.2/api/python/index.html
> https://spark.apache.org/docs/3.3.4/api/python/index.html
>
> Looks like the dropdown feature was recently introduced but partially
> done. The addition of a dropdown was done, but the way how to bump the
> version was missed to be documented.
> The contributor proposed the way to update the version "automatically",
> but the PR wasn't merged. As a result, we are neither having the
> instruction how to bump the version manually, nor having the automatic bump.
>
> * PR for addition of dropdown: https://github.com/apache/spark/pull/42428
> * PR for automatically bumping version:
> https://github.com/apache/spark/pull/42881
>
> We will probably need to add an instruction in the release process to
> update the version. (For automatic bumping I don't have a good idea.)
> I'll look into it. Please expect some delay during the holiday weekend
> in S. Korea.
>
> Thanks again.
> Jungtaek Lim (HeartSaVioR)
>
>
> On Fri, Mar 1, 2024 at 2:14 PM Dongjoon Hyun 
> wrote:
>
>> BTW, Jungtaek.
>>
>> PySpark document seems to show a wrong branch. At this time, `master`.
>>
>> https://spark.apache.org/docs/3.5.1/api/python/index.html
>>
>> PySpark Overview
>> <https://spark.apache.org/docs/3.5.1/api/python/index.html#pyspark-overview>
>>
>>Date: Feb 24, 2024 Version: master
>>
>> [image: Screenshot 2024-02-29 at 21.12.24.png]
>>
>>
>> Could you do the follow-up, please?
>>
>> Thank you in advance.
>>
>> Dongjoon.
>>
>>
>> On Thu, Feb 29, 2024 at 2:48 PM John Zhuge  wrote:
>>
>>> Excellent work, congratulations!
>>>
>>> On Wed, Feb 28, 2024 at 10:12 PM Dongjoon Hyun 
>>> wrote:
>>>
>>>> Congratulations!
>>>>
>>>> Bests,
>>>> Dongjoon.
>>>>
>>>> On Wed, Feb 28, 2024 at 11:43 AM beliefer  wrote:
>>>>
>>>>> Congratulations!
>>>>>
>>>>>
>>>>>
>>>>> At 2024-02-28 17:43:25, "Jungtaek Lim" 
>>>>> wrote:
>>>>>
>>>>> Hi everyone,
>>>>>
>>>>> We are happy to announce the availability of Spark 3.5.1!
>>>>>
>>>>> Spark 3.5.1 is a maintenance release containing stability fixes. This
>>>>> release is based on the branch-3.5 maintenance branch of Spark. We
>>>>> strongly
>>>>> recommend all 3.5 users to upgrade to this stable release.
>>>>>
>>>>> To download Spark 3.5.1, head over to the download page:
>>>>> https://spark.apache.org/downloads.html
>>>>>
>>>>> To view the release notes:
>>>>> https://spark.apache.org/releases/spark-release-3-5-1.html
>>>>>
>>>>> We would like to acknowledge all community members for contributing to
>>>>> this
>>>>> release. This release would not have been possible without you.
>>>>>
>>>>> Jungtaek Lim
>>>>>
>>>>> ps. Yikun is helping us through releasing the official docker image
>>>>> for Spark 3.5.1 (Thanks Yikun!) It may take some time to be generally
>>>>> available.
>>>>>
>>>>>
>>>
>>> --
>>> John Zhuge
>>>
>>


Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-02-29 Thread Peter Toth
Congratulations and thanks Jungtaek for driving this!

Xinrong Meng  ezt írta (időpont: 2024. márc. 1.,
P, 5:24):

> Congratulations!
>
> Thanks,
> Xinrong
>
> On Thu, Feb 29, 2024 at 11:16 AM Dongjoon Hyun 
> wrote:
>
>> Congratulations!
>>
>> Bests,
>> Dongjoon.
>>
>> On Wed, Feb 28, 2024 at 11:43 AM beliefer  wrote:
>>
>>> Congratulations!
>>>
>>>
>>>
>>> At 2024-02-28 17:43:25, "Jungtaek Lim" 
>>> wrote:
>>>
>>> Hi everyone,
>>>
>>> We are happy to announce the availability of Spark 3.5.1!
>>>
>>> Spark 3.5.1 is a maintenance release containing stability fixes. This
>>> release is based on the branch-3.5 maintenance branch of Spark. We
>>> strongly
>>> recommend all 3.5 users to upgrade to this stable release.
>>>
>>> To download Spark 3.5.1, head over to the download page:
>>> https://spark.apache.org/downloads.html
>>>
>>> To view the release notes:
>>> https://spark.apache.org/releases/spark-release-3-5-1.html
>>>
>>> We would like to acknowledge all community members for contributing to
>>> this
>>> release. This release would not have been possible without you.
>>>
>>> Jungtaek Lim
>>>
>>> ps. Yikun is helping us through releasing the official docker image for
>>> Spark 3.5.1 (Thanks Yikun!) It may take some time to be generally available.
>>>
>>>


Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-02-29 Thread Jungtaek Lim
Thanks for reporting - this is odd - the dropdown did not exist in other
recent releases.

https://spark.apache.org/docs/3.5.0/api/python/index.html
https://spark.apache.org/docs/3.4.2/api/python/index.html
https://spark.apache.org/docs/3.3.4/api/python/index.html

Looks like the dropdown feature was recently introduced but partially done.
The addition of a dropdown was done, but the way how to bump the version
was missed to be documented.
The contributor proposed the way to update the version "automatically", but
the PR wasn't merged. As a result, we are neither having the instruction
how to bump the version manually, nor having the automatic bump.

* PR for addition of dropdown: https://github.com/apache/spark/pull/42428
* PR for automatically bumping version:
https://github.com/apache/spark/pull/42881

We will probably need to add an instruction in the release process to
update the version. (For automatic bumping I don't have a good idea.)
I'll look into it. Please expect some delay during the holiday weekend
in S. Korea.

Thanks again.
Jungtaek Lim (HeartSaVioR)


On Fri, Mar 1, 2024 at 2:14 PM Dongjoon Hyun 
wrote:

> BTW, Jungtaek.
>
> PySpark document seems to show a wrong branch. At this time, `master`.
>
> https://spark.apache.org/docs/3.5.1/api/python/index.html
>
> PySpark Overview
> <https://spark.apache.org/docs/3.5.1/api/python/index.html#pyspark-overview>
>
>Date: Feb 24, 2024 Version: master
>
> [image: Screenshot 2024-02-29 at 21.12.24.png]
>
>
> Could you do the follow-up, please?
>
> Thank you in advance.
>
> Dongjoon.
>
>
> On Thu, Feb 29, 2024 at 2:48 PM John Zhuge  wrote:
>
>> Excellent work, congratulations!
>>
>> On Wed, Feb 28, 2024 at 10:12 PM Dongjoon Hyun 
>> wrote:
>>
>>> Congratulations!
>>>
>>> Bests,
>>> Dongjoon.
>>>
>>> On Wed, Feb 28, 2024 at 11:43 AM beliefer  wrote:
>>>
>>>> Congratulations!
>>>>
>>>>
>>>>
>>>> At 2024-02-28 17:43:25, "Jungtaek Lim" 
>>>> wrote:
>>>>
>>>> Hi everyone,
>>>>
>>>> We are happy to announce the availability of Spark 3.5.1!
>>>>
>>>> Spark 3.5.1 is a maintenance release containing stability fixes. This
>>>> release is based on the branch-3.5 maintenance branch of Spark. We
>>>> strongly
>>>> recommend all 3.5 users to upgrade to this stable release.
>>>>
>>>> To download Spark 3.5.1, head over to the download page:
>>>> https://spark.apache.org/downloads.html
>>>>
>>>> To view the release notes:
>>>> https://spark.apache.org/releases/spark-release-3-5-1.html
>>>>
>>>> We would like to acknowledge all community members for contributing to
>>>> this
>>>> release. This release would not have been possible without you.
>>>>
>>>> Jungtaek Lim
>>>>
>>>> ps. Yikun is helping us through releasing the official docker image for
>>>> Spark 3.5.1 (Thanks Yikun!) It may take some time to be generally 
>>>> available.
>>>>
>>>>
>>
>> --
>> John Zhuge
>>
>


Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-02-29 Thread Dongjoon Hyun
BTW, Jungtaek.

PySpark document seems to show a wrong branch. At this time, `master`.

https://spark.apache.org/docs/3.5.1/api/python/index.html

PySpark Overview
<https://spark.apache.org/docs/3.5.1/api/python/index.html#pyspark-overview>

   Date: Feb 24, 2024 Version: master

[image: Screenshot 2024-02-29 at 21.12.24.png]


Could you do the follow-up, please?

Thank you in advance.

Dongjoon.


On Thu, Feb 29, 2024 at 2:48 PM John Zhuge  wrote:

> Excellent work, congratulations!
>
> On Wed, Feb 28, 2024 at 10:12 PM Dongjoon Hyun 
> wrote:
>
>> Congratulations!
>>
>> Bests,
>> Dongjoon.
>>
>> On Wed, Feb 28, 2024 at 11:43 AM beliefer  wrote:
>>
>>> Congratulations!
>>>
>>>
>>>
>>> At 2024-02-28 17:43:25, "Jungtaek Lim" 
>>> wrote:
>>>
>>> Hi everyone,
>>>
>>> We are happy to announce the availability of Spark 3.5.1!
>>>
>>> Spark 3.5.1 is a maintenance release containing stability fixes. This
>>> release is based on the branch-3.5 maintenance branch of Spark. We
>>> strongly
>>> recommend all 3.5 users to upgrade to this stable release.
>>>
>>> To download Spark 3.5.1, head over to the download page:
>>> https://spark.apache.org/downloads.html
>>>
>>> To view the release notes:
>>> https://spark.apache.org/releases/spark-release-3-5-1.html
>>>
>>> We would like to acknowledge all community members for contributing to
>>> this
>>> release. This release would not have been possible without you.
>>>
>>> Jungtaek Lim
>>>
>>> ps. Yikun is helping us through releasing the official docker image for
>>> Spark 3.5.1 (Thanks Yikun!) It may take some time to be generally available.
>>>
>>>
>
> --
> John Zhuge
>


Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-02-29 Thread John Zhuge
Excellent work, congratulations!

On Wed, Feb 28, 2024 at 10:12 PM Dongjoon Hyun 
wrote:

> Congratulations!
>
> Bests,
> Dongjoon.
>
> On Wed, Feb 28, 2024 at 11:43 AM beliefer  wrote:
>
>> Congratulations!
>>
>>
>>
>> At 2024-02-28 17:43:25, "Jungtaek Lim" 
>> wrote:
>>
>> Hi everyone,
>>
>> We are happy to announce the availability of Spark 3.5.1!
>>
>> Spark 3.5.1 is a maintenance release containing stability fixes. This
>> release is based on the branch-3.5 maintenance branch of Spark. We
>> strongly
>> recommend all 3.5 users to upgrade to this stable release.
>>
>> To download Spark 3.5.1, head over to the download page:
>> https://spark.apache.org/downloads.html
>>
>> To view the release notes:
>> https://spark.apache.org/releases/spark-release-3-5-1.html
>>
>> We would like to acknowledge all community members for contributing to
>> this
>> release. This release would not have been possible without you.
>>
>> Jungtaek Lim
>>
>> ps. Yikun is helping us through releasing the official docker image for
>> Spark 3.5.1 (Thanks Yikun!) It may take some time to be generally available.
>>
>>

-- 
John Zhuge


Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-02-29 Thread Prem Sahoo
Congratulations Sent from my iPhoneOn Feb 29, 2024, at 4:54 PM, Xinrong Meng  wrote:Congratulations!Thanks,XinrongOn Thu, Feb 29, 2024 at 11:16 AM Dongjoon Hyun <dongjoon.h...@gmail.com> wrote:Congratulations!Bests,Dongjoon.On Wed, Feb 28, 2024 at 11:43 AM beliefer <belie...@163.com> wrote:Congratulations!At 2024-02-28 17:43:25, "Jungtaek Lim" <kabhwan.opensou...@gmail.com> wrote:Hi everyone,We are happy to announce the availability of Spark 3.5.1!Spark 3.5.1 is a maintenance release containing stability fixes. Thisrelease is based on the branch-3.5 maintenance branch of Spark. We stronglyrecommend all 3.5 users to upgrade to this stable release.To download Spark 3.5.1, head over to the download page:https://spark.apache.org/downloads.htmlTo view the release notes:https://spark.apache.org/releases/spark-release-3-5-1.htmlWe would like to acknowledge all community members for contributing to thisrelease. This release would not have been possible without you.Jungtaek Limps. Yikun is helping us through releasing the official docker image for Spark 3.5.1 (Thanks Yikun!) It may take some time to be generally available.




Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-02-29 Thread Xinrong Meng
Congratulations!

Thanks,
Xinrong

On Thu, Feb 29, 2024 at 11:16 AM Dongjoon Hyun 
wrote:

> Congratulations!
>
> Bests,
> Dongjoon.
>
> On Wed, Feb 28, 2024 at 11:43 AM beliefer  wrote:
>
>> Congratulations!
>>
>>
>>
>> At 2024-02-28 17:43:25, "Jungtaek Lim" 
>> wrote:
>>
>> Hi everyone,
>>
>> We are happy to announce the availability of Spark 3.5.1!
>>
>> Spark 3.5.1 is a maintenance release containing stability fixes. This
>> release is based on the branch-3.5 maintenance branch of Spark. We
>> strongly
>> recommend all 3.5 users to upgrade to this stable release.
>>
>> To download Spark 3.5.1, head over to the download page:
>> https://spark.apache.org/downloads.html
>>
>> To view the release notes:
>> https://spark.apache.org/releases/spark-release-3-5-1.html
>>
>> We would like to acknowledge all community members for contributing to
>> this
>> release. This release would not have been possible without you.
>>
>> Jungtaek Lim
>>
>> ps. Yikun is helping us through releasing the official docker image for
>> Spark 3.5.1 (Thanks Yikun!) It may take some time to be generally available.
>>
>>


Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-02-28 Thread Dongjoon Hyun
Congratulations!

Bests,
Dongjoon.

On Wed, Feb 28, 2024 at 11:43 AM beliefer  wrote:

> Congratulations!
>
>
>
> At 2024-02-28 17:43:25, "Jungtaek Lim" 
> wrote:
>
> Hi everyone,
>
> We are happy to announce the availability of Spark 3.5.1!
>
> Spark 3.5.1 is a maintenance release containing stability fixes. This
> release is based on the branch-3.5 maintenance branch of Spark. We strongly
> recommend all 3.5 users to upgrade to this stable release.
>
> To download Spark 3.5.1, head over to the download page:
> https://spark.apache.org/downloads.html
>
> To view the release notes:
> https://spark.apache.org/releases/spark-release-3-5-1.html
>
> We would like to acknowledge all community members for contributing to this
> release. This release would not have been possible without you.
>
> Jungtaek Lim
>
> ps. Yikun is helping us through releasing the official docker image for
> Spark 3.5.1 (Thanks Yikun!) It may take some time to be generally available.
>
>


Re:[ANNOUNCE] Apache Spark 3.5.1 released

2024-02-28 Thread beliefer
Congratulations!







At 2024-02-28 17:43:25, "Jungtaek Lim"  wrote:

Hi everyone,


We are happy to announce the availability of Spark 3.5.1!

Spark 3.5.1 is a maintenance release containing stability fixes. This
release is based on the branch-3.5 maintenance branch of Spark. We strongly
recommend all 3.5 users to upgrade to this stable release.

To download Spark 3.5.1, head over to the download page:
https://spark.apache.org/downloads.html

To view the release notes:
https://spark.apache.org/releases/spark-release-3-5-1.html

We would like to acknowledge all community members for contributing to this
release. This release would not have been possible without you.

Jungtaek Lim



ps. Yikun is helping us through releasing the official docker image for Spark 
3.5.1 (Thanks Yikun!) It may take some time to be generally available.



[ANNOUNCE] Apache Spark 3.5.1 released

2024-02-28 Thread Jungtaek Lim
Hi everyone,

We are happy to announce the availability of Spark 3.5.1!

Spark 3.5.1 is a maintenance release containing stability fixes. This
release is based on the branch-3.5 maintenance branch of Spark. We strongly
recommend all 3.5 users to upgrade to this stable release.

To download Spark 3.5.1, head over to the download page:
https://spark.apache.org/downloads.html

To view the release notes:
https://spark.apache.org/releases/spark-release-3-5-1.html

We would like to acknowledge all community members for contributing to this
release. This release would not have been possible without you.

Jungtaek Lim

ps. Yikun is helping us through releasing the official docker image for
Spark 3.5.1 (Thanks Yikun!) It may take some time to be generally available.


Re: [VOTE] Release Apache Spark 3.5.1 (RC2)

2024-02-23 Thread Jungtaek Lim
Thanks for figuring this out. That is my bad. My understanding is that
3.5.1 RC2 doc should be correctly generated in VOTE but it happened during
the finalization step.

I lost the build artifact for docs (I followed steps and removed docs from
dev dist before realizing I shouldn't remove them) and I accidentally
rebuilt the doc with the branch which I used for debugging issue in RC.

I'll rebuild the doc from tag and submit a PR again.

On Sat, Feb 24, 2024 at 7:16 AM Dongjoon Hyun 
wrote:

> Hi, All.
>
> Unfortunately, the Apache Spark `3.5.1 RC2` document artifact seems to be
> generated from unknown source code instead of the correct source code of
> the tag, `3.5.1`.
>
> https://spark.apache.org/docs/3.5.1/
>
> [image: Screenshot 2024-02-23 at 14.13.07.png]
>
> Dongjoon.
>
>
>
> On Wed, Feb 21, 2024 at 7:15 AM Jungtaek Lim 
> wrote:
>
>> Thanks everyone for participating the vote! The vote passed.
>> I'll send out the vote result and proceed to the next steps.
>>
>> On Wed, Feb 21, 2024 at 4:36 PM Maxim Gekk 
>> wrote:
>>
>>> +1
>>>
>>> On Wed, Feb 21, 2024 at 9:50 AM Hyukjin Kwon 
>>> wrote:
>>>
>>>> +1
>>>>
>>>> On Tue, 20 Feb 2024 at 22:00, Cheng Pan  wrote:
>>>>
>>>>> +1 (non-binding)
>>>>>
>>>>> - Build successfully from source code.
>>>>> - Pass integration tests with Spark ClickHouse Connector[1]
>>>>>
>>>>> [1] https://github.com/housepower/spark-clickhouse-connector/pull/299
>>>>>
>>>>> Thanks,
>>>>> Cheng Pan
>>>>>
>>>>>
>>>>> > On Feb 20, 2024, at 10:56, Jungtaek Lim <
>>>>> kabhwan.opensou...@gmail.com> wrote:
>>>>> >
>>>>> > Thanks Sean, let's continue the process for this RC.
>>>>> >
>>>>> > +1 (non-binding)
>>>>> >
>>>>> > - downloaded all files from URL
>>>>> > - checked signature
>>>>> > - extracted all archives
>>>>> > - ran all tests from source files in source archive file, via
>>>>> running "sbt clean test package" - Ubuntu 20.04.4 LTS, OpenJDK 17.0.9.
>>>>> >
>>>>> > Also bump to dev@ to encourage participation - looks like the
>>>>> timing is not good for US folks but let's see more days.
>>>>> >
>>>>> >
>>>>> > On Sat, Feb 17, 2024 at 1:49 AM Sean Owen  wrote:
>>>>> > Yeah let's get that fix in, but it seems to be a minor test only
>>>>> issue so should not block release.
>>>>> >
>>>>> > On Fri, Feb 16, 2024, 9:30 AM yangjie01  wrote:
>>>>> > Very sorry. When I was fixing `SPARK-45242 (
>>>>> https://github.com/apache/spark/pull/43594)`
>>>>> <https://github.com/apache/spark/pull/43594)>, I noticed that its
>>>>> `Affects Version` and `Fix Version` of SPARK-45242 were both 4.0, and I
>>>>> didn't realize that it had also been merged into branch-3.5, so I didn't
>>>>> advocate for SPARK-45357 to be backported to branch-3.5.
>>>>> >  As far as I know, the condition to trigger this test failure is:
>>>>> when using Maven to test the `connect` module, if  `sparkTestRelation` in
>>>>> `SparkConnectProtoSuite` is not the first `DataFrame` to be initialized,
>>>>> then the `id` of `sparkTestRelation` will no longer be 0. So, I think this
>>>>> is indeed related to the order in which Maven executes the test cases in
>>>>> the `connect` module.
>>>>> >  I have submitted a backport PR to branch-3.5, and if necessary, we
>>>>> can merge it to fix this test issue.
>>>>> >  Jie Yang
>>>>> >   发件人: Jungtaek Lim 
>>>>> > 日期: 2024年2月16日 星期五 22:15
>>>>> > 收件人: Sean Owen , Rui Wang 
>>>>> > 抄送: dev 
>>>>> > 主题: Re: [VOTE] Release Apache Spark 3.5.1 (RC2)
>>>>> >   I traced back relevant changes and got a sense of what happened.
>>>>> >   Yangjie figured out the issue via link. It's a tricky issue
>>>>> according to the comments from Yangjie - the test is dependent on ordering
>>>>> of execution for test suites. He said it does not fail in sbt, hence CI
>>>>> build couldn't catch it.
>>>>> > He fixed it via link, but we missed that the off

Re: [VOTE] Release Apache Spark 3.5.1 (RC2)

2024-02-23 Thread Dongjoon Hyun
Hi, All.

Unfortunately, the Apache Spark `3.5.1 RC2` document artifact seems to be
generated from unknown source code instead of the correct source code of
the tag, `3.5.1`.

https://spark.apache.org/docs/3.5.1/

[image: Screenshot 2024-02-23 at 14.13.07.png]

Dongjoon.



On Wed, Feb 21, 2024 at 7:15 AM Jungtaek Lim 
wrote:

> Thanks everyone for participating the vote! The vote passed.
> I'll send out the vote result and proceed to the next steps.
>
> On Wed, Feb 21, 2024 at 4:36 PM Maxim Gekk 
> wrote:
>
>> +1
>>
>> On Wed, Feb 21, 2024 at 9:50 AM Hyukjin Kwon 
>> wrote:
>>
>>> +1
>>>
>>> On Tue, 20 Feb 2024 at 22:00, Cheng Pan  wrote:
>>>
>>>> +1 (non-binding)
>>>>
>>>> - Build successfully from source code.
>>>> - Pass integration tests with Spark ClickHouse Connector[1]
>>>>
>>>> [1] https://github.com/housepower/spark-clickhouse-connector/pull/299
>>>>
>>>> Thanks,
>>>> Cheng Pan
>>>>
>>>>
>>>> > On Feb 20, 2024, at 10:56, Jungtaek Lim 
>>>> wrote:
>>>> >
>>>> > Thanks Sean, let's continue the process for this RC.
>>>> >
>>>> > +1 (non-binding)
>>>> >
>>>> > - downloaded all files from URL
>>>> > - checked signature
>>>> > - extracted all archives
>>>> > - ran all tests from source files in source archive file, via running
>>>> "sbt clean test package" - Ubuntu 20.04.4 LTS, OpenJDK 17.0.9.
>>>> >
>>>> > Also bump to dev@ to encourage participation - looks like the timing
>>>> is not good for US folks but let's see more days.
>>>> >
>>>> >
>>>> > On Sat, Feb 17, 2024 at 1:49 AM Sean Owen  wrote:
>>>> > Yeah let's get that fix in, but it seems to be a minor test only
>>>> issue so should not block release.
>>>> >
>>>> > On Fri, Feb 16, 2024, 9:30 AM yangjie01  wrote:
>>>> > Very sorry. When I was fixing `SPARK-45242 (
>>>> https://github.com/apache/spark/pull/43594)`
>>>> <https://github.com/apache/spark/pull/43594)>, I noticed that its
>>>> `Affects Version` and `Fix Version` of SPARK-45242 were both 4.0, and I
>>>> didn't realize that it had also been merged into branch-3.5, so I didn't
>>>> advocate for SPARK-45357 to be backported to branch-3.5.
>>>> >  As far as I know, the condition to trigger this test failure is:
>>>> when using Maven to test the `connect` module, if  `sparkTestRelation` in
>>>> `SparkConnectProtoSuite` is not the first `DataFrame` to be initialized,
>>>> then the `id` of `sparkTestRelation` will no longer be 0. So, I think this
>>>> is indeed related to the order in which Maven executes the test cases in
>>>> the `connect` module.
>>>> >  I have submitted a backport PR to branch-3.5, and if necessary, we
>>>> can merge it to fix this test issue.
>>>> >  Jie Yang
>>>> >   发件人: Jungtaek Lim 
>>>> > 日期: 2024年2月16日 星期五 22:15
>>>> > 收件人: Sean Owen , Rui Wang 
>>>> > 抄送: dev 
>>>> > 主题: Re: [VOTE] Release Apache Spark 3.5.1 (RC2)
>>>> >   I traced back relevant changes and got a sense of what happened.
>>>> >   Yangjie figured out the issue via link. It's a tricky issue
>>>> according to the comments from Yangjie - the test is dependent on ordering
>>>> of execution for test suites. He said it does not fail in sbt, hence CI
>>>> build couldn't catch it.
>>>> > He fixed it via link, but we missed that the offending commit was
>>>> also ported back to 3.5 as well, hence the fix wasn't ported back to 3.5.
>>>> >   Surprisingly, I can't reproduce locally even with maven. In my
>>>> attempt to reproduce, SparkConnectProtoSuite was executed at third,
>>>> SparkConnectStreamingQueryCacheSuite, and ExecuteEventsManagerSuite, and
>>>> then SparkConnectProtoSuite. Maybe very specific to the environment, not
>>>> just maven? My env: MBP M1 pro chip, MacOS 14.3.1, Openjdk 17.0.9. I used
>>>> build/mvn (Maven 3.8.8).
>>>> >   I'm not 100% sure this is something we should fail the release as
>>>> it's a test only and sounds very environment dependent, but I'll respect
>>>> your call on vote.
>>>> >   Btw, looks like Rui also made a relevant fix via link (not to fix
>&

[VOTE][RESULT] Release Apache Spark 3.5.1 (RC2)

2024-02-21 Thread Jungtaek Lim
The vote passes with 6 +1s (4 binding +1s).
Thanks to all who helped with the release!

(* = binding)
+1:
Jungtaek Lim
Wenchen Fan (*)
Cheng Pan
Xiao Li (*)
Hyukjin Kwon (*)
Maxim Gekk (*)

+0: None

-1: None


Re: [VOTE] Release Apache Spark 3.5.1 (RC2)

2024-02-21 Thread Jungtaek Lim
Thanks everyone for participating the vote! The vote passed.
I'll send out the vote result and proceed to the next steps.

On Wed, Feb 21, 2024 at 4:36 PM Maxim Gekk 
wrote:

> +1
>
> On Wed, Feb 21, 2024 at 9:50 AM Hyukjin Kwon  wrote:
>
>> +1
>>
>> On Tue, 20 Feb 2024 at 22:00, Cheng Pan  wrote:
>>
>>> +1 (non-binding)
>>>
>>> - Build successfully from source code.
>>> - Pass integration tests with Spark ClickHouse Connector[1]
>>>
>>> [1] https://github.com/housepower/spark-clickhouse-connector/pull/299
>>>
>>> Thanks,
>>> Cheng Pan
>>>
>>>
>>> > On Feb 20, 2024, at 10:56, Jungtaek Lim 
>>> wrote:
>>> >
>>> > Thanks Sean, let's continue the process for this RC.
>>> >
>>> > +1 (non-binding)
>>> >
>>> > - downloaded all files from URL
>>> > - checked signature
>>> > - extracted all archives
>>> > - ran all tests from source files in source archive file, via running
>>> "sbt clean test package" - Ubuntu 20.04.4 LTS, OpenJDK 17.0.9.
>>> >
>>> > Also bump to dev@ to encourage participation - looks like the timing
>>> is not good for US folks but let's see more days.
>>> >
>>> >
>>> > On Sat, Feb 17, 2024 at 1:49 AM Sean Owen  wrote:
>>> > Yeah let's get that fix in, but it seems to be a minor test only issue
>>> so should not block release.
>>> >
>>> > On Fri, Feb 16, 2024, 9:30 AM yangjie01  wrote:
>>> > Very sorry. When I was fixing `SPARK-45242 (
>>> https://github.com/apache/spark/pull/43594)`
>>> <https://github.com/apache/spark/pull/43594)>, I noticed that its
>>> `Affects Version` and `Fix Version` of SPARK-45242 were both 4.0, and I
>>> didn't realize that it had also been merged into branch-3.5, so I didn't
>>> advocate for SPARK-45357 to be backported to branch-3.5.
>>> >  As far as I know, the condition to trigger this test failure is: when
>>> using Maven to test the `connect` module, if  `sparkTestRelation` in
>>> `SparkConnectProtoSuite` is not the first `DataFrame` to be initialized,
>>> then the `id` of `sparkTestRelation` will no longer be 0. So, I think this
>>> is indeed related to the order in which Maven executes the test cases in
>>> the `connect` module.
>>> >  I have submitted a backport PR to branch-3.5, and if necessary, we
>>> can merge it to fix this test issue.
>>> >  Jie Yang
>>> >   发件人: Jungtaek Lim 
>>> > 日期: 2024年2月16日 星期五 22:15
>>> > 收件人: Sean Owen , Rui Wang 
>>> > 抄送: dev 
>>> > 主题: Re: [VOTE] Release Apache Spark 3.5.1 (RC2)
>>> >   I traced back relevant changes and got a sense of what happened.
>>> >   Yangjie figured out the issue via link. It's a tricky issue
>>> according to the comments from Yangjie - the test is dependent on ordering
>>> of execution for test suites. He said it does not fail in sbt, hence CI
>>> build couldn't catch it.
>>> > He fixed it via link, but we missed that the offending commit was also
>>> ported back to 3.5 as well, hence the fix wasn't ported back to 3.5.
>>> >   Surprisingly, I can't reproduce locally even with maven. In my
>>> attempt to reproduce, SparkConnectProtoSuite was executed at third,
>>> SparkConnectStreamingQueryCacheSuite, and ExecuteEventsManagerSuite, and
>>> then SparkConnectProtoSuite. Maybe very specific to the environment, not
>>> just maven? My env: MBP M1 pro chip, MacOS 14.3.1, Openjdk 17.0.9. I used
>>> build/mvn (Maven 3.8.8).
>>> >   I'm not 100% sure this is something we should fail the release as
>>> it's a test only and sounds very environment dependent, but I'll respect
>>> your call on vote.
>>> >   Btw, looks like Rui also made a relevant fix via link (not to fix
>>> the failing test but to fix other issues), but this also wasn't ported back
>>> to 3.5. @Rui Wang Do you think this is a regression issue and warrants a
>>> new RC?
>>> >     On Fri, Feb 16, 2024 at 11:38 AM Sean Owen 
>>> wrote:
>>> > Is anyone seeing this Spark Connect test failure? then again, I have
>>> some weird issue with this env that always fails 1 or 2 tests that nobody
>>> else can replicate.
>>> >   - Test observe *** FAILED ***
>>> >   == FAIL: Plans do not match ===
>>> >   !CollectMetrics my_metric, [min(id#0) AS min_val#0, max(id#0) AS
>>

Re: [VOTE] Release Apache Spark 3.5.1 (RC2)

2024-02-20 Thread Maxim Gekk
+1

On Wed, Feb 21, 2024 at 9:50 AM Hyukjin Kwon  wrote:

> +1
>
> On Tue, 20 Feb 2024 at 22:00, Cheng Pan  wrote:
>
>> +1 (non-binding)
>>
>> - Build successfully from source code.
>> - Pass integration tests with Spark ClickHouse Connector[1]
>>
>> [1] https://github.com/housepower/spark-clickhouse-connector/pull/299
>>
>> Thanks,
>> Cheng Pan
>>
>>
>> > On Feb 20, 2024, at 10:56, Jungtaek Lim 
>> wrote:
>> >
>> > Thanks Sean, let's continue the process for this RC.
>> >
>> > +1 (non-binding)
>> >
>> > - downloaded all files from URL
>> > - checked signature
>> > - extracted all archives
>> > - ran all tests from source files in source archive file, via running
>> "sbt clean test package" - Ubuntu 20.04.4 LTS, OpenJDK 17.0.9.
>> >
>> > Also bump to dev@ to encourage participation - looks like the timing
>> is not good for US folks but let's see more days.
>> >
>> >
>> > On Sat, Feb 17, 2024 at 1:49 AM Sean Owen  wrote:
>> > Yeah let's get that fix in, but it seems to be a minor test only issue
>> so should not block release.
>> >
>> > On Fri, Feb 16, 2024, 9:30 AM yangjie01  wrote:
>> > Very sorry. When I was fixing `SPARK-45242 (
>> https://github.com/apache/spark/pull/43594)`
>> <https://github.com/apache/spark/pull/43594)>, I noticed that its
>> `Affects Version` and `Fix Version` of SPARK-45242 were both 4.0, and I
>> didn't realize that it had also been merged into branch-3.5, so I didn't
>> advocate for SPARK-45357 to be backported to branch-3.5.
>> >  As far as I know, the condition to trigger this test failure is: when
>> using Maven to test the `connect` module, if  `sparkTestRelation` in
>> `SparkConnectProtoSuite` is not the first `DataFrame` to be initialized,
>> then the `id` of `sparkTestRelation` will no longer be 0. So, I think this
>> is indeed related to the order in which Maven executes the test cases in
>> the `connect` module.
>> >  I have submitted a backport PR to branch-3.5, and if necessary, we can
>> merge it to fix this test issue.
>> >  Jie Yang
>> >   发件人: Jungtaek Lim 
>> > 日期: 2024年2月16日 星期五 22:15
>> > 收件人: Sean Owen , Rui Wang 
>> > 抄送: dev 
>> > 主题: Re: [VOTE] Release Apache Spark 3.5.1 (RC2)
>> >   I traced back relevant changes and got a sense of what happened.
>> >   Yangjie figured out the issue via link. It's a tricky issue according
>> to the comments from Yangjie - the test is dependent on ordering of
>> execution for test suites. He said it does not fail in sbt, hence CI build
>> couldn't catch it.
>> > He fixed it via link, but we missed that the offending commit was also
>> ported back to 3.5 as well, hence the fix wasn't ported back to 3.5.
>> >   Surprisingly, I can't reproduce locally even with maven. In my
>> attempt to reproduce, SparkConnectProtoSuite was executed at third,
>> SparkConnectStreamingQueryCacheSuite, and ExecuteEventsManagerSuite, and
>> then SparkConnectProtoSuite. Maybe very specific to the environment, not
>> just maven? My env: MBP M1 pro chip, MacOS 14.3.1, Openjdk 17.0.9. I used
>> build/mvn (Maven 3.8.8).
>> >   I'm not 100% sure this is something we should fail the release as
>> it's a test only and sounds very environment dependent, but I'll respect
>> your call on vote.
>> >   Btw, looks like Rui also made a relevant fix via link (not to fix the
>> failing test but to fix other issues), but this also wasn't ported back to
>> 3.5. @Rui Wang Do you think this is a regression issue and warrants a new
>> RC?
>> > On Fri, Feb 16, 2024 at 11:38 AM Sean Owen 
>> wrote:
>> > Is anyone seeing this Spark Connect test failure? then again, I have
>> some weird issue with this env that always fails 1 or 2 tests that nobody
>> else can replicate.
>> >   - Test observe *** FAILED ***
>> >   == FAIL: Plans do not match ===
>> >   !CollectMetrics my_metric, [min(id#0) AS min_val#0, max(id#0) AS
>> max_val#0, sum(id#0) AS sum(id)#0L], 0   CollectMetrics my_metric,
>> [min(id#0) AS min_val#0, max(id#0) AS max_val#0, sum(id#0) AS sum(id)#0L],
>> 44
>> >+- LocalRelation , [id#0, name#0]
>>  +- LocalRelation , [id#0,
>> name#0] (PlanTest.scala:179)
>> >   On Thu, Feb 15, 2024 at 1:34 PM Jungtaek Lim <
>> kabhwan.opensou...@gmail.com> wrote:
>> > DISCLAIMER: RC for Apache Spark 3.5.1 starts with RC2 as I lately
>> figure

Re: [VOTE] Release Apache Spark 3.5.1 (RC2)

2024-02-20 Thread Hyukjin Kwon
+1

On Tue, 20 Feb 2024 at 22:00, Cheng Pan  wrote:

> +1 (non-binding)
>
> - Build successfully from source code.
> - Pass integration tests with Spark ClickHouse Connector[1]
>
> [1] https://github.com/housepower/spark-clickhouse-connector/pull/299
>
> Thanks,
> Cheng Pan
>
>
> > On Feb 20, 2024, at 10:56, Jungtaek Lim 
> wrote:
> >
> > Thanks Sean, let's continue the process for this RC.
> >
> > +1 (non-binding)
> >
> > - downloaded all files from URL
> > - checked signature
> > - extracted all archives
> > - ran all tests from source files in source archive file, via running
> "sbt clean test package" - Ubuntu 20.04.4 LTS, OpenJDK 17.0.9.
> >
> > Also bump to dev@ to encourage participation - looks like the timing is
> not good for US folks but let's see more days.
> >
> >
> > On Sat, Feb 17, 2024 at 1:49 AM Sean Owen  wrote:
> > Yeah let's get that fix in, but it seems to be a minor test only issue
> so should not block release.
> >
> > On Fri, Feb 16, 2024, 9:30 AM yangjie01  wrote:
> > Very sorry. When I was fixing `SPARK-45242 (
> https://github.com/apache/spark/pull/43594)`
> <https://github.com/apache/spark/pull/43594)>, I noticed that its
> `Affects Version` and `Fix Version` of SPARK-45242 were both 4.0, and I
> didn't realize that it had also been merged into branch-3.5, so I didn't
> advocate for SPARK-45357 to be backported to branch-3.5.
> >  As far as I know, the condition to trigger this test failure is: when
> using Maven to test the `connect` module, if  `sparkTestRelation` in
> `SparkConnectProtoSuite` is not the first `DataFrame` to be initialized,
> then the `id` of `sparkTestRelation` will no longer be 0. So, I think this
> is indeed related to the order in which Maven executes the test cases in
> the `connect` module.
> >  I have submitted a backport PR to branch-3.5, and if necessary, we can
> merge it to fix this test issue.
> >  Jie Yang
> >   发件人: Jungtaek Lim 
> > 日期: 2024年2月16日 星期五 22:15
> > 收件人: Sean Owen , Rui Wang 
> > 抄送: dev 
> > 主题: Re: [VOTE] Release Apache Spark 3.5.1 (RC2)
> >   I traced back relevant changes and got a sense of what happened.
> >   Yangjie figured out the issue via link. It's a tricky issue according
> to the comments from Yangjie - the test is dependent on ordering of
> execution for test suites. He said it does not fail in sbt, hence CI build
> couldn't catch it.
> > He fixed it via link, but we missed that the offending commit was also
> ported back to 3.5 as well, hence the fix wasn't ported back to 3.5.
> >   Surprisingly, I can't reproduce locally even with maven. In my attempt
> to reproduce, SparkConnectProtoSuite was executed at third,
> SparkConnectStreamingQueryCacheSuite, and ExecuteEventsManagerSuite, and
> then SparkConnectProtoSuite. Maybe very specific to the environment, not
> just maven? My env: MBP M1 pro chip, MacOS 14.3.1, Openjdk 17.0.9. I used
> build/mvn (Maven 3.8.8).
> >   I'm not 100% sure this is something we should fail the release as it's
> a test only and sounds very environment dependent, but I'll respect your
> call on vote.
> >   Btw, looks like Rui also made a relevant fix via link (not to fix the
> failing test but to fix other issues), but this also wasn't ported back to
> 3.5. @Rui Wang Do you think this is a regression issue and warrants a new
> RC?
> > On Fri, Feb 16, 2024 at 11:38 AM Sean Owen  wrote:
> > Is anyone seeing this Spark Connect test failure? then again, I have
> some weird issue with this env that always fails 1 or 2 tests that nobody
> else can replicate.
> >   - Test observe *** FAILED ***
> >   == FAIL: Plans do not match ===
> >   !CollectMetrics my_metric, [min(id#0) AS min_val#0, max(id#0) AS
> max_val#0, sum(id#0) AS sum(id)#0L], 0   CollectMetrics my_metric,
> [min(id#0) AS min_val#0, max(id#0) AS max_val#0, sum(id#0) AS sum(id)#0L],
> 44
> >+- LocalRelation , [id#0, name#0]
>+- LocalRelation , [id#0, name#0]
> (PlanTest.scala:179)
> >   On Thu, Feb 15, 2024 at 1:34 PM Jungtaek Lim <
> kabhwan.opensou...@gmail.com> wrote:
> > DISCLAIMER: RC for Apache Spark 3.5.1 starts with RC2 as I lately
> figured out doc generation issue after tagging RC1.
> >   Please vote on releasing the following candidate as Apache Spark
> version 3.5.1.
> >
> > The vote is open until February 18th 9AM (PST) and passes if a majority
> +1 PMC votes are cast, with
> > a minimum of 3 +1 votes.
> >
> > [ ] +1 Release this package as Apache Spark 3.5.1
> > [ ] -1 Do not release this package because ...
> >
> >

Re: [VOTE] Release Apache Spark 3.5.1 (RC2)

2024-02-20 Thread Xiao Li
+1

Xiao

Cheng Pan  于2024年2月20日周二 04:59写道:

> +1 (non-binding)
>
> - Build successfully from source code.
> - Pass integration tests with Spark ClickHouse Connector[1]
>
> [1] https://github.com/housepower/spark-clickhouse-connector/pull/299
>
> Thanks,
> Cheng Pan
>
>
> > On Feb 20, 2024, at 10:56, Jungtaek Lim 
> wrote:
> >
> > Thanks Sean, let's continue the process for this RC.
> >
> > +1 (non-binding)
> >
> > - downloaded all files from URL
> > - checked signature
> > - extracted all archives
> > - ran all tests from source files in source archive file, via running
> "sbt clean test package" - Ubuntu 20.04.4 LTS, OpenJDK 17.0.9.
> >
> > Also bump to dev@ to encourage participation - looks like the timing is
> not good for US folks but let's see more days.
> >
> >
> > On Sat, Feb 17, 2024 at 1:49 AM Sean Owen  wrote:
> > Yeah let's get that fix in, but it seems to be a minor test only issue
> so should not block release.
> >
> > On Fri, Feb 16, 2024, 9:30 AM yangjie01  wrote:
> > Very sorry. When I was fixing `SPARK-45242 (
> https://github.com/apache/spark/pull/43594)`
> <https://github.com/apache/spark/pull/43594)>, I noticed that its
> `Affects Version` and `Fix Version` of SPARK-45242 were both 4.0, and I
> didn't realize that it had also been merged into branch-3.5, so I didn't
> advocate for SPARK-45357 to be backported to branch-3.5.
> >  As far as I know, the condition to trigger this test failure is: when
> using Maven to test the `connect` module, if  `sparkTestRelation` in
> `SparkConnectProtoSuite` is not the first `DataFrame` to be initialized,
> then the `id` of `sparkTestRelation` will no longer be 0. So, I think this
> is indeed related to the order in which Maven executes the test cases in
> the `connect` module.
> >  I have submitted a backport PR to branch-3.5, and if necessary, we can
> merge it to fix this test issue.
> >  Jie Yang
> >   发件人: Jungtaek Lim 
> > 日期: 2024年2月16日 星期五 22:15
> > 收件人: Sean Owen , Rui Wang 
> > 抄送: dev 
> > 主题: Re: [VOTE] Release Apache Spark 3.5.1 (RC2)
> >   I traced back relevant changes and got a sense of what happened.
> >   Yangjie figured out the issue via link. It's a tricky issue according
> to the comments from Yangjie - the test is dependent on ordering of
> execution for test suites. He said it does not fail in sbt, hence CI build
> couldn't catch it.
> > He fixed it via link, but we missed that the offending commit was also
> ported back to 3.5 as well, hence the fix wasn't ported back to 3.5.
> >   Surprisingly, I can't reproduce locally even with maven. In my attempt
> to reproduce, SparkConnectProtoSuite was executed at third,
> SparkConnectStreamingQueryCacheSuite, and ExecuteEventsManagerSuite, and
> then SparkConnectProtoSuite. Maybe very specific to the environment, not
> just maven? My env: MBP M1 pro chip, MacOS 14.3.1, Openjdk 17.0.9. I used
> build/mvn (Maven 3.8.8).
> >   I'm not 100% sure this is something we should fail the release as it's
> a test only and sounds very environment dependent, but I'll respect your
> call on vote.
> >   Btw, looks like Rui also made a relevant fix via link (not to fix the
> failing test but to fix other issues), but this also wasn't ported back to
> 3.5. @Rui Wang Do you think this is a regression issue and warrants a new
> RC?
> > On Fri, Feb 16, 2024 at 11:38 AM Sean Owen  wrote:
> > Is anyone seeing this Spark Connect test failure? then again, I have
> some weird issue with this env that always fails 1 or 2 tests that nobody
> else can replicate.
> >   - Test observe *** FAILED ***
> >   == FAIL: Plans do not match ===
> >   !CollectMetrics my_metric, [min(id#0) AS min_val#0, max(id#0) AS
> max_val#0, sum(id#0) AS sum(id)#0L], 0   CollectMetrics my_metric,
> [min(id#0) AS min_val#0, max(id#0) AS max_val#0, sum(id#0) AS sum(id)#0L],
> 44
> >+- LocalRelation , [id#0, name#0]
>+- LocalRelation , [id#0, name#0]
> (PlanTest.scala:179)
> >   On Thu, Feb 15, 2024 at 1:34 PM Jungtaek Lim <
> kabhwan.opensou...@gmail.com> wrote:
> > DISCLAIMER: RC for Apache Spark 3.5.1 starts with RC2 as I lately
> figured out doc generation issue after tagging RC1.
> >   Please vote on releasing the following candidate as Apache Spark
> version 3.5.1.
> >
> > The vote is open until February 18th 9AM (PST) and passes if a majority
> +1 PMC votes are cast, with
> > a minimum of 3 +1 votes.
> >
> > [ ] +1 Release this package as Apache Spark 3.5.1
> > [ ] -1 Do not release this package because ...
> >
> > To l

Re: [VOTE] Release Apache Spark 3.5.1 (RC2)

2024-02-20 Thread Cheng Pan
+1 (non-binding)

- Build successfully from source code.
- Pass integration tests with Spark ClickHouse Connector[1]

[1] https://github.com/housepower/spark-clickhouse-connector/pull/299

Thanks,
Cheng Pan


> On Feb 20, 2024, at 10:56, Jungtaek Lim  wrote:
> 
> Thanks Sean, let's continue the process for this RC.
> 
> +1 (non-binding)
> 
> - downloaded all files from URL
> - checked signature
> - extracted all archives
> - ran all tests from source files in source archive file, via running "sbt 
> clean test package" - Ubuntu 20.04.4 LTS, OpenJDK 17.0.9.
> 
> Also bump to dev@ to encourage participation - looks like the timing is not 
> good for US folks but let's see more days.
> 
> 
> On Sat, Feb 17, 2024 at 1:49 AM Sean Owen  wrote:
> Yeah let's get that fix in, but it seems to be a minor test only issue so 
> should not block release.
> 
> On Fri, Feb 16, 2024, 9:30 AM yangjie01  wrote:
> Very sorry. When I was fixing `SPARK-45242 
> (https://github.com/apache/spark/pull/43594)`, I noticed that its `Affects 
> Version` and `Fix Version` of SPARK-45242 were both 4.0, and I didn't realize 
> that it had also been merged into branch-3.5, so I didn't advocate for 
> SPARK-45357 to be backported to branch-3.5.
>  As far as I know, the condition to trigger this test failure is: when using 
> Maven to test the `connect` module, if  `sparkTestRelation` in 
> `SparkConnectProtoSuite` is not the first `DataFrame` to be initialized, then 
> the `id` of `sparkTestRelation` will no longer be 0. So, I think this is 
> indeed related to the order in which Maven executes the test cases in the 
> `connect` module.
>  I have submitted a backport PR to branch-3.5, and if necessary, we can merge 
> it to fix this test issue.
>  Jie Yang
>   发件人: Jungtaek Lim 
> 日期: 2024年2月16日 星期五 22:15
> 收件人: Sean Owen , Rui Wang 
> 抄送: dev 
> 主题: Re: [VOTE] Release Apache Spark 3.5.1 (RC2)
>   I traced back relevant changes and got a sense of what happened.
>   Yangjie figured out the issue via link. It's a tricky issue according to 
> the comments from Yangjie - the test is dependent on ordering of execution 
> for test suites. He said it does not fail in sbt, hence CI build couldn't 
> catch it.
> He fixed it via link, but we missed that the offending commit was also ported 
> back to 3.5 as well, hence the fix wasn't ported back to 3.5.
>   Surprisingly, I can't reproduce locally even with maven. In my attempt to 
> reproduce, SparkConnectProtoSuite was executed at third, 
> SparkConnectStreamingQueryCacheSuite, and ExecuteEventsManagerSuite, and then 
> SparkConnectProtoSuite. Maybe very specific to the environment, not just 
> maven? My env: MBP M1 pro chip, MacOS 14.3.1, Openjdk 17.0.9. I used 
> build/mvn (Maven 3.8.8).
>   I'm not 100% sure this is something we should fail the release as it's a 
> test only and sounds very environment dependent, but I'll respect your call 
> on vote.
>   Btw, looks like Rui also made a relevant fix via link (not to fix the 
> failing test but to fix other issues), but this also wasn't ported back to 
> 3.5. @Rui Wang Do you think this is a regression issue and warrants a new RC?
> On Fri, Feb 16, 2024 at 11:38 AM Sean Owen  wrote:
> Is anyone seeing this Spark Connect test failure? then again, I have some 
> weird issue with this env that always fails 1 or 2 tests that nobody else can 
> replicate. 
>   - Test observe *** FAILED ***
>   == FAIL: Plans do not match ===
>   !CollectMetrics my_metric, [min(id#0) AS min_val#0, max(id#0) AS max_val#0, 
> sum(id#0) AS sum(id)#0L], 0   CollectMetrics my_metric, [min(id#0) AS 
> min_val#0, max(id#0) AS max_val#0, sum(id#0) AS sum(id)#0L], 44
>+- LocalRelation , [id#0, name#0]       
>   +- LocalRelation , [id#0, name#0] 
> (PlanTest.scala:179)
>   On Thu, Feb 15, 2024 at 1:34 PM Jungtaek Lim  
> wrote:
> DISCLAIMER: RC for Apache Spark 3.5.1 starts with RC2 as I lately figured out 
> doc generation issue after tagging RC1.
>   Please vote on releasing the following candidate as Apache Spark version 
> 3.5.1.
> 
> The vote is open until February 18th 9AM (PST) and passes if a majority +1 
> PMC votes are cast, with
> a minimum of 3 +1 votes.
> 
> [ ] +1 Release this package as Apache Spark 3.5.1
> [ ] -1 Do not release this package because ...
> 
> To learn more about Apache Spark, please see https://spark.apache.org/
> 
> The tag to be voted on is v3.5.1-rc2 (commit 
> fd86f85e181fc2dc0f50a096855acf83a6cc5d9c):
> https://github.com/apache/spark/tree/v3.5.1-rc2
> 
> The release files, including signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.5.1-rc2-bi

Re: [VOTE] Release Apache Spark 3.5.1 (RC2)

2024-02-19 Thread Wenchen Fan
+1, thanks for making the release!

On Sat, Feb 17, 2024 at 3:54 AM Sean Owen  wrote:

> Yeah let's get that fix in, but it seems to be a minor test only issue so
> should not block release.
>
> On Fri, Feb 16, 2024, 9:30 AM yangjie01  wrote:
>
>> Very sorry. When I was fixing `SPARK-45242 (
>> https://github.com/apache/spark/pull/43594)`
>> <https://github.com/apache/spark/pull/43594)>, I noticed that its
>> `Affects Version` and `Fix Version` of SPARK-45242 were both 4.0, and I
>> didn't realize that it had also been merged into branch-3.5, so I didn't
>> advocate for SPARK-45357 to be backported to branch-3.5.
>>
>>
>>
>> As far as I know, the condition to trigger this test failure is: when
>> using Maven to test the `connect` module, if  `sparkTestRelation` in
>> `SparkConnectProtoSuite` is not the first `DataFrame` to be initialized,
>> then the `id` of `sparkTestRelation` will no longer be 0. So, I think this
>> is indeed related to the order in which Maven executes the test cases in
>> the `connect` module.
>>
>>
>>
>> I have submitted a backport PR
>> <https://github.com/apache/spark/pull/45141> to branch-3.5, and if
>> necessary, we can merge it to fix this test issue.
>>
>>
>>
>> Jie Yang
>>
>>
>>
>> *发件人**: *Jungtaek Lim 
>> *日期**: *2024年2月16日 星期五 22:15
>> *收件人**: *Sean Owen , Rui Wang 
>> *抄送**: *dev 
>> *主题**: *Re: [VOTE] Release Apache Spark 3.5.1 (RC2)
>>
>>
>>
>> I traced back relevant changes and got a sense of what happened.
>>
>>
>>
>> Yangjie figured out the issue via link
>> <https://mailshield.baidu.com/check?q=8dOSfwXDFpe5HSp%2b%2bgCPsNQ52B7S7TAFG56Vj3tiFgMkCyOrQEGbg03AVWDX5bwwyIW7sZx3JZox3w8Jz1iw%2bPjaOZYmLWn2>.
>> It's a tricky issue according to the comments from Yangjie - the test is
>> dependent on ordering of execution for test suites. He said it does not
>> fail in sbt, hence CI build couldn't catch it.
>>
>> He fixed it via link
>> <https://mailshield.baidu.com/check?q=ojK3dg%2fDFf3xmQ8SPzsIou3EKaE1ZePctdB%2fUzhWmewnZb5chnQM1%2f8D1JDJnkxF>,
>> but we missed that the offending commit was also ported back to 3.5 as
>> well, hence the fix wasn't ported back to 3.5.
>>
>>
>>
>> Surprisingly, I can't reproduce locally even with maven. In my attempt to
>> reproduce, SparkConnectProtoSuite was executed at
>> third, SparkConnectStreamingQueryCacheSuite, and ExecuteEventsManagerSuite,
>> and then SparkConnectProtoSuite. Maybe very specific to the environment,
>> not just maven? My env: MBP M1 pro chip, MacOS 14.3.1, Openjdk 17.0.9. I
>> used build/mvn (Maven 3.8.8).
>>
>>
>>
>> I'm not 100% sure this is something we should fail the release as it's a
>> test only and sounds very environment dependent, but I'll respect your call
>> on vote.
>>
>>
>>
>> Btw, looks like Rui also made a relevant fix via link
>> <https://mailshield.baidu.com/check?q=TUbVzroxG%2fbi2P4qN0kbggzXuPzSN%2bKDoUFGhS9xMet8aXVw6EH0rMr1MKJqp2E2>
>>  (not
>> to fix the failing test but to fix other issues), but this also wasn't
>> ported back to 3.5. @Rui Wang  Do you think this
>> is a regression issue and warrants a new RC?
>>
>>
>>
>>
>>
>> On Fri, Feb 16, 2024 at 11:38 AM Sean Owen  wrote:
>>
>> Is anyone seeing this Spark Connect test failure? then again, I have some
>> weird issue with this env that always fails 1 or 2 tests that nobody else
>> can replicate.
>>
>>
>>
>> - Test observe *** FAILED ***
>>   == FAIL: Plans do not match ===
>>   !CollectMetrics my_metric, [min(id#0) AS min_val#0, max(id#0) AS
>> max_val#0, sum(id#0) AS sum(id)#0L], 0   CollectMetrics my_metric,
>> [min(id#0) AS min_val#0, max(id#0) AS max_val#0, sum(id#0) AS sum(id)#0L],
>> 44
>>+- LocalRelation , [id#0, name#0]
>>   +- LocalRelation , [id#0, name#0]
>> (PlanTest.scala:179)
>>
>>
>>
>> On Thu, Feb 15, 2024 at 1:34 PM Jungtaek Lim <
>> kabhwan.opensou...@gmail.com> wrote:
>>
>> DISCLAIMER: RC for Apache Spark 3.5.1 starts with RC2 as I lately figured
>> out doc generation issue after tagging RC1.
>>
>>
>>
>> Please vote on releasing the following candidate as Apache Spark version
>> 3.5.1.
>>
>> The vote is open until February 18th 9AM (PST) and passes if a majority
>> +1 PMC votes are cast, with
>> a minimum of 3 +1 votes.
>>
>> [ ] +

Re: [VOTE] Release Apache Spark 3.5.1 (RC2)

2024-02-19 Thread Jungtaek Lim
Thanks Sean, let's continue the process for this RC.

+1 (non-binding)

- downloaded all files from URL
- checked signature
- extracted all archives
- ran all tests from source files in source archive file, via running "sbt
clean test package" - Ubuntu 20.04.4 LTS, OpenJDK 17.0.9.

Also bump to dev@ to encourage participation - looks like the timing is not
good for US folks but let's see more days.


On Sat, Feb 17, 2024 at 1:49 AM Sean Owen  wrote:

> Yeah let's get that fix in, but it seems to be a minor test only issue so
> should not block release.
>
> On Fri, Feb 16, 2024, 9:30 AM yangjie01  wrote:
>
>> Very sorry. When I was fixing `SPARK-45242 (
>> https://github.com/apache/spark/pull/43594)`
>> <https://github.com/apache/spark/pull/43594)>, I noticed that its
>> `Affects Version` and `Fix Version` of SPARK-45242 were both 4.0, and I
>> didn't realize that it had also been merged into branch-3.5, so I didn't
>> advocate for SPARK-45357 to be backported to branch-3.5.
>>
>>
>>
>> As far as I know, the condition to trigger this test failure is: when
>> using Maven to test the `connect` module, if  `sparkTestRelation` in
>> `SparkConnectProtoSuite` is not the first `DataFrame` to be initialized,
>> then the `id` of `sparkTestRelation` will no longer be 0. So, I think this
>> is indeed related to the order in which Maven executes the test cases in
>> the `connect` module.
>>
>>
>>
>> I have submitted a backport PR
>> <https://github.com/apache/spark/pull/45141> to branch-3.5, and if
>> necessary, we can merge it to fix this test issue.
>>
>>
>>
>> Jie Yang
>>
>>
>>
>> *发件人**: *Jungtaek Lim 
>> *日期**: *2024年2月16日 星期五 22:15
>> *收件人**: *Sean Owen , Rui Wang 
>> *抄送**: *dev 
>> *主题**: *Re: [VOTE] Release Apache Spark 3.5.1 (RC2)
>>
>>
>>
>> I traced back relevant changes and got a sense of what happened.
>>
>>
>>
>> Yangjie figured out the issue via link
>> <https://mailshield.baidu.com/check?q=8dOSfwXDFpe5HSp%2b%2bgCPsNQ52B7S7TAFG56Vj3tiFgMkCyOrQEGbg03AVWDX5bwwyIW7sZx3JZox3w8Jz1iw%2bPjaOZYmLWn2>.
>> It's a tricky issue according to the comments from Yangjie - the test is
>> dependent on ordering of execution for test suites. He said it does not
>> fail in sbt, hence CI build couldn't catch it.
>>
>> He fixed it via link
>> <https://mailshield.baidu.com/check?q=ojK3dg%2fDFf3xmQ8SPzsIou3EKaE1ZePctdB%2fUzhWmewnZb5chnQM1%2f8D1JDJnkxF>,
>> but we missed that the offending commit was also ported back to 3.5 as
>> well, hence the fix wasn't ported back to 3.5.
>>
>>
>>
>> Surprisingly, I can't reproduce locally even with maven. In my attempt to
>> reproduce, SparkConnectProtoSuite was executed at
>> third, SparkConnectStreamingQueryCacheSuite, and ExecuteEventsManagerSuite,
>> and then SparkConnectProtoSuite. Maybe very specific to the environment,
>> not just maven? My env: MBP M1 pro chip, MacOS 14.3.1, Openjdk 17.0.9. I
>> used build/mvn (Maven 3.8.8).
>>
>>
>>
>> I'm not 100% sure this is something we should fail the release as it's a
>> test only and sounds very environment dependent, but I'll respect your call
>> on vote.
>>
>>
>>
>> Btw, looks like Rui also made a relevant fix via link
>> <https://mailshield.baidu.com/check?q=TUbVzroxG%2fbi2P4qN0kbggzXuPzSN%2bKDoUFGhS9xMet8aXVw6EH0rMr1MKJqp2E2>
>>  (not
>> to fix the failing test but to fix other issues), but this also wasn't
>> ported back to 3.5. @Rui Wang  Do you think this
>> is a regression issue and warrants a new RC?
>>
>>
>>
>>
>>
>> On Fri, Feb 16, 2024 at 11:38 AM Sean Owen  wrote:
>>
>> Is anyone seeing this Spark Connect test failure? then again, I have some
>> weird issue with this env that always fails 1 or 2 tests that nobody else
>> can replicate.
>>
>>
>>
>> - Test observe *** FAILED ***
>>   == FAIL: Plans do not match ===
>>   !CollectMetrics my_metric, [min(id#0) AS min_val#0, max(id#0) AS
>> max_val#0, sum(id#0) AS sum(id)#0L], 0   CollectMetrics my_metric,
>> [min(id#0) AS min_val#0, max(id#0) AS max_val#0, sum(id#0) AS sum(id)#0L],
>> 44
>>+- LocalRelation , [id#0, name#0]
>>   +- LocalRelation , [id#0, name#0]
>> (PlanTest.scala:179)
>>
>>
>>
>> On Thu, Feb 15, 2024 at 1:34 PM Jungtaek Lim <
>> kabhwan.opensou...@gmail.com> wrote:
>>
>> DISCLAIMER: RC for Apache Spark 3.5.1 starts with RC2 as I lately figured
>> out doc gene

Re: [VOTE] Release Apache Spark 3.5.1 (RC2)

2024-02-16 Thread Sean Owen
Yeah let's get that fix in, but it seems to be a minor test only issue so
should not block release.

On Fri, Feb 16, 2024, 9:30 AM yangjie01  wrote:

> Very sorry. When I was fixing `SPARK-45242 (
> https://github.com/apache/spark/pull/43594)`
> <https://github.com/apache/spark/pull/43594)>, I noticed that its
> `Affects Version` and `Fix Version` of SPARK-45242 were both 4.0, and I
> didn't realize that it had also been merged into branch-3.5, so I didn't
> advocate for SPARK-45357 to be backported to branch-3.5.
>
>
>
> As far as I know, the condition to trigger this test failure is: when
> using Maven to test the `connect` module, if  `sparkTestRelation` in
> `SparkConnectProtoSuite` is not the first `DataFrame` to be initialized,
> then the `id` of `sparkTestRelation` will no longer be 0. So, I think this
> is indeed related to the order in which Maven executes the test cases in
> the `connect` module.
>
>
>
> I have submitted a backport PR
> <https://github.com/apache/spark/pull/45141> to branch-3.5, and if
> necessary, we can merge it to fix this test issue.
>
>
>
> Jie Yang
>
>
>
> *发件人**: *Jungtaek Lim 
> *日期**: *2024年2月16日 星期五 22:15
> *收件人**: *Sean Owen , Rui Wang 
> *抄送**: *dev 
> *主题**: *Re: [VOTE] Release Apache Spark 3.5.1 (RC2)
>
>
>
> I traced back relevant changes and got a sense of what happened.
>
>
>
> Yangjie figured out the issue via link
> <https://mailshield.baidu.com/check?q=8dOSfwXDFpe5HSp%2b%2bgCPsNQ52B7S7TAFG56Vj3tiFgMkCyOrQEGbg03AVWDX5bwwyIW7sZx3JZox3w8Jz1iw%2bPjaOZYmLWn2>.
> It's a tricky issue according to the comments from Yangjie - the test is
> dependent on ordering of execution for test suites. He said it does not
> fail in sbt, hence CI build couldn't catch it.
>
> He fixed it via link
> <https://mailshield.baidu.com/check?q=ojK3dg%2fDFf3xmQ8SPzsIou3EKaE1ZePctdB%2fUzhWmewnZb5chnQM1%2f8D1JDJnkxF>,
> but we missed that the offending commit was also ported back to 3.5 as
> well, hence the fix wasn't ported back to 3.5.
>
>
>
> Surprisingly, I can't reproduce locally even with maven. In my attempt to
> reproduce, SparkConnectProtoSuite was executed at
> third, SparkConnectStreamingQueryCacheSuite, and ExecuteEventsManagerSuite,
> and then SparkConnectProtoSuite. Maybe very specific to the environment,
> not just maven? My env: MBP M1 pro chip, MacOS 14.3.1, Openjdk 17.0.9. I
> used build/mvn (Maven 3.8.8).
>
>
>
> I'm not 100% sure this is something we should fail the release as it's a
> test only and sounds very environment dependent, but I'll respect your call
> on vote.
>
>
>
> Btw, looks like Rui also made a relevant fix via link
> <https://mailshield.baidu.com/check?q=TUbVzroxG%2fbi2P4qN0kbggzXuPzSN%2bKDoUFGhS9xMet8aXVw6EH0rMr1MKJqp2E2>
>  (not
> to fix the failing test but to fix other issues), but this also wasn't
> ported back to 3.5. @Rui Wang  Do you think this is
> a regression issue and warrants a new RC?
>
>
>
>
>
> On Fri, Feb 16, 2024 at 11:38 AM Sean Owen  wrote:
>
> Is anyone seeing this Spark Connect test failure? then again, I have some
> weird issue with this env that always fails 1 or 2 tests that nobody else
> can replicate.
>
>
>
> - Test observe *** FAILED ***
>   == FAIL: Plans do not match ===
>   !CollectMetrics my_metric, [min(id#0) AS min_val#0, max(id#0) AS
> max_val#0, sum(id#0) AS sum(id)#0L], 0   CollectMetrics my_metric,
> [min(id#0) AS min_val#0, max(id#0) AS max_val#0, sum(id#0) AS sum(id)#0L],
> 44
>+- LocalRelation , [id#0, name#0]
>   +- LocalRelation , [id#0, name#0]
> (PlanTest.scala:179)
>
>
>
> On Thu, Feb 15, 2024 at 1:34 PM Jungtaek Lim 
> wrote:
>
> DISCLAIMER: RC for Apache Spark 3.5.1 starts with RC2 as I lately figured
> out doc generation issue after tagging RC1.
>
>
>
> Please vote on releasing the following candidate as Apache Spark version
> 3.5.1.
>
> The vote is open until February 18th 9AM (PST) and passes if a majority +1
> PMC votes are cast, with
> a minimum of 3 +1 votes.
>
> [ ] +1 Release this package as Apache Spark 3.5.1
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see https://spark.apache.org/
> <https://mailshield.baidu.com/check?q=iR6md5rYrz%2bpTPJlEXXlR6NN3aGjunZT0DADO3Pcgs0%3d>
>
> The tag to be voted on is v3.5.1-rc2 (commit
> fd86f85e181fc2dc0f50a096855acf83a6cc5d9c):
> https://github.com/apache/spark/tree/v3.5.1-rc2
> <https://mailshield.baidu.com/check?q=BMfFodF3wXGjeH1b9pbW8V4xeWam1vqNNCMtg1lcpC0d4WtLLiIr8UPiFKSwNMjbEy0AJw%3d%3d>
>
> The release files, including signatures, dige

Re: [VOTE] Release Apache Spark 3.5.1 (RC2)

2024-02-16 Thread yangjie01
Very sorry. When I was fixing `SPARK-45242 
(https://github.com/apache/spark/pull/43594)`, I noticed that its `Affects 
Version` and `Fix Version` of SPARK-45242 were both 4.0, and I didn't realize 
that it had also been merged into branch-3.5, so I didn't advocate for 
SPARK-45357 to be backported to branch-3.5.

As far as I know, the condition to trigger this test failure is: when using 
Maven to test the `connect` module, if  `sparkTestRelation` in 
`SparkConnectProtoSuite` is not the first `DataFrame` to be initialized, then 
the `id` of `sparkTestRelation` will no longer be 0. So, I think this is indeed 
related to the order in which Maven executes the test cases in the `connect` 
module.

I have submitted a backport PR<https://github.com/apache/spark/pull/45141> to 
branch-3.5, and if necessary, we can merge it to fix this test issue.

Jie Yang

发件人: Jungtaek Lim 
日期: 2024年2月16日 星期五 22:15
收件人: Sean Owen , Rui Wang 
抄送: dev 
主题: Re: [VOTE] Release Apache Spark 3.5.1 (RC2)

I traced back relevant changes and got a sense of what happened.

Yangjie figured out the issue via 
link<https://mailshield.baidu.com/check?q=8dOSfwXDFpe5HSp%2b%2bgCPsNQ52B7S7TAFG56Vj3tiFgMkCyOrQEGbg03AVWDX5bwwyIW7sZx3JZox3w8Jz1iw%2bPjaOZYmLWn2>.
 It's a tricky issue according to the comments from Yangjie - the test is 
dependent on ordering of execution for test suites. He said it does not fail in 
sbt, hence CI build couldn't catch it.
He fixed it via 
link<https://mailshield.baidu.com/check?q=ojK3dg%2fDFf3xmQ8SPzsIou3EKaE1ZePctdB%2fUzhWmewnZb5chnQM1%2f8D1JDJnkxF>,
 but we missed that the offending commit was also ported back to 3.5 as well, 
hence the fix wasn't ported back to 3.5.

Surprisingly, I can't reproduce locally even with maven. In my attempt to 
reproduce, SparkConnectProtoSuite was executed at third, 
SparkConnectStreamingQueryCacheSuite, and ExecuteEventsManagerSuite, and then 
SparkConnectProtoSuite. Maybe very specific to the environment, not just maven? 
My env: MBP M1 pro chip, MacOS 14.3.1, Openjdk 17.0.9. I used build/mvn (Maven 
3.8.8).

I'm not 100% sure this is something we should fail the release as it's a test 
only and sounds very environment dependent, but I'll respect your call on vote.

Btw, looks like Rui also made a relevant fix via 
link<https://mailshield.baidu.com/check?q=TUbVzroxG%2fbi2P4qN0kbggzXuPzSN%2bKDoUFGhS9xMet8aXVw6EH0rMr1MKJqp2E2>
 (not to fix the failing test but to fix other issues), but this also wasn't 
ported back to 3.5. @Rui Wang<mailto:amaliu...@apache.org> Do you think this is 
a regression issue and warrants a new RC?


On Fri, Feb 16, 2024 at 11:38 AM Sean Owen 
mailto:sro...@gmail.com>> wrote:
Is anyone seeing this Spark Connect test failure? then again, I have some weird 
issue with this env that always fails 1 or 2 tests that nobody else can 
replicate.

- Test observe *** FAILED ***
  == FAIL: Plans do not match ===
  !CollectMetrics my_metric, [min(id#0) AS min_val#0, max(id#0) AS max_val#0, 
sum(id#0) AS sum(id)#0L], 0   CollectMetrics my_metric, [min(id#0) AS 
min_val#0, max(id#0) AS max_val#0, sum(id#0) AS sum(id)#0L], 44
   +- LocalRelation , [id#0, name#0] 
+- LocalRelation , [id#0, name#0] 
(PlanTest.scala:179)

On Thu, Feb 15, 2024 at 1:34 PM Jungtaek Lim 
mailto:kabhwan.opensou...@gmail.com>> wrote:
DISCLAIMER: RC for Apache Spark 3.5.1 starts with RC2 as I lately figured out 
doc generation issue after tagging RC1.

Please vote on releasing the following candidate as Apache Spark version 3.5.1.

The vote is open until February 18th 9AM (PST) and passes if a majority +1 PMC 
votes are cast, with
a minimum of 3 +1 votes.

[ ] +1 Release this package as Apache Spark 3.5.1
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see 
https://spark.apache.org/<https://mailshield.baidu.com/check?q=iR6md5rYrz%2bpTPJlEXXlR6NN3aGjunZT0DADO3Pcgs0%3d>

The tag to be voted on is v3.5.1-rc2 (commit 
fd86f85e181fc2dc0f50a096855acf83a6cc5d9c):
https://github.com/apache/spark/tree/v3.5.1-rc2<https://mailshield.baidu.com/check?q=BMfFodF3wXGjeH1b9pbW8V4xeWam1vqNNCMtg1lcpC0d4WtLLiIr8UPiFKSwNMjbEy0AJw%3d%3d>

The release files, including signatures, digests, etc. can be found at:
https://dist.apache.org/repos/dist/dev/spark/v3.5.1-rc2-bin/<https://mailshield.baidu.com/check?q=GisJJtraQY1N6Eyahj4wgpwh0wps%2bZC4JtMrCvefk0scRi8wuiCglswMgKTAct5KKjhc%2fw%2f2NWCY4YCv2NIWVg%3d%3d>

Signatures used for Spark RCs can be found in this file:
https://dist.apache.org/repos/dist/dev/spark/KEYS<https://mailshield.baidu.com/check?q=E6fHbSXEWw02TTJBpc3bfA9mi7ea0YiWcNHkm%2fDJxwlaWinGnMdaoO1PahHhgj00vKwcbElpuHA%3d>

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1452/<https://mailshield.baidu.com/check?q=buXpvEpH6X6T3RyvYe2VQXDD5HPLWSOBI0hXYHpxkBXBL

Re: [VOTE] Release Apache Spark 3.5.1 (RC2)

2024-02-16 Thread Jungtaek Lim
I traced back relevant changes and got a sense of what happened.

Yangjie figured out the issue via link
<https://github.com/apache/spark/pull/43010#discussion_r1338737506>. It's a
tricky issue according to the comments from Yangjie - the test is dependent
on ordering of execution for test suites. He said it does not fail in sbt,
hence CI build couldn't catch it.
He fixed it via link <https://github.com/apache/spark/pull/43155>, but we
missed that the offending commit was also ported back to 3.5 as well, hence
the fix wasn't ported back to 3.5.

Surprisingly, I can't reproduce locally even with maven. In my attempt to
reproduce, SparkConnectProtoSuite was executed at
third, SparkConnectStreamingQueryCacheSuite, and ExecuteEventsManagerSuite,
and then SparkConnectProtoSuite. Maybe very specific to the environment,
not just maven? My env: MBP M1 pro chip, MacOS 14.3.1, Openjdk 17.0.9. I
used build/mvn (Maven 3.8.8).

I'm not 100% sure this is something we should fail the release as it's a
test only and sounds very environment dependent, but I'll respect your call
on vote.

Btw, looks like Rui also made a relevant fix via link
<https://github.com/apache/spark/pull/43594> (not to fix the failing test
but to fix other issues), but this also wasn't ported back to 3.5. @Rui Wang
 Do you think this is a regression issue and warrants
a new RC?


On Fri, Feb 16, 2024 at 11:38 AM Sean Owen  wrote:

> Is anyone seeing this Spark Connect test failure? then again, I have some
> weird issue with this env that always fails 1 or 2 tests that nobody else
> can replicate.
>
> - Test observe *** FAILED ***
>   == FAIL: Plans do not match ===
>   !CollectMetrics my_metric, [min(id#0) AS min_val#0, max(id#0) AS
> max_val#0, sum(id#0) AS sum(id)#0L], 0   CollectMetrics my_metric,
> [min(id#0) AS min_val#0, max(id#0) AS max_val#0, sum(id#0) AS sum(id)#0L],
> 44
>+- LocalRelation , [id#0, name#0]
>   +- LocalRelation , [id#0, name#0]
> (PlanTest.scala:179)
>
> On Thu, Feb 15, 2024 at 1:34 PM Jungtaek Lim 
> wrote:
>
>> DISCLAIMER: RC for Apache Spark 3.5.1 starts with RC2 as I lately figured
>> out doc generation issue after tagging RC1.
>>
>> Please vote on releasing the following candidate as Apache Spark version
>> 3.5.1.
>>
>> The vote is open until February 18th 9AM (PST) and passes if a majority
>> +1 PMC votes are cast, with
>> a minimum of 3 +1 votes.
>>
>> [ ] +1 Release this package as Apache Spark 3.5.1
>> [ ] -1 Do not release this package because ...
>>
>> To learn more about Apache Spark, please see https://spark.apache.org/
>>
>> The tag to be voted on is v3.5.1-rc2 (commit
>> fd86f85e181fc2dc0f50a096855acf83a6cc5d9c):
>> https://github.com/apache/spark/tree/v3.5.1-rc2
>>
>> The release files, including signatures, digests, etc. can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v3.5.1-rc2-bin/
>>
>> Signatures used for Spark RCs can be found in this file:
>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapachespark-1452/
>>
>> The documentation corresponding to this release can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v3.5.1-rc2-docs/
>>
>> The list of bug fixes going into 3.5.1 can be found at the following URL:
>> https://issues.apache.org/jira/projects/SPARK/versions/12353495
>>
>> FAQ
>>
>> =
>> How can I help test this release?
>> =
>>
>> If you are a Spark user, you can help us test this release by taking
>> an existing Spark workload and running on this release candidate, then
>> reporting any regressions.
>>
>> If you're working in PySpark you can set up a virtual env and install
>> the current RC via "pip install
>> https://dist.apache.org/repos/dist/dev/spark/v3.5.1-rc2-bin/pyspark-3.5.1.tar.gz
>> "
>> and see if anything important breaks.
>> In the Java/Scala, you can add the staging repository to your projects
>> resolvers and test
>> with the RC (make sure to clean up the artifact cache before/after so
>> you don't end up building with a out of date RC going forward).
>>
>> ===
>> What should happen to JIRA tickets still targeting 3.5.1?
>> ===
>>
>> The current list of open tickets targeted at 3.5.1 can be found at:
>> https://issues.apache.org/jira/projects/SPARK and search for "Target
>> Version/s" = 3.5.1
>>
>> Committer

Re: [VOTE] Release Apache Spark 3.5.1 (RC2)

2024-02-15 Thread Sean Owen
Is anyone seeing this Spark Connect test failure? then again, I have some
weird issue with this env that always fails 1 or 2 tests that nobody else
can replicate.

- Test observe *** FAILED ***
  == FAIL: Plans do not match ===
  !CollectMetrics my_metric, [min(id#0) AS min_val#0, max(id#0) AS
max_val#0, sum(id#0) AS sum(id)#0L], 0   CollectMetrics my_metric,
[min(id#0) AS min_val#0, max(id#0) AS max_val#0, sum(id#0) AS sum(id)#0L],
44
   +- LocalRelation , [id#0, name#0]
+- LocalRelation , [id#0, name#0]
(PlanTest.scala:179)

On Thu, Feb 15, 2024 at 1:34 PM Jungtaek Lim 
wrote:

> DISCLAIMER: RC for Apache Spark 3.5.1 starts with RC2 as I lately figured
> out doc generation issue after tagging RC1.
>
> Please vote on releasing the following candidate as Apache Spark version
> 3.5.1.
>
> The vote is open until February 18th 9AM (PST) and passes if a majority +1
> PMC votes are cast, with
> a minimum of 3 +1 votes.
>
> [ ] +1 Release this package as Apache Spark 3.5.1
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see https://spark.apache.org/
>
> The tag to be voted on is v3.5.1-rc2 (commit
> fd86f85e181fc2dc0f50a096855acf83a6cc5d9c):
> https://github.com/apache/spark/tree/v3.5.1-rc2
>
> The release files, including signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.5.1-rc2-bin/
>
> Signatures used for Spark RCs can be found in this file:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1452/
>
> The documentation corresponding to this release can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.5.1-rc2-docs/
>
> The list of bug fixes going into 3.5.1 can be found at the following URL:
> https://issues.apache.org/jira/projects/SPARK/versions/12353495
>
> FAQ
>
> =
> How can I help test this release?
> =
>
> If you are a Spark user, you can help us test this release by taking
> an existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> If you're working in PySpark you can set up a virtual env and install
> the current RC via "pip install
> https://dist.apache.org/repos/dist/dev/spark/v3.5.1-rc2-bin/pyspark-3.5.1.tar.gz
> "
> and see if anything important breaks.
> In the Java/Scala, you can add the staging repository to your projects
> resolvers and test
> with the RC (make sure to clean up the artifact cache before/after so
> you don't end up building with a out of date RC going forward).
>
> ===
> What should happen to JIRA tickets still targeting 3.5.1?
> ===
>
> The current list of open tickets targeted at 3.5.1 can be found at:
> https://issues.apache.org/jira/projects/SPARK and search for "Target
> Version/s" = 3.5.1
>
> Committers should look at those and triage. Extremely important bug
> fixes, documentation, and API tweaks that impact compatibility should
> be worked on immediately. Everything else please retarget to an
> appropriate release.
>
> ==
> But my bug isn't fixed?
> ==
>
> In order to make timely releases, we will typically not hold the
> release unless the bug in question is a regression from the previous
> release. That being said, if there is something which is a regression
> that has not been correctly targeted please ping me or a committer to
> help target the issue.
>


Re: Re: [DISCUSS] Release Spark 3.5.1?

2024-02-15 Thread Jungtaek Lim
UPDATE: The vote thread is up now.
https://lists.apache.org/thread/f28h0brncmkoyv5mtsqtxx38hx309c2j


On Tue, Feb 6, 2024 at 11:30 AM Jungtaek Lim 
wrote:

> Thanks all for the positive feedback! Will figure out time to go through
> the RC process. Stay tuned!
>
> On Mon, Feb 5, 2024 at 7:46 AM Gengliang Wang  wrote:
>
>> +1
>>
>> On Sun, Feb 4, 2024 at 1:57 PM Hussein Awala  wrote:
>>
>>> +1
>>>
>>> On Sun, Feb 4, 2024 at 10:13 PM John Zhuge  wrote:
>>>
 +1

 John Zhuge


 On Sun, Feb 4, 2024 at 11:23 AM Santosh Pingale
  wrote:

> +1
>
> On Sun, Feb 4, 2024, 8:18 PM Xiao Li 
> wrote:
>
>> +1
>>
>> On Sun, Feb 4, 2024 at 6:07 AM beliefer  wrote:
>>
>>> +1
>>>
>>>
>>>
>>> 在 2024-02-04 15:26:13,"Dongjoon Hyun"  写道:
>>>
>>> +1
>>>
>>> On Sat, Feb 3, 2024 at 9:18 PM yangjie01 
>>> wrote:
>>>
 +1

 在 2024/2/4 13:13,“Kent Yao”mailto:y...@apache.org>>
 写入:


 +1


 Jungtaek Lim >>> kabhwan.opensou...@gmail.com>> 于2024年2月3日周六 21:14写道:
 >
 > Hi dev,
 >
 > looks like there are a huge number of commits being pushed to
 branch-3.5 after 3.5.0 was released, 200+ commits.
 >
 > $ git log --oneline v3.5.0..HEAD | wc -l
 > 202
 >
 > Also, there are 180 JIRA tickets containing 3.5.1 as fixed
 version, and 10 resolved issues are either marked as blocker (even
 correctness issues) or critical, which justifies the release.
 > https://issues.apache.org/jira/projects/SPARK/versions/12353495 <
 https://issues.apache.org/jira/projects/SPARK/versions/12353495>
 >
 > What do you think about releasing 3.5.1 with the current head of
 branch-3.5? I'm happy to volunteer as the release manager.
 >
 > Thanks,
 > Jungtaek Lim (HeartSaVioR)



 -
 To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>> dev-unsubscr...@spark.apache.org>







 -
 To unsubscribe e-mail: dev-unsubscr...@spark.apache.org


>>
>> --
>>
>>


Re: Heads-up: Update on Spark 3.5.1 RC

2024-02-15 Thread Jungtaek Lim
UPDATE: Now the vote thread is up for RC2.
https://lists.apache.org/thread/f28h0brncmkoyv5mtsqtxx38hx309c2j

On Wed, Feb 14, 2024 at 2:59 AM Dongjoon Hyun 
wrote:

> Thank you for the update, Jungtaek.
>
> Dongjoon.
>
> On Tue, Feb 13, 2024 at 7:29 AM Jungtaek Lim 
> wrote:
>
>> Hi,
>>
>> Just a head-up since I didn't give an update for a week after the last
>> update from the discussion thread.
>>
>> I've been following the automated release process and encountered several
>> issues. Maybe I will file JIRA tickets and follow PRs.
>>
>> Issues I figured out so far are 1) python library version issue in the
>> release docker image, 2) doc build failure in pyspark ml for Spark connect.
>> I'm deferring to submit fixes till I see dry-run to succeed.
>>
>> Btw, I optimistically ran the process without a dry-run as GA has been
>> paased (my bad), and the tag for RC1 being created was done before I saw
>> issues. Maybe I'll need to start with RC2 after things are sorted out and
>> necessary fixes are landed to branch-3.5.
>>
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>>
>>


[VOTE] Release Apache Spark 3.5.1 (RC2)

2024-02-15 Thread Jungtaek Lim
DISCLAIMER: RC for Apache Spark 3.5.1 starts with RC2 as I lately figured
out doc generation issue after tagging RC1.

Please vote on releasing the following candidate as Apache Spark version
3.5.1.

The vote is open until February 18th 9AM (PST) and passes if a majority +1
PMC votes are cast, with
a minimum of 3 +1 votes.

[ ] +1 Release this package as Apache Spark 3.5.1
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see https://spark.apache.org/

The tag to be voted on is v3.5.1-rc2 (commit
fd86f85e181fc2dc0f50a096855acf83a6cc5d9c):
https://github.com/apache/spark/tree/v3.5.1-rc2

The release files, including signatures, digests, etc. can be found at:
https://dist.apache.org/repos/dist/dev/spark/v3.5.1-rc2-bin/

Signatures used for Spark RCs can be found in this file:
https://dist.apache.org/repos/dist/dev/spark/KEYS

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1452/

The documentation corresponding to this release can be found at:
https://dist.apache.org/repos/dist/dev/spark/v3.5.1-rc2-docs/

The list of bug fixes going into 3.5.1 can be found at the following URL:
https://issues.apache.org/jira/projects/SPARK/versions/12353495

FAQ

=
How can I help test this release?
=

If you are a Spark user, you can help us test this release by taking
an existing Spark workload and running on this release candidate, then
reporting any regressions.

If you're working in PySpark you can set up a virtual env and install
the current RC via "pip install
https://dist.apache.org/repos/dist/dev/spark/v3.5.1-rc2-bin/pyspark-3.5.1.tar.gz
"
and see if anything important breaks.
In the Java/Scala, you can add the staging repository to your projects
resolvers and test
with the RC (make sure to clean up the artifact cache before/after so
you don't end up building with a out of date RC going forward).

===
What should happen to JIRA tickets still targeting 3.5.1?
===

The current list of open tickets targeted at 3.5.1 can be found at:
https://issues.apache.org/jira/projects/SPARK and search for "Target
Version/s" = 3.5.1

Committers should look at those and triage. Extremely important bug
fixes, documentation, and API tweaks that impact compatibility should
be worked on immediately. Everything else please retarget to an
appropriate release.

==
But my bug isn't fixed?
==

In order to make timely releases, we will typically not hold the
release unless the bug in question is a regression from the previous
release. That being said, if there is something which is a regression
that has not been correctly targeted please ping me or a committer to
help target the issue.


Re: Heads-up: Update on Spark 3.5.1 RC

2024-02-13 Thread Dongjoon Hyun
Thank you for the update, Jungtaek.

Dongjoon.

On Tue, Feb 13, 2024 at 7:29 AM Jungtaek Lim 
wrote:

> Hi,
>
> Just a head-up since I didn't give an update for a week after the last
> update from the discussion thread.
>
> I've been following the automated release process and encountered several
> issues. Maybe I will file JIRA tickets and follow PRs.
>
> Issues I figured out so far are 1) python library version issue in the
> release docker image, 2) doc build failure in pyspark ml for Spark connect.
> I'm deferring to submit fixes till I see dry-run to succeed.
>
> Btw, I optimistically ran the process without a dry-run as GA has been
> paased (my bad), and the tag for RC1 being created was done before I saw
> issues. Maybe I'll need to start with RC2 after things are sorted out and
> necessary fixes are landed to branch-3.5.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
>


Heads-up: Update on Spark 3.5.1 RC

2024-02-13 Thread Jungtaek Lim
Hi,

Just a head-up since I didn't give an update for a week after the last
update from the discussion thread.

I've been following the automated release process and encountered several
issues. Maybe I will file JIRA tickets and follow PRs.

Issues I figured out so far are 1) python library version issue in the
release docker image, 2) doc build failure in pyspark ml for Spark connect.
I'm deferring to submit fixes till I see dry-run to succeed.

Btw, I optimistically ran the process without a dry-run as GA has been
paased (my bad), and the tag for RC1 being created was done before I saw
issues. Maybe I'll need to start with RC2 after things are sorted out and
necessary fixes are landed to branch-3.5.

Thanks,
Jungtaek Lim (HeartSaVioR)


Re: Re: [DISCUSS] Release Spark 3.5.1?

2024-02-05 Thread Jungtaek Lim
Thanks all for the positive feedback! Will figure out time to go through
the RC process. Stay tuned!

On Mon, Feb 5, 2024 at 7:46 AM Gengliang Wang  wrote:

> +1
>
> On Sun, Feb 4, 2024 at 1:57 PM Hussein Awala  wrote:
>
>> +1
>>
>> On Sun, Feb 4, 2024 at 10:13 PM John Zhuge  wrote:
>>
>>> +1
>>>
>>> John Zhuge
>>>
>>>
>>> On Sun, Feb 4, 2024 at 11:23 AM Santosh Pingale
>>>  wrote:
>>>
 +1

 On Sun, Feb 4, 2024, 8:18 PM Xiao Li 
 wrote:

> +1
>
> On Sun, Feb 4, 2024 at 6:07 AM beliefer  wrote:
>
>> +1
>>
>>
>>
>> 在 2024-02-04 15:26:13,"Dongjoon Hyun"  写道:
>>
>> +1
>>
>> On Sat, Feb 3, 2024 at 9:18 PM yangjie01 
>> wrote:
>>
>>> +1
>>>
>>> 在 2024/2/4 13:13,“Kent Yao”mailto:y...@apache.org>>
>>> 写入:
>>>
>>>
>>> +1
>>>
>>>
>>> Jungtaek Lim >> kabhwan.opensou...@gmail.com>> 于2024年2月3日周六 21:14写道:
>>> >
>>> > Hi dev,
>>> >
>>> > looks like there are a huge number of commits being pushed to
>>> branch-3.5 after 3.5.0 was released, 200+ commits.
>>> >
>>> > $ git log --oneline v3.5.0..HEAD | wc -l
>>> > 202
>>> >
>>> > Also, there are 180 JIRA tickets containing 3.5.1 as fixed
>>> version, and 10 resolved issues are either marked as blocker (even
>>> correctness issues) or critical, which justifies the release.
>>> > https://issues.apache.org/jira/projects/SPARK/versions/12353495 <
>>> https://issues.apache.org/jira/projects/SPARK/versions/12353495>
>>> >
>>> > What do you think about releasing 3.5.1 with the current head of
>>> branch-3.5? I'm happy to volunteer as the release manager.
>>> >
>>> > Thanks,
>>> > Jungtaek Lim (HeartSaVioR)
>>>
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> dev-unsubscr...@spark.apache.org>
>>>
>>>
>>>
>>>
>>>
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>
>
> --
>
>


Re: Re: [DISCUSS] Release Spark 3.5.1?

2024-02-04 Thread Gengliang Wang
+1

On Sun, Feb 4, 2024 at 1:57 PM Hussein Awala  wrote:

> +1
>
> On Sun, Feb 4, 2024 at 10:13 PM John Zhuge  wrote:
>
>> +1
>>
>> John Zhuge
>>
>>
>> On Sun, Feb 4, 2024 at 11:23 AM Santosh Pingale
>>  wrote:
>>
>>> +1
>>>
>>> On Sun, Feb 4, 2024, 8:18 PM Xiao Li 
>>> wrote:
>>>
 +1

 On Sun, Feb 4, 2024 at 6:07 AM beliefer  wrote:

> +1
>
>
>
> 在 2024-02-04 15:26:13,"Dongjoon Hyun"  写道:
>
> +1
>
> On Sat, Feb 3, 2024 at 9:18 PM yangjie01 
> wrote:
>
>> +1
>>
>> 在 2024/2/4 13:13,“Kent Yao”mailto:y...@apache.org>>
>> 写入:
>>
>>
>> +1
>>
>>
>> Jungtaek Lim > kabhwan.opensou...@gmail.com>> 于2024年2月3日周六 21:14写道:
>> >
>> > Hi dev,
>> >
>> > looks like there are a huge number of commits being pushed to
>> branch-3.5 after 3.5.0 was released, 200+ commits.
>> >
>> > $ git log --oneline v3.5.0..HEAD | wc -l
>> > 202
>> >
>> > Also, there are 180 JIRA tickets containing 3.5.1 as fixed version,
>> and 10 resolved issues are either marked as blocker (even correctness
>> issues) or critical, which justifies the release.
>> > https://issues.apache.org/jira/projects/SPARK/versions/12353495 <
>> https://issues.apache.org/jira/projects/SPARK/versions/12353495>
>> >
>> > What do you think about releasing 3.5.1 with the current head of
>> branch-3.5? I'm happy to volunteer as the release manager.
>> >
>> > Thanks,
>> > Jungtaek Lim (HeartSaVioR)
>>
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > dev-unsubscr...@spark.apache.org>
>>
>>
>>
>>
>>
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>

 --




Re: Re: [DISCUSS] Release Spark 3.5.1?

2024-02-04 Thread Hussein Awala
+1

On Sun, Feb 4, 2024 at 10:13 PM John Zhuge  wrote:

> +1
>
> John Zhuge
>
>
> On Sun, Feb 4, 2024 at 11:23 AM Santosh Pingale
>  wrote:
>
>> +1
>>
>> On Sun, Feb 4, 2024, 8:18 PM Xiao Li 
>> wrote:
>>
>>> +1
>>>
>>> On Sun, Feb 4, 2024 at 6:07 AM beliefer  wrote:
>>>
 +1



 在 2024-02-04 15:26:13,"Dongjoon Hyun"  写道:

 +1

 On Sat, Feb 3, 2024 at 9:18 PM yangjie01 
 wrote:

> +1
>
> 在 2024/2/4 13:13,“Kent Yao”mailto:y...@apache.org>>
> 写入:
>
>
> +1
>
>
> Jungtaek Lim  kabhwan.opensou...@gmail.com>> 于2024年2月3日周六 21:14写道:
> >
> > Hi dev,
> >
> > looks like there are a huge number of commits being pushed to
> branch-3.5 after 3.5.0 was released, 200+ commits.
> >
> > $ git log --oneline v3.5.0..HEAD | wc -l
> > 202
> >
> > Also, there are 180 JIRA tickets containing 3.5.1 as fixed version,
> and 10 resolved issues are either marked as blocker (even correctness
> issues) or critical, which justifies the release.
> > https://issues.apache.org/jira/projects/SPARK/versions/12353495 <
> https://issues.apache.org/jira/projects/SPARK/versions/12353495>
> >
> > What do you think about releasing 3.5.1 with the current head of
> branch-3.5? I'm happy to volunteer as the release manager.
> >
> > Thanks,
> > Jungtaek Lim (HeartSaVioR)
>
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org  dev-unsubscr...@spark.apache.org>
>
>
>
>
>
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>
>>>
>>> --
>>>
>>>


Re: Re: [DISCUSS] Release Spark 3.5.1?

2024-02-04 Thread John Zhuge
+1

John Zhuge


On Sun, Feb 4, 2024 at 11:23 AM Santosh Pingale
 wrote:

> +1
>
> On Sun, Feb 4, 2024, 8:18 PM Xiao Li 
> wrote:
>
>> +1
>>
>> On Sun, Feb 4, 2024 at 6:07 AM beliefer  wrote:
>>
>>> +1
>>>
>>>
>>>
>>> 在 2024-02-04 15:26:13,"Dongjoon Hyun"  写道:
>>>
>>> +1
>>>
>>> On Sat, Feb 3, 2024 at 9:18 PM yangjie01 
>>> wrote:
>>>
 +1

 在 2024/2/4 13:13,“Kent Yao”mailto:y...@apache.org>>
 写入:


 +1


 Jungtaek Lim >>> kabhwan.opensou...@gmail.com>> 于2024年2月3日周六 21:14写道:
 >
 > Hi dev,
 >
 > looks like there are a huge number of commits being pushed to
 branch-3.5 after 3.5.0 was released, 200+ commits.
 >
 > $ git log --oneline v3.5.0..HEAD | wc -l
 > 202
 >
 > Also, there are 180 JIRA tickets containing 3.5.1 as fixed version,
 and 10 resolved issues are either marked as blocker (even correctness
 issues) or critical, which justifies the release.
 > https://issues.apache.org/jira/projects/SPARK/versions/12353495 <
 https://issues.apache.org/jira/projects/SPARK/versions/12353495>
 >
 > What do you think about releasing 3.5.1 with the current head of
 branch-3.5? I'm happy to volunteer as the release manager.
 >
 > Thanks,
 > Jungtaek Lim (HeartSaVioR)


 -
 To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>> dev-unsubscr...@spark.apache.org>






 -
 To unsubscribe e-mail: dev-unsubscr...@spark.apache.org


>>
>> --
>>
>>


Re: Re: [DISCUSS] Release Spark 3.5.1?

2024-02-04 Thread Santosh Pingale
+1

On Sun, Feb 4, 2024, 8:18 PM Xiao Li  wrote:

> +1
>
> On Sun, Feb 4, 2024 at 6:07 AM beliefer  wrote:
>
>> +1
>>
>>
>>
>> 在 2024-02-04 15:26:13,"Dongjoon Hyun"  写道:
>>
>> +1
>>
>> On Sat, Feb 3, 2024 at 9:18 PM yangjie01 
>> wrote:
>>
>>> +1
>>>
>>> 在 2024/2/4 13:13,“Kent Yao”mailto:y...@apache.org>> 写入:
>>>
>>>
>>> +1
>>>
>>>
>>> Jungtaek Lim >> kabhwan.opensou...@gmail.com>> 于2024年2月3日周六 21:14写道:
>>> >
>>> > Hi dev,
>>> >
>>> > looks like there are a huge number of commits being pushed to
>>> branch-3.5 after 3.5.0 was released, 200+ commits.
>>> >
>>> > $ git log --oneline v3.5.0..HEAD | wc -l
>>> > 202
>>> >
>>> > Also, there are 180 JIRA tickets containing 3.5.1 as fixed version,
>>> and 10 resolved issues are either marked as blocker (even correctness
>>> issues) or critical, which justifies the release.
>>> > https://issues.apache.org/jira/projects/SPARK/versions/12353495 <
>>> https://issues.apache.org/jira/projects/SPARK/versions/12353495>
>>> >
>>> > What do you think about releasing 3.5.1 with the current head of
>>> branch-3.5? I'm happy to volunteer as the release manager.
>>> >
>>> > Thanks,
>>> > Jungtaek Lim (HeartSaVioR)
>>>
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> dev-unsubscr...@spark.apache.org>
>>>
>>>
>>>
>>>
>>>
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>
>
> --
>
>


Re: Re: [DISCUSS] Release Spark 3.5.1?

2024-02-04 Thread Xiao Li
+1

On Sun, Feb 4, 2024 at 6:07 AM beliefer  wrote:

> +1
>
>
>
> 在 2024-02-04 15:26:13,"Dongjoon Hyun"  写道:
>
> +1
>
> On Sat, Feb 3, 2024 at 9:18 PM yangjie01 
> wrote:
>
>> +1
>>
>> 在 2024/2/4 13:13,“Kent Yao”mailto:y...@apache.org>> 写入:
>>
>>
>> +1
>>
>>
>> Jungtaek Lim > kabhwan.opensou...@gmail.com>> 于2024年2月3日周六 21:14写道:
>> >
>> > Hi dev,
>> >
>> > looks like there are a huge number of commits being pushed to
>> branch-3.5 after 3.5.0 was released, 200+ commits.
>> >
>> > $ git log --oneline v3.5.0..HEAD | wc -l
>> > 202
>> >
>> > Also, there are 180 JIRA tickets containing 3.5.1 as fixed version, and
>> 10 resolved issues are either marked as blocker (even correctness issues)
>> or critical, which justifies the release.
>> > https://issues.apache.org/jira/projects/SPARK/versions/12353495 <
>> https://issues.apache.org/jira/projects/SPARK/versions/12353495>
>> >
>> > What do you think about releasing 3.5.1 with the current head of
>> branch-3.5? I'm happy to volunteer as the release manager.
>> >
>> > Thanks,
>> > Jungtaek Lim (HeartSaVioR)
>>
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > dev-unsubscr...@spark.apache.org>
>>
>>
>>
>>
>>
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>

--


Re:Re: [DISCUSS] Release Spark 3.5.1?

2024-02-04 Thread beliefer
+1







在 2024-02-04 15:26:13,"Dongjoon Hyun"  写道:

+1



On Sat, Feb 3, 2024 at 9:18 PM yangjie01  wrote:

+1

在 2024/2/4 13:13,“Kent Yao”mailto:y...@apache.org>> 写入:


+1


Jungtaek Lim mailto:kabhwan.opensou...@gmail.com>> 于2024年2月3日周六 21:14写道:
>
> Hi dev,
>
> looks like there are a huge number of commits being pushed to branch-3.5 
> after 3.5.0 was released, 200+ commits.
>
> $ git log --oneline v3.5.0..HEAD | wc -l
> 202
>
> Also, there are 180 JIRA tickets containing 3.5.1 as fixed version, and 10 
> resolved issues are either marked as blocker (even correctness issues) or 
> critical, which justifies the release.
> https://issues.apache.org/jira/projects/SPARK/versions/12353495 
> 
>
> What do you think about releasing 3.5.1 with the current head of branch-3.5? 
> I'm happy to volunteer as the release manager.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)


-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org 







-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [DISCUSS] Release Spark 3.5.1?

2024-02-03 Thread Dongjoon Hyun
+1

On Sat, Feb 3, 2024 at 9:18 PM yangjie01 
wrote:

> +1
>
> 在 2024/2/4 13:13,“Kent Yao”mailto:y...@apache.org>> 写入:
>
>
> +1
>
>
> Jungtaek Lim  kabhwan.opensou...@gmail.com>> 于2024年2月3日周六 21:14写道:
> >
> > Hi dev,
> >
> > looks like there are a huge number of commits being pushed to branch-3.5
> after 3.5.0 was released, 200+ commits.
> >
> > $ git log --oneline v3.5.0..HEAD | wc -l
> > 202
> >
> > Also, there are 180 JIRA tickets containing 3.5.1 as fixed version, and
> 10 resolved issues are either marked as blocker (even correctness issues)
> or critical, which justifies the release.
> > https://issues.apache.org/jira/projects/SPARK/versions/12353495 <
> https://issues.apache.org/jira/projects/SPARK/versions/12353495>
> >
> > What do you think about releasing 3.5.1 with the current head of
> branch-3.5? I'm happy to volunteer as the release manager.
> >
> > Thanks,
> > Jungtaek Lim (HeartSaVioR)
>
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org  dev-unsubscr...@spark.apache.org>
>
>
>
>
>
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


Re: [DISCUSS] Release Spark 3.5.1?

2024-02-03 Thread yangjie01
+1

在 2024/2/4 13:13,“Kent Yao”mailto:y...@apache.org>> 写入:


+1


Jungtaek Lim mailto:kabhwan.opensou...@gmail.com>> 于2024年2月3日周六 21:14写道:
>
> Hi dev,
>
> looks like there are a huge number of commits being pushed to branch-3.5 
> after 3.5.0 was released, 200+ commits.
>
> $ git log --oneline v3.5.0..HEAD | wc -l
> 202
>
> Also, there are 180 JIRA tickets containing 3.5.1 as fixed version, and 10 
> resolved issues are either marked as blocker (even correctness issues) or 
> critical, which justifies the release.
> https://issues.apache.org/jira/projects/SPARK/versions/12353495 
> 
>
> What do you think about releasing 3.5.1 with the current head of branch-3.5? 
> I'm happy to volunteer as the release manager.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)


-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org 







-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [DISCUSS] Release Spark 3.5.1?

2024-02-03 Thread Kent Yao
+1

Jungtaek Lim  于2024年2月3日周六 21:14写道:
>
> Hi dev,
>
> looks like there are a huge number of commits being pushed to branch-3.5 
> after 3.5.0 was released, 200+ commits.
>
> $ git log --oneline v3.5.0..HEAD | wc -l
> 202
>
> Also, there are 180 JIRA tickets containing 3.5.1 as fixed version, and 10 
> resolved issues are either marked as blocker (even correctness issues) or 
> critical, which justifies the release.
> https://issues.apache.org/jira/projects/SPARK/versions/12353495
>
> What do you think about releasing 3.5.1 with the current head of branch-3.5? 
> I'm happy to volunteer as the release manager.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



[DISCUSS] Release Spark 3.5.1?

2024-02-03 Thread Jungtaek Lim
Hi dev,

looks like there are a huge number of commits being pushed to branch-3.5
after 3.5.0 was released, 200+ commits.

$ git log --oneline v3.5.0..HEAD | wc -l
202

Also, there are 180 JIRA tickets containing 3.5.1 as fixed version, and 10
resolved issues are either marked as blocker (even correctness issues) or
critical, which justifies the release.
https://issues.apache.org/jira/projects/SPARK/versions/12353495

What do you think about releasing 3.5.1 with the current head of
branch-3.5? I'm happy to volunteer as the release manager.

Thanks,
Jungtaek Lim (HeartSaVioR)


Re: Spark 3.5.1

2024-01-31 Thread Jungtaek Lim
Hi,

I agreed it's time to release 3.5.1. 10 resolved issues are either marked
as blocker (even correctness issues) or critical, which justifies the
release.

I had been trying to find the time to take a step, but had no luck with it.
I'll give it another try this week (it needs some time as I'm not familiar
with Spark project's release process), and seek another volunteer if I
can't make any progress.

Thanks,
Jungtaek Lim (HeartSaVioR)

On Tue, Jan 30, 2024 at 7:15 PM Santosh Pingale
 wrote:

> Hey there
>
> Spark 3.5 branch has accumulated 199 commits with quite a few bug
> fixes related to correctness. Are there any plans for releasing 3.5.1?
>
> Kind regards
> Santosh
>


Spark 3.5.1

2024-01-30 Thread Santosh Pingale
Hey there

Spark 3.5 branch has accumulated 199 commits with quite a few bug
fixes related to correctness. Are there any plans for releasing 3.5.1?

Kind regards
Santosh