Re: Time for Spark 3.4.0 release?

2023-01-17 Thread Hyukjin Kwon
Yeah, these more look like something we should discuss around RC timing.
See "Spark 3.4 release window" in
https://spark.apache.org/versioning-policy.html

On Wed, 18 Jan 2023 at 16:28, Enrico Minack  wrote:

> You are saying the RCs are cut from that branch at a later point? What is
> the estimate deadline for that?
>
> Enrico
>
>
> Am 18.01.23 um 07:59 schrieb Hyukjin Kwon:
>
> These look like we can fix it after the branch-cut so should be fine.
>
> On Wed, 18 Jan 2023 at 15:57, Enrico Minack 
> wrote:
>
>> Hi Xinrong,
>>
>> what about regression issue
>> https://issues.apache.org/jira/browse/SPARK-40819
>> and correctness issue https://issues.apache.org/jira/browse/SPARK-40885?
>>
>> The latter gets fixed by either
>> https://issues.apache.org/jira/browse/SPARK-41959 or
>> https://issues.apache.org/jira/browse/SPARK-42049.
>>
>> Are those considered important?
>>
>> Cheers,
>> Enrico
>>
>>
>> Am 18.01.23 um 04:29 schrieb Xinrong Meng:
>>
>> Hi All,
>>
>> Considering there are still important issues unresolved (some are as
>> shown below), I would suggest to be conservative, we delay the branch-3.4's
>> cut for one week.
>>
>> https://issues.apache.org/jira/browse/SPARK-39375
>> https://issues.apache.org/jira/browse/SPARK-41589
>> https://issues.apache.org/jira/browse/SPARK-42075
>> https://issues.apache.org/jira/browse/SPARK-25299
>> https://issues.apache.org/jira/browse/SPARK-41053
>>
>> I plan to cut *branch-3.4* at *18:30 PT, January 24, 2023*. Please
>> ensure your changes for Apache Spark 3.4 to be ready by that time.
>>
>> Feel free to reply to the email if you have other ongoing big items for
>> Spark 3.4.
>>
>> Thanks,
>>
>> Xinrong Meng
>>
>> On Sat, Jan 7, 2023 at 9:16 AM Hyukjin Kwon  wrote:
>>
>>> Thanks Xinrong.
>>>
>>> On Sat, Jan 7, 2023 at 9:18 AM Xinrong Meng 
>>> wrote:
>>>
 The release window for Apache Spark 3.4.0 is updated per
 https://github.com/apache/spark-website/pull/430.

 Thank you all!

 On Thu, Jan 5, 2023 at 2:10 PM Maxim Gekk 
 wrote:

> +1
>
> On Thu, Jan 5, 2023 at 12:25 AM huaxin gao 
> wrote:
>
>> +1 Thanks!
>>
>> On Wed, Jan 4, 2023 at 10:19 AM L. C. Hsieh  wrote:
>>
>>> +1
>>>
>>> Thank you!
>>>
>>> On Wed, Jan 4, 2023 at 9:13 AM Chao Sun  wrote:
>>>
 +1, thanks!

 Chao

 On Wed, Jan 4, 2023 at 1:56 AM Mridul Muralidharan <
 mri...@gmail.com> wrote:

>
> +1, Thanks !
>
> Regards,
> Mridul
>
> On Wed, Jan 4, 2023 at 2:20 AM Gengliang Wang 
> wrote:
>
>> +1, thanks for driving the release!
>>
>>
>> Gengliang
>>
>> On Tue, Jan 3, 2023 at 10:55 PM Dongjoon Hyun <
>> dongjoon.h...@gmail.com> wrote:
>>
>>> +1
>>>
>>> Thank you!
>>>
>>> Dongjoon
>>>
>>> On Tue, Jan 3, 2023 at 9:44 PM Rui Wang 
>>> wrote:
>>>
 +1 to cut the branch starting from a workday!

 Great to see this is happening!

 Thanks Xinrong!

 -Rui

 On Tue, Jan 3, 2023 at 9:21 PM 416161...@qq.com <
 ruife...@foxmail.com> wrote:

> +1, thank you Xinrong for driving this release!
>
> --
> Ruifeng Zheng
> ruife...@foxmail.com
>
> 
>
>
>
> -- Original --
> *From:* "Hyukjin Kwon" ;
> *Date:* Wed, Jan 4, 2023 01:15 PM
> *To:* "Xinrong Meng";
> *Cc:* "dev";
> *Subject:* Re: Time for Spark 3.4.0 release?
>
> SGTM +1
>
> On Wed, Jan 4, 2023 at 2:13 PM Xinrong Meng <
> xinrong.apa...@gmail.com> wrote:
>
>> Hi All,
>>
>> Shall we cut *branch-3.4* on *January 16th, 2023*? We
>> proposed January 15th per
>> https://spark.apache.org/versioning-policy.html, but I would
>> suggest we postpone one day since January 15th is a Sunday.
>>
>> I would like to volunteer as the release manager for *Apache
>> Spark 3.4.0*.
>>
>> Thanks,
>>
>> Xinrong Meng
>>
>>
>>
>


Re: Time for Spark 3.4.0 release?

2023-01-17 Thread Enrico Minack
You are saying the RCs are cut from that branch at a later point? What 
is the estimate deadline for that?


Enrico



Am 18.01.23 um 07:59 schrieb Hyukjin Kwon:

These look like we can fix it after the branch-cut so should be fine.

On Wed, 18 Jan 2023 at 15:57, Enrico Minack  
wrote:


Hi Xinrong,

what about regression issue
https://issues.apache.org/jira/browse/SPARK-40819
and correctness issue
https://issues.apache.org/jira/browse/SPARK-40885?

The latter gets fixed by either
https://issues.apache.org/jira/browse/SPARK-41959 or
https://issues.apache.org/jira/browse/SPARK-42049.

Are those considered important?

Cheers,
Enrico


Am 18.01.23 um 04:29 schrieb Xinrong Meng:

Hi All,

Considering there are still important issues unresolved (some are
as shown below), I would suggest to be conservative, we delay the
branch-3.4's cut for one week.

https://issues.apache.org/jira/browse/SPARK-39375
https://issues.apache.org/jira/browse/SPARK-41589
https://issues.apache.org/jira/browse/SPARK-42075
https://issues.apache.org/jira/browse/SPARK-25299
https://issues.apache.org/jira/browse/SPARK-41053

I plan to cut *branch-3.4* at *18:30 PT, January 24, 2023*.
Please ensure your changes for Apache Spark 3.4 to be ready by
that time.

Feel free to reply to the email if you have other ongoing big
items for Spark 3.4.

Thanks,

Xinrong Meng

On Sat, Jan 7, 2023 at 9:16 AM Hyukjin Kwon 
wrote:

Thanks Xinrong.

On Sat, Jan 7, 2023 at 9:18 AM Xinrong Meng
 wrote:

The release window for Apache Spark 3.4.0 is updated per
https://github.com/apache/spark-website/pull/430.

Thank you all!

On Thu, Jan 5, 2023 at 2:10 PM Maxim Gekk
 wrote:

+1

On Thu, Jan 5, 2023 at 12:25 AM huaxin gao
 wrote:

+1 Thanks!

On Wed, Jan 4, 2023 at 10:19 AM L. C. Hsieh
 wrote:

+1

Thank you!

On Wed, Jan 4, 2023 at 9:13 AM Chao Sun
 wrote:

+1, thanks!

Chao

On Wed, Jan 4, 2023 at 1:56 AM Mridul
Muralidharan  wrote:


+1, Thanks !

Regards,
Mridul

On Wed, Jan 4, 2023 at 2:20 AM
Gengliang Wang  wrote:

+1, thanks for driving the release!


Gengliang

On Tue, Jan 3, 2023 at 10:55 PM
Dongjoon Hyun
 wrote:

+1

Thank you!

Dongjoon

On Tue, Jan 3, 2023 at 9:44
PM Rui Wang
 wrote:

+1 to cut the branch
starting from a workday!

Great to see this is
happening!

Thanks Xinrong!

-Rui

On Tue, Jan 3, 2023 at
9:21 PM 416161...@qq.com
 wrote:

+1, thank you Xinrong
for driving this release!




Ruifeng Zheng
ruife...@foxmail.com





--
Original
--
*From:* "Hyukjin
Kwon"
;
*Date:* Wed, Jan 4,
   

Re: Time for Spark 3.4.0 release?

2023-01-17 Thread Hyukjin Kwon
These look like we can fix it after the branch-cut so should be fine.

On Wed, 18 Jan 2023 at 15:57, Enrico Minack  wrote:

> Hi Xinrong,
>
> what about regression issue
> https://issues.apache.org/jira/browse/SPARK-40819
> and correctness issue https://issues.apache.org/jira/browse/SPARK-40885?
>
> The latter gets fixed by either
> https://issues.apache.org/jira/browse/SPARK-41959 or
> https://issues.apache.org/jira/browse/SPARK-42049.
>
> Are those considered important?
>
> Cheers,
> Enrico
>
>
> Am 18.01.23 um 04:29 schrieb Xinrong Meng:
>
> Hi All,
>
> Considering there are still important issues unresolved (some are as shown
> below), I would suggest to be conservative, we delay the branch-3.4's cut
> for one week.
>
> https://issues.apache.org/jira/browse/SPARK-39375
> https://issues.apache.org/jira/browse/SPARK-41589
> https://issues.apache.org/jira/browse/SPARK-42075
> https://issues.apache.org/jira/browse/SPARK-25299
> https://issues.apache.org/jira/browse/SPARK-41053
>
> I plan to cut *branch-3.4* at *18:30 PT, January 24, 2023*. Please ensure
> your changes for Apache Spark 3.4 to be ready by that time.
>
> Feel free to reply to the email if you have other ongoing big items for
> Spark 3.4.
>
> Thanks,
>
> Xinrong Meng
>
> On Sat, Jan 7, 2023 at 9:16 AM Hyukjin Kwon  wrote:
>
>> Thanks Xinrong.
>>
>> On Sat, Jan 7, 2023 at 9:18 AM Xinrong Meng 
>> wrote:
>>
>>> The release window for Apache Spark 3.4.0 is updated per
>>> https://github.com/apache/spark-website/pull/430.
>>>
>>> Thank you all!
>>>
>>> On Thu, Jan 5, 2023 at 2:10 PM Maxim Gekk 
>>> wrote:
>>>
 +1

 On Thu, Jan 5, 2023 at 12:25 AM huaxin gao 
 wrote:

> +1 Thanks!
>
> On Wed, Jan 4, 2023 at 10:19 AM L. C. Hsieh  wrote:
>
>> +1
>>
>> Thank you!
>>
>> On Wed, Jan 4, 2023 at 9:13 AM Chao Sun  wrote:
>>
>>> +1, thanks!
>>>
>>> Chao
>>>
>>> On Wed, Jan 4, 2023 at 1:56 AM Mridul Muralidharan 
>>> wrote:
>>>

 +1, Thanks !

 Regards,
 Mridul

 On Wed, Jan 4, 2023 at 2:20 AM Gengliang Wang 
 wrote:

> +1, thanks for driving the release!
>
>
> Gengliang
>
> On Tue, Jan 3, 2023 at 10:55 PM Dongjoon Hyun <
> dongjoon.h...@gmail.com> wrote:
>
>> +1
>>
>> Thank you!
>>
>> Dongjoon
>>
>> On Tue, Jan 3, 2023 at 9:44 PM Rui Wang 
>> wrote:
>>
>>> +1 to cut the branch starting from a workday!
>>>
>>> Great to see this is happening!
>>>
>>> Thanks Xinrong!
>>>
>>> -Rui
>>>
>>> On Tue, Jan 3, 2023 at 9:21 PM 416161...@qq.com <
>>> ruife...@foxmail.com> wrote:
>>>
 +1, thank you Xinrong for driving this release!

 --
 Ruifeng Zheng
 ruife...@foxmail.com

 



 -- Original --
 *From:* "Hyukjin Kwon" ;
 *Date:* Wed, Jan 4, 2023 01:15 PM
 *To:* "Xinrong Meng";
 *Cc:* "dev";
 *Subject:* Re: Time for Spark 3.4.0 release?

 SGTM +1

 On Wed, Jan 4, 2023 at 2:13 PM Xinrong Meng <
 xinrong.apa...@gmail.com> wrote:

> Hi All,
>
> Shall we cut *branch-3.4* on *January 16th, 2023*? We
> proposed January 15th per
> https://spark.apache.org/versioning-policy.html, but I would
> suggest we postpone one day since January 15th is a Sunday.
>
> I would like to volunteer as the release manager for *Apache
> Spark 3.4.0*.
>
> Thanks,
>
> Xinrong Meng
>
>
>


Re: Time for Spark 3.4.0 release?

2023-01-17 Thread Enrico Minack

Hi Xinrong,

what about regression issue 
https://issues.apache.org/jira/browse/SPARK-40819

and correctness issue https://issues.apache.org/jira/browse/SPARK-40885?

The latter gets fixed by either 
https://issues.apache.org/jira/browse/SPARK-41959 or 
https://issues.apache.org/jira/browse/SPARK-42049.


Are those considered important?

Cheers,
Enrico


Am 18.01.23 um 04:29 schrieb Xinrong Meng:

Hi All,

Considering there are still important issues unresolved (some are as 
shown below), I would suggest to be conservative, we delay the 
branch-3.4's cut for one week.


https://issues.apache.org/jira/browse/SPARK-39375
https://issues.apache.org/jira/browse/SPARK-41589
https://issues.apache.org/jira/browse/SPARK-42075
https://issues.apache.org/jira/browse/SPARK-25299
https://issues.apache.org/jira/browse/SPARK-41053

I plan to cut *branch-3.4* at *18:30 PT, January 24, 2023*. Please 
ensure your changes for Apache Spark 3.4 to be ready by that time.


Feel free to reply to the email if you have other ongoing big items 
for Spark 3.4.


Thanks,

Xinrong Meng

On Sat, Jan 7, 2023 at 9:16 AM Hyukjin Kwon  wrote:

Thanks Xinrong.

On Sat, Jan 7, 2023 at 9:18 AM Xinrong Meng
 wrote:

The release window for Apache Spark 3.4.0 is updated per
https://github.com/apache/spark-website/pull/430.

Thank you all!

On Thu, Jan 5, 2023 at 2:10 PM Maxim Gekk
 wrote:

+1

On Thu, Jan 5, 2023 at 12:25 AM huaxin gao
 wrote:

+1 Thanks!

On Wed, Jan 4, 2023 at 10:19 AM L. C. Hsieh
 wrote:

+1

Thank you!

On Wed, Jan 4, 2023 at 9:13 AM Chao Sun
 wrote:

+1, thanks!

Chao

On Wed, Jan 4, 2023 at 1:56 AM Mridul
Muralidharan  wrote:


+1, Thanks !

Regards,
Mridul

On Wed, Jan 4, 2023 at 2:20 AM Gengliang
Wang  wrote:

+1, thanks for driving the release!


Gengliang

On Tue, Jan 3, 2023 at 10:55 PM
Dongjoon Hyun
 wrote:

+1

Thank you!

Dongjoon

On Tue, Jan 3, 2023 at 9:44 PM Rui
Wang  wrote:

+1 to cut the branch starting
from a workday!

Great to see this is happening!

Thanks Xinrong!

-Rui

On Tue, Jan 3, 2023 at 9:21 PM
416161...@qq.com
 wrote:

+1, thank you Xinrong for
driving this release!




Ruifeng Zheng
ruife...@foxmail.com





--
Original --
*From:* "Hyukjin Kwon"
;
*Date:* Wed, Jan 4, 2023
01:15 PM
*To:* "Xinrong
Meng";
*Cc:* "dev";
*Subject:* Re: Time for
Spark 3.4.0 release?

SGTM +1

On Wed, Jan 4, 2023 at
2:13 PM Xinrong Meng

wrote:

Hi All,

Shall we cut
*branch-3.4* on

Re: Time for Spark 3.4.0 release?

2023-01-17 Thread Jungtaek Lim
+1 on delaying. I see there’s a JIRA ticket about DStream depreciation, we
are working on this - thanks for taking this into account!

2023년 1월 18일 (수) 오후 12:43, Hyukjin Kwon 님이 작성:

> +1. Thanks for driving this, Xinrong.
>
> On Wed, 18 Jan 2023 at 12:31, Xinrong Meng 
> wrote:
>
>> Hi All,
>>
>> Considering there are still important issues unresolved (some are as
>> shown below), I would suggest to be conservative, we delay the branch-3.4's
>> cut for one week.
>>
>> https://issues.apache.org/jira/browse/SPARK-39375
>> https://issues.apache.org/jira/browse/SPARK-41589
>> https://issues.apache.org/jira/browse/SPARK-42075
>> https://issues.apache.org/jira/browse/SPARK-25299
>> https://issues.apache.org/jira/browse/SPARK-41053
>>
>> I plan to cut *branch-3.4* at *18:30 PT, January 24, 2023*. Please
>> ensure your changes for Apache Spark 3.4 to be ready by that time.
>>
>> Feel free to reply to the email if you have other ongoing big items for
>> Spark 3.4.
>>
>> Thanks,
>>
>> Xinrong Meng
>>
>> On Sat, Jan 7, 2023 at 9:16 AM Hyukjin Kwon  wrote:
>>
>>> Thanks Xinrong.
>>>
>>> On Sat, Jan 7, 2023 at 9:18 AM Xinrong Meng 
>>> wrote:
>>>
 The release window for Apache Spark 3.4.0 is updated per
 https://github.com/apache/spark-website/pull/430.

 Thank you all!

 On Thu, Jan 5, 2023 at 2:10 PM Maxim Gekk 
 wrote:

> +1
>
> On Thu, Jan 5, 2023 at 12:25 AM huaxin gao 
> wrote:
>
>> +1 Thanks!
>>
>> On Wed, Jan 4, 2023 at 10:19 AM L. C. Hsieh  wrote:
>>
>>> +1
>>>
>>> Thank you!
>>>
>>> On Wed, Jan 4, 2023 at 9:13 AM Chao Sun  wrote:
>>>
 +1, thanks!

 Chao

 On Wed, Jan 4, 2023 at 1:56 AM Mridul Muralidharan <
 mri...@gmail.com> wrote:

>
> +1, Thanks !
>
> Regards,
> Mridul
>
> On Wed, Jan 4, 2023 at 2:20 AM Gengliang Wang 
> wrote:
>
>> +1, thanks for driving the release!
>>
>>
>> Gengliang
>>
>> On Tue, Jan 3, 2023 at 10:55 PM Dongjoon Hyun <
>> dongjoon.h...@gmail.com> wrote:
>>
>>> +1
>>>
>>> Thank you!
>>>
>>> Dongjoon
>>>
>>> On Tue, Jan 3, 2023 at 9:44 PM Rui Wang 
>>> wrote:
>>>
 +1 to cut the branch starting from a workday!

 Great to see this is happening!

 Thanks Xinrong!

 -Rui

 On Tue, Jan 3, 2023 at 9:21 PM 416161...@qq.com <
 ruife...@foxmail.com> wrote:

> +1, thank you Xinrong for driving this release!
>
> --
> Ruifeng Zheng
> ruife...@foxmail.com
>
> 
>
>
>
> -- Original --
> *From:* "Hyukjin Kwon" ;
> *Date:* Wed, Jan 4, 2023 01:15 PM
> *To:* "Xinrong Meng";
> *Cc:* "dev";
> *Subject:* Re: Time for Spark 3.4.0 release?
>
> SGTM +1
>
> On Wed, Jan 4, 2023 at 2:13 PM Xinrong Meng <
> xinrong.apa...@gmail.com> wrote:
>
>> Hi All,
>>
>> Shall we cut *branch-3.4* on *January 16th, 2023*? We
>> proposed January 15th per
>> https://spark.apache.org/versioning-policy.html, but I would
>> suggest we postpone one day since January 15th is a Sunday.
>>
>> I would like to volunteer as the release manager for *Apache
>> Spark 3.4.0*.
>>
>> Thanks,
>>
>> Xinrong Meng
>>
>>


Re: Time for Spark 3.4.0 release?

2023-01-17 Thread Hyukjin Kwon
+1. Thanks for driving this, Xinrong.

On Wed, 18 Jan 2023 at 12:31, Xinrong Meng  wrote:

> Hi All,
>
> Considering there are still important issues unresolved (some are as shown
> below), I would suggest to be conservative, we delay the branch-3.4's cut
> for one week.
>
> https://issues.apache.org/jira/browse/SPARK-39375
> https://issues.apache.org/jira/browse/SPARK-41589
> https://issues.apache.org/jira/browse/SPARK-42075
> https://issues.apache.org/jira/browse/SPARK-25299
> https://issues.apache.org/jira/browse/SPARK-41053
>
> I plan to cut *branch-3.4* at *18:30 PT, January 24, 2023*. Please ensure
> your changes for Apache Spark 3.4 to be ready by that time.
>
> Feel free to reply to the email if you have other ongoing big items for
> Spark 3.4.
>
> Thanks,
>
> Xinrong Meng
>
> On Sat, Jan 7, 2023 at 9:16 AM Hyukjin Kwon  wrote:
>
>> Thanks Xinrong.
>>
>> On Sat, Jan 7, 2023 at 9:18 AM Xinrong Meng 
>> wrote:
>>
>>> The release window for Apache Spark 3.4.0 is updated per
>>> https://github.com/apache/spark-website/pull/430.
>>>
>>> Thank you all!
>>>
>>> On Thu, Jan 5, 2023 at 2:10 PM Maxim Gekk 
>>> wrote:
>>>
 +1

 On Thu, Jan 5, 2023 at 12:25 AM huaxin gao 
 wrote:

> +1 Thanks!
>
> On Wed, Jan 4, 2023 at 10:19 AM L. C. Hsieh  wrote:
>
>> +1
>>
>> Thank you!
>>
>> On Wed, Jan 4, 2023 at 9:13 AM Chao Sun  wrote:
>>
>>> +1, thanks!
>>>
>>> Chao
>>>
>>> On Wed, Jan 4, 2023 at 1:56 AM Mridul Muralidharan 
>>> wrote:
>>>

 +1, Thanks !

 Regards,
 Mridul

 On Wed, Jan 4, 2023 at 2:20 AM Gengliang Wang 
 wrote:

> +1, thanks for driving the release!
>
>
> Gengliang
>
> On Tue, Jan 3, 2023 at 10:55 PM Dongjoon Hyun <
> dongjoon.h...@gmail.com> wrote:
>
>> +1
>>
>> Thank you!
>>
>> Dongjoon
>>
>> On Tue, Jan 3, 2023 at 9:44 PM Rui Wang 
>> wrote:
>>
>>> +1 to cut the branch starting from a workday!
>>>
>>> Great to see this is happening!
>>>
>>> Thanks Xinrong!
>>>
>>> -Rui
>>>
>>> On Tue, Jan 3, 2023 at 9:21 PM 416161...@qq.com <
>>> ruife...@foxmail.com> wrote:
>>>
 +1, thank you Xinrong for driving this release!

 --
 Ruifeng Zheng
 ruife...@foxmail.com

 



 -- Original --
 *From:* "Hyukjin Kwon" ;
 *Date:* Wed, Jan 4, 2023 01:15 PM
 *To:* "Xinrong Meng";
 *Cc:* "dev";
 *Subject:* Re: Time for Spark 3.4.0 release?

 SGTM +1

 On Wed, Jan 4, 2023 at 2:13 PM Xinrong Meng <
 xinrong.apa...@gmail.com> wrote:

> Hi All,
>
> Shall we cut *branch-3.4* on *January 16th, 2023*? We
> proposed January 15th per
> https://spark.apache.org/versioning-policy.html, but I would
> suggest we postpone one day since January 15th is a Sunday.
>
> I would like to volunteer as the release manager for *Apache
> Spark 3.4.0*.
>
> Thanks,
>
> Xinrong Meng
>
>


Re: Time for Spark 3.4.0 release?

2023-01-17 Thread Xinrong Meng
Hi All,

Considering there are still important issues unresolved (some are as shown
below), I would suggest to be conservative, we delay the branch-3.4's cut
for one week.

https://issues.apache.org/jira/browse/SPARK-39375
https://issues.apache.org/jira/browse/SPARK-41589
https://issues.apache.org/jira/browse/SPARK-42075
https://issues.apache.org/jira/browse/SPARK-25299
https://issues.apache.org/jira/browse/SPARK-41053

I plan to cut *branch-3.4* at *18:30 PT, January 24, 2023*. Please ensure
your changes for Apache Spark 3.4 to be ready by that time.

Feel free to reply to the email if you have other ongoing big items for
Spark 3.4.

Thanks,

Xinrong Meng

On Sat, Jan 7, 2023 at 9:16 AM Hyukjin Kwon  wrote:

> Thanks Xinrong.
>
> On Sat, Jan 7, 2023 at 9:18 AM Xinrong Meng 
> wrote:
>
>> The release window for Apache Spark 3.4.0 is updated per
>> https://github.com/apache/spark-website/pull/430.
>>
>> Thank you all!
>>
>> On Thu, Jan 5, 2023 at 2:10 PM Maxim Gekk 
>> wrote:
>>
>>> +1
>>>
>>> On Thu, Jan 5, 2023 at 12:25 AM huaxin gao 
>>> wrote:
>>>
 +1 Thanks!

 On Wed, Jan 4, 2023 at 10:19 AM L. C. Hsieh  wrote:

> +1
>
> Thank you!
>
> On Wed, Jan 4, 2023 at 9:13 AM Chao Sun  wrote:
>
>> +1, thanks!
>>
>> Chao
>>
>> On Wed, Jan 4, 2023 at 1:56 AM Mridul Muralidharan 
>> wrote:
>>
>>>
>>> +1, Thanks !
>>>
>>> Regards,
>>> Mridul
>>>
>>> On Wed, Jan 4, 2023 at 2:20 AM Gengliang Wang 
>>> wrote:
>>>
 +1, thanks for driving the release!


 Gengliang

 On Tue, Jan 3, 2023 at 10:55 PM Dongjoon Hyun <
 dongjoon.h...@gmail.com> wrote:

> +1
>
> Thank you!
>
> Dongjoon
>
> On Tue, Jan 3, 2023 at 9:44 PM Rui Wang 
> wrote:
>
>> +1 to cut the branch starting from a workday!
>>
>> Great to see this is happening!
>>
>> Thanks Xinrong!
>>
>> -Rui
>>
>> On Tue, Jan 3, 2023 at 9:21 PM 416161...@qq.com <
>> ruife...@foxmail.com> wrote:
>>
>>> +1, thank you Xinrong for driving this release!
>>>
>>> --
>>> Ruifeng Zheng
>>> ruife...@foxmail.com
>>>
>>> 
>>>
>>>
>>>
>>> -- Original --
>>> *From:* "Hyukjin Kwon" ;
>>> *Date:* Wed, Jan 4, 2023 01:15 PM
>>> *To:* "Xinrong Meng";
>>> *Cc:* "dev";
>>> *Subject:* Re: Time for Spark 3.4.0 release?
>>>
>>> SGTM +1
>>>
>>> On Wed, Jan 4, 2023 at 2:13 PM Xinrong Meng <
>>> xinrong.apa...@gmail.com> wrote:
>>>
 Hi All,

 Shall we cut *branch-3.4* on *January 16th, 2023*? We proposed
 January 15th per
 https://spark.apache.org/versioning-policy.html, but I would
 suggest we postpone one day since January 15th is a Sunday.

 I would like to volunteer as the release manager for *Apache
 Spark 3.4.0*.

 Thanks,

 Xinrong Meng




Re: [Suggest] Add geo function to core

2023-01-17 Thread Bjørn Jørgensen
Mosaic by Databricks Labs 



tir. 17. jan. 2023 kl. 15:53 skrev Grigory Pomadchin :

> Hey Mo,
>
> That is awesome, great to hear!
>
> Best,
>
> Grigory
>
> On Tue, Jan 17, 2023 at 9:03 AM Mo Sarwat  wrote:
>
>> Grigory,
>>
>> Thanks a lot for chiming - I really like the PostGIS to PostgreSQL
>> analogy. That is exactly what Sedona (an Apache project) is to Spark. Spark
>> core should remain light / generic enough (similar to PostgreSQL) and all
>> spatial functionalities should be pluggable extensions (Sedona). Otherwise,
>> the core will be unnecessarily heavy to maintain, release, and integrate.
>>
>> Sedona already supports geo-hashing among many other geospatial standard
>> functionality, which work seamlessly with Spark without any issues to the
>> end user. If there is something missing, I would highly recommend that we
>> bring it to the Sedona community, and that will directly feed into the
>> benefit of Spark uses who are doing geo.
>>
>> Implementing geospatial functionality in the core Spark will be a
>> replication of work done already. Databricks for instance already uses
>> Sedona internally with their geospatial capabilities.
>>
>> Finally, I would like to mention that I am totally willing to be
>> corrected on that. Especially, if you tried Sedona with Spark and figured
>> that it does not serve the purpose at all. But, please try it first and
>> let's come up with a few capabilities it cannot provide unless it is
>> implemented in Spark core. And, then we can suggest those capabilities to
>> the Spark community.
>>
>> Thanks,
>> -Mo
>>
>>
>> On 2023/01/17 03:09:06 Grigory Pomadchin wrote:
>> > Hey folks,
>> >
>> > Traditionally GIS functionality is distributed a bit separately - i.e.
>> > PostGIS is a great example; and indeed for GIS needs Sedona / GeoMesa /
>> > GeoWave may work out; I think GeoMesa implements GeoHash (see
>> >
>> https://www.geomesa.org/documentation/stable/user/spark/sparksql_functions.html
>> > -
>> > could be used as an inspiration at least);
>> >
>> > I'm pretty sure DataBricks provides some GIS functions (H3) at this
>> point.
>> > Could be an argument for having smth in the core / officially supported
>> by
>> > Spark community?
>> >
>> > I'd really love to see some relatively lightweight (JTS + Proj4j / SIS)
>> > library with basic expressions and optimization rules in the wild that
>> is
>> > usable in the Spark native interfaces primarily; so there is no need to
>> > figure out the API / way to set it up and / or resolve peculiar
>> > dependencies. Could be a step towards Spark GIS types standardization.
>> >
>> > Best,
>> >
>> > Grigory
>> >
>> > On Mon, Jan 16, 2023 at 6:21 PM Mo Sarwat  wrote:
>> >
>> > > Martin, thanks for chiming in and mentioning Apache SIS. However,
>> Mark was
>> > > asking about Geo in Spark, which Sedona already supports.
>> > >
>> > > Yet, I like the idea of making all dependencies within the Apache
>> family.
>> > > I believe a good solution would be for you (or the SIS community at
>> large)
>> > > to include Apache SIS in Sedona to replace libs like GeoTools. The
>> Sedona
>> > > community would definitely welcome your contribution :)
>> > >
>> > > Regards,
>> > > -Mo
>> > >
>> > > On 2023/01/16 22:24:14 Martin Desruisseaux wrote:
>> > > > Hello Mark
>> > > >
>> > > > Indeed Sedona is surely a serious candidate. Maybe one aspect to
>> take in
>> > > consideration, depending how "core" the geospatial services would be,
>> is
>> > > that Sedona depends on a LGPL library (GeoTools, bundled separately)
>> for
>> > > map projections, Shapefile and GeoTIFF support. So those features
>> could not
>> > > be in core since category X dependencies shall be optional.
>> > > >
>> > > > Regarding referencing by coordinates (including map projections),
>> I'm
>> > > aware of 3 libraries having a license compatible with Apache:
>> > > >
>> > > > * Apache SIS (Apache License)
>> > > > * PROJ4J (Apache license)
>> > > > * PROJ-JNI (MIT license)
>> > > >
>> > > > PROJ-JNI is a binding to PROJ native library using Java Native
>> Interface
>> > > (JNI). PROJ is the most well known map projection library, but it is
>> > > difficult to bundle native code in a Java application.
>> > > >
>> > > > I'm not in a neutral position to said that, but I believe that
>> Apache
>> > > SIS is the most powerful open source pure-Java referencing library.
>> But it
>> > > is relatively big, about 4 Mb for the referencing module with its
>> > > dependencies, not counting the optional EPSG geodetic dataset
>> (because not
>> > > compatible with Apache license). Apache SIS is not the library with
>> the
>> > > largest amount of map projections (PROJ4J has more), but it handles
>> some
>> > > difficult problems and scale well with three- or four-dimensional
>> data (or
>> > > more).
>> > > >
>> > > > PROJ4J is a lightweight library which may be sufficient if data are
>> > > mostly two-dimensional (limited 3D support seems also 

Re: [Suggest] Add geo function to core

2023-01-17 Thread Grigory Pomadchin
Hey Mo,

That is awesome, great to hear!

Best,

Grigory

On Tue, Jan 17, 2023 at 9:03 AM Mo Sarwat  wrote:

> Grigory,
>
> Thanks a lot for chiming - I really like the PostGIS to PostgreSQL
> analogy. That is exactly what Sedona (an Apache project) is to Spark. Spark
> core should remain light / generic enough (similar to PostgreSQL) and all
> spatial functionalities should be pluggable extensions (Sedona). Otherwise,
> the core will be unnecessarily heavy to maintain, release, and integrate.
>
> Sedona already supports geo-hashing among many other geospatial standard
> functionality, which work seamlessly with Spark without any issues to the
> end user. If there is something missing, I would highly recommend that we
> bring it to the Sedona community, and that will directly feed into the
> benefit of Spark uses who are doing geo.
>
> Implementing geospatial functionality in the core Spark will be a
> replication of work done already. Databricks for instance already uses
> Sedona internally with their geospatial capabilities.
>
> Finally, I would like to mention that I am totally willing to be corrected
> on that. Especially, if you tried Sedona with Spark and figured that it
> does not serve the purpose at all. But, please try it first and let's come
> up with a few capabilities it cannot provide unless it is implemented in
> Spark core. And, then we can suggest those capabilities to the Spark
> community.
>
> Thanks,
> -Mo
>
>
> On 2023/01/17 03:09:06 Grigory Pomadchin wrote:
> > Hey folks,
> >
> > Traditionally GIS functionality is distributed a bit separately - i.e.
> > PostGIS is a great example; and indeed for GIS needs Sedona / GeoMesa /
> > GeoWave may work out; I think GeoMesa implements GeoHash (see
> >
> https://www.geomesa.org/documentation/stable/user/spark/sparksql_functions.html
> > -
> > could be used as an inspiration at least);
> >
> > I'm pretty sure DataBricks provides some GIS functions (H3) at this
> point.
> > Could be an argument for having smth in the core / officially supported
> by
> > Spark community?
> >
> > I'd really love to see some relatively lightweight (JTS + Proj4j / SIS)
> > library with basic expressions and optimization rules in the wild that is
> > usable in the Spark native interfaces primarily; so there is no need to
> > figure out the API / way to set it up and / or resolve peculiar
> > dependencies. Could be a step towards Spark GIS types standardization.
> >
> > Best,
> >
> > Grigory
> >
> > On Mon, Jan 16, 2023 at 6:21 PM Mo Sarwat  wrote:
> >
> > > Martin, thanks for chiming in and mentioning Apache SIS. However, Mark
> was
> > > asking about Geo in Spark, which Sedona already supports.
> > >
> > > Yet, I like the idea of making all dependencies within the Apache
> family.
> > > I believe a good solution would be for you (or the SIS community at
> large)
> > > to include Apache SIS in Sedona to replace libs like GeoTools. The
> Sedona
> > > community would definitely welcome your contribution :)
> > >
> > > Regards,
> > > -Mo
> > >
> > > On 2023/01/16 22:24:14 Martin Desruisseaux wrote:
> > > > Hello Mark
> > > >
> > > > Indeed Sedona is surely a serious candidate. Maybe one aspect to
> take in
> > > consideration, depending how "core" the geospatial services would be,
> is
> > > that Sedona depends on a LGPL library (GeoTools, bundled separately)
> for
> > > map projections, Shapefile and GeoTIFF support. So those features
> could not
> > > be in core since category X dependencies shall be optional.
> > > >
> > > > Regarding referencing by coordinates (including map projections), I'm
> > > aware of 3 libraries having a license compatible with Apache:
> > > >
> > > > * Apache SIS (Apache License)
> > > > * PROJ4J (Apache license)
> > > > * PROJ-JNI (MIT license)
> > > >
> > > > PROJ-JNI is a binding to PROJ native library using Java Native
> Interface
> > > (JNI). PROJ is the most well known map projection library, but it is
> > > difficult to bundle native code in a Java application.
> > > >
> > > > I'm not in a neutral position to said that, but I believe that Apache
> > > SIS is the most powerful open source pure-Java referencing library.
> But it
> > > is relatively big, about 4 Mb for the referencing module with its
> > > dependencies, not counting the optional EPSG geodetic dataset (because
> not
> > > compatible with Apache license). Apache SIS is not the library with the
> > > largest amount of map projections (PROJ4J has more), but it handles
> some
> > > difficult problems and scale well with three- or four-dimensional data
> (or
> > > more).
> > > >
> > > > PROJ4J is a lightweight library which may be sufficient if data are
> > > mostly two-dimensional (limited 3D support seems also possible) and if
> > > uncertainty of a few metres in coordinate transformations (depending
> how
> > > datum shifts are specified) is acceptable.
> > > >
> > > > It is possible to write some code in an implementation-independent
> way
> > > using GeoAPI interfaces, 

Re: [Suggest] Add geo function to core

2023-01-17 Thread Mo Sarwat
Grigory,

Thanks a lot for chiming - I really like the PostGIS to PostgreSQL analogy. 
That is exactly what Sedona (an Apache project) is to Spark. Spark core should 
remain light / generic enough (similar to PostgreSQL) and all spatial 
functionalities should be pluggable extensions (Sedona). Otherwise, the core 
will be unnecessarily heavy to maintain, release, and integrate. 

Sedona already supports geo-hashing among many other geospatial standard 
functionality, which work seamlessly with Spark without any issues to the end 
user. If there is something missing, I would highly recommend that we bring it 
to the Sedona community, and that will directly feed into the benefit of Spark 
uses who are doing geo.

Implementing geospatial functionality in the core Spark will be a replication 
of work done already. Databricks for instance already uses Sedona internally 
with their geospatial capabilities.

Finally, I would like to mention that I am totally willing to be corrected on 
that. Especially, if you tried Sedona with Spark and figured that it does not 
serve the purpose at all. But, please try it first and let's come up with a few 
capabilities it cannot provide unless it is implemented in Spark core. And, 
then we can suggest those capabilities to the Spark community.

Thanks,
-Mo
 

On 2023/01/17 03:09:06 Grigory Pomadchin wrote:
> Hey folks,
> 
> Traditionally GIS functionality is distributed a bit separately - i.e.
> PostGIS is a great example; and indeed for GIS needs Sedona / GeoMesa /
> GeoWave may work out; I think GeoMesa implements GeoHash (see
> https://www.geomesa.org/documentation/stable/user/spark/sparksql_functions.html
> -
> could be used as an inspiration at least);
> 
> I'm pretty sure DataBricks provides some GIS functions (H3) at this point.
> Could be an argument for having smth in the core / officially supported by
> Spark community?
> 
> I'd really love to see some relatively lightweight (JTS + Proj4j / SIS)
> library with basic expressions and optimization rules in the wild that is
> usable in the Spark native interfaces primarily; so there is no need to
> figure out the API / way to set it up and / or resolve peculiar
> dependencies. Could be a step towards Spark GIS types standardization.
> 
> Best,
> 
> Grigory
> 
> On Mon, Jan 16, 2023 at 6:21 PM Mo Sarwat  wrote:
> 
> > Martin, thanks for chiming in and mentioning Apache SIS. However, Mark was
> > asking about Geo in Spark, which Sedona already supports.
> >
> > Yet, I like the idea of making all dependencies within the Apache family.
> > I believe a good solution would be for you (or the SIS community at large)
> > to include Apache SIS in Sedona to replace libs like GeoTools. The Sedona
> > community would definitely welcome your contribution :)
> >
> > Regards,
> > -Mo
> >
> > On 2023/01/16 22:24:14 Martin Desruisseaux wrote:
> > > Hello Mark
> > >
> > > Indeed Sedona is surely a serious candidate. Maybe one aspect to take in
> > consideration, depending how "core" the geospatial services would be, is
> > that Sedona depends on a LGPL library (GeoTools, bundled separately) for
> > map projections, Shapefile and GeoTIFF support. So those features could not
> > be in core since category X dependencies shall be optional.
> > >
> > > Regarding referencing by coordinates (including map projections), I'm
> > aware of 3 libraries having a license compatible with Apache:
> > >
> > > * Apache SIS (Apache License)
> > > * PROJ4J (Apache license)
> > > * PROJ-JNI (MIT license)
> > >
> > > PROJ-JNI is a binding to PROJ native library using Java Native Interface
> > (JNI). PROJ is the most well known map projection library, but it is
> > difficult to bundle native code in a Java application.
> > >
> > > I'm not in a neutral position to said that, but I believe that Apache
> > SIS is the most powerful open source pure-Java referencing library. But it
> > is relatively big, about 4 Mb for the referencing module with its
> > dependencies, not counting the optional EPSG geodetic dataset (because not
> > compatible with Apache license). Apache SIS is not the library with the
> > largest amount of map projections (PROJ4J has more), but it handles some
> > difficult problems and scale well with three- or four-dimensional data (or
> > more).
> > >
> > > PROJ4J is a lightweight library which may be sufficient if data are
> > mostly two-dimensional (limited 3D support seems also possible) and if
> > uncertainty of a few metres in coordinate transformations (depending how
> > datum shifts are specified) is acceptable.
> > >
> > > It is possible to write some code in an implementation-independent way
> > using GeoAPI interfaces, which aim to do what JDBC interfaces do for
> > databases. Apache SIS and PROJ-JNI are implementations of GeoAPI
> > interfaces, so by using those interfaces you can let users choose among
> > those two implementations. I think that GeoAPI wrappers could easily be
> > contributed to PROJ4J as well if there is a desire