Re: Contributions needed: 4 higher order functions

2022-12-01 Thread Hyukjin Kwon
Yes, I can. Please send me an email that includes your preferred id and
your full name, see also https://infra.apache.org/jira-guidelines.html#who

On Fri, 2 Dec 2022 at 08:00, jason carlson  wrote:

> Can someone make me a jira account?
>
> Sent from my iPhone
>
> On Nov 30, 2022, at 5:35 AM, Hyukjin Kwon  wrote:
>
> 
> Hi all,
>
> There are four higher order functions in our backlog:
>
> - https://issues.apache.org/jira/browse/SPARK-41235
> - https://issues.apache.org/jira/browse/SPARK-41234
> - https://issues.apache.org/jira/browse/SPARK-41233
> - https://issues.apache.org/jira/browse/SPARK-41232
>
> Would be a great chance for new contributors to understand and get into
> Catalyst optimizer and Spark SQL.
>
> Any help on these tickets would be much appreciated.
>
>


Re: Contributions needed: 4 higher order functions

2022-12-01 Thread jason carlson
Can someone make me a jira account?

Sent from my iPhone

> On Nov 30, 2022, at 5:35 AM, Hyukjin Kwon  wrote:
> 
> 
> Hi all,
> 
> There are four higher order functions in our backlog:
> 
> - https://issues.apache.org/jira/browse/SPARK-41235
> - https://issues.apache.org/jira/browse/SPARK-41234
> - https://issues.apache.org/jira/browse/SPARK-41233
> - https://issues.apache.org/jira/browse/SPARK-41232
> 
> Would be a great chance for new contributors to understand and get into 
> Catalyst optimizer and Spark SQL.
> 
> Any help on these tickets would be much appreciated.


Re: Contributions needed: 4 higher order functions

2022-12-01 Thread Ankit Gupta
Hey Hyukjin / Devs

Please do review the changes for below mentioned is the pull PR for the
same.
https://issues.apache.org/jira/browse/SPARK-41232

https://github.com/apache/spark/pull/38865

Thanks and Regards

Ankit Prakash Gupta

On Wed, Nov 30, 2022 at 4:04 PM Hyukjin Kwon  wrote:

> Hi all,
>
> There are four higher order functions in our backlog:
>
> - https://issues.apache.org/jira/browse/SPARK-41235
> - https://issues.apache.org/jira/browse/SPARK-41234
> - https://issues.apache.org/jira/browse/SPARK-41233
> - https://issues.apache.org/jira/browse/SPARK-41232
>
> Would be a great chance for new contributors to understand and get into
> Catalyst optimizer and Spark SQL.
>
> Any help on these tickets would be much appreciated.
>


Re: Syndicate Apache Spark Twitter to Mastodon?

2022-12-01 Thread Holden Karau
The main negatives that I can think of is an additional account for the PMC
to maintain so if we as a community don’t have many people on Mastodon yet
it might not be worth it. Would need probably about ~20 minutes of setup
work to make the sync (probably most of it is finding someone with the
Twitter credentials to enable to sync). The other tricky one is picking a
server (there is no default ASF server that I know of).

On Thu, Dec 1, 2022 at 8:03 AM Russell Spitzer 
wrote:

> Since this is just syndication I don't think arguments on the benefits of
> Twitter vs Mastodon are that important, it's really just what are the costs
> of additionally posting to Mastodon. I'm assuming those costs are basically
> 0 since this can be done by a bot? So I don't think there is any strong
> reason not to do so.
>
>
> On Nov 30, 2022, at 5:51 PM, Dmitry  wrote:
>
> My personal opinion, one of the most features of Twiiter that it is not
> federated and is good platform for annonces and so on. So it means "it
> would be good to reach our users where they are" means stay in twitter(most
> companies who use Spark/Databricks are in Twitter)
> For Federated  features, I think Slack would be a better platform, a lot
> of Apache Big data projects have slack for federated features
>
> чт, 1 дек. 2022 г., 02:33 Holden Karau :
>
>> I agree that there is probably a majority still on twitter, but it would
>> be a syndication (e.g. we'd keep both).
>>
>> As to the # of devs it's hard to say since:
>> 1) It's a federated service
>> 2) Figuring out if an account is a dev or not is hard
>>
>> But, for example,
>>
>> There seems to be roughly an aggregate 6 million users (
>> https://observablehq.com/@simonw/mastodon-users-and-statuses-over-time
>> ), which seems to be about only ~1% of Twitters size.
>>
>> Nova's (large K8s focused I believe) has ~29k, tech.lgbt has ~6k, The BSD
>> mastodon has ~1k ( https://bsd.network/about )
>>
>> It's hard to say, but I've noticed a larger number of my tech affiliated
>> friends moving to Mastodon (personally I now do both).
>>
>> On Wed, Nov 30, 2022 at 3:17 PM Dmitry  wrote:
>>
>>> Hello,
>>> Does any long-term statistics about number of developers who moved to
>>> mastodon and activity use exists?
>>>
>>> I believe the most devs are still using Twitter.
>>>
>>>
>>> чт, 1 дек. 2022 г., 01:35 Holden Karau :
>>>
 Do we want to start syndicating Apache Spark Twitter to a Mastodon
 instance. It seems like a lot of software dev folks are moving over there
 and it would be good to reach our users where they are.

 Any objections / concerns? Any thoughts on which server we should pick
 if we do this?
 --
 Twitter: https://twitter.com/holdenkarau
 Books (Learning Spark, High Performance Spark, etc.):
 https://amzn.to/2MaRAG9  
 YouTube Live Streams: https://www.youtube.com/user/holdenkarau

>>>
>>
>> --
>> Twitter: https://twitter.com/holdenkarau
>> Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9  
>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>
>
> --
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


Re: Syndicate Apache Spark Twitter to Mastodon?

2022-12-01 Thread Russell Spitzer
Since this is just syndication I don't think arguments on the benefits of 
Twitter vs Mastodon are that important, it's really just what are the costs of 
additionally posting to Mastodon. I'm assuming those costs are basically 0 
since this can be done by a bot? So I don't think there is any strong reason 
not to do so.

> On Nov 30, 2022, at 5:51 PM, Dmitry  wrote:
> 
> My personal opinion, one of the most features of Twiiter that it is not 
> federated and is good platform for annonces and so on. So it means "it would 
> be good to reach our users where they are" means stay in twitter(most 
> companies who use Spark/Databricks are in Twitter)
> For Federated  features, I think Slack would be a better platform, a lot of 
> Apache Big data projects have slack for federated features 
> 
> чт, 1 дек. 2022 г., 02:33 Holden Karau  >:
> I agree that there is probably a majority still on twitter, but it would be a 
> syndication (e.g. we'd keep both).
> 
> As to the # of devs it's hard to say since:
> 1) It's a federated service
> 2) Figuring out if an account is a dev or not is hard
> 
> But, for example,
> 
> There seems to be roughly an aggregate 6 million users ( 
> https://observablehq.com/@simonw/mastodon-users-and-statuses-over-time 
>  ), 
> which seems to be about only ~1% of Twitters size.
> 
> Nova's (large K8s focused I believe) has ~29k, tech.lgbt has ~6k, The BSD 
> mastodon has ~1k ( https://bsd.network/about  ) 
> 
> It's hard to say, but I've noticed a larger number of my tech affiliated 
> friends moving to Mastodon (personally I now do both).
> 
> On Wed, Nov 30, 2022 at 3:17 PM Dmitry  > wrote:
> Hello, 
> Does any long-term statistics about number of developers who moved to 
> mastodon and activity use exists?
> 
> I believe the most devs are still using Twitter.  
> 
> 
> чт, 1 дек. 2022 г., 01:35 Holden Karau  >:
> Do we want to start syndicating Apache Spark Twitter to a Mastodon instance. 
> It seems like a lot of software dev folks are moving over there and it would 
> be good to reach our users where they are.
> 
> Any objections / concerns? Any thoughts on which server we should pick if we 
> do this?
> -- 
> Twitter: https://twitter.com/holdenkarau 
> Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 
>  
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau 
> 
> 
> -- 
> Twitter: https://twitter.com/holdenkarau 
> Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 
>  
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau 
> 


Re: [VOTE][SPIP] Asynchronous Offset Management in Structured Streaming

2022-12-01 Thread Wenchen Fan
+1

On Thu, Dec 1, 2022 at 12:31 PM Shixiong Zhu  wrote:

> +1
>
>
> On Wed, Nov 30, 2022 at 8:04 PM Hyukjin Kwon  wrote:
>
>> +1
>>
>> On Thu, 1 Dec 2022 at 12:39, Mridul Muralidharan 
>> wrote:
>>
>>>
>>> +1
>>>
>>> Regards,
>>> Mridul
>>>
>>> On Wed, Nov 30, 2022 at 8:55 PM Xingbo Jiang 
>>> wrote:
>>>
 +1

 On Wed, Nov 30, 2022 at 5:59 PM Jungtaek Lim <
 kabhwan.opensou...@gmail.com> wrote:

> Starting with +1 from me.
>
> On Thu, Dec 1, 2022 at 10:54 AM Jungtaek Lim <
> kabhwan.opensou...@gmail.com> wrote:
>
>> Hi all,
>>
>> I'd like to start the vote for SPIP: Asynchronous Offset Management
>> in Structured Streaming.
>>
>> The high level summary of the SPIP is that we propose a couple of
>> improvements on offset management in microbatch execution to lower down
>> processing latency, which would help for certain types of workloads.
>>
>> References:
>>
>>- JIRA ticket 
>>- SPIP doc
>>
>> 
>>- Discussion thread
>>
>>
>> Please vote on the SPIP for the next 72 hours:
>>
>> [ ] +1: Accept the proposal as an official SPIP
>> [ ] +0
>> [ ] -1: I don’t think this is a good idea because …
>>
>> Thanks!
>> Jungtaek Lim (HeartSaVioR)
>>
>