Thanks for the reply.  I am not fussed how the situation is addressed really, 
but I am just trying to keep the initiative alive.  This isn’t the first time I 
have tried to rescue it.
The feature would deliver great cost savings and possibly greater performance 
for my use case.
After the disappointment of seeing the github PR closed due to inactivity I was 
unsure how to re-ignite things and it stuck me that maybe ASF board report may 
be a way to highlight the issue.
I understand that Structured streaming isn’t maybe the most common use case for 
spark and that spark in of it self is more of a batch centric technology, 
however I strongly believe that DRA in the long lived streaming context is 
possibly even more important than DRA in batch context.  Running a large 
Hadoop/spark cluster 24x7 is expensive and could really benefit from the 
functionality that proper streaming work load based DRA could bring.
Also, knowing that the PR author has been running this DRA code in his own 
environment for quite some time now successfully, makes it more frustrating.  
The code has essentially been tested externally before the PR was even raised.  
It seems to be more than just a theoretical improvement to the codebase.

Regards,

Adam Hobbs



C2 - Internal Use

From: Jungtaek Lim <kabhwan.opensou...@gmail.com>
Sent: Tuesday, 11 February 2025 8:49 AM
To: adam.ho...@bendigoadelaide.com.au.invalid
Cc: Matei Zaharia <matei.zaha...@gmail.com>; Spark dev list 
<dev@spark.apache.org>
Subject: Re: ASF board report draft for February 2025

CAUTION: This email originated from outside of the organisation. Do not click 
links or open attachments unless you recognise the sender's full email address 
and know the content is safe.

Thanks Adam for your email.

I started to look at these changes when proposed but I am not familiar with 
DRA. It needed a non-trivial context building for me to be effective which I 
could not prioritize. I asked my team members to also review and they were 
involved, but even they lacked context on how DRA works, its long term 
supportability and maintainability.

When possible I shepherd other initiatives (SPIP), such as Arbitrary state 
processing API. If in the community there are folks who understand DRA, its 
implications in terms of maintenance it will be nice for them to share the load 
and shepherd the project.

In any case, this seems to be a prioritization conversation that can perhaps be 
taken in another thread and not block this ASF board report. Is that ok for you?

On Thu, Feb 6, 2025 at 2:30 PM Adam Hobbs 
<adam.ho...@bendigoadelaide.com.au.invalid<mailto:adam.ho...@bendigoadelaide.com.au.invalid>>
 wrote:
I'd like to add something around the failure to get any traction on shepparding 
of the structured streaming DRA PR.  Multiple times now there have been calls 
for help to get this initiative over the line and the response has been 
disappointing.  The github PR has been closed due to inaction 
(https://github.com/apache/spark/pull/42352<https://urldefense.com/v3/__https:/github.com/apache/spark/pull/42352__;!!OkoFT9xN!PCzDhELZksixXIrHSFOlAGsgyEuE_NVULgxNonSd-HZD1Zd33au7gPaYFH2JxcnQBEfr-Mp5F7YlJrk_iWBA9P4Y8Pbnc4iXNMYs$>).

This seems like a bit of a failure in the process
.
Regards,

Adam Hobbs


C2 - Internal Use
-----Original Message-----
From: Matei Zaharia <matei.zaha...@gmail.com<mailto:matei.zaha...@gmail.com>>
Sent: Thursday, 6 February 2025 2:57 PM
To: Spark dev list <dev@spark.apache.org<mailto:dev@spark.apache.org>>
Cc: priv...@spark.apache.org<mailto:priv...@spark.apache.org>
Subject: ASF board report draft for February 2025

CAUTION: This email originated from outside of the organisation. Do not click 
links or open attachments unless you recognise the sender's full email address 
and know the content is safe.


It’s time to send our next ASF board report again on February 12th. Here’s an 
initial draft — feel free to suggest changes:

=====================


Description:

Apache Spark is a fast and general purpose engine for large-scale data 
processing. It offers high-level APIs in Java, Scala, Python, R and SQL as well 
as a rich set of libraries including stream processing, machine learning, and 
graph analytics.

Issues for the board:

- None

Project status:

- The Spark 4.0 branch has been cut and has entered the QA stage. We encourage 
the community to test it out!
- We released Spark 3.5.4 on December 20th, 2024.
- The PMC voted to add one new committer (Bingkun Pan) and one new PMC member 
(Jie Yang) to the project.
- The proposal to "Use plain text logs by default" was successfully passed.

Trademarks:

- No changes since last report.

Latest releases:

- Spark 3.5.4 was released on Dec 20, 2024
- Spark 3.4.4 was released on Oct 27, 2024
- Spark 4.0 Preview 2 was released on Sept 26, 2024

Committers and PMC:

- The latest committer was added on Nov 13, 2024 (Bingkun Pan).
- The latest PMC member was added on Jan 21st, 2025 (Jie Yang).

=====================
---------------------------------------------------------------------
To unsubscribe e-mail: 
dev-unsubscr...@spark.apache.org<mailto:dev-unsubscr...@spark.apache.org>

********************************************************************************

This communication is intended only for use of the addressee and may contain 
legally privileged and confidential information.
If you are not the addressee or intended recipient, you are notified that any 
dissemination, copying or use of any of the information is unauthorised.

The legal privilege and confidentiality attached to this e-mail is not waived, 
lost or destroyed by reason of a mistaken delivery to you.
If you have received this message in error, we would appreciate an immediate 
notification via e-mail to 
contac...@bendigoadelaide.com.au<mailto:contac...@bendigoadelaide.com.au> or by 
phoning 1300 BENDIGO (1300 236 344), and ask that the e-mail be permanently 
deleted from your system.

Bendigo and Adelaide Bank Limited ABN 11 068 049 178

********************************************************************************

---------------------------------------------------------------------
To unsubscribe e-mail: 
dev-unsubscr...@spark.apache.org<mailto:dev-unsubscr...@spark.apache.org>

********************************************************************************

This communication is intended only for use of the addressee and may contain 
legally privileged and confidential information.
If you are not the addressee or intended recipient, you are notified that any 
dissemination, copying or use of any of the information is unauthorised.

The legal privilege and confidentiality attached to this e-mail is not waived, 
lost or destroyed by reason of a mistaken delivery to you.
If you have received this message in error, we would appreciate an immediate 
notification via e-mail to contac...@bendigoadelaide.com.au or by phoning 1300 
BENDIGO (1300 236 344), and ask that the e-mail be permanently deleted from 
your system.

Bendigo and Adelaide Bank Limited ABN 11 068 049 178

********************************************************************************

Reply via email to