Re: ASF board report draft for February

2024-02-18 Thread Mich Talebzadeh
Np, thanks for addressing the point promptly

Mich Talebzadeh,
Dad | Technologist | Solutions Architect | Engineer
London
United Kingdom


   view my Linkedin profile



 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, one test result is worth one-thousand expert
opinions (Werner  Von Braun
).


On Sun, 18 Feb 2024 at 17:22, Matei Zaharia  wrote:

> Thanks for the clarification. I updated it to say Comet is in the process
> of being open sourced.
>
> On Feb 18, 2024, at 1:55 AM, Mich Talebzadeh 
> wrote:
>
> Hi Matei,
>
> With regard to your last point
>
> "- Project Comet, a plugin designed to accelerate Spark query execution by
> leveraging DataFusion and Arrow, has been open-sourced under the Apache
> Arrow project. For more information, visit
> https://github.com/apache/arrow-datafusion-comet.;
>
> If my understanding is correct (as of  15th February), I don't think the
> full project is open sourced yet and I quote a response from the thead
> owner Chao Sun
>
> "Note that we haven't open sourced several features yet including shuffle
> support, which the aggregate operation depends on. Please stay tuned!"
>
> I would be inclined to leave that line out for now. The rest is fine.
>
> HTH
>
> Mich Talebzadeh,
> Dad | Technologist | Solutions Architect | Engineer
> London
> United Kingdom
>
>view my Linkedin profile
> 
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* The information provided is correct to the best of my
> knowledge but of course cannot be guaranteed . It is essential to note
> that, as with any advice, one verified and tested result holds more weight
> than a thousand expert opinions.
>
>
> On Sat, 17 Feb 2024 at 19:23, Matei Zaharia 
> wrote:
>
>> Hi all,
>>
>> I missed some reminder emails about our board report this month, but here
>> is my draft. I’ll submit it tomorrow if that’s ok.
>>
>> ==
>>
>> Issues for the board:
>>
>> - None
>>
>> Project status:
>>
>> - We made two patch releases: Spark 3.3.4 (EOL release) on December 16,
>> 2023, and Spark 3.4.2 on November 30, 2023.
>> - We have begun voting for a Spark 3.5.1 maintenance release.
>> - The vote on "SPIP: Structured Streaming - Arbitrary State API v2" has
>> passed.
>> - We transitioned to an ASF-hosted analytics service, Matomo. For
>> details, visit
>> https://analytics.apache.org/index.php?module=CoreHome=index=yesterday=day=40
>> .
>> - Project Comet, a plugin designed to accelerate Spark query execution by
>> leveraging DataFusion and Arrow, has been open-sourced under the Apache
>> Arrow project. For more information, visit
>> https://github.com/apache/arrow-datafusion-comet.
>>
>> Trademarks:
>>
>> - No changes since the last report.
>>
>> Latest releases:
>>
>> - Spark 3.3.4 was released on December 16, 2023
>> - Spark 3.4.2 was released on November 30, 2023
>> - Spark 3.5.0 was released on September 13, 2023
>>
>> Committers and PMC:
>>
>> - The latest committer was added on Oct 2nd, 2023 (Jiaan Geng).
>> - The latest PMC members were added on Oct 2nd, 2023 (Yuanjian Li and
>> Yikun Jiang).
>>
>> ==
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>
>


Re: ASF board report draft for February

2024-02-18 Thread Matei Zaharia
Thanks for the clarification. I updated it to say Comet is in the process of 
being open sourced.

> On Feb 18, 2024, at 1:55 AM, Mich Talebzadeh  
> wrote:
> 
> Hi Matei,
> 
> With regard to your last point
> 
> "- Project Comet, a plugin designed to accelerate Spark query execution by 
> leveraging DataFusion and Arrow, has been open-sourced under the Apache Arrow 
> project. For more information, visit 
> https://github.com/apache/arrow-datafusion-comet.;
> 
> If my understanding is correct (as of  15th February), I don't think the full 
> project is open sourced yet and I quote a response from the thead owner Chao 
> Sun
> 
> "Note that we haven't open sourced several features yet including shuffle 
> support, which the aggregate operation depends on. Please stay tuned!" 
> 
> I would be inclined to leave that line out for now. The rest is fine.
> 
> HTH
> 
> Mich Talebzadeh,
> Dad | Technologist | Solutions Architect | Engineer
> London
> United Kingdom
> 
>view my Linkedin profile 
> 
> 
>  https://en.everybodywiki.com/Mich_Talebzadeh
> 
>  
> Disclaimer: The information provided is correct to the best of my knowledge 
> but of course cannot be guaranteed . It is essential to note that, as with 
> any advice, one verified and tested result holds more weight than a thousand 
> expert opinions.
> 
> 
> On Sat, 17 Feb 2024 at 19:23, Matei Zaharia  > wrote:
>> Hi all,
>> 
>> I missed some reminder emails about our board report this month, but here is 
>> my draft. I’ll submit it tomorrow if that’s ok. 
>> 
>> ==
>> 
>> Issues for the board:
>> 
>> - None
>> 
>> Project status:
>> 
>> - We made two patch releases: Spark 3.3.4 (EOL release) on December 16, 
>> 2023, and Spark 3.4.2 on November 30, 2023.
>> - We have begun voting for a Spark 3.5.1 maintenance release.
>> - The vote on "SPIP: Structured Streaming - Arbitrary State API v2" has 
>> passed.
>> - We transitioned to an ASF-hosted analytics service, Matomo. For details, 
>> visit 
>> https://analytics.apache.org/index.php?module=CoreHome=index=yesterday=day=40.
>> - Project Comet, a plugin designed to accelerate Spark query execution by 
>> leveraging DataFusion and Arrow, has been open-sourced under the Apache 
>> Arrow project. For more information, visit 
>> https://github.com/apache/arrow-datafusion-comet.
>> 
>> Trademarks:
>> 
>> - No changes since the last report.
>> 
>> Latest releases:
>> 
>> - Spark 3.3.4 was released on December 16, 2023
>> - Spark 3.4.2 was released on November 30, 2023
>> - Spark 3.5.0 was released on September 13, 2023
>> 
>> Committers and PMC:
>> 
>> - The latest committer was added on Oct 2nd, 2023 (Jiaan Geng).
>> - The latest PMC members were added on Oct 2nd, 2023 (Yuanjian Li and Yikun 
>> Jiang).
>> 
>> ==
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org 
>> 
>> 



Re: ASF board report draft for February

2024-02-18 Thread Mich Talebzadeh
Hi Matei,

With regard to your last point

"- Project Comet, a plugin designed to accelerate Spark query execution by
leveraging DataFusion and Arrow, has been open-sourced under the Apache
Arrow project. For more information, visit
https://github.com/apache/arrow-datafusion-comet.;

If my understanding is correct (as of  15th February), I don't think the
full project is open sourced yet and I quote a response from the thead
owner Chao Sun

"Note that we haven't open sourced several features yet including shuffle
support, which the aggregate operation depends on. Please stay tuned!"

I would be inclined to leave that line out for now. The rest is fine.

HTH

Mich Talebzadeh,
Dad | Technologist | Solutions Architect | Engineer
London
United Kingdom


   view my Linkedin profile



 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, one verified and tested result holds more weight
than a thousand expert opinions.


On Sat, 17 Feb 2024 at 19:23, Matei Zaharia  wrote:

> Hi all,
>
> I missed some reminder emails about our board report this month, but here
> is my draft. I’ll submit it tomorrow if that’s ok.
>
> ==
>
> Issues for the board:
>
> - None
>
> Project status:
>
> - We made two patch releases: Spark 3.3.4 (EOL release) on December 16,
> 2023, and Spark 3.4.2 on November 30, 2023.
> - We have begun voting for a Spark 3.5.1 maintenance release.
> - The vote on "SPIP: Structured Streaming - Arbitrary State API v2" has
> passed.
> - We transitioned to an ASF-hosted analytics service, Matomo. For details,
> visit
> https://analytics.apache.org/index.php?module=CoreHome=index=yesterday=day=40
> .
> - Project Comet, a plugin designed to accelerate Spark query execution by
> leveraging DataFusion and Arrow, has been open-sourced under the Apache
> Arrow project. For more information, visit
> https://github.com/apache/arrow-datafusion-comet.
>
> Trademarks:
>
> - No changes since the last report.
>
> Latest releases:
>
> - Spark 3.3.4 was released on December 16, 2023
> - Spark 3.4.2 was released on November 30, 2023
> - Spark 3.5.0 was released on September 13, 2023
>
> Committers and PMC:
>
> - The latest committer was added on Oct 2nd, 2023 (Jiaan Geng).
> - The latest PMC members were added on Oct 2nd, 2023 (Yuanjian Li and
> Yikun Jiang).
>
> ==
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


Re: ASF board report draft for February

2024-02-18 Thread Dongjoon Hyun
+1, it looks good to me.

Thank you, Matei.

Dongjoon

On Sat, Feb 17, 2024 at 11:21 AM Matei Zaharia 
wrote:

> Hi all,
>
> I missed some reminder emails about our board report this month, but here
> is my draft. I’ll submit it tomorrow if that’s ok.
>
> ==
>
> Issues for the board:
>
> - None
>
> Project status:
>
> - We made two patch releases: Spark 3.3.4 (EOL release) on December 16,
> 2023, and Spark 3.4.2 on November 30, 2023.
> - We have begun voting for a Spark 3.5.1 maintenance release.
> - The vote on "SPIP: Structured Streaming - Arbitrary State API v2" has
> passed.
> - We transitioned to an ASF-hosted analytics service, Matomo. For details,
> visit
> https://analytics.apache.org/index.php?module=CoreHome=index=yesterday=day=40
> .
> - Project Comet, a plugin designed to accelerate Spark query execution by
> leveraging DataFusion and Arrow, has been open-sourced under the Apache
> Arrow project. For more information, visit
> https://github.com/apache/arrow-datafusion-comet.
>
> Trademarks:
>
> - No changes since the last report.
>
> Latest releases:
>
> - Spark 3.3.4 was released on December 16, 2023
> - Spark 3.4.2 was released on November 30, 2023
> - Spark 3.5.0 was released on September 13, 2023
>
> Committers and PMC:
>
> - The latest committer was added on Oct 2nd, 2023 (Jiaan Geng).
> - The latest PMC members were added on Oct 2nd, 2023 (Yuanjian Li and
> Yikun Jiang).
>
> ==
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


Re: ASF board report draft for February 2022

2022-02-09 Thread Matei Zaharia
Thanks, good idea.

> On Feb 8, 2022, at 12:25 PM, Mich Talebzadeh  
> wrote:
> 
> Hi,
> 
> I believe it would be beneficial to provide the links to SPIPs mentioned in 
> the report
> 
> - Two Spark Project Improvement Proposals (SPIPs) were recently accepted by 
> the community: namely; 1)  Support for Customized Kubernetes Schedulers 
> 
>  and 2) Storage Partitioned Join for Data Source V2 
> 
> 
> HTH
> 
>view my Linkedin profile 
> 
>  
> Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
> damage or destruction of data or any other property which may arise from 
> relying on this email's technical content is explicitly disclaimed. The 
> author will in no case be liable for any monetary damages arising from such 
> loss, damage or destruction.
>  
> 
> 
> On Tue, 8 Feb 2022 at 09:06, Matei Zaharia  > wrote:
> It’s time to send our quarterly report to the ASF board again this Wednesday. 
> I’ve written the following draft for it — let me know if you want to add or 
> change anything.
> 
> ==
> 
> Description:
> 
> Apache Spark is a fast and general purpose engine for large-scale data
> processing. It offers high-level APIs in Java, Scala, Python, R and SQL as
> well as a rich set of libraries including stream processing, machine learning,
> and graph analytics.
> 
> Issues for the board:
> 
> - None
> 
> Project status:
> 
> - We released Apache Spark 3.2.1, a bug fix release for the 3.2 line, in 
> January.
> 
> - Two Spark Project Improvement Proposals (SPIPs) were recently accepted by 
> the community: Support for Customized Kubernetes Schedulers and Storage 
> Partitioned Join for Data Source V2.
> 
> - We’ve migrated away from Spark’s original Jenkins CI/CD infrastructure, 
> which was graciously hosted by UC Berkeley on their clusters since 2013, to 
> GitHub Actions. Thanks to the Berkeley CS department for hosting this for so 
> long!
> 
> - We added a new committer, Yuanjian Li, in December 2021.
> 
> - We added a new PMC member, Maciej Szymkiewicz, in January 2022.
> 
> Trademarks:
> 
> - No changes since the last report.
> 
> Latest releases:
> 
> - Spark 3.2.1 was released on January 26, 2022.
> - Spark 3.2.0 was released on October 13, 2021.
> - Spark 3.1.2 was released on June 23rd, 2021.
> 
> Committers and PMC:
> - The latest committer was added on Dec 20th, 2021 (Yuanjian Li).
> - The latest PMC member was added on Jan 19th, 2022 (Maciej Szymkiewicz).
> 
> ==
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org 
> 
> 



Re: ASF board report draft for February 2022

2022-02-08 Thread Mich Talebzadeh
Hi,

I believe it would be beneficial to provide the links to SPIPs mentioned in
the report

- Two Spark Project Improvement Proposals (SPIPs) were recently accepted by
the community: namely; 1)  Support for Customized Kubernetes Schedulers

and
2) Storage Partitioned Join for Data Source V2



HTH


   view my Linkedin profile




*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Tue, 8 Feb 2022 at 09:06, Matei Zaharia  wrote:

> It’s time to send our quarterly report to the ASF board again this
> Wednesday. I’ve written the following draft for it — let me know if you
> want to add or change anything.
>
> ==
>
> Description:
>
> Apache Spark is a fast and general purpose engine for large-scale data
> processing. It offers high-level APIs in Java, Scala, Python, R and SQL as
> well as a rich set of libraries including stream processing, machine
> learning,
> and graph analytics.
>
> Issues for the board:
>
> - None
>
> Project status:
>
> - We released Apache Spark 3.2.1, a bug fix release for the 3.2 line, in
> January.
>
> - Two Spark Project Improvement Proposals (SPIPs) were recently accepted
> by the community: Support for Customized Kubernetes Schedulers and Storage
> Partitioned Join for Data Source V2.
>
> - We’ve migrated away from Spark’s original Jenkins CI/CD infrastructure,
> which was graciously hosted by UC Berkeley on their clusters since 2013, to
> GitHub Actions. Thanks to the Berkeley CS department for hosting this for
> so long!
>
> - We added a new committer, Yuanjian Li, in December 2021.
>
> - We added a new PMC member, Maciej Szymkiewicz, in January 2022.
>
> Trademarks:
>
> - No changes since the last report.
>
> Latest releases:
>
> - Spark 3.2.1 was released on January 26, 2022.
> - Spark 3.2.0 was released on October 13, 2021.
> - Spark 3.1.2 was released on June 23rd, 2021.
>
> Committers and PMC:
> - The latest committer was added on Dec 20th, 2021 (Yuanjian Li).
> - The latest PMC member was added on Jan 19th, 2022 (Maciej Szymkiewicz).
>
> ==
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>