Actually I forgot to add one more item. I want to mention that the community started a large effort to improve Structured Streaming performance, usability, APIs, and connectors (https://issues.apache.org/jira/browse/SPARK-40025 <https://issues.apache.org/jira/browse/SPARK-40025>), and we’d love to get feedback and contributions on that.
> On Aug 10, 2022, at 11:16 AM, Matei Zaharia <matei.zaha...@gmail.com> wrote: > > It’s time to submit our quarterly report to the ASF board on Friday. Here is > a draft, lmk if you have suggestions: > > ======================= > > Description: > > Apache Spark is a fast and general purpose engine for large-scale data > processing. It offers high-level APIs in Java, Scala, Python, R and SQL as > well as a rich set of libraries including stream processing, machine learning, > and graph analytics. > > Issues for the board: > > - None > > Project status: > > - Apache Spark was honored to receive the SIGMOD System Award this year, > given by SIGMOD (the ACM’s data management research organization) to > impactful real-world and research systems. > > - We recently released Apache Spark 3.3.0, a feature release that improves > join query performance via Bloom filters, increases the Pandas API coverage > with the support of popular Pandas features such as datetime.timedelta and > merge_asof, simplifies the migration from traditional data warehouses by > improving ANSI SQL compliance and supporting dozens of new built-in > functions, boosts development productivity with better error handling, > autocompletion, performance, and profiling. > > - We released Apache Spark 3.2.2, a bug fix release for the 3.2 line, on July > 17th. > > - A Spark Project Improvement Proposal (SPIP) for Spark Connect was voted on > and accepted. Spark Connect introduces a lightweight client/server API for > Spark (https://issues.apache.org/jira/browse/SPARK-39375) that will allow > applications to submit work to a remote Spark cluster without running the > heavyweight query planner in the client, and will also decouple the client > version from the server version, making it possible to update Spark without > updating all the applications. > > - We added three new PMC members, Huaxin Gao, Gengliang Wang and Maxim Gekk, > in June 2022. > > - We added a new committer, Xinrong Meng, in July 2022. > > Trademarks: > > - No changes since the last report. > > Latest releases: > > - Spark 3.3.0 was released on June 16, 2022. > - Spark 3.2.2 was released on July 17, 2022. > - Spark 3.1.3 was released on February 18, 2022. > > Committers and PMC: > > - The latest committer was added on July 13rd, 2022 (Xinrong Meng). > - The latest PMC member was added on June 28th, 2022 (Huaxin Gao). > > =======================