Hi all, This is Danyaal Malik from Scala teams. Me and my team are interested in creating a Scala client and maintaining it in the long term as other language clients are being maintained. My query was that how do we submit our project under Apache foundation? I went through this https://spark.apache.org/contributing.html link but it’s more related to contributions in the Spark project itself.
Regards Danyaal Malik Scala Teams > On 05-Nov-2025, at 7:31 PM, Nimrod Ofek <[email protected]> wrote: > > Hi all, > > I wanted to offer a slightly different perspective regarding the project's > long-term health. > I see a compelling argument for prioritizing efforts that address codebase > simplification before investing heavily in a major language upgrade, > especially given the Spark Connect option for users and developers. > > My main point centers on the value proposition of this significant change: > > Spark Connect as an Alternative: For many users, the primary benefits of a > major language upgrade—such as access to new features and APIs—are now > substantially covered by Spark Connect. This feature already provides a > powerful, similar experience across many use cases, which could suggest that > the urgency for a full internal transition is not that big. > > Impact on Long-Term Maintainability: My primary concern is the cumulative > impact of these changes on the project’s technical debt. As the codebase > currently stands, there are existing complexities (e.g., the parallel support > for Datasource V1 and V2, the mix of Java and Scala APIs - and until not long > ago - the support of multiple Scala versions) that already challenge > readability and maintenance. > > Risk of Further Fragmentation: Layering on support for a new major language > version (Scala 3), which necessarily has differences from previous versions, > risks further complicating the build matrix and internal logic and project > structure. I worry this could make the project even more challenging to > onboard new contributors and manage future patches. > > > I propose we launch a focused initiative to tighten and consolidate the > existing codebase. This would involve: > > API Simplification: Creating a roadmap for the eventual deprecation and > removal of older systems like Datasource V1. > > Consolidation: Reducing the remaining areas of language or version > fragmentation to make the existing code more straightforward. > > Project high level design doc: a few pages doc or a video, that explains the > general flow and some of the most important classes, for new contributors to > have a starting point. > > By investing in internal cleanup and simplification first, we ensure that any > future feature or bug fix will be significantly less disruptive and more > cost-effective, while new Languages support will be handled in a different > repo, based on Spark Connect - so it won’t impact the core project. > Any thoughts about that? > > > > Best regards, > Nimrod > > > > On Wed, Nov 5, 2025 at 9:55 AM Norbert Schultz > <[email protected] <mailto:[email protected]>> > wrote: >> Hi Tanveer, >> >> The approach with Spark Connect from Dangjoon Hyun seems like a good start, >> if we want to run Scala 3 Applications with a Spark backend >> >> However I would also like to see a Scala 3 Build of Spark itself, as it >> would migrating existing applications easier. >> >> For that, it’s maybe a good Idea to just start with a small fork to gather >> more information: >> >> - Update https://github.com/apache/spark/pull/50474 >> - There doesn’t seem to be too much Scala Macros in the Codebase. Also there >> is no Shapeless. Good. >> - UDFs, DataSet, Encoders, ScalaReflection etc. are using Typetag to encode >> Decoders. This should be exchanged into some Spark-owned Typeclass, which >> can then describe Scala 2/Scala 3 specific ways. The Scala 2 Code can then >> still rely on TypeTags >> - Enabling Scala 3.3.x on the code and see what breaks. At least Scala with >> SBT supports Scala-Version specific Code paths (e.g. src/main/scala-3, >> Scala-2). I am sure, Maven can do this too. Scala-2-Specific Code goes to >> scala-2. Stubs should make it possible, to compile in Scala-3. >> - Implementing the stubs for Scala 3 and see how it goes. Typetags should >> possible be replaceable by a combination of ClassTag and Mirror.ProductOf >> (guessing) >> >> This could also be possible in a sub-project-wise fashion. >> >> The Scala 3 Code style should be as similar as the existing Scala 2 Style, >> in order to not make it more complicated, so Brace-Style and no unnecessary >> new futures. >> >> Note: I am not deep in the Spark source code. >> >> Kind Regards, >> Norbert >> >> >> >>> Am 04.11.2025 um 12:10 schrieb Tanveer Zia <[email protected] >>> <mailto:[email protected]>>: >>> >>> Hi everyone, >>> >>> I’m Tanveer from Scala Teams. We’re interested in contributing to the Scala >>> 3 migration of Apache Spark, as referenced in SPARK-54150 >>> <https://issues.apache.org/jira/browse/SPARK-54150>. >>> >>> Could you please share the current status or any existing roadmap for this >>> migration? We’d also appreciate guidance on how external contributors can >>> best get involved or coordinate with the core team on next steps. >>> >>> Best regards, >>> Tanveer Zia >>> Scala Teams >>> >> >> >> Reactive Core GmbH | Paul-Lincke-Ufer 8b | 10999 Berlin >> Fon: +49 30 9832 4666 | Web: www.reactivecore.de >> <http://www.reactivecore.de/> >> Handelsregister: Amtsgericht Charlottenburg HRB 156696 B >> Sitz: Berlin | Geschäftsführer: Norbert Schultz >>
