Hi all,

This is Danyaal Malik from Scala teams. Me and my team are interested in 
creating a Scala client and maintaining it in the long term as other language 
clients are being maintained. My query was that how do we submit our project 
under Apache foundation? I went through this 
https://spark.apache.org/contributing.html link but it’s more related to 
contributions in the Spark project itself. 

Regards
Danyaal Malik
Scala Teams

> On 05-Nov-2025, at 7:31 PM, Nimrod Ofek <[email protected]> wrote:
> 
> Hi all,
> 
> I wanted to offer a slightly different perspective regarding the project's 
> long-term health. 
> I see a compelling argument for prioritizing efforts that address codebase 
> simplification before investing heavily in a major language upgrade, 
> especially given the Spark Connect option for users and developers.
> 
> My main point centers on the value proposition of this significant change:
> 
> Spark Connect as an Alternative: For many users, the primary benefits of a 
> major language upgrade—such as access to new features and APIs—are now 
> substantially covered by Spark Connect. This feature already provides a 
> powerful, similar experience across many use cases, which could suggest that 
> the urgency for a full internal transition is not that big.
> 
> Impact on Long-Term Maintainability: My primary concern is the cumulative 
> impact of these changes on the project’s technical debt. As the codebase 
> currently stands, there are existing complexities (e.g., the parallel support 
> for Datasource V1 and V2, the mix of Java and Scala APIs - and until not long 
> ago - the support of multiple Scala versions) that already challenge 
> readability and maintenance.
> 
> Risk of Further Fragmentation: Layering on support for a new major language 
> version (Scala 3), which necessarily has differences from previous versions, 
> risks further complicating the build matrix and internal logic and project 
> structure. I worry this could make the project even more challenging to 
> onboard new contributors and manage future patches.
> 
> 
> I propose we launch a focused initiative to tighten and consolidate the 
> existing codebase. This would involve:
> 
> API Simplification: Creating a roadmap for the eventual deprecation and 
> removal of older systems like Datasource V1.
> 
> Consolidation: Reducing the remaining areas of language or version 
> fragmentation to make the existing code more straightforward.
> 
> Project high level design doc: a few pages doc or a video, that explains the 
> general flow and some of the most important classes, for new contributors to 
> have a starting point. 
> 
> By investing in internal cleanup and simplification first, we ensure that any 
> future feature or bug fix will be significantly less disruptive and more 
> cost-effective, while new Languages support will be handled in a different 
> repo, based on Spark Connect - so it won’t impact the core project.
> Any thoughts about that?
> 
> 
> 
> Best regards, 
> Nimrod
> 
> 
> 
> On Wed, Nov 5, 2025 at 9:55 AM Norbert Schultz 
> <[email protected] <mailto:[email protected]>> 
> wrote:
>> Hi Tanveer,
>> 
>> The approach with Spark Connect from Dangjoon Hyun seems like a good start, 
>> if we want to run Scala 3 Applications with a Spark backend
>> 
>> However I would also like to see a Scala 3 Build of Spark itself, as it 
>> would migrating existing applications easier.
>> 
>> For that, it’s maybe a good Idea to just start with a small fork to gather 
>> more information:
>> 
>> - Update https://github.com/apache/spark/pull/50474  
>> - There doesn’t seem to be too much Scala Macros in the Codebase. Also there 
>> is no Shapeless. Good.
>> - UDFs, DataSet, Encoders, ScalaReflection etc. are using Typetag to encode 
>> Decoders. This should be exchanged into some Spark-owned Typeclass, which 
>> can then describe Scala 2/Scala 3 specific ways. The Scala 2 Code can then 
>> still rely on TypeTags
>> - Enabling Scala 3.3.x on the code and see what breaks. At least Scala with 
>> SBT supports Scala-Version specific Code paths (e.g. src/main/scala-3, 
>> Scala-2). I am sure, Maven can do this too. Scala-2-Specific Code goes to 
>> scala-2. Stubs should make it possible, to compile in Scala-3.
>> - Implementing the stubs for Scala 3 and see how it goes. Typetags should 
>> possible be replaceable by a combination of ClassTag and Mirror.ProductOf 
>> (guessing)
>> 
>> This could also be possible in a sub-project-wise fashion.
>> 
>> The Scala 3 Code style should be as similar as the existing Scala 2 Style, 
>> in order to not make it more complicated, so Brace-Style and no unnecessary 
>> new futures.
>> 
>> Note: I am not deep in the Spark source code.
>> 
>> Kind Regards,
>> Norbert
>> 
>> 
>> 
>>> Am 04.11.2025 um 12:10 schrieb Tanveer Zia <[email protected] 
>>> <mailto:[email protected]>>:
>>> 
>>> Hi everyone,
>>> 
>>> I’m Tanveer from Scala Teams. We’re interested in contributing to the Scala 
>>> 3 migration of Apache Spark, as referenced in SPARK-54150 
>>> <https://issues.apache.org/jira/browse/SPARK-54150>.
>>> 
>>> Could you please share the current status or any existing roadmap for this 
>>> migration? We’d also appreciate guidance on how external contributors can 
>>> best get involved or coordinate with the core team on next steps.
>>> 
>>> Best regards,
>>> Tanveer Zia
>>> Scala Teams
>>> 
>> 
>> 
>> Reactive Core GmbH | Paul-Lincke-Ufer 8b | 10999 Berlin
>> Fon: +49 30 9832 4666 | Web: www.reactivecore.de 
>> <http://www.reactivecore.de/>
>> Handelsregister: Amtsgericht Charlottenburg HRB 156696 B
>> Sitz: Berlin | Geschäftsführer: Norbert Schultz
>> 

Reply via email to