Are you running in YARN mode and you want to put these jar files into HDFS in a distributed cluster? HTH
Dr Mich Talebzadeh, Architect | Data Science | Financial Crime | Forensic Analysis | GDPR view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> On Sat, 31 May 2025 at 19:47, Nimrod Ofek <ofek.nim...@gmail.com> wrote: > Hi everyone, > > Apologies if this is a basic question—I’ve searched around but haven’t > found a clear answer. > > I'm currently developing a Spark application using Scala, and I’m looking > for a way to include all the JARs typically bundled in a standard Spark > installation as a single provided dependency. > > From what I’ve seen, most examples add each Spark module individually > (like spark-core, spark-sql, spark-mllib, etc.) as separate provided > dependencies. > However, since these are all included in the Spark runtime environment, I’m > wondering why there isn’t a more aggregated dependency—something like a > parent project or BOM (Bill of Materials) that pulls in all the commonly > included Spark libraries (along with compatible versions of Log4j, Guava, > Jackson, and so on) - which is being used in projects. > > Is there a particular reason this approach isn’t commonly used? Does it > cause issues with transitive dependencies or version mismatches? If so - > I'm sure those can be addressed as well... > > > Thanks in advance for any insights! > > > Best regards, > > Nimrod > >