Thanks a lot Jey ! That fixes things. For reference I had to add the following line to build.sbt
case m if m.toLowerCase.matches("meta-inf/services.*$") => MergeStrategy.concat Should we also add this to Spark's assembly build ? Thanks Shivaram On Mon, Feb 17, 2014 at 6:27 PM, Jey Kottalam <j...@cs.berkeley.edu> wrote: > We ran into this issue with ADAM, and it came down to an issue of not > merging the "META-INF/services" files correctly. Here's the change we made > to our Maven build files to fix it, can probably do something similar under > SBT too: > https://github.com/bigdatagenomics/adam/commit/b0997760b23c4284efe32eeb968ef2744af8be82 > > -Jey > > > On Mon, Feb 17, 2014 at 6:15 PM, Shivaram Venkataraman > <shiva...@eecs.berkeley.edu> wrote: >> >> I ran into a weird bug today where trying to read a file from HDFS >> built using Hadoop 2 gives an error saying "No FileSystem for scheme: >> hdfs". Specifically this only seems to happen when building an >> assembly jar in the application and not when using sbt's run-main. >> >> The project's setup[0] is pretty simple and is only a slight >> modification of the project used by the release audit tool. The sbt >> assembly instructions[1] are mostly copied from Spark's sbt build >> files. >> >> We run into this in SparkR as well, so it'll be great if anybody has >> an idea on how to debug this. >> To repoduce, you can do the following: >> >> 1. Launch a Spark EC2 cluster with 0.9.0 with --hadoop-major-version=2 >> 2. Clone https://github.com/shivaram/spark-utils >> 3. Run release-audits/sbt_app_core/run-hdfs-test.sh >> >> Thanks >> Shivaram >> >> [0] >> https://github.com/shivaram/spark-utils/blob/master/release-audits/sbt_app_core/src/main/scala/SparkHdfsApp.scala >> [1] >> https://github.com/shivaram/spark-utils/blob/master/release-audits/sbt_app_core/build.sbt > >