That's right, I'm looking to depend on spark in general and change only the hadoop client deps. The spark master and slaves use the spark-1.0.1-bin-hadoop1 binaries from the downloads page. The relevant snippet from the app's maven pom is as follows:
<dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.10</artifactId> <version>1.0.1</version> <scope>provided</scope> </dependency> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-client</artifactId> <version>0.20.2-cdh3u5</version> <type>jar</type> </dependency> </dependencies> <repositories> <repository> <id>Cloudera repository</id> <url>https://repository.cloudera.com/artifactory/cloudera-repos/ </url> </repository> <repository> <id>Akka repository</id> <url>http://repo.akka.io/releases</url> </repository> </repositories> Thanks, Bharath On Fri, Jul 25, 2014 at 10:29 PM, Sean Owen <so...@cloudera.com> wrote: > If you link against the pre-built binary, that's for Hadoop 1.0.4. Can > you show your deps to clarify what you are depending on? Building > custom Spark and depending on it is a different thing from depending > on plain Spark and changing its deps. I think you want the latter. > > On Fri, Jul 25, 2014 at 5:46 PM, Bharath Ravi Kumar <reachb...@gmail.com> > wrote: > > Thanks for responding. I used the pre built spark binaries meant for > > hadoop1,cdh3u5. I do not intend to build spark against a specific > > distribution. Irrespective of whether I build my app with the explicit > cdh > > hadoop client dependency, I get the same error message. I also verified > > that my app's uber jar had pulled in the cdh hadoop client dependencies. > > > > On 25-Jul-2014 9:26 pm, "Sean Owen" <so...@cloudera.com> wrote: > >> > >> This indicates your app is not actually using the version of the HDFS > >> client you think. You built Spark from source with the right deps it > >> seems, but are you sure you linked to your build in your app? > >> > >> On Fri, Jul 25, 2014 at 4:32 PM, Bharath Ravi Kumar < > reachb...@gmail.com> > >> wrote: > >> > Any suggestions to work around this issue ? The pre built spark > >> > binaries > >> > don't appear to work against cdh as documented, unless there's a build > >> > issue, which seems unlikely. > >> > > >> > On 25-Jul-2014 3:42 pm, "Bharath Ravi Kumar" <reachb...@gmail.com> > >> > wrote: > >> >> > >> >> > >> >> I'm encountering a hadoop client protocol mismatch trying to read > from > >> >> HDFS (cdh3u5) using the pre-build spark from the downloads page > (linked > >> >> under "For Hadoop 1 (HDP1, CDH3)"). I've also followed the > >> >> instructions at > >> >> > >> >> > http://spark.apache.org/docs/latest/hadoop-third-party-distributions.html > >> >> (i.e. building the app against hadoop-client 0.20.2-cdh3u5), but > >> >> continue to > >> >> see the following error regardless of whether I link the app with the > >> >> cdh > >> >> client: > >> >> > >> >> 14/07/25 09:53:43 INFO client.AppClient$ClientActor: Executor > updated: > >> >> app-20140725095343-0016/1 is now RUNNING > >> >> 14/07/25 09:53:43 WARN util.NativeCodeLoader: Unable to load > >> >> native-hadoop > >> >> library for your platform... using builtin-java classes where > >> >> applicable > >> >> 14/07/25 09:53:43 WARN snappy.LoadSnappy: Snappy native library not > >> >> loaded > >> >> Exception in thread "main" org.apache.hadoop.ipc.RPC$VersionMismatch: > >> >> Protocol org.apache.hadoop.hdfs.protocol.ClientProtocol version > >> >> mismatch. > >> >> (client = 61, server = 63) > >> >> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:401) > >> >> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379) > >> >> > >> >> > >> >> While I can build spark against the exact hadoop distro version, I'd > >> >> rather work with the standard prebuilt binaries, making additional > >> >> changes > >> >> while building the app if necessary. Any workarounds/recommendations? > >> >> > >> >> Thanks, > >> >> Bharath >