Hi, Gary,
Thank you very much
This afternoon, I tried to compile spark with my customized hadoop, it finally
works
For those who shared the same problem with me:
1. add the following line to SparkBuild.scala
resolvers ++= Seq("Local Hadoop Repo" at "file:///Users/nanzhu/.m2/repository”),
2. install your customized jars
mvn install:install-file
-Dfile=/Users/nanzhu/code/hadoop-1.2.1/build/hadoop-client-1.2.2-SNAPSHOT.jar
-DgroupId=org.apache.hadoop -DartifactId=hadoop-core -Dversion=1.2.2-SNAPSHOT
-Dpackaging=jar -DgeneratePom=true
mvn install:install-file
-Dfile=/Users/nanzhu/code/hadoop-1.2.1/build/hadoop-core-1.2.2-SNAPSHOT.jar
-DgroupId=org.apache.hadoop -DartifactId=hadoop-core -Dversion=1.2.2-SNAPSHOT
-Dpackaging=jar -DgeneratePom=true
3. set SPARK_HADOOP_VERSION to 1.22-SNAPSHOT
4. add the dependency of hadoop-core
search org.apache.hadoop" % "hadoop-client" % hadoopVersion
excludeAll(excludeJackson, excludeNetty, excludeAsm, excludeCglib) in
project/SparkBuild.scala
add "org.apache.hadoop" % "hadoop-core" % hadoopVersion
excludeAll(excludeJackson, excludeNetty, excludeAsm, excludeCglib), below it
5. compile spark
note:
a. in 1, I don’t know why resolvers ++= Seq(Resolver.file("Local Maven Repo",
file(Path.userHome + "/.m2/repository"))), cannot resolve my directory, so I
have to manually add resolvers ++= Seq("Local Hadoop Repo" at
"file:///Users/nanzhu/.m2/repository”). It is still weird that Seq("Local
Hadoop Repo”, file("Users/nanzhu/.m2/repository”)) doesn’t work….
b. in 4, the cllient.jar dependency cannot download core.jar in automatic
(why?) I have to add an explicit dependency on core.jar
Best,
--
Nan Zhu
On Monday, December 16, 2013 at 2:41 PM, Gary Malouf wrote:
> Check out the dependencies for the version of hadoop-client you are using - I
> think you will find that hadoop-core is present there.
>
>
>
>
> On Mon, Dec 16, 2013 at 1:28 PM, Nan Zhu <[email protected]
> (mailto:[email protected])> wrote:
> > Hi, Gary,
> >
> > The page says Spark uses hadoop-client.jar to interact with HDFS, but why
> > it also downloads hadoop-core?
> >
> > Do I just need to change the dependency on hadoop-client to my local repo?
> >
> > Best,
> >
> > --
> > Nan Zhu
> > School of Computer Science,
> > McGill University
> >
> >
> >
> >
> > On Monday, December 16, 2013 at 9:05 AM, Gary Malouf wrote:
> >
> > > Hi Nan, check out the 'Note about Hadoop Versions' on
> > > http://spark.incubator.apache.org/docs/latest/
> > >
> > > Let us know if this does not solve your problem.
> > >
> > > Gary
> > >
> > >
> > > On Mon, Dec 16, 2013 at 8:19 AM, Nan Zhu <[email protected]
> > > (mailto:[email protected])> wrote:
> > > > Hi, Azuryy
> > > >
> > > > Thank you for the reply
> > > >
> > > > So you compiled Spark with mvn?
> > > >
> > > > I’m watching the pom.xml, I think it is doing the same work as
> > > > SparkBuild.Scala,
> > > >
> > > > I’m still confused by that, in Spark, some class utilized some classes
> > > > like InputFormat, I assume that this should be included in
> > > > hadoop-core.jar,
> > > >
> > > > but I didn’t find any line specified hadoop-core-1.0.4.jar in pom.xml
> > > > and SparkBuild.scala,
> > > >
> > > > Can you explain a bit to me?
> > > >
> > > > Best,
> > > >
> > > > --
> > > > Nan Zhu
> > > > School of Computer Science,
> > > > McGill University
> > > >
> > > >
> > > >
> > > > On Monday, December 16, 2013 at 3:58 AM, Azuryy Yu wrote:
> > > >
> > > > > Hi Nan,
> > > > > I am also using our customized hadoop, so you need to modiy the
> > > > > pom.xml, but before this change, you should install your customized
> > > > > hadoop-* jar in the local maven repo.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Sun, Dec 15, 2013 at 2:45 AM, Nan Zhu <[email protected]
> > > > > (mailto:[email protected])> wrote:
> > > > > > Hi, all
> > > > > >
> > > > > > I’m trying to compile Spark with a customized version of hadoop,
> > > > > > where I modify the implementation of DFSInputStream,
> > > > > >
> > > > > > I would like to SparkBuild.scala to make spark compile with my
> > > > > > hadoop-core.xxx.jar instead of download a original one?
> > > > > >
> > > > > > I only found hadoop-client-xxx.jar and some lines about yarn jars
> > > > > > in ScalaBuild.scala,
> > > > > >
> > > > > > Can you tell me which line I should modify to achieve the goal?
> > > > > >
> > > > > > Best,
> > > > > >
> > > > > > --
> > > > > > Nan Zhu
> > > > > > School of Computer Science,
> > > > > > McGill University
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>