Re: How to compile Spark with private build of Hadoop

Steve Loughran Tue, 08 Mar 2016 02:48:12 -0800

On 8 Mar 2016, at 07:23, Lu, Yingqi 
<yingqi...@intel.com<mailto:yingqi...@intel.com>> wrote:


Thank you for the quick reply. I am very new to maven and always use the 
default settings. Can you please be a little more specific on the instructions?

I think all the jar files from Hadoop build are located at 
Hadoop-3.0.0-SNAPSHOT/share/hadoop. Which ones I need to use to compile Spark 
and how can I change the pom.xml?

Thanks,
Lucy





It's simple to do this locally; no need for a remote server

You do need to do every morning, and do not try to run a build over midnight, 
as that confuses Maven. Just bear in mind "the first build that maven does 
every day, it will try to get snapshots remotely, if they aren't local"

1. in hadoop-trunk:
mvn install -DskipTests

This will publish the 3.0.0-SNAPSHOT JARs into ~/.m2/repository , where they 
will be picked up by dependent builds for the test of the day

2. in spark

mvn install -DskipTests -Phadoop-2.6 -Dhadoop.version=3.0.0-SNAPSHOT

That turns on the Hadoop 2.6+ profile, but sets the Hadoop version to build to 
be the 3.0.0 one you built in step (1)

3. Go and have a coffee; wait for the spark build to finish

That's all you need to do to get a version of spark built with your hadoop 
version.

It may be that spark fails to compile against Hadoop 2.9.0-SNAPSHOT or 
3.0.0-SNAPSHOT. If that happens, consider it a regression in Hadoop, file a bug 
there. I've been working with Hadoop 2.8.0-SNAPSHOT without problems, except 
for where, in the split of HDFS in to client and server JARs/POMs 
(hadoop-hdfs-client and hadoop-hdfs), the client JAR had left out some classes 
I expected to be available. That's been fixed, but don't be afraid to complain 
yourself if you find a problem: it's in the nightly build phase where 
regressions can be fixed within 24h

Re: How to compile Spark with private build of Hadoop

Reply via email to