On 8 Mar 2016, at 07:23, Lu, Yingqi <yingqi...@intel.com<mailto:yingqi...@intel.com>> wrote:
Thank you for the quick reply. I am very new to maven and always use the default settings. Can you please be a little more specific on the instructions? I think all the jar files from Hadoop build are located at Hadoop-3.0.0-SNAPSHOT/share/hadoop. Which ones I need to use to compile Spark and how can I change the pom.xml? Thanks, Lucy It's simple to do this locally; no need for a remote server You do need to do every morning, and do not try to run a build over midnight, as that confuses Maven. Just bear in mind "the first build that maven does every day, it will try to get snapshots remotely, if they aren't local" 1. in hadoop-trunk: mvn install -DskipTests This will publish the 3.0.0-SNAPSHOT JARs into ~/.m2/repository , where they will be picked up by dependent builds for the test of the day 2. in spark mvn install -DskipTests -Phadoop-2.6 -Dhadoop.version=3.0.0-SNAPSHOT That turns on the Hadoop 2.6+ profile, but sets the Hadoop version to build to be the 3.0.0 one you built in step (1) 3. Go and have a coffee; wait for the spark build to finish That's all you need to do to get a version of spark built with your hadoop version. It may be that spark fails to compile against Hadoop 2.9.0-SNAPSHOT or 3.0.0-SNAPSHOT. If that happens, consider it a regression in Hadoop, file a bug there. I've been working with Hadoop 2.8.0-SNAPSHOT without problems, except for where, in the split of HDFS in to client and server JARs/POMs (hadoop-hdfs-client and hadoop-hdfs), the client JAR had left out some classes I expected to be available. That's been fixed, but don't be afraid to complain yourself if you find a problem: it's in the nightly build phase where regressions can be fixed within 24h