Re: How to compile Spark with customized Hadoop?

2015-10-14 Thread Dogtail L
Hi,

When I publish my version of Hadoop, it is installed in:
/HOME_DIRECTORY/.m2/repository/org/apache/hadoop, but when I compile Spark,
it will fetch Hadoop libraries from
https://repo1.maven.org/maven2/org/apache/hadoop. How can I let Spark fetch
Hadoop libraries from my local M2 cache? Great thanks!

On Fri, Oct 9, 2015 at 5:31 PM, Matei Zaharia 
wrote:

> You can publish your version of Hadoop to your Maven cache with mvn
> publish (just give it a different version number, e.g. 2.7.0a) and then
> pass that as the Hadoop version to Spark's build (see
> http://spark.apache.org/docs/latest/building-spark.html).
>
> Matei
>
> On Oct 9, 2015, at 3:10 PM, Dogtail L  wrote:
>
> Hi all,
>
> I have modified Hadoop source code, and I want to compile Spark with my
> modified Hadoop. Do you know how to do that? Great thanks!
>
>
>


Re: How to compile Spark with customized Hadoop?

2015-10-10 Thread Raghavendra Pandey
There is spark without hadoop version.. You can use that to link with any
custom hadoop version.

Raghav
On Oct 10, 2015 5:34 PM, "Steve Loughran"  wrote:

>
> During development, I'd recommend giving Hadoop a version ending with
> -SNAPSHOT, and building spark with maven, as mvn knows to refresh the
> snapshot every day.
>
> you can do this in hadoop with
>
> mvn versions:set 2.7.0.stevel-SNAPSHOT
>
> if you are working on hadoop branch-2 or trunk direct, they come with
> -SNAPSHOT anyway, but unless you build hadoop every morning, you may find
> maven pulls in the latest nightly builds from the apache snapshot
> repository, which will cause chaos and confusion. This is also why you must
> never have maven build which spans midnight in your time zone.
>
>
> On 9 Oct 2015, at 22:31, Matei Zaharia  wrote:
>
> You can publish your version of Hadoop to your Maven cache with mvn
> publish (just give it a different version number, e.g. 2.7.0a) and then
> pass that as the Hadoop version to Spark's build (see
> http://spark.apache.org/docs/latest/building-spark.html).
>
> Matei
>
> On Oct 9, 2015, at 3:10 PM, Dogtail L  wrote:
>
> Hi all,
>
> I have modified Hadoop source code, and I want to compile Spark with my
> modified Hadoop. Do you know how to do that? Great thanks!
>
>
>
>


Re: How to compile Spark with customized Hadoop?

2015-10-10 Thread Steve Loughran

During development, I'd recommend giving Hadoop a version ending with 
-SNAPSHOT, and building spark with maven, as mvn knows to refresh the snapshot 
every day.

you can do this in hadoop with

mvn versions:set 2.7.0.stevel-SNAPSHOT

if you are working on hadoop branch-2 or trunk direct, they come with -SNAPSHOT 
anyway, but unless you build hadoop every morning, you may find maven pulls in 
the latest nightly builds from the apache snapshot repository, which will cause 
chaos and confusion. This is also why you must never have maven build which 
spans midnight in your time zone.


On 9 Oct 2015, at 22:31, Matei Zaharia 
> wrote:

You can publish your version of Hadoop to your Maven cache with mvn publish 
(just give it a different version number, e.g. 2.7.0a) and then pass that as 
the Hadoop version to Spark's build (see 
http://spark.apache.org/docs/latest/building-spark.html).

Matei

On Oct 9, 2015, at 3:10 PM, Dogtail L 
> wrote:

Hi all,

I have modified Hadoop source code, and I want to compile Spark with my 
modified Hadoop. Do you know how to do that? Great thanks!




Re: How to compile Spark with customized Hadoop?

2015-10-09 Thread Matei Zaharia
You can publish your version of Hadoop to your Maven cache with mvn publish 
(just give it a different version number, e.g. 2.7.0a) and then pass that as 
the Hadoop version to Spark's build (see 
http://spark.apache.org/docs/latest/building-spark.html 
).

Matei

> On Oct 9, 2015, at 3:10 PM, Dogtail L  wrote:
> 
> Hi all,
> 
> I have modified Hadoop source code, and I want to compile Spark with my 
> modified Hadoop. Do you know how to do that? Great thanks!