Re: Speeding up Spark build during development

2015-05-05 Thread Pramod Biligiri
 > >>>> the whole project, but that was done only once), and then the
> > > assembly.
> > > >>>> This was much faster than trying to build the whole Spark every
> > time.
> > > >>>>
> > > >>>> --
> > > >>>> Emre Sevinç
> > > >>>>
> > > >>>> On Mon, May 4, 2015 at 9:01 AM, Pramod Biligiri <
> > > >>>> pramodbilig...@gmail.com
> > > >>>>
> > > >>>>> wrote:
> > > >>>>> Using the inbuilt maven and zinc it takes around 10 minutes for
> > each
> > > >>>>> build.
> > > >>>>> Is that reasonable?
> > > >>>>> My maven opts looks like this:
> > > >>>>> $ echo $MAVEN_OPTS
> > > >>>>> -Xmx12000m -XX:MaxPermSize=2048m
> > > >>>>>
> > > >>>>> I'm running it as build/mvn -DskipTests package
> > > >>>>>
> > > >>>>> Should I be tweaking my Zinc/Nailgun config?
> > > >>>>>
> > > >>>>> Pramod
> > > >>>>>
> > > >>>>> On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra <
> > > m...@clearstorydata.com>
> > > >>>>> wrote:
> > > >>>>>
> > > >>>>>
> > > >>>>>>
> > > >>>>>
> > >
> >
> https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn
> > > >>>>>
> > > >>>>>> On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri <
> > > >>>>>>
> > > >>>>> pramodbilig...@gmail.com>
> > > >>>>>
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>  This is great. I didn't know about the mvn script in the build
> > > >>>>>>>
> > > >>>>>> directory.
> > > >>>>>
> > > >>>>>> Pramod
> > > >>>>>>>
> > > >>>>>>> On Fri, May 1, 2015 at 9:51 AM, York, Brennon <
> > > >>>>>>> brennon.y...@capitalone.com>
> > > >>>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>  Following what Ted said, if you leverage the `mvn` from within
> > the
> > > >>>>>>>> `build/` directory of Spark you¹ll get zinc for free which
> > should
> > > >>>>>>>>
> > > >>>>>>> help
> > > >>>>>
> > > >>>>>> speed up build times.
> > > >>>>>>>>
> > > >>>>>>>> On 5/1/15, 9:45 AM, "Ted Yu"  wrote:
> > > >>>>>>>>
> > > >>>>>>>>  Pramod:
> > > >>>>>>>>> Please remember to run Zinc so that the build is faster.
> > > >>>>>>>>>
> > > >>>>>>>>> Cheers
> > > >>>>>>>>>
> > > >>>>>>>>> On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander
> > > >>>>>>>>> 
> > > >>>>>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>  Hi Pramod,
> > > >>>>>>>>>>
> > > >>>>>>>>>> For cluster-like tests you might want to use the same code
> as
> > in
> > > >>>>>>>>>>
> > > >>>>>>>>> mllib's
> > > >>>>>>>
> > > >>>>>>>> LocalClusterSparkContext. You can rebuild only the package
> that
> > > >>>>>>>>>>
> > > >>>>>>>>> you
> > > >>>>>
> > > >>>>>> change
> > > >>>>>>>>>> and then run this main class.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Best regards, Alexander
> > > >>>>>>>>>>
> > > >>>>>>>>>> -Original Message-
> > > >>>>>>>>>> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com]
> > > >>>>>>>>>> Sent: Friday, May 01, 2015 1:46 AM
> > > >>>>>>>>>> To: dev@spark.apache.org
> > > >>>>>>>>>> Subject: Speeding up Spark build during development
> > > >>>>>>>>>>
> > > >>>>>>>>>> Hi,
> > > >>>>>>>>>> I'm making some small changes to the Spark codebase and
> trying
> > > >>>>>>>>>>
> > > >>>>>>>>> it out
> > > >>>>>
> > > >>>>>> on a
> > > >>>>>>>>>> cluster. I was wondering if there's a faster way to build
> than
> > > >>>>>>>>>>
> > > >>>>>>>>> running
> > > >>>>>>>
> > > >>>>>>>> the
> > > >>>>>>>>>> package target each time.
> > > >>>>>>>>>> Currently I'm using: mvn -DskipTests  package
> > > >>>>>>>>>>
> > > >>>>>>>>>> All the nodes have the same filesystem mounted at the same
> > mount
> > > >>>>>>>>>>
> > > >>>>>>>>> point.
> > > >>>>>>>
> > > >>>>>>>> Pramod
> > > >>>>>>>>>>
> > > >>>>>>>>>>  
> > > >>>>>>>>
> > > >>>>>>>> The information contained in this e-mail is confidential
> and/or
> > > >>>>>>>> proprietary to Capital One and/or its affiliates. The
> > information
> > > >>>>>>>> transmitted herewith is intended only for use by the
> individual
> > or
> > > >>>>>>>>
> > > >>>>>>> entity
> > > >>>>>>>
> > > >>>>>>>> to which it is addressed.  If the reader of this message is
> not
> > > the
> > > >>>>>>>> intended recipient, you are hereby notified that any review,
> > > >>>>>>>> retransmission, dissemination, distribution, copying or other
> > use
> > > >>>>>>>>
> > > >>>>>>> of, or
> > > >>>>>
> > > >>>>>> taking of any action in reliance upon this information is
> strictly
> > > >>>>>>>> prohibited. If you have received this communication in error,
> > > please
> > > >>>>>>>> contact the sender and delete the material from your computer.
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>
> > > >>>>
> > > >>>> --
> > > >>>> Emre Sevinc
> > > >>>>
> > > >>>>
> > > >>>
> > > >>
> > > >
> > >
> >
>
>
>
> --
>
> --
> Iulian Dragos
>
> --
> Reactive Apps on the JVM
> www.typesafe.com
>


Re: Speeding up Spark build during development

2015-05-05 Thread Iulian Dragoș
gt;>>
> > >>>>> I'm running it as build/mvn -DskipTests package
> > >>>>>
> > >>>>> Should I be tweaking my Zinc/Nailgun config?
> > >>>>>
> > >>>>> Pramod
> > >>>>>
> > >>>>> On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra <
> > m...@clearstorydata.com>
> > >>>>> wrote:
> > >>>>>
> > >>>>>
> > >>>>>>
> > >>>>>
> >
> https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn
> > >>>>>
> > >>>>>> On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri <
> > >>>>>>
> > >>>>> pramodbilig...@gmail.com>
> > >>>>>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>  This is great. I didn't know about the mvn script in the build
> > >>>>>>>
> > >>>>>> directory.
> > >>>>>
> > >>>>>> Pramod
> > >>>>>>>
> > >>>>>>> On Fri, May 1, 2015 at 9:51 AM, York, Brennon <
> > >>>>>>> brennon.y...@capitalone.com>
> > >>>>>>> wrote:
> > >>>>>>>
> > >>>>>>>  Following what Ted said, if you leverage the `mvn` from within
> the
> > >>>>>>>> `build/` directory of Spark you¹ll get zinc for free which
> should
> > >>>>>>>>
> > >>>>>>> help
> > >>>>>
> > >>>>>> speed up build times.
> > >>>>>>>>
> > >>>>>>>> On 5/1/15, 9:45 AM, "Ted Yu"  wrote:
> > >>>>>>>>
> > >>>>>>>>  Pramod:
> > >>>>>>>>> Please remember to run Zinc so that the build is faster.
> > >>>>>>>>>
> > >>>>>>>>> Cheers
> > >>>>>>>>>
> > >>>>>>>>> On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander
> > >>>>>>>>> 
> > >>>>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>  Hi Pramod,
> > >>>>>>>>>>
> > >>>>>>>>>> For cluster-like tests you might want to use the same code as
> in
> > >>>>>>>>>>
> > >>>>>>>>> mllib's
> > >>>>>>>
> > >>>>>>>> LocalClusterSparkContext. You can rebuild only the package that
> > >>>>>>>>>>
> > >>>>>>>>> you
> > >>>>>
> > >>>>>> change
> > >>>>>>>>>> and then run this main class.
> > >>>>>>>>>>
> > >>>>>>>>>> Best regards, Alexander
> > >>>>>>>>>>
> > >>>>>>>>>> -Original Message-
> > >>>>>>>>>> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com]
> > >>>>>>>>>> Sent: Friday, May 01, 2015 1:46 AM
> > >>>>>>>>>> To: dev@spark.apache.org
> > >>>>>>>>>> Subject: Speeding up Spark build during development
> > >>>>>>>>>>
> > >>>>>>>>>> Hi,
> > >>>>>>>>>> I'm making some small changes to the Spark codebase and trying
> > >>>>>>>>>>
> > >>>>>>>>> it out
> > >>>>>
> > >>>>>> on a
> > >>>>>>>>>> cluster. I was wondering if there's a faster way to build than
> > >>>>>>>>>>
> > >>>>>>>>> running
> > >>>>>>>
> > >>>>>>>> the
> > >>>>>>>>>> package target each time.
> > >>>>>>>>>> Currently I'm using: mvn -DskipTests  package
> > >>>>>>>>>>
> > >>>>>>>>>> All the nodes have the same filesystem mounted at the same
> mount
> > >>>>>>>>>>
> > >>>>>>>>> point.
> > >>>>>>>
> > >>>>>>>> Pramod
> > >>>>>>>>>>
> > >>>>>>>>>>  
> > >>>>>>>>
> > >>>>>>>> The information contained in this e-mail is confidential and/or
> > >>>>>>>> proprietary to Capital One and/or its affiliates. The
> information
> > >>>>>>>> transmitted herewith is intended only for use by the individual
> or
> > >>>>>>>>
> > >>>>>>> entity
> > >>>>>>>
> > >>>>>>>> to which it is addressed.  If the reader of this message is not
> > the
> > >>>>>>>> intended recipient, you are hereby notified that any review,
> > >>>>>>>> retransmission, dissemination, distribution, copying or other
> use
> > >>>>>>>>
> > >>>>>>> of, or
> > >>>>>
> > >>>>>> taking of any action in reliance upon this information is strictly
> > >>>>>>>> prohibited. If you have received this communication in error,
> > please
> > >>>>>>>> contact the sender and delete the material from your computer.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>
> > >>>> --
> > >>>> Emre Sevinc
> > >>>>
> > >>>>
> > >>>
> > >>
> > >
> >
>



-- 

--
Iulian Dragos

--
Reactive Apps on the JVM
www.typesafe.com


Re: Speeding up Spark build during development

2015-05-04 Thread Tathagata Das
s is great. I didn't know about the mvn script in the build
> >>>>>>>
> >>>>>> directory.
> >>>>>
> >>>>>> Pramod
> >>>>>>>
> >>>>>>> On Fri, May 1, 2015 at 9:51 AM, York, Brennon <
> >>>>>>> brennon.y...@capitalone.com>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>  Following what Ted said, if you leverage the `mvn` from within the
> >>>>>>>> `build/` directory of Spark you¹ll get zinc for free which should
> >>>>>>>>
> >>>>>>> help
> >>>>>
> >>>>>> speed up build times.
> >>>>>>>>
> >>>>>>>> On 5/1/15, 9:45 AM, "Ted Yu"  wrote:
> >>>>>>>>
> >>>>>>>>  Pramod:
> >>>>>>>>> Please remember to run Zinc so that the build is faster.
> >>>>>>>>>
> >>>>>>>>> Cheers
> >>>>>>>>>
> >>>>>>>>> On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander
> >>>>>>>>> 
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>  Hi Pramod,
> >>>>>>>>>>
> >>>>>>>>>> For cluster-like tests you might want to use the same code as in
> >>>>>>>>>>
> >>>>>>>>> mllib's
> >>>>>>>
> >>>>>>>> LocalClusterSparkContext. You can rebuild only the package that
> >>>>>>>>>>
> >>>>>>>>> you
> >>>>>
> >>>>>> change
> >>>>>>>>>> and then run this main class.
> >>>>>>>>>>
> >>>>>>>>>> Best regards, Alexander
> >>>>>>>>>>
> >>>>>>>>>> -Original Message-
> >>>>>>>>>> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com]
> >>>>>>>>>> Sent: Friday, May 01, 2015 1:46 AM
> >>>>>>>>>> To: dev@spark.apache.org
> >>>>>>>>>> Subject: Speeding up Spark build during development
> >>>>>>>>>>
> >>>>>>>>>> Hi,
> >>>>>>>>>> I'm making some small changes to the Spark codebase and trying
> >>>>>>>>>>
> >>>>>>>>> it out
> >>>>>
> >>>>>> on a
> >>>>>>>>>> cluster. I was wondering if there's a faster way to build than
> >>>>>>>>>>
> >>>>>>>>> running
> >>>>>>>
> >>>>>>>> the
> >>>>>>>>>> package target each time.
> >>>>>>>>>> Currently I'm using: mvn -DskipTests  package
> >>>>>>>>>>
> >>>>>>>>>> All the nodes have the same filesystem mounted at the same mount
> >>>>>>>>>>
> >>>>>>>>> point.
> >>>>>>>
> >>>>>>>> Pramod
> >>>>>>>>>>
> >>>>>>>>>>  
> >>>>>>>>
> >>>>>>>> The information contained in this e-mail is confidential and/or
> >>>>>>>> proprietary to Capital One and/or its affiliates. The information
> >>>>>>>> transmitted herewith is intended only for use by the individual or
> >>>>>>>>
> >>>>>>> entity
> >>>>>>>
> >>>>>>>> to which it is addressed.  If the reader of this message is not
> the
> >>>>>>>> intended recipient, you are hereby notified that any review,
> >>>>>>>> retransmission, dissemination, distribution, copying or other use
> >>>>>>>>
> >>>>>>> of, or
> >>>>>
> >>>>>> taking of any action in reliance upon this information is strictly
> >>>>>>>> prohibited. If you have received this communication in error,
> please
> >>>>>>>> contact the sender and delete the material from your computer.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>
> >>>> --
> >>>> Emre Sevinc
> >>>>
> >>>>
> >>>
> >>
> >
>


Re: Speeding up Spark build during development

2015-05-04 Thread Michael Armbrust
FWIW... My Spark SQL development workflow is usually to run "build/sbt
sparkShell" or "build/sbt 'sql/test-only '".  These commands
starts in as little as 30s on my laptop, automatically figure out which
subprojects need to be rebuilt, and don't require the expensive assembly
creation.

On Mon, May 4, 2015 at 5:48 AM, Meethu Mathew 
wrote:

> *
> *
> ** ** ** ** **  **  Hi,
>
>  Is it really necessary to run **mvn --projects assembly/ -DskipTests
> install ? Could you please explain why this is needed?
> I got the changes after running "mvn --projects streaming/ -DskipTests
> package".
>
> Regards,
> Meethu
>
>
> On Monday 04 May 2015 02:20 PM, Emre Sevinc wrote:
>
>> Just to give you an example:
>>
>> When I was trying to make a small change only to the Streaming component
>> of
>> Spark, first I built and installed the whole Spark project (this took
>> about
>> 15 minutes on my 4-core, 4 GB RAM laptop). Then, after having changed
>> files
>> only in Streaming, I ran something like (in the top-level directory):
>>
>> mvn --projects streaming/ -DskipTests package
>>
>> and then
>>
>> mvn --projects assembly/ -DskipTests install
>>
>>
>> This was much faster than trying to build the whole Spark from scratch,
>> because Maven was only building one component, in my case the Streaming
>> component, of Spark. I think you can use a very similar approach.
>>
>> --
>> Emre Sevinç
>>
>>
>>
>> On Mon, May 4, 2015 at 10:44 AM, Pramod Biligiri <
>> pramodbilig...@gmail.com>
>> wrote:
>>
>>  No, I just need to build one project at a time. Right now SparkSql.
>>>
>>> Pramod
>>>
>>> On Mon, May 4, 2015 at 12:09 AM, Emre Sevinc 
>>> wrote:
>>>
>>>  Hello Pramod,
>>>>
>>>> Do you need to build the whole project every time? Generally you don't,
>>>> e.g., when I was changing some files that belong only to Spark
>>>> Streaming, I
>>>> was building only the streaming (of course after having build and
>>>> installed
>>>> the whole project, but that was done only once), and then the assembly.
>>>> This was much faster than trying to build the whole Spark every time.
>>>>
>>>> --
>>>> Emre Sevinç
>>>>
>>>> On Mon, May 4, 2015 at 9:01 AM, Pramod Biligiri <
>>>> pramodbilig...@gmail.com
>>>>
>>>>> wrote:
>>>>> Using the inbuilt maven and zinc it takes around 10 minutes for each
>>>>> build.
>>>>> Is that reasonable?
>>>>> My maven opts looks like this:
>>>>> $ echo $MAVEN_OPTS
>>>>> -Xmx12000m -XX:MaxPermSize=2048m
>>>>>
>>>>> I'm running it as build/mvn -DskipTests package
>>>>>
>>>>> Should I be tweaking my Zinc/Nailgun config?
>>>>>
>>>>> Pramod
>>>>>
>>>>> On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra 
>>>>> wrote:
>>>>>
>>>>>
>>>>>>
>>>>> https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn
>>>>>
>>>>>> On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri <
>>>>>>
>>>>> pramodbilig...@gmail.com>
>>>>>
>>>>>> wrote:
>>>>>>
>>>>>>  This is great. I didn't know about the mvn script in the build
>>>>>>>
>>>>>> directory.
>>>>>
>>>>>> Pramod
>>>>>>>
>>>>>>> On Fri, May 1, 2015 at 9:51 AM, York, Brennon <
>>>>>>> brennon.y...@capitalone.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>  Following what Ted said, if you leverage the `mvn` from within the
>>>>>>>> `build/` directory of Spark you¹ll get zinc for free which should
>>>>>>>>
>>>>>>> help
>>>>>
>>>>>> speed up build times.
>>>>>>>>
>>>>>>>> On 5/1/15, 9:45 AM, "Ted Yu"  wrote:
>>>>>>>>
>>>>>>>>  Pramod:
>>>>>>>>> Please remember to run Zinc so that the build is faster.
>>>>>>>>>
>>>>>>>>> Cheers

Re: Speeding up Spark build during development

2015-05-04 Thread Meethu Mathew

*
*
** ** ** ** **  **  Hi,

 Is it really necessary to run **mvn --projects assembly/ -DskipTests 
install ? Could you please explain why this is needed?
I got the changes after running "mvn --projects streaming/ -DskipTests 
package".


Regards,
Meethu

On Monday 04 May 2015 02:20 PM, Emre Sevinc wrote:

Just to give you an example:

When I was trying to make a small change only to the Streaming component of
Spark, first I built and installed the whole Spark project (this took about
15 minutes on my 4-core, 4 GB RAM laptop). Then, after having changed files
only in Streaming, I ran something like (in the top-level directory):

mvn --projects streaming/ -DskipTests package

and then

mvn --projects assembly/ -DskipTests install


This was much faster than trying to build the whole Spark from scratch,
because Maven was only building one component, in my case the Streaming
component, of Spark. I think you can use a very similar approach.

--
Emre Sevinç



On Mon, May 4, 2015 at 10:44 AM, Pramod Biligiri 
wrote:


No, I just need to build one project at a time. Right now SparkSql.

Pramod

On Mon, May 4, 2015 at 12:09 AM, Emre Sevinc 
wrote:


Hello Pramod,

Do you need to build the whole project every time? Generally you don't,
e.g., when I was changing some files that belong only to Spark Streaming, I
was building only the streaming (of course after having build and installed
the whole project, but that was done only once), and then the assembly.
This was much faster than trying to build the whole Spark every time.

--
Emre Sevinç

On Mon, May 4, 2015 at 9:01 AM, Pramod Biligiri 
wrote:
Using the inbuilt maven and zinc it takes around 10 minutes for each
build.
Is that reasonable?
My maven opts looks like this:
$ echo $MAVEN_OPTS
-Xmx12000m -XX:MaxPermSize=2048m

I'm running it as build/mvn -DskipTests package

Should I be tweaking my Zinc/Nailgun config?

Pramod

On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra 
wrote:




https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn

On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri <

pramodbilig...@gmail.com>

wrote:


This is great. I didn't know about the mvn script in the build

directory.

Pramod

On Fri, May 1, 2015 at 9:51 AM, York, Brennon <
brennon.y...@capitalone.com>
wrote:


Following what Ted said, if you leverage the `mvn` from within the
`build/` directory of Spark you¹ll get zinc for free which should

help

speed up build times.

On 5/1/15, 9:45 AM, "Ted Yu"  wrote:


Pramod:
Please remember to run Zinc so that the build is faster.

Cheers

On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander

wrote:


Hi Pramod,

For cluster-like tests you might want to use the same code as in

mllib's

LocalClusterSparkContext. You can rebuild only the package that

you

change
and then run this main class.

Best regards, Alexander

-Original Message-
From: Pramod Biligiri [mailto:pramodbilig...@gmail.com]
Sent: Friday, May 01, 2015 1:46 AM
To: dev@spark.apache.org
Subject: Speeding up Spark build during development

Hi,
I'm making some small changes to the Spark codebase and trying

it out

on a
cluster. I was wondering if there's a faster way to build than

running

the
package target each time.
Currently I'm using: mvn -DskipTests  package

All the nodes have the same filesystem mounted at the same mount

point.

Pramod




The information contained in this e-mail is confidential and/or
proprietary to Capital One and/or its affiliates. The information
transmitted herewith is intended only for use by the individual or

entity

to which it is addressed.  If the reader of this message is not the
intended recipient, you are hereby notified that any review,
retransmission, dissemination, distribution, copying or other use

of, or

taking of any action in reliance upon this information is strictly
prohibited. If you have received this communication in error, please
contact the sender and delete the material from your computer.







--
Emre Sevinc









Re: Speeding up Spark build during development

2015-05-04 Thread Emre Sevinc
Just to give you an example:

When I was trying to make a small change only to the Streaming component of
Spark, first I built and installed the whole Spark project (this took about
15 minutes on my 4-core, 4 GB RAM laptop). Then, after having changed files
only in Streaming, I ran something like (in the top-level directory):

   mvn --projects streaming/ -DskipTests package

and then

   mvn --projects assembly/ -DskipTests install


This was much faster than trying to build the whole Spark from scratch,
because Maven was only building one component, in my case the Streaming
component, of Spark. I think you can use a very similar approach.

--
Emre Sevinç



On Mon, May 4, 2015 at 10:44 AM, Pramod Biligiri 
wrote:

> No, I just need to build one project at a time. Right now SparkSql.
>
> Pramod
>
> On Mon, May 4, 2015 at 12:09 AM, Emre Sevinc 
> wrote:
>
>> Hello Pramod,
>>
>> Do you need to build the whole project every time? Generally you don't,
>> e.g., when I was changing some files that belong only to Spark Streaming, I
>> was building only the streaming (of course after having build and installed
>> the whole project, but that was done only once), and then the assembly.
>> This was much faster than trying to build the whole Spark every time.
>>
>> --
>> Emre Sevinç
>>
>> On Mon, May 4, 2015 at 9:01 AM, Pramod Biligiri > > wrote:
>>
>>> Using the inbuilt maven and zinc it takes around 10 minutes for each
>>> build.
>>> Is that reasonable?
>>> My maven opts looks like this:
>>> $ echo $MAVEN_OPTS
>>> -Xmx12000m -XX:MaxPermSize=2048m
>>>
>>> I'm running it as build/mvn -DskipTests package
>>>
>>> Should I be tweaking my Zinc/Nailgun config?
>>>
>>> Pramod
>>>
>>> On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra 
>>> wrote:
>>>
>>> >
>>> >
>>> https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn
>>> >
>>> > On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri <
>>> pramodbilig...@gmail.com>
>>> > wrote:
>>> >
>>> >> This is great. I didn't know about the mvn script in the build
>>> directory.
>>> >>
>>> >> Pramod
>>> >>
>>> >> On Fri, May 1, 2015 at 9:51 AM, York, Brennon <
>>> >> brennon.y...@capitalone.com>
>>> >> wrote:
>>> >>
>>> >> > Following what Ted said, if you leverage the `mvn` from within the
>>> >> > `build/` directory of Spark you¹ll get zinc for free which should
>>> help
>>> >> > speed up build times.
>>> >> >
>>> >> > On 5/1/15, 9:45 AM, "Ted Yu"  wrote:
>>> >> >
>>> >> > >Pramod:
>>> >> > >Please remember to run Zinc so that the build is faster.
>>> >> > >
>>> >> > >Cheers
>>> >> > >
>>> >> > >On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander
>>> >> > >
>>> >> > >wrote:
>>> >> > >
>>> >> > >> Hi Pramod,
>>> >> > >>
>>> >> > >> For cluster-like tests you might want to use the same code as in
>>> >> mllib's
>>> >> > >> LocalClusterSparkContext. You can rebuild only the package that
>>> you
>>> >> > >>change
>>> >> > >> and then run this main class.
>>> >> > >>
>>> >> > >> Best regards, Alexander
>>> >> > >>
>>> >> > >> -Original Message-
>>> >> > >> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com]
>>> >> > >> Sent: Friday, May 01, 2015 1:46 AM
>>> >> > >> To: dev@spark.apache.org
>>> >> > >> Subject: Speeding up Spark build during development
>>> >> > >>
>>> >> > >> Hi,
>>> >> > >> I'm making some small changes to the Spark codebase and trying
>>> it out
>>> >> > >>on a
>>> >> > >> cluster. I was wondering if there's a faster way to build than
>>> >> running
>>> >> > >>the
>>> >> > >> package target each time.
>>> >> > >> Currently I'm using: mvn -DskipTests  package
>>> >> > >>
>>> >> > >> All the nodes have the same filesystem mounted at the same mount
>>> >> point.
>>> >> > >>
>>> >> > >> Pramod
>>> >> > >>
>>> >> >
>>> >> > 
>>> >> >
>>> >> > The information contained in this e-mail is confidential and/or
>>> >> > proprietary to Capital One and/or its affiliates. The information
>>> >> > transmitted herewith is intended only for use by the individual or
>>> >> entity
>>> >> > to which it is addressed.  If the reader of this message is not the
>>> >> > intended recipient, you are hereby notified that any review,
>>> >> > retransmission, dissemination, distribution, copying or other use
>>> of, or
>>> >> > taking of any action in reliance upon this information is strictly
>>> >> > prohibited. If you have received this communication in error, please
>>> >> > contact the sender and delete the material from your computer.
>>> >> >
>>> >> >
>>> >>
>>> >
>>> >
>>>
>>
>>
>>
>> --
>> Emre Sevinc
>>
>
>


-- 
Emre Sevinc


Re: Speeding up Spark build during development

2015-05-04 Thread Pramod Biligiri
No, I just need to build one project at a time. Right now SparkSql.

Pramod

On Mon, May 4, 2015 at 12:09 AM, Emre Sevinc  wrote:

> Hello Pramod,
>
> Do you need to build the whole project every time? Generally you don't,
> e.g., when I was changing some files that belong only to Spark Streaming, I
> was building only the streaming (of course after having build and installed
> the whole project, but that was done only once), and then the assembly.
> This was much faster than trying to build the whole Spark every time.
>
> --
> Emre Sevinç
>
> On Mon, May 4, 2015 at 9:01 AM, Pramod Biligiri 
> wrote:
>
>> Using the inbuilt maven and zinc it takes around 10 minutes for each
>> build.
>> Is that reasonable?
>> My maven opts looks like this:
>> $ echo $MAVEN_OPTS
>> -Xmx12000m -XX:MaxPermSize=2048m
>>
>> I'm running it as build/mvn -DskipTests package
>>
>> Should I be tweaking my Zinc/Nailgun config?
>>
>> Pramod
>>
>> On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra 
>> wrote:
>>
>> >
>> >
>> https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn
>> >
>> > On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri <
>> pramodbilig...@gmail.com>
>> > wrote:
>> >
>> >> This is great. I didn't know about the mvn script in the build
>> directory.
>> >>
>> >> Pramod
>> >>
>> >> On Fri, May 1, 2015 at 9:51 AM, York, Brennon <
>> >> brennon.y...@capitalone.com>
>> >> wrote:
>> >>
>> >> > Following what Ted said, if you leverage the `mvn` from within the
>> >> > `build/` directory of Spark you¹ll get zinc for free which should
>> help
>> >> > speed up build times.
>> >> >
>> >> > On 5/1/15, 9:45 AM, "Ted Yu"  wrote:
>> >> >
>> >> > >Pramod:
>> >> > >Please remember to run Zinc so that the build is faster.
>> >> > >
>> >> > >Cheers
>> >> > >
>> >> > >On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander
>> >> > >
>> >> > >wrote:
>> >> > >
>> >> > >> Hi Pramod,
>> >> > >>
>> >> > >> For cluster-like tests you might want to use the same code as in
>> >> mllib's
>> >> > >> LocalClusterSparkContext. You can rebuild only the package that
>> you
>> >> > >>change
>> >> > >> and then run this main class.
>> >> > >>
>> >> > >> Best regards, Alexander
>> >> > >>
>> >> > >> -Original Message-
>> >> > >> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com]
>> >> > >> Sent: Friday, May 01, 2015 1:46 AM
>> >> > >> To: dev@spark.apache.org
>> >> > >> Subject: Speeding up Spark build during development
>> >> > >>
>> >> > >> Hi,
>> >> > >> I'm making some small changes to the Spark codebase and trying it
>> out
>> >> > >>on a
>> >> > >> cluster. I was wondering if there's a faster way to build than
>> >> running
>> >> > >>the
>> >> > >> package target each time.
>> >> > >> Currently I'm using: mvn -DskipTests  package
>> >> > >>
>> >> > >> All the nodes have the same filesystem mounted at the same mount
>> >> point.
>> >> > >>
>> >> > >> Pramod
>> >> > >>
>> >> >
>> >> > 
>> >> >
>> >> > The information contained in this e-mail is confidential and/or
>> >> > proprietary to Capital One and/or its affiliates. The information
>> >> > transmitted herewith is intended only for use by the individual or
>> >> entity
>> >> > to which it is addressed.  If the reader of this message is not the
>> >> > intended recipient, you are hereby notified that any review,
>> >> > retransmission, dissemination, distribution, copying or other use
>> of, or
>> >> > taking of any action in reliance upon this information is strictly
>> >> > prohibited. If you have received this communication in error, please
>> >> > contact the sender and delete the material from your computer.
>> >> >
>> >> >
>> >>
>> >
>> >
>>
>
>
>
> --
> Emre Sevinc
>


Re: Speeding up Spark build during development

2015-05-04 Thread Emre Sevinc
Hello Pramod,

Do you need to build the whole project every time? Generally you don't,
e.g., when I was changing some files that belong only to Spark Streaming, I
was building only the streaming (of course after having build and installed
the whole project, but that was done only once), and then the assembly.
This was much faster than trying to build the whole Spark every time.

--
Emre Sevinç

On Mon, May 4, 2015 at 9:01 AM, Pramod Biligiri 
wrote:

> Using the inbuilt maven and zinc it takes around 10 minutes for each build.
> Is that reasonable?
> My maven opts looks like this:
> $ echo $MAVEN_OPTS
> -Xmx12000m -XX:MaxPermSize=2048m
>
> I'm running it as build/mvn -DskipTests package
>
> Should I be tweaking my Zinc/Nailgun config?
>
> Pramod
>
> On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra 
> wrote:
>
> >
> >
> https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn
> >
> > On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri <
> pramodbilig...@gmail.com>
> > wrote:
> >
> >> This is great. I didn't know about the mvn script in the build
> directory.
> >>
> >> Pramod
> >>
> >> On Fri, May 1, 2015 at 9:51 AM, York, Brennon <
> >> brennon.y...@capitalone.com>
> >> wrote:
> >>
> >> > Following what Ted said, if you leverage the `mvn` from within the
> >> > `build/` directory of Spark you¹ll get zinc for free which should help
> >> > speed up build times.
> >> >
> >> > On 5/1/15, 9:45 AM, "Ted Yu"  wrote:
> >> >
> >> > >Pramod:
> >> > >Please remember to run Zinc so that the build is faster.
> >> > >
> >> > >Cheers
> >> > >
> >> > >On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander
> >> > >
> >> > >wrote:
> >> > >
> >> > >> Hi Pramod,
> >> > >>
> >> > >> For cluster-like tests you might want to use the same code as in
> >> mllib's
> >> > >> LocalClusterSparkContext. You can rebuild only the package that you
> >> > >>change
> >> > >> and then run this main class.
> >> > >>
> >> > >> Best regards, Alexander
> >> > >>
> >> > >> -Original Message-
> >> > >> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com]
> >> > >> Sent: Friday, May 01, 2015 1:46 AM
> >> > >> To: dev@spark.apache.org
> >> > >> Subject: Speeding up Spark build during development
> >> > >>
> >> > >> Hi,
> >> > >> I'm making some small changes to the Spark codebase and trying it
> out
> >> > >>on a
> >> > >> cluster. I was wondering if there's a faster way to build than
> >> running
> >> > >>the
> >> > >> package target each time.
> >> > >> Currently I'm using: mvn -DskipTests  package
> >> > >>
> >> > >> All the nodes have the same filesystem mounted at the same mount
> >> point.
> >> > >>
> >> > >> Pramod
> >> > >>
> >> >
> >> > 
> >> >
> >> > The information contained in this e-mail is confidential and/or
> >> > proprietary to Capital One and/or its affiliates. The information
> >> > transmitted herewith is intended only for use by the individual or
> >> entity
> >> > to which it is addressed.  If the reader of this message is not the
> >> > intended recipient, you are hereby notified that any review,
> >> > retransmission, dissemination, distribution, copying or other use of,
> or
> >> > taking of any action in reliance upon this information is strictly
> >> > prohibited. If you have received this communication in error, please
> >> > contact the sender and delete the material from your computer.
> >> >
> >> >
> >>
> >
> >
>



-- 
Emre Sevinc


Re: Speeding up Spark build during development

2015-05-04 Thread Pramod Biligiri
Using the inbuilt maven and zinc it takes around 10 minutes for each build.
Is that reasonable?
My maven opts looks like this:
$ echo $MAVEN_OPTS
-Xmx12000m -XX:MaxPermSize=2048m

I'm running it as build/mvn -DskipTests package

Should I be tweaking my Zinc/Nailgun config?

Pramod

On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra 
wrote:

>
> https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn
>
> On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri 
> wrote:
>
>> This is great. I didn't know about the mvn script in the build directory.
>>
>> Pramod
>>
>> On Fri, May 1, 2015 at 9:51 AM, York, Brennon <
>> brennon.y...@capitalone.com>
>> wrote:
>>
>> > Following what Ted said, if you leverage the `mvn` from within the
>> > `build/` directory of Spark you¹ll get zinc for free which should help
>> > speed up build times.
>> >
>> > On 5/1/15, 9:45 AM, "Ted Yu"  wrote:
>> >
>> > >Pramod:
>> > >Please remember to run Zinc so that the build is faster.
>> > >
>> > >Cheers
>> > >
>> > >On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander
>> > >
>> > >wrote:
>> > >
>> > >> Hi Pramod,
>> > >>
>> > >> For cluster-like tests you might want to use the same code as in
>> mllib's
>> > >> LocalClusterSparkContext. You can rebuild only the package that you
>> > >>change
>> > >> and then run this main class.
>> > >>
>> > >> Best regards, Alexander
>> > >>
>> > >> -Original Message-
>> > >> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com]
>> > >> Sent: Friday, May 01, 2015 1:46 AM
>> > >> To: dev@spark.apache.org
>> > >> Subject: Speeding up Spark build during development
>> > >>
>> > >> Hi,
>> > >> I'm making some small changes to the Spark codebase and trying it out
>> > >>on a
>> > >> cluster. I was wondering if there's a faster way to build than
>> running
>> > >>the
>> > >> package target each time.
>> > >> Currently I'm using: mvn -DskipTests  package
>> > >>
>> > >> All the nodes have the same filesystem mounted at the same mount
>> point.
>> > >>
>> > >> Pramod
>> > >>
>> >
>> > 
>> >
>> > The information contained in this e-mail is confidential and/or
>> > proprietary to Capital One and/or its affiliates. The information
>> > transmitted herewith is intended only for use by the individual or
>> entity
>> > to which it is addressed.  If the reader of this message is not the
>> > intended recipient, you are hereby notified that any review,
>> > retransmission, dissemination, distribution, copying or other use of, or
>> > taking of any action in reliance upon this information is strictly
>> > prohibited. If you have received this communication in error, please
>> > contact the sender and delete the material from your computer.
>> >
>> >
>>
>
>


Re: Speeding up Spark build during development

2015-05-03 Thread Mark Hamstra
https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn

On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri 
wrote:

> This is great. I didn't know about the mvn script in the build directory.
>
> Pramod
>
> On Fri, May 1, 2015 at 9:51 AM, York, Brennon  >
> wrote:
>
> > Following what Ted said, if you leverage the `mvn` from within the
> > `build/` directory of Spark you¹ll get zinc for free which should help
> > speed up build times.
> >
> > On 5/1/15, 9:45 AM, "Ted Yu"  wrote:
> >
> > >Pramod:
> > >Please remember to run Zinc so that the build is faster.
> > >
> > >Cheers
> > >
> > >On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander
> > >
> > >wrote:
> > >
> > >> Hi Pramod,
> > >>
> > >> For cluster-like tests you might want to use the same code as in
> mllib's
> > >> LocalClusterSparkContext. You can rebuild only the package that you
> > >>change
> > >> and then run this main class.
> > >>
> > >> Best regards, Alexander
> > >>
> > >> -Original Message-
> > >> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com]
> > >> Sent: Friday, May 01, 2015 1:46 AM
> > >> To: dev@spark.apache.org
> > >> Subject: Speeding up Spark build during development
> > >>
> > >> Hi,
> > >> I'm making some small changes to the Spark codebase and trying it out
> > >>on a
> > >> cluster. I was wondering if there's a faster way to build than running
> > >>the
> > >> package target each time.
> > >> Currently I'm using: mvn -DskipTests  package
> > >>
> > >> All the nodes have the same filesystem mounted at the same mount
> point.
> > >>
> > >> Pramod
> > >>
> >
> > 
> >
> > The information contained in this e-mail is confidential and/or
> > proprietary to Capital One and/or its affiliates. The information
> > transmitted herewith is intended only for use by the individual or entity
> > to which it is addressed.  If the reader of this message is not the
> > intended recipient, you are hereby notified that any review,
> > retransmission, dissemination, distribution, copying or other use of, or
> > taking of any action in reliance upon this information is strictly
> > prohibited. If you have received this communication in error, please
> > contact the sender and delete the material from your computer.
> >
> >
>


Re: Speeding up Spark build during development

2015-05-03 Thread Pramod Biligiri
This is great. I didn't know about the mvn script in the build directory.

Pramod

On Fri, May 1, 2015 at 9:51 AM, York, Brennon 
wrote:

> Following what Ted said, if you leverage the `mvn` from within the
> `build/` directory of Spark you¹ll get zinc for free which should help
> speed up build times.
>
> On 5/1/15, 9:45 AM, "Ted Yu"  wrote:
>
> >Pramod:
> >Please remember to run Zinc so that the build is faster.
> >
> >Cheers
> >
> >On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander
> >
> >wrote:
> >
> >> Hi Pramod,
> >>
> >> For cluster-like tests you might want to use the same code as in mllib's
> >> LocalClusterSparkContext. You can rebuild only the package that you
> >>change
> >> and then run this main class.
> >>
> >> Best regards, Alexander
> >>
> >> -----Original Message-
> >> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com]
> >> Sent: Friday, May 01, 2015 1:46 AM
> >> To: dev@spark.apache.org
> >> Subject: Speeding up Spark build during development
> >>
> >> Hi,
> >> I'm making some small changes to the Spark codebase and trying it out
> >>on a
> >> cluster. I was wondering if there's a faster way to build than running
> >>the
> >> package target each time.
> >> Currently I'm using: mvn -DskipTests  package
> >>
> >> All the nodes have the same filesystem mounted at the same mount point.
> >>
> >> Pramod
> >>
>
> 
>
> The information contained in this e-mail is confidential and/or
> proprietary to Capital One and/or its affiliates. The information
> transmitted herewith is intended only for use by the individual or entity
> to which it is addressed.  If the reader of this message is not the
> intended recipient, you are hereby notified that any review,
> retransmission, dissemination, distribution, copying or other use of, or
> taking of any action in reliance upon this information is strictly
> prohibited. If you have received this communication in error, please
> contact the sender and delete the material from your computer.
>
>


Re: Speeding up Spark build during development

2015-05-01 Thread York, Brennon
Following what Ted said, if you leverage the `mvn` from within the
`build/` directory of Spark you¹ll get zinc for free which should help
speed up build times.

On 5/1/15, 9:45 AM, "Ted Yu"  wrote:

>Pramod:
>Please remember to run Zinc so that the build is faster.
>
>Cheers
>
>On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander
>
>wrote:
>
>> Hi Pramod,
>>
>> For cluster-like tests you might want to use the same code as in mllib's
>> LocalClusterSparkContext. You can rebuild only the package that you
>>change
>> and then run this main class.
>>
>> Best regards, Alexander
>>
>> -Original Message-
>> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com]
>> Sent: Friday, May 01, 2015 1:46 AM
>> To: dev@spark.apache.org
>> Subject: Speeding up Spark build during development
>>
>> Hi,
>> I'm making some small changes to the Spark codebase and trying it out
>>on a
>> cluster. I was wondering if there's a faster way to build than running
>>the
>> package target each time.
>> Currently I'm using: mvn -DskipTests  package
>>
>> All the nodes have the same filesystem mounted at the same mount point.
>>
>> Pramod
>>



The information contained in this e-mail is confidential and/or proprietary to 
Capital One and/or its affiliates. The information transmitted herewith is 
intended only for use by the individual or entity to which it is addressed.  If 
the reader of this message is not the intended recipient, you are hereby 
notified that any review, retransmission, dissemination, distribution, copying 
or other use of, or taking of any action in reliance upon this information is 
strictly prohibited. If you have received this communication in error, please 
contact the sender and delete the material from your computer.


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Speeding up Spark build during development

2015-05-01 Thread Ted Yu
Pramod:
Please remember to run Zinc so that the build is faster.

Cheers

On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander 
wrote:

> Hi Pramod,
>
> For cluster-like tests you might want to use the same code as in mllib's
> LocalClusterSparkContext. You can rebuild only the package that you change
> and then run this main class.
>
> Best regards, Alexander
>
> -Original Message-
> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com]
> Sent: Friday, May 01, 2015 1:46 AM
> To: dev@spark.apache.org
> Subject: Speeding up Spark build during development
>
> Hi,
> I'm making some small changes to the Spark codebase and trying it out on a
> cluster. I was wondering if there's a faster way to build than running the
> package target each time.
> Currently I'm using: mvn -DskipTests  package
>
> All the nodes have the same filesystem mounted at the same mount point.
>
> Pramod
>


RE: Speeding up Spark build during development

2015-05-01 Thread Ulanov, Alexander
Hi Pramod,

For cluster-like tests you might want to use the same code as in mllib's 
LocalClusterSparkContext. You can rebuild only the package that you change and 
then run this main class.

Best regards, Alexander

-Original Message-
From: Pramod Biligiri [mailto:pramodbilig...@gmail.com] 
Sent: Friday, May 01, 2015 1:46 AM
To: dev@spark.apache.org
Subject: Speeding up Spark build during development

Hi,
I'm making some small changes to the Spark codebase and trying it out on a 
cluster. I was wondering if there's a faster way to build than running the 
package target each time.
Currently I'm using: mvn -DskipTests  package

All the nodes have the same filesystem mounted at the same mount point.

Pramod


Re: Speeding up Spark build during development

2015-05-01 Thread Prashant Sharma
Hi Pramod,

If you are using sbt as your build, then you need to do sbt assembly once
and use sbt ~compile. Also export SPARK_PREPEND_CLASSES=1 this in your
shell and all nodes.
You can may be try this out ?

Thanks,

Prashant Sharma



On Fri, May 1, 2015 at 2:16 PM, Pramod Biligiri 
wrote:

> Hi,
> I'm making some small changes to the Spark codebase and trying it out on a
> cluster. I was wondering if there's a faster way to build than running the
> package target each time.
> Currently I'm using: mvn -DskipTests  package
>
> All the nodes have the same filesystem mounted at the same mount point.
>
> Pramod
>


Speeding up Spark build during development

2015-05-01 Thread Pramod Biligiri
Hi,
I'm making some small changes to the Spark codebase and trying it out on a
cluster. I was wondering if there's a faster way to build than running the
package target each time.
Currently I'm using: mvn -DskipTests  package

All the nodes have the same filesystem mounted at the same mount point.

Pramod