Re: Speeding up Spark build during development
> >>>> the whole project, but that was done only once), and then the > > > assembly. > > > >>>> This was much faster than trying to build the whole Spark every > > time. > > > >>>> > > > >>>> -- > > > >>>> Emre Sevinç > > > >>>> > > > >>>> On Mon, May 4, 2015 at 9:01 AM, Pramod Biligiri < > > > >>>> pramodbilig...@gmail.com > > > >>>> > > > >>>>> wrote: > > > >>>>> Using the inbuilt maven and zinc it takes around 10 minutes for > > each > > > >>>>> build. > > > >>>>> Is that reasonable? > > > >>>>> My maven opts looks like this: > > > >>>>> $ echo $MAVEN_OPTS > > > >>>>> -Xmx12000m -XX:MaxPermSize=2048m > > > >>>>> > > > >>>>> I'm running it as build/mvn -DskipTests package > > > >>>>> > > > >>>>> Should I be tweaking my Zinc/Nailgun config? > > > >>>>> > > > >>>>> Pramod > > > >>>>> > > > >>>>> On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra < > > > m...@clearstorydata.com> > > > >>>>> wrote: > > > >>>>> > > > >>>>> > > > >>>>>> > > > >>>>> > > > > > > https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn > > > >>>>> > > > >>>>>> On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri < > > > >>>>>> > > > >>>>> pramodbilig...@gmail.com> > > > >>>>> > > > >>>>>> wrote: > > > >>>>>> > > > >>>>>> This is great. I didn't know about the mvn script in the build > > > >>>>>>> > > > >>>>>> directory. > > > >>>>> > > > >>>>>> Pramod > > > >>>>>>> > > > >>>>>>> On Fri, May 1, 2015 at 9:51 AM, York, Brennon < > > > >>>>>>> brennon.y...@capitalone.com> > > > >>>>>>> wrote: > > > >>>>>>> > > > >>>>>>> Following what Ted said, if you leverage the `mvn` from within > > the > > > >>>>>>>> `build/` directory of Spark you¹ll get zinc for free which > > should > > > >>>>>>>> > > > >>>>>>> help > > > >>>>> > > > >>>>>> speed up build times. > > > >>>>>>>> > > > >>>>>>>> On 5/1/15, 9:45 AM, "Ted Yu" wrote: > > > >>>>>>>> > > > >>>>>>>> Pramod: > > > >>>>>>>>> Please remember to run Zinc so that the build is faster. > > > >>>>>>>>> > > > >>>>>>>>> Cheers > > > >>>>>>>>> > > > >>>>>>>>> On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander > > > >>>>>>>>> > > > >>>>>>>>> wrote: > > > >>>>>>>>> > > > >>>>>>>>> Hi Pramod, > > > >>>>>>>>>> > > > >>>>>>>>>> For cluster-like tests you might want to use the same code > as > > in > > > >>>>>>>>>> > > > >>>>>>>>> mllib's > > > >>>>>>> > > > >>>>>>>> LocalClusterSparkContext. You can rebuild only the package > that > > > >>>>>>>>>> > > > >>>>>>>>> you > > > >>>>> > > > >>>>>> change > > > >>>>>>>>>> and then run this main class. > > > >>>>>>>>>> > > > >>>>>>>>>> Best regards, Alexander > > > >>>>>>>>>> > > > >>>>>>>>>> -Original Message- > > > >>>>>>>>>> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com] > > > >>>>>>>>>> Sent: Friday, May 01, 2015 1:46 AM > > > >>>>>>>>>> To: dev@spark.apache.org > > > >>>>>>>>>> Subject: Speeding up Spark build during development > > > >>>>>>>>>> > > > >>>>>>>>>> Hi, > > > >>>>>>>>>> I'm making some small changes to the Spark codebase and > trying > > > >>>>>>>>>> > > > >>>>>>>>> it out > > > >>>>> > > > >>>>>> on a > > > >>>>>>>>>> cluster. I was wondering if there's a faster way to build > than > > > >>>>>>>>>> > > > >>>>>>>>> running > > > >>>>>>> > > > >>>>>>>> the > > > >>>>>>>>>> package target each time. > > > >>>>>>>>>> Currently I'm using: mvn -DskipTests package > > > >>>>>>>>>> > > > >>>>>>>>>> All the nodes have the same filesystem mounted at the same > > mount > > > >>>>>>>>>> > > > >>>>>>>>> point. > > > >>>>>>> > > > >>>>>>>> Pramod > > > >>>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>>> > > > >>>>>>>> The information contained in this e-mail is confidential > and/or > > > >>>>>>>> proprietary to Capital One and/or its affiliates. The > > information > > > >>>>>>>> transmitted herewith is intended only for use by the > individual > > or > > > >>>>>>>> > > > >>>>>>> entity > > > >>>>>>> > > > >>>>>>>> to which it is addressed. If the reader of this message is > not > > > the > > > >>>>>>>> intended recipient, you are hereby notified that any review, > > > >>>>>>>> retransmission, dissemination, distribution, copying or other > > use > > > >>>>>>>> > > > >>>>>>> of, or > > > >>>>> > > > >>>>>> taking of any action in reliance upon this information is > strictly > > > >>>>>>>> prohibited. If you have received this communication in error, > > > please > > > >>>>>>>> contact the sender and delete the material from your computer. > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>> > > > >>>> > > > >>>> -- > > > >>>> Emre Sevinc > > > >>>> > > > >>>> > > > >>> > > > >> > > > > > > > > > > > > > -- > > -- > Iulian Dragos > > -- > Reactive Apps on the JVM > www.typesafe.com >
Re: Speeding up Spark build during development
gt;>> > > >>>>> I'm running it as build/mvn -DskipTests package > > >>>>> > > >>>>> Should I be tweaking my Zinc/Nailgun config? > > >>>>> > > >>>>> Pramod > > >>>>> > > >>>>> On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra < > > m...@clearstorydata.com> > > >>>>> wrote: > > >>>>> > > >>>>> > > >>>>>> > > >>>>> > > > https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn > > >>>>> > > >>>>>> On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri < > > >>>>>> > > >>>>> pramodbilig...@gmail.com> > > >>>>> > > >>>>>> wrote: > > >>>>>> > > >>>>>> This is great. I didn't know about the mvn script in the build > > >>>>>>> > > >>>>>> directory. > > >>>>> > > >>>>>> Pramod > > >>>>>>> > > >>>>>>> On Fri, May 1, 2015 at 9:51 AM, York, Brennon < > > >>>>>>> brennon.y...@capitalone.com> > > >>>>>>> wrote: > > >>>>>>> > > >>>>>>> Following what Ted said, if you leverage the `mvn` from within > the > > >>>>>>>> `build/` directory of Spark you¹ll get zinc for free which > should > > >>>>>>>> > > >>>>>>> help > > >>>>> > > >>>>>> speed up build times. > > >>>>>>>> > > >>>>>>>> On 5/1/15, 9:45 AM, "Ted Yu" wrote: > > >>>>>>>> > > >>>>>>>> Pramod: > > >>>>>>>>> Please remember to run Zinc so that the build is faster. > > >>>>>>>>> > > >>>>>>>>> Cheers > > >>>>>>>>> > > >>>>>>>>> On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander > > >>>>>>>>> > > >>>>>>>>> wrote: > > >>>>>>>>> > > >>>>>>>>> Hi Pramod, > > >>>>>>>>>> > > >>>>>>>>>> For cluster-like tests you might want to use the same code as > in > > >>>>>>>>>> > > >>>>>>>>> mllib's > > >>>>>>> > > >>>>>>>> LocalClusterSparkContext. You can rebuild only the package that > > >>>>>>>>>> > > >>>>>>>>> you > > >>>>> > > >>>>>> change > > >>>>>>>>>> and then run this main class. > > >>>>>>>>>> > > >>>>>>>>>> Best regards, Alexander > > >>>>>>>>>> > > >>>>>>>>>> -Original Message- > > >>>>>>>>>> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com] > > >>>>>>>>>> Sent: Friday, May 01, 2015 1:46 AM > > >>>>>>>>>> To: dev@spark.apache.org > > >>>>>>>>>> Subject: Speeding up Spark build during development > > >>>>>>>>>> > > >>>>>>>>>> Hi, > > >>>>>>>>>> I'm making some small changes to the Spark codebase and trying > > >>>>>>>>>> > > >>>>>>>>> it out > > >>>>> > > >>>>>> on a > > >>>>>>>>>> cluster. I was wondering if there's a faster way to build than > > >>>>>>>>>> > > >>>>>>>>> running > > >>>>>>> > > >>>>>>>> the > > >>>>>>>>>> package target each time. > > >>>>>>>>>> Currently I'm using: mvn -DskipTests package > > >>>>>>>>>> > > >>>>>>>>>> All the nodes have the same filesystem mounted at the same > mount > > >>>>>>>>>> > > >>>>>>>>> point. > > >>>>>>> > > >>>>>>>> Pramod > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>> > > >>>>>>>> The information contained in this e-mail is confidential and/or > > >>>>>>>> proprietary to Capital One and/or its affiliates. The > information > > >>>>>>>> transmitted herewith is intended only for use by the individual > or > > >>>>>>>> > > >>>>>>> entity > > >>>>>>> > > >>>>>>>> to which it is addressed. If the reader of this message is not > > the > > >>>>>>>> intended recipient, you are hereby notified that any review, > > >>>>>>>> retransmission, dissemination, distribution, copying or other > use > > >>>>>>>> > > >>>>>>> of, or > > >>>>> > > >>>>>> taking of any action in reliance upon this information is strictly > > >>>>>>>> prohibited. If you have received this communication in error, > > please > > >>>>>>>> contact the sender and delete the material from your computer. > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>> > > >>>> > > >>>> -- > > >>>> Emre Sevinc > > >>>> > > >>>> > > >>> > > >> > > > > > > -- -- Iulian Dragos -- Reactive Apps on the JVM www.typesafe.com
Re: Speeding up Spark build during development
s is great. I didn't know about the mvn script in the build > >>>>>>> > >>>>>> directory. > >>>>> > >>>>>> Pramod > >>>>>>> > >>>>>>> On Fri, May 1, 2015 at 9:51 AM, York, Brennon < > >>>>>>> brennon.y...@capitalone.com> > >>>>>>> wrote: > >>>>>>> > >>>>>>> Following what Ted said, if you leverage the `mvn` from within the > >>>>>>>> `build/` directory of Spark you¹ll get zinc for free which should > >>>>>>>> > >>>>>>> help > >>>>> > >>>>>> speed up build times. > >>>>>>>> > >>>>>>>> On 5/1/15, 9:45 AM, "Ted Yu" wrote: > >>>>>>>> > >>>>>>>> Pramod: > >>>>>>>>> Please remember to run Zinc so that the build is faster. > >>>>>>>>> > >>>>>>>>> Cheers > >>>>>>>>> > >>>>>>>>> On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander > >>>>>>>>> > >>>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>> Hi Pramod, > >>>>>>>>>> > >>>>>>>>>> For cluster-like tests you might want to use the same code as in > >>>>>>>>>> > >>>>>>>>> mllib's > >>>>>>> > >>>>>>>> LocalClusterSparkContext. You can rebuild only the package that > >>>>>>>>>> > >>>>>>>>> you > >>>>> > >>>>>> change > >>>>>>>>>> and then run this main class. > >>>>>>>>>> > >>>>>>>>>> Best regards, Alexander > >>>>>>>>>> > >>>>>>>>>> -Original Message- > >>>>>>>>>> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com] > >>>>>>>>>> Sent: Friday, May 01, 2015 1:46 AM > >>>>>>>>>> To: dev@spark.apache.org > >>>>>>>>>> Subject: Speeding up Spark build during development > >>>>>>>>>> > >>>>>>>>>> Hi, > >>>>>>>>>> I'm making some small changes to the Spark codebase and trying > >>>>>>>>>> > >>>>>>>>> it out > >>>>> > >>>>>> on a > >>>>>>>>>> cluster. I was wondering if there's a faster way to build than > >>>>>>>>>> > >>>>>>>>> running > >>>>>>> > >>>>>>>> the > >>>>>>>>>> package target each time. > >>>>>>>>>> Currently I'm using: mvn -DskipTests package > >>>>>>>>>> > >>>>>>>>>> All the nodes have the same filesystem mounted at the same mount > >>>>>>>>>> > >>>>>>>>> point. > >>>>>>> > >>>>>>>> Pramod > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>> > >>>>>>>> The information contained in this e-mail is confidential and/or > >>>>>>>> proprietary to Capital One and/or its affiliates. The information > >>>>>>>> transmitted herewith is intended only for use by the individual or > >>>>>>>> > >>>>>>> entity > >>>>>>> > >>>>>>>> to which it is addressed. If the reader of this message is not > the > >>>>>>>> intended recipient, you are hereby notified that any review, > >>>>>>>> retransmission, dissemination, distribution, copying or other use > >>>>>>>> > >>>>>>> of, or > >>>>> > >>>>>> taking of any action in reliance upon this information is strictly > >>>>>>>> prohibited. If you have received this communication in error, > please > >>>>>>>> contact the sender and delete the material from your computer. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>> > >>>> > >>>> -- > >>>> Emre Sevinc > >>>> > >>>> > >>> > >> > > >
Re: Speeding up Spark build during development
FWIW... My Spark SQL development workflow is usually to run "build/sbt sparkShell" or "build/sbt 'sql/test-only '". These commands starts in as little as 30s on my laptop, automatically figure out which subprojects need to be rebuilt, and don't require the expensive assembly creation. On Mon, May 4, 2015 at 5:48 AM, Meethu Mathew wrote: > * > * > ** ** ** ** ** ** Hi, > > Is it really necessary to run **mvn --projects assembly/ -DskipTests > install ? Could you please explain why this is needed? > I got the changes after running "mvn --projects streaming/ -DskipTests > package". > > Regards, > Meethu > > > On Monday 04 May 2015 02:20 PM, Emre Sevinc wrote: > >> Just to give you an example: >> >> When I was trying to make a small change only to the Streaming component >> of >> Spark, first I built and installed the whole Spark project (this took >> about >> 15 minutes on my 4-core, 4 GB RAM laptop). Then, after having changed >> files >> only in Streaming, I ran something like (in the top-level directory): >> >> mvn --projects streaming/ -DskipTests package >> >> and then >> >> mvn --projects assembly/ -DskipTests install >> >> >> This was much faster than trying to build the whole Spark from scratch, >> because Maven was only building one component, in my case the Streaming >> component, of Spark. I think you can use a very similar approach. >> >> -- >> Emre Sevinç >> >> >> >> On Mon, May 4, 2015 at 10:44 AM, Pramod Biligiri < >> pramodbilig...@gmail.com> >> wrote: >> >> No, I just need to build one project at a time. Right now SparkSql. >>> >>> Pramod >>> >>> On Mon, May 4, 2015 at 12:09 AM, Emre Sevinc >>> wrote: >>> >>> Hello Pramod, >>>> >>>> Do you need to build the whole project every time? Generally you don't, >>>> e.g., when I was changing some files that belong only to Spark >>>> Streaming, I >>>> was building only the streaming (of course after having build and >>>> installed >>>> the whole project, but that was done only once), and then the assembly. >>>> This was much faster than trying to build the whole Spark every time. >>>> >>>> -- >>>> Emre Sevinç >>>> >>>> On Mon, May 4, 2015 at 9:01 AM, Pramod Biligiri < >>>> pramodbilig...@gmail.com >>>> >>>>> wrote: >>>>> Using the inbuilt maven and zinc it takes around 10 minutes for each >>>>> build. >>>>> Is that reasonable? >>>>> My maven opts looks like this: >>>>> $ echo $MAVEN_OPTS >>>>> -Xmx12000m -XX:MaxPermSize=2048m >>>>> >>>>> I'm running it as build/mvn -DskipTests package >>>>> >>>>> Should I be tweaking my Zinc/Nailgun config? >>>>> >>>>> Pramod >>>>> >>>>> On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra >>>>> wrote: >>>>> >>>>> >>>>>> >>>>> https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn >>>>> >>>>>> On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri < >>>>>> >>>>> pramodbilig...@gmail.com> >>>>> >>>>>> wrote: >>>>>> >>>>>> This is great. I didn't know about the mvn script in the build >>>>>>> >>>>>> directory. >>>>> >>>>>> Pramod >>>>>>> >>>>>>> On Fri, May 1, 2015 at 9:51 AM, York, Brennon < >>>>>>> brennon.y...@capitalone.com> >>>>>>> wrote: >>>>>>> >>>>>>> Following what Ted said, if you leverage the `mvn` from within the >>>>>>>> `build/` directory of Spark you¹ll get zinc for free which should >>>>>>>> >>>>>>> help >>>>> >>>>>> speed up build times. >>>>>>>> >>>>>>>> On 5/1/15, 9:45 AM, "Ted Yu" wrote: >>>>>>>> >>>>>>>> Pramod: >>>>>>>>> Please remember to run Zinc so that the build is faster. >>>>>>>>> >>>>>>>>> Cheers
Re: Speeding up Spark build during development
* * ** ** ** ** ** ** Hi, Is it really necessary to run **mvn --projects assembly/ -DskipTests install ? Could you please explain why this is needed? I got the changes after running "mvn --projects streaming/ -DskipTests package". Regards, Meethu On Monday 04 May 2015 02:20 PM, Emre Sevinc wrote: Just to give you an example: When I was trying to make a small change only to the Streaming component of Spark, first I built and installed the whole Spark project (this took about 15 minutes on my 4-core, 4 GB RAM laptop). Then, after having changed files only in Streaming, I ran something like (in the top-level directory): mvn --projects streaming/ -DskipTests package and then mvn --projects assembly/ -DskipTests install This was much faster than trying to build the whole Spark from scratch, because Maven was only building one component, in my case the Streaming component, of Spark. I think you can use a very similar approach. -- Emre Sevinç On Mon, May 4, 2015 at 10:44 AM, Pramod Biligiri wrote: No, I just need to build one project at a time. Right now SparkSql. Pramod On Mon, May 4, 2015 at 12:09 AM, Emre Sevinc wrote: Hello Pramod, Do you need to build the whole project every time? Generally you don't, e.g., when I was changing some files that belong only to Spark Streaming, I was building only the streaming (of course after having build and installed the whole project, but that was done only once), and then the assembly. This was much faster than trying to build the whole Spark every time. -- Emre Sevinç On Mon, May 4, 2015 at 9:01 AM, Pramod Biligiri wrote: Using the inbuilt maven and zinc it takes around 10 minutes for each build. Is that reasonable? My maven opts looks like this: $ echo $MAVEN_OPTS -Xmx12000m -XX:MaxPermSize=2048m I'm running it as build/mvn -DskipTests package Should I be tweaking my Zinc/Nailgun config? Pramod On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra wrote: https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri < pramodbilig...@gmail.com> wrote: This is great. I didn't know about the mvn script in the build directory. Pramod On Fri, May 1, 2015 at 9:51 AM, York, Brennon < brennon.y...@capitalone.com> wrote: Following what Ted said, if you leverage the `mvn` from within the `build/` directory of Spark you¹ll get zinc for free which should help speed up build times. On 5/1/15, 9:45 AM, "Ted Yu" wrote: Pramod: Please remember to run Zinc so that the build is faster. Cheers On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander wrote: Hi Pramod, For cluster-like tests you might want to use the same code as in mllib's LocalClusterSparkContext. You can rebuild only the package that you change and then run this main class. Best regards, Alexander -Original Message- From: Pramod Biligiri [mailto:pramodbilig...@gmail.com] Sent: Friday, May 01, 2015 1:46 AM To: dev@spark.apache.org Subject: Speeding up Spark build during development Hi, I'm making some small changes to the Spark codebase and trying it out on a cluster. I was wondering if there's a faster way to build than running the package target each time. Currently I'm using: mvn -DskipTests package All the nodes have the same filesystem mounted at the same mount point. Pramod The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer. -- Emre Sevinc
Re: Speeding up Spark build during development
Just to give you an example: When I was trying to make a small change only to the Streaming component of Spark, first I built and installed the whole Spark project (this took about 15 minutes on my 4-core, 4 GB RAM laptop). Then, after having changed files only in Streaming, I ran something like (in the top-level directory): mvn --projects streaming/ -DskipTests package and then mvn --projects assembly/ -DskipTests install This was much faster than trying to build the whole Spark from scratch, because Maven was only building one component, in my case the Streaming component, of Spark. I think you can use a very similar approach. -- Emre Sevinç On Mon, May 4, 2015 at 10:44 AM, Pramod Biligiri wrote: > No, I just need to build one project at a time. Right now SparkSql. > > Pramod > > On Mon, May 4, 2015 at 12:09 AM, Emre Sevinc > wrote: > >> Hello Pramod, >> >> Do you need to build the whole project every time? Generally you don't, >> e.g., when I was changing some files that belong only to Spark Streaming, I >> was building only the streaming (of course after having build and installed >> the whole project, but that was done only once), and then the assembly. >> This was much faster than trying to build the whole Spark every time. >> >> -- >> Emre Sevinç >> >> On Mon, May 4, 2015 at 9:01 AM, Pramod Biligiri > > wrote: >> >>> Using the inbuilt maven and zinc it takes around 10 minutes for each >>> build. >>> Is that reasonable? >>> My maven opts looks like this: >>> $ echo $MAVEN_OPTS >>> -Xmx12000m -XX:MaxPermSize=2048m >>> >>> I'm running it as build/mvn -DskipTests package >>> >>> Should I be tweaking my Zinc/Nailgun config? >>> >>> Pramod >>> >>> On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra >>> wrote: >>> >>> > >>> > >>> https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn >>> > >>> > On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri < >>> pramodbilig...@gmail.com> >>> > wrote: >>> > >>> >> This is great. I didn't know about the mvn script in the build >>> directory. >>> >> >>> >> Pramod >>> >> >>> >> On Fri, May 1, 2015 at 9:51 AM, York, Brennon < >>> >> brennon.y...@capitalone.com> >>> >> wrote: >>> >> >>> >> > Following what Ted said, if you leverage the `mvn` from within the >>> >> > `build/` directory of Spark you¹ll get zinc for free which should >>> help >>> >> > speed up build times. >>> >> > >>> >> > On 5/1/15, 9:45 AM, "Ted Yu" wrote: >>> >> > >>> >> > >Pramod: >>> >> > >Please remember to run Zinc so that the build is faster. >>> >> > > >>> >> > >Cheers >>> >> > > >>> >> > >On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander >>> >> > > >>> >> > >wrote: >>> >> > > >>> >> > >> Hi Pramod, >>> >> > >> >>> >> > >> For cluster-like tests you might want to use the same code as in >>> >> mllib's >>> >> > >> LocalClusterSparkContext. You can rebuild only the package that >>> you >>> >> > >>change >>> >> > >> and then run this main class. >>> >> > >> >>> >> > >> Best regards, Alexander >>> >> > >> >>> >> > >> -Original Message- >>> >> > >> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com] >>> >> > >> Sent: Friday, May 01, 2015 1:46 AM >>> >> > >> To: dev@spark.apache.org >>> >> > >> Subject: Speeding up Spark build during development >>> >> > >> >>> >> > >> Hi, >>> >> > >> I'm making some small changes to the Spark codebase and trying >>> it out >>> >> > >>on a >>> >> > >> cluster. I was wondering if there's a faster way to build than >>> >> running >>> >> > >>the >>> >> > >> package target each time. >>> >> > >> Currently I'm using: mvn -DskipTests package >>> >> > >> >>> >> > >> All the nodes have the same filesystem mounted at the same mount >>> >> point. >>> >> > >> >>> >> > >> Pramod >>> >> > >> >>> >> > >>> >> > >>> >> > >>> >> > The information contained in this e-mail is confidential and/or >>> >> > proprietary to Capital One and/or its affiliates. The information >>> >> > transmitted herewith is intended only for use by the individual or >>> >> entity >>> >> > to which it is addressed. If the reader of this message is not the >>> >> > intended recipient, you are hereby notified that any review, >>> >> > retransmission, dissemination, distribution, copying or other use >>> of, or >>> >> > taking of any action in reliance upon this information is strictly >>> >> > prohibited. If you have received this communication in error, please >>> >> > contact the sender and delete the material from your computer. >>> >> > >>> >> > >>> >> >>> > >>> > >>> >> >> >> >> -- >> Emre Sevinc >> > > -- Emre Sevinc
Re: Speeding up Spark build during development
No, I just need to build one project at a time. Right now SparkSql. Pramod On Mon, May 4, 2015 at 12:09 AM, Emre Sevinc wrote: > Hello Pramod, > > Do you need to build the whole project every time? Generally you don't, > e.g., when I was changing some files that belong only to Spark Streaming, I > was building only the streaming (of course after having build and installed > the whole project, but that was done only once), and then the assembly. > This was much faster than trying to build the whole Spark every time. > > -- > Emre Sevinç > > On Mon, May 4, 2015 at 9:01 AM, Pramod Biligiri > wrote: > >> Using the inbuilt maven and zinc it takes around 10 minutes for each >> build. >> Is that reasonable? >> My maven opts looks like this: >> $ echo $MAVEN_OPTS >> -Xmx12000m -XX:MaxPermSize=2048m >> >> I'm running it as build/mvn -DskipTests package >> >> Should I be tweaking my Zinc/Nailgun config? >> >> Pramod >> >> On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra >> wrote: >> >> > >> > >> https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn >> > >> > On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri < >> pramodbilig...@gmail.com> >> > wrote: >> > >> >> This is great. I didn't know about the mvn script in the build >> directory. >> >> >> >> Pramod >> >> >> >> On Fri, May 1, 2015 at 9:51 AM, York, Brennon < >> >> brennon.y...@capitalone.com> >> >> wrote: >> >> >> >> > Following what Ted said, if you leverage the `mvn` from within the >> >> > `build/` directory of Spark you¹ll get zinc for free which should >> help >> >> > speed up build times. >> >> > >> >> > On 5/1/15, 9:45 AM, "Ted Yu" wrote: >> >> > >> >> > >Pramod: >> >> > >Please remember to run Zinc so that the build is faster. >> >> > > >> >> > >Cheers >> >> > > >> >> > >On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander >> >> > > >> >> > >wrote: >> >> > > >> >> > >> Hi Pramod, >> >> > >> >> >> > >> For cluster-like tests you might want to use the same code as in >> >> mllib's >> >> > >> LocalClusterSparkContext. You can rebuild only the package that >> you >> >> > >>change >> >> > >> and then run this main class. >> >> > >> >> >> > >> Best regards, Alexander >> >> > >> >> >> > >> -Original Message- >> >> > >> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com] >> >> > >> Sent: Friday, May 01, 2015 1:46 AM >> >> > >> To: dev@spark.apache.org >> >> > >> Subject: Speeding up Spark build during development >> >> > >> >> >> > >> Hi, >> >> > >> I'm making some small changes to the Spark codebase and trying it >> out >> >> > >>on a >> >> > >> cluster. I was wondering if there's a faster way to build than >> >> running >> >> > >>the >> >> > >> package target each time. >> >> > >> Currently I'm using: mvn -DskipTests package >> >> > >> >> >> > >> All the nodes have the same filesystem mounted at the same mount >> >> point. >> >> > >> >> >> > >> Pramod >> >> > >> >> >> > >> >> > >> >> > >> >> > The information contained in this e-mail is confidential and/or >> >> > proprietary to Capital One and/or its affiliates. The information >> >> > transmitted herewith is intended only for use by the individual or >> >> entity >> >> > to which it is addressed. If the reader of this message is not the >> >> > intended recipient, you are hereby notified that any review, >> >> > retransmission, dissemination, distribution, copying or other use >> of, or >> >> > taking of any action in reliance upon this information is strictly >> >> > prohibited. If you have received this communication in error, please >> >> > contact the sender and delete the material from your computer. >> >> > >> >> > >> >> >> > >> > >> > > > > -- > Emre Sevinc >
Re: Speeding up Spark build during development
Hello Pramod, Do you need to build the whole project every time? Generally you don't, e.g., when I was changing some files that belong only to Spark Streaming, I was building only the streaming (of course after having build and installed the whole project, but that was done only once), and then the assembly. This was much faster than trying to build the whole Spark every time. -- Emre Sevinç On Mon, May 4, 2015 at 9:01 AM, Pramod Biligiri wrote: > Using the inbuilt maven and zinc it takes around 10 minutes for each build. > Is that reasonable? > My maven opts looks like this: > $ echo $MAVEN_OPTS > -Xmx12000m -XX:MaxPermSize=2048m > > I'm running it as build/mvn -DskipTests package > > Should I be tweaking my Zinc/Nailgun config? > > Pramod > > On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra > wrote: > > > > > > https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn > > > > On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri < > pramodbilig...@gmail.com> > > wrote: > > > >> This is great. I didn't know about the mvn script in the build > directory. > >> > >> Pramod > >> > >> On Fri, May 1, 2015 at 9:51 AM, York, Brennon < > >> brennon.y...@capitalone.com> > >> wrote: > >> > >> > Following what Ted said, if you leverage the `mvn` from within the > >> > `build/` directory of Spark you¹ll get zinc for free which should help > >> > speed up build times. > >> > > >> > On 5/1/15, 9:45 AM, "Ted Yu" wrote: > >> > > >> > >Pramod: > >> > >Please remember to run Zinc so that the build is faster. > >> > > > >> > >Cheers > >> > > > >> > >On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander > >> > > > >> > >wrote: > >> > > > >> > >> Hi Pramod, > >> > >> > >> > >> For cluster-like tests you might want to use the same code as in > >> mllib's > >> > >> LocalClusterSparkContext. You can rebuild only the package that you > >> > >>change > >> > >> and then run this main class. > >> > >> > >> > >> Best regards, Alexander > >> > >> > >> > >> -Original Message- > >> > >> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com] > >> > >> Sent: Friday, May 01, 2015 1:46 AM > >> > >> To: dev@spark.apache.org > >> > >> Subject: Speeding up Spark build during development > >> > >> > >> > >> Hi, > >> > >> I'm making some small changes to the Spark codebase and trying it > out > >> > >>on a > >> > >> cluster. I was wondering if there's a faster way to build than > >> running > >> > >>the > >> > >> package target each time. > >> > >> Currently I'm using: mvn -DskipTests package > >> > >> > >> > >> All the nodes have the same filesystem mounted at the same mount > >> point. > >> > >> > >> > >> Pramod > >> > >> > >> > > >> > > >> > > >> > The information contained in this e-mail is confidential and/or > >> > proprietary to Capital One and/or its affiliates. The information > >> > transmitted herewith is intended only for use by the individual or > >> entity > >> > to which it is addressed. If the reader of this message is not the > >> > intended recipient, you are hereby notified that any review, > >> > retransmission, dissemination, distribution, copying or other use of, > or > >> > taking of any action in reliance upon this information is strictly > >> > prohibited. If you have received this communication in error, please > >> > contact the sender and delete the material from your computer. > >> > > >> > > >> > > > > > -- Emre Sevinc
Re: Speeding up Spark build during development
Using the inbuilt maven and zinc it takes around 10 minutes for each build. Is that reasonable? My maven opts looks like this: $ echo $MAVEN_OPTS -Xmx12000m -XX:MaxPermSize=2048m I'm running it as build/mvn -DskipTests package Should I be tweaking my Zinc/Nailgun config? Pramod On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra wrote: > > https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn > > On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri > wrote: > >> This is great. I didn't know about the mvn script in the build directory. >> >> Pramod >> >> On Fri, May 1, 2015 at 9:51 AM, York, Brennon < >> brennon.y...@capitalone.com> >> wrote: >> >> > Following what Ted said, if you leverage the `mvn` from within the >> > `build/` directory of Spark you¹ll get zinc for free which should help >> > speed up build times. >> > >> > On 5/1/15, 9:45 AM, "Ted Yu" wrote: >> > >> > >Pramod: >> > >Please remember to run Zinc so that the build is faster. >> > > >> > >Cheers >> > > >> > >On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander >> > > >> > >wrote: >> > > >> > >> Hi Pramod, >> > >> >> > >> For cluster-like tests you might want to use the same code as in >> mllib's >> > >> LocalClusterSparkContext. You can rebuild only the package that you >> > >>change >> > >> and then run this main class. >> > >> >> > >> Best regards, Alexander >> > >> >> > >> -Original Message- >> > >> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com] >> > >> Sent: Friday, May 01, 2015 1:46 AM >> > >> To: dev@spark.apache.org >> > >> Subject: Speeding up Spark build during development >> > >> >> > >> Hi, >> > >> I'm making some small changes to the Spark codebase and trying it out >> > >>on a >> > >> cluster. I was wondering if there's a faster way to build than >> running >> > >>the >> > >> package target each time. >> > >> Currently I'm using: mvn -DskipTests package >> > >> >> > >> All the nodes have the same filesystem mounted at the same mount >> point. >> > >> >> > >> Pramod >> > >> >> > >> > >> > >> > The information contained in this e-mail is confidential and/or >> > proprietary to Capital One and/or its affiliates. The information >> > transmitted herewith is intended only for use by the individual or >> entity >> > to which it is addressed. If the reader of this message is not the >> > intended recipient, you are hereby notified that any review, >> > retransmission, dissemination, distribution, copying or other use of, or >> > taking of any action in reliance upon this information is strictly >> > prohibited. If you have received this communication in error, please >> > contact the sender and delete the material from your computer. >> > >> > >> > >
Re: Speeding up Spark build during development
https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri wrote: > This is great. I didn't know about the mvn script in the build directory. > > Pramod > > On Fri, May 1, 2015 at 9:51 AM, York, Brennon > > wrote: > > > Following what Ted said, if you leverage the `mvn` from within the > > `build/` directory of Spark you¹ll get zinc for free which should help > > speed up build times. > > > > On 5/1/15, 9:45 AM, "Ted Yu" wrote: > > > > >Pramod: > > >Please remember to run Zinc so that the build is faster. > > > > > >Cheers > > > > > >On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander > > > > > >wrote: > > > > > >> Hi Pramod, > > >> > > >> For cluster-like tests you might want to use the same code as in > mllib's > > >> LocalClusterSparkContext. You can rebuild only the package that you > > >>change > > >> and then run this main class. > > >> > > >> Best regards, Alexander > > >> > > >> -Original Message- > > >> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com] > > >> Sent: Friday, May 01, 2015 1:46 AM > > >> To: dev@spark.apache.org > > >> Subject: Speeding up Spark build during development > > >> > > >> Hi, > > >> I'm making some small changes to the Spark codebase and trying it out > > >>on a > > >> cluster. I was wondering if there's a faster way to build than running > > >>the > > >> package target each time. > > >> Currently I'm using: mvn -DskipTests package > > >> > > >> All the nodes have the same filesystem mounted at the same mount > point. > > >> > > >> Pramod > > >> > > > > > > > > The information contained in this e-mail is confidential and/or > > proprietary to Capital One and/or its affiliates. The information > > transmitted herewith is intended only for use by the individual or entity > > to which it is addressed. If the reader of this message is not the > > intended recipient, you are hereby notified that any review, > > retransmission, dissemination, distribution, copying or other use of, or > > taking of any action in reliance upon this information is strictly > > prohibited. If you have received this communication in error, please > > contact the sender and delete the material from your computer. > > > > >
Re: Speeding up Spark build during development
This is great. I didn't know about the mvn script in the build directory. Pramod On Fri, May 1, 2015 at 9:51 AM, York, Brennon wrote: > Following what Ted said, if you leverage the `mvn` from within the > `build/` directory of Spark you¹ll get zinc for free which should help > speed up build times. > > On 5/1/15, 9:45 AM, "Ted Yu" wrote: > > >Pramod: > >Please remember to run Zinc so that the build is faster. > > > >Cheers > > > >On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander > > > >wrote: > > > >> Hi Pramod, > >> > >> For cluster-like tests you might want to use the same code as in mllib's > >> LocalClusterSparkContext. You can rebuild only the package that you > >>change > >> and then run this main class. > >> > >> Best regards, Alexander > >> > >> -----Original Message- > >> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com] > >> Sent: Friday, May 01, 2015 1:46 AM > >> To: dev@spark.apache.org > >> Subject: Speeding up Spark build during development > >> > >> Hi, > >> I'm making some small changes to the Spark codebase and trying it out > >>on a > >> cluster. I was wondering if there's a faster way to build than running > >>the > >> package target each time. > >> Currently I'm using: mvn -DskipTests package > >> > >> All the nodes have the same filesystem mounted at the same mount point. > >> > >> Pramod > >> > > > > The information contained in this e-mail is confidential and/or > proprietary to Capital One and/or its affiliates. The information > transmitted herewith is intended only for use by the individual or entity > to which it is addressed. If the reader of this message is not the > intended recipient, you are hereby notified that any review, > retransmission, dissemination, distribution, copying or other use of, or > taking of any action in reliance upon this information is strictly > prohibited. If you have received this communication in error, please > contact the sender and delete the material from your computer. > >
Re: Speeding up Spark build during development
Following what Ted said, if you leverage the `mvn` from within the `build/` directory of Spark you¹ll get zinc for free which should help speed up build times. On 5/1/15, 9:45 AM, "Ted Yu" wrote: >Pramod: >Please remember to run Zinc so that the build is faster. > >Cheers > >On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander > >wrote: > >> Hi Pramod, >> >> For cluster-like tests you might want to use the same code as in mllib's >> LocalClusterSparkContext. You can rebuild only the package that you >>change >> and then run this main class. >> >> Best regards, Alexander >> >> -Original Message- >> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com] >> Sent: Friday, May 01, 2015 1:46 AM >> To: dev@spark.apache.org >> Subject: Speeding up Spark build during development >> >> Hi, >> I'm making some small changes to the Spark codebase and trying it out >>on a >> cluster. I was wondering if there's a faster way to build than running >>the >> package target each time. >> Currently I'm using: mvn -DskipTests package >> >> All the nodes have the same filesystem mounted at the same mount point. >> >> Pramod >> The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Speeding up Spark build during development
Pramod: Please remember to run Zinc so that the build is faster. Cheers On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander wrote: > Hi Pramod, > > For cluster-like tests you might want to use the same code as in mllib's > LocalClusterSparkContext. You can rebuild only the package that you change > and then run this main class. > > Best regards, Alexander > > -Original Message- > From: Pramod Biligiri [mailto:pramodbilig...@gmail.com] > Sent: Friday, May 01, 2015 1:46 AM > To: dev@spark.apache.org > Subject: Speeding up Spark build during development > > Hi, > I'm making some small changes to the Spark codebase and trying it out on a > cluster. I was wondering if there's a faster way to build than running the > package target each time. > Currently I'm using: mvn -DskipTests package > > All the nodes have the same filesystem mounted at the same mount point. > > Pramod >
RE: Speeding up Spark build during development
Hi Pramod, For cluster-like tests you might want to use the same code as in mllib's LocalClusterSparkContext. You can rebuild only the package that you change and then run this main class. Best regards, Alexander -Original Message- From: Pramod Biligiri [mailto:pramodbilig...@gmail.com] Sent: Friday, May 01, 2015 1:46 AM To: dev@spark.apache.org Subject: Speeding up Spark build during development Hi, I'm making some small changes to the Spark codebase and trying it out on a cluster. I was wondering if there's a faster way to build than running the package target each time. Currently I'm using: mvn -DskipTests package All the nodes have the same filesystem mounted at the same mount point. Pramod
Re: Speeding up Spark build during development
Hi Pramod, If you are using sbt as your build, then you need to do sbt assembly once and use sbt ~compile. Also export SPARK_PREPEND_CLASSES=1 this in your shell and all nodes. You can may be try this out ? Thanks, Prashant Sharma On Fri, May 1, 2015 at 2:16 PM, Pramod Biligiri wrote: > Hi, > I'm making some small changes to the Spark codebase and trying it out on a > cluster. I was wondering if there's a faster way to build than running the > package target each time. > Currently I'm using: mvn -DskipTests package > > All the nodes have the same filesystem mounted at the same mount point. > > Pramod >
Speeding up Spark build during development
Hi, I'm making some small changes to the Spark codebase and trying it out on a cluster. I was wondering if there's a faster way to build than running the package target each time. Currently I'm using: mvn -DskipTests package All the nodes have the same filesystem mounted at the same mount point. Pramod