Re: Speeding up Spark build during development
I had to make a small change to Emre's suggestion above, in order for my changes to get picked up. This worked for me: mvn --projects sql/core -DskipTests install #not package mvn --projects assembly/ -DskipTests install Pramod On Tue, May 5, 2015 at 2:36 AM, Iulian Dragoș wrote: > I'm probably the only Eclipse user here, but it seems I have the best > workflow :) At least for me things work as they should: once I imported > projects in the workspace I can build and run/debug tests from the IDE. I > only go to sbt when I need to re-create projects or I want to run the full > test suite. > > > iulian > > > > On Tue, May 5, 2015 at 7:35 AM, Tathagata Das wrote: > > > In addition to Michael suggestion, in my SBT workflow I also use "~" to > > automatically kickoff build and unit test. For example, > > > > sbt/sbt "~streaming/test-only *BasicOperationsSuite*" > > > > It will automatically detect any file changes in the project and start of > > the compilation and testing. > > So my full workflow involves changing code in IntelliJ and then > > continuously running unit tests in the background on the command line > using > > this "~". > > > > TD > > > > > > On Mon, May 4, 2015 at 2:49 PM, Michael Armbrust > > > wrote: > > > > > FWIW... My Spark SQL development workflow is usually to run "build/sbt > > > sparkShell" or "build/sbt 'sql/test-only '". These > > commands > > > starts in as little as 30s on my laptop, automatically figure out which > > > subprojects need to be rebuilt, and don't require the expensive > assembly > > > creation. > > > > > > On Mon, May 4, 2015 at 5:48 AM, Meethu Mathew < > meethu.mat...@flytxt.com> > > > wrote: > > > > > > > * > > > > * > > > > ** ** ** ** ** ** Hi, > > > > > > > > Is it really necessary to run **mvn --projects assembly/ -DskipTests > > > > install ? Could you please explain why this is needed? > > > > I got the changes after running "mvn --projects streaming/ > -DskipTests > > > > package". > > > > > > > > Regards, > > > > Meethu > > > > > > > > > > > > On Monday 04 May 2015 02:20 PM, Emre Sevinc wrote: > > > > > > > >> Just to give you an example: > > > >> > > > >> When I was trying to make a small change only to the Streaming > > component > > > >> of > > > >> Spark, first I built and installed the whole Spark project (this > took > > > >> about > > > >> 15 minutes on my 4-core, 4 GB RAM laptop). Then, after having > changed > > > >> files > > > >> only in Streaming, I ran something like (in the top-level > directory): > > > >> > > > >> mvn --projects streaming/ -DskipTests package > > > >> > > > >> and then > > > >> > > > >> mvn --projects assembly/ -DskipTests install > > > >> > > > >> > > > >> This was much faster than trying to build the whole Spark from > > scratch, > > > >> because Maven was only building one component, in my case the > > Streaming > > > >> component, of Spark. I think you can use a very similar approach. > > > >> > > > >> -- > > > >> Emre Sevinç > > > >> > > > >> > > > >> > > > >> On Mon, May 4, 2015 at 10:44 AM, Pramod Biligiri < > > > >> pramodbilig...@gmail.com> > > > >> wrote: > > > >> > > > >> No, I just need to build one project at a time. Right now SparkSql. > > > >>> > > > >>> Pramod > > > >>> > > > >>> On Mon, May 4, 2015 at 12:09 AM, Emre Sevinc < > emre.sev...@gmail.com> > > > >>> wrote: > > > >>> > > > >>> Hello Pramod, > > > > > > Do you need to build the whole project every time? Generally you > > > don't, > > > e.g., when I was changing some files that belong only to Spark > > > Streaming, I > > > was building only the streaming (of course after having build and > > > installed > > > the whole project, but that was done only once), and then the > > > assembly. > > > This was much faster than trying to build the whole Spark every > > time. > > > > > > -- > > > Emre Sevinç > > > > > > On Mon, May 4, 2015 at 9:01 AM, Pramod Biligiri < > > > pramodbilig...@gmail.com > > > > > > > wrote: > > > > Using the inbuilt maven and zinc it takes around 10 minutes for > > each > > > > build. > > > > Is that reasonable? > > > > My maven opts looks like this: > > > > $ echo $MAVEN_OPTS > > > > -Xmx12000m -XX:MaxPermSize=2048m > > > > > > > > I'm running it as build/mvn -DskipTests package > > > > > > > > Should I be tweaking my Zinc/Nailgun config? > > > > > > > > Pramod > > > > > > > > On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra < > > > m...@clearstorydata.com> > > > > wrote: > > > > > > > > > > > >> > > > > > > > > > > https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn > > > > > > > >> On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri < > > > >> > > > > pramodbilig...@gmail.com> > > > > > > > >> wrote: > > > >> > > > >> This is great. I didn't know about the mvn script in the build > > > >>> > > > >>
Re: Speeding up Spark build during development
I'm probably the only Eclipse user here, but it seems I have the best workflow :) At least for me things work as they should: once I imported projects in the workspace I can build and run/debug tests from the IDE. I only go to sbt when I need to re-create projects or I want to run the full test suite. iulian On Tue, May 5, 2015 at 7:35 AM, Tathagata Das wrote: > In addition to Michael suggestion, in my SBT workflow I also use "~" to > automatically kickoff build and unit test. For example, > > sbt/sbt "~streaming/test-only *BasicOperationsSuite*" > > It will automatically detect any file changes in the project and start of > the compilation and testing. > So my full workflow involves changing code in IntelliJ and then > continuously running unit tests in the background on the command line using > this "~". > > TD > > > On Mon, May 4, 2015 at 2:49 PM, Michael Armbrust > wrote: > > > FWIW... My Spark SQL development workflow is usually to run "build/sbt > > sparkShell" or "build/sbt 'sql/test-only '". These > commands > > starts in as little as 30s on my laptop, automatically figure out which > > subprojects need to be rebuilt, and don't require the expensive assembly > > creation. > > > > On Mon, May 4, 2015 at 5:48 AM, Meethu Mathew > > wrote: > > > > > * > > > * > > > ** ** ** ** ** ** Hi, > > > > > > Is it really necessary to run **mvn --projects assembly/ -DskipTests > > > install ? Could you please explain why this is needed? > > > I got the changes after running "mvn --projects streaming/ -DskipTests > > > package". > > > > > > Regards, > > > Meethu > > > > > > > > > On Monday 04 May 2015 02:20 PM, Emre Sevinc wrote: > > > > > >> Just to give you an example: > > >> > > >> When I was trying to make a small change only to the Streaming > component > > >> of > > >> Spark, first I built and installed the whole Spark project (this took > > >> about > > >> 15 minutes on my 4-core, 4 GB RAM laptop). Then, after having changed > > >> files > > >> only in Streaming, I ran something like (in the top-level directory): > > >> > > >> mvn --projects streaming/ -DskipTests package > > >> > > >> and then > > >> > > >> mvn --projects assembly/ -DskipTests install > > >> > > >> > > >> This was much faster than trying to build the whole Spark from > scratch, > > >> because Maven was only building one component, in my case the > Streaming > > >> component, of Spark. I think you can use a very similar approach. > > >> > > >> -- > > >> Emre Sevinç > > >> > > >> > > >> > > >> On Mon, May 4, 2015 at 10:44 AM, Pramod Biligiri < > > >> pramodbilig...@gmail.com> > > >> wrote: > > >> > > >> No, I just need to build one project at a time. Right now SparkSql. > > >>> > > >>> Pramod > > >>> > > >>> On Mon, May 4, 2015 at 12:09 AM, Emre Sevinc > > >>> wrote: > > >>> > > >>> Hello Pramod, > > > > Do you need to build the whole project every time? Generally you > > don't, > > e.g., when I was changing some files that belong only to Spark > > Streaming, I > > was building only the streaming (of course after having build and > > installed > > the whole project, but that was done only once), and then the > > assembly. > > This was much faster than trying to build the whole Spark every > time. > > > > -- > > Emre Sevinç > > > > On Mon, May 4, 2015 at 9:01 AM, Pramod Biligiri < > > pramodbilig...@gmail.com > > > > > wrote: > > > Using the inbuilt maven and zinc it takes around 10 minutes for > each > > > build. > > > Is that reasonable? > > > My maven opts looks like this: > > > $ echo $MAVEN_OPTS > > > -Xmx12000m -XX:MaxPermSize=2048m > > > > > > I'm running it as build/mvn -DskipTests package > > > > > > Should I be tweaking my Zinc/Nailgun config? > > > > > > Pramod > > > > > > On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra < > > m...@clearstorydata.com> > > > wrote: > > > > > > > > >> > > > > > > https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn > > > > > >> On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri < > > >> > > > pramodbilig...@gmail.com> > > > > > >> wrote: > > >> > > >> This is great. I didn't know about the mvn script in the build > > >>> > > >> directory. > > > > > >> Pramod > > >>> > > >>> On Fri, May 1, 2015 at 9:51 AM, York, Brennon < > > >>> brennon.y...@capitalone.com> > > >>> wrote: > > >>> > > >>> Following what Ted said, if you leverage the `mvn` from within > the > > `build/` directory of Spark you¹ll get zinc for free which > should > > > > >>> help > > > > > >> speed up build times. > > > > On 5/1/15, 9:45 AM, "Ted Yu" wrote: > > > > Pramod: > > > Please remember to run Zinc so that the build is faster. > > > > > > Cheers > > > > >
Re: Speeding up Spark build during development
In addition to Michael suggestion, in my SBT workflow I also use "~" to automatically kickoff build and unit test. For example, sbt/sbt "~streaming/test-only *BasicOperationsSuite*" It will automatically detect any file changes in the project and start of the compilation and testing. So my full workflow involves changing code in IntelliJ and then continuously running unit tests in the background on the command line using this "~". TD On Mon, May 4, 2015 at 2:49 PM, Michael Armbrust wrote: > FWIW... My Spark SQL development workflow is usually to run "build/sbt > sparkShell" or "build/sbt 'sql/test-only '". These commands > starts in as little as 30s on my laptop, automatically figure out which > subprojects need to be rebuilt, and don't require the expensive assembly > creation. > > On Mon, May 4, 2015 at 5:48 AM, Meethu Mathew > wrote: > > > * > > * > > ** ** ** ** ** ** Hi, > > > > Is it really necessary to run **mvn --projects assembly/ -DskipTests > > install ? Could you please explain why this is needed? > > I got the changes after running "mvn --projects streaming/ -DskipTests > > package". > > > > Regards, > > Meethu > > > > > > On Monday 04 May 2015 02:20 PM, Emre Sevinc wrote: > > > >> Just to give you an example: > >> > >> When I was trying to make a small change only to the Streaming component > >> of > >> Spark, first I built and installed the whole Spark project (this took > >> about > >> 15 minutes on my 4-core, 4 GB RAM laptop). Then, after having changed > >> files > >> only in Streaming, I ran something like (in the top-level directory): > >> > >> mvn --projects streaming/ -DskipTests package > >> > >> and then > >> > >> mvn --projects assembly/ -DskipTests install > >> > >> > >> This was much faster than trying to build the whole Spark from scratch, > >> because Maven was only building one component, in my case the Streaming > >> component, of Spark. I think you can use a very similar approach. > >> > >> -- > >> Emre Sevinç > >> > >> > >> > >> On Mon, May 4, 2015 at 10:44 AM, Pramod Biligiri < > >> pramodbilig...@gmail.com> > >> wrote: > >> > >> No, I just need to build one project at a time. Right now SparkSql. > >>> > >>> Pramod > >>> > >>> On Mon, May 4, 2015 at 12:09 AM, Emre Sevinc > >>> wrote: > >>> > >>> Hello Pramod, > > Do you need to build the whole project every time? Generally you > don't, > e.g., when I was changing some files that belong only to Spark > Streaming, I > was building only the streaming (of course after having build and > installed > the whole project, but that was done only once), and then the > assembly. > This was much faster than trying to build the whole Spark every time. > > -- > Emre Sevinç > > On Mon, May 4, 2015 at 9:01 AM, Pramod Biligiri < > pramodbilig...@gmail.com > > > wrote: > > Using the inbuilt maven and zinc it takes around 10 minutes for each > > build. > > Is that reasonable? > > My maven opts looks like this: > > $ echo $MAVEN_OPTS > > -Xmx12000m -XX:MaxPermSize=2048m > > > > I'm running it as build/mvn -DskipTests package > > > > Should I be tweaking my Zinc/Nailgun config? > > > > Pramod > > > > On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra < > m...@clearstorydata.com> > > wrote: > > > > > >> > > > https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn > > > >> On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri < > >> > > pramodbilig...@gmail.com> > > > >> wrote: > >> > >> This is great. I didn't know about the mvn script in the build > >>> > >> directory. > > > >> Pramod > >>> > >>> On Fri, May 1, 2015 at 9:51 AM, York, Brennon < > >>> brennon.y...@capitalone.com> > >>> wrote: > >>> > >>> Following what Ted said, if you leverage the `mvn` from within the > `build/` directory of Spark you¹ll get zinc for free which should > > >>> help > > > >> speed up build times. > > On 5/1/15, 9:45 AM, "Ted Yu" wrote: > > Pramod: > > Please remember to run Zinc so that the build is faster. > > > > Cheers > > > > On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander > > > > wrote: > > > > Hi Pramod, > >> > >> For cluster-like tests you might want to use the same code as in > >> > > mllib's > >>> > LocalClusterSparkContext. You can rebuild only the package that > >> > > you > > > >> change > >> and then run this main class. > >> > >> Best regards, Alexander > >> > >> -Original Message- > >> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com] > >> Sent: Friday, May 01, 2015 1:46 AM > >> To: dev@spark.ap
Re: Speeding up Spark build during development
FWIW... My Spark SQL development workflow is usually to run "build/sbt sparkShell" or "build/sbt 'sql/test-only '". These commands starts in as little as 30s on my laptop, automatically figure out which subprojects need to be rebuilt, and don't require the expensive assembly creation. On Mon, May 4, 2015 at 5:48 AM, Meethu Mathew wrote: > * > * > ** ** ** ** ** ** Hi, > > Is it really necessary to run **mvn --projects assembly/ -DskipTests > install ? Could you please explain why this is needed? > I got the changes after running "mvn --projects streaming/ -DskipTests > package". > > Regards, > Meethu > > > On Monday 04 May 2015 02:20 PM, Emre Sevinc wrote: > >> Just to give you an example: >> >> When I was trying to make a small change only to the Streaming component >> of >> Spark, first I built and installed the whole Spark project (this took >> about >> 15 minutes on my 4-core, 4 GB RAM laptop). Then, after having changed >> files >> only in Streaming, I ran something like (in the top-level directory): >> >> mvn --projects streaming/ -DskipTests package >> >> and then >> >> mvn --projects assembly/ -DskipTests install >> >> >> This was much faster than trying to build the whole Spark from scratch, >> because Maven was only building one component, in my case the Streaming >> component, of Spark. I think you can use a very similar approach. >> >> -- >> Emre Sevinç >> >> >> >> On Mon, May 4, 2015 at 10:44 AM, Pramod Biligiri < >> pramodbilig...@gmail.com> >> wrote: >> >> No, I just need to build one project at a time. Right now SparkSql. >>> >>> Pramod >>> >>> On Mon, May 4, 2015 at 12:09 AM, Emre Sevinc >>> wrote: >>> >>> Hello Pramod, Do you need to build the whole project every time? Generally you don't, e.g., when I was changing some files that belong only to Spark Streaming, I was building only the streaming (of course after having build and installed the whole project, but that was done only once), and then the assembly. This was much faster than trying to build the whole Spark every time. -- Emre Sevinç On Mon, May 4, 2015 at 9:01 AM, Pramod Biligiri < pramodbilig...@gmail.com > wrote: > Using the inbuilt maven and zinc it takes around 10 minutes for each > build. > Is that reasonable? > My maven opts looks like this: > $ echo $MAVEN_OPTS > -Xmx12000m -XX:MaxPermSize=2048m > > I'm running it as build/mvn -DskipTests package > > Should I be tweaking my Zinc/Nailgun config? > > Pramod > > On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra > wrote: > > >> > https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn > >> On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri < >> > pramodbilig...@gmail.com> > >> wrote: >> >> This is great. I didn't know about the mvn script in the build >>> >> directory. > >> Pramod >>> >>> On Fri, May 1, 2015 at 9:51 AM, York, Brennon < >>> brennon.y...@capitalone.com> >>> wrote: >>> >>> Following what Ted said, if you leverage the `mvn` from within the `build/` directory of Spark you¹ll get zinc for free which should >>> help > >> speed up build times. On 5/1/15, 9:45 AM, "Ted Yu" wrote: Pramod: > Please remember to run Zinc so that the build is faster. > > Cheers > > On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander > > wrote: > > Hi Pramod, >> >> For cluster-like tests you might want to use the same code as in >> > mllib's >>> LocalClusterSparkContext. You can rebuild only the package that >> > you > >> change >> and then run this main class. >> >> Best regards, Alexander >> >> -Original Message- >> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com] >> Sent: Friday, May 01, 2015 1:46 AM >> To: dev@spark.apache.org >> Subject: Speeding up Spark build during development >> >> Hi, >> I'm making some small changes to the Spark codebase and trying >> > it out > >> on a >> cluster. I was wondering if there's a faster way to build than >> > running >>> the >> package target each time. >> Currently I'm using: mvn -DskipTests package >> >> All the nodes have the same filesystem mounted at the same mount >> > point. >>> Pramod >> >> The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates. The information transmitted he
Re: Speeding up Spark build during development
* * ** ** ** ** ** ** Hi, Is it really necessary to run **mvn --projects assembly/ -DskipTests install ? Could you please explain why this is needed? I got the changes after running "mvn --projects streaming/ -DskipTests package". Regards, Meethu On Monday 04 May 2015 02:20 PM, Emre Sevinc wrote: Just to give you an example: When I was trying to make a small change only to the Streaming component of Spark, first I built and installed the whole Spark project (this took about 15 minutes on my 4-core, 4 GB RAM laptop). Then, after having changed files only in Streaming, I ran something like (in the top-level directory): mvn --projects streaming/ -DskipTests package and then mvn --projects assembly/ -DskipTests install This was much faster than trying to build the whole Spark from scratch, because Maven was only building one component, in my case the Streaming component, of Spark. I think you can use a very similar approach. -- Emre Sevinç On Mon, May 4, 2015 at 10:44 AM, Pramod Biligiri wrote: No, I just need to build one project at a time. Right now SparkSql. Pramod On Mon, May 4, 2015 at 12:09 AM, Emre Sevinc wrote: Hello Pramod, Do you need to build the whole project every time? Generally you don't, e.g., when I was changing some files that belong only to Spark Streaming, I was building only the streaming (of course after having build and installed the whole project, but that was done only once), and then the assembly. This was much faster than trying to build the whole Spark every time. -- Emre Sevinç On Mon, May 4, 2015 at 9:01 AM, Pramod Biligiri wrote: Using the inbuilt maven and zinc it takes around 10 minutes for each build. Is that reasonable? My maven opts looks like this: $ echo $MAVEN_OPTS -Xmx12000m -XX:MaxPermSize=2048m I'm running it as build/mvn -DskipTests package Should I be tweaking my Zinc/Nailgun config? Pramod On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra wrote: https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri < pramodbilig...@gmail.com> wrote: This is great. I didn't know about the mvn script in the build directory. Pramod On Fri, May 1, 2015 at 9:51 AM, York, Brennon < brennon.y...@capitalone.com> wrote: Following what Ted said, if you leverage the `mvn` from within the `build/` directory of Spark you¹ll get zinc for free which should help speed up build times. On 5/1/15, 9:45 AM, "Ted Yu" wrote: Pramod: Please remember to run Zinc so that the build is faster. Cheers On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander wrote: Hi Pramod, For cluster-like tests you might want to use the same code as in mllib's LocalClusterSparkContext. You can rebuild only the package that you change and then run this main class. Best regards, Alexander -Original Message- From: Pramod Biligiri [mailto:pramodbilig...@gmail.com] Sent: Friday, May 01, 2015 1:46 AM To: dev@spark.apache.org Subject: Speeding up Spark build during development Hi, I'm making some small changes to the Spark codebase and trying it out on a cluster. I was wondering if there's a faster way to build than running the package target each time. Currently I'm using: mvn -DskipTests package All the nodes have the same filesystem mounted at the same mount point. Pramod The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer. -- Emre Sevinc
Re: Speeding up Spark build during development
Just to give you an example: When I was trying to make a small change only to the Streaming component of Spark, first I built and installed the whole Spark project (this took about 15 minutes on my 4-core, 4 GB RAM laptop). Then, after having changed files only in Streaming, I ran something like (in the top-level directory): mvn --projects streaming/ -DskipTests package and then mvn --projects assembly/ -DskipTests install This was much faster than trying to build the whole Spark from scratch, because Maven was only building one component, in my case the Streaming component, of Spark. I think you can use a very similar approach. -- Emre Sevinç On Mon, May 4, 2015 at 10:44 AM, Pramod Biligiri wrote: > No, I just need to build one project at a time. Right now SparkSql. > > Pramod > > On Mon, May 4, 2015 at 12:09 AM, Emre Sevinc > wrote: > >> Hello Pramod, >> >> Do you need to build the whole project every time? Generally you don't, >> e.g., when I was changing some files that belong only to Spark Streaming, I >> was building only the streaming (of course after having build and installed >> the whole project, but that was done only once), and then the assembly. >> This was much faster than trying to build the whole Spark every time. >> >> -- >> Emre Sevinç >> >> On Mon, May 4, 2015 at 9:01 AM, Pramod Biligiri > > wrote: >> >>> Using the inbuilt maven and zinc it takes around 10 minutes for each >>> build. >>> Is that reasonable? >>> My maven opts looks like this: >>> $ echo $MAVEN_OPTS >>> -Xmx12000m -XX:MaxPermSize=2048m >>> >>> I'm running it as build/mvn -DskipTests package >>> >>> Should I be tweaking my Zinc/Nailgun config? >>> >>> Pramod >>> >>> On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra >>> wrote: >>> >>> > >>> > >>> https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn >>> > >>> > On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri < >>> pramodbilig...@gmail.com> >>> > wrote: >>> > >>> >> This is great. I didn't know about the mvn script in the build >>> directory. >>> >> >>> >> Pramod >>> >> >>> >> On Fri, May 1, 2015 at 9:51 AM, York, Brennon < >>> >> brennon.y...@capitalone.com> >>> >> wrote: >>> >> >>> >> > Following what Ted said, if you leverage the `mvn` from within the >>> >> > `build/` directory of Spark you¹ll get zinc for free which should >>> help >>> >> > speed up build times. >>> >> > >>> >> > On 5/1/15, 9:45 AM, "Ted Yu" wrote: >>> >> > >>> >> > >Pramod: >>> >> > >Please remember to run Zinc so that the build is faster. >>> >> > > >>> >> > >Cheers >>> >> > > >>> >> > >On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander >>> >> > > >>> >> > >wrote: >>> >> > > >>> >> > >> Hi Pramod, >>> >> > >> >>> >> > >> For cluster-like tests you might want to use the same code as in >>> >> mllib's >>> >> > >> LocalClusterSparkContext. You can rebuild only the package that >>> you >>> >> > >>change >>> >> > >> and then run this main class. >>> >> > >> >>> >> > >> Best regards, Alexander >>> >> > >> >>> >> > >> -Original Message- >>> >> > >> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com] >>> >> > >> Sent: Friday, May 01, 2015 1:46 AM >>> >> > >> To: dev@spark.apache.org >>> >> > >> Subject: Speeding up Spark build during development >>> >> > >> >>> >> > >> Hi, >>> >> > >> I'm making some small changes to the Spark codebase and trying >>> it out >>> >> > >>on a >>> >> > >> cluster. I was wondering if there's a faster way to build than >>> >> running >>> >> > >>the >>> >> > >> package target each time. >>> >> > >> Currently I'm using: mvn -DskipTests package >>> >> > >> >>> >> > >> All the nodes have the same filesystem mounted at the same mount >>> >> point. >>> >> > >> >>> >> > >> Pramod >>> >> > >> >>> >> > >>> >> > >>> >> > >>> >> > The information contained in this e-mail is confidential and/or >>> >> > proprietary to Capital One and/or its affiliates. The information >>> >> > transmitted herewith is intended only for use by the individual or >>> >> entity >>> >> > to which it is addressed. If the reader of this message is not the >>> >> > intended recipient, you are hereby notified that any review, >>> >> > retransmission, dissemination, distribution, copying or other use >>> of, or >>> >> > taking of any action in reliance upon this information is strictly >>> >> > prohibited. If you have received this communication in error, please >>> >> > contact the sender and delete the material from your computer. >>> >> > >>> >> > >>> >> >>> > >>> > >>> >> >> >> >> -- >> Emre Sevinc >> > > -- Emre Sevinc
Re: Speeding up Spark build during development
No, I just need to build one project at a time. Right now SparkSql. Pramod On Mon, May 4, 2015 at 12:09 AM, Emre Sevinc wrote: > Hello Pramod, > > Do you need to build the whole project every time? Generally you don't, > e.g., when I was changing some files that belong only to Spark Streaming, I > was building only the streaming (of course after having build and installed > the whole project, but that was done only once), and then the assembly. > This was much faster than trying to build the whole Spark every time. > > -- > Emre Sevinç > > On Mon, May 4, 2015 at 9:01 AM, Pramod Biligiri > wrote: > >> Using the inbuilt maven and zinc it takes around 10 minutes for each >> build. >> Is that reasonable? >> My maven opts looks like this: >> $ echo $MAVEN_OPTS >> -Xmx12000m -XX:MaxPermSize=2048m >> >> I'm running it as build/mvn -DskipTests package >> >> Should I be tweaking my Zinc/Nailgun config? >> >> Pramod >> >> On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra >> wrote: >> >> > >> > >> https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn >> > >> > On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri < >> pramodbilig...@gmail.com> >> > wrote: >> > >> >> This is great. I didn't know about the mvn script in the build >> directory. >> >> >> >> Pramod >> >> >> >> On Fri, May 1, 2015 at 9:51 AM, York, Brennon < >> >> brennon.y...@capitalone.com> >> >> wrote: >> >> >> >> > Following what Ted said, if you leverage the `mvn` from within the >> >> > `build/` directory of Spark you¹ll get zinc for free which should >> help >> >> > speed up build times. >> >> > >> >> > On 5/1/15, 9:45 AM, "Ted Yu" wrote: >> >> > >> >> > >Pramod: >> >> > >Please remember to run Zinc so that the build is faster. >> >> > > >> >> > >Cheers >> >> > > >> >> > >On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander >> >> > > >> >> > >wrote: >> >> > > >> >> > >> Hi Pramod, >> >> > >> >> >> > >> For cluster-like tests you might want to use the same code as in >> >> mllib's >> >> > >> LocalClusterSparkContext. You can rebuild only the package that >> you >> >> > >>change >> >> > >> and then run this main class. >> >> > >> >> >> > >> Best regards, Alexander >> >> > >> >> >> > >> -Original Message- >> >> > >> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com] >> >> > >> Sent: Friday, May 01, 2015 1:46 AM >> >> > >> To: dev@spark.apache.org >> >> > >> Subject: Speeding up Spark build during development >> >> > >> >> >> > >> Hi, >> >> > >> I'm making some small changes to the Spark codebase and trying it >> out >> >> > >>on a >> >> > >> cluster. I was wondering if there's a faster way to build than >> >> running >> >> > >>the >> >> > >> package target each time. >> >> > >> Currently I'm using: mvn -DskipTests package >> >> > >> >> >> > >> All the nodes have the same filesystem mounted at the same mount >> >> point. >> >> > >> >> >> > >> Pramod >> >> > >> >> >> > >> >> > >> >> > >> >> > The information contained in this e-mail is confidential and/or >> >> > proprietary to Capital One and/or its affiliates. The information >> >> > transmitted herewith is intended only for use by the individual or >> >> entity >> >> > to which it is addressed. If the reader of this message is not the >> >> > intended recipient, you are hereby notified that any review, >> >> > retransmission, dissemination, distribution, copying or other use >> of, or >> >> > taking of any action in reliance upon this information is strictly >> >> > prohibited. If you have received this communication in error, please >> >> > contact the sender and delete the material from your computer. >> >> > >> >> > >> >> >> > >> > >> > > > > -- > Emre Sevinc >
Re: Speeding up Spark build during development
Hello Pramod, Do you need to build the whole project every time? Generally you don't, e.g., when I was changing some files that belong only to Spark Streaming, I was building only the streaming (of course after having build and installed the whole project, but that was done only once), and then the assembly. This was much faster than trying to build the whole Spark every time. -- Emre Sevinç On Mon, May 4, 2015 at 9:01 AM, Pramod Biligiri wrote: > Using the inbuilt maven and zinc it takes around 10 minutes for each build. > Is that reasonable? > My maven opts looks like this: > $ echo $MAVEN_OPTS > -Xmx12000m -XX:MaxPermSize=2048m > > I'm running it as build/mvn -DskipTests package > > Should I be tweaking my Zinc/Nailgun config? > > Pramod > > On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra > wrote: > > > > > > https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn > > > > On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri < > pramodbilig...@gmail.com> > > wrote: > > > >> This is great. I didn't know about the mvn script in the build > directory. > >> > >> Pramod > >> > >> On Fri, May 1, 2015 at 9:51 AM, York, Brennon < > >> brennon.y...@capitalone.com> > >> wrote: > >> > >> > Following what Ted said, if you leverage the `mvn` from within the > >> > `build/` directory of Spark you¹ll get zinc for free which should help > >> > speed up build times. > >> > > >> > On 5/1/15, 9:45 AM, "Ted Yu" wrote: > >> > > >> > >Pramod: > >> > >Please remember to run Zinc so that the build is faster. > >> > > > >> > >Cheers > >> > > > >> > >On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander > >> > > > >> > >wrote: > >> > > > >> > >> Hi Pramod, > >> > >> > >> > >> For cluster-like tests you might want to use the same code as in > >> mllib's > >> > >> LocalClusterSparkContext. You can rebuild only the package that you > >> > >>change > >> > >> and then run this main class. > >> > >> > >> > >> Best regards, Alexander > >> > >> > >> > >> -Original Message- > >> > >> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com] > >> > >> Sent: Friday, May 01, 2015 1:46 AM > >> > >> To: dev@spark.apache.org > >> > >> Subject: Speeding up Spark build during development > >> > >> > >> > >> Hi, > >> > >> I'm making some small changes to the Spark codebase and trying it > out > >> > >>on a > >> > >> cluster. I was wondering if there's a faster way to build than > >> running > >> > >>the > >> > >> package target each time. > >> > >> Currently I'm using: mvn -DskipTests package > >> > >> > >> > >> All the nodes have the same filesystem mounted at the same mount > >> point. > >> > >> > >> > >> Pramod > >> > >> > >> > > >> > > >> > > >> > The information contained in this e-mail is confidential and/or > >> > proprietary to Capital One and/or its affiliates. The information > >> > transmitted herewith is intended only for use by the individual or > >> entity > >> > to which it is addressed. If the reader of this message is not the > >> > intended recipient, you are hereby notified that any review, > >> > retransmission, dissemination, distribution, copying or other use of, > or > >> > taking of any action in reliance upon this information is strictly > >> > prohibited. If you have received this communication in error, please > >> > contact the sender and delete the material from your computer. > >> > > >> > > >> > > > > > -- Emre Sevinc
Re: Speeding up Spark build during development
Using the inbuilt maven and zinc it takes around 10 minutes for each build. Is that reasonable? My maven opts looks like this: $ echo $MAVEN_OPTS -Xmx12000m -XX:MaxPermSize=2048m I'm running it as build/mvn -DskipTests package Should I be tweaking my Zinc/Nailgun config? Pramod On Sun, May 3, 2015 at 3:40 PM, Mark Hamstra wrote: > > https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn > > On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri > wrote: > >> This is great. I didn't know about the mvn script in the build directory. >> >> Pramod >> >> On Fri, May 1, 2015 at 9:51 AM, York, Brennon < >> brennon.y...@capitalone.com> >> wrote: >> >> > Following what Ted said, if you leverage the `mvn` from within the >> > `build/` directory of Spark you¹ll get zinc for free which should help >> > speed up build times. >> > >> > On 5/1/15, 9:45 AM, "Ted Yu" wrote: >> > >> > >Pramod: >> > >Please remember to run Zinc so that the build is faster. >> > > >> > >Cheers >> > > >> > >On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander >> > > >> > >wrote: >> > > >> > >> Hi Pramod, >> > >> >> > >> For cluster-like tests you might want to use the same code as in >> mllib's >> > >> LocalClusterSparkContext. You can rebuild only the package that you >> > >>change >> > >> and then run this main class. >> > >> >> > >> Best regards, Alexander >> > >> >> > >> -Original Message- >> > >> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com] >> > >> Sent: Friday, May 01, 2015 1:46 AM >> > >> To: dev@spark.apache.org >> > >> Subject: Speeding up Spark build during development >> > >> >> > >> Hi, >> > >> I'm making some small changes to the Spark codebase and trying it out >> > >>on a >> > >> cluster. I was wondering if there's a faster way to build than >> running >> > >>the >> > >> package target each time. >> > >> Currently I'm using: mvn -DskipTests package >> > >> >> > >> All the nodes have the same filesystem mounted at the same mount >> point. >> > >> >> > >> Pramod >> > >> >> > >> > >> > >> > The information contained in this e-mail is confidential and/or >> > proprietary to Capital One and/or its affiliates. The information >> > transmitted herewith is intended only for use by the individual or >> entity >> > to which it is addressed. If the reader of this message is not the >> > intended recipient, you are hereby notified that any review, >> > retransmission, dissemination, distribution, copying or other use of, or >> > taking of any action in reliance upon this information is strictly >> > prohibited. If you have received this communication in error, please >> > contact the sender and delete the material from your computer. >> > >> > >> > >
Re: Speeding up Spark build during development
https://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn On Sun, May 3, 2015 at 2:54 PM, Pramod Biligiri wrote: > This is great. I didn't know about the mvn script in the build directory. > > Pramod > > On Fri, May 1, 2015 at 9:51 AM, York, Brennon > > wrote: > > > Following what Ted said, if you leverage the `mvn` from within the > > `build/` directory of Spark you¹ll get zinc for free which should help > > speed up build times. > > > > On 5/1/15, 9:45 AM, "Ted Yu" wrote: > > > > >Pramod: > > >Please remember to run Zinc so that the build is faster. > > > > > >Cheers > > > > > >On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander > > > > > >wrote: > > > > > >> Hi Pramod, > > >> > > >> For cluster-like tests you might want to use the same code as in > mllib's > > >> LocalClusterSparkContext. You can rebuild only the package that you > > >>change > > >> and then run this main class. > > >> > > >> Best regards, Alexander > > >> > > >> -Original Message- > > >> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com] > > >> Sent: Friday, May 01, 2015 1:46 AM > > >> To: dev@spark.apache.org > > >> Subject: Speeding up Spark build during development > > >> > > >> Hi, > > >> I'm making some small changes to the Spark codebase and trying it out > > >>on a > > >> cluster. I was wondering if there's a faster way to build than running > > >>the > > >> package target each time. > > >> Currently I'm using: mvn -DskipTests package > > >> > > >> All the nodes have the same filesystem mounted at the same mount > point. > > >> > > >> Pramod > > >> > > > > > > > > The information contained in this e-mail is confidential and/or > > proprietary to Capital One and/or its affiliates. The information > > transmitted herewith is intended only for use by the individual or entity > > to which it is addressed. If the reader of this message is not the > > intended recipient, you are hereby notified that any review, > > retransmission, dissemination, distribution, copying or other use of, or > > taking of any action in reliance upon this information is strictly > > prohibited. If you have received this communication in error, please > > contact the sender and delete the material from your computer. > > > > >
Re: Speeding up Spark build during development
This is great. I didn't know about the mvn script in the build directory. Pramod On Fri, May 1, 2015 at 9:51 AM, York, Brennon wrote: > Following what Ted said, if you leverage the `mvn` from within the > `build/` directory of Spark you¹ll get zinc for free which should help > speed up build times. > > On 5/1/15, 9:45 AM, "Ted Yu" wrote: > > >Pramod: > >Please remember to run Zinc so that the build is faster. > > > >Cheers > > > >On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander > > > >wrote: > > > >> Hi Pramod, > >> > >> For cluster-like tests you might want to use the same code as in mllib's > >> LocalClusterSparkContext. You can rebuild only the package that you > >>change > >> and then run this main class. > >> > >> Best regards, Alexander > >> > >> -Original Message- > >> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com] > >> Sent: Friday, May 01, 2015 1:46 AM > >> To: dev@spark.apache.org > >> Subject: Speeding up Spark build during development > >> > >> Hi, > >> I'm making some small changes to the Spark codebase and trying it out > >>on a > >> cluster. I was wondering if there's a faster way to build than running > >>the > >> package target each time. > >> Currently I'm using: mvn -DskipTests package > >> > >> All the nodes have the same filesystem mounted at the same mount point. > >> > >> Pramod > >> > > > > The information contained in this e-mail is confidential and/or > proprietary to Capital One and/or its affiliates. The information > transmitted herewith is intended only for use by the individual or entity > to which it is addressed. If the reader of this message is not the > intended recipient, you are hereby notified that any review, > retransmission, dissemination, distribution, copying or other use of, or > taking of any action in reliance upon this information is strictly > prohibited. If you have received this communication in error, please > contact the sender and delete the material from your computer. > >
Re: Speeding up Spark build during development
Following what Ted said, if you leverage the `mvn` from within the `build/` directory of Spark you¹ll get zinc for free which should help speed up build times. On 5/1/15, 9:45 AM, "Ted Yu" wrote: >Pramod: >Please remember to run Zinc so that the build is faster. > >Cheers > >On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander > >wrote: > >> Hi Pramod, >> >> For cluster-like tests you might want to use the same code as in mllib's >> LocalClusterSparkContext. You can rebuild only the package that you >>change >> and then run this main class. >> >> Best regards, Alexander >> >> -Original Message- >> From: Pramod Biligiri [mailto:pramodbilig...@gmail.com] >> Sent: Friday, May 01, 2015 1:46 AM >> To: dev@spark.apache.org >> Subject: Speeding up Spark build during development >> >> Hi, >> I'm making some small changes to the Spark codebase and trying it out >>on a >> cluster. I was wondering if there's a faster way to build than running >>the >> package target each time. >> Currently I'm using: mvn -DskipTests package >> >> All the nodes have the same filesystem mounted at the same mount point. >> >> Pramod >> The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Speeding up Spark build during development
Pramod: Please remember to run Zinc so that the build is faster. Cheers On Fri, May 1, 2015 at 9:36 AM, Ulanov, Alexander wrote: > Hi Pramod, > > For cluster-like tests you might want to use the same code as in mllib's > LocalClusterSparkContext. You can rebuild only the package that you change > and then run this main class. > > Best regards, Alexander > > -Original Message- > From: Pramod Biligiri [mailto:pramodbilig...@gmail.com] > Sent: Friday, May 01, 2015 1:46 AM > To: dev@spark.apache.org > Subject: Speeding up Spark build during development > > Hi, > I'm making some small changes to the Spark codebase and trying it out on a > cluster. I was wondering if there's a faster way to build than running the > package target each time. > Currently I'm using: mvn -DskipTests package > > All the nodes have the same filesystem mounted at the same mount point. > > Pramod >
RE: Speeding up Spark build during development
Hi Pramod, For cluster-like tests you might want to use the same code as in mllib's LocalClusterSparkContext. You can rebuild only the package that you change and then run this main class. Best regards, Alexander -Original Message- From: Pramod Biligiri [mailto:pramodbilig...@gmail.com] Sent: Friday, May 01, 2015 1:46 AM To: dev@spark.apache.org Subject: Speeding up Spark build during development Hi, I'm making some small changes to the Spark codebase and trying it out on a cluster. I was wondering if there's a faster way to build than running the package target each time. Currently I'm using: mvn -DskipTests package All the nodes have the same filesystem mounted at the same mount point. Pramod
Re: Speeding up Spark build during development
Hi Pramod, If you are using sbt as your build, then you need to do sbt assembly once and use sbt ~compile. Also export SPARK_PREPEND_CLASSES=1 this in your shell and all nodes. You can may be try this out ? Thanks, Prashant Sharma On Fri, May 1, 2015 at 2:16 PM, Pramod Biligiri wrote: > Hi, > I'm making some small changes to the Spark codebase and trying it out on a > cluster. I was wondering if there's a faster way to build than running the > package target each time. > Currently I'm using: mvn -DskipTests package > > All the nodes have the same filesystem mounted at the same mount point. > > Pramod >