Great Chandeep. I also have Eclipse Scala IDE below scala IDE build of Eclipse SDK Build id: 4.3.0-vfinal-2015-12-01T15:55:22Z-Typesafe
I am no expert on Eclipse so if I create project called ImportCSV where do I need to put the pom file or how do I reference it please. My Eclipse runs on a Linux host so it cab access all the directories that sbt project accesses? I also believe there will not be any need for external jar files in builkd path? Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 15 March 2016 at 12:15, Chandeep Singh <c...@chandeep.com> wrote: > Btw, just to add to the confusion ;) I use Maven as well since I moved > from Java to Scala but everyone I talk to has been recommending SBT for > Scala. > > I use the Eclipse Scala IDE to build. http://scala-ide.org/ > > Here is my sample PoM. You can add dependancies based on your requirement. > > <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi=" > http://www.w3.org/2001/XMLSchema-instance" > xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 > http://maven.apache.org/maven-v4_0_0.xsd"> > <modelVersion>4.0.0</modelVersion> > <groupId>spark</groupId> > <version>1.0</version> > <name>${project.artifactId}</name> > > <properties> > <maven.compiler.source>1.7</maven.compiler.source> > <maven.compiler.target>1.7</maven.compiler.target> > <encoding>UTF-8</encoding> > <scala.version>2.10.4</scala.version> > <maven-scala-plugin.version>2.15.2</maven-scala-plugin.version> > </properties> > > <repositories> > <repository> > <id>cloudera-repo-releases</id> > <url>https://repository.cloudera.com/artifactory/repo/</url> > </repository> > </repositories> > > <dependencies> > <dependency> > <groupId>org.scala-lang</groupId> > <artifactId>scala-library</artifactId> > <version>${scala.version}</version> > </dependency> > <dependency> > <groupId>org.apache.spark</groupId> > <artifactId>spark-core_2.10</artifactId> > <version>1.5.0-cdh5.5.1</version> > </dependency> > <dependency> > <groupId>org.apache.spark</groupId> > <artifactId>spark-mllib_2.10</artifactId> > <version>1.5.0-cdh5.5.1</version> > </dependency> > <dependency> > <groupId>org.apache.spark</groupId> > <artifactId>spark-hive_2.10</artifactId> > <version>1.5.0</version> > </dependency> > > </dependencies> > <build> > <sourceDirectory>src/main/scala</sourceDirectory> > <testSourceDirectory>src/test/scala</testSourceDirectory> > <plugins> > <plugin> > <groupId>org.scala-tools</groupId> > <artifactId>maven-scala-plugin</artifactId> > <version>${maven-scala-plugin.version}</version> > <executions> > <execution> > <goals> > <goal>compile</goal> > <goal>testCompile</goal> > </goals> > </execution> > </executions> > <configuration> > <jvmArgs> > <jvmArg>-Xms64m</jvmArg> > <jvmArg>-Xmx1024m</jvmArg> > </jvmArgs> > </configuration> > </plugin> > <plugin> > <groupId>org.apache.maven.plugins</groupId> > <artifactId>maven-shade-plugin</artifactId> > <version>1.6</version> > <executions> > <execution> > <phase>package</phase> > <goals> > <goal>shade</goal> > </goals> > <configuration> > <filters> > <filter> > <artifact>*:*</artifact> > <excludes> > <exclude>META-INF/*.SF</exclude> > <exclude>META-INF/*.DSA</exclude> > <exclude>META-INF/*.RSA</exclude> > </excludes> > </filter> > </filters> > <transformers> > <transformer > > implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer"> > <mainClass>com.group.id.Launcher1</mainClass> > </transformer> > </transformers> > </configuration> > </execution> > </executions> > </plugin> > </plugins> > </build> > > <artifactId>scala</artifactId> > </project> > > > On Mar 15, 2016, at 12:09 PM, Mich Talebzadeh <mich.talebza...@gmail.com> > wrote: > > Ok. > > Sounds like opinion is divided :) > > I will try to build a scala app with Maven. > > When I build with SBT I follow this directory structure > > High level directory the package name like > > ImportCSV > > under ImportCSV I have a directory src and the sbt file ImportCSV.sbt > > in directory src I have main and scala subdirectories. My scala file is in > > ImportCSV/src/main/scala > > called ImportCSV.scala > > I then have a shell script that runs everything under ImportCSV directory > > cat generic.ksh > #!/bin/ksh > > #-------------------------------------------------------------------------------- > # > # Procedure: generic.ksh > # > # Description: Compiles and run scala app usinbg sbt and spark-submit > # > # Parameters: none > # > > #-------------------------------------------------------------------------------- > # Vers| Date | Who | DA | Description > > #-----+--------+-----+----+----------------------------------------------------- > # 1.0 |04/03/15| MT | | Initial Version > > #-------------------------------------------------------------------------------- > # > function F_USAGE > { > echo "USAGE: ${1##*/} -A '<Application>'" > echo "USAGE: ${1##*/} -H '<HELP>' -h '<HELP>'" > exit 10 > } > # > # Main Section > # > if [[ "${1}" = "-h" || "${1}" = "-H" ]]; then > F_USAGE $0 > fi > ## MAP INPUT TO VARIABLES > while getopts A: opt > do > case $opt in > (A) APPLICATION="$OPTARG" ;; > (*) F_USAGE $0 ;; > esac > done > [[ -z ${APPLICATION} ]] && print "You must specify an application value " > && F_USAGE $0 > ENVFILE=/home/hduser/dba/bin/environment.ksh > if [[ -f $ENVFILE ]] > then > . $ENVFILE > . ~/spark_1.5.2_bin-hadoop2.6.kshrc > else > echo "Abort: $0 failed. No environment file ( $ENVFILE ) found" > exit 1 > fi > ##FILE_NAME=`basename $0 .ksh` > FILE_NAME=${APPLICATION} > CLASS=`echo ${FILE_NAME}|tr "[:upper:]" "[:lower:]"` > NOW="`date +%Y%m%d_%H%M`" > LOG_FILE=${LOGDIR}/${FILE_NAME}.log > [ -f ${LOG_FILE} ] && rm -f ${LOG_FILE} > print "\n" `date` ", Started $0" | tee -a ${LOG_FILE} > cd ../${FILE_NAME} > print "Compiling ${FILE_NAME}" | tee -a ${LOG_FILE} > sbt package > print "Submiiting the job" | tee -a ${LOG_FILE} > > ${SPARK_HOME}/bin/spark-submit \ > --packages com.databricks:spark-csv_2.11:1.3.0 \ > --class "${FILE_NAME}" \ > --master spark://50.140.197.217:7077 \ > --executor-memory=12G \ > --executor-cores=12 \ > --num-executors=2 \ > target/scala-2.10/${CLASS}_2.10-1.0.jar > print `date` ", Finished $0" | tee -a ${LOG_FILE} > exit > > > So to run it for ImportCSV all I need is to do > > ./generic.ksh -A ImportCSV > > Now can anyone kindly give me a rough guideline on directory and location > of pom.xml to make this work using maven? > > Thanks > > > Dr Mich Talebzadeh > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > http://talebzadehmich.wordpress.com > > > > On 15 March 2016 at 10:50, Sean Owen <so...@cloudera.com> wrote: > >> FWIW, I strongly prefer Maven over SBT even for Scala projects. The >> Spark build of reference is Maven. >> >> On Tue, Mar 15, 2016 at 10:45 AM, Chandeep Singh <c...@chandeep.com> wrote: >> > For Scala, SBT is recommended. >> > >> > On Mar 15, 2016, at 10:42 AM, Mich Talebzadeh < >> mich.talebza...@gmail.com> >> > wrote: >> > >> > Hi, >> > >> > I build my Spark/Scala packages using SBT that works fine. I have >> created >> > generic shell scripts to build and submit it. >> > >> > Yesterday I noticed that some use Maven and Pom for this purpose. >> > >> > Which approach is recommended? >> > >> > Thanks, >> > >> > >> > Dr Mich Talebzadeh >> > >> > >> > >> > LinkedIn >> > >> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >> > >> > >> > >> > http://talebzadehmich.wordpress.com >> > >> > >> > >> > >> > > >