Btw, just to add to the confusion ;) I use Maven as well since I moved from Java to Scala but everyone I talk to has been recommending SBT for Scala.
I use the Eclipse Scala IDE to build. http://scala-ide.org/ <http://scala-ide.org/> Here is my sample PoM. You can add dependancies based on your requirement. <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>spark</groupId> <version>1.0</version> <name>${project.artifactId}</name> <properties> <maven.compiler.source>1.7</maven.compiler.source> <maven.compiler.target>1.7</maven.compiler.target> <encoding>UTF-8</encoding> <scala.version>2.10.4</scala.version> <maven-scala-plugin.version>2.15.2</maven-scala-plugin.version> </properties> <repositories> <repository> <id>cloudera-repo-releases</id> <url>https://repository.cloudera.com/artifactory/repo/</url> </repository> </repositories> <dependencies> <dependency> <groupId>org.scala-lang</groupId> <artifactId>scala-library</artifactId> <version>${scala.version}</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.10</artifactId> <version>1.5.0-cdh5.5.1</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-mllib_2.10</artifactId> <version>1.5.0-cdh5.5.1</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-hive_2.10</artifactId> <version>1.5.0</version> </dependency> </dependencies> <build> <sourceDirectory>src/main/scala</sourceDirectory> <testSourceDirectory>src/test/scala</testSourceDirectory> <plugins> <plugin> <groupId>org.scala-tools</groupId> <artifactId>maven-scala-plugin</artifactId> <version>${maven-scala-plugin.version}</version> <executions> <execution> <goals> <goal>compile</goal> <goal>testCompile</goal> </goals> </execution> </executions> <configuration> <jvmArgs> <jvmArg>-Xms64m</jvmArg> <jvmArg>-Xmx1024m</jvmArg> </jvmArgs> </configuration> </plugin> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> <version>1.6</version> <executions> <execution> <phase>package</phase> <goals> <goal>shade</goal> </goals> <configuration> <filters> <filter> <artifact>*:*</artifact> <excludes> <exclude>META-INF/*.SF</exclude> <exclude>META-INF/*.DSA</exclude> <exclude>META-INF/*.RSA</exclude> </excludes> </filter> </filters> <transformers> <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer"> <mainClass>com.group.id.Launcher1</mainClass> </transformer> </transformers> </configuration> </execution> </executions> </plugin> </plugins> </build> <artifactId>scala</artifactId> </project> > On Mar 15, 2016, at 12:09 PM, Mich Talebzadeh <mich.talebza...@gmail.com> > wrote: > > Ok. > > Sounds like opinion is divided :) > > I will try to build a scala app with Maven. > > When I build with SBT I follow this directory structure > > High level directory the package name like > > ImportCSV > > under ImportCSV I have a directory src and the sbt file ImportCSV.sbt > > in directory src I have main and scala subdirectories. My scala file is in > > ImportCSV/src/main/scala > > called ImportCSV.scala > > I then have a shell script that runs everything under ImportCSV directory > > cat generic.ksh > #!/bin/ksh > #-------------------------------------------------------------------------------- > # > # Procedure: generic.ksh > # > # Description: Compiles and run scala app usinbg sbt and spark-submit > # > # Parameters: none > # > #-------------------------------------------------------------------------------- > # Vers| Date | Who | DA | Description > #-----+--------+-----+----+----------------------------------------------------- > # 1.0 |04/03/15| MT | | Initial Version > #-------------------------------------------------------------------------------- > # > function F_USAGE > { > echo "USAGE: ${1##*/} -A '<Application>'" > echo "USAGE: ${1##*/} -H '<HELP>' -h '<HELP>'" > exit 10 > } > # > # Main Section > # > if [[ "${1}" = "-h" || "${1}" = "-H" ]]; then > F_USAGE $0 > fi > ## MAP INPUT TO VARIABLES > while getopts A: opt > do > case $opt in > (A) APPLICATION="$OPTARG" ;; > (*) F_USAGE $0 ;; > esac > done > [[ -z ${APPLICATION} ]] && print "You must specify an application value " && > F_USAGE $0 > ENVFILE=/home/hduser/dba/bin/environment.ksh > if [[ -f $ENVFILE ]] > then > . $ENVFILE > . ~/spark_1.5.2_bin-hadoop2.6.kshrc > else > echo "Abort: $0 failed. No environment file ( $ENVFILE ) found" > exit 1 > fi > ##FILE_NAME=`basename $0 .ksh` > FILE_NAME=${APPLICATION} > CLASS=`echo ${FILE_NAME}|tr "[:upper:]" "[:lower:]"` > NOW="`date +%Y%m%d_%H%M`" > LOG_FILE=${LOGDIR}/${FILE_NAME}.log > [ -f ${LOG_FILE} ] && rm -f ${LOG_FILE} > print "\n" `date` ", Started $0" | tee -a ${LOG_FILE} > cd ../${FILE_NAME} > print "Compiling ${FILE_NAME}" | tee -a ${LOG_FILE} > sbt package > print "Submiiting the job" | tee -a ${LOG_FILE} > > ${SPARK_HOME}/bin/spark-submit \ > --packages com.databricks:spark-csv_2.11:1.3.0 \ > --class "${FILE_NAME}" \ > --master spark://50.140.197.217:7077 > <http://50.140.197.217:7077/> \ > --executor-memory=12G \ > --executor-cores=12 \ > --num-executors=2 \ > target/scala-2.10/${CLASS}_2.10-1.0.jar > print `date` ", Finished $0" | tee -a ${LOG_FILE} > exit > > > So to run it for ImportCSV all I need is to do > > ./generic.ksh -A ImportCSV > > Now can anyone kindly give me a rough guideline on directory and location of > pom.xml to make this work using maven? > > Thanks > > > Dr Mich Talebzadeh > > LinkedIn > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw> > > http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/> > > > On 15 March 2016 at 10:50, Sean Owen <so...@cloudera.com > <mailto:so...@cloudera.com>> wrote: > FWIW, I strongly prefer Maven over SBT even for Scala projects. The > Spark build of reference is Maven. > > On Tue, Mar 15, 2016 at 10:45 AM, Chandeep Singh <c...@chandeep.com > <mailto:c...@chandeep.com>> wrote: > > For Scala, SBT is recommended. > > > > On Mar 15, 2016, at 10:42 AM, Mich Talebzadeh <mich.talebza...@gmail.com > > <mailto:mich.talebza...@gmail.com>> > > wrote: > > > > Hi, > > > > I build my Spark/Scala packages using SBT that works fine. I have created > > generic shell scripts to build and submit it. > > > > Yesterday I noticed that some use Maven and Pom for this purpose. > > > > Which approach is recommended? > > > > Thanks, > > > > > > Dr Mich Talebzadeh > > > > > > > > LinkedIn > > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > > > > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw> > > > > > > > > http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/> > > > > > > > > >