Ok. Sounds like opinion is divided :)
I will try to build a scala app with Maven. When I build with SBT I follow this directory structure High level directory the package name like ImportCSV under ImportCSV I have a directory src and the sbt file ImportCSV.sbt in directory src I have main and scala subdirectories. My scala file is in ImportCSV/src/main/scala called ImportCSV.scala I then have a shell script that runs everything under ImportCSV directory cat generic.ksh #!/bin/ksh #-------------------------------------------------------------------------------- # # Procedure: generic.ksh # # Description: Compiles and run scala app usinbg sbt and spark-submit # # Parameters: none # #-------------------------------------------------------------------------------- # Vers| Date | Who | DA | Description #-----+--------+-----+----+----------------------------------------------------- # 1.0 |04/03/15| MT | | Initial Version #-------------------------------------------------------------------------------- # function F_USAGE { echo "USAGE: ${1##*/} -A '<Application>'" echo "USAGE: ${1##*/} -H '<HELP>' -h '<HELP>'" exit 10 } # # Main Section # if [[ "${1}" = "-h" || "${1}" = "-H" ]]; then F_USAGE $0 fi ## MAP INPUT TO VARIABLES while getopts A: opt do case $opt in (A) APPLICATION="$OPTARG" ;; (*) F_USAGE $0 ;; esac done [[ -z ${APPLICATION} ]] && print "You must specify an application value " && F_USAGE $0 ENVFILE=/home/hduser/dba/bin/environment.ksh if [[ -f $ENVFILE ]] then . $ENVFILE . ~/spark_1.5.2_bin-hadoop2.6.kshrc else echo "Abort: $0 failed. No environment file ( $ENVFILE ) found" exit 1 fi ##FILE_NAME=`basename $0 .ksh` FILE_NAME=${APPLICATION} CLASS=`echo ${FILE_NAME}|tr "[:upper:]" "[:lower:]"` NOW="`date +%Y%m%d_%H%M`" LOG_FILE=${LOGDIR}/${FILE_NAME}.log [ -f ${LOG_FILE} ] && rm -f ${LOG_FILE} print "\n" `date` ", Started $0" | tee -a ${LOG_FILE} cd ../${FILE_NAME} print "Compiling ${FILE_NAME}" | tee -a ${LOG_FILE} sbt package print "Submiiting the job" | tee -a ${LOG_FILE} ${SPARK_HOME}/bin/spark-submit \ --packages com.databricks:spark-csv_2.11:1.3.0 \ --class "${FILE_NAME}" \ --master spark://50.140.197.217:7077 \ --executor-memory=12G \ --executor-cores=12 \ --num-executors=2 \ target/scala-2.10/${CLASS}_2.10-1.0.jar print `date` ", Finished $0" | tee -a ${LOG_FILE} exit So to run it for ImportCSV all I need is to do ./generic.ksh -A ImportCSV Now can anyone kindly give me a rough guideline on directory and location of pom.xml to make this work using maven? Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 15 March 2016 at 10:50, Sean Owen <so...@cloudera.com> wrote: > FWIW, I strongly prefer Maven over SBT even for Scala projects. The > Spark build of reference is Maven. > > On Tue, Mar 15, 2016 at 10:45 AM, Chandeep Singh <c...@chandeep.com> wrote: > > For Scala, SBT is recommended. > > > > On Mar 15, 2016, at 10:42 AM, Mich Talebzadeh <mich.talebza...@gmail.com > > > > wrote: > > > > Hi, > > > > I build my Spark/Scala packages using SBT that works fine. I have created > > generic shell scripts to build and submit it. > > > > Yesterday I noticed that some use Maven and Pom for this purpose. > > > > Which approach is recommended? > > > > Thanks, > > > > > > Dr Mich Talebzadeh > > > > > > > > LinkedIn > > > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > > > > > > > > http://talebzadehmich.wordpress.com > > > > > > > > >