mapPartitionsWithSplit was removed in Spark 2.0.0. You can use mapPartitionsWithIndex instead.
On Tue, Mar 28, 2017 at 3:52 PM, Anahita Talebi <anahita.t.am...@gmail.com> wrote: > Thanks. > I tried this one, as well. Unfortunately I still get the same error. > > > On Wednesday, March 29, 2017, Marco Mistroni <mmistr...@gmail.com> wrote: > >> 1.7.5 >> >> On 28 Mar 2017 10:10 pm, "Anahita Talebi" <anahita.t.am...@gmail.com> >> wrote: >> >>> Hi, >>> >>> Thanks for your answer. >>> What is the version of "org.slf4j" % "slf4j-api" in your sbt file? >>> I think the problem might come from this part. >>> >>> On Tue, Mar 28, 2017 at 11:02 PM, Marco Mistroni <mmistr...@gmail.com> >>> wrote: >>> >>>> Hello >>>> uhm ihave a project whose build,sbt is closest to yours, where i am >>>> using spark 2.1, scala 2.11 and scalatest (i upgraded to 3.0.0) and it >>>> works fine >>>> in my projects though i don thave any of the following libraries that >>>> you mention >>>> - breeze >>>> - netlib,all >>>> - scoopt >>>> >>>> hth >>>> >>>> On Tue, Mar 28, 2017 at 9:10 PM, Anahita Talebi < >>>> anahita.t.am...@gmail.com> wrote: >>>> >>>>> Hi, >>>>> >>>>> Thanks for your answer. >>>>> >>>>> I first changed the scala version to 2.11.8 and kept the spark version >>>>> 1.5.2 (old version). Then I changed the scalatest version into "3.0.1". >>>>> With this configuration, I could run the code and compile it and generate >>>>> the .jar file. >>>>> >>>>> When I changed the spark version into 2.1.0, I get the same error as >>>>> before. So I imagine the problem should be somehow related to the version >>>>> of spark. >>>>> >>>>> Cheers, >>>>> Anahita >>>>> >>>>> ------------------------------------------------------------ >>>>> ------------------------------------------------------------ >>>>> -------------------------------- >>>>> import AssemblyKeys._ >>>>> >>>>> assemblySettings >>>>> >>>>> name := "proxcocoa" >>>>> >>>>> version := "0.1" >>>>> >>>>> organization := "edu.berkeley.cs.amplab" >>>>> >>>>> scalaVersion := "2.11.8" >>>>> >>>>> parallelExecution in Test := false >>>>> >>>>> { >>>>> val excludeHadoop = ExclusionRule(organization = "org.apache.hadoop") >>>>> libraryDependencies ++= Seq( >>>>> "org.slf4j" % "slf4j-api" % "1.7.2", >>>>> "org.slf4j" % "slf4j-log4j12" % "1.7.2", >>>>> "org.scalatest" %% "scalatest" % "3.0.1" % "test", >>>>> "org.apache.spark" %% "spark-core" % "2.1.0" >>>>> excludeAll(excludeHadoop), >>>>> "org.apache.spark" %% "spark-mllib" % "2.1.0" >>>>> excludeAll(excludeHadoop), >>>>> "org.apache.spark" %% "spark-sql" % "2.1.0" >>>>> excludeAll(excludeHadoop), >>>>> "org.apache.commons" % "commons-compress" % "1.7", >>>>> "commons-io" % "commons-io" % "2.4", >>>>> "org.scalanlp" % "breeze_2.11" % "0.11.2", >>>>> "com.github.fommil.netlib" % "all" % "1.1.2" pomOnly(), >>>>> "com.github.scopt" %% "scopt" % "3.3.0" >>>>> ) >>>>> } >>>>> >>>>> { >>>>> val defaultHadoopVersion = "1.0.4" >>>>> val hadoopVersion = >>>>> scala.util.Properties.envOrElse("SPARK_HADOOP_VERSION", >>>>> defaultHadoopVersion) >>>>> libraryDependencies += "org.apache.hadoop" % "hadoop-client" % >>>>> hadoopVersion >>>>> } >>>>> >>>>> libraryDependencies += "org.apache.spark" %% "spark-streaming" % >>>>> "2.1.0" >>>>> >>>>> resolvers ++= Seq( >>>>> "Local Maven Repository" at Path.userHome.asFile.toURI.toURL + >>>>> ".m2/repository", >>>>> "Typesafe" at "http://repo.typesafe.com/typesafe/releases", >>>>> "Spray" at "http://repo.spray.cc" >>>>> ) >>>>> >>>>> mergeStrategy in assembly <<= (mergeStrategy in assembly) { (old) => >>>>> { >>>>> case PathList("javax", "servlet", xs @ _*) => >>>>> MergeStrategy.first >>>>> case PathList(ps @ _*) if ps.last endsWith ".html" => >>>>> MergeStrategy.first >>>>> case "application.conf" => >>>>> MergeStrategy.concat >>>>> case "reference.conf" => >>>>> MergeStrategy.concat >>>>> case "log4j.properties" => >>>>> MergeStrategy.discard >>>>> case m if m.toLowerCase.endsWith("manifest.mf") => >>>>> MergeStrategy.discard >>>>> case m if m.toLowerCase.matches("meta-inf.*\\.sf$") => >>>>> MergeStrategy.discard >>>>> case _ => MergeStrategy.first >>>>> } >>>>> } >>>>> >>>>> test in assembly := {} >>>>> ------------------------------------------------------------ >>>>> ------------------------------------------------------------ >>>>> -------------------------------- >>>>> >>>>> On Tue, Mar 28, 2017 at 9:33 PM, Marco Mistroni <mmistr...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hello >>>>>> that looks to me like there's something dodgy withyour Scala >>>>>> installation >>>>>> Though Spark 2.0 is built on Scala 2.11, it still support 2.10... i >>>>>> suggest you change one thing at a time in your sbt >>>>>> First Spark version. run it and see if it works >>>>>> Then amend the scala version >>>>>> >>>>>> hth >>>>>> marco >>>>>> >>>>>> On Tue, Mar 28, 2017 at 5:20 PM, Anahita Talebi < >>>>>> anahita.t.am...@gmail.com> wrote: >>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> Thanks you all for your informative answers. >>>>>>> I actually changed the scala version to the 2.11.8 and spark version >>>>>>> into 2.1.0 in the build.sbt >>>>>>> >>>>>>> Except for these two guys (scala and spark version), I kept the same >>>>>>> values for the rest in the build.sbt file. >>>>>>> ------------------------------------------------------------ >>>>>>> --------------- >>>>>>> import AssemblyKeys._ >>>>>>> >>>>>>> assemblySettings >>>>>>> >>>>>>> name := "proxcocoa" >>>>>>> >>>>>>> version := "0.1" >>>>>>> >>>>>>> scalaVersion := "2.11.8" >>>>>>> >>>>>>> parallelExecution in Test := false >>>>>>> >>>>>>> { >>>>>>> val excludeHadoop = ExclusionRule(organization = >>>>>>> "org.apache.hadoop") >>>>>>> libraryDependencies ++= Seq( >>>>>>> "org.slf4j" % "slf4j-api" % "1.7.2", >>>>>>> "org.slf4j" % "slf4j-log4j12" % "1.7.2", >>>>>>> "org.scalatest" %% "scalatest" % "1.9.1" % "test", >>>>>>> "org.apache.spark" % "spark-core_2.11" % "2.1.0" >>>>>>> excludeAll(excludeHadoop), >>>>>>> "org.apache.spark" % "spark-mllib_2.11" % "2.1.0" >>>>>>> excludeAll(excludeHadoop), >>>>>>> "org.apache.spark" % "spark-sql_2.11" % "2.1.0" >>>>>>> excludeAll(excludeHadoop), >>>>>>> "org.apache.commons" % "commons-compress" % "1.7", >>>>>>> "commons-io" % "commons-io" % "2.4", >>>>>>> "org.scalanlp" % "breeze_2.11" % "0.11.2", >>>>>>> "com.github.fommil.netlib" % "all" % "1.1.2" pomOnly(), >>>>>>> "com.github.scopt" %% "scopt" % "3.3.0" >>>>>>> ) >>>>>>> } >>>>>>> >>>>>>> { >>>>>>> val defaultHadoopVersion = "1.0.4" >>>>>>> val hadoopVersion = >>>>>>> scala.util.Properties.envOrElse("SPARK_HADOOP_VERSION", >>>>>>> defaultHadoopVersion) >>>>>>> libraryDependencies += "org.apache.hadoop" % "hadoop-client" % >>>>>>> hadoopVersion >>>>>>> } >>>>>>> >>>>>>> libraryDependencies += "org.apache.spark" % "spark-streaming_2.11" % >>>>>>> "2.1.0" >>>>>>> >>>>>>> resolvers ++= Seq( >>>>>>> "Local Maven Repository" at Path.userHome.asFile.toURI.toURL + >>>>>>> ".m2/repository", >>>>>>> "Typesafe" at "http://repo.typesafe.com/typesafe/releases", >>>>>>> "Spray" at "http://repo.spray.cc" >>>>>>> ) >>>>>>> >>>>>>> mergeStrategy in assembly <<= (mergeStrategy in assembly) { (old) => >>>>>>> { >>>>>>> case PathList("javax", "servlet", xs @ _*) => >>>>>>> MergeStrategy.first >>>>>>> case PathList(ps @ _*) if ps.last endsWith ".html" => >>>>>>> MergeStrategy.first >>>>>>> case "application.conf" => >>>>>>> MergeStrategy.concat >>>>>>> case "reference.conf" => >>>>>>> MergeStrategy.concat >>>>>>> case "log4j.properties" => >>>>>>> MergeStrategy.discard >>>>>>> case m if m.toLowerCase.endsWith("manifest.mf") => >>>>>>> MergeStrategy.discard >>>>>>> case m if m.toLowerCase.matches("meta-inf.*\\.sf$") => >>>>>>> MergeStrategy.discard >>>>>>> case _ => MergeStrategy.first >>>>>>> } >>>>>>> } >>>>>>> >>>>>>> test in assembly := {} >>>>>>> ---------------------------------------------------------------- >>>>>>> >>>>>>> When I compile the code, I get the following error: >>>>>>> >>>>>>> [info] Compiling 4 Scala sources to /Users/atalebi/Desktop/new_ver >>>>>>> sion_proxcocoa-master/target/scala-2.11/classes... >>>>>>> [error] /Users/atalebi/Desktop/new_version_proxcocoa-master/src/main >>>>>>> /scala/utils/OptUtils.scala:40: value mapPartitionsWithSplit is not >>>>>>> a member of org.apache.spark.rdd.RDD[String] >>>>>>> [error] val sizes = data.mapPartitionsWithSplit{ case(i,lines) => >>>>>>> [error] ^ >>>>>>> [error] /Users/atalebi/Desktop/new_version_proxcocoa-master/src/main >>>>>>> /scala/utils/OptUtils.scala:41: value length is not a member of Any >>>>>>> [error] Iterator(i -> lines.length) >>>>>>> [error] ^ >>>>>>> ---------------------------------------------------------------- >>>>>>> It gets the error in the code. Does it mean that for the different >>>>>>> version of the spark and scala, I need to change the main code? >>>>>>> >>>>>>> Thanks, >>>>>>> Anahita >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Mar 28, 2017 at 10:28 AM, Dinko Srkoč <dinko.sr...@gmail.com >>>>>>> > wrote: >>>>>>> >>>>>>>> Adding to advices given by others ... Spark 2.1.0 works with Scala >>>>>>>> 2.11, so set: >>>>>>>> >>>>>>>> scalaVersion := "2.11.8" >>>>>>>> >>>>>>>> When you see something like: >>>>>>>> >>>>>>>> "org.apache.spark" % "spark-core_2.10" % "1.5.2" >>>>>>>> >>>>>>>> that means that library `spark-core` is compiled against Scala 2.10, >>>>>>>> so you would have to change that to 2.11: >>>>>>>> >>>>>>>> "org.apache.spark" % "spark-core_2.11" % "2.1.0" >>>>>>>> >>>>>>>> better yet, let SBT worry about libraries built against particular >>>>>>>> Scala versions: >>>>>>>> >>>>>>>> "org.apache.spark" %% "spark-core" % "2.1.0" >>>>>>>> >>>>>>>> The `%%` will instruct SBT to choose the library appropriate for a >>>>>>>> version of Scala that is set in `scalaVersion`. >>>>>>>> >>>>>>>> It may be worth mentioning that the `%%` thing works only with Scala >>>>>>>> libraries as they are compiled against a certain Scala version. Java >>>>>>>> libraries are unaffected (have nothing to do with Scala), e.g. for >>>>>>>> `slf4j` one only uses single `%`s: >>>>>>>> >>>>>>>> "org.slf4j" % "slf4j-api" % "1.7.2" >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Dinko >>>>>>>> >>>>>>>> On 27 March 2017 at 23:30, Mich Talebzadeh < >>>>>>>> mich.talebza...@gmail.com> wrote: >>>>>>>> > check these versions >>>>>>>> > >>>>>>>> > function create_build_sbt_file { >>>>>>>> > BUILD_SBT_FILE=${GEN_APPSDIR}/scala/${APPLICATION}/ >>>>>>>> build.sbt >>>>>>>> > [ -f ${BUILD_SBT_FILE} ] && rm -f ${BUILD_SBT_FILE} >>>>>>>> > cat >> $BUILD_SBT_FILE << ! >>>>>>>> > lazy val root = (project in file(".")). >>>>>>>> > settings( >>>>>>>> > name := "${APPLICATION}", >>>>>>>> > version := "1.0", >>>>>>>> > scalaVersion := "2.11.8", >>>>>>>> > mainClass in Compile := Some("myPackage.${APPLICATION}") >>>>>>>> > ) >>>>>>>> > libraryDependencies += "org.apache.spark" %% "spark-core" % >>>>>>>> "2.0.0" % >>>>>>>> > "provided" >>>>>>>> > libraryDependencies += "org.apache.spark" %% "spark-sql" % >>>>>>>> "2.0.0" % >>>>>>>> > "provided" >>>>>>>> > libraryDependencies += "org.apache.spark" %% "spark-hive" % >>>>>>>> "2.0.0" % >>>>>>>> > "provided" >>>>>>>> > libraryDependencies += "org.apache.spark" %% "spark-streaming" % >>>>>>>> "2.0.0" % >>>>>>>> > "provided" >>>>>>>> > libraryDependencies += "org.apache.spark" %% >>>>>>>> "spark-streaming-kafka" % >>>>>>>> > "1.6.1" % "provided" >>>>>>>> > libraryDependencies += "com.google.code.gson" % "gson" % "2.6.2" >>>>>>>> > libraryDependencies += "org.apache.phoenix" % "phoenix-spark" % >>>>>>>> > "4.6.0-HBase-1.0" >>>>>>>> > libraryDependencies += "org.apache.hbase" % "hbase" % "1.2.3" >>>>>>>> > libraryDependencies += "org.apache.hbase" % "hbase-client" % >>>>>>>> "1.2.3" >>>>>>>> > libraryDependencies += "org.apache.hbase" % "hbase-common" % >>>>>>>> "1.2.3" >>>>>>>> > libraryDependencies += "org.apache.hbase" % "hbase-server" % >>>>>>>> "1.2.3" >>>>>>>> > // META-INF discarding >>>>>>>> > mergeStrategy in assembly <<= (mergeStrategy in assembly) { (old) >>>>>>>> => >>>>>>>> > { >>>>>>>> > case PathList("META-INF", xs @ _*) => MergeStrategy.discard >>>>>>>> > case x => MergeStrategy.first >>>>>>>> > } >>>>>>>> > } >>>>>>>> > ! >>>>>>>> > } >>>>>>>> > >>>>>>>> > HTH >>>>>>>> > >>>>>>>> > Dr Mich Talebzadeh >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > LinkedIn >>>>>>>> > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJ >>>>>>>> d6zP6AcPCCdOABUrV8Pw >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > http://talebzadehmich.wordpress.com >>>>>>>> > >>>>>>>> > >>>>>>>> > Disclaimer: Use it at your own risk. Any and all responsibility >>>>>>>> for any >>>>>>>> > loss, damage or destruction of data or any other property which >>>>>>>> may arise >>>>>>>> > from relying on this email's technical content is explicitly >>>>>>>> disclaimed. The >>>>>>>> > author will in no case be liable for any monetary damages arising >>>>>>>> from such >>>>>>>> > loss, damage or destruction. >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > On 27 March 2017 at 21:45, Jörn Franke <jornfra...@gmail.com> >>>>>>>> wrote: >>>>>>>> >> >>>>>>>> >> Usually you define the dependencies to the Spark library as >>>>>>>> provided. You >>>>>>>> >> also seem to mix different Spark versions which should be >>>>>>>> avoided. >>>>>>>> >> The Hadoop library seems to be outdated and should also only be >>>>>>>> provided. >>>>>>>> >> >>>>>>>> >> The other dependencies you could assemble in a fat jar. >>>>>>>> >> >>>>>>>> >> On 27 Mar 2017, at 21:25, Anahita Talebi < >>>>>>>> anahita.t.am...@gmail.com> >>>>>>>> >> wrote: >>>>>>>> >> >>>>>>>> >> Hi friends, >>>>>>>> >> >>>>>>>> >> I have a code which is written in Scala. The scala version >>>>>>>> 2.10.4 and >>>>>>>> >> Spark version 1.5.2 are used to run the code. >>>>>>>> >> >>>>>>>> >> I would like to upgrade the code to the most updated version of >>>>>>>> spark, >>>>>>>> >> meaning 2.1.0. >>>>>>>> >> >>>>>>>> >> Here is the build.sbt: >>>>>>>> >> >>>>>>>> >> import AssemblyKeys._ >>>>>>>> >> >>>>>>>> >> assemblySettings >>>>>>>> >> >>>>>>>> >> name := "proxcocoa" >>>>>>>> >> >>>>>>>> >> version := "0.1" >>>>>>>> >> >>>>>>>> >> scalaVersion := "2.10.4" >>>>>>>> >> >>>>>>>> >> parallelExecution in Test := false >>>>>>>> >> >>>>>>>> >> { >>>>>>>> >> val excludeHadoop = ExclusionRule(organization = >>>>>>>> "org.apache.hadoop") >>>>>>>> >> libraryDependencies ++= Seq( >>>>>>>> >> "org.slf4j" % "slf4j-api" % "1.7.2", >>>>>>>> >> "org.slf4j" % "slf4j-log4j12" % "1.7.2", >>>>>>>> >> "org.scalatest" %% "scalatest" % "1.9.1" % "test", >>>>>>>> >> "org.apache.spark" % "spark-core_2.10" % "1.5.2" >>>>>>>> >> excludeAll(excludeHadoop), >>>>>>>> >> "org.apache.spark" % "spark-mllib_2.10" % "1.5.2" >>>>>>>> >> excludeAll(excludeHadoop), >>>>>>>> >> "org.apache.spark" % "spark-sql_2.10" % "1.5.2" >>>>>>>> >> excludeAll(excludeHadoop), >>>>>>>> >> "org.apache.commons" % "commons-compress" % "1.7", >>>>>>>> >> "commons-io" % "commons-io" % "2.4", >>>>>>>> >> "org.scalanlp" % "breeze_2.10" % "0.11.2", >>>>>>>> >> "com.github.fommil.netlib" % "all" % "1.1.2" pomOnly(), >>>>>>>> >> "com.github.scopt" %% "scopt" % "3.3.0" >>>>>>>> >> ) >>>>>>>> >> } >>>>>>>> >> >>>>>>>> >> { >>>>>>>> >> val defaultHadoopVersion = "1.0.4" >>>>>>>> >> val hadoopVersion = >>>>>>>> >> scala.util.Properties.envOrElse("SPARK_HADOOP_VERSION", >>>>>>>> >> defaultHadoopVersion) >>>>>>>> >> libraryDependencies += "org.apache.hadoop" % "hadoop-client" % >>>>>>>> >> hadoopVersion >>>>>>>> >> } >>>>>>>> >> >>>>>>>> >> libraryDependencies += "org.apache.spark" % >>>>>>>> "spark-streaming_2.10" % >>>>>>>> >> "1.5.0" >>>>>>>> >> >>>>>>>> >> resolvers ++= Seq( >>>>>>>> >> "Local Maven Repository" at Path.userHome.asFile.toURI.toURL + >>>>>>>> >> ".m2/repository", >>>>>>>> >> "Typesafe" at "http://repo.typesafe.com/typesafe/releases", >>>>>>>> >> "Spray" at "http://repo.spray.cc" >>>>>>>> >> ) >>>>>>>> >> >>>>>>>> >> mergeStrategy in assembly <<= (mergeStrategy in assembly) { >>>>>>>> (old) => >>>>>>>> >> { >>>>>>>> >> case PathList("javax", "servlet", xs @ _*) => >>>>>>>> >> MergeStrategy.first >>>>>>>> >> case PathList(ps @ _*) if ps.last endsWith ".html" => >>>>>>>> >> MergeStrategy.first >>>>>>>> >> case "application.conf" => >>>>>>>> >> MergeStrategy.concat >>>>>>>> >> case "reference.conf" => >>>>>>>> >> MergeStrategy.concat >>>>>>>> >> case "log4j.properties" => >>>>>>>> >> MergeStrategy.discard >>>>>>>> >> case m if m.toLowerCase.endsWith("manifest.mf") => >>>>>>>> >> MergeStrategy.discard >>>>>>>> >> case m if m.toLowerCase.matches("meta-inf.*\\.sf$") => >>>>>>>> >> MergeStrategy.discard >>>>>>>> >> case _ => MergeStrategy.first >>>>>>>> >> } >>>>>>>> >> } >>>>>>>> >> >>>>>>>> >> test in assembly := {} >>>>>>>> >> >>>>>>>> >> ----------------------------------------------------------- >>>>>>>> >> I downloaded the spark 2.1.0 and change the version of spark and >>>>>>>> >> scalaversion in the build.sbt. But unfortunately, I was failed >>>>>>>> to run the >>>>>>>> >> code. >>>>>>>> >> >>>>>>>> >> Does anybody know how I can upgrade the code to the most recent >>>>>>>> spark >>>>>>>> >> version by changing the build.sbt file? >>>>>>>> >> >>>>>>>> >> Or do you have any other suggestion? >>>>>>>> >> >>>>>>>> >> Thanks a lot, >>>>>>>> >> Anahita >>>>>>>> >> >>>>>>>> > >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>>