sbin/start-master.sh sbin/start-slave.sh spark://192.168.0.38:7077 Backbutton.co.uk ¯\_(ツ)_/¯ ♡۶Java♡۶RMI ♡۶ Make Use Method {MUM} makeuse.org <http://www.backbutton.co.uk>
On Fri, 27 Mar 2020 at 05:59, Wenchen Fan <cloud0...@gmail.com> wrote: > Your Spark cluster, spark://192.168.0.38:7077, how is it deployed if you > just include Spark dependency in IntelliJ? > > On Fri, Mar 27, 2020 at 1:54 PM Zahid Rahman <zahidr1...@gmail.com> wrote: > >> I have configured in IntelliJ as external jars >> spark-3.0.0-preview2-bin-hadoop2.7/jar >> >> not pulling anything from maven. >> >> Backbutton.co.uk >> ¯\_(ツ)_/¯ >> ♡۶Java♡۶RMI ♡۶ >> Make Use Method {MUM} >> makeuse.org >> <http://www.backbutton.co.uk> >> >> >> On Fri, 27 Mar 2020 at 05:45, Wenchen Fan <cloud0...@gmail.com> wrote: >> >>> Which Spark/Scala version do you use? >>> >>> On Fri, Mar 27, 2020 at 1:24 PM Zahid Rahman <zahidr1...@gmail.com> >>> wrote: >>> >>>> >>>> with the following sparksession configuration >>>> >>>> val spark = SparkSession.builder().master("local[*]").appName("Spark >>>> Session take").getOrCreate(); >>>> >>>> this line works >>>> >>>> flights.filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != >>>> "Canada").map(flight_row => flight_row).take(5) >>>> >>>> >>>> however if change the master url like so, with the ip address then the >>>> following error is produced by the position of .take(5) >>>> >>>> val spark = >>>> SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark >>>> Session take").getOrCreate(); >>>> >>>> >>>> 20/03/27 05:15:20 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID >>>> 1, 192.168.0.38, executor 0): java.lang.ClassCastException: cannot assign >>>> instance of java.lang.invoke.SerializedLambda to field >>>> org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in instance >>>> of org.apache.spark.rdd.MapPartitionsRDD >>>> >>>> BUT if I remove take(5) or change the position of take(5) or insert an >>>> extra take(5) as illustrated in code then it works. I don't see why the >>>> position of take(5) should cause such an error or be caused by changing the >>>> master url >>>> >>>> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != >>>> "Canada").map(flight_row => flight_row).take(5) >>>> >>>> flights.take(5) >>>> >>>> flights >>>> .take(5) >>>> .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada") >>>> .map(fr => flight(fr.DEST_COUNTRY_NAME, fr.ORIGIN_COUNTRY_NAME,fr.count >>>> + 5)) >>>> flights.show(5) >>>> >>>> >>>> complete code if you wish to replicate it. >>>> >>>> import org.apache.spark.sql.SparkSession >>>> >>>> object sessiontest { >>>> >>>> // define specific data type class then manipulate it using the filter >>>> and map functions >>>> // this is also known as an Encoder >>>> case class flight (DEST_COUNTRY_NAME: String, >>>> ORIGIN_COUNTRY_NAME:String, >>>> count: BigInt) >>>> >>>> >>>> def main(args:Array[String]): Unit ={ >>>> >>>> val spark = >>>> SparkSession.builder().master("spark://192.168.0.38:7077").appName("Spark >>>> Session take").getOrCreate(); >>>> >>>> import spark.implicits._ >>>> val flightDf = >>>> spark.read.parquet("/data/flight-data/parquet/2010-summary.parquet/") >>>> val flights = flightDf.as[flight] >>>> >>>> flights.take(5).filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != >>>> "Canada").map(flight_row => flight_row).take(5) >>>> >>>> flights.take(5) >>>> >>>> flights >>>> .take(5) >>>> .filter(flight_row => flight_row.ORIGIN_COUNTRY_NAME != "Canada") >>>> .map(fr => flight(fr.DEST_COUNTRY_NAME, >>>> fr.ORIGIN_COUNTRY_NAME,fr.count + 5)) >>>> flights.show(5) >>>> >>>> } // main >>>> } >>>> >>>> >>>> >>>> >>>> >>>> Backbutton.co.uk >>>> ¯\_(ツ)_/¯ >>>> ♡۶Java♡۶RMI ♡۶ >>>> Make Use Method {MUM} >>>> makeuse.org >>>> <http://www.backbutton.co.uk> >>>> >>>