I've noticed that after I use a Window function over a DataFrame if I call a
map() with a function, Spark returns a "Task not serializable" Exception
This is my code:
val hc = new org.apache.spark.sql.hive.HiveContext(sc)
import hc.implicits._
import org.apache.spark.sql.expressions.Window
import
Hi,
I'm looking for a solution to improve my Spark cluster performances, I have
read from http://spark.apache.org/docs/latest/hardware-provisioning.html:
"We recommend having 4-8 disks per node", I have tried both with one and two
disks but I have seen that with 2 disks the execution time is
I don't think it's a "malformed IP address" issue because I have used an uri
and not an IP.
Another info, the master, driver and workers are hosted on the same machine
so I use "localhost" as host for the Driver.
--
View this message in context:
This is the master's log file:
15/12/22 03:23:05 ERROR FileAppender: Error writing stream to file
/disco1/spark-1.5.1/work/app-20151222032252-0010/0/stderr
java.io.IOException: Stream closed
at
java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:162)
at
Hi MegaLearn!
thanks for the reply!
it's a placeholder, in my real application I use the right
master's hostname.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Problem-with-Spark-Standalone-tp25750p25752.html
Sent from the Apache Spark User List
Hi,
I’m trying to submit a streaming application on my standalone Spark cluster,
this is my code:
import akka.actor.{Props, ActorSystem, Actor}
import akka.http.scaladsl.Http
import akka.http.scaladsl.model.HttpRequest
import akka.http.scaladsl.model.Uri
import akka.stream.ActorMaterializer