Hi All,
I've been using spark standalone for a while and now its time for me to
install HDFS. If a spark worker goes down then Spark master restarts the
worker similarly if a datanode process goes down it looks like it is not
the namenode job to restart the datanode and if so, 1) should I use
Hi,
I was wondering if folks have some ideas, recommendation for how to fix
this error (full stack trace included below).
We're on Kafka 0.10.0.0 and spark_streaming_2.11 v. 2.0.0.
We've tried a few things as suggested in these sources:
-
I have put more details and stack traces here:
http://stackoverflow.com/questions/43462638/emr-spark-2-1-0-process-get-stuck-at-at-org-apache-spark-unsafe-platform-copymem
Any suggestions would be very much appreciated.
--
View this message in context:
I wrote up a simple metric sink for Spark that publishes metrics to a Kafka
broker.
Each metric is published as a message (in json format), with the metric name as
the message key.
https://github.com/erikerlandson/spark-kafka-sink
Build with "(x)sbt assembly" and make sure the resulting jar
CALL FOR PAPERS
12th Workshop on Virtualization in HighÂ-Performance Cloud Computing (VHPC
'17)
held in conjunction with the International Supercomputing Conference - High
Performance,
June 18-22, 2017, Frankfurt, Germany.
Hi there,
I'm using PySpark on a Hadoop cluster and I could not find the info about
the executor memory model with Python.
I know that the Python memory (spark.python.worker.memory) does not overlap
the JVM Heap (spark.executor.memory).
However, does the Python Memory overlap the Executor
Hi,
I am running application over spark v 1.6.2(in standalone mode) for over 100 GB
of data . Given below are my configurations:
Job configuration
spark.driver.memory=5g
spark.executor.memory=5g
spark.cores.max=4
spark-env.sh
export SPARK_WORKER_INSTANCES=3;
export SPARK_WORKER_MEMORY=5g;