there is a work around.
sc.parallelise(items, items size / 2)
This way each executor will get a batch of 2 items at a time, simulating a
producer-consumer. With /4 it will get 4 items.
--
View this message in context:
Hi, I do an sc.parallelize with a list of 512k items. But sometimes not all
executors are used, i.e. they don't have work to do and nothing is logged
after:
15/07/29 16:35:22 WARN internal.ThreadLocalRandom: Failed to generate a seed
from SecureRandom within 3 seconds. Not enough entrophy?
Hi, I am using sc.parallelise(...32k of items) several times for 1 job. Each
executor takes x amount of time to process it's items but this results in
some executors finishing quickly and staying idle till the others catch up.
Only after all executors complete the first 32k batch, the next batch
I am still receiving these weird sigterms on the executors. The driver claims
it lost the executor, the executor receives a SIGTERM (from whom???)
It doesn't seem a memory related issue though increasing memory takes the
job a bit further or completes it. But why? there is no memory pressure on
it seems it is hardcoded in ExecutorRunnable.scala :
val commands = prefixEnv ++ Seq(
YarnSparkHadoopUtil.expandEnvironment(Environment.JAVA_HOME) +
/bin/java,
-server,
// Kill if OOM is raised - leverage yarn's failure handling to cause
rescheduling.
// Not killing the
I get a suspicious sigterm on the executors that doesnt seem to be from the
driver. The other thing that might send a sigterm is the
-XX:OnOutOfMemoryError=kill %p java arg that the executor starts with. Now
my tasks dont seem to run out of mem, so how can I disable this param to
debug them?
--
I've recompiled spark deleting the -XX:OnOutOfMemoryError=kill declaration,
but still I am getting a SIGTERM!
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/is-it-possible-to-disable-XX-OnOutOfMemoryError-kill-p-for-the-executors-tp23680p23687.html
Sent
Is this normal?
15/07/07 15:27:04 INFO Remoting: Remoting started; listening on addresses
:[akka.tcp://sparkExecutor@cruncher02.stratified:50063]
15/07/07 15:27:04 INFO util.Utils: Successfully started service
'sparkExecutor' on port 50063.
15/07/07 15:27:04 INFO
I have this problem with a job. A random executor gets this
ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL 15: SIGTERM
Almost always at the same point in the processing of the data. I am
processing 1 mil files with sc.wholeText. At around the 600.000th file, a
container receives
Once again I am trying to read a directory tree using binary files.
My directory tree has a root dir ROOTDIR and subdirs where the files are
located, i.e.
ROOTDIR/1
ROOTDIR/2
ROOTDIR/..
ROOTDIR/100
A total of 1 mil files split into 100 sub dirs
Using binaryFiles requires too much memory on the
Hi, I got a cluster of 4 machines and I
sc.wholeTextFiles(/x/*/*.txt)
folder x contains subfolders and each subfolder contains thousand of files
with a total of ~1million matching the path expression.
My spark task starts processing the files but single threaded. I can see
that in the sparkUI,
In SparkUI I can see it creating 2 stages. I tried
wholeTextFiles().repartition(32) but same threading results.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/wholeTextFiles-x-txt-runs-single-threaded-tp23591p23593.html
Sent from the Apache Spark User List
Both the driver (ApplicationMaster running on hadoop) and container
(CoarseGrainedExecutorBackend) end up exceeding my 25GB allocation.
my code is something like
sc.binaryFiles(... 1mil xml files).flatMap( ... extract some domain classes,
not many though as each xml usually have zero
I am profiling the driver. It currently has 564MB of strings which might be
the 1mil file names. But also it has 2.34 GB of long[] ! That's so far, it
is still running. What are those long[] used for?
--
View this message in context:
After some time the driver accumulated 6.67GB of long[] . The executor mem
usage so far is low.
--
View this message in context:
I am reading millions of xml files via
val xmls = sc.binaryFiles(xmlDir)
The operation runs fine locally but on yarn it fails with:
client token: N/A
diagnostics: Application application_1433491939773_0012 failed 2 times due
to ApplicationMaster for attempt
16 matches
Mail list logo