Re: sc.parallelise to work more like a producer/consumer?

2015-07-30 Thread Kostas Kougios
there is a work around. sc.parallelise(items, items size / 2) This way each executor will get a batch of 2 items at a time, simulating a producer-consumer. With /4 it will get 4 items. -- View this message in context:

sc.parallelize(512k items) doesn't always use 64 executors

2015-07-29 Thread Kostas Kougios
Hi, I do an sc.parallelize with a list of 512k items. But sometimes not all executors are used, i.e. they don't have work to do and nothing is logged after: 15/07/29 16:35:22 WARN internal.ThreadLocalRandom: Failed to generate a seed from SecureRandom within 3 seconds. Not enough entrophy?

sc.parallelise to work more like a producer/consumer?

2015-07-28 Thread Kostas Kougios
Hi, I am using sc.parallelise(...32k of items) several times for 1 job. Each executor takes x amount of time to process it's items but this results in some executors finishing quickly and staying idle till the others catch up. Only after all executors complete the first 32k batch, the next batch

RECEIVED SIGNAL 15: SIGTERM

2015-07-07 Thread Kostas Kougios
I am still receiving these weird sigterms on the executors. The driver claims it lost the executor, the executor receives a SIGTERM (from whom???) It doesn't seem a memory related issue though increasing memory takes the job a bit further or completes it. But why? there is no memory pressure on

Re: is it possible to disable -XX:OnOutOfMemoryError=kill %p for the executors?

2015-07-07 Thread Kostas Kougios
it seems it is hardcoded in ExecutorRunnable.scala : val commands = prefixEnv ++ Seq( YarnSparkHadoopUtil.expandEnvironment(Environment.JAVA_HOME) + /bin/java, -server, // Kill if OOM is raised - leverage yarn's failure handling to cause rescheduling. // Not killing the

is it possible to disable -XX:OnOutOfMemoryError=kill %p for the executors?

2015-07-07 Thread Kostas Kougios
I get a suspicious sigterm on the executors that doesnt seem to be from the driver. The other thing that might send a sigterm is the -XX:OnOutOfMemoryError=kill %p java arg that the executor starts with. Now my tasks dont seem to run out of mem, so how can I disable this param to debug them? --

Re: is it possible to disable -XX:OnOutOfMemoryError=kill %p for the executors?

2015-07-07 Thread Kostas Kougios
I've recompiled spark deleting the -XX:OnOutOfMemoryError=kill declaration, but still I am getting a SIGTERM! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/is-it-possible-to-disable-XX-OnOutOfMemoryError-kill-p-for-the-executors-tp23680p23687.html Sent

Remoting started followed by a Remoting shut down straight away

2015-07-07 Thread Kostas Kougios
Is this normal? 15/07/07 15:27:04 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkExecutor@cruncher02.stratified:50063] 15/07/07 15:27:04 INFO util.Utils: Successfully started service 'sparkExecutor' on port 50063. 15/07/07 15:27:04 INFO

ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL 15: SIGTERM

2015-07-03 Thread Kostas Kougios
I have this problem with a job. A random executor gets this ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL 15: SIGTERM Almost always at the same point in the processing of the data. I am processing 1 mil files with sc.wholeText. At around the 600.000th file, a container receives

binaryFiles() for 1 million files, too much memory required

2015-07-02 Thread Kostas Kougios
Once again I am trying to read a directory tree using binary files. My directory tree has a root dir ROOTDIR and subdirs where the files are located, i.e. ROOTDIR/1 ROOTDIR/2 ROOTDIR/.. ROOTDIR/100 A total of 1 mil files split into 100 sub dirs Using binaryFiles requires too much memory on the

wholeTextFiles(/x/*/*.txt) runs single threaded

2015-07-02 Thread Kostas Kougios
Hi, I got a cluster of 4 machines and I sc.wholeTextFiles(/x/*/*.txt) folder x contains subfolders and each subfolder contains thousand of files with a total of ~1million matching the path expression. My spark task starts processing the files but single threaded. I can see that in the sparkUI,

Re: wholeTextFiles(/x/*/*.txt) runs single threaded

2015-07-02 Thread Kostas Kougios
In SparkUI I can see it creating 2 stages. I tried wholeTextFiles().repartition(32) but same threading results. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/wholeTextFiles-x-txt-runs-single-threaded-tp23591p23593.html Sent from the Apache Spark User List

spark uses too much memory maybe (binaryFiles() with more than 1 million files in HDFS), groupBy or reduceByKey()

2015-06-10 Thread Kostas Kougios
Both the driver (ApplicationMaster running on hadoop) and container (CoarseGrainedExecutorBackend) end up exceeding my 25GB allocation. my code is something like sc.binaryFiles(... 1mil xml files).flatMap( ... extract some domain classes, not many though as each xml usually have zero

Re: spark uses too much memory maybe (binaryFiles() with more than 1 million files in HDFS), groupBy or reduceByKey()

2015-06-10 Thread Kostas Kougios
I am profiling the driver. It currently has 564MB of strings which might be the 1mil file names. But also it has 2.34 GB of long[] ! That's so far, it is still running. What are those long[] used for? -- View this message in context:

Re: spark uses too much memory maybe (binaryFiles() with more than 1 million files in HDFS), groupBy or reduceByKey()

2015-06-10 Thread Kostas Kougios
After some time the driver accumulated 6.67GB of long[] . The executor mem usage so far is low. -- View this message in context:

spark timesout maybe due to binaryFiles() with more than 1 million files in HDFS

2015-06-08 Thread Kostas Kougios
I am reading millions of xml files via val xmls = sc.binaryFiles(xmlDir) The operation runs fine locally but on yarn it fails with: client token: N/A diagnostics: Application application_1433491939773_0012 failed 2 times due to ApplicationMaster for attempt