Got it. Thanks, that clarifies.
On Thu, Nov 7, 2013 at 3:34 PM, Shangyu Luo lsy...@gmail.com wrote:
I am not sure. But in their RDD paper they have mentioned the usage of
broadcast variable. Sometimes you may need local variable in many
map-reduce jobs and you do not want to copy them to
I am not sure. But in their RDD paper they have mentioned the usage of
broadcast variable. Sometimes you may need local variable in many
map-reduce jobs and you do not want to copy them to all worker nodes
multiple times. Then the broadcast variable is a good choice
2013/11/7 Walrus theCat
I met the problem of 'Too many open files' before. One solution is adding
'ulimit -n 10' in the spark-env.sh file.
Basically, I think the local variable may not be a problem as I have
written programs with local variables as parameters for functions and the
programs work.
2013/11/3 Walrus
Are there heuristics to check when the scheduler says it is missing
parents and just hangs?
On Thu, Oct 31, 2013 at 4:56 PM, Walrus theCat walrusthe...@gmail.comwrote:
Hi,
I'm not sure what's going on here. My code seems to be working thus far
(map at SparkLR:90 completed.) What can I do
Hi,
I'm not sure what's going on here. My code seems to be working thus far
(map at SparkLR:90 completed.) What can I do to help the scheduler out
here?
Thanks
13/10/31 02:10:13 INFO scheduler.DAGScheduler: Completed ShuffleMapTask(10,
211)
13/10/31 02:10:13 INFO scheduler.DAGScheduler: Stage