Hello Mike,
No problem, logs are useful to us anyway. Thank you for all the pointers.
We started off with examining only a single RDD but later on added a few
more. The persist count and unpersist count sequence is the dummy stage
that you suggested us to use to avoid the initial scheduler delay.
Hi Raghava,
I'm terribly sorry about the end of my last email; that garbled
sentence was garbled because it wasn't meant to exist; I wrote it on
my phone, realized I wouldn't realistically have time to look into
another set of logs deeply enough, and then mistook myself for having
deleted it. Agai
Maybe this is due to config spark.scheduler.minRegisteredResourcesRatio,
you can try set it as 1 to see the behavior.
// Submit tasks only after (registered resources / total expected resources)
// is equal to at least this value, that is double between 0 and 1.
var minRegisteredRatio =
math.m
Could you change numPartitions to {16, 32, 64} and run your program for
each to see how many partitions are allocated to each worker? Let's see if
you experience an all-nothing imbalance that way; if so, my guess is that
something else is odd in your program logic or spark runtime environment,
but
Mike, All,
It turns out that the second time we encountered the uneven-partition issue
is not due to spark-submit. It was resolved with the change in placement of
count().
Case-1:
val numPartitions = 8
// read uAxioms from HDFS, use hash partitioner on it and persist it
// read type1Axioms from
Thank you. For now we plan to use spark-shell to submit jobs.
Regards,
Raghava.
On Fri, Apr 22, 2016 at 7:40 PM, Mike Hynes <91m...@gmail.com> wrote:
> Glad to hear that the problem was solvable! I have not seen delays of this
> type for later stages in jobs run by spark-submit, but I do not th
Glad to hear that the problem was solvable! I have not seen delays of this
type for later stages in jobs run by spark-submit, but I do not think it
impossible if your stage has no lineage dependence on other RDDs.
I'm CC'ing the dev list to report of other users observing load imbalance
caused by
Mike,
It turns out the executor delay, as you mentioned, is the cause. After we
introduced a dummy stage, partitioning was working fine. Does this delay
happen during later stages as well? We noticed the same behavior
(partitioning happens on spark-shell but not through spark-submit) at a
later st