>> So increasing Executors without increasing physical resources If I have a 16 GB RAM system and then I allocate 1 GB for each executor, and give number of executors as 8, then I am increasing the resource right? In this case, how do you explain?
Thank You On Sun, Feb 22, 2015 at 6:12 AM, Aaron Davidson <ilike...@gmail.com> wrote: > Note that the parallelism (i.e., number of partitions) is just an upper > bound on how much of the work can be done in parallel. If you have 200 > partitions, then you can divide the work among between 1 and 200 cores and > all resources will remain utilized. If you have more than 200 cores, > though, then some will not be used, so you would want to increase > parallelism further. (There are other rules-of-thumb -- for instance, it's > generally good to have at least 2x more partitions than cores for straggler > mitigation, but these are essentially just optimizations.) > > Further note that when you increase the number of Executors for the same > set of resources (i.e., starting 10 Executors on a single machine instead > of 1), you make Spark's job harder. Spark has to communicate in an > all-to-all manner across Executors for shuffle operations, and it uses TCP > sockets to do so whether or not the Executors happen to be on the same > machine. So increasing Executors without increasing physical resources > means Spark has to do more communication to do the same work. > > We expect that increasing the number of Executors by a factor of 10, given > an increase in the number of physical resources by the same factor, would > also improve performance by 10x. This is not always the case for the > precise reason above (increased communication overhead), but typically we > can get close. The actual observed improvement is very algorithm-dependent, > though; for instance, some ML algorithms become hard to scale out past a > certain point because the increase in communication overhead outweighs the > increase in parallelism. > > On Sat, Feb 21, 2015 at 8:19 AM, Deep Pradhan <pradhandeep1...@gmail.com> > wrote: > >> So, if I keep the number of instances constant and increase the degree of >> parallelism in steps, can I expect the performance to increase? >> >> Thank You >> >> On Sat, Feb 21, 2015 at 9:07 PM, Deep Pradhan <pradhandeep1...@gmail.com> >> wrote: >> >>> So, with the increase in the number of worker instances, if I also >>> increase the degree of parallelism, will it make any difference? >>> I can use this model even the other way round right? I can always >>> predict the performance of an app with the increase in number of worker >>> instances, the deterioration in performance, right? >>> >>> Thank You >>> >>> On Sat, Feb 21, 2015 at 8:52 PM, Deep Pradhan <pradhandeep1...@gmail.com >>> > wrote: >>> >>>> Yes, I have decreased the executor memory. >>>> But,if I have to do this, then I have to tweak around with the code >>>> corresponding to each configuration right? >>>> >>>> On Sat, Feb 21, 2015 at 8:47 PM, Sean Owen <so...@cloudera.com> wrote: >>>> >>>>> "Workers" has a specific meaning in Spark. You are running many on one >>>>> machine? that's possible but not usual. >>>>> >>>>> Each worker's executors have access to a fraction of your machine's >>>>> resources then. If you're not increasing parallelism, maybe you're not >>>>> actually using additional workers, so are using less resource for your >>>>> problem. >>>>> >>>>> Or because the resulting executors are smaller, maybe you're hitting >>>>> GC thrashing in these executors with smaller heaps. >>>>> >>>>> Or if you're not actually configuring the executors to use less >>>>> memory, maybe you're over-committing your RAM and swapping? >>>>> >>>>> Bottom line, you wouldn't use multiple workers on one small standalone >>>>> node. This isn't a good way to estimate performance on a distributed >>>>> cluster either. >>>>> >>>>> On Sat, Feb 21, 2015 at 3:11 PM, Deep Pradhan < >>>>> pradhandeep1...@gmail.com> wrote: >>>>> > No, I just have a single node standalone cluster. >>>>> > >>>>> > I am not tweaking around with the code to increase parallelism. I am >>>>> just >>>>> > running SparkKMeans that is there in Spark-1.0.0 >>>>> > I just wanted to know, if this behavior is natural. And if so, what >>>>> causes >>>>> > this? >>>>> > >>>>> > Thank you >>>>> > >>>>> > On Sat, Feb 21, 2015 at 8:32 PM, Sean Owen <so...@cloudera.com> >>>>> wrote: >>>>> >> >>>>> >> What's your storage like? are you adding worker machines that are >>>>> >> remote from where the data lives? I wonder if it just means you are >>>>> >> spending more and more time sending the data over the network as you >>>>> >> try to ship more of it to more remote workers. >>>>> >> >>>>> >> To answer your question, no in general more workers means more >>>>> >> parallelism and therefore faster execution. But that depends on a >>>>> lot >>>>> >> of things. For example, if your process isn't parallelize to use all >>>>> >> available execution slots, adding more slots doesn't do anything. >>>>> >> >>>>> >> On Sat, Feb 21, 2015 at 2:51 PM, Deep Pradhan < >>>>> pradhandeep1...@gmail.com> >>>>> >> wrote: >>>>> >> > Yes, I am talking about standalone single node cluster. >>>>> >> > >>>>> >> > No, I am not increasing parallelism. I just wanted to know if it >>>>> is >>>>> >> > natural. >>>>> >> > Does message passing across the workers account for the >>>>> happenning? >>>>> >> > >>>>> >> > I am running SparkKMeans, just to validate one prediction model. >>>>> I am >>>>> >> > using >>>>> >> > several data sets. I have a standalone mode. I am varying the >>>>> workers >>>>> >> > from 1 >>>>> >> > to 16 >>>>> >> > >>>>> >> > On Sat, Feb 21, 2015 at 8:14 PM, Sean Owen <so...@cloudera.com> >>>>> wrote: >>>>> >> >> >>>>> >> >> I can imagine a few reasons. Adding workers might cause fewer >>>>> tasks to >>>>> >> >> execute locally (?) So you may be execute more remotely. >>>>> >> >> >>>>> >> >> Are you increasing parallelism? for trivial jobs, chopping them >>>>> up >>>>> >> >> further may cause you to pay more overhead of managing so many >>>>> small >>>>> >> >> tasks, for no speed up in execution time. >>>>> >> >> >>>>> >> >> Can you provide any more specifics though? you haven't said what >>>>> >> >> you're running, what mode, how many workers, how long it takes, >>>>> etc. >>>>> >> >> >>>>> >> >> On Sat, Feb 21, 2015 at 2:37 PM, Deep Pradhan >>>>> >> >> <pradhandeep1...@gmail.com> >>>>> >> >> wrote: >>>>> >> >> > Hi, >>>>> >> >> > I have been running some jobs in my local single node stand >>>>> alone >>>>> >> >> > cluster. I >>>>> >> >> > am varying the worker instances for the same job, and the time >>>>> taken >>>>> >> >> > for >>>>> >> >> > the >>>>> >> >> > job to complete increases with increase in the number of >>>>> workers. I >>>>> >> >> > repeated >>>>> >> >> > some experiments varying the number of nodes in a cluster too >>>>> and the >>>>> >> >> > same >>>>> >> >> > behavior is seen. >>>>> >> >> > Can the idea of worker instances be extrapolated to the nodes >>>>> in a >>>>> >> >> > cluster? >>>>> >> >> > >>>>> >> >> > Thank You >>>>> >> > >>>>> >> > >>>>> > >>>>> > >>>>> >>>> >>>> >>> >> >