subject:"Task Parallelism in a Cluster"

Re: Task Parallelism in a Cluster

2015-12-14 Thread Kashmar, Ali

Hi Stephan, I figured it out. The problem was that the date/time was different on all 3 nodes. Zookeeper thought that it hadn’t heard from the other nodes for longer than the allowed period and dropped them, therefore causing the other two task managers in the cluster to fail. I synchronized the t

Re: Task Parallelism in a Cluster

2015-12-11 Thread Kashmar, Ali

Hi Stephan, I’m using DataStream.writeAsText(String path, WriteMode writemode) for my sink. The data is written to disk and there’s plenty of space available. I looked deeper into the logs and found out that the jobs on 174 and 175 are not actually stuck, but they’re moving extremely slowly, This

Re: Task Parallelism in a Cluster

2015-12-11 Thread Stephan Ewen

Hi Ali! I see, so the tasks 192.168.200.174 and 192.168.200.175 apparently do not make progress, even do not recognize the end-of-stream point. I expect that the streams on 192.168.200.174 and 192.168.200.175 are back-pressured to a stand-still. Since no network is involved, the reason for the ba

Re: Task Parallelism in a Cluster

2015-12-11 Thread Kashmar, Ali

Hi Stephan, I got a request to share the image with someone and I assume it was you. You should be able to see it now. This seems to be the main issue I have at this time. I've tried running the job on the cluster with a parallelism of 16, 24, 36, and even went up to 48. I see all the parallel pip

Re: Task Parallelism in a Cluster

2015-12-10 Thread Stephan Ewen

Hi Ali! Seems like the Google Doc has restricted access, I tells me I have no permission to view it... Stephan On Wed, Dec 9, 2015 at 8:49 PM, Kashmar, Ali wrote: > Hi Stephan, > > Here’s a link to the screenshot I tried to attach earlier: > > https://drive.google.com/open?id=0B0_jTR8-IvUcMEd

Re: Task Parallelism in a Cluster

2015-12-09 Thread Kashmar, Ali

Hi Stephan, Here’s a link to the screenshot I tried to attach earlier: https://drive.google.com/open?id=0B0_jTR8-IvUcMEdjWGFmYXJYS28 It looks to me like the distribution is fairly skewed across the nodes, even though they’re executing the same pipeline. Thanks, Ali On 2015-12-09, 12:36 PM, "S

Re: Task Parallelism in a Cluster

2015-12-09 Thread Stephan Ewen

Hi! The parallel socket source looks good. I think you forgot to attach the screenshot, or the mailing list dropped the attachment... Not sure if I can diagnose that without more details. The sources all do the same. Assuming that the server distributes the data evenly across all connected socket

Re: Task Parallelism in a Cluster

2015-12-09 Thread Kashmar, Ali

Hi Stephan, That was my original understanding, until I realized that I was not using a parallel socket source. I had a custom source that extended SourceFunction which always runs with parallelism = 1. I looked through the API and found the ParallelSourceFunction interface so I implemented that a

Re: Task Parallelism in a Cluster

2015-12-08 Thread Stephan Ewen

Hi Ali! In the case you have, the sequence of source-map-filter ... forms a pipeline. You mentioned that you set the parallelism to 16, so there should be 16 pipelines. These pipelines should be completely independent. Looking at the way the scheduler is implemented, independent pipelines should

Re: Task Parallelism in a Cluster

2015-12-02 Thread Kashmar, Ali

There is no shuffle operation in my flow. Mine actually looks like this: Source: Custom Source -> Flat Map -> (Filter -> Flat Map -> Map -> Map -> Map, Filter) Maybe it’s treating this whole flow as one pipeline and assigning it to a slot. What I really wanted was to have the custom source I bui

Re: Task Parallelism in a Cluster

2015-12-02 Thread Till Rohrmann

If I'm not mistaken, then the scheduler has already a preference to spread independent pipelines out across the cluster. At least he uses a queue of instances from which it pops the first element if it allocates a new slot. This instance is then appended to the queue again, if it has some resources

Re: Task Parallelism in a Cluster

2015-12-01 Thread Stephan Ewen

Slots are like "resource groups" which execute entire pipelines. They frequently have more than one operator. What you can try as a workaround is decrease the number of slots per machine to cause the operators to be spread across more machines. If this is a crucial issue for your use case, it sho

Re: Task Parallelism in a Cluster

2015-12-01 Thread Ufuk Celebi

> On 01 Dec 2015, at 15:26, Kashmar, Ali wrote: > > Is there a way to make a task cluster-parallelizable? I.e. Make sure the > parallel instances of the task are distributed across the cluster. When I > run my flink job with a parallelism of 16, all the parallel tasks are > assigned to the first

Re: Task Parallelism in a Cluster

2015-12-01 Thread Kashmar, Ali

Is there a way to make a task cluster-parallelizable? I.e. Make sure the parallel instances of the task are distributed across the cluster. When I run my flink job with a parallelism of 16, all the parallel tasks are assigned to the first task manager. - Ali On 2015-11-30, 2:18 PM, "Ufuk Celebi"

Re: Task Parallelism in a Cluster

2015-11-30 Thread Ufuk Celebi

> On 30 Nov 2015, at 17:47, Kashmar, Ali wrote: > Do the parallel instances of each task get distributed across the cluster or > is it possible that they all run on the same node? Yes, slots are requested from all nodes of the cluster. But keep in mind that multiple tasks (forming a local pipe

Task Parallelism in a Cluster

2015-11-30 Thread Kashmar, Ali

Hello, I’m trying to wrap my head around task parallelism in a Flink cluster. Let’s say I have a cluster of 3 nodes, each node offering 16 task slots, so in total I’d have 48 slots for processing. Do the parallel instances of each task get distributed across the cluster or is it possible that t

Re: Task Parallelism in a Cluster

Re: Task Parallelism in a Cluster

Re: Task Parallelism in a Cluster

Re: Task Parallelism in a Cluster

Re: Task Parallelism in a Cluster

Re: Task Parallelism in a Cluster

Re: Task Parallelism in a Cluster

Re: Task Parallelism in a Cluster

Re: Task Parallelism in a Cluster

Re: Task Parallelism in a Cluster

Re: Task Parallelism in a Cluster

Re: Task Parallelism in a Cluster

Re: Task Parallelism in a Cluster

Re: Task Parallelism in a Cluster

Re: Task Parallelism in a Cluster

Task Parallelism in a Cluster

16 matches

Site Navigation

Mail list logo

Footer information