Re: Using Custom Partitioner in Streaming with parallelism 1 adding latency

Aljoscha Krettek Thu, 15 Jun 2017 06:06:46 -0700

Hi,
Before adding the partitioner all your functions are chained together. That is, 
everything is executed in one Thread and sending elements from one function to 
the next is essentially just a method call. By introducing a partitioner you 
break this chain and therefore your job now has to send data (at least) across 
between different Threads. I think you should see this if you look at the 
execution graph.


Best,
Aljoscha

> On 15. Jun 2017, at 14:35, sohimankotia <sohimanko...@gmail.com> wrote:
> 
> Hi,
> 
> I have streaming job which is running with parallelism 1 as of now .  (This
> job will run with parallelism > 1 in future )
> 
> So I have added custom partitioner to partition the data based on one tuple
> field .
> 
> The flow is  :
> 
> source -> map -> partitioner -> flatmap -> sink
> 
> The partitioner is adding around 40 sec or more delay for 40% of data (from
> map to flatmap ).
> 
> I am not able to understand if parallelism is one how this delay is getting
> added .
> 
> 
> 
> --
> View this message in context: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Using-Custom-Partitioner-in-Streaming-with-parallelism-1-adding-latency-tp13766.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive at 
> Nabble.com.

Re: Using Custom Partitioner in Streaming with parallelism 1 adding latency

Reply via email to