[jira] [Commented] (FLINK-10887) Add source watermark tracking to the JobMaster

2019-02-23 Thread Thomas Weise (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776082#comment-16776082
 ] 

Thomas Weise commented on FLINK-10887:
--

There is a workaround available; I don't see it as pressing issue.

It may be good to discuss how a better solution would look like. Ideally the 
user class loader would be used for RPC deserialization, not sure if that is 
feasible.

If like for aggregateFunction, we would require double (de)serialization (RPC + 
RpcGlobalAggregateManager), then it might be better to leave to the user 
instead of assuming another layer of Java serialization.

Probably best to create a new JIRA. 

> Add source watermark tracking to the JobMaster
> --
>
> Key: FLINK-10887
> URL: https://issues.apache.org/jira/browse/FLINK-10887
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: Jamie Grier
>Assignee: Jamie Grier
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>   Original Estimate: 24h
>  Time Spent: 50m
>  Remaining Estimate: 23h 10m
>
> We need to add a new RPC to the JobMaster such that the current watermark for 
> every source sub-task can be reported and the current global minimum/maximum 
> watermark can be retrieved so that each source can adjust their partition 
> read rates in an attempt to keep sources roughly aligned in event time.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10887) Add source watermark tracking to the JobMaster

2019-02-21 Thread Jamie Grier (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774612#comment-16774612
 ] 

Jamie Grier commented on FLINK-10887:
-

[~thw] I'll update the PR with a solution for the aggregand and result that 
works similarly to the aggregateFunction.  This was a designed but I think it's 
an oversight.  Thanks.

> Add source watermark tracking to the JobMaster
> --
>
> Key: FLINK-10887
> URL: https://issues.apache.org/jira/browse/FLINK-10887
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: Jamie Grier
>Assignee: Jamie Grier
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>   Original Estimate: 24h
>  Time Spent: 50m
>  Remaining Estimate: 23h 10m
>
> We need to add a new RPC to the JobMaster such that the current watermark for 
> every source sub-task can be reported and the current global minimum/maximum 
> watermark can be retrieved so that each source can adjust their partition 
> read rates in an attempt to keep sources roughly aligned in event time.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10887) Add source watermark tracking to the JobMaster

2019-02-16 Thread Thomas Weise (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16770265#comment-16770265
 ] 

Thomas Weise commented on FLINK-10887:
--

For verification, I used a synthetic source with a timer that would update an 
aggregate every second, with parallelism 128. The overhead is negligible.

Note that while aggregators can be custom classes available only through the 
user code class loader, aggregand and results currently cannot. If they are 
custom classes, then (de)serialization needs to be performed in user code and 
byte arrays used with updateGlobalAggregate.

 

> Add source watermark tracking to the JobMaster
> --
>
> Key: FLINK-10887
> URL: https://issues.apache.org/jira/browse/FLINK-10887
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: Jamie Grier
>Assignee: Jamie Grier
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>   Original Estimate: 24h
>  Time Spent: 50m
>  Remaining Estimate: 23h 10m
>
> We need to add a new RPC to the JobMaster such that the current watermark for 
> every source sub-task can be reported and the current global minimum/maximum 
> watermark can be retrieved so that each source can adjust their partition 
> read rates in an attempt to keep sources roughly aligned in event time.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10887) Add source watermark tracking to the JobMaster

2018-12-07 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16713157#comment-16713157
 ] 

ASF GitHub Bot commented on FLINK-10887:


tweise commented on issue #7099: [FLINK-10887] [jobmaster] Add source watermark 
tracking to the JobMaster
URL: https://github.com/apache/flink/pull/7099#issuecomment-445318396
 
 
   > @aljoscha @tweise Can you guys comment on the above generic aggregator 
proposal? I'd like to keep this moving forward.
   
   @jgrier +1 as suggested earlier. We just need to confirm that the class of 
`aggregationFunction` is available. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add source watermark tracking to the JobMaster
> --
>
> Key: FLINK-10887
> URL: https://issues.apache.org/jira/browse/FLINK-10887
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: Jamie Grier
>Assignee: Jamie Grier
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We need to add a new RPC to the JobMaster such that the current watermark for 
> every source sub-task can be reported and the current global minimum/maximum 
> watermark can be retrieved so that each source can adjust their partition 
> read rates in an attempt to keep sources roughly aligned in event time.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10887) Add source watermark tracking to the JobMaster

2018-12-07 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16713143#comment-16713143
 ] 

ASF GitHub Bot commented on FLINK-10887:


jgrier commented on issue #7099: [FLINK-10887] [jobmaster] Add source watermark 
tracking to the JobMaster
URL: https://github.com/apache/flink/pull/7099#issuecomment-445316366
 
 
   > Can we always assume that the user-jar/class loader will be available 
where the `AggregateFunction` is needed? If yes, I think this is a nice 
approach! (We can probably use concrete types in the interface, though)
   
   @aljoscha I'm actually not sure if the user code classloader is available 
from the JobMaster but I would think that's reasonable since there's a 1:1 
relationship between the JobMaster and a single job.
   
   WRT concrete types in the RPC interface I'm not sure what you're thinking 
there.  The concrete types are not known in this approach.  The types are up to 
the user/client and can be different for each named aggregate.
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add source watermark tracking to the JobMaster
> --
>
> Key: FLINK-10887
> URL: https://issues.apache.org/jira/browse/FLINK-10887
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: Jamie Grier
>Assignee: Jamie Grier
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We need to add a new RPC to the JobMaster such that the current watermark for 
> every source sub-task can be reported and the current global minimum/maximum 
> watermark can be retrieved so that each source can adjust their partition 
> read rates in an attempt to keep sources roughly aligned in event time.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10887) Add source watermark tracking to the JobMaster

2018-12-07 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16712618#comment-16712618
 ] 

ASF GitHub Bot commented on FLINK-10887:


aljoscha commented on issue #7099: [FLINK-10887] [jobmaster] Add source 
watermark tracking to the JobMaster
URL: https://github.com/apache/flink/pull/7099#issuecomment-445189806
 
 
   Also, congratulations, I guess!  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add source watermark tracking to the JobMaster
> --
>
> Key: FLINK-10887
> URL: https://issues.apache.org/jira/browse/FLINK-10887
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: Jamie Grier
>Assignee: Jamie Grier
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We need to add a new RPC to the JobMaster such that the current watermark for 
> every source sub-task can be reported and the current global minimum/maximum 
> watermark can be retrieved so that each source can adjust their partition 
> read rates in an attempt to keep sources roughly aligned in event time.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10887) Add source watermark tracking to the JobMaster

2018-12-07 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16712616#comment-16712616
 ] 

ASF GitHub Bot commented on FLINK-10887:


aljoscha commented on issue #7099: [FLINK-10887] [jobmaster] Add source 
watermark tracking to the JobMaster
URL: https://github.com/apache/flink/pull/7099#issuecomment-445189727
 
 
   Can we always assume that the user-jar/class loader will be available where 
the `AggregateFunction` is needed? If yes, I think this is a nice approach! (We 
can probably use concrete types in the interface, though)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add source watermark tracking to the JobMaster
> --
>
> Key: FLINK-10887
> URL: https://issues.apache.org/jira/browse/FLINK-10887
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: Jamie Grier
>Assignee: Jamie Grier
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We need to add a new RPC to the JobMaster such that the current watermark for 
> every source sub-task can be reported and the current global minimum/maximum 
> watermark can be retrieved so that each source can adjust their partition 
> read rates in an attempt to keep sources roughly aligned in event time.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10887) Add source watermark tracking to the JobMaster

2018-12-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711581#comment-16711581
 ] 

ASF GitHub Bot commented on FLINK-10887:


jgrier edited a comment on issue #7099: [FLINK-10887] [jobmaster] Add source 
watermark tracking to the JobMaster
URL: https://github.com/apache/flink/pull/7099#issuecomment-444907282
 
 
   @aljoscha @tweise Can you guys comment on the above generic aggregator 
proposal?  I'd like to keep this moving forward.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add source watermark tracking to the JobMaster
> --
>
> Key: FLINK-10887
> URL: https://issues.apache.org/jira/browse/FLINK-10887
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: Jamie Grier
>Assignee: Jamie Grier
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We need to add a new RPC to the JobMaster such that the current watermark for 
> every source sub-task can be reported and the current global minimum/maximum 
> watermark can be retrieved so that each source can adjust their partition 
> read rates in an attempt to keep sources roughly aligned in event time.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10887) Add source watermark tracking to the JobMaster

2018-12-06 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711578#comment-16711578
 ] 

ASF GitHub Bot commented on FLINK-10887:


jgrier commented on issue #7099: [FLINK-10887] [jobmaster] Add source watermark 
tracking to the JobMaster
URL: https://github.com/apache/flink/pull/7099#issuecomment-444907282
 
 
   @aljoscha @tweise Can you guys comment on the above proposal?  I'd like to 
keep this moving forward.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add source watermark tracking to the JobMaster
> --
>
> Key: FLINK-10887
> URL: https://issues.apache.org/jira/browse/FLINK-10887
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: Jamie Grier
>Assignee: Jamie Grier
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We need to add a new RPC to the JobMaster such that the current watermark for 
> every source sub-task can be reported and the current global minimum/maximum 
> watermark can be retrieved so that each source can adjust their partition 
> read rates in an attempt to keep sources roughly aligned in event time.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10887) Add source watermark tracking to the JobMaster

2018-12-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707319#comment-16707319
 ] 

ASF GitHub Bot commented on FLINK-10887:


jgrier commented on issue #7099: [FLINK-10887] [jobmaster] Add source watermark 
tracking to the JobMaster
URL: https://github.com/apache/flink/pull/7099#issuecomment-443739811
 
 
   Sorry I haven't responded to this.  We had a baby boy this week so that has 
kept me pretty busy ;)
   
   Okay, so I'm on board with generifying this further.  @StephanEwen if we're 
to do a generic transient aggregator do you mean to allow the client to provide 
the aggregation function?  In this case the API would look something like this:
   
   `CompletableFuture updateAggregate(
 String aggregateName,
 T aggregand,
 BiFunction aggregationFunction);`
   
   Is this what you had in mind?
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add source watermark tracking to the JobMaster
> --
>
> Key: FLINK-10887
> URL: https://issues.apache.org/jira/browse/FLINK-10887
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: Jamie Grier
>Assignee: Jamie Grier
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We need to add a new RPC to the JobMaster such that the current watermark for 
> every source sub-task can be reported and the current global minimum/maximum 
> watermark can be retrieved so that each source can adjust their partition 
> read rates in an attempt to keep sources roughly aligned in event time.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10887) Add source watermark tracking to the JobMaster

2018-12-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707342#comment-16707342
 ] 

ASF GitHub Bot commented on FLINK-10887:


jgrier edited a comment on issue #7099: [FLINK-10887] [jobmaster] Add source 
watermark tracking to the JobMaster
URL: https://github.com/apache/flink/pull/7099#issuecomment-443739811
 
 
   Sorry I haven't responded to this.  We had a baby boy this week so that has 
kept me pretty busy ;)
   
   Okay, so I'm on board with generifying this further.  @StephanEwen if we're 
to do a generic transient aggregator do you mean to allow the client to provide 
the aggregation function?  In this case the API would look something like this:
   
   ```
   /**
* Update the aggregate and return the new value.
*
* @param aggregateName The name of the aggregate to update
* @param aggregand The value to add to the aggregate
* @param aggregationFunction The function to apply to the current aggregate 
and aggregand to obtain the new aggregate value
* @return The updated aggregate
   CompletableFuture updateAggregate(
 String aggregateName,
 Object aggregand,
 AggregateFunction aggregationFunction);
   ```
   
   Is something like this what you had in mind?
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add source watermark tracking to the JobMaster
> --
>
> Key: FLINK-10887
> URL: https://issues.apache.org/jira/browse/FLINK-10887
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: Jamie Grier
>Assignee: Jamie Grier
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We need to add a new RPC to the JobMaster such that the current watermark for 
> every source sub-task can be reported and the current global minimum/maximum 
> watermark can be retrieved so that each source can adjust their partition 
> read rates in an attempt to keep sources roughly aligned in event time.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10887) Add source watermark tracking to the JobMaster

2018-12-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707326#comment-16707326
 ] 

ASF GitHub Bot commented on FLINK-10887:


jgrier edited a comment on issue #7099: [FLINK-10887] [jobmaster] Add source 
watermark tracking to the JobMaster
URL: https://github.com/apache/flink/pull/7099#issuecomment-443739811
 
 
   Sorry I haven't responded to this.  We had a baby boy this week so that has 
kept me pretty busy ;)
   
   Okay, so I'm on board with generifying this further.  @StephanEwen if we're 
to do a generic transient aggregator do you mean to allow the client to provide 
the aggregation function?  In this case the API would look something like this:
   
   `CompletableFuture updateAggregate(
 String aggregateName,
 Object aggregand,
 Function aggregationFunction);`
   
   Is this what you had in mind?
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add source watermark tracking to the JobMaster
> --
>
> Key: FLINK-10887
> URL: https://issues.apache.org/jira/browse/FLINK-10887
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: Jamie Grier
>Assignee: Jamie Grier
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We need to add a new RPC to the JobMaster such that the current watermark for 
> every source sub-task can be reported and the current global minimum/maximum 
> watermark can be retrieved so that each source can adjust their partition 
> read rates in an attempt to keep sources roughly aligned in event time.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10887) Add source watermark tracking to the JobMaster

2018-12-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707324#comment-16707324
 ] 

ASF GitHub Bot commented on FLINK-10887:


jgrier edited a comment on issue #7099: [FLINK-10887] [jobmaster] Add source 
watermark tracking to the JobMaster
URL: https://github.com/apache/flink/pull/7099#issuecomment-443739811
 
 
   Sorry I haven't responded to this.  We had a baby boy this week so that has 
kept me pretty busy ;)
   
   Okay, so I'm on board with generifying this further.  @StephanEwen if we're 
to do a generic transient aggregator do you mean to allow the client to provide 
the aggregation function?  In this case the API would look something like this:
   
   `CompletableFuture updateAggregate(
 String aggregateName,
 Object aggregand,
 BiFunction aggregationFunction);`
   
   Is this what you had in mind?
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add source watermark tracking to the JobMaster
> --
>
> Key: FLINK-10887
> URL: https://issues.apache.org/jira/browse/FLINK-10887
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: Jamie Grier
>Assignee: Jamie Grier
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We need to add a new RPC to the JobMaster such that the current watermark for 
> every source sub-task can be reported and the current global minimum/maximum 
> watermark can be retrieved so that each source can adjust their partition 
> read rates in an attempt to keep sources roughly aligned in event time.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10887) Add source watermark tracking to the JobMaster

2018-12-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707328#comment-16707328
 ] 

ASF GitHub Bot commented on FLINK-10887:


jgrier edited a comment on issue #7099: [FLINK-10887] [jobmaster] Add source 
watermark tracking to the JobMaster
URL: https://github.com/apache/flink/pull/7099#issuecomment-443739811
 
 
   Sorry I haven't responded to this.  We had a baby boy this week so that has 
kept me pretty busy ;)
   
   Okay, so I'm on board with generifying this further.  @StephanEwen if we're 
to do a generic transient aggregator do you mean to allow the client to provide 
the aggregation function?  In this case the API would look something like this:
   
   ```
   CompletableFuture updateAggregate(
 String aggregateName,
 Object aggregand,
 Function aggregationFunction);
   ```
   
   Is this what you had in mind?
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add source watermark tracking to the JobMaster
> --
>
> Key: FLINK-10887
> URL: https://issues.apache.org/jira/browse/FLINK-10887
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: Jamie Grier
>Assignee: Jamie Grier
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We need to add a new RPC to the JobMaster such that the current watermark for 
> every source sub-task can be reported and the current global minimum/maximum 
> watermark can be retrieved so that each source can adjust their partition 
> read rates in an attempt to keep sources roughly aligned in event time.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10887) Add source watermark tracking to the JobMaster

2018-11-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16698339#comment-16698339
 ] 

ASF GitHub Bot commented on FLINK-10887:


StephanEwen commented on issue #7099: [FLINK-10887] [jobmaster] Add source 
watermark tracking to the JobMaster
URL: https://github.com/apache/flink/pull/7099#issuecomment-441478603
 
 
   I think this is a very nice feature, +1 to have this.
   
   We have seen other use cases that need a similar mechanism, so I am 
wondering if we can generify this to a some transient aggregator. One of those 
use case would need the max across all values and is otherwise almost the same.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add source watermark tracking to the JobMaster
> --
>
> Key: FLINK-10887
> URL: https://issues.apache.org/jira/browse/FLINK-10887
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: Jamie Grier
>Assignee: Jamie Grier
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We need to add a new RPC to the JobMaster such that the current watermark for 
> every source sub-task can be reported and the current global minimum/maximum 
> watermark can be retrieved so that each source can adjust their partition 
> read rates in an attempt to keep sources roughly aligned in event time.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10887) Add source watermark tracking to the JobMaster

2018-11-21 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695332#comment-16695332
 ] 

ASF GitHub Bot commented on FLINK-10887:


jgrier commented on a change in pull request #7099: [FLINK-10887] [jobmaster] 
Add source watermark tracking to the JobMaster
URL: https://github.com/apache/flink/pull/7099#discussion_r235559380
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/watermark/SourceWatermark.java
 ##
 @@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.watermark;
+
+import java.io.Serializable;
+
+/**
+ * This represents the watermark for a single source partition.
+ */
+public class SourceWatermark implements Serializable {
 
 Review comment:
   Sounds good.  Will update shortly.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add source watermark tracking to the JobMaster
> --
>
> Key: FLINK-10887
> URL: https://issues.apache.org/jira/browse/FLINK-10887
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: Jamie Grier
>Assignee: Jamie Grier
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We need to add a new RPC to the JobMaster such that the current watermark for 
> every source sub-task can be reported and the current global minimum/maximum 
> watermark can be retrieved so that each source can adjust their partition 
> read rates in an attempt to keep sources roughly aligned in event time.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10887) Add source watermark tracking to the JobMaster

2018-11-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16694119#comment-16694119
 ] 

ASF GitHub Bot commented on FLINK-10887:


tweise commented on a change in pull request #7099: [FLINK-10887] [jobmaster] 
Add source watermark tracking to the JobMaster
URL: https://github.com/apache/flink/pull/7099#discussion_r235234375
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/watermark/SourceWatermark.java
 ##
 @@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.watermark;
+
+import java.io.Serializable;
+
+/**
+ * This represents the watermark for a single source partition.
+ */
+public class SourceWatermark implements Serializable {
 
 Review comment:
   I could imagine scenarios where different sources have different 
synchronization. That could be supported with a grouping mechanism for the 
tasks that participate in the watermark sync. The RPC would pass the 
group/namespace identifier as additional parameter and only get back the 
watermark for that (hash table would remain internal).  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add source watermark tracking to the JobMaster
> --
>
> Key: FLINK-10887
> URL: https://issues.apache.org/jira/browse/FLINK-10887
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: Jamie Grier
>Assignee: Jamie Grier
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We need to add a new RPC to the JobMaster such that the current watermark for 
> every source sub-task can be reported and the current global minimum/maximum 
> watermark can be retrieved so that each source can adjust their partition 
> read rates in an attempt to keep sources roughly aligned in event time.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10887) Add source watermark tracking to the JobMaster

2018-11-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688147#comment-16688147
 ] 

ASF GitHub Bot commented on FLINK-10887:


jgrier commented on a change in pull request #7099: [FLINK-10887] [jobmaster] 
Add source watermark tracking to the JobMaster
URL: https://github.com/apache/flink/pull/7099#discussion_r233862737
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/watermark/SourceWatermark.java
 ##
 @@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.watermark;
+
+import java.io.Serializable;
+
+/**
+ * This represents the watermark for a single source partition.
+ */
+public class SourceWatermark implements Serializable {
 
 Review comment:
   Yeah, my intention was to keep this very focused on the exact use case at 
hand -- to provide simple state sharing for watermarks in the service of the 
source synchronization effort.  This is why the very specific naming and lack 
of additional features like namespaces, etc.
   
   If we were to generalize this more it would be good to understand some other 
specific use cases -- and also to consider whether it's important to tackle 
that here or just go with the simplest interface we need for the task at hand.
   
   @tweise @aljoscha If we do something more general what are you thinking?  
Something more like a hash table or a collection of namespaced hashtables?  
Would we need to make the key and value types generic, etc?  Would we want to 
then distribute the entire hashtable to every sub-task?
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add source watermark tracking to the JobMaster
> --
>
> Key: FLINK-10887
> URL: https://issues.apache.org/jira/browse/FLINK-10887
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: Jamie Grier
>Assignee: Jamie Grier
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We need to add a new RPC to the JobMaster such that the current watermark for 
> every source sub-task can be reported and the current global minimum/maximum 
> watermark can be retrieved so that each source can adjust their partition 
> read rates in an attempt to keep sources roughly aligned in event time.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10887) Add source watermark tracking to the JobMaster

2018-11-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688127#comment-16688127
 ] 

ASF GitHub Bot commented on FLINK-10887:


jgrier commented on a change in pull request #7099: [FLINK-10887] [jobmaster] 
Add source watermark tracking to the JobMaster
URL: https://github.com/apache/flink/pull/7099#discussion_r233857870
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/watermark/SourceWatermark.java
 ##
 @@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.watermark;
+
+import java.io.Serializable;
+
+/**
+ * This represents the watermark for a single source partition.
+ */
+public class SourceWatermark implements Serializable {
+
+   private static final long serialVersionUID = 1L;
+   private long timestamp;
 
 Review comment:
   The timestamp here is meant to represent the watermark itself -- the current 
low watermark for the sub-task that sent it.
   
   I do agree, however, that we will also need to know at what time the 
watermark was sent so that we can ignore it if it hasn't been updated in some 
configurable amount of time.
   
   Very good point.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add source watermark tracking to the JobMaster
> --
>
> Key: FLINK-10887
> URL: https://issues.apache.org/jira/browse/FLINK-10887
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: Jamie Grier
>Assignee: Jamie Grier
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We need to add a new RPC to the JobMaster such that the current watermark for 
> every source sub-task can be reported and the current global minimum/maximum 
> watermark can be retrieved so that each source can adjust their partition 
> read rates in an attempt to keep sources roughly aligned in event time.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10887) Add source watermark tracking to the JobMaster

2018-11-15 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687911#comment-16687911
 ] 

ASF GitHub Bot commented on FLINK-10887:


aljoscha commented on a change in pull request #7099: [FLINK-10887] [jobmaster] 
Add source watermark tracking to the JobMaster
URL: https://github.com/apache/flink/pull/7099#discussion_r233811969
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/watermark/SourceWatermark.java
 ##
 @@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.watermark;
+
+import java.io.Serializable;
+
+/**
+ * This represents the watermark for a single source partition.
+ */
+public class SourceWatermark implements Serializable {
 
 Review comment:
   One problem is that there is already a `Watermark` class, but I agree with 
Thomas' comment. In the future, not all "sources" might be actual physical 
sources in the pipeline.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add source watermark tracking to the JobMaster
> --
>
> Key: FLINK-10887
> URL: https://issues.apache.org/jira/browse/FLINK-10887
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: Jamie Grier
>Assignee: Jamie Grier
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We need to add a new RPC to the JobMaster such that the current watermark for 
> every source sub-task can be reported and the current global minimum/maximum 
> watermark can be retrieved so that each source can adjust their partition 
> read rates in an attempt to keep sources roughly aligned in event time.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10887) Add source watermark tracking to the JobMaster

2018-11-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687500#comment-16687500
 ] 

ASF GitHub Bot commented on FLINK-10887:


tweise commented on a change in pull request #7099: [FLINK-10887] [jobmaster] 
Add source watermark tracking to the JobMaster
URL: https://github.com/apache/flink/pull/7099#discussion_r233714795
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/watermark/SourceWatermark.java
 ##
 @@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.watermark;
+
+import java.io.Serializable;
+
+/**
+ * This represents the watermark for a single source partition.
+ */
+public class SourceWatermark implements Serializable {
 
 Review comment:
   I wonder if it should be qualified as `SourceWatermark` vs. just 
`Watermark`? Perhaps there are use cases for exchanging watermarks across 
subtasks that don't necessarily belong to a source. One such example could be 
operators that perform asynchronous operations. Related, do we want to allow 
for an identifier for the watermark so that within an application multiple 
independent groupings could be formed? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add source watermark tracking to the JobMaster
> --
>
> Key: FLINK-10887
> URL: https://issues.apache.org/jira/browse/FLINK-10887
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: Jamie Grier
>Assignee: Jamie Grier
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We need to add a new RPC to the JobMaster such that the current watermark for 
> every source sub-task can be reported and the current global minimum/maximum 
> watermark can be retrieved so that each source can adjust their partition 
> read rates in an attempt to keep sources roughly aligned in event time.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10887) Add source watermark tracking to the JobMaster

2018-11-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687502#comment-16687502
 ] 

ASF GitHub Bot commented on FLINK-10887:


tweise commented on a change in pull request #7099: [FLINK-10887] [jobmaster] 
Add source watermark tracking to the JobMaster
URL: https://github.com/apache/flink/pull/7099#discussion_r233715375
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/watermark/SourceWatermark.java
 ##
 @@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.watermark;
+
+import java.io.Serializable;
+
+/**
+ * This represents the watermark for a single source partition.
+ */
+public class SourceWatermark implements Serializable {
 
 Review comment:
   This may be a bit far fetched, but can it be generalized further to 
something like a named counter/metric? Currently there isn't anything watermark 
specific here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add source watermark tracking to the JobMaster
> --
>
> Key: FLINK-10887
> URL: https://issues.apache.org/jira/browse/FLINK-10887
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: Jamie Grier
>Assignee: Jamie Grier
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We need to add a new RPC to the JobMaster such that the current watermark for 
> every source sub-task can be reported and the current global minimum/maximum 
> watermark can be retrieved so that each source can adjust their partition 
> read rates in an attempt to keep sources roughly aligned in event time.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10887) Add source watermark tracking to the JobMaster

2018-11-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687496#comment-16687496
 ] 

ASF GitHub Bot commented on FLINK-10887:


tweise commented on a change in pull request #7099: [FLINK-10887] [jobmaster] 
Add source watermark tracking to the JobMaster
URL: https://github.com/apache/flink/pull/7099#discussion_r233714283
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/watermark/SourceWatermark.java
 ##
 @@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.watermark;
+
+import java.io.Serializable;
+
+/**
+ * This represents the watermark for a single source partition.
+ */
+public class SourceWatermark implements Serializable {
+
+   private static final long serialVersionUID = 1L;
+   private long timestamp;
 
 Review comment:
   What does the timestamp represent? Is it when the watermark last changed or 
when it was last communicated by the subtask (even if it did not change, for 
example because the subtask is just reading a lot of data under the same 
watermark). We will need a way to detect that a source subtask is idle so we 
can avoid waiting for it (similar to how we has to identify idle within a 
subtask).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add source watermark tracking to the JobMaster
> --
>
> Key: FLINK-10887
> URL: https://issues.apache.org/jira/browse/FLINK-10887
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: Jamie Grier
>Assignee: Jamie Grier
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We need to add a new RPC to the JobMaster such that the current watermark for 
> every source sub-task can be reported and the current global minimum/maximum 
> watermark can be retrieved so that each source can adjust their partition 
> read rates in an attempt to keep sources roughly aligned in event time.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10887) Add source watermark tracking to the JobMaster

2018-11-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687311#comment-16687311
 ] 

ASF GitHub Bot commented on FLINK-10887:


jgrier opened a new pull request #7099: [FLINK-10887] [jobmaster] Add source 
watermark tracking to the JobMaster
URL: https://github.com/apache/flink/pull/7099
 
 
   ## What is the purpose of the change
   
   This commit adds a JobMaster RPC endpoint that is used to share information 
across source subtasks regarding event time progress.
   
   This will be used implement event time source synchronization across sources.
   
   ## Brief change log
 - New RPC endpoint on JobMaster to track event time progress across source 
sub-tasks
 - Updated JobMaster Tests
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
 - Added testCase to ensure watermark stats are computed correctly to 
JobMasterTest
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): No
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: No
 - The serializers: No
 - The runtime per-record code paths (performance sensitive): No
 - Anything that affects deployment or recovery: No
 - The S3 file system connector: No
   
   ## Documentation
   
 - Does this pull request introduce a new feature? No
 - If yes, how is the feature documented? not applicable
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add source watermark tracking to the JobMaster
> --
>
> Key: FLINK-10887
> URL: https://issues.apache.org/jira/browse/FLINK-10887
> Project: Flink
>  Issue Type: Sub-task
>  Components: JobManager
>Reporter: Jamie Grier
>Assignee: Jamie Grier
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> We need to add a new RPC to the JobMaster such that the current watermark for 
> every source sub-task can be reported and the current global minimum/maximum 
> watermark can be retrieved so that each source can adjust their partition 
> read rates in an attempt to keep sources roughly aligned in event time.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)