[jira] [Commented] (FLINK-4020) Remove shard list querying from Kinesis consumer constructor

2016-06-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15338614#comment-15338614
 ] 

ASF GitHub Bot commented on FLINK-4020:
---

Github user tzulitai closed the pull request at:

https://github.com/apache/flink/pull/2081


> Remove shard list querying from Kinesis consumer constructor
> 
>
> Key: FLINK-4020
> URL: https://issues.apache.org/jira/browse/FLINK-4020
> Project: Flink
>  Issue Type: Sub-task
>  Components: Kinesis Connector, Streaming Connectors
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
> Fix For: 1.1.0
>
>
> Currently FlinkKinesisConsumer is querying for the whole list of shards in 
> the constructor, forcing the client to be able to access Kinesis as well. 
> This is also a drawback for handling Kinesis-side resharding, since we'd want 
> all shard listing / shard-to-task assigning / shard end (result of 
> resharding) handling logic to be capable of being independently done within 
> task life cycle methods, with defined and definite results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4020) Remove shard list querying from Kinesis consumer constructor

2016-06-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15338613#comment-15338613
 ] 

ASF GitHub Bot commented on FLINK-4020:
---

Github user tzulitai commented on the issue:

https://github.com/apache/flink/pull/2081
  
Hi @rmetzger,
Update: I'm closing this PR now. The new PR with FLINK-4020 & FLINK-3231 is 
at https://github.com/apache/flink/pull/2131.


> Remove shard list querying from Kinesis consumer constructor
> 
>
> Key: FLINK-4020
> URL: https://issues.apache.org/jira/browse/FLINK-4020
> Project: Flink
>  Issue Type: Sub-task
>  Components: Kinesis Connector, Streaming Connectors
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
> Fix For: 1.1.0
>
>
> Currently FlinkKinesisConsumer is querying for the whole list of shards in 
> the constructor, forcing the client to be able to access Kinesis as well. 
> This is also a drawback for handling Kinesis-side resharding, since we'd want 
> all shard listing / shard-to-task assigning / shard end (result of 
> resharding) handling logic to be capable of being independently done within 
> task life cycle methods, with defined and definite results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4020) Remove shard list querying from Kinesis consumer constructor

2016-06-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329401#comment-15329401
 ] 

ASF GitHub Bot commented on FLINK-4020:
---

Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/2081
  
Okay, thank you. I'll wait then.


> Remove shard list querying from Kinesis consumer constructor
> 
>
> Key: FLINK-4020
> URL: https://issues.apache.org/jira/browse/FLINK-4020
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming Connectors
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>
> Currently FlinkKinesisConsumer is querying for the whole list of shards in 
> the constructor, forcing the client to be able to access Kinesis as well. 
> This is also a drawback for handling Kinesis-side resharding, since we'd want 
> all shard listing / shard-to-task assigning / shard end (result of 
> resharding) handling logic to be capable of being independently done within 
> task life cycle methods, with defined and definite results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4020) Remove shard list querying from Kinesis consumer constructor

2016-06-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328943#comment-15328943
 ] 

ASF GitHub Bot commented on FLINK-4020:
---

Github user tzulitai commented on the issue:

https://github.com/apache/flink/pull/2081
  
Hi @rmetzger,
Thanks for letting me know. However, I'd like to close this PR for now for 
the following reasons:

1. The new shard-to-subtask assignment logic introduced with this change 
will actually need to be moved again to run() as part of implementing Kinesis 
reshard handling [FLINK-3231](https://issues.apache.org/jira/browse/FLINK-3231).
2. I've testing this change a bit more on Kinesis streams with high shard 
counts, and it seems like the implementation needs more guarantee on that all 
subtasks will be able to get the shard list without failing with Amazon's 
LimitExceededException even after 3 retries. Since the implementation for 
FLINK-3231 will have a separate thread that polls for changes in the shard 
list, I'd like to strengthen this guarantee as part of FLINK-3231's PR.

I'm almost done with FLINK-3231, and will reopen a PR to resolve FLINK-3231 
and FLINK-4020 together. I'll keep you updated!


> Remove shard list querying from Kinesis consumer constructor
> 
>
> Key: FLINK-4020
> URL: https://issues.apache.org/jira/browse/FLINK-4020
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming Connectors
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>
> Currently FlinkKinesisConsumer is querying for the whole list of shards in 
> the constructor, forcing the client to be able to access Kinesis as well. 
> This is also a drawback for handling Kinesis-side resharding, since we'd want 
> all shard listing / shard-to-task assigning / shard end (result of 
> resharding) handling logic to be capable of being independently done within 
> task life cycle methods, with defined and definite results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4020) Remove shard list querying from Kinesis consumer constructor

2016-06-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322207#comment-15322207
 ] 

ASF GitHub Bot commented on FLINK-4020:
---

Github user rmetzger commented on the issue:

https://github.com/apache/flink/pull/2081
  
I'll try to review this change soon.


> Remove shard list querying from Kinesis consumer constructor
> 
>
> Key: FLINK-4020
> URL: https://issues.apache.org/jira/browse/FLINK-4020
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming Connectors
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>
> Currently FlinkKinesisConsumer is querying for the whole list of shards in 
> the constructor, forcing the client to be able to access Kinesis as well. 
> This is also a drawback for handling Kinesis-side resharding, since we'd want 
> all shard listing / shard-to-task assigning / shard end (result of 
> resharding) handling logic to be capable of being independently done within 
> task life cycle methods, with defined and definite results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4020) Remove shard list querying from Kinesis consumer constructor

2016-06-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15320388#comment-15320388
 ] 

ASF GitHub Bot commented on FLINK-4020:
---

GitHub user tzulitai opened a pull request:

https://github.com/apache/flink/pull/2081

[FLINK-4020][streaming-connectors] Move shard list querying to open() for 
Kinesis consumer

Remove shard list querying from the constructor, and let all subtasks 
independently discover which shards it should consume from in open(). This 
change is a prerequisite for 
[FLINK-3231](https://issues.apache.org/jira/browse/FLINK-3231).

Explanation for some changes that might seem irrelevant:
1. Changed naming of some variables / methods: Since the behaviour of shard 
assignment to subtasks is now (and will continue to be in the future after 
FLINK-3231) more like "discovering shards for consuming" instead of "being 
assigned shards", I've changed the "assignedShards" related namings to 
"discoveredShards".
2. I've removed some tests, due to the fact that the corresponding parts of 
the code will be subject to quite a bit of change with the upcoming changes of 
[FLINK-3231](https://issues.apache.org/jira/browse/FLINK-3231). Tests will be 
added back with FLINK-3231.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tzulitai/flink FLINK-4020

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2081.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2081


commit 1db426be73f572aec2041cb1a9da6ad49425f392
Author: Gordon Tai 
Date:   2016-06-08T10:46:02Z

[FLINK-4020] Move shard list querying to open() for Kinesis consumer




> Remove shard list querying from Kinesis consumer constructor
> 
>
> Key: FLINK-4020
> URL: https://issues.apache.org/jira/browse/FLINK-4020
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming Connectors
>Reporter: Tzu-Li (Gordon) Tai
>Assignee: Tzu-Li (Gordon) Tai
>
> Currently FlinkKinesisConsumer is querying for the whole list of shards in 
> the constructor, forcing the client to be able to access Kinesis as well. 
> This is also a drawback for handling Kinesis-side resharding, since we'd want 
> all shard listing / shard-to-task assigning / shard end (result of 
> resharding) handling logic to be capable of being independently done within 
> task life cycle methods, with defined and definite results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)