Hello Thomas,

Thank-you for your vote and feedback on the FLIP.

Q: Do you see the polling (non-EFO) mode as a permanent option going forward?
A: I will follow up on this, I have forwarded the question on. But generally 
speaking AWS do not usually deprecate APIs. KDS (Kinesis Data Streams) will 
therefore likely always support the polling mechanism. The question is whether 
we want to support it within the Flink connector. I will get back to you. 

Q: Perhaps elaborate a bit more on limitations and reasoning for the different 
registration options?
A: I was planning on elaborating in the updated documentation that will be 
published to the Flink website. Would you like me to update FLIP to include 
this information in advance?

Q: That won't cause an issue because a stale registration will be 
overridden/removed by a new job with the same name?
A: Yes exactly, when the consumer name is already registered, either by the 
user, or from a previous error shutdown. The first ListStreamConsumers call 
will find the consumer in an ACTIVE state, retrieve the ConsumerARN and tasks 
will subsequently use that to obtain a subscription (this new subscription will 
invalidate any existing ones).

Thanks,
Danny

On 06/07/2020, 20:36, "Thomas Weise" <t...@apache.org> wrote:

    CAUTION: This email originated from outside of the organization. Do not 
click links or open attachments unless you can confirm the sender and know the 
content is safe.



    Thanks for the excellent proposal!

    Big +1 for introducing EFO as an incremental feature while retaining
    backward compatibility! This will make it easier for users to adopt.

    Thanks for mentioning the reasons why one might not want to use EFO.
    Regarding "Streams with a single consumer would not benefit from the
    dedicated throughput (they already have the full quota)": Do you see the
    polling (non-EFO) mode as a permanent option going forward?

    Regarding "Registration/De-registration Configuration":

    The limit for "ListStreamConsumers" is 5 TPS per [1], which is even lower
    than that for "DescribeStream". That limit could cause significant issues
    during large scale job startup and the only solution was to switch to
    ListShards. Perhaps elaborate a bit more on limitations and reasoning for
    the different registration options?

    De-registration may never happen when task managers go into a bad state and
    are forcefully terminated. That won't cause an issue because a stale
    registration will be overridden/removed by a new job with the same name?

    Thanks,
    Thomas


    [1]
    https://docs.aws.amazon.com/streams/latest/dev/service-sizes-and-limits.html

    On Mon, Jun 22, 2020 at 2:42 AM Cranmer, Danny <cranm...@amazon.com.invalid>
    wrote:

    > Hello everyone,
    > This is a discussion thread for the FLIP [1] regarding Enhanced Fan Out
    > for AWS Kinesis Consumers.
    >
    > Enhanced Fan Out (EFO) allows AWS Kinesis Data Stream (KDS) consumers to
    > utilise a dedicated read throughput, rather than a shared quota. HTTP/2
    > reduces latency and typically gives a 65% performance boost [2]. EFO is 
not
    > currently supported by the Flink Kinesis Consumer. Adding EFO support 
would
    > allow Flink applications to reap the benefits, widening Flink adoption.
    > Existing applications will be able to optionally perform a backwards
    > compatible library upgrade and configuration tweak to inherit the
    > performance benefits.
    > [1]
    > 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-128%3A+Enhanced+Fan+Out+for+AWS+Kinesis+Consumers
    > [2] https://aws.amazon.com/blogs/aws/kds-enhanced-fanout/
    > I look forward to your feedback,
    > Thanks,
    >

Reply via email to