[ 
https://issues.apache.org/jira/browse/FLINK-11617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16768330#comment-16768330
 ] 

Jamie Grier commented on FLINK-11617:
-------------------------------------

Here's an example:

 

Stacktrace is:

{{java.lang.RuntimeException: Rate Exceeded for getRecords operation - all 3 
retry attempts returned ProvisionedThroughputExceededException.}}

{{  at 
org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.getRecords(KinesisProxy.java:234)}}

{{  at 
org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer.getRecords(ShardConsumer.java:373)}}

{{  at 
org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer.run(ShardConsumer.java:216)}}

{{  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
at java.util.concurrent.FutureTask.run(FutureTask.java:266)}}

{{  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)}}

{{  at java.lang.Thread.run(Thread.java:748)}}

 

But the root cause is actually given by this log line:

{{Got recoverable SdkClientException. Backing off for 140 millis (null 
(Service: AmazonKinesis; Status Code: 500; Error Code: InternalFailure; Request 
ID: c49c8e5b-a068-9733-9043-b215d51b0aa1))}}

 

 

> Kinesis Connector getRecords() failure logging is misleading
> ------------------------------------------------------------
>
>                 Key: FLINK-11617
>                 URL: https://issues.apache.org/jira/browse/FLINK-11617
>             Project: Flink
>          Issue Type: Improvement
>          Components: Kinesis Connector
>    Affects Versions: 1.5.6, 1.6.3, 1.7.1
>            Reporter: Jamie Grier
>            Assignee: Jamie Grier
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> There isn't enough information in the current logging to diagnose a 
> getRecords() failure.  Also there is a hardcoded string that states the 
> failure cause was always ProvisionedThroughputExceededException which isn't 
> true.  There are many possible causes of failures.  This is misleading.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to