Hi,
current setup.

Kinesis stream 1 -----> Kinesis Analytics Flink -----> Kinesis stream 2
|
----> Firehose Delivery stream

Curl eror:
org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.producer.LogInputStreamReader
 - [2020-07-02 15:22:32.203053] [0x000007f4][0x00007ffbced15700] [error]
[AWS Log: ERROR](CurlHttpClient)Curl returned error code 28

But I am still seeing tons of the curl 28 error. I use parallelism of 80
for the Sink to Kinesis Data stream(KDS). Which seems to point to KDS being
pounded with too many requests - the 80(parallelism) * 10(ThreadPool size)
= 800 requests. Is my understanding correct ? So, maybe reduce the 80
parallelism ??
*I still don't understand why the logs are stuck with just
FlinkKInesisProducer for around 4s(blocking calls???) *with the rest of the
Flink Analytics application not producing any logs while this happens.
*I noticed that the FlinkKInesisProducer took about 3.785secs, 3.984s,
4.223s in between other application logs in Kibana when the Kinesis
GetIterator Age peaked*. It seemed like FlinkKinesisProducer was blocking
for that long when the Flink app was not able to generate any other logs.

Looked at this:
https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kinesis.html#backpressure

Could use this:
producerConfig.put("RequestTimeout", "10000");//from 6000

But doesn't really solve the problem when trying to maintain a real time
processing system.

TIA

Reply via email to