Hi Harrison,
Really sorry for the late reply.
Do you have any insight on whether the missing records were read by
the consumer and just the StreamingFileSink failed to write their
offsets, or the Kafka consumer did not even read them or dropped them
for some reason? I asking this in order to
Thank you for your reply,
Some clarification:
We have configured the BucketAssigner to use the *Kafka record timestamp*.
Exact bucketing behavior as follows:
private static final DateTimeFormatter formatter = DateTimeFormatter
.ofPattern("-MM-dd'T'HH");
@Override
public String
Hi Harrison,
One thing to keep in mind is that Flink will only write files if there
is data to write. If, for example, your partition is not active for a
period of time, then no files will be written.
Could this explain the fact that dt 2019-11-24T01 and 2019-11-24T02
are entirely skipped?
In
Hello,
We're seeing some strange behavior with flink's KafkaConnector010 (Kafka
0.10.1.1) arbitrarily skipping data.
*Context*
KafkaConnector010 is used as source, and StreamingFileSink/BulkPartWriter
(S3) as sink with no intermediate operators. Recently, we noticed that
millions of Kafka