Fixing it in https://github.com/apache/beam/pull/3461.
Thanks for reporting the issue. On Wed, Jun 28, 2017 at 8:37 AM, Raghu Angadi <rang...@google.com> wrote: > Hi Elmar, > > You are right. We should not log this at all when the gaps are expected as > you pointed out. I don't think client can check if compaction is enabled > for a topic through Consumer api. > > I think we should remove the log. The user can't really act on it other > than reporting it. I will send a PR. > > As a temporary work around you can disable logging for a particular class > on the worker with --workerLogLevelOverrides > <https://cloud.google.com/dataflow/pipelines/logging> option. But this > this would suppress rest of the logging the reader. > > Raghu > > > On Wed, Jun 28, 2017 at 4:12 AM, Elmar Weber <i...@elmarweber.org> wrote: > >> Hello, >> >> I'm testing the KafkaIO with Google Cloud dataflow and getting warnings >> when working with compacted logs. In the code there is a relevant check: >> >> >> https://github.com/apache/beam/blob/master/sdks/java/io/kafk >> a/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java#L1158 >> >> // sanity check >> if (offset != expected) { >> LOG.warn("{}: gap in offsets for {} at {}. {} records missing.", >> this, pState.topicPartition, expected, offset - expected); >> } >> >> From what I understand, this can happen when log compaction is enabled >> because the relevant entry can get cleaned up by Kafka with a newer one. >> In this case, shouldn't this be a info log and / or warn only when log >> compaction is disabled for the topic? >> >> >> I'm still debugging some stuff because the pipeline also stops reading on >> compacted logs, I'm not sure if this related / could also be an issue with >> my Kafka test installation, but as far as I understand the gaps are >> expected behaviour with log compaction enabled. >> >> Thanks, >> Elmar >> >> >