Hi Raghu,

You are right I got confused by consumer config in the logs, when I
searched for auto commit enabled I was indeed able to find the correct one
for main consumer.
Thanks for clearing this up for me!

Cheers,
Kamil.


On Mon, 22 Jan 2018 at 19:40 Raghu Angadi <[email protected]> wrote:

> Kamil,
>
> The code at line 1153
> <https://github.com/apache/beam/blob/v2.2.0/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java#L1153>
>  you
> are referring to does not correspond to main Kafka consumer used to fetch
> the records. KafkaIO source creates an internal consumer to fetch offsets.
> Auto commit is is disabled only for the internal one at line 1153.
> The worker log contains Kafka logging for both the consumers. You should
> see multiple consumer configs printed in the log.
>
> On Mon, Jan 22, 2018 at 6:42 AM, Kamil Dziublinski <
> [email protected]> wrote:
>
>> Hi guys,
>>
>> I am using apache beam 2.2.0 on prod. Reading from kafka and writing to
>> BigQuery.
>>
>> In your kafkaIO documentation you mention that one can enable auto-commit
>> for offsets by setting ENABLE_AUTO_COMMIT_CONFIG to true.
>> But I noticed in dataflow logs that my consumer config always has this
>> value set to false even tho I set it to true in my code (via
>> updateConsumerProperties method).
>>
>> I had a look at source code and I noticed that in KafkaIO class in line
>> nr: 1153 you are hardcoding auto commits to false.
>> Which means that documentation is not in line with code implementation.
>>
>> Now I view this as a problem, because I already had situations that I
>> couldn't redeploy my flow with --update flag due to incompatible changes,
>> but I still didn't want to reprocess old data. In this case I will have to
>> use earliest offset reset which will force me to reprocess all the messages
>> in my topic. I cannot use latest offset reset, because I will lose data.
>> And KafkaIO is also not manually committing those offsets on successful
>> checkpoints which would be definitely the most optimal solution, like it is
>> handled in flink.
>>
>> Do you have any suggestions how to solve that problem?
>>
>> Thanks,
>> Kamil.
>>
>>
>> --
>> Kamil Dziubliński
>> Big Data Engineer
>>
>> Azimo
>> 173 Upper Street
>> <https://maps.google.com/?q=173+Upper+Street+London,+N1+1RG&entry=gmail&source=g>
>> London, N1 1RG
>> <https://maps.google.com/?q=173+Upper+Street+London,+N1+1RG&entry=gmail&source=g>
>>
>>
>>
>>
>> azimo.com
>> facebook.com/azimomoney <https://www.facebook.com/azimomoney>
>> twitter.com/azimo
>>
>
>

-- 
Kamil Dziubliński
Big Data Engineer

Azimo
173 Upper Street
<https://maps.google.com/?q=173+Upper+Street+London,+N1+1RG&entry=gmail&source=g>
London, N1 1RG
<https://maps.google.com/?q=173+Upper+Street+London,+N1+1RG&entry=gmail&source=g>




azimo.com
facebook.com/azimomoney <https://www.facebook.com/azimomoney>
twitter.com/azimo

Reply via email to