Hello Raf, If I understand it correctly, there are two topics, lets's say topic A and B, and the flow seems like:
1. ProduceKafka -> topic A 2. External process consumes from topic A, process data and push those to topic B 3. topic B -> ConsumeKafka I tested a flow looks like above, and the last ConsumeKafka received expected number of flow files, and status looks fine. Details and result are available here: https://gist.github.com/ijokarumawak/29a568f760cb24fb94ae3384037ab9e7 A question to rule out a possible cause, Had the ConsumeKafka processor at step-3 already started when the External process started sending messages to topic B? If it started after that, then ConsumeKafka will start receiving messages from the 'latest' offset at that point by default (can be configured by 'offset reset' property). For example, if ConsumeKafka started after External process published 807 messages, then ConsumeKafka can only consume 700 messages those are sent after ConsumeKafka is connected to the topic. Otherwise, if it's reproducible, please let us know of your environment detail, such as NiFi clustered/standalone, number of Kafka partition ... etc. Thanks, Koji On Wed, Jan 11, 2017 at 10:17 PM, Raf Huys <[email protected]> wrote: > With the Produce_Kafka_0_10 processor 1507 flowfiles are pushed onto a > topic. > > A separate process mangles a bit with these records and pushes them back on > a separate topic. With the Consume_Kafka_0_10 processor I'm ingesting this > second topic again. > > According to Kafka, this Nifi consumer is at offset 1507 (so there's no > lag). The total amount of processing time was around 3 minutes (less than 5 > for sure). > > Weird thing is, I cannot correlate these number in the Nifi consumer. The > status history of the processor returns an aggregated sum of 700 flowfiles > out (all flowfiles are transferred to a logging processor). Data provenance > has 700 messages as well. > > `Flowfiles in` is 0 (I suppose because there's no ingoing arrow on the > consumer) so that's not of any use. > > `Message Demarcator` is not set, `Max Poll Records` 1000, `Max Uncommitted > Time` 1 s have default values, so every incoming message should result in 1 > flowFile to my understanding. > > Not really sure what to think of this. Are there any other places i can look > for the total amount of flowfiles (in/out) during a certain amount of time? > > > -- > Mvg, > > Raf Huys
