Hello Raf,

If I understand it correctly, there are two topics, lets's say topic A
and B, and the flow seems like:

1. ProduceKafka -> topic A
2. External process consumes from topic A, process data and push those
to topic B
3. topic B -> ConsumeKafka

I tested a flow looks like above, and the last ConsumeKafka received
expected number of flow files, and status looks fine.
Details and result are available here:
https://gist.github.com/ijokarumawak/29a568f760cb24fb94ae3384037ab9e7

A question to rule out a possible cause,
Had the ConsumeKafka processor at step-3 already started when the
External process started sending messages to topic B?

If it started after that, then ConsumeKafka will start receiving
messages from the 'latest' offset at that point by default (can be
configured by 'offset reset' property). For example, if ConsumeKafka
started after External process published 807 messages, then
ConsumeKafka can only consume 700 messages those are sent after
ConsumeKafka is connected to the topic.

Otherwise, if it's reproducible, please let us know of your
environment detail, such as NiFi clustered/standalone, number of Kafka
partition ... etc.

Thanks,
Koji

On Wed, Jan 11, 2017 at 10:17 PM, Raf Huys <[email protected]> wrote:
> With the Produce_Kafka_0_10 processor 1507 flowfiles are pushed onto a
> topic.
>
> A separate process mangles a bit with these records and pushes them back on
> a separate topic. With the Consume_Kafka_0_10 processor I'm ingesting this
> second topic again.
>
> According to Kafka, this Nifi consumer is at offset 1507 (so there's no
> lag). The total amount of processing time was around 3 minutes (less than 5
> for sure).
>
> Weird thing is, I cannot correlate these number in the Nifi consumer. The
> status history of the processor returns an aggregated sum of 700 flowfiles
> out (all flowfiles are transferred to a logging processor). Data provenance
> has 700 messages as well.
>
> `Flowfiles in` is 0 (I suppose because there's no ingoing arrow on the
> consumer) so that's not of any use.
>
> `Message Demarcator` is not set, `Max Poll Records` 1000, `Max Uncommitted
> Time` 1 s have default values, so every incoming message should result in 1
> flowFile to my understanding.
>
> Not really sure what to think of this. Are there any other places i can look
> for the total amount of flowfiles (in/out) during a certain amount of time?
>
>
> --
> Mvg,
>
> Raf Huys

Reply via email to