I missed the kafka.offset attribute. That is what I needed. Thanks Mark.

*Jeremiah Adams*

Senior Software Developer
Pearson

2154 East Commons Ave.
Suite 400
Centennial, CO 80122


Always Learning
Learn more at www.pearson.com

On Tue, Jul 28, 2015 at 9:28 AM, Mark Payne <[email protected]> wrote:

> Jeremiah,
>
> Totally understand now. We can certainly add a property that indicates
> whether or not to commit the offsets.
> We should probably also document (at a very high level) the use-case that
> you are describing as an example
> of why you may want to not commit the offsets. I will update the ticket to
> include this.
>
> Regarding the separate enhancement: when you say "the last written offset"
> are you referring to when GetKafka
> writes the offset to ZooKeeper? I do not believe that information is
> exposed by their "High-level consumer."
> It's probably possible if we were to change to the "simple consumer" API,
> but that interface is extremely different
> so it unfortunately isn't a simple change.
>
> The FlowFiles that are received, though, do have a "kafka.offset"
> attribute, which indicates the offset of that individual
> message, if that helps?
>
> Thanks
> -Mark
>
>
> ----------------------------------------
> > Date: Tue, 28 Jul 2015 08:56:21 -0600
> > Subject: Re: GetKafka Processor and Hardcoded Kafka Consumer Configs
> > From: [email protected]
> > To: [email protected]
> >
> > In the case of auto.commit.enable - we had a scenario during our last
> > deploy in which we did not commit the offsets we read at all. This
> > atypical. This is in the case of a Lambda-like architecture in which we
> use
> > S3 to provide historical data to repopulate the near real-time datastore
> > during a deploy.
> >
> > Mostly, I think that the user experience would be better if we had
> complete
> > control over the GetKafka Processor config here:
> > http://kafka.apache.org/documentation.html#consumerconfigs.
> > There may be implementation details that make it impossible, but it would
> > be the best case. I think it is probably safe to say the same about the
> > Kafka Producer - but I have not run into any blockers as-is. I have added
> > this to the jira ticket.
> >
> > Also, a separate enhancement:
> >
> > I see a need to pass along the last written offset to subsequent
> Processors
> > in a flow. I don't know if this is even possible, I didn't look that
> > closely at the code. It could be useful If it were possible to have the
> > option to pass the last Offset along the flow as metadata. We could then
> > pass around FlowFile data indexed by last Offset. Dunno if this is worth
> > exploring as it may be unique to our architecture.
> >
> >
> > *Jeremiah Adams*
> >
> > Senior Software Developer
> > Pearson
> >
> > 2154 East Commons Ave.
> > Suite 400
> > Centennial, CO 80122
> >
> >
> > Always Learning
> > Learn more at www.pearson.com
> >
> > On Mon, Jul 27, 2015 at 6:14 PM, Mark Payne <[email protected]>
> wrote:
> >
> >> Jeremiah,
> >>
> >> We can certainly enable the "auto.offset.reset" to be configurable. Not
> >> sure how making the "auto.commit.enable" configurable would work.
> >> Are you thinking that another property would be added to indicate how
> >> often to commit? Or would it work completely differently? Just need that
> >> fleshed out a bit more.
> >>
> >> I do like the suggestion of exposing the config properties as
> user-defined
> >> properties.
> >>
> >> I have created a ticket to track this information:
> >> https://issues.apache.org/jira/browse/NIFI-791
> >>
> >> Please feel free to update the ticket with any relevant information as
> you
> >> think of it.
> >>
> >> Thanks!
> >> -Mark
> >>
> >> ----------------------------------------
> >>> Date: Mon, 27 Jul 2015 15:42:37 -0600
> >>> Subject: GetKafka Processor and Hardcoded Kafka Consumer Configs
> >>> From: [email protected]
> >>> To: [email protected]
> >>>
> >>> The GetKafka processor has a couple of Kafka Consumer Config values
> that
> >>> are hard-coded.
> >>>
> >>> props.setProperty("auto.commit.enable", "true"); // just be explicit
> >>> props.setProperty("auto.offset.reset", "smallest");
> >>>
> >>> These should be configurable property values in the Processor. Most
> >>> notable for me is the "auto.offset.reset". Smallest vs. Largest has
> some
> >>> implications concerning fault tolerance strategies.
> >>>
> >>> It would be best to expose all of the available Kafka Consumer Config
> >>> properties. If these change though between kafka versions it would
> create
> >>> maintenance work for the Processors.
> >>>
> >>> Another option would be to allow ad-hoc property values and end-user
> just
> >>> supply the kafka config values they want to override.
> >>>
> >>>
> >>> *Jeremiah Adams*
> >>>
> >>> Senior Software Developer
> >>> Pearson
> >>>
> >>> 2154 East Commons Ave.
> >>> Suite 400
> >>> Centennial, CO 80122
> >>>
> >>>
> >>> Always Learning
> >>> Learn more at www.pearson.com
> >>
>

Reply via email to