Re: [DISCUSS] KIP-28 - Add a transform client for data processing

Jiangjie Qin Fri, 31 Jul 2015 21:22:33 -0700

I think the abstraction of processor would be useful. It is not quite clear
to me yet though which grid in the following API analysis chart this
processor is trying to satisfy.


https://cwiki.apache.org/confluence/display/KAFKA/New+consumer+API+change+proposal

For example, in current proposal. It looks user will only be able to commit
offsets for the last seen message. What If user have interleaved groups of
messages, each group makes a complete logic? In that case, user will not
have a safe boundary to commit offset.


Is the processor client only intended to address the static topic data
stream with semi-auto offset commit (which means user can only commit the
last seen message)?

Jiangjie (Becket) Qin

On Thu, Jul 30, 2015 at 2:32 PM, James Cheng <[email protected]> wrote:

> I agree with Sriram and Martin. Kafka is already about providing streams
> of data, and so Kafka Streams or anything like that is confusing to me.
>
> This new library is about making it easier to process the data.
>
> -James
>
> On Jul 30, 2015, at 9:38 AM, Aditya Auradkar
> <[email protected]> wrote:
>
> > Personally, I prefer KafkaStreams just because it sounds nicer. For the
> > reasons identified above, KafkaProcessor or KProcessor is more apt but
> > sounds less catchy (IMO). I also think we should prefix with Kafka
> (rather
> > than K) because we will then have 3 clients: KafkaProducer, KafkaConsumer
> > and KafkaProcessor which is very nice and consistent.
> >
> > Aditya
> >
> > On Thu, Jul 30, 2015 at 9:17 AM, Gwen Shapira <[email protected]>
> wrote:
> >
> >> I think its also a matter of intent. If we see it as "yet another
> >> client library", than Processor (to match Producer and Consumer) will
> >> work great.
> >> If we see it is a stream processing framework, the name has to start
> >> with S to follow existing convention.
> >>
> >> Speaking of naming conventions:
> >> You know how people have stack names for technologies that are usually
> >> used in tandem? ELK, LAMP, etc.
> >> The pattern of Kafka -> Stream Processor -> NoSQL Store is super
> >> common. KSN stack doesn't sound right, though. Maybe while we are
> >> bikeshedding, someone has ideas in that direction :)
> >>
> >> On Thu, Jul 30, 2015 at 2:01 AM, Sriram Subramanian
> >> <[email protected]> wrote:
> >>> I had the same thought. Kafka processor, KProcessor or even Kafka
> >>> stream processor is more relevant.
> >>>
> >>>
> >>>
> >>>> On Jul 30, 2015, at 2:09 PM, Martin Kleppmann <[email protected]>
> >> wrote:
> >>>>
> >>>> I'm with Sriram -- Kafka is all about streams already (or topics, to
> be
> >> precise, but we're calling it "stream processing" not "topic
> processing"),
> >> so I find "Kafka Streams", "KStream" and "Kafka Streaming" all
> confusing,
> >> since they seem to imply that other bits of Kafka are not about streams.
> >>>>
> >>>> I would prefer "The Processor API" or "Kafka Processors" or "Kafka
> >> Processing Client" or "KProcessor", or something along those lines.
> >>>>
> >>>>> On 30 Jul 2015, at 15:07, Guozhang Wang <[email protected]> wrote:
> >>>>>
> >>>>> I would vote for KStream as it sounds sexier (is it only me??),
> second
> >> to
> >>>>> that would be Kafka Streaming.
> >>>>>
> >>>>>> On Wed, Jul 29, 2015 at 6:08 PM, Jay Kreps <[email protected]>
> wrote:
> >>>>>>
> >>>>>> Also, the most important part of any prototype, we should have a
> name
> >> for
> >>>>>> this producing-consumer-thingamgigy:
> >>>>>>
> >>>>>> Various ideas:
> >>>>>> - Kafka Streams
> >>>>>> - KStream
> >>>>>> - Kafka Streaming
> >>>>>> - The Processor API
> >>>>>> - Metamorphosis
> >>>>>> - Transformer API
> >>>>>> - Verwandlung
> >>>>>>
> >>>>>> For my part I think what people are trying to do is stream
> processing
> >> with
> >>>>>> Kafka so I think something that evokes Kafka and stream processing
> is
> >>>>>> preferable. I like Kafka Streams or Kafka Streaming followed by
> >> KStream.
> >>>>>>
> >>>>>> Transformer kind of makes me think of the shape-shifting cars.
> >>>>>>
> >>>>>> Metamorphosis is cool and hilarious but since we are kind of
> >> envisioning
> >>>>>> this as more limited scope thing rather than a massive framework in
> >> its own
> >>>>>> right I actually think it should have a descriptive name rather
> than a
> >>>>>> personality of it's own.
> >>>>>>
> >>>>>> Anyhow let the bikeshedding commence.
> >>>>>>
> >>>>>> -Jay
> >>>>>>
> >>>>>>
> >>>>>>> On Thu, Jul 23, 2015 at 5:59 PM, Guozhang Wang <[email protected]
> >
> >> wrote:
> >>>>>>>
> >>>>>>> Hi all,
> >>>>>>>
> >>>>>>> I just posted KIP-28: Add a transform client for data processing
> >>>>>>> <
> >>>>>>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-28+-+Add+a+transform+client+for+data+processing
> >>>>>>> .
> >>>>>>>
> >>>>>>> The wiki page does not yet have the full design / implementation
> >> details,
> >>>>>>> and this email is to kick-off the conversation on whether we should
> >> add
> >>>>>>> this new client with the described motivations, and if yes what
> >> features
> >>>>>> /
> >>>>>>> functionalities should be included.
> >>>>>>>
> >>>>>>> Looking forward to your feedback!
> >>>>>>>
> >>>>>>> -- Guozhang
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> -- Guozhang
> >>>>
> >>
>
>

Re: [DISCUSS] KIP-28 - Add a transform client for data processing

Reply via email to