Jay, Jun,
Thank you both for explaining. I understand this is important enough such
that it must be done, and if so, the sooner the better.
How will the change be released? a beta-2 or release candidate? I think
that if possible, it should not overrun the already released version.
Thank you guys for the hard work.
Shlomi

On Tue, Nov 25, 2014 at 7:37 PM, Jun Rao <jun...@gmail.com> wrote:

> Bhavesh,
>
> This api change doesn't mean you need to change the format of the encoded
> data. It simply moves the serialization logic from the application to a
> pluggable serializer. As long as you preserve the serialization logic, the
> consumer should still see the same bytes.
>
> If you are talking about how to evolve the data schema over time, that's a
> separate story. Serialization libraries like Avro have better support on
> schema evolution.
>
> Thanks,
>
> Jun
>
> On Tue, Nov 25, 2014 at 8:41 AM, Bhavesh Mistry <
> mistry.p.bhav...@gmail.com>
> wrote:
>
> > How will mix bag will work with Consumer side ?  Entire site can not be
> > rolled at once so Consumer will have to deals with New and Old Serialize
> > Bytes ?  This could be app team responsibility.  Are you guys targeting
> > 0.8.2 release, which may break customer who are already using new
> producer
> > API (beta version).
> >
> > Thanks,
> >
> > Bhavesh
> >
> > On Tue, Nov 25, 2014 at 8:29 AM, Manikumar Reddy <ku...@nmsworks.co.in>
> > wrote:
> >
> > > +1 for this change.
> > >
> > > what about de-serializer  class in 0.8.2?  Say i am using new producer
> > with
> > > Avro and old consumer combination.
> > > then i need to give custom Decoder implementation for Avro right?.
> > >
> > > On Tue, Nov 25, 2014 at 9:19 PM, Joe Stein <joe.st...@stealth.ly>
> wrote:
> > >
> > > > The serializer is an expected use of the producer/consumer now and
> > think
> > > we
> > > > should continue that support in the new client. As far as breaking
> the
> > > API
> > > > it is why we released the 0.8.2-beta to help get through just these
> > type
> > > of
> > > > blocking issues in a way that the community at large could be
> involved
> > in
> > > > easier with a build/binaries to download and use from maven also.
> > > >
> > > > +1 on the change now prior to the 0.8.2 release.
> > > >
> > > > - Joe Stein
> > > >
> > > >
> > > > On Mon, Nov 24, 2014 at 11:43 PM, Sriram Subramanian <
> > > > srsubraman...@linkedin.com.invalid> wrote:
> > > >
> > > > > Looked at the patch. +1 from me.
> > > > >
> > > > > On 11/24/14 8:29 PM, "Gwen Shapira" <gshap...@cloudera.com> wrote:
> > > > >
> > > > > >As one of the people who spent too much time building Avro
> > > repositories,
> > > > > >+1
> > > > > >on bringing serializer API back.
> > > > > >
> > > > > >I think it will make the new producer easier to work with.
> > > > > >
> > > > > >Gwen
> > > > > >
> > > > > >On Mon, Nov 24, 2014 at 6:13 PM, Jay Kreps <jay.kr...@gmail.com>
> > > wrote:
> > > > > >
> > > > > >> This is admittedly late in the release cycle to make a change.
> To
> > > add
> > > > to
> > > > > >> Jun's description the motivation was that we felt it would be
> > better
> > > > to
> > > > > >> change that interface now rather than after the release if it
> > needed
> > > > to
> > > > > >> change.
> > > > > >>
> > > > > >> The motivation for wanting to make a change was the ability to
> > > really
> > > > be
> > > > > >> able to develop support for Avro and other serialization
> formats.
> > > The
> > > > > >> current status is pretty scattered--there is a schema repository
> > on
> > > an
> > > > > >>Avro
> > > > > >> JIRA and another fork of that on github, and a bunch of people
> we
> > > have
> > > > > >> talked to have done similar things for other serialization
> > systems.
> > > It
> > > > > >> would be nice if these things could be packaged in such a way
> that
> > > it
> > > > > >>was
> > > > > >> possible to just change a few configs in the producer and get
> rich
> > > > > >>metadata
> > > > > >> support for messages.
> > > > > >>
> > > > > >> As we were thinking this through we realized that the new api we
> > > were
> > > > > >>about
> > > > > >> to introduce was kind of not very compatable with this since it
> > was
> > > > just
> > > > > >> byte[] oriented.
> > > > > >>
> > > > > >> You can always do this by adding some kind of wrapper api that
> > wraps
> > > > the
> > > > > >> producer. But this puts us back in the position of trying to
> > > document
> > > > > >>and
> > > > > >> support multiple interfaces.
> > > > > >>
> > > > > >> This also opens up the possibility of adding a MessageValidator
> or
> > > > > >> MessageInterceptor plug-in transparently so that you can do
> other
> > > > custom
> > > > > >> validation on the messages you are sending which obviously
> > requires
> > > > > >>access
> > > > > >> to the original object not the byte array.
> > > > > >>
> > > > > >> This api doesn't prevent using byte[] by configuring the
> > > > > >> ByteArraySerializer it works as it currently does.
> > > > > >>
> > > > > >> -Jay
> > > > > >>
> > > > > >> On Mon, Nov 24, 2014 at 5:58 PM, Jun Rao <jun...@gmail.com>
> > wrote:
> > > > > >>
> > > > > >> > Hi, Everyone,
> > > > > >> >
> > > > > >> > I'd like to start a discussion on whether it makes sense to
> add
> > > the
> > > > > >> > serializer api back to the new java producer. Currently, the
> new
> > > > java
> > > > > >> > producer takes a byte array for both the key and the value.
> > While
> > > > this
> > > > > >> api
> > > > > >> > is simple, it pushes the serialization logic into the
> > application.
> > > > > >>This
> > > > > >> > makes it hard to reason about what type of data is being sent
> to
> > > > Kafka
> > > > > >> and
> > > > > >> > also makes it hard to share an implementation of the
> serializer.
> > > For
> > > > > >> > example, to support Avro, the serialization logic could be
> quite
> > > > > >>involved
> > > > > >> > since it might need to register the Avro schema in some remote
> > > > > >>registry
> > > > > >> and
> > > > > >> > maintain a schema cache locally, etc. Without a serialization
> > api,
> > > > > >>it's
> > > > > >> > impossible to share such an implementation so that people can
> > > easily
> > > > > >> reuse.
> > > > > >> > We sort of overlooked this implication during the initial
> > > discussion
> > > > > >>of
> > > > > >> the
> > > > > >> > producer api.
> > > > > >> >
> > > > > >> > So, I'd like to propose an api change to the new producer by
> > > adding
> > > > > >>back
> > > > > >> > the serializer api similar to what we had in the old producer.
> > > > > >>Specially,
> > > > > >> > the proposed api changes are the following.
> > > > > >> >
> > > > > >> > First, we change KafkaProducer to take generic types K and V
> for
> > > the
> > > > > >>key
> > > > > >> > and the value, respectively.
> > > > > >> >
> > > > > >> > public class KafkaProducer<K,V> implements Producer<K,V> {
> > > > > >> >
> > > > > >> >     public Future<RecordMetadata> send(ProducerRecord<K,V>
> > record,
> > > > > >> Callback
> > > > > >> > callback);
> > > > > >> >
> > > > > >> >     public Future<RecordMetadata> send(ProducerRecord<K,V>
> > > record);
> > > > > >> > }
> > > > > >> >
> > > > > >> > Second, we add two new configs, one for the key serializer and
> > > > another
> > > > > >> for
> > > > > >> > the value serializer. Both serializers will default to the
> byte
> > > > array
> > > > > >> > implementation.
> > > > > >> >
> > > > > >> > public class ProducerConfig extends AbstractConfig {
> > > > > >> >
> > > > > >> >     .define(KEY_SERIALIZER_CLASS_CONFIG, Type.CLASS,
> > > > > >> > "org.apache.kafka.clients.producer.ByteArraySerializer",
> > > > > >>Importance.HIGH,
> > > > > >> > KEY_SERIALIZER_CLASS_DOC)
> > > > > >> >     .define(VALUE_SERIALIZER_CLASS_CONFIG, Type.CLASS,
> > > > > >> > "org.apache.kafka.clients.producer.ByteArraySerializer",
> > > > > >>Importance.HIGH,
> > > > > >> > VALUE_SERIALIZER_CLASS_DOC);
> > > > > >> > }
> > > > > >> >
> > > > > >> > Both serializers will implement the following interface.
> > > > > >> >
> > > > > >> > public interface Serializer<T> extends Configurable {
> > > > > >> >     public byte[] serialize(String topic, T data, boolean
> > isKey);
> > > > > >> >
> > > > > >> >     public void close();
> > > > > >> > }
> > > > > >> >
> > > > > >> > This is more or less the same as what's in the old producer.
> The
> > > > > >>slight
> > > > > >> > differences are (1) the serializer now only requires a
> > > > parameter-less
> > > > > >> > constructor; (2) the serializer has a configure() and a
> close()
> > > > method
> > > > > >> for
> > > > > >> > initialization and cleanup, respectively; (3) the serialize()
> > > method
> > > > > >> > additionally takes the topic and an isKey indicator, both of
> > which
> > > > are
> > > > > >> > useful for things like schema registration.
> > > > > >> >
> > > > > >> > The detailed changes are included in KAFKA-1797. For
> > > completeness, I
> > > > > >>also
> > > > > >> > made the corresponding changes for the new java consumer api
> as
> > > > well.
> > > > > >> >
> > > > > >> > Note that the proposed api changes are incompatible with
> what's
> > in
> > > > the
> > > > > >> > 0.8.2 branch. However, if those api changes are beneficial,
> it's
> > > > > >>probably
> > > > > >> > better to include them now in the 0.8.2 release, rather than
> > > later.
> > > > > >> >
> > > > > >> > I'd like to discuss mainly two things in this thread.
> > > > > >> > 1. Do people feel that the proposed api changes are
> reasonable?
> > > > > >> > 2. Are there any concerns of including the api changes in the
> > > 0.8.2
> > > > > >>final
> > > > > >> > release?
> > > > > >> >
> > > > > >> > Thanks,
> > > > > >> >
> > > > > >> > Jun
> > > > > >> >
> > > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to