I just updated the draft implementation[^1], rebasing against the latest
trunk and implementing error routine (i.e., Error code 74 for
UnsupportedCompressionTypeException.) Since we decided to disallow all
fetch request below version 2.1.0 for the topics specifying ZStandard, I
added an error logic only.

Please have a look when you are free.

Thanks,
Dongjin

[^1]: Please check the last commit here:
https://github.com/apache/kafka/pull/2267

On Thu, Aug 23, 2018, 8:55 AM Dongjin Lee <dong...@apache.org> wrote:

> Jason,
>
> Great. +1 for UNSUPPORTED_COMPRESSION_TYPE.
>
> Best,
> Dongjin
>
> On Thu, Aug 23, 2018 at 8:19 AM Jason Gustafson <ja...@confluent.io>
> wrote:
>
>> Hey Dongjin,
>>
>> Yeah that's right. For what it's worth, librdkafka also appears to handle
>> unexpected error codes. I expect that most client implementations would
>> either pass through the raw type or convert to an enum using something
>> like
>> what the java client does. Since we're expecting the client to fail
>> anyway,
>> I'm probably in favor of using the UNSUPPORTED_COMPRESSION_TYPE error
>> code.
>>
>> -Jason
>>
>> On Wed, Aug 22, 2018 at 1:46 AM, Dongjin Lee <dong...@apache.org> wrote:
>>
>> > Jason and Ismael,
>> >
>> > It seems like the only thing we need to regard if we define a new error
>> > code (i.e., UNSUPPORTED_COMPRESSION_TYPE) would be the implementation of
>> > the other language clients, right? At least, this strategy causes any
>> > problem for Java client. Do I understand correctly?
>> >
>> > Thanks,
>> > Dongjin
>> >
>> > On Wed, Aug 22, 2018 at 5:43 PM Dongjin Lee <dong...@apache.org> wrote:
>> >
>> > > Jason,
>> > >
>> > > > I think we would only use this error code when we /know/ that zstd
>> was
>> > > in use and the client doesn't support it? This is true if either 1)
>> the
>> > > message needs down-conversion and we encounter a zstd compressed
>> message,
>> > > or 2) if the topic is explicitly configured to use zstd.
>> > >
>> > > Yes, it is right. And you know, the case 1 includes 1.a) old clients'
>> > > request v0, v1 records or 1.b) implicit zstd, the compression type of
>> > > "producer" with Zstd compressed data.
>> > >
>> > > > However, if the compression type is set to "producer," then the
>> fetched
>> > > data may or may not be compressed with zstd. In this case, we return
>> the
>> > > data to the client and expect it to fail parsing. Is that correct?
>> > >
>> > > Exactly.
>> > >
>> > > Following your message, I reviewed the implementation of
>> > > `KafkaApis#handleFetchRequest,` which handles the fetch request. And
>> > found
>> > > that the information we can use is like the following:
>> > >
>> > > 1. Client's fetch request version. (`versionId` variable)
>> > > 2. Log's compression type. (`logConfig` variable)
>> > >
>> > > We can't detect the actual compression type of the data, unless we
>> > inspect
>> > > the `RecordBatch` included in the `Records` instance (i.e.,
>> > > `unconvertedRecords` variable.) Since it requires some performance
>> issue,
>> > > it is not our option - in short, we can't be sure if given chunks of
>> data
>> > > are compressed with zstd or not.
>> > >
>> > > So, conclusion: we can return an error in the case of 1.a and 2
>> easily,
>> > > with the information above. In the case 1.b (implicit zstd), we can
>> just
>> > > return the data by do nothing special and expect it to fail parsing.
>> > >
>> > > Thanks,
>> > > Dongjin
>> > >
>> > > On Wed, Aug 22, 2018 at 12:02 PM Ismael Juma <isma...@gmail.com>
>> wrote:
>> > >
>> > >> Jason, that's an interesting point regarding the Java client. Do we
>> know
>> > >> what clients in other languages do in these cases?
>> > >>
>> > >> Ismael
>> > >>
>> > >> On Tue, 21 Aug 2018, 17:30 Jason Gustafson, <ja...@confluent.io>
>> wrote:
>> > >>
>> > >> > Hi Dongjin,
>> > >> >
>> > >> > One of the complications is that old versions of the API will not
>> > >> expect a
>> > >> > new error code. However, since we expect this to be a fatal error
>> > anyway
>> > >> > for old clients, it may still be more useful to return the correct
>> > error
>> > >> > code. For example, the Kafka clients use the following code to
>> convert
>> > >> the
>> > >> > error code:
>> > >> >
>> > >> >     public static Errors forCode(short code) {
>> > >> >         Errors error = codeToError.get(code);
>> > >> >         if (error != null) {
>> > >> >             return error;
>> > >> >         } else {
>> > >> >             log.warn("Unexpected error code: {}.", code);
>> > >> >             return UNKNOWN_SERVER_ERROR;
>> > >> >         }
>> > >> >     }
>> > >> >
>> > >> > If we return an unsupported error code, it will be converted to an
>> > >> UNKNOWN
>> > >> > error, but at least we will get the message in the log with the
>> > correct
>> > >> > code. That seems preferable to returning a misleading error code.
>> So I
>> > >> > wonder if we can use the new UNSUPPORTED_COMPRESSION_TYPE error
>> even
>> > for
>> > >> > older versions.
>> > >> >
>> > >> > Also, one question just to check my understanding. I think we would
>> > only
>> > >> > use this error code when we /know/ that zstd was in use and the
>> client
>> > >> > doesn't support it? This is true if either 1) the message needs
>> > >> > down-conversion and we encounter a zstd compressed message, or 2)
>> if
>> > the
>> > >> > topic is explicitly configured to use zstd. However, if the
>> > compression
>> > >> > type is set to "producer," then the fetched data may or may not be
>> > >> > compressed with zstd. In this case, we return the data to the
>> client
>> > and
>> > >> > expect it to fail parsing. Is that correct?
>> > >> >
>> > >> > Thanks,
>> > >> > Jason
>> > >> >
>> > >> >
>> > >> >
>> > >> > On Tue, Aug 21, 2018 at 9:08 AM, Dongjin Lee <dong...@apache.org>
>> > >> wrote:
>> > >> >
>> > >> > > Ismael, Jason and all,
>> > >> > >
>> > >> > > I rewrote the backward compatibility strategy & its alternatives
>> > like
>> > >> > > following, based on Ismael & Jason's comments. Since it is not
>> > >> updated to
>> > >> > > the wiki yet, don't hesitate to give me a message if you have any
>> > >> opinion
>> > >> > > on it.
>> > >> > >
>> > >> > > ```
>> > >> > > *Backward Compatibility*
>> > >> > >
>> > >> > > We need to establish some backward-compatibility strategy for the
>> > >> case an
>> > >> > > old client subscribes a topic using ZStandard implicitly (i.e.,
>> > >> > > 'compression.type' configuration of given topic is 'producer' and
>> > the
>> > >> > > producer compressed the records with ZStandard). We have the
>> > following
>> > >> > > options for this situation:
>> > >> > >
>> > >> > > *A. Support ZStandard to the old clients which can understand
>> v0, v1
>> > >> > > messages only.*
>> > >> > >
>> > >> > > This strategy necessarily requires the down-conversion of v2
>> message
>> > >> > > compressed with Zstandard into v0 or v1 messages, which means a
>> > >> > > considerable performance degradation. So we rejected this
>> strategy.
>> > >> > >
>> > >> > > *B. Bump the API version and support only v2-available clients*
>> > >> > >
>> > >> > > With this approach, we can message the old clients that they are
>> old
>> > >> and
>> > >> > > should be upgraded. However, there are still several options for
>> the
>> > >> > Error
>> > >> > > code.
>> > >> > >
>> > >> > > *B.1. INVALID_REQUEST (42)*
>> > >> > >
>> > >> > > This option gives the client so little information; the user can
>> be
>> > >> > > confused about why the client worked correctly in the past
>> suddenly
>> > >> > > encounters a problem. So we rejected this strategy.
>> > >> > >
>> > >> > > *B.2. CORRUPT_MESSAGE (2)*
>> > >> > >
>> > >> > > This option gives inaccurate information; the user can be
>> surprised
>> > >> and
>> > >> > > misunderstand that the log files are broken in some way. So we
>> > >> rejected
>> > >> > > this strategy.
>> > >> > >
>> > >> > > *B.3 UNSUPPORTED_FOR_MESSAGE_FORMAT (43)*
>> > >> > >
>> > >> > > The advantage of this approach is we don't need to define a new
>> > error
>> > >> > code;
>> > >> > > we can reuse it and that's all.
>> > >> > >
>> > >> > > The disadvantage of this approach is that it is also a little bit
>> > >> vague;
>> > >> > > This error code is defined as a work for KIP-98[^1] and now
>> returned
>> > >> in
>> > >> > the
>> > >> > > transaction error.
>> > >> > >
>> > >> > > *B.4. UNSUPPORTED_COMPRESSION_TYPE (new)*
>> > >> > >
>> > >> > > The advantage of this approach is that it is clear and provides
>> an
>> > >> exact
>> > >> > > description. The disadvantage is we need to add a new error code.
>> > >> > > ```
>> > >> > >
>> > >> > > *It seems like what we need to choose is now so clear:
>> > >> > > UNSUPPORTED_FOR_MESSAGE_FORMAT (B.3) or
>> UNSUPPORTED_COMPRESSION_TYPE
>> > >> > > (B.4).*
>> > >> > > The first one doesn't need a new error message but the latter is
>> > more
>> > >> > > explicit. Which one do you prefer? Since all of you have much
>> more
>> > >> > > experience and knowledge than me, I will follow your decision.
>> The
>> > >> wiki
>> > >> > > page will be updated following the decision also.
>> > >> > >
>> > >> > > Best,
>> > >> > > Dongjin
>> > >> > >
>> > >> > > [^1]: https://issues.apache.org/jira/browse/KAFKA-4990
>> > >> > >
>> > >> > > On Sun, Aug 19, 2018 at 4:58 AM Ismael Juma <isma...@gmail.com>
>> > >> wrote:
>> > >> > >
>> > >> > > > Sounds reasonable to me.
>> > >> > > >
>> > >> > > > Ismael
>> > >> > > >
>> > >> > > > On Sat, 18 Aug 2018, 12:20 Jason Gustafson, <
>> ja...@confluent.io>
>> > >> > wrote:
>> > >> > > >
>> > >> > > > > Hey Ismael,
>> > >> > > > >
>> > >> > > > > Your summary looks good to me. I think it might also be a
>> good
>> > >> idea
>> > >> > to
>> > >> > > > add
>> > >> > > > > a new UNSUPPORTED_COMPRESSION_TYPE error code to go along
>> with
>> > the
>> > >> > > > version
>> > >> > > > > bumps. We won't be able to use it for old api versions since
>> the
>> > >> > > clients
>> > >> > > > > will not understand it, but we can use it going forward so
>> that
>> > >> we're
>> > >> > > not
>> > >> > > > > stuck in a similar situation with a new message format and a
>> new
>> > >> > codec
>> > >> > > to
>> > >> > > > > support. Another option is to use UNSUPPORTED_FOR_MESSAGE_
>> > FORMAT,
>> > >> but
>> > >> > > it
>> > >> > > > is
>> > >> > > > > not as explicit.
>> > >> > > > >
>> > >> > > > > -Jason
>> > >> > > > >
>> > >> > > > > On Fri, Aug 17, 2018 at 5:19 PM, Ismael Juma <
>> ism...@juma.me.uk
>> > >
>> > >> > > wrote:
>> > >> > > > >
>> > >> > > > > > Hi Dongjin and Jason,
>> > >> > > > > >
>> > >> > > > > > I would agree. My summary:
>> > >> > > > > >
>> > >> > > > > > 1. Support zstd with message format 2 only.
>> > >> > > > > > 2. Bump produce and fetch request versions.
>> > >> > > > > > 3. Provide broker errors whenever possible based on the
>> > request
>> > >> > > version
>> > >> > > > > and
>> > >> > > > > > rely on clients for the cases where the broker can't
>> validate
>> > >> > > > efficiently
>> > >> > > > > > (example message format 2 consumer that supports the latest
>> > >> fetch
>> > >> > > > version
>> > >> > > > > > but doesn't support zstd).
>> > >> > > > > >
>> > >> > > > > > If there's general agreement on this, I suggest we update
>> the
>> > >> KIP
>> > >> > to
>> > >> > > > > state
>> > >> > > > > > the proposal and to move the rejected options to its own
>> > >> section.
>> > >> > And
>> > >> > > > > then
>> > >> > > > > > start a vote!
>> > >> > > > > >
>> > >> > > > > > Ismael
>> > >> > > > > >
>> > >> > > > > > On Fri, Aug 17, 2018 at 4:00 PM Jason Gustafson <
>> > >> > ja...@confluent.io>
>> > >> > > > > > wrote:
>> > >> > > > > >
>> > >> > > > > > > Hi Dongjin,
>> > >> > > > > > >
>> > >> > > > > > > Yes, that's a good summary. For clients which support v2,
>> > the
>> > >> > > client
>> > >> > > > > can
>> > >> > > > > > > parse the message format and hopefully raise a useful
>> error
>> > >> > message
>> > >> > > > > > > indicating the unsupported compression type. For older
>> > >> clients,
>> > >> > our
>> > >> > > > > > options
>> > >> > > > > > > are probably (1) to down-convert to the old format using
>> no
>> > >> > > > compression
>> > >> > > > > > > type, or (2) to return an error code. I'm leaning toward
>> the
>> > >> > latter
>> > >> > > > as
>> > >> > > > > > the
>> > >> > > > > > > simpler solution, but the challenge is finding a good
>> error
>> > >> code.
>> > >> > > Two
>> > >> > > > > > > possibilities might be INVALID_REQUEST or
>> CORRUPT_MESSAGE.
>> > The
>> > >> > > > downside
>> > >> > > > > > is
>> > >> > > > > > > that old clients probably won't get a helpful message.
>> > >> However,
>> > >> > at
>> > >> > > > > least
>> > >> > > > > > > the behavior will be consistent in the sense that all
>> > clients
>> > >> > will
>> > >> > > > fail
>> > >> > > > > > if
>> > >> > > > > > > they do not support zstandard.
>> > >> > > > > > >
>> > >> > > > > > > What do you think?
>> > >> > > > > > >
>> > >> > > > > > > Thanks,
>> > >> > > > > > > Jason
>> > >> > > > > > >
>> > >> > > > > > > On Fri, Aug 17, 2018 at 8:08 AM, Dongjin Lee <
>> > >> dong...@apache.org
>> > >> > >
>> > >> > > > > wrote:
>> > >> > > > > > >
>> > >> > > > > > > > Thanks Jason, I reviewed the down-converting logic
>> > following
>> > >> > your
>> > >> > > > > > > > explanation.[^1] You mean the following routines,
>> right?
>> > >> > > > > > > >
>> > >> > > > > > > > -
>> > >> > > > > > > > https://github.com/apache/kafka/blob/trunk/core/src/
>> > >> > > > > > > > main/scala/kafka/server/KafkaApis.scala#L534
>> > >> > > > > > > > -
>> > >> > > > > > > >
>> https://github.com/apache/kafka/blob/trunk/clients/src/
>> > >> > > > > > > > main/java/org/apache/kafka/common/record/
>> > >> > > LazyDownConversionRecords.
>> > >> > > > > > > > java#L165
>> > >> > > > > > > > -
>> > >> > > > > > > >
>> https://github.com/apache/kafka/blob/trunk/clients/src/
>> > >> > > > > > > >
>> > >> main/java/org/apache/kafka/common/record/RecordsUtil.java#L40
>> > >> > > > > > > >
>> > >> > > > > > > > It seems like your stance is like following:
>> > >> > > > > > > >
>> > >> > > > > > > > 1. In principle, Kafka does not change the compression
>> > codec
>> > >> > when
>> > >> > > > > > > > down-converting, since it requires inspecting the
>> fetched
>> > >> data,
>> > >> > > > which
>> > >> > > > > > is
>> > >> > > > > > > > expensive.
>> > >> > > > > > > > 2. However, there are some cases the fetched data is
>> > >> inspected
>> > >> > > > > anyway.
>> > >> > > > > > In
>> > >> > > > > > > > this case, we can provide compression conversion from
>> > >> Zstandard
>> > >> > > to
>> > >> > > > > > > > classical ones[^2].
>> > >> > > > > > > >
>> > >> > > > > > > > And from what I understand, the cases where the client
>> > >> without
>> > >> > > > > > ZStandard
>> > >> > > > > > > > support receives ZStandard compressed records can be
>> > >> organized
>> > >> > > into
>> > >> > > > > two
>> > >> > > > > > > > cases:
>> > >> > > > > > > >
>> > >> > > > > > > > a. The 'compression.type' configuration of given topic
>> is
>> > >> > > > 'producer'
>> > >> > > > > > and
>> > >> > > > > > > > the producer compressed the records with ZStandard.
>> (that
>> > >> is,
>> > >> > > using
>> > >> > > > > > > > ZStandard implicitly.)
>> > >> > > > > > > > b.  The 'compression.type' configuration of given
>> topic is
>> > >> > > 'zstd';
>> > >> > > > > that
>> > >> > > > > > > is,
>> > >> > > > > > > > using ZStandard explicitly.
>> > >> > > > > > > >
>> > >> > > > > > > > As you stated, we don't have to handle the case b
>> > specially.
>> > >> > So,
>> > >> > > It
>> > >> > > > > > seems
>> > >> > > > > > > > like we can narrow the focus of the problem by joining
>> > case
>> > >> 1
>> > >> > and
>> > >> > > > > case
>> > >> > > > > > b
>> > >> > > > > > > > like the following:
>> > >> > > > > > > >
>> > >> > > > > > > > > Given the topic with 'producer' as its
>> > 'compression.type'
>> > >> > > > > > > configuration,
>> > >> > > > > > > > ZStandard compressed records and old client without
>> > >> ZStandard,
>> > >> > is
>> > >> > > > > there
>> > >> > > > > > > any
>> > >> > > > > > > > case we need to inspect the records and can change the
>> > >> > > compression
>> > >> > > > > > type?
>> > >> > > > > > > If
>> > >> > > > > > > > so, can we provide compression type converting?
>> > >> > > > > > > >
>> > >> > > > > > > > Do I understand correctly?
>> > >> > > > > > > >
>> > >> > > > > > > > Best,
>> > >> > > > > > > > Dongjin
>> > >> > > > > > > >
>> > >> > > > > > > > [^1]: I'm sorry, I found that I was a little bit
>> > >> > misunderstanding
>> > >> > > > how
>> > >> > > > > > API
>> > >> > > > > > > > version works, after reviewing the downconvert logic &
>> the
>> > >> > > protocol
>> > >> > > > > > > > documentation <https://kafka.apache.org/protocol>.
>> > >> > > > > > > > [^2]: None, Gzip, Snappy, Lz4.
>> > >> > > > > > > >
>> > >> > > > > > > > On Tue, Aug 14, 2018 at 2:16 AM Jason Gustafson <
>> > >> > > > ja...@confluent.io>
>> > >> > > > > > > > wrote:
>> > >> > > > > > > >
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > But in my opinion, since the client will fail with
>> the
>> > >> API
>> > >> > > > > version,
>> > >> > > > > > > so
>> > >> > > > > > > > we
>> > >> > > > > > > > > > don't need to down-convert the messages anyway.
>> Isn't
>> > >> it?
>> > >> > > So, I
>> > >> > > > > > think
>> > >> > > > > > > > we
>> > >> > > > > > > > > > don't care about this case. (I'm sorry, I am not
>> > >> familiar
>> > >> > > with
>> > >> > > > > > > > > down-convert
>> > >> > > > > > > > > > logic.)
>> > >> > > > > > > > >
>> > >> > > > > > > > >
>> > >> > > > > > > > > Currently the broker down-converts automatically
>> when it
>> > >> > > receives
>> > >> > > > > an
>> > >> > > > > > > old
>> > >> > > > > > > > > version of the fetch request (a version which is
>> known
>> > to
>> > >> > > predate
>> > >> > > > > the
>> > >> > > > > > > > > message format in use). Typically when
>> down-converting
>> > the
>> > >> > > > message
>> > >> > > > > > > > format,
>> > >> > > > > > > > > we use the same compression type, but there is not
>> much
>> > >> point
>> > >> > > in
>> > >> > > > > > doing
>> > >> > > > > > > so
>> > >> > > > > > > > > when we know the client doesn't support it. So if
>> > >> zstandard
>> > >> > is
>> > >> > > in
>> > >> > > > > > use,
>> > >> > > > > > > > and
>> > >> > > > > > > > > we have to down-convert anyway, then we can choose to
>> > use
>> > >> a
>> > >> > > > > different
>> > >> > > > > > > > > compression type or no compression type.
>> > >> > > > > > > > >
>> > >> > > > > > > > > From my perspective, there is no significant
>> downside to
>> > >> > > bumping
>> > >> > > > > the
>> > >> > > > > > > > > protocol version and it has several potential
>> benefits.
>> > >> > Version
>> > >> > > > > bumps
>> > >> > > > > > > are
>> > >> > > > > > > > > cheap. The main question mark in my mind is about
>> > >> > > > down-conversion.
>> > >> > > > > > > > Figuring
>> > >> > > > > > > > > out whether down-conversion is needed is hard
>> generally
>> > >> > without
>> > >> > > > > > > > inspecting
>> > >> > > > > > > > > the fetched data, which is expensive. I think we
>> agree
>> > in
>> > >> > > > principle
>> > >> > > > > > > that
>> > >> > > > > > > > we
>> > >> > > > > > > > > do not want to have to pay this cost generally and
>> > prefer
>> > >> the
>> > >> > > > > clients
>> > >> > > > > > > to
>> > >> > > > > > > > > fail when they see an unhandled compression type. The
>> > >> point I
>> > >> > > was
>> > >> > > > > > > making
>> > >> > > > > > > > is
>> > >> > > > > > > > > that there are some cases where we are either
>> inspecting
>> > >> the
>> > >> > > data
>> > >> > > > > > > anyway
>> > >> > > > > > > > > (because we have to down-convert the message
>> format), or
>> > >> we
>> > >> > > have
>> > >> > > > an
>> > >> > > > > > > easy
>> > >> > > > > > > > > way to tell whether zstandard is in use (the topic
>> has
>> > it
>> > >> > > > > configured
>> > >> > > > > > > > > explicitly). In the latter case, we don't have to
>> handle
>> > >> it
>> > >> > > > > > specially.
>> > >> > > > > > > > But
>> > >> > > > > > > > > we do have to decide how we will handle
>> down-conversion
>> > to
>> > >> > > older
>> > >> > > > > > > formats.
>> > >> > > > > > > > >
>> > >> > > > > > > > > -Jason
>> > >> > > > > > > > >
>> > >> > > > > > > > > On Sun, Aug 12, 2018 at 5:15 PM, Dongjin Lee <
>> > >> > > dong...@apache.org
>> > >> > > > >
>> > >> > > > > > > wrote:
>> > >> > > > > > > > >
>> > >> > > > > > > > > > Colin and Jason,
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > Thanks for your opinions. In summarizing, the Pros
>> and
>> > >> Cons
>> > >> > > of
>> > >> > > > > > > bumping
>> > >> > > > > > > > > > fetch API version are:
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > Cons:
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > - The Broker can't know whether a given message
>> batch
>> > is
>> > >> > > > > compressed
>> > >> > > > > > > > with
>> > >> > > > > > > > > > zstd or not.
>> > >> > > > > > > > > > - Need some additional logic for the topic
>> explicitly
>> > >> > > > configured
>> > >> > > > > to
>> > >> > > > > > > use
>> > >> > > > > > > > > > zstd.
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > Pros:
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > - The broker doesn't need to conduct expensive
>> > >> > > down-conversion.
>> > >> > > > > > > > > > - Can message the users to update their client.
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > So, opinions for the backward-compatibility policy
>> by
>> > >> far:
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > - A: bump the API version - +2 (Colin, Jason)
>> > >> > > > > > > > > > - B: leave unchanged - +1 (Viktor)
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > Here are my additional comments:
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > @Colin
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > I greatly appreciate your response. In the case of
>> the
>> > >> > > > dictionary
>> > >> > > > > > > > > support,
>> > >> > > > > > > > > > of course, this issue should be addressed later so
>> we
>> > >> don't
>> > >> > > > need
>> > >> > > > > it
>> > >> > > > > > > in
>> > >> > > > > > > > > the
>> > >> > > > > > > > > > first version. You are right - it is not late to
>> try
>> > it
>> > >> > after
>> > >> > > > > some
>> > >> > > > > > > > > > benchmarks. What I mean is, we should keep in mind
>> on
>> > >> that
>> > >> > > > > > potential
>> > >> > > > > > > > > > feature.
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > @Jason
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > You wrote,
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > > Similarly, if we have to down-convert anyway
>> because
>> > >> the
>> > >> > > > client
>> > >> > > > > > > does
>> > >> > > > > > > > > not
>> > >> > > > > > > > > > understand the message format, then we could also
>> use
>> > a
>> > >> > > > different
>> > >> > > > > > > > > > compression type.
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > But in my opinion, since the client will fail with
>> the
>> > >> API
>> > >> > > > > version,
>> > >> > > > > > > so
>> > >> > > > > > > > we
>> > >> > > > > > > > > > don't need to down-convert the messages anyway.
>> Isn't
>> > >> it?
>> > >> > > So, I
>> > >> > > > > > think
>> > >> > > > > > > > we
>> > >> > > > > > > > > > don't care about this case. (I'm sorry, I am not
>> > >> familiar
>> > >> > > with
>> > >> > > > > > > > > down-convert
>> > >> > > > > > > > > > logic.)
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > Please give more opinions. Thanks!
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > - Dongjin
>> > >> > > > > > > > > >
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > On Wed, Aug 8, 2018 at 6:41 AM Jason Gustafson <
>> > >> > > > > ja...@confluent.io
>> > >> > > > > > >
>> > >> > > > > > > > > wrote:
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > > Hey Colin,
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > > > The problem for the fetch API is that the broker
>> > does
>> > >> not
>> > >> > > > > > generally
>> > >> > > > > > > > > know
>> > >> > > > > > > > > > if
>> > >> > > > > > > > > > > a batch was compressed with zstd unless it parses
>> > it.
>> > >> I
>> > >> > > think
>> > >> > > > > the
>> > >> > > > > > > > goal
>> > >> > > > > > > > > > here
>> > >> > > > > > > > > > > is to avoid the expensive down-conversion that is
>> > >> needed
>> > >> > to
>> > >> > > > > > ensure
>> > >> > > > > > > > > > > compatibility because it is only necessary if
>> zstd
>> > is
>> > >> > > > actually
>> > >> > > > > in
>> > >> > > > > > > > use.
>> > >> > > > > > > > > > But
>> > >> > > > > > > > > > > as long as old clients can parse the message
>> format,
>> > >> they
>> > >> > > > > should
>> > >> > > > > > > get
>> > >> > > > > > > > a
>> > >> > > > > > > > > > > reasonable error if they see an unsupported
>> > >> compression
>> > >> > > type
>> > >> > > > in
>> > >> > > > > > the
>> > >> > > > > > > > > > > attributes. Basically the onus is on users to
>> ensure
>> > >> that
>> > >> > > > their
>> > >> > > > > > > > > consumers
>> > >> > > > > > > > > > > have been updated prior to using zstd. It seems
>> > like a
>> > >> > > > > reasonable
>> > >> > > > > > > > > > tradeoff
>> > >> > > > > > > > > > > to me. There are a couple cases that might be
>> worth
>> > >> > > thinking
>> > >> > > > > > > through:
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > > > 1. If a topic is explicitly configured to use
>> zstd,
>> > >> then
>> > >> > we
>> > >> > > > > don't
>> > >> > > > > > > > need
>> > >> > > > > > > > > to
>> > >> > > > > > > > > > > check the fetched data for the compression type
>> to
>> > >> know
>> > >> > if
>> > >> > > we
>> > >> > > > > > need
>> > >> > > > > > > > > > > down-conversion. If we did bump the Fetch API
>> > version,
>> > >> > then
>> > >> > > > we
>> > >> > > > > > > could
>> > >> > > > > > > > > > handle
>> > >> > > > > > > > > > > this case by either down-converting using a
>> > different
>> > >> > > > > compression
>> > >> > > > > > > > type
>> > >> > > > > > > > > or
>> > >> > > > > > > > > > > returning an error.
>> > >> > > > > > > > > > > 2. Similarly, if we have to down-convert anyway
>> > >> because
>> > >> > the
>> > >> > > > > > client
>> > >> > > > > > > > does
>> > >> > > > > > > > > > not
>> > >> > > > > > > > > > > understand the message format, then we could also
>> > use
>> > >> a
>> > >> > > > > different
>> > >> > > > > > > > > > > compression type.
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > > > For the produce API, I think it's reasonable to
>> bump
>> > >> the
>> > >> > > api
>> > >> > > > > > > version.
>> > >> > > > > > > > > > This
>> > >> > > > > > > > > > > can be used by clients to check whether a broker
>> > >> supports
>> > >> > > > zstd.
>> > >> > > > > > For
>> > >> > > > > > > > > > > example, we might support a list of preferred
>> > >> compression
>> > >> > > > types
>> > >> > > > > > in
>> > >> > > > > > > > the
>> > >> > > > > > > > > > > producer and we could use the broker to detect
>> which
>> > >> > > version
>> > >> > > > to
>> > >> > > > > > > use.
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > > > -Jason
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > > > On Tue, Aug 7, 2018 at 1:32 PM, Colin McCabe <
>> > >> > > > > cmcc...@apache.org
>> > >> > > > > > >
>> > >> > > > > > > > > wrote:
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > > > > Thanks for bumping this, Dongjin.  ZStd is a
>> good
>> > >> > > > compression
>> > >> > > > > > > codec
>> > >> > > > > > > > > > and I
>> > >> > > > > > > > > > > > hope we can get this support in soon!
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > > I would say we can just bump the API version to
>> > >> > indicate
>> > >> > > > that
>> > >> > > > > > > ZStd
>> > >> > > > > > > > > > > support
>> > >> > > > > > > > > > > > is expected in new clients.  We probably need
>> some
>> > >> way
>> > >> > of
>> > >> > > > > > > > indicating
>> > >> > > > > > > > > to
>> > >> > > > > > > > > > > the
>> > >> > > > > > > > > > > > older clients that they can't consume the
>> > >> partitions,
>> > >> > as
>> > >> > > > > well.
>> > >> > > > > > > > > Perhaps
>> > >> > > > > > > > > > > we
>> > >> > > > > > > > > > > > can use the UNSUPPORTED_FOR_MESSAGE_FORMAT
>> error?
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > > The license thing seems straightforward -- it's
>> > >> just a
>> > >> > > > matter
>> > >> > > > > > of
>> > >> > > > > > > > > adding
>> > >> > > > > > > > > > > > the text to the right files as per ASF
>> guidelines.
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > > With regard to the dictionary support, do we
>> > really
>> > >> > need
>> > >> > > > that
>> > >> > > > > > in
>> > >> > > > > > > > the
>> > >> > > > > > > > > > > first
>> > >> > > > > > > > > > > > version?  Hopefully message batches are big
>> enough
>> > >> that
>> > >> > > > this
>> > >> > > > > > > isn't
>> > >> > > > > > > > > > > needed.
>> > >> > > > > > > > > > > > Some benchmarks might help here.
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > > best,
>> > >> > > > > > > > > > > > Colin
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > > On Tue, Aug 7, 2018, at 08:02, Dongjin Lee
>> wrote:
>> > >> > > > > > > > > > > > > As Kafka 2.0.0 was released, let's reboot
>> this
>> > >> issue,
>> > >> > > > > KIP-110
>> > >> > > > > > > > > > > > > <https://cwiki.apache.org/
>> > >> > > confluence/display/KAFKA/KIP-
>> > >> > > > > > > > > > > > 110%3A+Add+Codec+for+ZStandard+Compression>
>> > >> > > > > > > > > > > > > .
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > For newcomers, Here is some summary of the
>> > >> history:
>> > >> > > > KIP-110
>> > >> > > > > > was
>> > >> > > > > > > > > > > > originally
>> > >> > > > > > > > > > > > > worked for the issue KAFKA-4514 but, it
>> lacked
>> > >> > > benchmark
>> > >> > > > > > > results
>> > >> > > > > > > > to
>> > >> > > > > > > > > > get
>> > >> > > > > > > > > > > > the
>> > >> > > > > > > > > > > > > agreement of the community. Later, Ivan
>> Babrou
>> > and
>> > >> > some
>> > >> > > > > other
>> > >> > > > > > > > users
>> > >> > > > > > > > > > who
>> > >> > > > > > > > > > > > > adopted the patch provided their excellent
>> > >> > performance
>> > >> > > > > report
>> > >> > > > > > > > which
>> > >> > > > > > > > > > is
>> > >> > > > > > > > > > > > now
>> > >> > > > > > > > > > > > > included in the KIP, but it postponed again
>> > >> because
>> > >> > of
>> > >> > > > the
>> > >> > > > > > > > > community
>> > >> > > > > > > > > > > was
>> > >> > > > > > > > > > > > > busy for 2.0.0 release. It is why I now
>> reboot
>> > >> this
>> > >> > > > issue.
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > The following is the current status of the
>> > >> feature:
>> > >> > You
>> > >> > > > can
>> > >> > > > > > > check
>> > >> > > > > > > > > the
>> > >> > > > > > > > > > > > > current draft implementation here
>> > >> > > > > > > > > > > > > <https://github.com/apache/kafka/pull/2267>.
>> It
>> > >> is
>> > >> > > based
>> > >> > > > > on
>> > >> > > > > > > zstd
>> > >> > > > > > > > > > 1.3.5
>> > >> > > > > > > > > > > > and
>> > >> > > > > > > > > > > > > periodically rebased onto the latest
>> trunk[^1].
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > The issues that should be addressed is like
>> > >> > following:
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > *1. Backward Compatibility*
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > To support old consumers, we need to take a
>> > >> strategy
>> > >> > to
>> > >> > > > > > handle
>> > >> > > > > > > > the
>> > >> > > > > > > > > > old
>> > >> > > > > > > > > > > > > consumers. Current candidates are:
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > - Bump API version
>> > >> > > > > > > > > > > > > - Leave unchanged: let the old clients fail.
>> > >> > > > > > > > > > > > > - Improve the error messages:
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > *2. Dictionary Support*
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > To support zstd's dictionary feature in the
>> > future
>> > >> > (if
>> > >> > > > > > needed),
>> > >> > > > > > > > we
>> > >> > > > > > > > > > need
>> > >> > > > > > > > > > > > to
>> > >> > > > > > > > > > > > > sketch how it should be and leave some room
>> for
>> > >> it.
>> > >> > As
>> > >> > > of
>> > >> > > > > > now,
>> > >> > > > > > > > > there
>> > >> > > > > > > > > > > has
>> > >> > > > > > > > > > > > > been no discussion on this topic yet.
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > *3. License*
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > To use this feature, we need to add license
>> of
>> > >> zstd
>> > >> > and
>> > >> > > > > > > zstd-jni
>> > >> > > > > > > > to
>> > >> > > > > > > > > > the
>> > >> > > > > > > > > > > > > project. (Thanks to Viktor Somogyi for
>> raising
>> > >> this
>> > >> > > > issue!)
>> > >> > > > > > It
>> > >> > > > > > > > > seems
>> > >> > > > > > > > > > > like
>> > >> > > > > > > > > > > > > what Apache Spark did would be a good example
>> > but
>> > >> > there
>> > >> > > > has
>> > >> > > > > > > been
>> > >> > > > > > > > no
>> > >> > > > > > > > > > > > > discussion yet.
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > You can find the details of the above issues
>> in
>> > >> the
>> > >> > KIP
>> > >> > > > > > > document.
>> > >> > > > > > > > > > > Please
>> > >> > > > > > > > > > > > > have a look when you are free, and give me
>> > >> feedback.
>> > >> > > All
>> > >> > > > > > kinds
>> > >> > > > > > > of
>> > >> > > > > > > > > > > > > participating are welcome.
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > Best,
>> > >> > > > > > > > > > > > > Dongjin
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > [^1]: At the time of writing, commit
>> 6b4fb8152.
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > On Sat, Jul 14, 2018 at 10:45 PM Dongjin Lee
>> <
>> > >> > > > > > > dong...@apache.org
>> > >> > > > > > > > >
>> > >> > > > > > > > > > > wrote:
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > Sorry for the late reply.
>> > >> > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > In short, I could not submit the updated
>> KIP
>> > by
>> > >> the
>> > >> > > > > feature
>> > >> > > > > > > > > freeze
>> > >> > > > > > > > > > > > > > deadline of 2.0.0. For this reason, it will
>> > not
>> > >> be
>> > >> > > > > included
>> > >> > > > > > > in
>> > >> > > > > > > > > the
>> > >> > > > > > > > > > > > 2.0.0
>> > >> > > > > > > > > > > > > > release and all discussion for this issue
>> were
>> > >> > > > postponed
>> > >> > > > > > > after
>> > >> > > > > > > > > the
>> > >> > > > > > > > > > > > release
>> > >> > > > > > > > > > > > > > of 2.0.0.
>> > >> > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > I have been updating the PR following
>> recent
>> > >> > updates.
>> > >> > > > > Just
>> > >> > > > > > > > now, I
>> > >> > > > > > > > > > > > rebased
>> > >> > > > > > > > > > > > > > it against the latest trunk and updated the
>> > zstd
>> > >> > > > version
>> > >> > > > > > into
>> > >> > > > > > > > > > 1.3.5.
>> > >> > > > > > > > > > > > If you
>> > >> > > > > > > > > > > > > > need some request, don't hesitate to notify
>> > me.
>> > >> > (But
>> > >> > > > not
>> > >> > > > > > this
>> > >> > > > > > > > > > thread
>> > >> > > > > > > > > > > -
>> > >> > > > > > > > > > > > just
>> > >> > > > > > > > > > > > > > send me the message directly.)
>> > >> > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > Best,
>> > >> > > > > > > > > > > > > > Dongjin
>> > >> > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > On Tue, Jul 10, 2018 at 11:57 PM Bobby
>> Evans <
>> > >> > > > > > > bo...@apache.org
>> > >> > > > > > > > >
>> > >> > > > > > > > > > > wrote:
>> > >> > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > >> I there any update on this.  The
>> performance
>> > >> > > > > improvements
>> > >> > > > > > > are
>> > >> > > > > > > > > > quite
>> > >> > > > > > > > > > > > > >> impressive and I really would like to stop
>> > >> forking
>> > >> > > > kafka
>> > >> > > > > > > just
>> > >> > > > > > > > to
>> > >> > > > > > > > > > get
>> > >> > > > > > > > > > > > this
>> > >> > > > > > > > > > > > > >> in.
>> > >> > > > > > > > > > > > > >>
>> > >> > > > > > > > > > > > > >> Thanks,
>> > >> > > > > > > > > > > > > >>
>> > >> > > > > > > > > > > > > >> Bobby
>> > >> > > > > > > > > > > > > >>
>> > >> > > > > > > > > > > > > >> On Wed, Jun 13, 2018 at 8:56 PM Dongjin
>> Lee <
>> > >> > > > > > > > dong...@apache.org
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > > > wrote:
>> > >> > > > > > > > > > > > > >>
>> > >> > > > > > > > > > > > > >> > Ismael,
>> > >> > > > > > > > > > > > > >> >
>> > >> > > > > > > > > > > > > >> > Oh, I forgot all of you are on working
>> > frenzy
>> > >> > for
>> > >> > > > 2.0!
>> > >> > > > > > No
>> > >> > > > > > > > > > problem,
>> > >> > > > > > > > > > > > take
>> > >> > > > > > > > > > > > > >> > your time. I am also working at another
>> > issue
>> > >> > now.
>> > >> > > > > Thank
>> > >> > > > > > > you
>> > >> > > > > > > > > for
>> > >> > > > > > > > > > > > > >> letting me
>> > >> > > > > > > > > > > > > >> > know.
>> > >> > > > > > > > > > > > > >> >
>> > >> > > > > > > > > > > > > >> > Best,
>> > >> > > > > > > > > > > > > >> > Dongjin
>> > >> > > > > > > > > > > > > >> >
>> > >> > > > > > > > > > > > > >> > On Wed, Jun 13, 2018, 11:44 PM Ismael
>> Juma
>> > <
>> > >> > > > > > > > isma...@gmail.com
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > > > wrote:
>> > >> > > > > > > > > > > > > >> >
>> > >> > > > > > > > > > > > > >> > > Sorry for the delay Dongjin. Everyone
>> is
>> > >> busy
>> > >> > > > > > finalising
>> > >> > > > > > > > > > 2.0.0.
>> > >> > > > > > > > > > > > This
>> > >> > > > > > > > > > > > > >> KIP
>> > >> > > > > > > > > > > > > >> > > seems like a great candidate for 2.1.0
>> > and
>> > >> > > > hopefully
>> > >> > > > > > > there
>> > >> > > > > > > > > > will
>> > >> > > > > > > > > > > be
>> > >> > > > > > > > > > > > > >> more
>> > >> > > > > > > > > > > > > >> > of
>> > >> > > > > > > > > > > > > >> > > a discussion next week. :)
>> > >> > > > > > > > > > > > > >> > >
>> > >> > > > > > > > > > > > > >> > > Ismael
>> > >> > > > > > > > > > > > > >> > >
>> > >> > > > > > > > > > > > > >> > > On Wed, 13 Jun 2018, 05:17 Dongjin
>> Lee, <
>> > >> > > > > > > > dong...@apache.org
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > > > wrote:
>> > >> > > > > > > > > > > > > >> > >
>> > >> > > > > > > > > > > > > >> > > > Hello. I just updated my draft
>> > >> > implementation:
>> > >> > > > > > > > > > > > > >> > > >
>> > >> > > > > > > > > > > > > >> > > > 1. Rebased to latest trunk (commit
>> > >> 5145d6b)
>> > >> > > > > > > > > > > > > >> > > > 2. Apply ZStd 1.3.4
>> > >> > > > > > > > > > > > > >> > > >
>> > >> > > > > > > > > > > > > >> > > > You can check out the implementation
>> > from
>> > >> > here
>> > >> > > > > > > > > > > > > >> > > > <
>> > >> https://github.com/apache/kafka/pull/2267
>> > >> > >.
>> > >> > > If
>> > >> > > > > you
>> > >> > > > > > > > > > > experience
>> > >> > > > > > > > > > > > any
>> > >> > > > > > > > > > > > > >> > > problem
>> > >> > > > > > > > > > > > > >> > > > running it, don't hesitate to give
>> me a
>> > >> > > mention.
>> > >> > > > > > > > > > > > > >> > > >
>> > >> > > > > > > > > > > > > >> > > > Best,
>> > >> > > > > > > > > > > > > >> > > > Dongjin
>> > >> > > > > > > > > > > > > >> > > >
>> > >> > > > > > > > > > > > > >> > > > On Tue, Jun 12, 2018 at 6:50 PM
>> Dongjin
>> > >> Lee
>> > >> > <
>> > >> > > > > > > > > > > dong...@apache.org
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > >> > wrote:
>> > >> > > > > > > > > > > > > >> > > >
>> > >> > > > > > > > > > > > > >> > > > > Here is the short conclusion about
>> > the
>> > >> > > license
>> > >> > > > > > > > problem:
>> > >> > > > > > > > > > *We
>> > >> > > > > > > > > > > > can
>> > >> > > > > > > > > > > > > >> use
>> > >> > > > > > > > > > > > > >> > > zstd
>> > >> > > > > > > > > > > > > >> > > > > and zstd-jni without any problem,
>> but
>> > >> we
>> > >> > > need
>> > >> > > > to
>> > >> > > > > > > > include
>> > >> > > > > > > > > > > their
>> > >> > > > > > > > > > > > > >> > license,
>> > >> > > > > > > > > > > > > >> > > > > e.g., BSD license.*
>> > >> > > > > > > > > > > > > >> > > > >
>> > >> > > > > > > > > > > > > >> > > > > Both of BSD 2 Clause License & 3
>> > Clause
>> > >> > > > License
>> > >> > > > > > > > requires
>> > >> > > > > > > > > > to
>> > >> > > > > > > > > > > > > >> include
>> > >> > > > > > > > > > > > > >> > the
>> > >> > > > > > > > > > > > > >> > > > > license used, and BSD 3 Clause
>> > License
>> > >> > > > requires
>> > >> > > > > > that
>> > >> > > > > > > > the
>> > >> > > > > > > > > > > name
>> > >> > > > > > > > > > > > of
>> > >> > > > > > > > > > > > > >> the
>> > >> > > > > > > > > > > > > >> > > > > contributor can't be used to
>> endorse
>> > or
>> > >> > > > promote
>> > >> > > > > > the
>> > >> > > > > > > > > > product.
>> > >> > > > > > > > > > > > > >> That's
>> > >> > > > > > > > > > > > > >> > it
>> > >> > > > > > > > > > > > > >> > > > > <
>> > >> > > > > > > > > > > > > >> > > >
>> > >> > > > > > > > > > > > > >> > >
>> > >> > > > > > > > > > > > > >> >
>> > >> > > > > > > > > > > > > >>
>> > >> > http://www.mikestratton.net/2011/12/is-bsd-license-
>> > >> > > > > > > > > > > > compatible-with-apache-2-0-license/
>> > >> > > > > > > > > > > > > >> > > > >
>> > >> > > > > > > > > > > > > >> > > > > - They are not listed in the list
>> of
>> > >> > > > prohibited
>> > >> > > > > > > > licenses
>> > >> > > > > > > > > > > > > >> > > > > <https://www.apache.org/legal/
>> > >> > > > > > > > resolved.html#category-x>
>> > >> > > > > > > > > > > also.
>> > >> > > > > > > > > > > > > >> > > > >
>> > >> > > > > > > > > > > > > >> > > > > Here is how Spark did for it
>> > >> > > > > > > > > > > > > >> > > > > <https://issues.apache.org/
>> > >> > > > > > jira/browse/SPARK-19112
>> > >> > > > > > > >:
>> > >> > > > > > > > > > > > > >> > > > >
>> > >> > > > > > > > > > > > > >> > > > > - They made a directory dedicated
>> to
>> > >> the
>> > >> > > > > > dependency
>> > >> > > > > > > > > > license
>> > >> > > > > > > > > > > > files
>> > >> > > > > > > > > > > > > >> > > > > <
>> > >> > > > > > > https://github.com/apache/spark/tree/master/licenses
>> > >> > > > > > > > >
>> > >> > > > > > > > > > and
>> > >> > > > > > > > > > > > added
>> > >> > > > > > > > > > > > > >> > > > licenses
>> > >> > > > > > > > > > > > > >> > > > > for Zstd
>> > >> > > > > > > > > > > > > >> > > > > <
>> > >> > > > > > > > > > > > > >> >
>> > >> > > > https://github.com/apache/spark/blob/master/licenses/
>> > >> > > > > > > > > > > > LICENSE-zstd.txt
>> > >> > > > > > > > > > > > > >> > > >
>> > >> > > > > > > > > > > > > >> > > > &
>> > >> > > > > > > > > > > > > >> > > > > Zstd-jni
>> > >> > > > > > > > > > > > > >> > > > > <
>> > >> > > > > > > > > > > > > >> > > >
>> > >> > > > > > > > > > > > > >> > >
>> > >> > > > > > > > > > > > > >> >
>> > >> > > > > > > > > > > > > >> https://github.com/apache/
>> > >> > > spark/blob/master/licenses/
>> > >> > > > > > > > > > > > LICENSE-zstd-jni.txt
>> > >> > > > > > > > > > > > > >> >
>> > >> > > > > > > > > > > > > >> > > > > .
>> > >> > > > > > > > > > > > > >> > > > > - Added a link to the original
>> > license
>> > >> > files
>> > >> > > > in
>> > >> > > > > > > > LICENSE.
>> > >> > > > > > > > > > > > > >> > > > > <
>> > >> > > > > https://github.com/apache/spark/pull/18805/files
>> > >> > > > > > >
>> > >> > > > > > > > > > > > > >> > > > >
>> > >> > > > > > > > > > > > > >> > > > > If needed, I can make a similar
>> > update.
>> > >> > > > > > > > > > > > > >> > > > >
>> > >> > > > > > > > > > > > > >> > > > > Thanks for pointing out this
>> problem,
>> > >> > > Viktor!
>> > >> > > > > Nice
>> > >> > > > > > > > > catch!
>> > >> > > > > > > > > > > > > >> > > > >
>> > >> > > > > > > > > > > > > >> > > > > Best,
>> > >> > > > > > > > > > > > > >> > > > > Dongjin
>> > >> > > > > > > > > > > > > >> > > > >
>> > >> > > > > > > > > > > > > >> > > > >
>> > >> > > > > > > > > > > > > >> > > > >
>> > >> > > > > > > > > > > > > >> > > > > On Mon, Jun 11, 2018 at 11:50 PM
>> > >> Dongjin
>> > >> > > Lee <
>> > >> > > > > > > > > > > > dong...@apache.org>
>> > >> > > > > > > > > > > > > >> > > wrote:
>> > >> > > > > > > > > > > > > >> > > > >
>> > >> > > > > > > > > > > > > >> > > > >> I greatly appreciate your
>> > >> comprehensive
>> > >> > > > > > reasoning.
>> > >> > > > > > > > so:
>> > >> > > > > > > > > +1
>> > >> > > > > > > > > > > > for b
>> > >> > > > > > > > > > > > > >> > until
>> > >> > > > > > > > > > > > > >> > > > now.
>> > >> > > > > > > > > > > > > >> > > > >>
>> > >> > > > > > > > > > > > > >> > > > >> For the license issues, I will
>> have
>> > a
>> > >> > check
>> > >> > > > on
>> > >> > > > > > how
>> > >> > > > > > > > the
>> > >> > > > > > > > > > over
>> > >> > > > > > > > > > > > > >> projects
>> > >> > > > > > > > > > > > > >> > > are
>> > >> > > > > > > > > > > > > >> > > > >> doing and share the results.
>> > >> > > > > > > > > > > > > >> > > > >>
>> > >> > > > > > > > > > > > > >> > > > >> Best,
>> > >> > > > > > > > > > > > > >> > > > >> Dongjin
>> > >> > > > > > > > > > > > > >> > > > >>
>> > >> > > > > > > > > > > > > >> > > > >> On Mon, Jun 11, 2018 at 10:08 PM
>> > >> Viktor
>> > >> > > > > Somogyi <
>> > >> > > > > > > > > > > > > >> > > > viktorsomo...@gmail.com>
>> > >> > > > > > > > > > > > > >> > > > >> wrote:
>> > >> > > > > > > > > > > > > >> > > > >>
>> > >> > > > > > > > > > > > > >> > > > >>> Hi Dongjin,
>> > >> > > > > > > > > > > > > >> > > > >>>
>> > >> > > > > > > > > > > > > >> > > > >>> A couple of comments:
>> > >> > > > > > > > > > > > > >> > > > >>> I would vote for option b. in
>> the
>> > >> > > "backward
>> > >> > > > > > > > > > compatibility"
>> > >> > > > > > > > > > > > > >> section.
>> > >> > > > > > > > > > > > > >> > > My
>> > >> > > > > > > > > > > > > >> > > > >>> reasoning for this is that users
>> > >> > upgrading
>> > >> > > > to
>> > >> > > > > a
>> > >> > > > > > > zstd
>> > >> > > > > > > > > > > > compatible
>> > >> > > > > > > > > > > > > >> > > version
>> > >> > > > > > > > > > > > > >> > > > >>> won't start to use it
>> > automatically,
>> > >> so
>> > >> > > > manual
>> > >> > > > > > > > > > > > reconfiguration
>> > >> > > > > > > > > > > > > >> is
>> > >> > > > > > > > > > > > > >> > > > >>> required.
>> > >> > > > > > > > > > > > > >> > > > >>> Therefore an upgrade won't mess
>> up
>> > >> the
>> > >> > > > > cluster.
>> > >> > > > > > If
>> > >> > > > > > > > not
>> > >> > > > > > > > > > all
>> > >> > > > > > > > > > > > the
>> > >> > > > > > > > > > > > > >> > > clients
>> > >> > > > > > > > > > > > > >> > > > >>> are
>> > >> > > > > > > > > > > > > >> > > > >>> upgraded but just some of them
>> and
>> > >> > they'd
>> > >> > > > > start
>> > >> > > > > > to
>> > >> > > > > > > > use
>> > >> > > > > > > > > > > zstd
>> > >> > > > > > > > > > > > > >> then it
>> > >> > > > > > > > > > > > > >> > > > would
>> > >> > > > > > > > > > > > > >> > > > >>> cause errors in the cluster. I'd
>> > >> like to
>> > >> > > > > presume
>> > >> > > > > > > > > though
>> > >> > > > > > > > > > > that
>> > >> > > > > > > > > > > > > >> this
>> > >> > > > > > > > > > > > > >> > is
>> > >> > > > > > > > > > > > > >> > > a
>> > >> > > > > > > > > > > > > >> > > > >>> very
>> > >> > > > > > > > > > > > > >> > > > >>> obvious failure case and nobody
>> > >> should
>> > >> > be
>> > >> > > > > > > surprised
>> > >> > > > > > > > if
>> > >> > > > > > > > > > it
>> > >> > > > > > > > > > > > didn't
>> > >> > > > > > > > > > > > > >> > > work.
>> > >> > > > > > > > > > > > > >> > > > >>> I wouldn't choose a. as I think
>> we
>> > >> > should
>> > >> > > > bump
>> > >> > > > > > the
>> > >> > > > > > > > > fetch
>> > >> > > > > > > > > > > and
>> > >> > > > > > > > > > > > > >> > produce
>> > >> > > > > > > > > > > > > >> > > > >>> requests if it's a change in the
>> > >> message
>> > >> > > > > format.
>> > >> > > > > > > > > > Moreover
>> > >> > > > > > > > > > > if
>> > >> > > > > > > > > > > > > >> some
>> > >> > > > > > > > > > > > > >> > of
>> > >> > > > > > > > > > > > > >> > > > the
>> > >> > > > > > > > > > > > > >> > > > >>> producers and the brokers are
>> > >> upgraded
>> > >> > but
>> > >> > > > > some
>> > >> > > > > > of
>> > >> > > > > > > > the
>> > >> > > > > > > > > > > > consumers
>> > >> > > > > > > > > > > > > >> > are
>> > >> > > > > > > > > > > > > >> > > > not,
>> > >> > > > > > > > > > > > > >> > > > >>> then we wouldn't prevent the
>> error
>> > >> when
>> > >> > > the
>> > >> > > > > old
>> > >> > > > > > > > > consumer
>> > >> > > > > > > > > > > > tries
>> > >> > > > > > > > > > > > > >> to
>> > >> > > > > > > > > > > > > >> > > > consume
>> > >> > > > > > > > > > > > > >> > > > >>> the zstd compressed messages.
>> > >> > > > > > > > > > > > > >> > > > >>> I wouldn't choose c. either as I
>> > >> think
>> > >> > > > binding
>> > >> > > > > > the
>> > >> > > > > > > > > > > > compression
>> > >> > > > > > > > > > > > > >> type
>> > >> > > > > > > > > > > > > >> > > to
>> > >> > > > > > > > > > > > > >> > > > an
>> > >> > > > > > > > > > > > > >> > > > >>> API is not so obvious from the
>> > >> > developer's
>> > >> > > > > > > > > perspective.
>> > >> > > > > > > > > > > > > >> > > > >>>
>> > >> > > > > > > > > > > > > >> > > > >>> I would also prefer to use the
>> > >> existing
>> > >> > > > > binding,
>> > >> > > > > > > > > however
>> > >> > > > > > > > > > > we
>> > >> > > > > > > > > > > > must
>> > >> > > > > > > > > > > > > >> > > > respect
>> > >> > > > > > > > > > > > > >> > > > >>> the licenses:
>> > >> > > > > > > > > > > > > >> > > > >>> "The code for these JNI
>> bindings is
>> > >> > > licenced
>> > >> > > > > > under
>> > >> > > > > > > > > > > 2-clause
>> > >> > > > > > > > > > > > BSD
>> > >> > > > > > > > > > > > > >> > > > license.
>> > >> > > > > > > > > > > > > >> > > > >>> The native Zstd library is
>> licensed
>> > >> > under
>> > >> > > > > > 3-clause
>> > >> > > > > > > > BSD
>> > >> > > > > > > > > > > > license
>> > >> > > > > > > > > > > > > >> and
>> > >> > > > > > > > > > > > > >> > > > GPL2"
>> > >> > > > > > > > > > > > > >> > > > >>> Based on the FAQ page
>> > >> > > > > > > > > > > > > >> > > > >>> https://www.apache.org/legal/
>> > >> > > > > > > > resolved.html#category-a
>> > >> > > > > > > > > > > > > >> > > > >>> we may use 2- and 3-clause BSD
>> > >> licenses
>> > >> > > but
>> > >> > > > > the
>> > >> > > > > > > > Apache
>> > >> > > > > > > > > > > > license
>> > >> > > > > > > > > > > > > >> is
>> > >> > > > > > > > > > > > > >> > not
>> > >> > > > > > > > > > > > > >> > > > >>> compatible with GPL2. I'm hoping
>> > that
>> > >> > the
>> > >> > > > > > > "3-clause
>> > >> > > > > > > > > BSD
>> > >> > > > > > > > > > > > license
>> > >> > > > > > > > > > > > > >> and
>> > >> > > > > > > > > > > > > >> > > > GPL2"
>> > >> > > > > > > > > > > > > >> > > > >>> is really not an AND but an OR
>> in
>> > >> this
>> > >> > > case,
>> > >> > > > > but
>> > >> > > > > > > I'm
>> > >> > > > > > > > > no
>> > >> > > > > > > > > > > > lawyer,
>> > >> > > > > > > > > > > > > >> > just
>> > >> > > > > > > > > > > > > >> > > > >>> wanted
>> > >> > > > > > > > > > > > > >> > > > >>> to make the point that we should
>> > >> watch
>> > >> > out
>> > >> > > > for
>> > >> > > > > > > > > licenses.
>> > >> > > > > > > > > > > :)
>> > >> > > > > > > > > > > > > >> > > > >>>
>> > >> > > > > > > > > > > > > >> > > > >>> Regards,
>> > >> > > > > > > > > > > > > >> > > > >>> Viktor
>> > >> > > > > > > > > > > > > >> > > > >>>
>> > >> > > > > > > > > > > > > >> > > > >>>
>> > >> > > > > > > > > > > > > >> > > > >>> On Sun, Jun 10, 2018 at 3:02 AM
>> > Ivan
>> > >> > > Babrou
>> > >> > > > <
>> > >> > > > > > > > > > > > ibob...@gmail.com>
>> > >> > > > > > > > > > > > > >> > > wrote:
>> > >> > > > > > > > > > > > > >> > > > >>>
>> > >> > > > > > > > > > > > > >> > > > >>> > Hello,
>> > >> > > > > > > > > > > > > >> > > > >>> >
>> > >> > > > > > > > > > > > > >> > > > >>> > This is Ivan and I still very
>> > much
>> > >> > > support
>> > >> > > > > the
>> > >> > > > > > > > fact
>> > >> > > > > > > > > > that
>> > >> > > > > > > > > > > > zstd
>> > >> > > > > > > > > > > > > >> > > > >>> compression
>> > >> > > > > > > > > > > > > >> > > > >>> > should be included out of the
>> > box.
>> > >> > > > > > > > > > > > > >> > > > >>> >
>> > >> > > > > > > > > > > > > >> > > > >>> > Please think about the
>> > environment,
>> > >> > you
>> > >> > > > can
>> > >> > > > > > save
>> > >> > > > > > > > > > quite a
>> > >> > > > > > > > > > > > lot
>> > >> > > > > > > > > > > > > >> of
>> > >> > > > > > > > > > > > > >> > > > >>> hardware
>> > >> > > > > > > > > > > > > >> > > > >>> > with it.
>> > >> > > > > > > > > > > > > >> > > > >>> >
>> > >> > > > > > > > > > > > > >> > > > >>> > Thank you.
>> > >> > > > > > > > > > > > > >> > > > >>> >
>> > >> > > > > > > > > > > > > >> > > > >>> > On Sat, Jun 9, 2018 at 14:14
>> > >> Dongjin
>> > >> > > Lee <
>> > >> > > > > > > > > > > > dong...@apache.org>
>> > >> > > > > > > > > > > > > >> > > wrote:
>> > >> > > > > > > > > > > > > >> > > > >>> >
>> > >> > > > > > > > > > > > > >> > > > >>> > > Since there are no responses
>> > for
>> > >> a
>> > >> > > > week, I
>> > >> > > > > > > > decided
>> > >> > > > > > > > > > to
>> > >> > > > > > > > > > > > > >> > reinitiate
>> > >> > > > > > > > > > > > > >> > > > the
>> > >> > > > > > > > > > > > > >> > > > >>> > > discussion thread.
>> > >> > > > > > > > > > > > > >> > > > >>> > >
>> > >> > > > > > > > > > > > > >> > > > >>> > >
>> > >> > > > > > > > > > > > > >> > > > >>> > >
>> > >> > > > > > > > > > > > > >> > > > >>> >
>> > >> > > > > > > > > > > > > >> > > > >>>
>> > >> > > > > > > > > > > > > >> > > >
>> > >> >
>>
>> --
>> *Dongjin Lee*
>>
>> *A hitchhiker in the mathematical world.*
>>
>> *github:  <http://goog_969573159/>github.com/dongjinleekr
>> <http://github.com/dongjinleekr>linkedin: kr.linkedin.com/in/dongjinleekr
>> <http://kr.linkedin.com/in/dongjinleekr>slideshare: 
>> www.slideshare.net/dongjinleekr
>> <http://www.slideshare.net/dongjinleekr>*
>>
>

Reply via email to