I've trimmed the inline contents as this mail is getting too big for the
apache mailing list software to deliver :-(

1. the important thing for interoperability is for different "interested
parties" (plugins, infra layers/wrappers, user-code) to be able to stick
pieces of metadata onto msgs without getting in each other's way. a common
key scheme (Strings, as of the time of this writing?) is all thats required
for that. it is assumed that the other end interested in any such piece of
metadata knows the encoding, and byte[] provides for the most flexibility.
i believe this is the same logic behind core kafka being byte[]/byte[] -
Strings are more "usable" but bytes are flexible and so were chosen.
Also - core kafka doesnt even do that good of a job on usability of the
payload (example - i have to specify the nop byte[] "decoders" explicitly
in conf), and again sacrificies usability for the sake of performance (no
convenient single-record processing as poll is a batch, lots of obscure
little config details exposing internals of the batching mechanism, etc)

this is also why i really dislike the idea of a "type system" for header
values, it further degrades the usability, adds complexity and will
eventually get in people's way, also, it would be the 2nd/3rd home-group
serialization mechanism in core kafka (counting 2 iterations of the "type
definition DSL")

2. this is an implementation detail, and not even a very "user facing" one?
to the best of my understanding the vote process is on proposed
API/behaviour. also - since we're willing to go with strings just serialize
a 0-sized header blob and IIUC you dont need any optionals anymore.

3. yes, we can :-)

On Tue, Feb 14, 2017 at 11:56 PM, Michael Pearce <michael.pea...@ig.com>
wrote:

> Hi Jay,
>
> 1) There was some initial debate on the value part, as youll note String,
> String headers were discounted early on. The reason for this is flexibility
> and keeping in line with the flexibility of key, value of the message
> object itself. I don’t think it takes away from an ecosystem as each plugin
> will care for their own key, this way ints, booleans , exotic custom binary
> can all be catered for=.
> a. If you really wanted to push for a typed value interface, I wouldn’t
> want just String values supported, but the the primatives plus string and
> also still keeping the ability to have a binary for custom binaries that
> some organisations may have.
> i. I have written this slight alternative here, https://cwiki.apache.org/
> confluence/display/KAFKA/KIP-82+-+Add+Record+Headers+-+Typed
> ii. Essentially the value bytes, has a leading byte overhead.
> 1.  This tells you what type the value is, before reading the rest of the
> bytes, allowing serialisation/deserialization to and from the primitives,
> string and byte[]. This is akin to some other messaging systems.
> 2) We are making it optional, so that for those not wanting headers have 0
> bytes overhead (think of it as a feature flag), I don’t think this is
> complex, especially if comparing to changes proposed in other kips like
> kip-98.
> a. If you really really don’t like this, we can drop it, but it would mean
> buying into 4 bytes extra overhead for users who do not want to use headers.
> 3) In the summary yes, it is at a higher level, but I think this is well
> documented in the proposed changes section.
> a. Added getHeaders method to Producer/Consumer record (that is it)
> b. We’ve also detailed the new Headers class that this method returns that
> encapsulates the headers protocol and logic.
>
> Best,
> Mike
>
> ==Original questions from the vote thread from Jay.==
>
> Couple of things I think we still need to work out:
>
>    1. I think we agree about the key, but I think we haven't talked about
>    the value yet. I think if our goal is an open ecosystem of these header
>    spread across many plugins from many systems we should consider making
> this
>    a string as well so it can be printed, set via a UI, set in config, etc.
>    Basically encouraging pluggable serialization formats here will lead to
> a
>    bit of a tower of babel.
>    2. This proposal still includes a pretty big change to our serialization
>    and protocol definition layer. Essentially it is introducing an optional
>    type, where the format is data dependent. I think this is actually a big
>    change though it doesn't seem like it. It means you can no longer
> specify
>    this type with our type definition DSL, and likewise it requires custom
>    handling in client libs. This isn't a huge thing, since the Record
>    definition is custom anyway, but I think this kind of protocol
>    inconsistency is very non-desirable and ties you to hand-coding things.
> I
>    think the type should instead by [Key Value] in our BNF, where key and
>    value are both short strings as used elsewhere. This brings it in line
> with
>    the rest of the protocol.
>    3. Could we get more specific about the exact Java API change to
>    ProducerRecord, ConsumerRecord, Record, etc?
>
> -Jay
>

Reply via email to