Bhaskar Gollapudi created KAFKA-5761: ----------------------------------------
Summary: Serializer API should support ByteBuffer Key: KAFKA-5761 URL: https://issues.apache.org/jira/browse/KAFKA-5761 Project: Kafka Issue Type: Improvement Components: clients Affects Versions: 0.11.0.0 Reporter: Bhaskar Gollapudi Consider the Serializer : Its main method is : byte[] serialize(String topic, T data); Producer applications create a implementation that takes in an instance ( of T ) and convert that to a byte[]. This byte array is allocated a new for this message.This byte array then is handed over to Kafka Producer API internals that write the bytes to buffer/ network socket. When the next message arrives , the serializer instead of creating a new byte[] , should try to reuse the existing byte[] for the new message. This requires two things : 1. The process of handing off the bytes to the buffer/socket and reusing the byte[] must happen on the same thread. 2 There should be a way for marking the end of available bytes in the byte[]. The first is reasonably simple to understand. If this does not happen , and without other necessary synchrinization , the byte[] get corrupted and so is the message written to buffer/socket.However , this requirement is easy to meet for a producer application , because it controls the threads on which the serializer is invoked. The second is where the problem lies with the current API. It does not allow a variable size of bytes to be read from a container. It is limited by the byte[]'s length. This forces the producer to 1 either create a new byte[] for a message that is bigger than the previous one. OR 2. Decide a max size and use a padding . Both are cumbersome and error prone, and may cause wasting of network bandwidth. Instead , if there is an Serializer with this method : ByteBuffer serialize(String topic, T data); This helps to implements a reusable bytes container for clients to avoid allocations for each message. -- This message was sent by Atlassian JIRA (v6.4.14#64029)