Hi,

I've noticed something a bit surprising (to me at least) when a Kafka 8 
producer writes to a Kafka 10 Cluster where the messages are subsequently 
processed by a Kafka Connect sink.  The messages are Avro encoded (a suitable 
Avro key/value converter is specified via worker.properties properties), which 
makes it a little difficult to process but creating a String from the byte[] 
gives a reasonable idea of the message contents.

The general pattern seems to be:

The toConnectData() method in the org.apache.kafka.connect.storage.Converter 
interface takes 2 parameters: the topic and a byte[].

On the first invocation, the byte[] only contains part of the message, 
specifically the id attribute - this causes the Avro decoder to fail with an 
EOFException (not surprisingly).  This is followed by a second invocation of 
the toConnectData method and this time the byte[] contains an entire and 
parseable message - in this case the message id component is prefixed by the 
character 'H' (i.e. the first element in the byte[]).

Is this expected behaviour?  Is there any way to suppress the first invocation?

Thanks,
David

Reply via email to