Surprising behaviour with a Kafka 8 Producer writing to a Kafka 10 Cluster

2017-05-11 Thread david.franklin
Hi, I've noticed something a bit surprising (to me at least) when a Kafka 8 producer writes to a Kafka 10 Cluster where the messages are subsequently processed by a Kafka Connect sink. The messages are Avro encoded (a suitable Avro key/value converter is specified via worker.properties

RE: Kafka Connect and Partitions

2017-05-03 Thread david.franklin
Hi Randall, Many thanks for your and Gwen's help with this - it's very reassuring that help is at hand in such circumstances :) All the best, David -Original Message- From: Randall Hauch [mailto:rha...@gmail.com] Sent: 02 May 2017 21:01 To: dev@kafka.apache.org Subject: Re: Kafka

RE: Kafka Connect and Partitions

2017-05-02 Thread david.franklin
Hi Gwen/Randall, I think I've finally understood, more or less, how partitioning relates to SourceRecords. Because I was using the SourceRecord constructor that doesn't provide values for key and key schema, the key is null. The DefaultPartioner appears to map null to a constant value rather

RE: Kafka Connect and Partitions

2017-04-28 Thread david.franklin
Hi Randall, I'd prefer it if my source connector didn't explicitly set a partition number - it seems cleaner for this to be set by a partitioner based on information generated by the source, i.e. to keep these aspects cleanly decoupled. Re your final paragraph, I do understand that the source

RE: Kafka Connect and Partitions

2017-04-28 Thread david.franklin
Hi Gwen, Having added a custom partitioner (via the producer.partitioner.class property in worker.properties) that simply randomly selects a partition, the data is now written evenly across all the partitions :) The root of my confusion regarding why the default partitioner writes all data to

Re. Kafka Connect and Partitions

2017-04-27 Thread david.franklin
Hi Gwen, Many thanks for your much appreciated offer to help with this. In answer to your questions: * Are you writing a connector or trying to use an existing one? I'm writing a new source/sink connector pipeline: folderToTopics piped into topicsToFolders. * Is the connector reading

RE: Kafka Connect and Partitions

2017-04-27 Thread david.franklin
Hi Gwen, Many thanks for your much appreciated offer to help with this. In answer to your questions: * Are you writing a connector or trying to use an existing one? I'm writing a new source/sink connector pipeline: folderToTopics piped into topicsToFolders. * Is the connector reading from the

Kafka Connect and Partitions

2017-04-26 Thread david.franklin
I've defined several Kafka Connect tasks via the tasks.max property to process a set of topics. Initially I set the partitions on the topics to 1 and partitioned the topics across the tasks programmatically so that each task processed a subset of the topics (or so I thought ...). I then noticed

RE: MirrorMaker across Kafka Clusters with different versions

2017-01-20 Thread david.franklin
Thanks for confirming that Gwen. Best wishes, David -Original Message- From: Gwen Shapira [mailto:g...@confluent.io] Sent: 19 January 2017 20:41 To: dev@kafka.apache.org Subject: Re: MirrorMaker across Kafka Clusters with different versions As you figured out - 0.10 clients (including

MirrorMaker across Kafka Clusters with different versions

2017-01-18 Thread david.franklin
Can MirrorMaker can work across different major versions of Kafka, specifically from a v10 producer to a v8 consumer? I suspect, given that the client API is not backwards compatible, that the answer unfortunately is no. But it would be useful to get a definitive answer on that, and any

RE: KafkaConnect SinkTask::put

2017-01-09 Thread david.franklin
Hi Shikhar - many thanks - that works a treat :) -Original Message- From: Shikhar Bhushan [mailto:shik...@confluent.io] Sent: 06 January 2017 17:39 To: dev@kafka.apache.org Subject: Re: KafkaConnect SinkTask::put Sorry I forgot to specify, this needs to go into your Connect worker

RE: KafkaConnect SinkTask::put

2017-01-06 Thread david.franklin
Hi Shikhar, I've just added this to ~config/consumer.properties in the Kafka folder but it doesn't appear to have made any difference. Have I put it in the wrong place? Thanks again, David -Original Message- From: Shikhar Bhushan [mailto:shik...@confluent.io] Sent: 05 January 2017

RE: KafkaConnect SinkTask::put

2017-01-06 Thread david.franklin
Hi Shikhar - thankyou very much for that :) Best wishes, David -Original Message- From: Shikhar Bhushan [mailto:shik...@confluent.io] Sent: 05 January 2017 18:12 To: dev@kafka.apache.org Subject: Re: KafkaConnect SinkTask::put Hi David, You can override the underlying consumer's

KafkaConnect SinkTask::put

2017-01-05 Thread david.franklin
Is there any way of limiting the number of events that are passed into the call to the put(Collection) method? I'm writing a set of events to Kafka via a source Connector/Task and reading these from a sink Connector/Task. If I generate of the order of 10k events the number of SinkRecords passed

RE: Kafka Connect key.converter and value.converter properties for Avro encoding

2016-11-08 Thread david.franklin
Hi Ewen, Thanks for the additional insight. Because I have no Connect schema (only an Avro schema) am I safe to just use the byte[] <-> (Avro Schema, SpecificRecord) conversion? This seems to work with the, admittedly limited, testing I've done so far. Without converting my Avro schema into

RE: Kafka Connect key.converter and value.converter properties for Avro encoding

2016-11-07 Thread david.franklin
Hi Ewen, Sorry but I didn't understand much of that. I currently have an implementation of the Converter interface that uses Avro's BinaryEncoder/Decoder, SpecificDatumReader/Writer. The main mismatch I faced is that I need to use org.apache.avro.Schema for serialization whereas the Converter

RE: Kafka Connect key.converter and value.converter properties for Avro encoding

2016-11-03 Thread david.franklin
Thanks to Gwen and Tommy Baker for their helpful replies. Currently, the environment I need to work with doesn't use the Schema Registry; hopefully one day it will but for now that's not an option. Events are written to Kafka without the schema embedded and each side of the interface assumes a

Kafka Connect key.converter and value.converter properties for Avro encoding

2016-11-02 Thread david.franklin
I am using Kafka Connect in source mode i.e. using it to send events to Kafka topics. With the key.converter and value.converter properties set to org.apache.kafka.connect.storage.StringConverter I can attach a consumer to the topics and see the events in a readable form. This is helpful and