Hi,
Supposed that a Kafka topic has 3 partitions only. Now we want to partition it
into 20 partition, each one will produce an output collection. The purpose is
to write to the sink in parallel from all 20 output collections.
Will this code achieve that purpose?
KafkaIO.Read<byte[], Long> reader =
KafkaIO.<byte[], Long>read()
.withConsumerFactoryFn(
new ConsumerFactoryFn(
topic, 10, numElements, OffsetResetStrategy.EARLIEST)) // 10
partitions
PCollection<Long> input =
p.apply(reader.withoutMetadata()).apply(KafkaToCassandraRow).apply(CassadraIO.write);
Regards
Dinh