I am using samza on yarn with Kafka. I need to reduce the number of
partitions in kafka. I am ok with some data loss. Can someone suggest what
should be the recommended way of doing this?

Samza Job Config looks like this -

job.factory.class = org.apache.samza.job.yarn.YarnJobFactory
task.class = com.vnera.grid.task.GenericStreamTask
task.window.ms = 100
systems.kafka.samza.factory =
org.apache.samza.system.kafka.KafkaSystemFactory
systems.kafka.consumer.zookeeper.connect = localhost:2181
systems.kafka.consumer.auto.offset.reset = largest
systems.kafka.producer.metadata.broker.list = localhost:9092
systems.kafka.producer.producer.type = sync
systems.kafka.producer.batch.num.messages = 1
systems.kafka.samza.key.serde = string
serializers.registry.string.class =
org.apache.samza.serializers.StringSerdeFactory
yarn.package.path =
file://${basedir}/target/${project.artifactId}-${pom.version}-dist.tar.gz
yarn.container.count = ${container.count}
yarn.am.container.memory.mb =  ${samzajobs.memory.mb}
job.name = job4
task.inputs = kafka.Topic3

Reply via email to