Hi Casey, On the first question I *believe* you see 2 partitions on the created output streams because they are created automatically by Kafka (auto.create.topics.enabled set to true) and if so then it seems to default to 2 and not 1 partitions.
For the rest I am completely out of my depth and will wait for one of the other guys to jump in. :) On the second question what are you trying to achieve here? Do you want the mechanism for a single message to go to all partitions of a stream or do you want every client to read every message across all the partitions? The two are different though I'm not sure I've made that clear. I ask because there was a discussion earlier this week about a possible addition of broadcast streams which are more in the second case. Regards Garry -----Original Message----- From: Anh Thu Vu [mailto:[email protected]] Sent: 27 February 2014 15:14 To: [email protected] Subject: Stream partition Hi all, It's me again :) I have some questions regarding partitioning the stream. Consider the wikipedia feed task in hello-samza, when I run it, I saw 2 partitions when I list kafka topics: topic: wikipedia-raw partition: 0 leader: 0 replicas: 0 isr: 0 topic: wikipedia-raw partition: 1 leader: 0 replicas: 0 isr: 0 My first question is how to increase the number of partitions? I was playing around with the OutgoingMessageEnvelope (OUTPUT_STREAM is the kafka.wikipedia-raw SystemStream): 1) collector.send(new OutgoingMessageEnvelope(OUTPUT_STREAM, outgoingMap)); 2) collector.send(new OutgoingMessageEnvelope(new SystemStreamPartition(OUTPUT_STREAM, new Partition(0)), outgoingMap)); collector.send(new OutgoingMessageEnvelope(new SystemStreamPartition(OUTPUT_STREAM, new Partition(1)), outgoingMap)); collector.send(new OutgoingMessageEnvelope(new SystemStreamPartition(OUTPUT_STREAM, new Partition(2)), outgoingMap)); 3) collector.send(new OutgoingMessageEnvelope(OUTPUT_STREAM,0, null, outgoingMap)); collector.send(new OutgoingMessageEnvelope(OUTPUT_STREAM,1, null, outgoingMap)); collector.send(new OutgoingMessageEnvelope(OUTPUT_STREAM,2, null, outgoingMap)); but in all cases above, I always got 2 partitions (and why 2 but not 1?) Then, when I try this: collector.send(new OutgoingMessageEnvelope(OUTPUT_STREAM,0, 0, outgoingMap)); collector.send(new OutgoingMessageEnvelope(OUTPUT_STREAM,1, 1, outgoingMap)); collector.send(new OutgoingMessageEnvelope(OUTPUT_STREAM,2, 2, outgoingMap)); I don't receive anything on the OUTPUT_STREAM. Does it has anything to do with the SinglePartitionSystemAdmin in WikipediaSystemFactory? The second question is whether it's possible to send a message to all the partitions of a stream? (maybe, by sending a message multiple times, each specifying a partition?) Thanks in advance, Casey ----- No virus found in this message. Checked by AVG - www.avg.com Version: 2014.0.4259 / Virus Database: 3705/7127 - Release Date: 02/26/14
