Thank you, Chris!

So if I want to have a list of streams/topics with different number of
partitions, I should write a script to create all the topics before
starting my samza jobs? There is no way to set that "automatically" from my
samza jobs?

Casey


On Fri, Feb 28, 2014 at 12:45 AM, Chris Riccomini
<[email protected]>wrote:

> Hey Casey,
>
> Garry is right. By default Kafka ships with 2 partitions per topic. This
> can be configured exactly as Garry describes.
>
> To create a new topic with a custom partition count, use:
>
>   bin/kafka-create-topic.sh --zookeeper localhost:2181 --replica 1
> --partition 1 --topic test
>
> See http://kafka.apache.org/08/quickstart.html for details.
>
> To expand a topic's partition count for a topic that already exists, run:
>
>   bin/kafka-add-partitions.sh
>
> See https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools
> for details.
>
> You can't shrink a topic's partition count for a topic that already exists.
>
> You can send a message to all partitions manually, as you describe.
> Something like:
>
>   For(int partition: partitions) {
>     collector.send(OUTGOING_STREAM, partition, null, outgoingMap)
>   }
>
> This should result in the message being sent to each partition. If you
> have a kafka-console-consumer.sh running against the stream, you should
> see the message once for each partition.
>
> Cheers,
> Chris
>
> On 2/27/14 9:21 AM, "Anh Thu Vu" <[email protected]> wrote:
>
> >Hi Garry,
> >
> >I really owe you a lot today :)
> >
> >About the broadcasting, I think both cases will serve my purpose.
> >Basically, for some particular streams, I want to broadcast every messages
> >to all the replicas of the recipient StreamTask.
> >
> >Casey
> >
> >
> >On Thu, Feb 27, 2014 at 6:00 PM, Garry Turkington <
> >[email protected]> wrote:
> >
> >> Hi Casey,
> >>
> >> On the first question I *believe* you see 2 partitions on the created
> >> output streams because they are created automatically by Kafka
> >> (auto.create.topics.enabled set to true) and if so then it seems to
> >>default
> >> to 2 and not 1 partitions.
> >>
> >> For the rest I am completely out of my depth and will wait for one of
> >>the
> >> other guys to jump in. :)
> >>
> >> On the second question what are you trying to achieve here? Do you want
> >> the mechanism for a single message to go to all partitions of a stream
> >>or
> >> do you want every client to read every message across all the
> >>partitions?
> >> The two are different though I'm not sure I've made that clear. I ask
> >> because there was a discussion earlier this week about a possible
> >>addition
> >> of broadcast streams which are more in the second case.
> >>
> >> Regards
> >> Garry
> >>
> >> -----Original Message-----
> >> From: Anh Thu Vu [mailto:[email protected]]
> >> Sent: 27 February 2014 15:14
> >> To: [email protected]
> >> Subject: Stream partition
> >>
> >> Hi all,
> >>
> >> It's me again :)
> >>
> >> I have some questions regarding partitioning the stream.
> >> Consider the wikipedia feed task in hello-samza, when I run it, I saw 2
> >> partitions when I list kafka topics:
> >> topic: wikipedia-raw    partition: 0    leader: 0    replicas: 0
> >>isr: 0
> >> topic: wikipedia-raw    partition: 1    leader: 0    replicas: 0
> >>isr: 0
> >>
> >> My first question is how to increase the number of partitions? I was
> >> playing around with the OutgoingMessageEnvelope (OUTPUT_STREAM is the
> >> kafka.wikipedia-raw
> >> SystemStream):
> >> 1)
> >> collector.send(new OutgoingMessageEnvelope(OUTPUT_STREAM, outgoingMap));
> >>
> >> 2)
> >> collector.send(new OutgoingMessageEnvelope(new
> >> SystemStreamPartition(OUTPUT_STREAM, new Partition(0)), outgoingMap));
> >> collector.send(new OutgoingMessageEnvelope(new
> >> SystemStreamPartition(OUTPUT_STREAM, new Partition(1)), outgoingMap));
> >> collector.send(new OutgoingMessageEnvelope(new
> >> SystemStreamPartition(OUTPUT_STREAM, new Partition(2)), outgoingMap));
> >>
> >> 3)
> >> collector.send(new OutgoingMessageEnvelope(OUTPUT_STREAM,0, null,
> >> outgoingMap)); collector.send(new
> >>OutgoingMessageEnvelope(OUTPUT_STREAM,1,
> >> null, outgoingMap)); collector.send(new
> >> OutgoingMessageEnvelope(OUTPUT_STREAM,2, null, outgoingMap));
> >>
> >> but in all cases above, I always got 2 partitions (and why 2 but not 1?)
> >>
> >> Then, when I try this:
> >> collector.send(new OutgoingMessageEnvelope(OUTPUT_STREAM,0, 0,
> >> outgoingMap)); collector.send(new
> >>OutgoingMessageEnvelope(OUTPUT_STREAM,1,
> >> 1, outgoingMap)); collector.send(new
> >> OutgoingMessageEnvelope(OUTPUT_STREAM,2, 2, outgoingMap)); I don't
> >>receive
> >> anything on the OUTPUT_STREAM.
> >>
> >> Does it has anything to do with the SinglePartitionSystemAdmin in
> >> WikipediaSystemFactory?
> >>
> >> The second question is whether it's possible to send a message to all
> >>the
> >> partitions of a stream? (maybe, by sending a message multiple times,
> >>each
> >> specifying a partition?)
> >>
> >> Thanks in advance,
> >> Casey
> >>
> >> -----
> >> No virus found in this message.
> >> Checked by AVG - www.avg.com
> >> Version: 2014.0.4259 / Virus Database: 3705/7127 - Release Date:
> >>02/26/14
> >>
>
>

Reply via email to