We don't have a partition per user, there is no need for that. In the same
way a distributed database doesn't have a partition per user. A partition
is just a physical grouping of keys.
-Jay
On Tue, Nov 27, 2012 at 12:00 PM, S Ahmed wrote:
> How does that work out though, I mean with 10 millio
How does that work out though, I mean with 10 million users that is 10
million files at least.
On Mon, Nov 26, 2012 at 2:02 PM, Jay Kreps wrote:
> Yeah a partition is physically implemented as a log (i.e. a sequence of
> files containing a bunch of messages indexed by offset). So each server c
Yeah a partition is physically implemented as a log (i.e. a sequence of
files containing a bunch of messages indexed by offset). So each server can
have lots of partitions, but each partition exists entirely on a server.
So in the "newsfeed" case if you partition by user id, you would be
guarantee
>Yes, your description is correct. A particular member's data would all be
>in one partition.
When you say in one partition, that also means on the same server? Or a
partition can span a brocker node?
At the file level, I'm guessing it has its own physical file then? (or set
of files as it grows
Yes, your description is correct. A particular member's data would all be
in one partition.
Broker partitions are just the unit of parallelism--think of each partition
as a totally ordered log you can append to and read from. The consumption
of one of these partition logs is single threaded.
The
sorry wrong list.
On Sun, Nov 25, 2012 at 10:54 PM, S Ahmed wrote:
> The wiki states "Consider an application that would like to maintain an
> aggregation of the number of profile visitors for each member. It would
> like to send all profile visit events for a member to a particular
> partition