Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "HedWig/TopicManagement" page has been changed by ErwinTam. http://wiki.apache.org/hadoop/HedWig/TopicManagement?action=diff&rev1=1&rev2=2 -------------------------------------------------- - ---+ Topic management in Hedwig + == Topic management in Hedwig == - %TOC% + === ZooKeeper data structure === - ---++ !ZooKeeper data structure - - Metadata about topics, subscribers, and hubs will be stored in !ZooKeeper. For a given Hedwig region, we will store the following structure: + Metadata about topics, subscribers, and hubs will be stored in ZooKeeper. For a given Hedwig region, we will store the following structure: - - <img src="%ATTACHURLPATH%/hedwigzk.png" alt="hedwigzk.png"/> The rectangles in this diagram are znodes; rectangles with dashed borders are ephemeral znodes. - * <B>hedwig</B> is the root. If the !ZooKeeper instance is shared between regions, then there would be a region-specific root. + * '''hedwig''' is the root. If the !ZooKeeper instance is shared between regions, then there would be a region-specific root. - * <B>regionname</B> is the region this Hedwig cluster lives in. + * '''regionname''' is the region this Hedwig cluster lives in. - * <B>topics</B> is the root for the topics subtree. All topics that have been created live under this root. + * '''topics''' is the root for the topics subtree. All topics that have been created live under this root. - * <B>T1, T2 ... </B> are nodes for each topic. The nodes are named by the topic name. The fact that a topic has a node under the <B>Topics</B> node means that the topic exists. If we had any other topic metadata (like permissions) that we wanted to store, we could store them as the content of the node. + * '''T1, T2 ... ''' are nodes for each topic. The nodes are named by the topic name. The fact that a topic has a node under the <B>Topics</B> node means that the topic exists. If we had any other topic metadata (like permissions) that we wanted to store, we could store them as the content of the node. - * <B>Ti.hub</B> is the hub that is currently assigned to the topic. This is an ephemeral node, so that if the hub fails, another hub can take over. The name of the node is "Hub" and the content is the hostname of the hub. + * '''Ti.hub''' is the hub that is currently assigned to the topic. This is an ephemeral node, so that if the hub fails, another hub can take over. The name of the node is "Hub" and the content is the hostname of the hub. - * <B>hubscribers</B> is the root of the tree of subscribers to the topic. + * '''hubscribers''' is the root of the tree of subscribers to the topic. - * <B>Sub1, Sub2 ...</B> are the subscribers. Each node is named with the subscriber ID (which should be descriptive enough for the hub to be able to deliver messages to the subscriber.) The content of the node includes the subscriber's current consume mark for the topic. + * '''Sub1, Sub2 ...''' are the subscribers. Each node is named with the subscriber ID (which should be descriptive enough for the hub to be able to deliver messages to the subscriber.) The content of the node includes the subscriber's current consume mark for the topic. - * <B>hosts</B> is the root of the tree of hubs in the region. + * '''hosts''' is the root of the tree of hubs in the region. - * <B>S1:port, S2:port ... </B> are the hubs. Each node is named with the hostname:port of the hub. The fact that a node exists for the hub means that the hub is in the cluster, and is eligible to be assigned topics. + * '''S1:port, S2:port ... ''' are the hubs. Each node is named with the hostname:port of the hub. The fact that a node exists for the hub means that the hub is in the cluster, and is eligible to be assigned topics. - * <B>Alive</B> is an ephemeral node that indicates whether the hub is currently alive. + * '''Alive''' is an ephemeral node that indicates whether the hub is currently alive. - ---++ Topic creation + === Topic creation === Topic creation and assignment to a hub is a lazy process. Topics are created on demand (e.g., when there is a subscriber) and assigned to a hub on demand (e.g., when there is a new subscriber or a message published.) When a hub responsible for a topic fails, we reassign the topic on demand; e.g. when the connected subscribers reconnect. - ---+++ Subscription process + === Subscription process === The subscribe call takes three parameters: * The subscriber ID (string) @@ -39, +35 @@ Upon receiving a <i>subscribe</i> message for topic T, the hub H1 will follow these steps: * The hub H1 will check in !ZooKeeper to see if the topic T exists as a child of <b>Topics</B>. If the topic T does not exist: - * H1 will create the node <B>Topics.T</B>, and the node <B>Topics.T.Subscribers</B>. + * H1 will create the node '''Topics.T''', and the node '''Topics.T.Subscribers'''. - * The hub H1 will read <b>T.Hub</b>, the current hub assigned to the topic (say, H2). + * The hub H1 will read '''T.Hub''', the current hub assigned to the topic (say, H2). - * If a hub exists, and is the same hub the client contacted (e.g. H1==H2), then H1 will add C to the list of subscribers (under <b>T.Subscribers</B>), and set up its internal bookkeeping to begin delivering messages to C. + * If a hub exists, and is the same hub the client contacted (e.g. H1==H2), then H1 will add C to the list of subscribers (under '''T.Subscribers'''), and set up its internal bookkeeping to begin delivering messages to C. * If a hub exists, and is a different hub than the one the client contacted (e.g. H1!=H2), then H1 will return to the client a <I>redirect</I> message, requesting that the client retry its subscription at H2. * If no hub exists, H1 will check the flag of the <i>subscribe</I> call to see if the client was redirected. * If false (the client has not yet been redirected) H1 will choose a random hub H3 (possibly itself) to manage the topic. @@ -102, +98 @@ * We may want to optimize this procedure so that the hub does not have to contact !ZooKeeper on every publish. This could be done using leases, or by depending on the disconnect exception...the proper way to do this is an open question. * Instead of redirecting the client, we could forward the published message to the correct hub. But redirecting seems cleaner, since we would really like the publisher to be directly connected to the correct hub. - ---+++ Topic redistribution + === Topic redistribution === Occasionally, we should shuffle topics between hubs to ensure load balancing. For example, when a new hub joins, we want topics to be assigned to it. Similarly, if some topics are hotter than others, the hub should be able to shed load. Since all of the persistent state about a topic is in !ZooKeeper or !BookKeeper, shuffling a single topic can be easy: the hub just stops accepting publishes and deletes its ephemeral node. The next time a client tries to subscribe or publish to the topic, it will get assigned to a random hub. @@ -113, +109 @@ The constant shuffling of topics should help to keep load evenly balanced across hubs, without human intervention. Moreover, lazily abandoning topics will help the shuffling to occur in an incremental fashion spread out over time. - ---++ Open questions + === Open questions === * How to handle the lease of the hub's primary status for a topic * How to handle access control - who is allowed to create topics, who is allowed to subscribe to them, who is allowed to publish to them
