Added: bookkeeper/site/trunk/content/docs/master/hedwigConsole.textile
URL: 
http://svn.apache.org/viewvc/bookkeeper/site/trunk/content/docs/master/hedwigConsole.textile?rev=1644097&view=auto
==============================================================================
--- bookkeeper/site/trunk/content/docs/master/hedwigConsole.textile (added)
+++ bookkeeper/site/trunk/content/docs/master/hedwigConsole.textile Tue Dec  9 
16:00:52 2014
@@ -0,0 +1,187 @@
+Title:        Hedwig Console
+Notice: Licensed under the Apache License, Version 2.0 (the "License");
+        you may not use this file except in compliance with the License. You 
may
+        obtain a copy of the License at 
"http://www.apache.org/licenses/LICENSE-2.0":http://www.apache.org/licenses/LICENSE-2.0.
+        .
+        .        
+        Unless required by applicable law or agreed to in writing,
+        software distributed under the License is distributed on an "AS IS"
+        BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+        implied. See the License for the specific language governing 
permissions
+        and limitations under the License.
+        .
+        .
+
+h1. Hedwig Console
+
+Apache Hedwig provides a console client, which allows users and administrators 
to interact with a hedwig cluster. 
+
+h2. Connecting to hedwig cluster
+
+Hedwig console client is shipped with hedwig server package.
+
+p. To start the console client:
+
+ @hedwig-server/bin/hedwig console@
+
+p. By default, the console client connects to hub server on localhost. If you 
want the console client to connect to a different hub server, you can override 
following environment variables.
+
+| @HEDWIG_CONSOLE_SERVER_CONF@ | Path of a hub server configuration file. 
Override to make hedwig console client connect to correct zookeeper cluster. |
+| @HEDWIG_CONSOLE_CLIENT_CONF@ | Path of a hedwig client configuration file. 
Override to make hedwig console client communicate with correct hub servers. |
+
+p. Once connected, you should see something like:
+
+<pre>
+Connecting to zookeeper/bookkeeper using HedwigAdmin
+
+Connecting to default hub server localhost/127.0.0.1:4080
+Welcome to Hedwig!
+JLine support is enabled
+JLine history support is enabled
+[hedwig: (standalone) 16] 
+</pre>
+
+p. From the shell, type __help__ to get a list of commands that can be 
executed from the client:
+
+<pre>
+[hedwig: (standalone) 16] help
+HedwigConsole [options] [command] [args]
+
+Available commands:
+        pub
+        sub
+        closesub
+        unsub
+        rmsub
+        consume
+        consumeto
+        pubsub
+        show
+        describe
+        readtopic
+        set
+        history
+        redo
+        help
+        quit
+        exit
+
+Finished 0.0020 s.
+</pre>
+
+p. If you want to know detail usage for each command, type __help {command}__ 
in the shell. For example:
+
+<pre>
+[hedwig: (standalone) 17] help pub
+pub: Publish a message to a topic in Hedwig
+usage: pub {topic} {message}
+
+  {topic}   : topic name.
+              any printable string without spaces.
+  {message} : message body.
+              remaining arguments are used as message body to publish.
+
+Finished 0.0 s.
+</pre>
+
+h2. Commands
+
+All the available commands provided in hedwig console could be categorized 
into three groups. They are __interactive commands__, __admin commands__, 
__utility commands__.
+
+h3. Interactive Commands
+
+p. Interactive commands are used by users to communicate with a hedwig 
cluster. They are __pub__, __sub__, __closesub__, __unsub__, __consume__ and 
__consumeto__.
+
+p. These commands are quite simple and have same semantics as the API provided 
in hedwig client.
+
+h3.  Admin Commands
+
+p. Admin commands are used by administrators to operate or debug a hedwig 
cluster. They are __show__, __describe__, __pubsub__ and __readtopic__.
+
+p. __show__ is used to list all available hub servers or topics in the cluster.
+
+p. You could use __show__ to list hub servers to know how many hub servers are 
alive in the cluster.
+
+<pre>
+[hedwig: (standalone) 27] show hubs
+Available Hub Servers:
+        192.168.1.102:4080:9876 :       0
+Finished 0.0040 s.
+</pre>
+
+p. Also, you could use __show__ to list all topics. If you have a lot of 
topics on the clusters, this command will take a long time to run.
+
+<pre>
+[hedwig: (standalone) 28] show topics
+Topic List:
+[mytopic]
+Finished 0.0020 s.
+</pre>
+
+p. To see the details of a topic, run __describe__. This shows the metadata of 
a topic, including topic owner, persistence info, subscriptions info.
+
+<pre>
+[hedwig: (standalone) 43] describe topic mytopic
+===== Topic Information : mytopic =====
+
+Owner : 192.168.1.102:4080:9876
+
+>>> Persistence Info <<<
+Ledger 3 [ 1 ~ 9 ]
+
+>>> Subscription Info <<<
+Subscriber mysub : consumeSeqId: local:0
+
+Finished 0.011 s.
+</pre>
+
+p. When you are run the __describe__ command, you should keep in mind that 
__describe__ command reads the metadata from __ZooKeeper__ directly, so the 
subscription info might not be completely up to date due to the fact that hub 
servers update the subscription metadata lazily.
+
+p. The __readtopic__ command is useful to see which messages have not been 
consumed by the client.
+
+<pre>
+[hedwig: (standalone) 46] readtopic mytopic
+
+>>>>> Ledger 3 [ 1 ~ 9] <<<<<
+
+---------- MSGID=LOCAL(1) ----------
+MsgId:     LOCAL(1)
+SrcRegion: standalone
+Message:
+
+hello
+
+---------- MSGID=LOCAL(2) ----------
+MsgId:     LOCAL(2)
+SrcRegion: standalone
+Message:
+
+hello 2
+
+---------- MSGID=LOCAL(3) ----------
+MsgId:     LOCAL(3)
+SrcRegion: standalone
+Message:
+
+hello 3
+
+...
+</pre>
+
+p. __pubsub__ is another useful command for administrators. It can be used to 
test availability and functionality of a cluster. It generates a temporary 
subscriber id with the current timestamp, subscribes to the given topic using 
generated subscriber id, publishes a message to given topic and testes whether 
the subscriber received the message.
+
+<pre>
+[hedwig: (standalone) 48] pubsub testtopic testsub- 10 test message for 
availability
+Starting PUBSUB test ...
+Sub topic testtopic, subscriber id testsub--1338126964504
+Pub topic testtopic : test message for availability-1338126964504
+Received message : test message for availability-1338126964504
+PUBSUB SUCCESS. TIME: 377 MS
+Finished 0.388 s.
+</pre>
+
+h3. Utility Commands
+
+p. Utility Commands are __help__, __history__, __redo__, __quit__ and __exit__.
+
+p. __quit__ and __exit__ are used to exit console, while __history__ and 
__redo__ are used to manage the history of commands executed in the shell.

Added: bookkeeper/site/trunk/content/docs/master/hedwigDesign.textile
URL: 
http://svn.apache.org/viewvc/bookkeeper/site/trunk/content/docs/master/hedwigDesign.textile?rev=1644097&view=auto
==============================================================================
--- bookkeeper/site/trunk/content/docs/master/hedwigDesign.textile (added)
+++ bookkeeper/site/trunk/content/docs/master/hedwigDesign.textile Tue Dec  9 
16:00:52 2014
@@ -0,0 +1,72 @@
+Notice: Licensed under the Apache License, Version 2.0 (the "License");
+        you may not use this file except in compliance with the License. You 
may
+        obtain a copy of the License at 
"http://www.apache.org/licenses/LICENSE-2.0":http://www.apache.org/licenses/LICENSE-2.0.
+        .        
+        Unless required by applicable law or agreed to in writing,
+        software distributed under the License is distributed on an "AS IS"
+        BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+        implied. See the License for the specific language governing 
permissions
+        and limitations under the License.
+        .
+
+h1. Style
+
+We have provided an Eclipse Formatter file @formatter.xml@ with all the 
formatting conventions currently used in the project. Highlights include no 
tabs, 4-space indentation, and 120-char width. Please respect this so as to 
reduce the amount of formatting-related noise produced in commits.
+
+h1. Static Analysis
+
+We would like to use static analysis tools PMD and FindBugs to maintain code 
quality. However, we have not yet arrived at a consensus on what rules to 
adhere to, and what to ignore.
+
+h1. Netty Notes
+
+The asynchronous network IO infrastructure that Hedwig uses is 
"Netty":http://www.jboss.org/netty. Here are some notes on Netty's concurrency 
architecture and its filter pipeline design.
+
+h2. Concurrency Architecture
+
+After calling @ServerBootstrap.bind()@, Netty starts a boss thread 
(@NioServerSocketPipelineSink.Boss@) that just accepts new connections and 
registers them with one of the workers from the @NioWorker@ pool in round-robin 
fashion (pool size defaults to CPU count). Each worker runs its own select loop 
over just the set of keys that have been registered with it. Workers start 
lazily on demand and run only so long as there are interested fd's/keys. All 
selected events are handled in the same thread and sent up the pipeline 
attached to the channel (this association is established by the boss as soon as 
a new connection is accepted).
+
+All workers, and the boss, run via the executor thread pool; hence, the 
executor must support at least two simultaneous threads.
+
+h2. Handler Pipeline
+
+A pipeline implements the intercepting filter pattern. A pipeline is a 
sequence of handlers. Whenever a packet is read from the wire, it travels up 
the stream, stopping at each handler that can handle upstream events. 
Vice-versa for writes. Between each filter, control flows back through the 
centralized pipeline, and a linked list of contexts keeps track of where we are 
in the pipeline (one context object per handler).
+
+
+h1. Pseudocode
+
+This summarizes the control flow through the system.
+
+h2. publish
+
+Need to document
+
+h2. subscribe
+
+Need to document
+
+h1. ReadAhead Cache
+
+The delivery manager class is responsible for pushing published messages from 
the hubs to the subscribers. The most common case is that all subscribers are 
connected and either caught up, or close to the tail end of the topic. In this 
case, we don't want the delivery manager to be polling bookkeeper for any newly 
arrived messages on the topic; new messages should just be pushed to the 
delivery manager. However, there is also the uncommon case when a subscriber is 
behind, and messages must be pulled from Bookkeeper.
+
+Since all publishes go through the hub, it is possible to cache the recently 
published messages in the hub, and then the delivery manager won't have to make 
the trip to bookkeeper to get the messages but instead get them from local 
process memory.
+
+These ideas of push, pull, and caching are unified in the following way: - A 
hub has a cache of messages
+
+* When the delivery manager wants to deliver a message, it asks the cache for 
it. There are 3 cases:
+* The message is available in the cache, in which case it is given to the 
delivery manager
+* The message is not present in the cache and the seq-id of the message is 
beyond the last message published on that topic (this happens if the subscriber 
is totally caught up for that topic). In this case, a stub is put in the cache 
in order to notify the delivery manager when that message does happen to be 
published.
+* The message is not in the cache but has been published to the topic. In this 
case, a stub is put in the cache, and a read is issued to bookkeeper.
+* Whenever a message is published, it is cached. If there is a stub already in 
the cache for that message, the delivery manager is notified.
+* Whenever a message is read from bookkeeper, it is cached. There must be a 
stub for that message (since reads to bookkeeper are issued only after putting 
a stub), so the delivery manager is notified.
+* The cache does readahead, i.e., if a message requested by the delivery 
manager is not in the cache, a stub is established not only for that message, 
but also for the next n messages where n is configurable (default 10). On a 
cache hit, we look ahead n/2 messages, and if that message is not present, we 
establish another n/2 stubs. In short, we always ensure that the next n stubs 
are always established.
+* Over time, the cache will grow in size. There are 2 pruning mechanisms:
+* Once all subscribers have consumed up to a particular seq-id, they notify 
the cache, and all messages up to that seq-id are pruned from the cache.
+* If the above pruning is not working (e.g., because some subscribers are 
down), the cache will eventually hit its size limit which is configurable
+ (default, half of maximum jvm heap size). At this point, messages are just 
pruned in FIFO order. We use the size of the blobs in the message for 
estimating the cache size. The assumption is that that size will dominate over 
fixed, object-level size overheads.
+* Stubs are not purged because according to the above simplification, they are 
of 0 size.
+
+h1. Scalability Bottlenecks Down the Road
+
+* Currently each topic subscription is served on a different channel. The 
number of channels will become a bottleneck at higher channels. We should 
switch to an architecture, where multiple topic subscriptions between the same 
client, hub pair should be served on the same channel. We can have commands to 
start, stop subscriptions sent all the way to the server (right now these are 
local).
+* Publishes for a topic are serialized through a hub, to get ordering 
guarantees. Currently, all subscriptions to that topic are served from the same 
hub. If we start having large number of subscribers to heavy-volume topics, the 
outbound bandwidth at the hub, or the CPU at that hub might become the 
bottleneck. In that case, we can setup other regions through which the messages 
are routed (this hierarchical scheme) reduces bandwidth requirements at any 
single node. It should be possible to do this entirely through configuration.
+

Added: bookkeeper/site/trunk/content/docs/master/hedwigJMX.textile
URL: 
http://svn.apache.org/viewvc/bookkeeper/site/trunk/content/docs/master/hedwigJMX.textile?rev=1644097&view=auto
==============================================================================
--- bookkeeper/site/trunk/content/docs/master/hedwigJMX.textile (added)
+++ bookkeeper/site/trunk/content/docs/master/hedwigJMX.textile Tue Dec  9 
16:00:52 2014
@@ -0,0 +1,32 @@
+Title:        Hedwig JMX
+Notice: Licensed under the Apache License, Version 2.0 (the "License");
+        you may not use this file except in compliance with the License. You 
may
+        obtain a copy of the License at 
"http://www.apache.org/licenses/LICENSE-2.0":http://www.apache.org/licenses/LICENSE-2.0.
+        .
+        .        
+        Unless required by applicable law or agreed to in writing,
+        software distributed under the License is distributed on an "AS IS"
+        BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+        implied. See the License for the specific language governing 
permissions
+        and limitations under the License.
+        .
+        .
+
+h1. JMX
+
+Apache Hedwig has extensive support for JMX, which allows viewing and managing 
a hedwig cluster.
+
+This document assumes that you have basic knowledge of JMX. See "Sun JMX 
Technology":http://java.sun.com/javase/technologies/core/mntr-mgmt/javamanagement/
 page to get started with JMX.
+
+See the "JMX Management 
Guide":http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html 
for details on setting up local and remote management of VM instances. By 
default the included __hedwig__ script supports only local management - review 
the linked document to enable support for remote management (beyond the scope 
of this document).
+
+__Hub Server__ is a JMX manageable server, which registers the proper MBeans 
during initialization to support JMX monitoring and management of the instance.
+
+h1. Hub Server MBean Reference
+
+This table details JMX for a hub server.
+
+| _.MBean | _.MBean Object Name | _.Description |
+| PubSubServer | PubSubServer | Represents a hub server. It is the root MBean 
for hub server, which includes statistics for a hub server. E.g. number packets 
sent/received/redirected, and statistics for pub/sub/unsub/consume operations. |
+| NettyHandlers | NettyHandler | Provide statistics for netty handlers. 
Currently it just returns number of subscription channels established to a hub 
server. |
+| ReadAheadCache | ReadAheadCache | Provide read ahead cache statistics. |

Added: bookkeeper/site/trunk/content/docs/master/hedwigMessageFilter.textile
URL: 
http://svn.apache.org/viewvc/bookkeeper/site/trunk/content/docs/master/hedwigMessageFilter.textile?rev=1644097&view=auto
==============================================================================
--- bookkeeper/site/trunk/content/docs/master/hedwigMessageFilter.textile 
(added)
+++ bookkeeper/site/trunk/content/docs/master/hedwigMessageFilter.textile Tue 
Dec  9 16:00:52 2014
@@ -0,0 +1,76 @@
+Title:        Hedwig Message Filter
+Notice: Licensed under the Apache License, Version 2.0 (the "License");
+        you may not use this file except in compliance with the License. You 
may
+        obtain a copy of the License at 
"http://www.apache.org/licenses/LICENSE-2.0":http://www.apache.org/licenses/LICENSE-2.0.
+        .
+        .
+        Unless required by applicable law or agreed to in writing,
+        software distributed under the License is distributed on an "AS IS"
+        BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+        implied. See the License for the specific language governing 
permissions
+        and limitations under the License.
+        .
+        .
+
+h1. Message Filter
+
+Apache Hedwig provides an efficient mechanism for supporting 
application-defined __message filtering__.
+
+h2. Message
+
+Most message-oriented middleware (MOM) products treat messages as lightweight 
entities that consist of a header and a payload. The header contains fields 
used for message routing and identification; the payload contains the 
application data being sent.
+
+Hedwig messages follow a similar template, being composed of following parts:
+
+* @Header@ - All messages support both system defined fields and application 
defined property values. Properties provide an efficient mechanism for 
supporting application-defined message filtering.
+* @Body@ - Hedwig considers the message body as a opaque binary blob.
+* @SrcRegion@ - Indicates where the message comes from.
+* @MessageSeqId@ - The unique message sequence id assigned by Hedwig.
+
+h3. Message Header Properties
+
+A __Message__ object contains a built-in facility for supporting 
application-defined property values. In effect, this provides a mechanism for 
adding application-specific header fields to a message.
+
+By using properties and  __message filters__, an application can have Hedwig 
select, or filter, messages on its behalf using application-specific criteria.
+
+Property names must be a __String__ and must not be null, while property 
values are binary blobs. The flexibility of binary blobs allows applications to 
define their own serialize/deserialize functions, allowing structured data to 
be stored in the message header.
+
+h2. Message Filter
+
+A __Message Filter__ allows an application to specify, via header properties, 
the messages it is interested in. Only messages which pass validation of a 
__Message Filter__, specified by a subscriber, are be delivered to the 
subscriber.
+
+A message filter could be run either on the __server side__ or on the __client 
side__. For both __server side__ and __client side__, a __Message Filter__ 
implementation needs to implement the following two interfaces:
+
+* @setSubscriptionPreferences(topic, subscriberId, preferences)@: The 
__subscription preferences__ of the subscriber will be passed to message filter 
when it was attached to its subscription either on the server-side or on the 
client-side.
+* @testMessage(message)@: Used to test whether a particular message passes the 
filter or not.
+
+The __subscription preferences__ are used to specify the messages that the 
user is interested in. The __message filter__ uses the __subscription 
preferences__ to decide which messages are passed to the user.
+
+Take a book store(using topic __BookStore__) as an example:
+
+# User A may only care about History books. He subscribes to __BookStore__ 
with his custom preferences : type="History".
+# User B may only care about Romance books. He subscribes to __BookStore__ 
with his custom preferences : type="Romance".
+# A new book arrives at the book store; a message is sent to __BookStore__ 
with type="History" in its header
+# The message is then delivered to __BookStore__'s subscribers.
+# Subscriber A filters the message by checking messages' header to accept 
those messages whose type is "History".
+# Subscriber B filters out the message, as the type does not match its 
preferences.
+
+h3. Client Message Filter.
+
+A __ClientMessageFilter__ runs on the client side. Each subscriber can write 
its own filter and pass it as a parameter when starting delivery ( 
__startDelivery(topic, subscriberId, messageHandler, messageFilter)__ ).
+
+h3. Server Message Filter.
+
+A __ServerMessageFilter__ runs on the server side (a hub server). A hub server 
instantiates a server message filter, by means of reflection, using the message 
filter class specified in the subscription preferences which are provided by 
the subscriber. Since __ServerMessageFilter__s run on the hub server, all 
filtered-out messages are never delivered to client, reducing unnecessary 
network traffic. Hedwig uses a implementation of __ServerMessageFilter__ to 
filter unnecessary message deliveries between regions.
+
+Since hub servers use reflection to instantiate a __ServerMessageFilter__, an 
implementation of __ServerMessageFilter__ needs to implement two additional 
methods:
+
+* @initialize(conf)@: Initialize the message filter before filtering messages.
+* @uninitialize()@: Uninitialize the message filter to release resources used 
by the message filter.
+
+For the hub server to load the message filter, the implementation class must 
be in the server's classpath at startup.
+
+h3. Which message filter should be used?
+
+It depends on application requirements. Using a __ServerMessageFilter__ will 
reduce network traffic by filtering unnecessary messages, but it would compete 
for resources on the hub server(CPU, memory, etc). Conversely, 
__ClientMessageFilter__s have the advantage of inducing no extra load on the 
hub server, but at the price of higher network utilization. A filter can be 
installed both at the server side and on the client; Hedwig does not restrict 
this.
+

Added: bookkeeper/site/trunk/content/docs/master/hedwigMetadata.textile
URL: 
http://svn.apache.org/viewvc/bookkeeper/site/trunk/content/docs/master/hedwigMetadata.textile?rev=1644097&view=auto
==============================================================================
--- bookkeeper/site/trunk/content/docs/master/hedwigMetadata.textile (added)
+++ bookkeeper/site/trunk/content/docs/master/hedwigMetadata.textile Tue Dec  9 
16:00:52 2014
@@ -0,0 +1,123 @@
+Title:        Hedwig Metadata Management
+Notice: Licensed under the Apache License, Version 2.0 (the "License");
+        you may not use this file except in compliance with the License. You 
may
+        obtain a copy of the License at 
"http://www.apache.org/licenses/LICENSE-2.0":http://www.apache.org/licenses/LICENSE-2.0.
+        .
+        .
+        Unless required by applicable law or agreed to in writing,
+        software distributed under the License is distributed on an "AS IS"
+        BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+        implied. See the License for the specific language governing 
permissions
+        and limitations under the License.
+        .
+        .
+
+h1. Metadata Management
+
+There are two classes of metadata that need to be managed in Hedwig: one is 
the __list of available hubs__, which is used to track server availability 
(ZooKeeper is designed naturally for this); while the other is for data 
structures to track __topic states__ and __subscription states__. This second 
class can be handled by any key/value store which provides ah __CAS (Compare 
And Set)__ operation. The metadata in this class are:
+
+* @Topic Ownership@: tracks which hub server is assigned to serve requests for 
a specific topic.
+* @Topic Persistence Info@: records what __bookkeeper ledgers__ are used to 
store messages for a specific topic and their message id ranges.
+* @Subscription Data@: records the preferences and subscription state for a 
specific subscription (topic, subscriber).
+
+Each kind of metadata is handled by a specific metadata manager. They are 
__TopicOwnershipManager__, __TopicPersistenceManager__ and 
__SubscriptionDataManager__.
+
+h2. Topic Ownership Management
+
+There are two ways to management topic ownership. One is leveraging 
ZooKeeper's ephemeral znodes to record the topic's owner info as a child 
ephemeral znode under its topic znode. When a hub server, owning a specific 
topic, crashes, the ephemeral znode which signifies topic ownership will be 
deleted due to the loss of the zookeeper session. Other hubs can then be 
assigned the ownership of the topic. The other one is to leverage the __CAS__ 
operation provided by key/value stores to do leader election. __CAS__ doesn't 
require the underlying key/value store to provide functionality similar to 
ZooKeeper's ephemeral nodes. With __CAS__ it is possible to guarantee that only 
one hub server gains the ownership for a specific topic, which is more scalable 
and generic solution.
+
+The implementation of a __TopicOwnershipManager__ is required to implement 
following methods:
+
+<pre><code>
+
+public void readOwnerInfo(ByteString topic, Callback<Versioned<HubInfo>> 
callback, Object ctx);
+
+public void writeOwnerInfo(ByteString topic, HubInfo owner, Version version,
+                           Callback<Version> callback, Object ctx);
+
+public void deleteOwnerInfo(ByteString topic, Version version,
+                            Callback<Void> callback, Object ctx);
+
+</code></pre>
+
+* @readOwnerInfo@: Read the owner info from the underlying key/value store. 
The implementation should take the responsibility of deserializing the metadata 
into a __HubInfo__ object identifying a hub server. Also, its current 
__version__ needs to be returned for future updates. If there is no owner info 
found for a topic, null value is returned.
+
+* @writeOwnerInfo@: Write the owner info into the underlying key/value store 
with the given __version__. If the current __version__ in underlying key/value 
store doesn't equal to the provided __version__, the write should be rejected 
with __BadVersionException__. The new __version__ should be returned for a 
successful write. __NoTopicOwnerInfoException__ is returned if no owner info 
found for a topic.
+
+* @deleteOwnerInfo@: Delete the owner info from key/value store with the given 
__version__. The owner info should be removed if the current __version__ in 
key/value store is equal to the provided __version__. Otherwise, the deletion 
should be rejected with __BadVersionException__. __NoTopicOwnerInfoException__ 
is returned if no owner info is found for the topic.
+
+h2. Topic Persistence Info Management
+
+Similar as __TopicOwnershipManager__, an implementation of 
__TopicPersistenceManager__ is required to implement READ/WRITE/DELETE 
interfaces as below:
+
+<pre><code>
+public void readTopicPersistenceInfo(ByteString topic,
+                                     Callback<Versioned<LedgerRanges>> 
callback, Object ctx);
+
+public void writeTopicPersistenceInfo(ByteString topic, LedgerRanges ranges, 
Version version,
+                                      Callback<Version> callback, Object ctx);
+
+public void deleteTopicPersistenceInfo(ByteString topic, Version version,
+                                       Callback<Void> callback, Object ctx);
+</code></pre>
+
+* @readTopicPersistenceInfo@: Read the persistence info from the underlying 
key/value store. The implementation should take the responsibility of 
deserializing the metadata into a __LedgerRanges__ object includes the ledgers 
used to store messages. Also, its current __version__ needs to be returned for 
future updates. If there is no persistence info found for a topic, a null value 
is returned.
+
+* @writeTopicPersistenceInfo@: Write the persistence info into the underlying 
key/value store with the given __version__. If the current __version__ in the 
underlying key/value store doesn't equal the provided __version__, the write 
should be rejected with __BadVersionException__. The new __version__ should be 
returned on a successful write. __NoTopicPersistenceInfoException__ is returned 
if no persistence info is found for a topic.
+
+* @deleteTopicPersistenceInfo@: Delete the persistence info from the key/value 
store with the given __version__. The owner info should be removed if the 
current __version__ in the key/value store equals the provided __version__. 
Otherwise, the deletion should be rejected with __BadVersionException__. 
__NoTopicPersistenceInfoException__ is returned if no persistence info is found 
for a topic.
+
+h2. Subscription Data Management
+
+__SubscriptionDataManager__ has similar READ/CREATE/WRITE/DELETE interfaces as 
other managers. Besides that, the implementation needs to implement __READ 
SUBSCRIPTIONS__ interface, which is to fetch all the subscriptions for a given 
topic.
+
+<pre><code>
+public void createSubscriptionData(ByteString topic, ByteString subscriberId, 
SubscriptionData data,
+                                   Callback<Version> callback, Object ctx);
+
+public boolean isPartialUpdateSupported();
+
+public void updateSubscriptionData(ByteString topic, ByteString subscriberId, 
SubscriptionData dataToUpdate, 
+                                   Version version, Callback<Version> 
callback, Object ctx);
+
+public void replaceSubscriptionData(ByteString topic, ByteString subscriberId, 
SubscriptionData dataToReplace,
+                                    Version version, Callback<Version> 
callback, Object ctx);
+
+public void deleteSubscriptionData(ByteString topic, ByteString subscriberId, 
Version version,
+                                   Callback<Void> callback, Object ctx);
+
+public void readSubscriptionData(ByteString topic, ByteString subscriberId,
+                                 Callback<Versioned<SubscriptionData>> 
callback, Object ctx);
+
+public void readSubscriptions(ByteString topic, Callback<Map<ByteString, 
Versioned<SubscriptionData>>> cb,
+                              Object ctx);
+</code></pre>
+
+h3. Create/Update Subscriptions
+
+The metadata for a subscription includes two parts, one is preferences and the 
other one is subscription state. __SubscriptionPreferences__ tracks all the 
preferences for a subscriber (etc. Application could store its customized 
preferences for message filtering), while __SubscriptionState__ is used 
internally to track the message consumption state for a given subscriber. These 
two kinds of metadata are quite different: __SubscriptionPreferences__ is not 
updated
+frequently while __SubscriptionState__ is be updated frequently when messages 
are consumed. If the underlying key/value store supports independent field 
update for a given key (subscription), __SubscriptionPreferences__ and 
__SubscriptionState__ could be stored as two different fields for a given 
subscription. In this case __isPartialUpdateSupported__ should return true. 
Otherwise, __isPartialUpdateSupported__ should return false and the 
implementation should serialize/deserialize __SubscriptionData__ as an opaque 
blob.
+
+* @createSubscriptionData@: Create a subscription entry for a given topic. The 
initial __version__ would be returned for a success creation. 
__SubscriptionStateExistsException__ is returned if the subscription entry 
already exists.
+
+* @updateSubscriptionData/replaceSubscriptionData@: Update/replace the 
subscription data in the underlying key/value store with the given __version__. 
If the current __version__ in underlying key/value store doesn't equal to the 
provided __version__, the update should be rejected with 
__BadVersionException__. The new __version__ should be returned for a 
successful write. __NoSubscriptionStateException__ is returned if no 
subscription entry is found for a subscription (topic, subscriber).
+
+h3. Read Subscriptions
+
+* @readSubscriptionData@: Read the subscription data from the underlying 
key/value store. The implementation should take the responsibility of 
deserializing the metadata into a __SubscriptionData__ object including its 
preferences and subscription state. Also, its current __version__ needs to be 
returned for future updates. If there is no subscription data found for a 
subscription, a null value is returned.
+
+* @readSubscriptions@: Read all the subscription data from key/value store for 
a given topic. The implementation should take the responsibility of managing 
all subscription for a topic for efficient access.  An empty map is returned if 
there are no subscriptions found for a given topic.
+
+h3. Delete Subscription
+
+* @deleteSubscriptionData@: Delete the subscription data from the key/value 
store with given __version__ for a specific subscription (topic, subscriber). 
The subscription info should be removed if current __version__ in key/value 
store equals the provided __version__. Otherwise, the deletion should be 
rejected with __BadVersionException__. __NoSubscriptionStateException__ is 
returned if no subscription data is found for a subscription (topic, 
subscriber).
+
+h1. How to choose a key/value store for Hedwig.
+
+From the interface, several requirements needs to meet before picking up a 
key/value store for Hedwig:
+
+* @CAS@: The ability to do strict updates according to specific condition, 
i.e. a specific version (ZooKeeper) and same content (HBase).
+* @Optimized for Writes@: The metadata access pattern for Hedwig is read first 
and continuous updates.
+* @Optimized for retrieving all subscriptions for a topic@: Either 
hierarchical structures to maintain such relationships (ZooKeeper), or ordered 
key/value storage to cluster the subscription for a topic together, would 
provide efficient subscription data management.
+
+__ZooKeeper__ is the default implementation for Hedwig metadata management, 
which holds data in memory and provides filesystem-like namespace, meeting the 
above requirements. __ZooKeeper__ is suitable for most Hedwig usecases. 
However, if your application needs to manage millions of topics/subscriptions, 
a more scalable solution would be __HBase__, which also meet the above 
requirements.

Added: bookkeeper/site/trunk/content/docs/master/hedwigParams.textile
URL: 
http://svn.apache.org/viewvc/bookkeeper/site/trunk/content/docs/master/hedwigParams.textile?rev=1644097&view=auto
==============================================================================
--- bookkeeper/site/trunk/content/docs/master/hedwigParams.textile (added)
+++ bookkeeper/site/trunk/content/docs/master/hedwigParams.textile Tue Dec  9 
16:00:52 2014
@@ -0,0 +1,92 @@
+Notice: Licensed under the Apache License, Version 2.0 (the "License");
+        you may not use this file except in compliance with the License. You 
may
+        obtain a copy of the License at 
"http://www.apache.org/licenses/LICENSE-2.0":http://www.apache.org/licenses/LICENSE-2.0.
+        .        
+        Unless required by applicable law or agreed to in writing,
+        software distributed under the License is distributed on an "AS IS"
+        BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+        implied. See the License for the specific language governing 
permissions
+        and limitations under the License.
+        .
+        
+h1. Hedwig configuration parameters        
+        
+This page contains detailed information about configuration parameters used 
for Hubs, Regions, ZooKeeper, and BookKeeper.
+        
+h2. Hedwig server configuration parameters
+
+Please also refer to the configuration file that comes with the distribution: 
_hedwig-server/conf/hw_server.conf_.  
+
+h3. Region related parameters
+
+| @region@ | Region identifier. Default is "standalone". |
+| @regions@ | List of region identifiers, space separated. Default is empty. |
+| @inter_region_ssl_enabled (deprecated)@ | Enables SSL across regions. 
Default is false. *Since this parameter has been deprecated, use 
__ssl_enabled__ in _hedwig-server/conf/hw_region_client.conf_ to enable SSL 
across regions instead.* |
+| @retry_remote_subscribe_thread_run_interval@ | This parameter is used to 
determine how often we run a thread to retry those failed remote subscriptions 
in asynchronous mode (in milliseconds). Default is 2 minutes. |
+
+h3. Hub server parameters
+
+| @standalone@ | Sets the hub server to run in standalone mode (no regions). 
Default is false. |
+| @server_port@ | Sets the server port that receives client connections. 
Default is 4080. |
+| @ssl_enabled@ | Enables SSL. Default is false. |
+| @ssl_server_port@ | Sets the server port for SSL connections. Default is 
9876. | 
+| @password@ | Password used for pkcs12 certificate.. Default is the empty 
string. |
+| @cert_name@ | Sets the name of the SSL certificate if available as a 
resource. Default is the null string. |
+| @cert_path@ | Sets the path to the SSL certificate if it is available as a 
file. Default is the null string. |
+
+h3. Read-ahead cache parameters
+
+| @readahead_enabled@ | Enables read-ahead. Enabled by default. | 
+| @readahead_count@ | Number of messages to read ahead. Default is 10. |
+| @readahead_size@ | Maximum number of bytes to read during a scan. Default is 
4 megabytes. |
+
+bq. Upon a range scan request for a given topic, two hints are provided as to 
when scanning should stop: the number of messages scanned and the total size of 
messages scanned. Scanning stops whenever one of these limits is exceeded.
+
+| @cache_size@ | Sets the size of the read-ahead cache. Default is the 
smallest of 2G or half the heap size. | 
+| @cache_entry_ttl@ | Sets TTL for cache entries. Each time adding new entry 
into the cache, those expired cache entries would be discarded. If the value is 
set to zero or less than zero, cache entry will not be evicted until the cache 
is fullfilled or the messages are already consumed. Default is 0. |
+| @scan_backoff_ms@ | The backoff time (in milliseconds) to retry scans after 
failures. Default value is 1s (1000ms). Default is 1s. |
+| @num_readahead_cache_threads@ | Sets the number of threads to be used for 
the read-ahead mechanism. Default is the number of cores as returned with a 
call to <code>Runtime.getRuntime().availableProcessors()</code>.|
+
+h3. Publish and subscription parameters 
+
+| @max_message_size@ | Sets the maximum message size. Default is 1.2 
megabytes. |
+| @default_message_window_size@ | This parameter is used for setting the 
default maximum number of messages that can be delivered to a subscriber 
without being consumed. We pause delivery to a subscriber when reaching the 
window size. Default is unlimited (0). |
+| @consume_interval@ | Sets the number of messages consumed before persisting 
information about consumed messages. A value greater than one avoids persisting 
information about consumed messages upon every consumed message. Default is 50.|
+| @retention_secs@ | the interval to release a topic. If this parameter is 
greater than zero, then schedule a task to release an owned topic. Default is 0 
(never released).
+| @messages_consumed_thread_run_interval@ | Time interval (in milliseconds) to 
run messages consumed timer task to
+delete those consumed ledgers in BookKeeper. Default is 1 minute (60,000 ms). |
+
+
+h3. ZooKeeper parameters
+ 
+| @zk_host@ | Sets the ZooKeeper list of servers. Default is localhost:2181. |
+| @zk_timeout@ | Sets the ZooKeeper session timeout. Default is 2s. |
+
+h3. BookKeeper parameters
+
+| @bk_ensemble_size@ | Sets the ensemble size. Default is 3. |
+| @bk_write_quorum_size@ | Sets the write quorum size. Default is 2. |
+| @bk_ack_quorum_size@ | Sets the ack quorum size. Default is 2. |
+
+bq. Note that the ack quorum size must be equal or smaller than the write 
quorum size.
+
+| @max_entries_per_ledger@ | Maximum number of entries before we roll a 
ledger. Default is unlimited (0). |
+
+h3. Metadata parameters
+
+| @zk_prefix@ | Sets the ZooKeeper path prefix. Default is _/hedwig_. |
+| @metadata_manager_based_topic_manager_enabled@ | Enables the use of a 
metadata manager for topic management. Default is false. |
+| @metadata_manager_factory_class@ | Sets the default factory for the metadata 
manager. Default is null. |
+
+h2. Region manager configuration parameters
+
+Please also refer to the configuration file that comes with the distribution: 
_hedwig-server/conf/hw_region_client.conf_.
+
+| @ssl_enabled@ | This parameter is a boolean flag indicating if communication 
with the server should be done via SSL for encryption. The Hedwig server hubs 
also need to be SSL enabled for this to work. Default value is false. |
+| @max_message_size@ | Sets the maximum message size in bytes. The default 
value is 2 MB (2097152). |
+| @max_server_redirects@ | Sets the maximum number of redirects we permit 
before signaling an error. Default value is 2. |
+| @auto_send_consume_message_enabled@ | A flag indicating whether the client 
library should automatically send consume messages to the server. Default value 
is true. |
+| @consumed_messages_buffer_size@ | Sets the number of messages we buffer 
before sending a consume message to the server. Default value is 5. |
+| @max_outstanding_messages@ | Support for client side throttling, sets the 
maximum number of outstanding messages. Default value is 10. |
+| @server_ack_response_timeout@ | Sets the timeout (in milliseconds) before we 
error out any existing requests. Default value is 30s (30,000). |
+        

Added: bookkeeper/site/trunk/content/docs/master/hedwigUser.textile
URL: 
http://svn.apache.org/viewvc/bookkeeper/site/trunk/content/docs/master/hedwigUser.textile?rev=1644097&view=auto
==============================================================================
--- bookkeeper/site/trunk/content/docs/master/hedwigUser.textile (added)
+++ bookkeeper/site/trunk/content/docs/master/hedwigUser.textile Tue Dec  9 
16:00:52 2014
@@ -0,0 +1,63 @@
+Notice: Licensed under the Apache License, Version 2.0 (the "License");
+        you may not use this file except in compliance with the License. You 
may
+        obtain a copy of the License at 
"http://www.apache.org/licenses/LICENSE-2.0":http://www.apache.org/licenses/LICENSE-2.0.
+        .        
+        Unless required by applicable law or agreed to in writing,
+        software distributed under the License is distributed on an "AS IS"
+        BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+        implied. See the License for the specific language governing 
permissions
+        and limitations under the License.
+        .
+
+h1. Design
+
+In Hedwig, clients publish messages associated with a topic, and they 
subscribe to a topic to receive all messages published with that topic. Clients 
are associated with (publish to and subscribe from) a Hedwig _instance_ (also 
referred to as a _region_), which consists of a number of servers called 
_hubs_. The hubs partition up topic ownership among themselves, and all 
publishes and subscribes to a topic must be done to its owning hub. When a 
client doesn't know the owning hub, it tries a default hub, which may redirect 
the client.
+
+Running a Hedwig instance requires a Zookeeper server and at least three 
Bookkeeper servers.
+
+An instance is designed to run within a datacenter. For wide-area messaging 
across datacenters, specify in the server configuration the set of default 
servers for each of the other instances. Dissemination among instances 
currently takes place over an all-to-all topology. Local subscriptions cause 
the hub to subscribe to all other regions on this topic, so that the local 
region receives all updates to it. Future work includes allowing the user to 
overlay alternative topologies.
+
+Because all messages on a topic go through a single hub per region, all 
messages within a region are ordered. This means that, for a given topic, 
messages are delivered in the same order to all subscribers within a region, 
and messages from any particular region are delivered in the same order to all 
subscribers globally, but messages from different regions may be delivered in 
different orders to different regions. Providing global ordering is 
prohibitively expensive in the wide area. However, in Hedwig clients such as 
PNUTS, the lack of global ordering is not a problem, as PNUTS serializes all 
updates to a table row at a single designated master for that row.
+
+Topics are independent; Hedwig provides no ordering across different topics.
+
+Version vectors are associated with each topic and serve as the identifiers 
for each message. Vectors consist of one component per region. A component 
value is the region's local sequence number on the topic, and is incremented 
each time a hub persists a message (published either locally or remotely) to BK.
+
+TODO: More on how version vectors are to be used, and on maintaining 
vector-maxes.
+
+h1. Entry Points
+
+The main class for running the server is 
@org.apache.hedwig.server.netty.PubSubServer@. It takes a single argument, 
which is a "Commons Configuration":http://commons.apache.org/configuration/ 
file. Currently, for configuration, the source is the documentation. See 
@org.apache.hedwig.server.conf.ServerConfiguration@ for server configuration 
parameters.
+
+The client is a library intended to be consumed by user applications. It takes 
a Commons Configuration object, for which the source/documentation is in 
@org.apache.hedwig.client.conf.ClientConfiguration@.
+
+h1. Deployment
+
+h2. Limits
+
+Because the current implementation uses a single socket per subscription, the 
Hedwig requires a high @ulimit@ on the number of open file descriptors. 
Non-root users can only use up to the limit specified in 
@/etc/security/limits.conf@; to raise this to 1024^2, as root, modify the 
&quot;nofile&quot; line in /etc/security/limits.conf on all hubs.
+
+h2. Running Servers
+
+Hedwig requires BookKeeper to run. For BookKeeper setup instructions see 
"BookKeeper Getting Started":./bookkeeperStarted.html.
+
+To start a Hedwig hub server:
+
+@hedwig-server/bin/hedwig server@
+
+Hedwig takes its configuration from hedwig-server/conf/hw_server.conf by 
default. To change location of the conf file, modify the HEDWIG_SERVER_CONF 
environment variable.
+
+h1. Debugging
+
+You can attach an Eclipse debugger (or any debugger) to a Java process running 
on a remote host, as long as it has been started with the appropriate JVM 
flags. (See the Building Hedwig document to set up your Eclipse environment.) 
To launch something using @bin/hedwig@ with debugger attachment enabled, prefix 
the command with 
@HEDWIG_EXTRA_OPTS=-agentlib:jdwp=transport=dt_socket,server=y,address=5000@, 
e.g.:
+
+@HEDWIG_EXTRA_OPTS=-agentlib:jdwp=transport=dt_socket,server=y,address=5000 
hedwig-server/bin/hedwig server@
+
+h1. Logging
+
+Hedwig uses "slf4j":http://www.slf4j.org for logging, with the log4j bindings 
enabled by default. To enable logging from hedwig, create a log4j.properties 
file and point the environment variable HEDWIG_LOG_CONF to the file. The path 
to the log4j.properties file must be absolute.
+
+@export HEDWIG_LOG_CONF=/tmp/log4j.properties@
+@hedwig-server/bin/hedwig server@
+
+

Added: bookkeeper/site/trunk/content/docs/master/index.textile
URL: 
http://svn.apache.org/viewvc/bookkeeper/site/trunk/content/docs/master/index.textile?rev=1644097&view=auto
==============================================================================
--- bookkeeper/site/trunk/content/docs/master/index.textile (added)
+++ bookkeeper/site/trunk/content/docs/master/index.textile Tue Dec  9 16:00:52 
2014
@@ -0,0 +1,52 @@
+Title:     BookKeeper Documentation
+Notice:    Licensed to the Apache Software Foundation (ASF) under one
+           or more contributor license agreements.  See the NOTICE file
+           distributed with this work for additional information
+           regarding copyright ownership.  The ASF licenses this file
+           to you under the Apache License, Version 2.0 (the
+           "License"); you may not use this file except in compliance
+           with the License.  You may obtain a copy of the License at
+           .
+             http://www.apache.org/licenses/LICENSE-2.0
+           .
+           Unless required by applicable law or agreed to in writing,
+           software distributed under the License is distributed on an
+           "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+           KIND, either express or implied.  See the License for the
+           specific language governing permissions and limitations
+           under the License.
+
+h1. Apache BookKeeper documentation
+
+* "Overview":./bookkeeperOverview.html
+* "Getting started":./bookkeeperStarted.html
+* "Programmer's Guide":./bookkeeperProgrammer.html
+* "Bookie Server Configuration Parameters":./bookieConfigParams.html
+* "BookKeeper Configuration Parameters":./bookkeeperConfigParams.html
+* "BookKeeper Internals":./bookkeeperInternals.html
+* "Bookie Recovery":./bookieRecovery.html
+* "Using BookKeeper stream library":./bookkeeperStream.html
+* "BookKeeper Metadata Management":./bookkeeperMetadata.html
+
+h2. BookKeeper Admin & Ops
+
+* "Admin Guide":./bookkeeperConfig.html
+* "BookKeeper JMX":./bookkeeperJMX.html
+
+h1. Apache Hedwig documentation
+
+* "Building Hedwig, or how to set up Hedwig":./hedwigBuild.html
+* "User's Guide, or how to program against the Hedwig API and how to run 
it":./hedwigUser.html
+* "Developer's Guide, or Hedwig internals and hacking 
details":./hedwigDesign.html
+* "Configuration parameters":./hedwigParams.html
+* "Message Filtering":./hedwigMessageFilter.html
+* "Hedwig Metadata Management":./hedwigMetadata.html
+
+h2. Hedwig Admin & Ops
+
+* "Hedwig Console":./hedwigConsole.html
+* "Hedwig JMX":./hedwigJMX.html
+
+h1. Metastore documentation
+
+* "Metastore Interface":./metastore.textile

Added: bookkeeper/site/trunk/content/docs/master/metastore.textile
URL: 
http://svn.apache.org/viewvc/bookkeeper/site/trunk/content/docs/master/metastore.textile?rev=1644097&view=auto
==============================================================================
--- bookkeeper/site/trunk/content/docs/master/metastore.textile (added)
+++ bookkeeper/site/trunk/content/docs/master/metastore.textile Tue Dec  9 
16:00:52 2014
@@ -0,0 +1,47 @@
+Title:        Metastore Interface
+Notice: Licensed under the Apache License, Version 2.0 (the "License");
+        you may not use this file except in compliance with the License. You 
may
+        obtain a copy of the License at 
"http://www.apache.org/licenses/LICENSE-2.0":http://www.apache.org/licenses/LICENSE-2.0.
+        .
+        .
+        Unless required by applicable law or agreed to in writing,
+        software distributed under the License is distributed on an "AS IS"
+        BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+        implied. See the License for the specific language governing 
permissions
+        and limitations under the License.
+        .
+        .
+
+h1. Metastore Interface
+
+Although Apache BookKeeper provides "LedgerManager":./bookkeeperMetadata.html 
and "Hedwig Metadata Managers":./hedwigMetadata.html for users to plugin 
different metadata storages for both BookKeeper and Hedwig, it is quite 
difficult to implement a correct and efficient manager version based on the 
knowledge for both projects. The __MetaStore__ interface extracts the 
commonality of the metadata storage interfaces and is provided for users to 
focus on adapting the underlying storage itself w/o having to worry about the 
detailed logic for BookKeeper and Hedwig.
+
+h2. MetaStore
+
+The __MetaStore__ interface provide users with access to __MetastoreTable__s 
used for BookKeeper and Hedwig metadata management. There are two kinds of 
table defined in a __MetaStore__, __MetastoreTable__ which provides basic 
__PUT__,__GET__,__REMOVE__,__SCAN__ operations and which does not assume any 
ordering requirements from the underlying storage; and 
__MetastoreScannableTable__ which is derived from __MetastoreTable__, but 
*does* assume that data is stored in key order in the underlying storage.
+
+* @getName@: Return the name of the __MetaStore__.
+* @getVersion@: Return current __MetaStore__ plugin version.
+* @init@: Initialize the __MetaStore__ library with the given configuration 
and its version.
+* @close@: Close the __MetaStore__, freeing all resources. i.e. release all 
the open connections and occupied memory etc.
+* @createTable@: Create a table instance to access the data stored in it. A 
table name is given to locate the table. An __MetastoreTable__ object is 
returned.
+* @createScannableTable@: Similar as __createTable__, but returns 
__MetastoreScannableTable__ rather then __MetastoreTable__ object. If the 
underlying table is not an ordered table, __MetastoreException__ should be 
thrown.
+
+h2. MetaStore Table
+
+__MetastoreTable__ is a basic unit in a __MetaStore__, which is used to handle 
different types of metadata, i.e. A __MetastoreTable__ is used to store 
metadata for ledgers, while the other __MetastoreTable__ is used to store 
metadata for topic persistence info. The interface for a __MetastoreTable__ is 
quite simple:
+
+* @get@: Retrieve a entry by a given __key__. __OK__ and its current version 
in metadata storage is returned when succeed. __NoKey__ returned for a 
non-existent key. If __fields__ are specified, return only the specified fields 
for the key.
+* @put@: Put the given __value__ associated with __key__ with given 
__version__. The value is only updated when the given __version__ equals the 
current version in metadata storage. A new __version__ should be returned when 
updated successfully. __NoKey__ is returned for a non-existent key, 
__BadVersion__ is returned when an update is attempted with a __version__ which 
does not match the one in the metadata store.
+* @remove@: Remove the given __value__ associated with __key__. The value is 
only removed when the given __version__ equals its current version in metadata 
storage. __NoKey__ is returned for a non-existent key, __BadVersion__ is 
returned when remove is attempted with a __version__ which does not match.
+* @openCursor@: Open a __cursor__ to iterate over all the entries of a table. 
The returned cursor doesn't need to guarantee any order and transaction.
+
+h2. MetaStore Scannable Table
+
+__MetastoreScannableTable__ is identical to a __MetastoreTable__ except that 
it provides an addition interface to iterate over entries in the table in key 
order.
+
+* @openCursor@: Open a __cursor__ to iterate over all the entries of a table 
between the key range of __firstKey__ and __lastKey__.
+
+h2. How to organize your metadata.
+
+Some metadata in Hedwig and BookKeeper does not need to be stored in the order 
of the ledger id or the topic. You could use kind of hash table to store 
metadata for them. These metadata are topic ownership and topic persistence 
info. Besides that, subscription state and ledger metadata must be stored in 
key order due to the current logic in Hedwig/BookKeeper.


Reply via email to