svn commit: r1644097 [1/2] - /bookkeeper/site/trunk/content/docs/master/

ivank Tue, 09 Dec 2014 08:02:14 -0800

Author: ivank
Date: Tue Dec  9 16:00:52 2014
New Revision: 1644097

URL: http://svn.apache.org/r1644097
Log:
Syncing website with git


Added:
    bookkeeper/site/trunk/content/docs/master/bookieConfigParams.textile
    bookkeeper/site/trunk/content/docs/master/bookieRecovery.textile
    bookkeeper/site/trunk/content/docs/master/bookkeeperConfig.textile
    bookkeeper/site/trunk/content/docs/master/bookkeeperConfigParams.textile
    bookkeeper/site/trunk/content/docs/master/bookkeeperInternals.textile
    bookkeeper/site/trunk/content/docs/master/bookkeeperJMX.textile
    bookkeeper/site/trunk/content/docs/master/bookkeeperMetadata.textile
    bookkeeper/site/trunk/content/docs/master/bookkeeperOverview.textile
    bookkeeper/site/trunk/content/docs/master/bookkeeperProgrammer.textile
    bookkeeper/site/trunk/content/docs/master/bookkeeperStarted.textile
    bookkeeper/site/trunk/content/docs/master/bookkeeperStream.textile
    bookkeeper/site/trunk/content/docs/master/doc.textile
    bookkeeper/site/trunk/content/docs/master/hedwigBuild.textile
    bookkeeper/site/trunk/content/docs/master/hedwigConsole.textile
    bookkeeper/site/trunk/content/docs/master/hedwigDesign.textile
    bookkeeper/site/trunk/content/docs/master/hedwigJMX.textile
    bookkeeper/site/trunk/content/docs/master/hedwigMessageFilter.textile
    bookkeeper/site/trunk/content/docs/master/hedwigMetadata.textile
    bookkeeper/site/trunk/content/docs/master/hedwigParams.textile
    bookkeeper/site/trunk/content/docs/master/hedwigUser.textile
    bookkeeper/site/trunk/content/docs/master/index.textile
    bookkeeper/site/trunk/content/docs/master/metastore.textile

Added: bookkeeper/site/trunk/content/docs/master/bookieConfigParams.textile
URL: 
http://svn.apache.org/viewvc/bookkeeper/site/trunk/content/docs/master/bookieConfigParams.textile?rev=1644097&view=auto
==============================================================================
--- bookkeeper/site/trunk/content/docs/master/bookieConfigParams.textile (added)
+++ bookkeeper/site/trunk/content/docs/master/bookieConfigParams.textile Tue 
Dec  9 16:00:52 2014
@@ -0,0 +1,94 @@
+Title:        Bookie Configuration Parameters
+Notice: Licensed under the Apache License, Version 2.0 (the "License");
+        you may not use this file except in compliance with the License. You 
may
+        obtain a copy of the License at 
"http://www.apache.org/licenses/LICENSE-2.0":http://www.apache.org/licenses/LICENSE-2.0.
+        .        
+        Unless required by applicable law or agreed to in writing,
+        software distributed under the License is distributed on an "AS IS"
+        BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+        implied. See the License for the specific language governing 
permissions
+        and limitations under the License.
+        .
+
+h1. Bookie Configuration Parameters
+
+This page contains detailed information about configuration parameters used 
for configuring a bookie server. There is an example in 
"bookkeeper-server/conf/bk_server.conf". 
+
+h2. Server parameters
+
+| @bookiePort@        |Port that bookie server listens on. The default value 
is 3181.|
+| @journalDirectory@        | Directory to which Bookkeeper outputs its write 
ahead log, ideally on a dedicated device. The default value is "/tmp/bk-txn". |
+| @ledgerDirectories@        | Directory to which Bookkeeper outputs ledger 
snapshots.  Multiple directories can be defined, separated by comma, e.g. 
/tmp/bk1-data,/tmp/bk2-data. Ideally ledger dirs and journal dir are each on a 
different device, which reduces the contention between random I/O and 
sequential writes. It is possible to run with a single disk,  but performance 
will be significantly lower.|
+| @indexDirectories@  | Directories to store index files. If not specified, 
bookie will use ledgerDirectories to store index files. |
+| @bookieDeathWatchInterval@ | Interval to check whether a bookie is dead or 
not, in milliseconds. |
+| @gcWaitTime@        | Interval to trigger next garbage collection, in 
milliseconds. Since garbage collection is running in the background, running 
the garbage collector too frequently hurts performance. It is best to set its 
value high enough if there is sufficient disk capacity.|
+| @flushInterval@ | Interval to flush ledger index pages to disk, in 
milliseconds. Flushing index files will introduce random disk I/O. 
Consequently, it is important to have journal dir and ledger dirs each on 
different devices. However,  if it necessary to have journal dir and ledger 
dirs on the same device, one option is to increment the flush interval to get 
higher performance. Upon a failure, the bookie will take longer to recover. |
+| @numAddWorkerThreads@ | Number of threads that should handle write requests. 
if zero, the writes would be handled by netty threads directly. |
+| @numReadWorkerThreads@ | Number of threads that should handle read requests. 
if zero, the reads would be handled by netty threads directly. |
+
+h2. NIO server settings
+
+| @serverTcpNoDelay@ | This settings is used to enabled/disabled Nagle's 
algorithm, which is a means of improving the efficiency of TCP/IP networks by 
reducing the number of packets that need to be sent over the network. If you 
are sending many small messages, such that more than one can fit in a single IP 
packet, setting server.tcpnodelay to false to enable Nagle algorithm can 
provide better performance. Default value is true. |
+
+h2. Journal settings
+
+| @journalMaxSizeMB@  |  Maximum file size of journal file, in megabytes. A 
new journal file will be created when the old one reaches the file size 
limitation. The default value is 2kB. |
+| @journalMaxBackups@ |  Max number of old journal file to keep. Keeping a 
number of old journal files might help data recovery in some special cases. The 
default value is 5. |
+| @journalPreAllocSizeMB@ | The space that bookie pre-allocate at a time in 
the journal. |
+| @journalWriteBufferSizeKB@ | Size of the write buffers used for the journal. 
|
+| @journalRemoveFromPageCache@ | Whether bookie removes pages from page cache 
after force write. Used to avoid journal pollute os page cache. |
+| @journalAdaptiveGroupWrites@ | Whether to group journal force writes, which 
optimize group commit for higher throughput. |
+| @journalMaxGroupWaitMSec@ | Maximum latency to impose on a journal write to 
achieve grouping. |
+| @journalBufferedWritesThreshold@ | Maximum writes to buffer to achieve 
grouping. |
+| @journalFlushWhenQueueEmpty@ | Whether to flush the journal when journal 
queue is empty. Disabling it would provide sustained journal adds throughput. |
+| @numJournalCallbackThreads@ | The number of threads that should handle 
journal callbacks. |
+
+h2. Ledger cache settings
+
+| @openFileLimit@ | Maximum number of ledger index files that can be opened in 
a bookie. If the number of ledger index files reaches this limit, the bookie 
starts to flush some ledger indexes from memory to disk. If flushing happens 
too frequently, then performance is affected. You can tune this number to 
improve performance according. |
+| @pageSize@ | Size of an index page in ledger cache, in bytes. A larger index 
page can improve performance when writing page to disk, which is efficient when 
you have small number of ledgers and these ledgers have a similar number of 
entries. With a large number of ledgers and a few entries per ledger, a smaller 
index page would improves memory usage. |
+| @pageLimit@ | Maximum number of index pages to store in the ledger cache. If 
the number of index pages reaches this limit, bookie server starts to flush 
ledger indexes from memory to disk. Incrementing this value is an option when 
flushing becomes frequent. It is important to make sure, though, that 
pageLimit*pageSize is not more than JVM max memory limit; otherwise it will 
raise an OutOfMemoryException. In general, incrementing pageLimit, using 
smaller index page would gain better performance in the case of a large number 
of ledgers with few entries per ledger. If pageLimit is -1, a bookie uses 1/3 
of the JVM memory to compute the maximum number of index pages. |
+
+h2. Ledger manager settings
+
+| @ledgerManagerType@ | What kind of ledger manager is used to manage how 
ledgers are stored, managed and garbage collected. See "BookKeeper 
Internals":./bookkeeperInternals.html for detailed info. Default is flat. |
+| @zkLedgersRootPath@ | Root zookeeper path to store ledger metadata. Default 
is /ledgers. |
+
+h2. Entry Log settings
+
+| @logSizeLimit@      | Maximum file size of entry logger, in bytes. A new 
entry log file will be created when the old one reaches the file size 
limitation. The default value is 2GB. |
+| @entryLogFilePreallocationEnabled@ | Enable/Disable entry logger 
preallocation. Enable this would provide sustained higher throughput and reduce 
latency impaction. |
+| @readBufferSizeBytes@ | The number of bytes used as capacity for 
BufferedReadChannel. Default is 512 bytes. |
+| @writeBufferSizeBytes@ | The number of bytes used as capacity for the write 
buffer. Default is 64KB. |
+
+h2. Entry Log compaction settings
+
+| @minorCompactionInterval@ | Interval to run minor compaction, in seconds. If 
it is set to less than or equal to zero, then minor compaction is disabled. 
Default is 1 hour. |
+| @minorCompactionThreshold@ | Entry log files with remaining size under this 
threshold value will be compacted in a minor compaction. If it is set to less 
than or equal to zero, the minor compaction is disabled. Default is 0.2 |
+| @majorCompactionInterval@ | Interval to run major compaction, in seconds. If 
it is set to less than or equal to zero, then major compaction is disabled. 
Default is 1 day. |
+| @majorCompactionThreshold@ | Entry log files with remaining size below this 
threshold value will be compacted in a major compaction. Those entry log files 
whose remaining size percentage is still higher than the threshold value will 
never be compacted. If it is set to less than or equal to zero, the major 
compaction is disabled. Default is 0.8. |
+| @compactionMaxOutstandingRequests@ | The maximum number of entries which can 
be compacted without flushing. When compacting, the entries are written to the 
entrylog and the new offsets are cached in memory. Once the entrylog is flushed 
the index is updated with the new offsets. This parameter controls the number 
of entries added to the entrylog before a flush is forced. A higher value for 
this parameter means more memory will be used for offsets. Each offset consist 
of 3
+longs. This parameter should _not_ be modified unless you know what you're 
doing. |
+| @compactionRate@ | The rate at which compaction will re-add entries. The 
unit is adds per second. |
+
+h2. Statistics
+
+| @enableStatistics@ | Enables the collection of statistics. Default is on. |
+
+h2. Auto-replication
+
+| @openLedgerRereplicationGracePeriod@ | This is the grace period which the 
rereplication worker waits before fencing and replicating a ledger fragment 
which is still being written to upon a bookie failure. The default is 30s. |
+
+h2. Read-only mode support
+
+| @readOnlyModeEnabled@ | Enables/disables the read-only Bookie feature. A 
bookie goes into read-only mode when it finds integrity issues with stored 
data. If @readOnlyModeEnabled@ is false, the bookie shuts down if it finds 
integrity issues. By default it is enabled. |
+
+h2. Disk utilization
+
+| @diskUsageThreshold@ | Fraction of the total utilized usable disk space to 
declare the disk full. The total available disk space is obtained with 
File.getUsableSpace(). Default is 0.95. |
+| @diskCheckInterval@ | Interval between consecutive checks of disk 
utilization. Default is 10s. |
+
+h2. ZooKeeper parameters
+
+| @zkServers@ | A list of one or more servers on which zookeeper is running. 
The server list is comma separated, e.g., zk1:2181,zk2:2181,zk3:2181 |
+| @zkTimeout@ | ZooKeeper client session timeout in milliseconds. Bookie 
server will exit if it received SESSION_EXPIRED because it was partitioned off 
from ZooKeeper for more than the session timeout. JVM garbage collection or 
disk I/O can cause SESSION_EXPIRED. Increment this value could help avoiding 
this issue. The default value is 10,000. |

Added: bookkeeper/site/trunk/content/docs/master/bookieRecovery.textile
URL: 
http://svn.apache.org/viewvc/bookkeeper/site/trunk/content/docs/master/bookieRecovery.textile?rev=1644097&view=auto
==============================================================================
--- bookkeeper/site/trunk/content/docs/master/bookieRecovery.textile (added)
+++ bookkeeper/site/trunk/content/docs/master/bookieRecovery.textile Tue Dec  9 
16:00:52 2014
@@ -0,0 +1,79 @@
+Title:     BookKeeper Bookie Recovery
+Notice:    Licensed to the Apache Software Foundation (ASF) under one
+           or more contributor license agreements.  See the NOTICE file
+           distributed with this work for additional information
+           regarding copyright ownership.  The ASF licenses this file
+           to you under the Apache License, Version 2.0 (the
+           "License"); you may not use this file except in compliance
+           with the License.  You may obtain a copy of the License at
+           .
+             http://www.apache.org/licenses/LICENSE-2.0
+           .
+           Unless required by applicable law or agreed to in writing,
+           software distributed under the License is distributed on an
+           "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+           KIND, either express or implied.  See the License for the
+           specific language governing permissions and limitations
+           under the License.
+h1. Bookie Ledger Recovery
+
+p. When a Bookie crashes, any ledgers with data on that Bookie become 
underreplicated. There are two options for bringing the ledgers back to full 
replication, Autorecovery and Manual Bookie Recovery.
+
+h2. Autorecovery
+
+p. Autorecovery runs as a daemon alongside the Bookie daemon on each Bookie. 
Autorecovery detects when a bookie in the cluster has become unavailable, and 
rereplicates all the ledgers which were on that bookie, so that those ledgers 
are brough back to full replication. See the "Admin 
Guide":./bookkeeperConfig.html for instructions on how to start autorecovery.
+
+h2. Manual Bookie Recovery
+
+p. If autorecovery is not enabled, it is possible for the adminisatrator to 
manually rereplicate the data from the failed bookie.
+
+To run recovery, with zk1.example.com as the zookeeper ensemble, and 
192.168.1.10 as the failed bookie, do the following:
+
+@bookkeeper-server/bin/bookkeeper org.apache.bookkeeper.tools.BookKeeperTools 
zk1.example.com:2181 192.168.1.10:3181@
+
+It is necessary to specify the host and port portion of failed bookie, as this 
is how it identifies itself to zookeeper. It is possible to specify a third 
argument, which is the bookie to replicate to. If this is omitted, as in our 
example, a random bookie is chosen for each ledger segment. A ledger segment is 
a continuous sequence of entries in a bookie, which share the same ensemble.
+
+h2. AutoRecovery Internals
+
+Auto-Recovery has two components:
+
+* *Auditor*, a singleton node which watches for bookie failure, and creates 
rereplication tasks for the ledgers on failed bookies.
+* *ReplicationWorker*, runs on each Bookie, takes rereplication tasks and 
executes them.
+
+Both the components run as threads in the the *AutoRecoveryMain* process. The 
*AutoRecoveryMain* process runs on each Bookie in the cluster. All recovery 
nodes will participate in leader election to decide which node becomes the 
auditor. Those which fail to become the auditor will watch the elected auditor, 
and will run election again if they see that it has failed.
+
+h3. Auditor
+
+The auditor watches the the list of bookies registered with ZooKeeper in the 
cluster. A Bookie registers with ZooKeeper during startup. If the bookie 
crashes or is killed, the bookie's registration disappears. The auditor is 
notified of changes in the registered bookies list.
+
+When the auditor sees that a bookie has disappeared from the list, it 
immediately scans the complete ledger list to find ledgers which have stored 
data on the failed bookie. Once it has a list of ledgers which need to be 
rereplicated, it will publish a rereplication task for each ledger under the 
/underreplicated/ znode in ZooKeeeper.
+
+h3. ReplicationWorker
+
+Each replication worker watches for tasks being published in the 
/underreplicated/ znode. When a new task appears, it will try to get a lock on 
it. If it cannot acquire the lock, it tries the next entry. The locks are 
implemented using ZooKeeper ephemeral znodes.
+
+The replication worker will scan through the rereplication task's ledger for 
segments of which its local bookie is not a member. When it finds segments 
matching this criteria it will replicate the entries of that segment to the 
local bookie.  If, after this process, the ledger is fully replicated, the 
ledgers entry under /underreplicated/ is deleted, and the lock is released. If 
there is a problem replicating, or there are still segments in the ledger which 
are still underreplicated (due to the local bookie already being part of the 
ensemble for the segment), then the lock is simply released.
+
+If the replication worker finds a segment which needs rereplication, but does 
not have a defined endpoint (i.e. the final segment of a ledger currently being 
written to), it will wait for a grace period before attempting rereplication. 
If the segment needing rereplciation still does not have a defined endpoint, 
the ledger is fenced and rereplication then takes place. This avoids the case 
where a client is writing to a ledger, and one of the bookies goes down, but 
the client has not written an entry to that bookie before rereplication takes 
place. The client could continue writing to the old segment, even though the 
ensemble for the segment had changed. This could lead to data loss. Fencing 
prevents this scenario from happening. In the normal case, the client will try 
to write to the failed bookie within the grace period, and will have started a 
new segment before rereplication starts. See the "Admin 
Guide":./bookkeeperConfig.html for how to configure this grace period.
+
+h2. The Rereplication process
+
+The ledger rereplication process is as follows.
+
+# The client goes through all ledger segments in the ledger, selecting those 
which contain the failed bookie;
+# A recovery process is initiated for each ledger segment in this list;
+## The client selects a bookie to which all entries in the ledger segment will 
be replicated; In the case of autorecovery, this will always be the local 
bookie;
+## the client reads entries that belong to the ledger segment from other 
bookies in the ensemble and writes them to the selected bookie;
+## Once all entries have been replicated, the zookeeper metadata for the 
segment is updated to reflect the new ensemble;
+## The segment is marked as fully replicated in the recovery tool;
+# Once all ledger segments are marked as fully replicated, the ledger is 
marked as fully replicated.
+
+h2. The Manual Bookie Recovery process
+
+The manual bookie recovery process is as follows.
+
+# The client reads the metadata of active ledgers from zookeeper;
+# From this, the ledgers which contain segments using the failed bookie in 
their ensemble are selected;
+# A recovery process is initiated for each ledger in this list;
+## The Ledger rereplication process is run for each ledger;
+# Once all ledgers are marked as fully replicated, bookie recovery is finished.

Added: bookkeeper/site/trunk/content/docs/master/bookkeeperConfig.textile
URL: 
http://svn.apache.org/viewvc/bookkeeper/site/trunk/content/docs/master/bookkeeperConfig.textile?rev=1644097&view=auto
==============================================================================
--- bookkeeper/site/trunk/content/docs/master/bookkeeperConfig.textile (added)
+++ bookkeeper/site/trunk/content/docs/master/bookkeeperConfig.textile Tue Dec  
9 16:00:52 2014
@@ -0,0 +1,167 @@
+Title:        BookKeeper Administrator's Guide
+Notice: Licensed under the Apache License, Version 2.0 (the "License");
+        you may not use this file except in compliance with the License. You 
may
+        obtain a copy of the License at 
"http://www.apache.org/licenses/LICENSE-2.0":http://www.apache.org/licenses/LICENSE-2.0.
+        .
+        .        
+        Unless required by applicable law or agreed to in writing,
+        software distributed under the License is distributed on an "AS IS"
+        BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+        implied. See the License for the specific language governing 
permissions
+        and limitations under the License.
+        .
+        .
+
+h1. Abstract
+
+This document contains information about deploying, administering and 
maintaining BookKeeper. It also discusses best practices and common problems. 
+
+h1. Running a BookKeeper instance
+
+h2. System requirements
+
+A typical BookKeeper installation comprises a set of bookies and a set of 
ZooKeeper replicas. The exact number of bookies depends on the quorum mode, 
desired throughput, and number of clients using this installation 
simultaneously. The minimum number of bookies is three for self-verifying 
(stores a message authentication code along with each entry) and four for 
generic (does not store a message authentication code with each entry), and 
there is no upper limit on the number of bookies. Increasing the number of 
bookies will, in fact, enable higher throughput.
+
+For performance, we require each server to have at least two disks. It is 
possible to run a bookie with a single disk, but performance will be 
significantly lower in this case.
+
+For ZooKeeper, there is no constraint with respect to the number of replicas. 
Having a single machine running ZooKeeper in standalone mode is sufficient for 
BookKeeper. For resilience purposes, it might be a good idea to run ZooKeeper 
in quorum mode with multiple servers. Please refer to the ZooKeeper 
documentation for detail on how to configure ZooKeeper with multiple replicas.
+
+h2. Starting and Stopping Bookies
+
+To *start* a bookie, execute the following command:
+
+* To run a bookie in the foreground:
+@bookkeeper-server/bin/bookkeeper bookie@
+
+* To run a bookie in the background:
+@bookkeeper-server/bin/bookkeeper-daemon.sh start bookie@
+
+The configuration parameters can be set in 
bookkeeper-server/conf/bk_server.conf.
+
+The important parameters are:
+
+* @bookiePort@, Port number that the bookie listens on; 
+* @zkServers@, Comma separated list of ZooKeeper servers with a hostname:port 
format; 
+* @journalDir@, Path for Log Device (stores bookie write-ahead log); 
+* @ledgerDir@, Path for Ledger Device (stores ledger entries); 
+
+Ideally, @journalDir@ and @ledgerDir@ are each in a different device. See 
"Bookie Configuration Parameters":./bookieConfigParams.html for a full list of 
configuration parameters.
+
+To *stop* a bookie running in the background, execute the following command:
+
+@bookkeeper-server/bin/bookkeeper-daemon.sh stop bookie [-force]@
+@-force@ is optional, which is used to stop the bookie forcefully, if the 
bookie server is not stopped gracefully within the _BOOKIE_STOP_TIMEOUT_ 
(environment variable), which is 30 seconds, by default.
+
+h3. Upgrading
+
+From time to time, we may make changes to the filesystem layout of the bookie, 
which are incompatible with previous versions of bookkeeper and require that 
directories used with previous versions are upgraded. If you upgrade your 
bookkeeper software, and an upgrade is required, then the bookie will fail to 
start and print an error such as:
+
+@2012-05-25 10:41:50,494 - ERROR - [main:Bookie@246] - Directory layout 
version is less than 3, upgrade needed@
+
+BookKeeper provides a utility for upgrading the filesystem.
+@bookkeeper-server/bin/bookkeeper upgrade@
+
+The upgrade application takes 3 possible switches, @--upgrade@, @--rollback@ 
or @--finalize@. A normal upgrade process looks like.
+
+# @bookkeeper-server/bin/bookkeeper upgrade --upgrade@
+# @bookkeeper-server/bin/bookkeeper bookie@
+# Check everything is working. Kill bookie, ^C
+# If everything is ok, @bookkeeper-server/bin/bookkeeper upgrade --finalize@
+# Start bookie again @bookkeeper-server/bin/bookkeeper bookie@
+# If something is amiss, you can roll back the upgrade 
@bookkeeper-server/bin/bookkeeper upgrade --rollback@
+
+h3. Formatting
+
+To format the bookie metadata in Zookeeper, execute the following command once:
+
+@bookkeeper-server/bin/bookkeeper shell metaformat [-nonInteractive] [-force]@
+
+To format the bookie local filesystem data, execute the following command on 
each bookie node:
+
+@bookkeeper-server/bin/bookkeeper shell bookieformat [-nonInteractive] 
[-force]@
+
+The @-nonInteractive@ and @-force@ switches are optional.
+
+If @-nonInteractive@ is set, the user will not be asked to confirm the format 
operation if old data exists. If it exists, the format operation will abort, 
unless the @-force@ switch has been specified, in which case it will process.
+
+By default, the user will be prompted to confirm the format operation if old 
data exists.
+
+h3. Logging
+
+BookKeeper uses "slf4j":http://www.slf4j.org for logging, with the log4j 
bindings enabled by default. To enable logging from a bookie, create a 
log4j.properties file and point the environment variable BOOKIE_LOG_CONF to the 
configuration file. The path to the log4j.properties file must be absolute.
+
+@export BOOKIE_LOG_CONF=/tmp/log4j.properties@
+@bookkeeper-server/bin/bookkeeper bookie@
+
+h3. Missing disks or directories
+
+Replacing disks or removing directories accidentally can cause a bookie to 
fail while trying to read a ledger fragment which the ledger metadata has 
claimed exists on the bookie. For this reason, when a bookie is started for the 
first time, it's disk configuration is fixed for the lifetime of that bookie. 
Any change to the disk configuration of the bookie, such as a crashed disk or 
an accidental configuration change, will result in the bookie being unable to 
start with the following error:
+
+@2012-05-29 18:19:13,790 - ERROR - [main:BookieServer@314] - Exception running 
bookie server : @
[email protected]$InvalidCookieException@
[email protected] org.apache.bookkeeper.bookie.Cookie.verify(Cookie.java:82)@
[email protected] 
org.apache.bookkeeper.bookie.Bookie.checkEnvironment(Bookie.java:275)@
[email protected] org.apache.bookkeeper.bookie.Bookie.<init>(Bookie.java:351)@
+
+If the change was the result of an accidental configuration change, the change 
can be reverted and the bookie can be restarted. However, if the change cannot 
be reverted, such as is the case when you want to add a new disk or replace a 
disk, the bookie must be wiped and then all its data re-replicated onto it. To 
do this, do the following:
+
+# Increment the _bookiePort_ in _bk_server.conf_.
+# Ensure that all directories specified by _journalDirectory_ and 
_ledgerDirectories_ are empty.
+# Start the bookie.
+# Run @bin/bookkeeper org.apache.bookkeeper.tools.BookKeeperTools <zkserver> 
<oldbookie> <newbookie>@ to re-replicate data. <oldbookie> and <newbookie> are 
identified by their external IP and bookiePort. For example if this process is 
being run on a bookie with an external IP of 192.168.1.10, with an old 
_bookiePort_ of 3181 and a new _bookiePort_ of 3182, and with zookeeper running 
on _zk1.example.com_, the command to run would be <br/>@bin/bookkeeper 
org.apache.bookkeeper.tools.BookKeeperTools zk1.example.com 192.168.1.10:3181 
192.168.1.10:3182@. See "Bookie Recovery":./bookieRecovery.html for more 
details on the re-replication process.
+
+The mechanism to prevent the bookie from starting up in the case of 
configuration changes exists to prevent the following silent failures:
+
+# A strict subset of the ledger devices (among multiple ledger devices) has 
been replaced, consequently making the content of the replaced devices 
unavailable;
+# A strict subset of the ledger directories has been accidentally deleted.
+
+h3. Full or failing disks
+
+A bookie can go into read-only mode if it detects problems with its disks. In 
read-only mode, the bookie will serve read requests, but will not allow any 
writes. Any ledger currently writing to the bookie will replace the bookie in 
its ensemble. No new ledgers will select the read-only bookie for writing.
+
+The bookie goes into read-only mode in the following conditions.
+
+# All disks are full.
+# An error occurred flushing to the ledger disks.
+# An error occurred writing to the journal disk.
+
+Important parameters are:
+
+* @readOnlyModeEnabled@, whether read-only mode is enabled. If read-only mode 
is not enabled, the bookie will shutdown on encountering any of the above 
conditions. By default, read-only mode is disabled.
+* @diskUsageThreshold@, percentage threshold at which a disk will be 
considered full. This value must be between 0 and 1.0. By default, the value is 
0.95.
+* @diskCheckInterval@, interval at which the disks are checked to see if they 
are full. Specified in milliseconds. By default the check occurs every 10000 
milliseconds (10 seconds).
+
+h2. Running Autorecovery nodes
+
+To run autorecovery nodes, we execute the following command in every Bookie 
node:
+ @bookkeeper-server/bin/bookkeeper autorecovery@
+
+Configuration parameters for autorecovery can be set in 
*bookkeeper-server/conf/bk_server.conf*.
+
+Important parameters are:
+
+* @auditorPeriodicCheckInterval@, interval at which the auditor will do a 
check of all ledgers in the cluster. By default this runs once a week. The 
interval is set in seconds. To disable the periodic check completely, set this 
to 0. Note that periodic checking will put extra load on the cluster, so it 
should not be run more frequently than once a day.
+
+* @rereplicationEntryBatchSize@ specifies the number of entries which a 
replication will rereplicate in parallel. The default value is 10. A larger 
value for this parameter will increase the speed at which autorecovery occurs 
but will increate the memory requirement of the autorecovery process, and 
create more load on the cluster.
+
+* @openLedgerRereplicationGracePeriod@, is the amount of time, in 
milliseconds, which a recovery worker will wait before recovering a ledger 
segment which has no defined ended, i.e. the client is still writing to that 
segment. If the client is still active, it should detect the bookie failure, 
and start writing to a new ledger segment, and a new ensemble, which doesn't 
include the failed bookie. Creating new ledger segment will define the end of 
the previous segment. If, after the grace period, the ledger segment's end has 
not been defined, we assume the writing client has crashed. The ledger is 
fenced and the client is blocked from writing any more entries to the ledger. 
The default value is 30000ms.
+
+
+h3. Disabling Autorecovery during maintenance
+
+It is useful to disable autorecovery during maintenance, for example, to avoid 
a Bookie's data being unnecessarily rereplicated when it is only being taken 
down for a short period to update the software, or change the configuration.
+
+To disable autorecovery, run:
+@bookkeeper-server/bin/bookkeeper shell autorecovery -disable@
+
+To reenable, run:
+@bookkeeper-server/bin/bookkeeper shell autorecovery -enable@
+
+Autorecovery enable/disable only needs to be run once for the whole cluster, 
and not individually on each Bookie in the cluster.
+
+h2. Setting up a test ensemble
+
+Sometimes it is useful to run a ensemble of bookies on your local machine for 
testing. We provide a utility for doing this. It will set up N bookies, and a 
zookeeper instance locally. The data on these bookies and of the zookeeper 
instance are not persisted over restarts, so obviously this should never be 
used in a production environment. To run a test ensemble of 10 bookies, do the 
following:
+
+@bookkeeper-server/bin/bookkeeper localbookie 10@
+

Added: bookkeeper/site/trunk/content/docs/master/bookkeeperConfigParams.textile
URL: 
http://svn.apache.org/viewvc/bookkeeper/site/trunk/content/docs/master/bookkeeperConfigParams.textile?rev=1644097&view=auto
==============================================================================
--- bookkeeper/site/trunk/content/docs/master/bookkeeperConfigParams.textile 
(added)
+++ bookkeeper/site/trunk/content/docs/master/bookkeeperConfigParams.textile 
Tue Dec  9 16:00:52 2014
@@ -0,0 +1,39 @@
+Title:        BookKeeper Configuration Parameters
+Notice: Licensed under the Apache License, Version 2.0 (the "License");
+        you may not use this file except in compliance with the License. You 
may
+        obtain a copy of the License at 
"http://www.apache.org/licenses/LICENSE-2.0":http://www.apache.org/licenses/LICENSE-2.0.
+        .        
+        Unless required by applicable law or agreed to in writing,
+        software distributed under the License is distributed on an "AS IS"
+        BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+        implied. See the License for the specific language governing 
permissions
+        and limitations under the License.
+        .
+
+h1. BookKeeper Configuration Parameters
+
+This page contains detailed information about configuration parameters used 
for configuring a BookKeeper client.
+
+h3. General parameters
+
+| @zkServers@ | A list of one of more servers on which zookeeper is running. 
The server list can be comma separated values, e.g., zk1:2181,zk2:2181,zk3:2181 
|
+| @zkTimeout@ | ZooKeeper client session timeout in milliseconds. The default 
value is 10,000. |
+| @throttle@ | A throttle value is used to prevent running out of memory when 
producing too many requests than the capability of bookie servers can handle. 
The default is 5,000. |
+| @readTimeout@ | This is the number of seconds bookkeeper client wait without 
hearing a response from a bookie before client consider it failed. The default 
is 5 seconds. |
+| @numWorkerThreads@ | This is the number of worker threads used by bookkeeper 
client to submit operations. The default value is the number of available 
processors. |
+
+h3. NIO server settings
+
+| @clientTcpNoDelay@ | This settings is used to enabled/disabled Nagle's 
algorithm, which is a means of improving the efficiency of TCP/IP networks by 
reducing the number of packets that need to be sent over the network. If you 
are sending many small messages, such that more than one can fit in a single IP 
packet, setting server.tcpnodelay to false to enable Nagle algorithm can 
provide better performance. Default value is true. |
+
+h3. Ledger manager settings
+
+| @ledgerManagerType@ | This parameter determines the type of ledger manager 
used to manage how ledgers are stored, manipulated, and garbage collected. See 
"BookKeeper Internals":./bookkeeperInternals.html for detailed info. Default 
value is flat. |
+| @zkLedgersRootPath@ | Root zookeeper path to store ledger metadata. Default 
is /ledgers. |
+
+h3. Bookie recovery settings
+
+Currently bookie recovery tool needs a digest type and passwd to open ledgers 
to do recovery. Currently, bookkeeper assumes that all ledgers were created 
with the same DigestType and Password. In the future, it needs to know for each 
ledger, what was the DigestType and Password used to create it before opening 
it.
+
+| @digestType@ | Digest type used to open ledgers from bookkie recovery tool. |
+| @passwd@ | Password used to open ledgers from bookie recovery tool. |

Added: bookkeeper/site/trunk/content/docs/master/bookkeeperInternals.textile
URL: 
http://svn.apache.org/viewvc/bookkeeper/site/trunk/content/docs/master/bookkeeperInternals.textile?rev=1644097&view=auto
==============================================================================
--- bookkeeper/site/trunk/content/docs/master/bookkeeperInternals.textile 
(added)
+++ bookkeeper/site/trunk/content/docs/master/bookkeeperInternals.textile Tue 
Dec  9 16:00:52 2014
@@ -0,0 +1,84 @@
+Title:        BookKeeper Internals
+Notice: Licensed under the Apache License, Version 2.0 (the "License");
+        you may not use this file except in compliance with the License. You 
may
+        obtain a copy of the License at 
"http://www.apache.org/licenses/LICENSE-2.0":http://www.apache.org/licenses/LICENSE-2.0.
+        .        
+        Unless required by applicable law or agreed to in writing,
+        software distributed under the License is distributed on an "AS IS"
+        BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+        implied. See the License for the specific language governing 
permissions
+        and limitations under the License.
+        .
+
+h2. Bookie Internals
+
+p. Bookie server stores its data in multiple ledger directories and its 
journal files in a journal directory. Ideally, storing journal files in a 
separate directory than data files would increase throughput and decrease 
latency
+
+h3. The Bookie Journal
+
+p. Journal directory has one kind of file in it:
+
+* @{timestamp}.txn@ - holds transactions executed in the bookie server.
+
+p. Before persisting ledger index and data to disk, a bookie ensures that the 
transaction that represents the update is written to a journal in non-volatile 
storage. A new journal file is created using current timestamp when a bookie 
starts or an old journal file reaches its maximum size.
+
+p. A bookie supports journal rolling to remove old journal files. In order to 
remove old journal files safely, bookie server records LastLogMark in Ledger 
Device, which indicates all updates (including index and data) before 
LastLogMark has been persisted to the Ledger Device.
+
+p. LastLogMark contains two parts:
+
+* @LastLogId@ - indicates which journal file the transaction persisted.
+* @LastLogPos@ - indicates the position the transaction persisted in LastLogId 
journal file.
+
+p. You may use following settings to further fine tune the behavior of 
journalling on bookies:
+
+| @journalMaxSizeMB@ | journal file size limitation. when a journal reaches 
this limitation, it will be closed and new journal file be created. |
+| @journalMaxBackups@ | how many old journal files whose id is less than 
LastLogMark 's journal id. |
+
+bq. NOTE: keeping number of old journal files would be useful for manually 
recovery in special case.
+
+h1. ZooKeeper Metadata
+
+p. For BookKeeper, we require a ZooKeeper installation to store metadata, and 
to pass the list of ZooKeeper servers as parameter to the constructor of the 
BookKeeper class (@org.apache.bookkeeper.client.BookKeeper@). To setup 
ZooKeeper, please check the "ZooKeeper 
documentation":http://zookeeper.apache.org/doc/trunk/index.html. 
+
+p. BookKeeper provides two mechanisms to organize its metadata in ZooKeeper. 
By default, the @FlatLedgerManager@ is used, and 99% of users should never need 
to look at anything else. However, in cases where there are a lot of active 
ledgers concurrently, (> 50,000), @HierarchicalLedgerManager@ should be used. 
For so many ledgers, a hierarchical approach is needed due to a limit ZooKeeper 
places on packet sizes "JIRA 
Issue":https://issues.apache.org/jira/browse/BOOKKEEPER-39.
+
+| @FlatLedgerManager@ | All ledger metadata are placed as children in a single 
zookeeper path. |
+| @HierarchicalLedgerManager@ | All ledger metadata are partitioned into 
2-level znodes. |
+
+h2. Flat Ledger Manager
+
+p. All ledgers' metadata are put in a single zookeeper path, created using 
zookeeper sequential node, which can ensure uniqueness of ledger id. Each 
ledger node is prefixed with 'L'.
+
+p. Bookie server manages its owned active ledgers in a hash map. So it is easy 
for bookie server to find what ledgers are deleted from zookeeper and garbage 
collect them. And its garbage collection flow is described as below:
+
+* Fetch all existing ledgers from zookeeper (@zkActiveLedgers@).
+* Fetch all ledgers currently active within the Bookie (@bkActiveLedgers@).
+* Loop over @bkActiveLedgers@ to find those ledgers which do not exist in 
@zkActiveLedgers@ and garbage collect them.
+
+h2. Hierarchical Ledger Manager
+
+p. @HierarchicalLedgerManager@ first obtains a global unique id from ZooKeeper 
using a EPHEMERAL_SEQUENTIAL znode.
+
+p. Since ZooKeeper sequential counter has a format of %10d -- that is 10 
digits with 0 (zero) padding, i.e. "&lt;path&gt;0000000001", 
@HierarchicalLedgerManager@ splits the generated id into 3 parts :
+
+@{level1 (2 digits)}{level2 (4 digits)}{level3 (4 digits)}@
+
+p. These 3 parts are used to form the actual ledger node path used to store 
ledger metadata:
+
+@{ledgers_root_path}/{level1}/{level2}/L{level3}@
+
+p. E.g. Ledger 0000000001 is split into 3 parts 00, 0000, 00001, which is 
stored in znode /{ledgers_root_path}/00/0000/L0001. So each znode could have at 
most 10000 ledgers, which avoids the problem of the child list being larger 
than the maximum ZooKeeper packet size.
+
+p. Bookie server manages its active ledgers in a sorted map, which simplifies 
access to active ledgers in a particular (level1, level2) partition.
+
+p. Garbage collection in bookie server is processed node by node as follows:
+
+* Fetching all level1 nodes, by calling zk#getChildren(ledgerRootPath).
+** For each level1 nodes, fetching their level2 nodes :
+** For each partition (level1, level2) :
+*** Fetch all existed ledgers from zookeeper belonging to partition (level1, 
level2) (@zkActiveLedgers@).
+*** Fetch all ledgers currently active in the bookie which belong to partition 
(level1, level2) (@bkActiveLedgers@).
+*** Loop over @bkActiveLedgers@ to find those ledgers which do not exist in 
@zkActiveLedgers@, and garbage collect them.
+
+bq. NOTE: Hierarchical Ledger Manager is more suitable to manage large number 
of ledgers existed in BookKeeper.
+

Added: bookkeeper/site/trunk/content/docs/master/bookkeeperJMX.textile
URL: 
http://svn.apache.org/viewvc/bookkeeper/site/trunk/content/docs/master/bookkeeperJMX.textile?rev=1644097&view=auto
==============================================================================
--- bookkeeper/site/trunk/content/docs/master/bookkeeperJMX.textile (added)
+++ bookkeeper/site/trunk/content/docs/master/bookkeeperJMX.textile Tue Dec  9 
16:00:52 2014
@@ -0,0 +1,32 @@
+Title:        BookKeeper JMX
+Notice: Licensed under the Apache License, Version 2.0 (the "License");
+        you may not use this file except in compliance with the License. You 
may
+        obtain a copy of the License at 
"http://www.apache.org/licenses/LICENSE-2.0":http://www.apache.org/licenses/LICENSE-2.0.
+        .
+        .        
+        Unless required by applicable law or agreed to in writing,
+        software distributed under the License is distributed on an "AS IS"
+        BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+        implied. See the License for the specific language governing 
permissions
+        and limitations under the License.
+        .
+        .
+
+h1. JMX
+
+Apache BookKeeper has extensive support for JMX, which allows viewing and 
managing a BookKeeper cluster.
+
+This document assumes that you have basic knowledge of JMX. See "Sun JMX 
Technology":http://java.sun.com/javase/technologies/core/mntr-mgmt/javamanagement/
 page to get started with JMX.
+
+See the "JMX Management 
Guide":http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html 
for details on setting up local and remote management of VM instances. By 
default the included __bookkeeper__ script supports only local management - 
review the linked document to enable support for remote management (beyond the 
scope of this document).
+
+__Bookie Server__ is a JMX manageable server, which registers the proper 
MBeans during initialization to support JMX monitoring and management of the 
instance.
+
+h1. Bookie Server MBean Reference
+
+This table details JMX for a bookie server.
+
+| _.MBean | _.MBean Object Name | _.Description |
+| BookieServer | BookieServer_<port> | Represents a bookie server. Note that 
the object name includes bookie port that the server listens on. It is the root 
MBean for bookie server, which includes statistics for a bookie server. E.g. 
number packets sent/received, and statistics for add/read operations. |
+| Bookie | Bookie | Provide bookie statistics. Currently it just returns 
current journal queue length waiting to be committed. |
+| LedgerCache | LedgerCache | Provide ledger cache statistics. E.g. number of 
page cached in page cache, number of files opened for ledger index files. |

Added: bookkeeper/site/trunk/content/docs/master/bookkeeperMetadata.textile
URL: 
http://svn.apache.org/viewvc/bookkeeper/site/trunk/content/docs/master/bookkeeperMetadata.textile?rev=1644097&view=auto
==============================================================================
--- bookkeeper/site/trunk/content/docs/master/bookkeeperMetadata.textile (added)
+++ bookkeeper/site/trunk/content/docs/master/bookkeeperMetadata.textile Tue 
Dec  9 16:00:52 2014
@@ -0,0 +1,40 @@
+Title:        BookKeeper Metadata Management
+Notice: Licensed under the Apache License, Version 2.0 (the "License");
+        you may not use this file except in compliance with the License. You 
may
+        obtain a copy of the License at 
"http://www.apache.org/licenses/LICENSE-2.0":http://www.apache.org/licenses/LICENSE-2.0.
+        .
+        .
+        Unless required by applicable law or agreed to in writing,
+        software distributed under the License is distributed on an "AS IS"
+        BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+        implied. See the License for the specific language governing 
permissions
+        and limitations under the License.
+        .
+        .
+
+h1. Metadata Management
+
+There are two kinds of metadata needs to be managed in BookKeeper: one is the 
__list of available bookies__, which is used to track server availability 
(ZooKeeper is designed naturally for this); while the other is __ledger 
metadata__, which could be handle by different kinds of key/value storages 
efficiently with __CAS (Compare And Set)__ semantics.
+
+__Ledger metadata__ is handled by __LedgerManager__ and can be plugged with 
various storage mediums.
+
+h2. Ledger Metadata Management
+
+The operations on the metadata of a ledger are quite straightforward. They are:
+
+* @createLedger@: create an new entry to store given ledger metadata. A unique 
id should be generated as the ledger id for the new ledger.
+* @removeLedgerMetadata@: remove the entry of a ledger from metadata store. A 
__Version__ object is provided to do conditional remove. If given __Version__ 
object doesn't match current __Version__ in metadata store, 
__MetadataVersionException__ should be thrown to indicate version confliction. 
__NoSuchLedgerExistsException__ should be returned if the ledger metadata entry 
doesn't exists.
+* @readLedgerMetadata@: read the metadata of a ledger from metadata store. The 
new __version__ should be set to the returned __LedgerMetadata__ object. 
__NoSuchLedgerExistsException__ should be returned if the entry of the ledger 
metadata doesn't exists.
+* @writeLedgerMetadata@: update the metadata of a ledger matching the given 
__Version__. The update should be rejected and __MetadataVersionException__ 
should be returned whe then given __Version__ doesn't match the current 
__Version__ in metadata store. __NoSuchLedgerExistsException__ should be 
returned if the entry of the ledger metadata doesn't exists. The version of the 
__LedgerMetadata__ object should be set to the new __Version__ generated by 
applying this update.
+* @asyncProcessLedgers@: loops through all existed ledgers in metadata store 
and applies a __Processor__. The __Processor__ provided is executed for each 
ledger. If a failure happens during iteration, the iteration should be 
teminated and __final callback__ triggered with failure. Otherwise, __final 
callback__ is triggered after all ledgers are processed. No ordering nor 
transactional guarantees need to be provided for in the implementation of this 
interface.
+* @getLedgerRanges@: return a list of ranges for ledgers in the metadata 
store. The ledger metadata itself does not need to be fetched. Only the ledger 
ids are needed. No ordering is required, but there must be no overlap between 
ledger ranges and each ledger range must be contain all the ledgers in the 
metadata store between the defined endpoint (i.e. a ledger range [x, y], all 
ledger ids larger or equal to x and smaller or equal to y should exist only in 
this range). __getLedgerRanges__ is used in the __ScanAndCompare__ gc algorithm.
+
+h1. How to choose a metadata storage medium for BookKeeper.
+
+From the interface, several requirements need to met before choosing a 
metadata storage medium for BookKeeper:
+
+* @Check and Set (CAS)@: The ability to do strict update according to specific 
conditional. Etc, a specific version (ZooKeeper) and same content (HBase).
+* @Optimized for Writes@: The metadata access pattern for BookKeeper is read 
first and continuous updates.
+* @Optimized for Scans@: Scans are required for a __ScanAndCompare__ gc 
algorithm.
+
+__ZooKeeper__ is the default implemention for BookKeeper metadata management, 
__ZooKeeper__ holds data in memory and provides filesystem-like namespace and 
also meets all the above requirements. __ZooKeeper__ could meet most of usages 
for BookKeeper. However, if you application needs to manage millions of 
ledgers, a more scalable solution would be __HBase__, which also meet the above 
requirements, but it more complicated to set up.

Added: bookkeeper/site/trunk/content/docs/master/bookkeeperOverview.textile
URL: 
http://svn.apache.org/viewvc/bookkeeper/site/trunk/content/docs/master/bookkeeperOverview.textile?rev=1644097&view=auto
==============================================================================
--- bookkeeper/site/trunk/content/docs/master/bookkeeperOverview.textile (added)
+++ bookkeeper/site/trunk/content/docs/master/bookkeeperOverview.textile Tue 
Dec  9 16:00:52 2014
@@ -0,0 +1,185 @@
+Title:        BookKeeper overview
+Notice: Licensed under the Apache License, Version 2.0 (the "License");
+        you may not use this file except in compliance with the License. You 
may
+        obtain a copy of the License at 
"http://www.apache.org/licenses/LICENSE-2.0":http://www.apache.org/licenses/LICENSE-2.0.
+        .        
+        Unless required by applicable law or agreed to in writing,
+        software distributed under the License is distributed on an "AS IS"
+        BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+        implied. See the License for the specific language governing 
permissions
+        and limitations under the License.
+        .
+
+h1. Abstract
+
+This guide contains detailed information about using BookKeeper for logging. 
It discusses the basic operations BookKeeper supports, and how to create logs 
and perform basic read and write operations on these logs.
+
+h1. BookKeeper introduction
+
+p. BookKeeper is a replicated service to reliably log streams of records. In 
BookKeeper, servers are "bookies", log streams are "ledgers", and each unit of 
a log (aka record) is a "ledger entry". BookKeeper is designed to be reliable; 
bookies, the servers that store ledgers, can crash, corrupt data, discard data, 
but as long as there are enough bookies behaving correctly the service as a 
whole behaves correctly. 
+
+p. The initial motivation for BookKeeper comes from the namenode of HDFS. 
Namenodes have to log operations in a reliable fashion so that recovery is 
possible in the case of crashes. We have found the applications for BookKeeper 
extend far beyond HDFS, however. Essentially, any application that requires an 
append storage can replace their implementations with BookKeeper. BookKeeper 
has the advantage of writing efficiently, replicating for fault tolerance, and 
scaling throughput with the number of servers through striping. 
+
+p. At a high level, a bookkeeper client receives entries from a client 
application and stores it to sets of bookies, and there are a few advantages in 
having such a service: 
+
+* We can use hardware that is optimized for such a service. We currently 
believe that such a system has to be optimized only for disk I/O; 
+* We can have a pool of servers implementing such a log system, and shared 
among a number of servers; 
+* We can have a higher degree of replication with such a pool, which makes 
sense if the hardware necessary for it is cheaper compared to the one the 
application uses. 
+
+
+h1. In slightly more detail...
+
+p. BookKeeper implements highly available logs, and it has been designed with 
write-ahead logging in mind. Besides high availability due to the replicated 
nature of the service, it provides high throughput due to striping. As we write 
entries in a subset of bookies of an ensemble and rotate writes across 
available quorums, we are able to increase throughput with the number of 
servers for both reads and writes. Scalability is a property that is possible 
to achieve in this case due to the use of quorums. Other replication 
techniques, such as state-machine replication, do not enable such a property. 
+
+p. An application first creates a ledger before writing to bookies through a 
local BookKeeper client instance. Upon creating a ledger, a BookKeeper client 
writes metadata about the ledger to ZooKeeper. Each ledger currently has a 
single writer. This writer has to execute a close ledger operation before any 
other client can read from it. If the writer of a ledger does not close a 
ledger properly because, for example, it has crashed before having the 
opportunity of closing the ledger, then the next client that tries to open a 
ledger executes a procedure to recover it. As closing a ledger consists 
essentially of writing the last entry written to a ledger to ZooKeeper, the 
recovery procedure simply finds the last entry written correctly and writes it 
to ZooKeeper. 
+
+p. Note that currently this recovery procedure is executed automatically upon 
trying to open a ledger and no explicit action is necessary. Although two 
clients may try to recover a ledger concurrently, only one will succeed, the 
first one that is able to create the close znode for the ledger. 
+
+h1. Bookkeeper elements and concepts
+
+p. BookKeeper uses four basic elements: 
+
+*  _Ledger_ : A ledger is a sequence of entries, and each entry is a sequence 
of bytes. Entries are written sequentially to a ledger and at most once. 
Consequently, ledgers have an append-only semantics; 
+*  _BookKeeper client_ : A client runs along with a BookKeeper application, 
and it enables applications to execute operations on ledgers, such as creating 
a ledger and writing to it; 
+*  _Bookie_ : A bookie is a BookKeeper storage server. Bookies store the 
content of ledgers. For any given ledger L, we call an _ensemble_ the group of 
bookies storing the content of L. For performance, we store on each bookie of 
an ensemble only a fragment of a ledger. That is, we stripe when writing 
entries to a ledger such that each entry is written to sub-group of bookies of 
the ensemble. 
+*  _Metadata storage service_ : BookKeeper requires a metadata storage service 
to store information related to ledgers and available bookies. We currently use 
ZooKeeper for such a task. 
+
+
+h1. Bookkeeper initial design
+
+p. A set of bookies implements BookKeeper, and we use a quorum-based protocol 
to replicate data across the bookies. There are basically two operations to an 
existing ledger: read and append. Here is the complete API list (mode detail 
"here":bookkeeperProgrammer.html): 
+
+* Create ledger: creates a new empty ledger; 
+* Open ledger: opens an existing ledger for reading; 
+* Add entry: adds a record to a ledger either synchronously or asynchronously; 
+* Read entries: reads a sequence of entries from a ledger either synchronously 
or asynchronously 
+
+
+p. There is only a single client that can write to a ledger. Once that ledger 
is closed or the client fails, no more entries can be added. (We take advantage 
of this behavior to provide our strong guarantees.) There will not be gaps in 
the ledger. Fingers get broken, people get roughed up or end up in prison when 
books are manipulated, so there is no deleting or changing of entries. 
+
+!images/bk-overview.jpg!
+p. A simple use of BooKeeper is to implement a write-ahead transaction log. A 
server maintains an in-memory data structure (with periodic snapshots for 
example) and logs changes to that structure before it applies the change. The 
application server creates a ledger at startup and store the ledger id and 
password in a well known place (ZooKeeper maybe). When it needs to make a 
change, the server adds an entry with the change information to a ledger and 
apply the change when BookKeeper adds the entry successfully. The server can 
even use asyncAddEntry to queue up many changes for high change throughput. 
BooKeeper meticulously logs the changes in order and call the completion 
functions in order. 
+
+p. When the application server dies, a backup server will come online, get the 
last snapshot and then it will open the ledger of the old server and read all 
the entries from the time the snapshot was taken. (Since it doesn't know the 
last entry number it will use MAX_INTEGER). Once all the entries have been 
processed, it will close the ledger and start a new one for its use. 
+
+p. A client library takes care of communicating with bookies and managing 
entry numbers. An entry has the following fields: 
+
+|Field|Type|Description|
+|Ledger number|long|The id of the ledger of this entry|
+|Entry number|long|The id of this entry|
+|last confirmed ( _LC_ )|long|id of the last recorded entry|
+|data|byte[]|the entry data (supplied by application)|
+|authentication code|byte[]|Message authentication code that includes all 
other fields of the entry|
+
+
+p. The client library generates a ledger entry. None of the fields are 
modified by the bookies and only the first three fields are interpreted by the 
bookies. 
+
+p. To add to a ledger, the client generates the entry above using the ledger 
number. The entry number will be one more than the last entry generated. The 
_LC_ field contains the last entry that has been successfully recorded by 
BookKeeper. If the client writes entries one at a time, _LC_ is the last entry 
id. But, if the client is using asyncAddEntry, there may be many entries in 
flight. An entry is considered recorded when both of the following conditions 
are met: 
+
+* the entry has been accepted by a quorum of bookies 
+* all entries with a lower entry id have been accepted by a quorum of bookies 
+
+
+ _LC_ seems mysterious right now, but it is too early to explain how we use 
it; just smile and move on. 
+
+p. Once all the other fields have been field in, the client generates an 
authentication code with all of the previous fields. The entry is then sent to 
a quorum of bookies to be recorded. Any failures will result in the entry being 
sent to a new quorum of bookies. 
+
+p. To read, the client library initially contacts a bookie and starts 
requesting entries. If an entry is missing or invalid (a bad MAC for example), 
the client will make a request to a different bookie. By using quorum writes, 
as long as enough bookies are up we are guaranteed to eventually be able to 
read an entry. 
+
+h1. Bookkeeper metadata management
+
+p. There are some meta data that needs to be made available to BookKeeper 
clients: 
+
+* The available bookies; 
+* The list of ledgers; 
+* The list of bookies that have been used for a given ledger; 
+* The last entry of a ledger; 
+
+
+p. We maintain this information in ZooKeeper. Bookies use ephemeral nodes to 
indicate their availability. Clients use znodes to track ledger creation and 
deletion and also to know the end of the ledger and the bookies that were used 
to store the ledger. Bookies also watch the ledger list so that they can 
cleanup ledgers that get deleted. 
+
+h1. Closing out ledgers
+
+p. The process of closing out the ledger and finding the last entry is 
difficult due to the durability guarantees of BookKeeper: 
+
+* If an entry has been successfully recorded, it must be readable. 
+* If an entry is read once, it must always be available to be read. 
+
+
+p. If the ledger was closed gracefully, ZooKeeper will have the last entry and 
everything will work well. But, if the BookKeeper client that was writing the 
ledger dies, there is some recovery that needs to take place. 
+
+p. The problematic entries are the ones at the end of the ledger. There can be 
entries in flight when a BookKeeper client dies. If the entry only gets to one 
bookie, the entry should not be readable since the entry will disappear if that 
bookie fails. If the entry is only on one bookie, that doesn't mean that the 
entry has not been recorded successfully; the other bookies that recorded the 
entry might have failed. 
+
+p. The trick to making everything work is to have a correct idea of a last 
entry. We do it in roughly three steps: 
+
+# Find the entry with the highest last recorded entry, _LC_ ; 
+# Find the highest consecutively recorded entry, _LR_ ; 
+# Make sure that all entries between _LC_ and _LR_ are on a quorum of bookies; 
+
+h1. Data Management in Bookies
+
+p. This section gives an overview of how a bookie manages its ledger 
fragments. 
+
+h2. Basic
+
+p. Bookies manage data in a log-structured way, which is implemented using 
three kind of files:
+
+* _Journal_ : A journal file contains the BookKeeper transaction logs. Before 
any update takes place, a bookie ensures that a transaction describing the 
update is written to non-volatile storage. A new journal file is created once 
the bookie starts or the older journal file reaches the journal file size 
threshold.
+* _Entry Log_ : An entry log file manages the written entries received from 
BookKeeper clients. Entries from different ledgers are aggregated and written 
sequentially, while their offsets are kept as pointers in _LedgerCache_ for 
fast lookup. A new entry log file is created once the bookie starts or the 
older entry log file reaches the entry log size threshold. Old entry log files 
are removed by the _Garbage Collector Thread_ once they are not associated with 
any active ledger.
+* _Index File_ : An index file is created for each ledger, which comprises a 
header and several fixed-length index pages, recording the offsets of data 
stored in entry log files. 
+
+p. Since updating index files would introduce random disk I/O, for performance 
consideration, index files are updated lazily by a _Sync Thread_ running in the 
background. Before index pages are persisted to disk, they are gathered in 
_LedgerCache_ for lookup.
+
+* _LedgerCache_ : A memory pool caches ledger index pages, which more 
efficiently manage disk head scheduling.
+
+h2. Add Entry
+
+p. When a bookie receives entries from clients to be written, these entries 
will go through the following steps to be persisted to disk:
+
+# Append the entry in _Entry Log_, return its position { logId , offset } ;
+# Update the index of this entry in _Ledger Cache_ ;
+# Append a transaction corresponding to this entry update in _Journal_ ;
+# Respond to BookKeeper client ;
+
+* For performance reasons, _Entry Log_ buffers entries in memory and commit 
them in batches, while _Ledger Cache_ holds index pages in memory and flushes 
them lazily. We will discuss data flush and how to ensure data integrity in the 
following section 'Data Flush'.
+
+h2. Data Flush
+
+p. Ledger index pages are flushed to index files in the following two cases:
+
+# _LedgerCache_ memory reaches its limit. There is no more space available to 
hold newer index pages. Dirty index pages will be evicted from _LedgerCache_ 
and persisted to index files.
+# A background thread _Sync Thread_ is responsible for flushing index pages 
from _LedgerCache_ to index files periodically.
+
+p. Besides flushing index pages, _Sync Thread_ is responsible for rolling 
journal files in case that journal files use too much disk space. 
+
+p. The data flush flow in _Sync Thread_ is as follows:
+
+# Records a _LastLogMark_ in memory. The _LastLogMark_ contains two parts: 
first one is _txnLogId_ (file id of a journal) and the second one is 
_txnLogPos_ (offset in a journal). The _LastLogMark_ indicates that those 
entries before it have been persisted to both index and entry log files.
+# Flushes dirty index pages from _LedgerCache_ to index file, and flushes 
entry log files to ensure all buffered entries in entry log files are persisted 
to disk.
+#* Ideally, a bookie just needs to flush index pages and entry log files that 
contains entries before _LastLogMark_. There is no such information in 
_LedgerCache_ and _Entry Log_ mapping to journal files, though. Consequently, 
the thread flushes _LedgerCache_ and _Entry Log_ entirely here, and may flush 
entries after the _LastLogMark_. Flushing more is not a problem, though, just 
redundant.
+# Persists _LastLogMark_ to disk, which means entries added before 
_LastLogMark_ whose entry data and index page were also persisted to disk. It 
is the time to safely remove journal files created earlier than _txnLogId_.
+#* If the bookie has crashed before persisting _LastLogMark_ to disk, it still 
has journal files containing entries for which index pages may not have been 
persisted. Consequently, when this bookie restarts, it inspects journal files 
to restore those entries; data isn't lost.
+
+p. Using the above data flush mechanism, it is safe for the _Sync Thread_ to 
skip data flushing when the bookie shuts down. However, in _Entry Logger_, it 
uses _BufferedChannel_ to write entries in batches and there might be data 
buffered in _BufferedChannel_ upon a shut down. The bookie needs to ensure 
_Entry Logger_ flushes its buffered data during shutting down. Otherwise, 
_Entry Log_ files become corrupted with partial entries.
+
+p. As described above, _EntryLogger#flush_ is invoked in the following two 
cases:
+* in _Sync Thread_ : used to ensure entries added before _LastLogMark_ are 
persisted to disk.
+* in _ShutDown_ : used to ensure its buffered data persisted to disk to avoid 
data corruption with partial entries.
+
+h2. Data Compaction
+
+p. In bookie server, entries of different ledgers are interleaved in entry log 
files. A bookie server runs a _Garbage Collector_ thread to delete 
un-associated entry log files to reclaim disk space. If a given entry log file 
contains entries from a ledger that has not been deleted, then the entry log 
file would never be removed and the occupied disk space never reclaimed. In 
order to avoid such a case, a bookie server compacts entry log files in 
_Garbage Collector_ thread to reclaim disk space.
+
+p. There are two kinds of compaction running with different frequency, which 
are _Minor Compaction_ and _Major Compaction_. The differences of _Minor 
Compaction_ and _Major Compaction_ are just their threshold value and 
compaction interval.
+
+# _Threshold_ : Size percentage of an entry log file occupied by those 
undeleted ledgers. Default minor compaction threshold is 0.2, while major 
compaction threshold is 0.8.
+# _Interval_ : How long to run the compaction. Default minor compaction is 1 
hour, while major compaction threshold is 1 day.
+
+p. NOTE: if either _Threshold_ or _Interval_ is set to less than or equal to 
zero, then compaction is disabled.
+
+p. The data compaction flow in _Garbage Collector Thread_ is as follows:
+
+# _Garbage Collector_ thread scans entry log files to get their entry log 
metadata, which records a list of ledgers comprising an entry log and their 
corresponding percentages.
+# With the normal garbage collection flow, once the bookie determines that a 
ledger has been deleted, the ledger will be removed from the entry log metadata 
and the size of the entry log reduced.
+# If the remaining size of an entry log file reaches a specified threshold, 
the entries of active ledgers in the entry log will be copied to a new entry 
log file.
+# Once all valid entries have been copied, the old entry log file is deleted.

Added: bookkeeper/site/trunk/content/docs/master/bookkeeperProgrammer.textile
URL: 
http://svn.apache.org/viewvc/bookkeeper/site/trunk/content/docs/master/bookkeeperProgrammer.textile?rev=1644097&view=auto
==============================================================================
--- bookkeeper/site/trunk/content/docs/master/bookkeeperProgrammer.textile 
(added)
+++ bookkeeper/site/trunk/content/docs/master/bookkeeperProgrammer.textile Tue 
Dec  9 16:00:52 2014
@@ -0,0 +1,99 @@
+Title:        BookKeeper Getting Started Guide
+Notice: Licensed under the Apache License, Version 2.0 (the "License");
+        you may not use this file except in compliance with the License. You 
may
+        obtain a copy of the License at 
"http://www.apache.org/licenses/LICENSE-2.0":http://www.apache.org/licenses/LICENSE-2.0.
+        .        
+        Unless required by applicable law or agreed to in writing,
+        software distributed under the License is distributed on an "AS IS"
+        BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+        implied. See the License for the specific language governing 
permissions
+        and limitations under the License.
+        .
+
+h1. Abstract
+
+This guide contains detailed information about using BookKeeper for write 
ahead logging. It discusses the basic operations BookKeeper supports, and how 
to create logs and perform basic read and write operations on these logs. The 
main classes used by BookKeeper client are 
"BookKeeper":./apidocs/org/apache/bookkeeper/client/BookKeeper.html and 
"LedgerHandle":./apidocs/org/apache/bookkeeper/client/LedgerHandle.html. 
+
+BookKeeper is the main client used to create, open and delete ledgers. A 
ledger is a log file in BookKeeper, which contains a sequence of entries. Only 
the client which creates a ledger can write to it. A LedgerHandle represents 
the ledger to the client, and allows the client to read and write entries. When 
the client is finished writing they can close the LedgerHandle. Once a ledger 
has been closed, all client who read from it are guaranteed to read the exact 
same entries in the exact same order. All methods of BookKeeper and 
LedgerHandle have synchronous and asynchronous versions. Internally the 
synchronous versions are implemented using the asynchronous.
+
+h1.  Instantiating BookKeeper
+
+To create a BookKeeper client, you need to create a configuration object and 
set the address of the ZooKeeper ensemble in use. For example, if you were 
using @zk1.example.com:2181,zk2.example.com:2181,zk3.example.com:2181@ as your 
ensemble, you would create the BookKeeper client as follows.
+
+<pre><code>
+ClientConfiguration conf = new ClientConfiguration();
+conf.setZkServers("zk1.example.com:2181,zk2.example.com:2181,zk3.example.com:2181");
 
+
+BookKeeper client = new BookKeeper(conf);
+</code></pre>
+
+It is important to close the client once you are finished working with it. The 
set calls on ClientConfiguration are chainable, so instead of putting a set* 
call on a new line as above, it is possible to make a number of calls on the 
one line. For example;
+
+<pre><code>
+ClientConfiguration conf = new 
ClientConfiguration().setZkServers("localhost:2181").setZkTimeout(5000);
+</code></pre>
+
+There is also a useful shortcut constructor which allows you to pass the 
zookeeper ensemble string directly to BookKeeper.
+<pre><code>
+BookKeeper client = new BookKeeper("localhost:2181");
+</code></pre>
+
+See "BookKeeper":./apidocs/org/apache/bookkeeper/client/BookKeeper.html for 
the full api.
+
+
+h1.  Creating a ledger
+
+p. Before writing entries to BookKeeper, it is necessary to create a ledger. 
Before creating the ledger you must decide the ensemble size and the quorum 
size. 
+
+p. The ensemble size is the number of Bookies over which entries will be 
striped. The quorum size is the number of bookies which an entry will be 
written to. Striping is done in a round robin fashion. For example, if you have 
an ensemble size of 3 (consisting of bk1, bk2 & bk3), and a quorum of 2, entry 
1 will be written to bk1 & bk2, entry 2 will be written to bk2 & bk3, entry 3 
will be written to bk3 & bk1 and so on.
+
+p. Ledgers are also created with a digest type and password. The digest type 
is used to generate a checksum so that when reading entries we can ensure that 
the content is the same as what was written. The password is used as an access 
control mechanism.
+
+p. To create a ledger, with ensemble size 3, quorum size 2, using a CRC to 
checksum and "foobar" as the password, do the following:
+
+<pre><code>
+LedgerHandle lh = client.createLedger(3, 2, DigestType.CRC32, "foobar");
+</code></pre>
+
+You can now write to this ledger handle. As you probably plan to read the 
ledger at some stage, now is a good time to store the id of the ledger 
somewhere. The ledger id is a long, and can be obtained with @lh.getId()@.
+
+h1.  Adding entries to a ledger
+
+p. Once you have obtained a ledger handle, you can start adding entries to it. 
Entries are simply arrays of bytes. As such, adding entries to the ledger is 
rather simple.
+
+<pre><code>
+lh.addEntry("Hello World!".getBytes());
+</code></pre>
+
+h1.  Closing a ledger
+
+p. Once a client is done writing, it can closes the ledger. Closing the ledger 
is a very important step in BookKeeper, as once a ledger is closed, all reading 
clients are guaranteed to read the same sequence of entries in the same order. 
Closing takes no parameters. 
+
+<pre><code>
+lh.close();
+</code></pre>
+
+h1.  Opening a ledger
+
+To read from a ledger, a client must open it first. To open a ledger you must 
know its ID, which digest type was used when creating it, and its password. To 
open the ledger we created above, assuming it has ID 1;
+
+<pre><code>
+LedgerHandle lh2 = client.openLedger(1, DigestType.CRC32, "foobar");
+</code></pre>
+
+You can now read entries from the ledger. Any attempt to write to this handle 
will throw an exception.
+
+bq. NOTE: Opening a ledger, which another client already has open for writing 
will prevent that client from writing any new entries to it. If you do not wish 
this to happen, you should use the openLedgerNoRecovery method. However, keep 
in mind that without recovery, you lose the guarantees of what entries are in 
the ledger. You should only use openLedgerNoRecovery if you know what you are 
doing.
+
+h1. Reading entries from a ledger
+
+p. Now that you have an open ledger, you can read entries from it. You can use 
@getLastAddConfirmed@ to get the id of the last entry in the ledger.
+
+<pre><code>
+long lastEntry = lh2.getLastAddConfirmed();
+Enumeration<LedgerEntry> entries = lh2.readEntries(0, 9);
+while (entries.hasMoreElements()) {
+       byte[] bytes = entries.nextElement().getEntry();
+       System.out.println(new String(bytes));
+}
+</code></pre>
\ No newline at end of file

Added: bookkeeper/site/trunk/content/docs/master/bookkeeperStarted.textile
URL: 
http://svn.apache.org/viewvc/bookkeeper/site/trunk/content/docs/master/bookkeeperStarted.textile?rev=1644097&view=auto
==============================================================================
--- bookkeeper/site/trunk/content/docs/master/bookkeeperStarted.textile (added)
+++ bookkeeper/site/trunk/content/docs/master/bookkeeperStarted.textile Tue Dec 
 9 16:00:52 2014
@@ -0,0 +1,102 @@
+Title:        BookKeeper Getting Started Guide
+Notice: Licensed under the Apache License, Version 2.0 (the "License");
+        you may not use this file except in compliance with the License. You 
may
+        obtain a copy of the License at 
"http://www.apache.org/licenses/LICENSE-2.0":http://www.apache.org/licenses/LICENSE-2.0.
+        .        
+        Unless required by applicable law or agreed to in writing,
+        software distributed under the License is distributed on an "AS IS"
+        BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+        implied. See the License for the specific language governing 
permissions
+        and limitations under the License.
+        .
+
+h1. Abstract
+
+This guide contains detailed information about using BookKeeper for logging. 
It discusses the basic operations BookKeeper supports, and how to create logs 
and perform basic read and write operations on these logs.
+
+h1. Getting Started: Setting up BookKeeper to write logs.
+
+p. This document contains information to get you started quickly with 
BookKeeper. It is aimed primarily at developers willing to try it out, and 
contains simple installation instructions for a simple BookKeeper installation 
and a simple programming example. For further programming detail, please refer 
to  "BookKeeper Programmer's Guide":bookkeeperProgrammer.html. 
+
+h1. Pre-requisites
+
+p. See "System Requirements":./bookkeeperConfig.html#bk_sysReqin the Admin 
guide.
+
+h1. Download
+
+p. BookKeeper trunk can be downloaded from subversion. See "Version 
Control:http://zookeeper.apache.org/bookkeeper/svn.html. 
+
+h1. LocalBookKeeper
+
+p. BookKeeper provides a utility program to start a standalone ZooKeeper 
ensemble and a number of bookies on a local machine. As this all runs on a 
local machine, throughput will be very low. It should only be used for testing.
+
+p. To start a local bookkeeper ensemble with 5 bookies:
+
+ @bookkeeper-server/bin/bookkeeper localbookie 5@
+
+h1. Setting up bookies
+
+p. If you're bold and you want more than just running things locally, then 
you'll need to run bookies in different servers. You'll need at least three 
bookies to start with. 
+
+p. For each bookie, we need to execute a command like the following: 
+
+ @bookkeeper-server/bin/bookkeeper bookie@
+
+p. This command will use the default directories for storing ledgers and the 
write ahead log, and will look for a zookeeper server on localhost:2181. See 
the "Admin Guide":./bookkeeperConfig.html for more details.
+
+p. To see the default values of these configuration variables, run:
+
+ @bookkeeper-server/bin/bookkeeper help@
+
+h1. Setting up ZooKeeper
+
+p. ZooKeeper stores metadata on behalf of BookKeeper clients and bookies. To 
get a minimal ZooKeeper installation to work with BookKeeper, we can set up one 
server running in standalone mode. Once we have the server running, we need to 
create a few znodes: 
+
+#  @/ledgers @ 
+#  @/ledgers/available @ 
+
+p. We provide a way of bootstrapping it automatically. See the "Admin 
Guide":./bookkeeperConfig.html for a description of how to bootstrap 
automatically, and in particular the shell metaformat command.
+ 
+
+h1. Example
+
+p. In the following excerpt of code, we: 
+
+# Open a bookkeeper client;
+# Create a ledger; 
+# Write to the ledger; 
+# Close the ledger; 
+# Open the same ledger for reading; 
+# Read from the ledger; 
+# Close the ledger again; 
+# Close the bookkeeper client.
+
+<pre><code>
+BookKeeper bkc = new BookKeeper("localhost:2181");
+LedgerHandle lh = bkc.createLedger(ledgerPassword);
+ledgerId = lh.getId();
+ByteBuffer entry = ByteBuffer.allocate(4);
+
+for(int i = 0; i < 10; i++){
+       entry.putInt(i);
+       entry.position(0);
+       entries.add(entry.array());                             
+       lh.addEntry(entry.array());
+}
+lh.close();
+lh = bkc.openLedger(ledgerId, ledgerPassword);         
+                       
+Enumeration<LedgerEntry> ls = lh.readEntries(0, 9);
+int i = 0;
+while(ls.hasMoreElements()){
+       ByteBuffer origbb = ByteBuffer.wrap(
+                               entries.get(i++));
+       Integer origEntry = origbb.getInt();
+       ByteBuffer result = ByteBuffer.wrap(
+                               ls.nextElement().getEntry());
+
+       Integer retrEntry = result.getInt();
+}
+lh.close();
+bkc.close();
+</code></pre>
\ No newline at end of file

Added: bookkeeper/site/trunk/content/docs/master/bookkeeperStream.textile
URL: 
http://svn.apache.org/viewvc/bookkeeper/site/trunk/content/docs/master/bookkeeperStream.textile?rev=1644097&view=auto
==============================================================================
--- bookkeeper/site/trunk/content/docs/master/bookkeeperStream.textile (added)
+++ bookkeeper/site/trunk/content/docs/master/bookkeeperStream.textile Tue Dec  
9 16:00:52 2014
@@ -0,0 +1,124 @@
+Title:        Streaming with BookKeeper
+Notice: Licensed under the Apache License, Version 2.0 (the "License");
+        you may not use this file except in compliance with the License. You 
may
+        obtain a copy of the License at 
"http://www.apache.org/licenses/LICENSE-2.0":http://www.apache.org/licenses/LICENSE-2.0.
+        .        
+        Unless required by applicable law or agreed to in writing,
+        software distributed under the License is distributed on an "AS IS"
+        BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+        implied. See the License for the specific language governing 
permissions
+        and limitations under the License.
+        .
+
+h1. Abstract
+
+This guide contains detailed information about using how to stream bytes on 
top of BookKeeper. It essentially motivates and discusses the basic stream 
operations currently supported.
+
+h1. Summary
+
+p. When using the BookKeeper API, an application has to split the data to 
write into entries, each entry being a byte array. This is natural for many 
applications. For example, when using BookKeeper for write-ahead logging, an 
application typically wants to write the modifications corresponding to a 
command or a transaction. Some other applications, however, might not have a 
natural boundary for entries, and may prefer to write and read streams of 
bytes. This is exactly the purpose of the stream API we have implemented on top 
of BookKeeper. 
+
+p. The stream API is implemented in the package @Streaming@ , and it contains 
two main classes: @LedgerOutputStream@ and  @LedgerInputStream@ . The class 
names are indicative of what they do. 
+
+h1. Writing a stream of bytes
+
+p. Class @LedgerOutputStream@ implements two constructors and five public 
methods: 
+
+ @public LedgerOutputStream(LedgerHandle lh) @ 
+
+p. where: 
+
+*  @lh@ is a ledger handle for a previously created and open ledger. 
+
+
+ @public LedgerOutputStream(LedgerHandle lh, int size) @ 
+
+p. where: 
+
+*  @lh@ is a ledger handle for a previously created and open ledger. 
+*  @size@ is the size of the byte buffer to store written bytes before 
flushing. 
+
+
+ _Closing a stream._ This call closes the stream by flushing the write buffer. 
+
+ @public void close() @ 
+
+p. which has no parameters. 
+
+ _Flushing a stream._ This call essentially flushes the write buffer. 
+
+ @public synchronized void flush() @ 
+
+p. which has no parameters. 
+
+ _Writing bytes._ There are three calls for writing bytes to a stream. 
+
+ @public synchronized void write(byte[] b) @ 
+
+p. where: 
+
+*  @b@ is an array of bytes to write. 
+
+
+ @public synchronized void write(byte[] b, int off, int len) @ 
+
+p. where: 
+
+*  @b@ is an array of bytes to write. 
+*  @off@ is a buffer offset. 
+*  @len@ is the length to write. 
+
+
+ @public synchronized void write(int b) @ 
+
+p. where: 
+
+*  @b@ contains a byte to write. The method writes the least significant byte 
of the integer four bytes. 
+
+
+h1. Reading a stream of bytes
+
+p. Class @LedgerOutputStream@ implements two constructors and four public 
methods: 
+
+ @public LedgerInputStream(LedgerHandle lh) throws BKException, 
InterruptedException @ 
+
+p. where: 
+
+*  @lh@ is a ledger handle for a previously created and open ledger. 
+
+
+ @public LedgerInputStream(LedgerHandle lh, int size) throws BKException, 
InterruptedException @ 
+
+p. where: 
+
+*  @lh@ is a ledger handle for a previously created and open ledger. 
+*  @size@ is the size of the byte buffer to store bytes that the application 
will eventually read. 
+
+
+ _Closing._ There is one call to close an input stream, but the call is 
currently empty and the application is responsible for closing the ledger 
handle. 
+
+ @public void close() @ 
+
+p. which has no parameters. 
+
+ _Reading._ There are three calls to read from the stream. 
+
+ @public synchronized int read() throws IOException @ 
+
+p. which has no parameters. 
+
+ @public synchronized int read(byte[] b) throws IOException @ 
+
+p. where: 
+
+*  @b@ is a byte array to write to. 
+
+
+ @public synchronized int read(byte[] b, int off, int len) throws IOException 
@ 
+
+p. where: 
+
+*  @b@ is a byte array to write to. 
+*  @off@ is an offset for byte array @b@ . 
+*  @len@ is the length in bytes to write to @b@ . 
+

Added: bookkeeper/site/trunk/content/docs/master/doc.textile
URL: 
http://svn.apache.org/viewvc/bookkeeper/site/trunk/content/docs/master/doc.textile?rev=1644097&view=auto
==============================================================================
--- bookkeeper/site/trunk/content/docs/master/doc.textile (added)
+++ bookkeeper/site/trunk/content/docs/master/doc.textile Tue Dec  9 16:00:52 
2014
@@ -0,0 +1,21 @@
+Notice: Licensed under the Apache License, Version 2.0 (the "License");
+        you may not use this file except in compliance with the License. You 
may
+        obtain a copy of the License at 
"http://www.apache.org/licenses/LICENSE-2.0":http://www.apache.org/licenses/LICENSE-2.0.
+        .        
+        Unless required by applicable law or agreed to in writing,
+        software distributed under the License is distributed on an "AS IS"
+        BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+        implied. See the License for the specific language governing 
permissions
+        and limitations under the License.
+        .
+
+In the documentation directory, you'll find:
+
+* @build.txt@: Building Hedwig, or how to set up Hedwig
+* @user.txt@: User's Guide, or how to program against the Hedwig API and how 
to run it
+* @dev.txt@: Developer's Guide, or Hedwig internals and hacking details
+
+These documents are all written in the 
"Pandoc":http://johnmacfarlane.net/pandoc/ dialect of 
"Markdown":http://daringfireball.net/projects/markdown/. This makes them 
readable as plain text files, but also capable of generating HTML or LaTeX 
documentation.
+
+Documents are wrapped at 80 chars and use 2-space indentation.
+

Added: bookkeeper/site/trunk/content/docs/master/hedwigBuild.textile
URL: 
http://svn.apache.org/viewvc/bookkeeper/site/trunk/content/docs/master/hedwigBuild.textile?rev=1644097&view=auto
==============================================================================
--- bookkeeper/site/trunk/content/docs/master/hedwigBuild.textile (added)
+++ bookkeeper/site/trunk/content/docs/master/hedwigBuild.textile Tue Dec  9 
16:00:52 2014
@@ -0,0 +1,38 @@
+Notice: Licensed under the Apache License, Version 2.0 (the "License");
+        you may not use this file except in compliance with the License. You 
may
+        obtain a copy of the License at 
"http://www.apache.org/licenses/LICENSE-2.0":http://www.apache.org/licenses/LICENSE-2.0.
+        .        
+        Unless required by applicable law or agreed to in writing,
+        software distributed under the License is distributed on an "AS IS"
+        BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+        implied. See the License for the specific language governing 
permissions
+        and limitations under the License.
+        .
+
+h1. Pre-requisites
+
+For the core itself:
+
+* JDK 6: "http://java.sun.com/":http://java.sun.com/. Ensure @$JAVA_HOME@ is 
correctly set.
+* Maven 2: "http://maven.apache.org/":http://maven.apache.org/.
+
+Hedwig has been tested on Windows XP, Linux 2.6, and OS X.
+
+h1. Command-Line Instructions
+
+From the top level bookkeeper directory, run @mvn package@. This will compile 
and package the jars necessary for running hedwig. 
+
+See the User's Guide for instructions on running and usage.
+
+h1. Eclipse Instructions
+
+To check out, build, and develop using Eclipse:
+
+# Install the Subclipse plugin. Update site: 
"http://subclipse.tigris.org/update_1.4.x":http://subclipse.tigris.org/update_1.4.x.
+# Install the Maven plugin. Update site: 
"http://m2eclipse.sonatype.org/update":http://m2eclipse.sonatype.org/update. 
From the list of packages available from this site, select everything under the 
&quot;Maven Integration&quot; category, and from the optional components select 
the ones with the word &quot;SCM&quot; in them.
+# Go to Preferences &gt; Team &gt; SVN. For the SVN interface, choose 
&quot;Pure Java&quot;.
+# Choose File &gt; New &gt; Project... &gt; Maven &gt; Checkout Maven Projects 
from SCM.
+# For the SCM URL type, choose SVN. For the URL, enter SVN URL. Maven will 
automatically create a top-level Eclipse project for each of the 4 Maven 
modules (recommended). If you want fewer top-level projects, uncheck the option 
of having a project for each module (under Advanced).
+
+You are now ready to run and debug the client and server code. See the User's 
Guide for instructions on running and usage.
+

svn commit: r1644097 [1/2] - /bookkeeper/site/trunk/content/docs/master/

Reply via email to