sijie commented on a change in pull request #402: Issue 356: Release notes 4.5.0
File path: site/docs/latest/releaseNotes.md
@@ -0,0 +1,483 @@
+title: Apache BookKeeper 4.5.0 Release Notes
+This is the fifth release of BookKeeper as an Apache Top Level Project!
+The 4.5.0 release incorporates hundreds of new fixes, improvements, and
features since previous major release, 4.4.0,
+which was released over a year ago. It is a big milestone in Apache BookKeeper
community, converging from three
+main branches (Salesforce, Twitter and Yahoo).
+Apache BookKeeper users are encouraged to upgrade to `4.5.0`. The technical
details of this release are summarized
+The main features in 4.5.0 cover are around four areas:
+- Public API
+Prior to this release, Apache BookKeeper only supports simple `DIGEST-MD5`
+With this release of Apache BookKeeper, a number of feature are introduced
that can be used, together of separately,
+to secure a BookKeeper cluster.
+The following security features are currently supported.
+- Authentication of connections to bookies from clients, using either `TLS` or
+- Authentication of connections from clients, bookies, autorecovery daemons to
`ZooKeeper`, when using zookeeper
+ based ledger managers.
+- Encryption of data transferred between bookies and clients, between bookies
and autorecovery daemons using `TLS`.
+It's worth noting that those security features are optional - non-secured
clusters are supported, as well as a mix
+of authenticated, unauthenticated, encrypted and non-encrypted clients.
+For more details, have a look at [BookKeeper Security](../security).
+### Public API
+There are multiple new client features introduced in 4.5.0.
+The [Ledger API] is the low level API provides by BookKeeper for interacting
with `ledgers` in a bookkeeper cluster.
+It is simple but not flexible on ledger id or entry id generation. Apache
BookKeeper introduces `LedgerHandleAdv`
+as an extension of existing `LedgerHandle` for advanced usage. The new
`LedgerHandleAdv` allows applications providing
+its own `ledger-id` and assigning `entry-id` on adding entries.
+See [Ledger Advanced API](../api/ledger-adv-api) for more details.
+#### Long Poll
+`Long Poll` is a main feature that [DistributedLog](https://distributedlog.io)
uses to achieve low-latency tailing.
+This big feature has been merged back in 4.5.0 and available to BookKeeper
users. It allows tailing-reads without
+polling `LastAddConfirmed` everytime after the readers exhaust known entries.
+Although `Long Poll` brings great latency improvements on tailing reads, it is
still a very low-level primitive.
+It is still recommended to use high level API (e.g. [DistributedLog
API](../api/distributedlog-api)) for tailing and streaming use cases.
for more details.
+#### Explicit LAC
+Prior to 4.5.0, the `LAC` is only advanced when subsequent entries are added.
If there is no subsequent entries added,
+the last entry written will not be visible to readers until the ledger is
closed. High-level client (e.g. DistributedLog) or applications
+has to work around this by writing some sort of `control records` to advance
+In 4.5.0, a new `explicit lac` feature is introduced to periodically advance
`LAC` if there are not subsequent entries added. This feature
+can be enabled by setting `explicitLacInterval` to a positive value.
+There are a lot for performance related bug fixes and improvements in 4.5.0.
The major performance improvement introduced in 4.5.0, is
+upgrading netty from 3.x to
+For more details, please read [upgrade guide](../upgrade) about the netty
related tips when upgrading bookkeeper from 4.4.0 to 4.5.0.
+Besides netty 4 upgrade, there are other performance related changes
highlighted as below:
+#### Delay Ensemble Change
+`Ensemble Change` is a feature that Apache BookKeeper uses to achieve high
availability. However it is an expensive metadata operation.
+Especially when Apache BookKeeper is deployed in a multiple data-centers
environment, losing a data center will cause churn of metadata
+operations due to ensemble changes. `Delay Ensemble Change` is introduced in
4.5.0 to overcome this problem. Enabling this feature means
+an `Ensemble Change` will only occur when clients can't receive enough valid
responses to satisfy `ack-quorum` constraint. This feature
+improves the tail latency.
+To enable this feature, please set `delayEnsembleChange` to `true` on your
+#### Parallel Ledger Recovery
+BookKeeper clients recovers entries one-by-one during ledger recovery. If a
ledger has very large volumn of traffic, it will have
+large number of entries to recover when client failures occur. BookKeeper
introduces `parallel ledger recovery` in 4.5.0 to allow
+batch recovery to improve ledger recovery speed.
+To enable this feature, please set `enableParallelRecoveryRead` to `true` on
your clients. You can also set `recoveryReadBatchSize`
+to control the batch size of recovery read.
+#### multiple journals
+Prior to 4.5.0, bookies are only allowed to configure one journal device. If
you want to have high write bandwidth, you can raid multiple
+disks into one device and mount that device for jouranl directory. However
because there is only one journal thread, this approach doesn't
+actually improve the write bandwidth.
+BookKeeper introduces multiple journal directories support in 4.5.0. Users can
configure multiple devices for journal directories.
+To enable this feature, please use `journalDirectories` rather than
+Apache BookKeeper supports pluggable metadata store. By default, it uses
Apache ZooKeeper as its metadata store. Among the zookeeper-based
+ledger manager implementations, `HierarchicalLedgerManager` is the most
popular and widely adopted ledger manager. However it has a major
+limitation, which it assumes `ledger-id` is a 32-bits integer. It limits the
number of ledgers to `2^32`.
+`LongHierarchicalLedgerManager` is introduced to overcome this limitation.
+See [Ledger Manager](../develop/ledger-manager) for more details and learn how
to migrate `HierarchicalLedgerManager` to `LongHierarchicalLedgerManager`.
+#### Weight-based placement policy
+`Rack-Aware` and `Region-Aware` placement polices are the two available
placement policies in BookKeeper client. It places ensembles based
+on users' configured network topology. However they both assume that all nodes
are equal. `weight-based` placement is introduced in 4.5.0 to
+improve the existing placement polices. `weight-based` placement was not built
as separated polices. It is built in the existing placement policies.
+If you are using `Rack-Aware` or `Region-Aware`, you can simply enable
`weight-based` placement by setting `diskWeightBasedPlacementEnabled` to `true`.
+#### Customized Ledger Metadata
+A `Map<String, byte>` is introduced in ledger metadata in 4.5.0. Clients now
are allowed to pass in a key/value map when creating ledgers.
+This customized ledger metadata can be later on used by user defined placement
policy. This extends the flexibility of bookkeeper API.
+#### Add Prometheus stats provider
+A new [Prometheus](https://prometheus.io/) [stats
+is introduce in 4.5.0. It simplies the metric collection when running
bookkeeper on [kubernetes](https://kubernetes.io/).
I don't think test improvements need to be in `highlights`.
I will include the tool improvements.
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
Apache Git Services