This is an automated email from the ASF dual-hosted git repository.

showuon pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/kafka.git


The following commit(s) were added to refs/heads/trunk by this push:
     new 7376d2c5b1e MINOR: add quick start for tiered storage feature (#14528)
7376d2c5b1e is described below

commit 7376d2c5b1ed455eaf39dfd8443d3a67c9189a36
Author: Luke Chen <[email protected]>
AuthorDate: Tue Oct 17 10:30:11 2023 +0800

    MINOR: add quick start for tiered storage feature (#14528)
    
    Some users complained they don't have a way to determine if there is 
something wrong in the RSM plug-in they implemented, or there's something wrong 
in Kafka itself. Also, if there are users who just want to try the tiered 
storage feature out before implementing anything, it would be good we have an 
RSM implementation by default.
    
    Per the discussion in the KIP, there will be no default RSM implementation 
in Kafka, but we can use the LocalTieredStorage implemented for integration 
test, to resolve the issues above.
    
    Reviewers: Christo Lolov <[email protected]>, Divij Vaidya 
<[email protected]>, Kamal Chandraprakash <[email protected]>, 
Satish Duggana <[email protected]>
---
 docs/ops.html | 82 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
 docs/toc.html |  2 +-
 2 files changed, 76 insertions(+), 8 deletions(-)

diff --git a/docs/ops.html b/docs/ops.html
index 3ee9cac4237..4414a0b86ad 100644
--- a/docs/ops.html
+++ b/docs/ops.html
@@ -3984,27 +3984,95 @@ listeners=CONTROLLER://:9093
   If unset, The value in <code>retention.ms</code> and 
<code>retention.bytes</code> will be used.
 </p>
 
-<h4 class="anchor-heading"><a id="tiered_storage_config_ex" 
class="anchor-link"></a><a href="#tiered_storage_config_ex">Configurations 
Example</a></h4>
+<h4 class="anchor-heading"><a id="tiered_storage_config_ex" 
class="anchor-link"></a><a href="#tiered_storage_config_ex">Quick Start 
Example</a></h4>
+
+<p>Apache Kafka doesn't provide an out-of-the-box RemoteStorageManager 
implementation. To have a preview of the tiered storage
+  feature, the <a 
href="https://github.com/apache/kafka/blob/trunk/storage/src/test/java/org/apache/kafka/server/log/remote/storage/LocalTieredStorage.java";>LocalTieredStorage</a>
+  implemented for integration test can be used, which will create a temporary 
directory in local storage to simulate the remote storage.
+</p>
+
+<p>To adopt the `LocalTieredStorage`, the test library needs to be built 
locally</p>
+<pre># please checkout to the specific version tag you're using before 
building it
+# ex: `git checkout 3.6.0`
+./gradlew clean :storage:testJar</pre>
+<p>After build successfully, there should be a `kafka-storage-x.x.x-test.jar` 
file under `storage/build/libs`.
+Next, setting configurations in the broker side to enable tiered storage 
feature.</p>
 
-<p>Here is a sample configuration to enable tiered storage feature in broker 
side:
 <pre>
 # Sample Zookeeper/Kraft broker server.properties listening on 
PLAINTEXT://:9092
 remote.log.storage.system.enable=true
-# Please provide the implementation for remoteStorageManager. This is the 
mandatory configuration for tiered storage.
-# 
remote.log.storage.manager.class.name=org.apache.kafka.server.log.remote.storage.NoOpRemoteStorageManager
-# Using the "PLAINTEXT" listener for the clients in RemoteLogMetadataManager 
to talk to the brokers.
+
+# Setting the listener for the clients in RemoteLogMetadataManager to talk to 
the brokers.
 remote.log.metadata.manager.listener.name=PLAINTEXT
+
+# Please provide the implementation info for remoteStorageManager.
+# This is the mandatory configuration for tiered storage.
+# Here, we use the `LocalTieredStorage` built above.
+remote.log.storage.manager.class.name=org.apache.kafka.server.log.remote.storage.LocalTieredStorage
+remote.log.storage.manager.class.path=/PATH/TO/kafka-storage-x.x.x-test.jar
+
+# These 2 prefix are default values, but customizable
+remote.log.storage.manager.impl.prefix=rsm.config.
+remote.log.metadata.manager.impl.prefix=rlmm.config.
+
+# Configure the directory used for `LocalTieredStorage`
+# Note, please make sure the brokers need to have access to this directory
+rsm.config.dir=/tmp/kafka-remote-storage
+
+# This needs to be changed if number of brokers in the cluster is more than 1
+rlmm.config.remote.log.metadata.topic.replication.factor=1
+
+# Try to speed up the log retention check interval for testing
+log.retention.check.interval.ms=1000
 </pre>
 </p>
 
-<p>After broker is started, creating a topic with tiered storage enabled, and 
a small log time retention value to try this feature:
-<pre>bin/kafka-topics.sh --create --topic tieredTopic --bootstrap-server 
localhost:9092 --config remote.storage.enable=true --config 
local.retention.ms=1000
+<p>Following <a href="#quickstart_startserver">quick start guide</a> to start 
up the kafka environment.
+  Then, create a topic with tiered storage enabled with configs:
+
+<pre>
+# remote.storage.enable=true -> enables tiered storage on the topic
+# local.retention.ms=1000 -> The number of milliseconds to keep the local log 
segment before it gets deleted.
+  Note that a local log segment is eligible for deletion only after it gets 
uploaded to remote.
+# retention.ms=3600000 -> when segments exceed this time, the segments in 
remote storage will be deleted
+# segment.bytes=1048576 -> for test only, to speed up the log segment rolling 
interval
+# file.delete.delay.ms=10000 -> for test only, to speed up the local-log 
segment file delete delay
+
+bin/kafka-topics.sh --create --topic tieredTopic --bootstrap-server 
localhost:9092 \
+--config remote.storage.enable=true --config local.retention.ms=1000 --config 
retention.ms=3600000 \
+--config segment.bytes=1048576 --config file.delete.delay.ms=1000
 </pre>
 </p>
 
+<p>Try to send messages to the `tieredTopic` topic to roll the log segment:</p>
+
+<pre>
+bin/kafka-producer-perf-test.sh --topic tieredTopic --num-records 1200 
--record-size 1024 --throughput -1 --producer-props 
bootstrap.servers=localhost:9092
+</pre>
+
 <p>Then, after the active segment is rolled, the old segment should be moved 
to the remote storage and get deleted.
+  This can be verified by checking the remote log directory configured above. 
For example:
 </p>
 
+<pre> > ls 
/tmp/kafka-remote-storage/kafka-tiered-storage/tieredTopic-0-jF8s79t9SrG_PNqlwv7bAA
+00000000000000000000-knnxbs3FSRyKdPcSAOQC-w.index
+00000000000000000000-knnxbs3FSRyKdPcSAOQC-w.snapshot
+00000000000000000000-knnxbs3FSRyKdPcSAOQC-w.leader_epoch_checkpoint
+00000000000000000000-knnxbs3FSRyKdPcSAOQC-w.timeindex
+00000000000000000000-knnxbs3FSRyKdPcSAOQC-w.log
+</pre>
+
+<p>Lastly, we can try to consume some data from the beginning and print offset 
number, to make sure it will successfully fetch offset 0 from the remote 
storage.</p>
+
+<pre>bin/kafka-console-consumer.sh --topic tieredTopic --from-beginning 
--max-messages 1 --bootstrap-server localhost:9092 --property 
print.offset=true</pre>
+
+<p>Please note, if you want to disable tiered storage at the cluster level, 
you should delete the tiered storage enabled topics explicitly.
+  Attempting to disable tiered storage at the cluster level without deleting 
the topics using tiered storage will result in an exception during startup.</p>
+
+<pre>bin/kafka-topics.sh --delete --topic tieredTopic --bootstrap-server 
localhost:9092</pre>
+
+<p>After topics are deleted, you're safe to set 
<code>remote.log.storage.system.enable=false</code> in the broker 
configuration.</p>
+
 <h4 class="anchor-heading"><a id="tiered_storage_limitation" 
class="anchor-link"></a><a 
href="#tiered_storage_limitation">Limitations</a></h4>
 
 <p>While the early access release of Tiered Storage offers the opportunity to 
try out this new feature, it is important to be aware of the following 
limitations:
diff --git a/docs/toc.html b/docs/toc.html
index 737ef887cd1..73bd66ae41c 100644
--- a/docs/toc.html
+++ b/docs/toc.html
@@ -173,7 +173,7 @@
                     <ul>
                         <li><a href="#tiered_storage_overview">Tiered Storage 
Overview</a></li>
                         <li><a 
href="#tiered_storage_config">Configuration</a></li>
-                        <li><a href="#tiered_storage_config_ex">Configurations 
Example</a></li>
+                        <li><a href="#tiered_storage_config_ex">Quick Start 
Example</a></li>
                         <li><a 
href="#tiered_storage_limitation">Limitations</a></li>
                     </ul>
                 </li>

Reply via email to