kafka-site git commit: additional improvements to 0.10.0 docs

gwenshap Mon, 09 May 2016 18:45:02 -0700

Repository: kafka-site
Updated Branches:
  refs/heads/asf-site 1ad8525f1 -> 76217f0b9



additional improvements to 0.10.0 docs


Project: http://git-wip-us.apache.org/repos/asf/kafka-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/kafka-site/commit/76217f0b
Tree: http://git-wip-us.apache.org/repos/asf/kafka-site/tree/76217f0b
Diff: http://git-wip-us.apache.org/repos/asf/kafka-site/diff/76217f0b

Branch: refs/heads/asf-site
Commit: 76217f0b996e0c563359fa3b8aad32d3f2ed46de
Parents: 1ad8525
Author: Gwen Shapira <[email protected]>
Authored: Mon May 9 18:44:25 2016 -0700
Committer: Gwen Shapira <[email protected]>
Committed: Mon May 9 18:44:25 2016 -0700

----------------------------------------------------------------------
 0100/connect.html        |  8 +++++---
 0100/implementation.html | 10 +++++-----
 0100/introduction.html   |  2 +-
 0100/ops.html            |  4 ++--
 0100/upgrade.html        |  5 +++++
 5 files changed, 18 insertions(+), 11 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kafka-site/blob/76217f0b/0100/connect.html
----------------------------------------------------------------------
diff --git a/0100/connect.html b/0100/connect.html
index 5cd4130..c3cf583 100644
--- a/0100/connect.html
+++ b/0100/connect.html
@@ -95,7 +95,9 @@ Since Kafka Connect is intended to be run as a service, it 
also provides a REST
     <li><code>GET /connectors/{name}</code> - get information about a specific 
connector</li>
     <li><code>GET /connectors/{name}/config</code> - get the configuration 
parameters for a specific connector</li>
     <li><code>PUT /connectors/{name}/config</code> - update the configuration 
parameters for a specific connector</li>
+    <li><code>GET /connectors/{name}/status</code> - get current status of the 
connector, including if it is running, failed, paused, etc., which worker it is 
assigned to, error information if it has failed, and the state of all its 
tasks</li>
     <li><code>GET /connectors/{name}/tasks</code> - get a list of tasks 
currently running for a connector</li>
+    <li><code>GET /connectors/{name}/tasks/{taskid}/status</code> - get 
current status of the task, including if it is running, failed, paused, etc., 
which worker it is assigned to, and error information if it has failed</li>
     <li><code>DELETE /connectors/{name}</code> - delete a connector, halting 
all tasks and deleting its configuration</li>
 </ul>
 
@@ -191,8 +193,8 @@ public List&lt;Map&lt;String, String&gt;&gt; 
getTaskConfigs(int maxTasks) {
 }
 </pre>
 
-Although not used in the example, <code>SourceTask</code> also provides two 
APIs to commit offsets in the source system: <code>commit</code> and 
<code>commitSourceRecord</code>. The APIs are provided for source systems which 
have an acknowledgement mechanism for messages. Overriding these methods allows 
the source connector to acknowledge messages in the source system, either in 
bulk or individually, once they have been written to Kafka.
-The <code>commit<code> API stores the offsets in the source system, up to the 
offsets that have been returned by <code>poll</code>. The implementation of 
this API should block until the commit is complete. The 
<code>commitSourceRecord</code> API saves the offset in the source system for 
each <code>SourceRecord</code> after it is written to Kafka. As Kafka Connect 
will record offsets automatically, <code>SourceTask<code>s are not required to 
implement them. In cases where a connector does need to acknowledge messages in 
the source system, only one of the APIs is typically required.
+Although not used in the example, <code>SourceTask</code> also provides two 
APIs to commit offsets in the source system: <code>commit</code> and 
<code>commitRecord</code>. The APIs are provided for source systems which have 
an acknowledgement mechanism for messages. Overriding these methods allows the 
source connector to acknowledge messages in the source system, either in bulk 
or individually, once they have been written to Kafka.
+The <code>commit</code> API stores the offsets in the source system, up to the 
offsets that have been returned by <code>poll</code>. The implementation of 
this API should block until the commit is complete. The 
<code>commitRecord</code> API saves the offset in the source system for each 
<code>SourceRecord</code> after it is written to Kafka. As Kafka Connect will 
record offsets automatically, <code>SourceTask</code>s are not required to 
implement them. In cases where a connector does need to acknowledge messages in 
the source system, only one of the APIs is typically required.
 
 Even with multiple tasks, this method implementation is usually pretty simple. 
It just has to determine the number of input tasks, which may require 
contacting the remote service it is pulling data from, and then divvy them up. 
Because some patterns for splitting work among tasks are so common, some 
utilities are provided in <code>ConnectorUtils</code> to simplify these cases.
 
@@ -232,7 +234,7 @@ Next, we implement the main functionality of the task, the 
<code>poll()</code> m
 public List&lt;SourceRecord&gt; poll() throws InterruptedException {
     try {
         ArrayList&lt;SourceRecord&gt; records = new ArrayList&lt;&gt;();
-        while (streamValid(stream) && records.isEmpty()) {
+        while (streamValid(stream) &amp;&amp; records.isEmpty()) {
             LineAndOffset line = readToNextLine(stream);
             if (line != null) {
                 Map<String, Object> sourcePartition = 
Collections.singletonMap("filename", filename);

http://git-wip-us.apache.org/repos/asf/kafka-site/blob/76217f0b/0100/implementation.html
----------------------------------------------------------------------
diff --git a/0100/implementation.html b/0100/implementation.html
index be81227..0a36c22 100644
--- a/0100/implementation.html
+++ b/0100/implementation.html
@@ -282,7 +282,7 @@ When an element in a path is denoted [xyz], that means that 
the value of xyz is
 
 <h4><a id="impl_zkbroker" href="#impl_zkbroker">Broker Node Registry</a></h4>
 <pre>
-/brokers/ids/[0...N] --> host:port (ephemeral node)
+/brokers/ids/[0...N] --> 
{"jmx_port":...,"timestamp":...,"endpoints":[...],"host":...,"version":...,"port":...}
 (ephemeral node)
 </pre>
 <p>
 This is a list of all present broker nodes, each of which provides a unique 
logical broker id which identifies it to consumers (which must be given as part 
of its configuration). On startup, a broker node registers itself by creating a 
znode with the logical broker id under /brokers/ids. The purpose of the logical 
broker id is to allow a broker to be moved to a different physical machine 
without affecting consumers. An attempt to register a broker id that is already 
in use (say because two servers are configured with the same broker id) results 
in an error.
@@ -292,7 +292,7 @@ Since the broker registers itself in ZooKeeper using 
ephemeral znodes, this regi
 </p>
 <h4><a id="impl_zktopic" href="#impl_zktopic">Broker Topic Registry</a></h4>
 <pre>
-/brokers/topics/[topic]/[0...N] --> nPartitions (ephemeral node)
+/brokers/topics/[topic]/partitions/[0...N]/state --> 
{"controller_epoch":...,"leader":...,"version":...,"leader_epoch":...,"isr":[...]}
 (ephemeral node)
 </pre>
 
 <p>
@@ -317,7 +317,7 @@ The consumers in a group divide up the partitions as fairly 
as possible, each pa
 <p>
 In addition to the group_id which is shared by all consumers in a group, each 
consumer is given a transient, unique consumer_id (of the form hostname:uuid) 
for identification purposes. Consumer ids are registered in the following 
directory.
 <pre>
-/consumers/[group_id]/ids/[consumer_id] --> {"topic1": #streams, ..., 
"topicN": #streams} (ephemeral node)
+/consumers/[group_id]/ids/[consumer_id] --> 
{"version":...,"subscription":{...:...},"pattern":...,"timestamp":...} 
(ephemeral node)
 </pre>
 Each of the consumers in the group registers under its group and creates a 
znode with its consumer_id. The value of the znode contains a map of &lt;topic, 
#streams&gt;. This id is simply used to identify each of the consumers which is 
currently active within a group. This is an ephemeral node so it will disappear 
if the consumer process dies.
 </p>
@@ -327,7 +327,7 @@ Each of the consumers in the group registers under its 
group and creates a znode
 Consumers track the maximum offset they have consumed in each partition. This 
value is stored in a ZooKeeper directory if 
<code>offsets.storage=zookeeper</code>.
 </p>
 <pre>
-/consumers/[group_id]/offsets/[topic]/[broker_id-partition_id] --> 
offset_counter_value ((persistent node)
+/consumers/[group_id]/offsets/[topic]/[partition_id] --> offset_counter_value 
((persistent node)
 </pre>
 
 <h4><a id="impl_zkowner" href="#impl_zkowner">Partition Owner registry</a></h4>
@@ -337,7 +337,7 @@ Each broker partition is consumed by a single consumer 
within a given consumer g
 </p>
 
 <pre>
-/consumers/[group_id]/owners/[topic]/[broker_id-partition_id] --> 
consumer_node_id (ephemeral node)
+/consumers/[group_id]/owners/[topic]/[partition_id] --> consumer_node_id 
(ephemeral node)
 </pre>
 
 <h4><a id="impl_brokerregistration" href="#impl_brokerregistration">Broker 
node registration</a></h4>

http://git-wip-us.apache.org/repos/asf/kafka-site/blob/76217f0b/0100/introduction.html
----------------------------------------------------------------------
diff --git a/0100/introduction.html b/0100/introduction.html
index ad81e97..c2e3554 100644
--- a/0100/introduction.html
+++ b/0100/introduction.html
@@ -33,7 +33,7 @@ So, at a high level, producers send messages over the network 
to the Kafka clust
   <img src="images/producer_consumer.png">
 </div>
 
-Communication between the clients and the servers is done with a simple, 
high-performance, language agnostic <a 
href="https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol";>TCP
 protocol</a>. We provide a Java client for Kafka, but clients are available in 
<a href="https://cwiki.apache.org/confluence/display/KAFKA/Clients";>many 
languages</a>.
+Communication between the clients and the servers is done with a simple, 
high-performance, language agnostic <a 
href="https://kafka.apache.org/protocol.html";>TCP protocol</a>. We provide a 
Java client for Kafka, but clients are available in <a 
href="https://cwiki.apache.org/confluence/display/KAFKA/Clients";>many 
languages</a>.
 
 <h4><a id="intro_topics" href="#intro_topics">Topics and Logs</a></h4>
 Let's first dive into the high-level abstraction Kafka provides&mdash;the 
topic.

http://git-wip-us.apache.org/repos/asf/kafka-site/blob/76217f0b/0100/ops.html
----------------------------------------------------------------------
diff --git a/0100/ops.html b/0100/ops.html
index 8b1cc23..f64a701 100644
--- a/0100/ops.html
+++ b/0100/ops.html
@@ -134,7 +134,7 @@ my-group        my-topic                       1   0        
       0
 </pre>
 
 
-Note, however, after 0.9.0, the kafka.tools.ConsumerOffsetChecker tool is 
deprecated and you should use the kafka.admin.ConsumerGroupCommand (or the 
bin/kafka-consumer-groups.sh script) to manage consumer groups, including 
consumers created with the <a 
href="https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Client+Re-Design";>new
 consumer-groups API</a>.
+Note, however, after 0.9.0, the kafka.tools.ConsumerOffsetChecker tool is 
deprecated and you should use the kafka.admin.ConsumerGroupCommand (or the 
bin/kafka-consumer-groups.sh script) to manage consumer groups, including 
consumers created with the <a 
href="http://kafka.apache.org/documentation.html#newconsumerapi";>new consumer 
API</a>.
 
 <h4><a id="basic_ops_consumer_group" href="#basic_ops_consumer_group">Managing 
Consumer Groups</a></h4>
 
@@ -156,7 +156,7 @@ test-consumer-group            test-foo                     
  0          1
 </pre>
 
 
-When you're using the <a 
href="https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Client+Re-Design";>new
 consumer-groups API</a> where the broker handles coordination of partition 
handling and rebalance, you can manage the groups with the "--new-consumer" 
flags:
+When you're using the <a 
href="http://kafka.apache.org/documentation.html#newconsumerapi";>new consumer 
API</a> where the broker handles coordination of partition handling and 
rebalance, you can manage the groups with the "--new-consumer" flags:
 
 <pre>
  &gt; bin/kafka-consumer-groups.sh --new-consumer --bootstrap-server 
broker1:9092 --list

http://git-wip-us.apache.org/repos/asf/kafka-site/blob/76217f0b/0100/upgrade.html
----------------------------------------------------------------------
diff --git a/0100/upgrade.html b/0100/upgrade.html
index b9c4bec..486954c 100644
--- a/0100/upgrade.html
+++ b/0100/upgrade.html
@@ -80,6 +80,11 @@ work with 0.10.0.x brokers. Therefore, 0.9.0.0 clients 
should be upgraded to 0.9
     <li> MirrorMakerMessageHandler no longer exposes the <code>handle(record: 
MessageAndMetadata[Array[Byte], Array[Byte]])</code> method as it was never 
called. </li>
     <li> The 0.7 KafkaMigrationTool is no longer packaged with Kafka. If you 
need to migrate from 0.7 to 0.10.0, please migrate to 0.8 first and then follow 
the documented upgrade process to upgrade from 0.8 to 0.10.0. </li>
     <li> The new consumer has standardized its APIs to accept 
<code>java.util.Collection</code> as the sequence type for method parameters. 
Existing code may have to be updated to work with the 0.10.0 client library. 
</li>
+    <li> LZ4-compressed message handling was changed to use an interoperable 
framing specification (LZ4f v1.5.1).
+         To maintain compatibility with old clients, this change only applies 
to Message format 0.10.0 and later.
+         Clients that Produce/Fetch LZ4-compressed messages using v0/v1 
(Message format 0.9.0) should continue
+         to use the 0.9.0 framing implementation. Clients that use 
Produce/Fetch protocols v2 or later
+         should use interoperable LZ4f framing. A list of interoperable LZ4 
libraries is available at http://www.lz4.org/
 </ul>
 
 <h5><a id="upgrade_10_notable" href="#upgrade_10_notable">Notable changes in 
0.10.0.0</a></h5>

kafka-site git commit: additional improvements to 0.10.0 docs

Reply via email to