kafka git commit: KAFKA-5839: Upgrade Guide doc changes for KIP-130

guozhang Wed, 20 Sep 2017 01:51:00 -0700

Repository: kafka
Updated Branches:
  refs/heads/trunk bd0146d98 -> 398714b75



KAFKA-5839: Upgrade Guide doc changes for KIP-130

Author: Florian Hussonnois <[email protected]>

Reviewers: Matthias J. Sax <[email protected]>, Guozhang Wang 
<[email protected]>

Closes #3811 from fhussonnois/KAFKA-5839

minor fixes


Project: http://git-wip-us.apache.org/repos/asf/kafka/repo
Commit: http://git-wip-us.apache.org/repos/asf/kafka/commit/398714b7
Tree: http://git-wip-us.apache.org/repos/asf/kafka/tree/398714b7
Diff: http://git-wip-us.apache.org/repos/asf/kafka/diff/398714b7

Branch: refs/heads/trunk
Commit: 398714b75888fca00319493c38c8ead2a02fd7cc
Parents: bd0146d
Author: Florian Hussonnois <[email protected]>
Authored: Wed Sep 20 16:39:49 2017 +0800
Committer: Guozhang Wang <[email protected]>
Committed: Wed Sep 20 16:50:08 2017 +0800

----------------------------------------------------------------------
 docs/streams/developer-guide.html | 10 +++++
 docs/streams/upgrade-guide.html   | 70 ++++++++++++++++++++++++----------
 2 files changed, 59 insertions(+), 21 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kafka/blob/398714b7/docs/streams/developer-guide.html
----------------------------------------------------------------------
diff --git a/docs/streams/developer-guide.html 
b/docs/streams/developer-guide.html
index ab5a823..3368757 100644
--- a/docs/streams/developer-guide.html
+++ b/docs/streams/developer-guide.html
@@ -2905,6 +2905,16 @@ Note that in the <code>WordCountProcessor</code> 
implementation, users need to r
     </pre>
 
     <p>
+        To retrieve information about the local running threads, you can use 
the <code>localThreadsMetadata()</code> method after you start the application.
+    </p>
+
+    <pre class="brush: java;">
+    // For instance, use this method to print/monitor the partitions assigned 
to each local tasks.
+    Set&lt;ThreadMetadata&gt; threads = streams.localThreadsMetadata();
+    ...
+    </pre>
+
+    <p>
         To stop the application instance call the <code>close()</code> method:
     </p>
 

http://git-wip-us.apache.org/repos/asf/kafka/blob/398714b7/docs/streams/upgrade-guide.html
----------------------------------------------------------------------
diff --git a/docs/streams/upgrade-guide.html b/docs/streams/upgrade-guide.html
index 96c5941..c4ae54f 100644
--- a/docs/streams/upgrade-guide.html
+++ b/docs/streams/upgrade-guide.html
@@ -49,7 +49,7 @@
 
     <p>
         With 1.0 a major API refactoring was accomplished and the new API is 
cleaner and easier to use.
-        This change includes the five main classes <code>KafkaStreams<code>, 
<code>KStreamBuilder</code>,
+        This change includes the five main classes <code>KafkaStreams</code>, 
<code>KStreamBuilder</code>,
         <code>KStream</code>, <code>KTable</code>, and 
<code>TopologyBuilder</code> (and some more others).
         All changes are fully backward compatible as old API is only 
deprecated but not removed.
         We recommend to move to the new API as soon as you can.
@@ -59,7 +59,7 @@
     <p>
         The two main classes to specify a topology via the DSL 
(<code>KStreamBuilder</code>)
         or the Processor API (<code>TopologyBuilder</code>) were deprecated 
and replaced by
-        <code>StreamsBuilder</code> and <code>Topology<code> (both new classes 
are located in
+        <code>StreamsBuilder</code> and <code>Topology</code> (both new 
classes are located in
         package <code>org.apache.kafka.streams</code>).
         Note, that <code>StreamsBuilder</code> does not extend 
<code>Topology</code>, i.e.,
         the class hierarchy is different now.
@@ -74,7 +74,7 @@
     </p>
 
     <p>
-        Changing how a topology is specified also affects 
<code>KafkaStreams<code> constructors,
+        Changing how a topology is specified also affects 
<code>KafkaStreams</code> constructors,
         that now only accept a <code>Topology</code>.
         Using the DSL builder class <code>StreamsBuilder</code> one can get 
the constructed
         <code>Topology</code> via <code>StreamsBuilder#build()</code>.
@@ -86,33 +86,61 @@
     </p>
 
     <p>
-        With the introduction of <a 
href="https://cwiki.apache.org/confluence/display/KAFKA/KIP-182%3A+Reduce+Streams+DSL+overloads+and+allow+easier+use+of+custom+storage+engines";>KIP-182</a>
-        you should no longer pass in <code>Serde</code> to 
<code>KStream#print</code> operations.
-        If you can't rely on using <code>toString</code> to print your keys an 
values, you should instead you provide a custom <code>KeyValueMapper</code> via 
the <code>Printed#withKeyValueMapper</code> call.
+        New methods in <code>KafkaStreams</code>:
     </p>
-
+    <ul>
+        <li> retrieve the current runtime information about the local threads 
via <code>#localThreadsMetadata()</code> </li>
+    </ul>
+    <p>
+        Deprecated methods in <code>KafkaStreams</code>:
+    </p>
+    <ul>
+        <li><code>toString()</code></li>
+        <li><code>toString(final String indent)</code></li>
+    </ul>
     <p>
-        Windowed aggregations have moved from <code>KGroupedStream</code> to 
<code>WindowedKStream</code>.
-        You can now perform a windowed aggregation by, for example, using 
<code>KGroupedStream#windowedBy(Windows)#reduce(Reducer)</code>.
-        Note: the previous aggregate functions on <code>KGroupedStream</code> 
still work, but have been deprecated.
+        Previously the above methods were used to return static and runtime 
information.
+        They have been deprecated in favor of using the new classes/methods 
<code>#localThreadsMetadata()</code> / <code>ThreadMetadata</code> (returning 
runtime information) and
+        <code>TopologyDescription</code> / <code>Topology#describe()</code> 
(returning static information).
     </p>
 
     <p>
-        The Processor API was extended to allow users to schedule 
<code>punctuate</code> functions either based on data-driven <b>stream time</b> 
or wall-clock time.
-        As a result, the original <code>ProcessorContext#schedule</code> is 
deprecated with a new overloaded function that accepts a user customizable 
<code>Punctuator</code> callback interface, which triggers its 
<code>punctuate</code> API method periodically based on the 
<code>PunctuationType</code>.
-        The <code>PunctuationType</code> determines what notion of time is 
used for the punctuation scheduling: either <a 
href="/{{version}}/documentation/streams/core-concepts#streams_time">stream 
time</a> or wall-clock time (by default, <b>stream time</b> is configured to 
represent event time via <code>TimestampExtractor</code>).
-        In addition, the <code>punctuate</code> function inside 
<code>Processor</code> is also deprecated.
+        More deprecated methods in <code>KafkaStreams</code>:
     </p>
+    <ul>
+        <li>With the introduction of <a 
href="https://cwiki.apache.org/confluence/display/KAFKA/KIP-182%3A+Reduce+Streams+DSL+overloads+and+allow+easier+use+of+custom+storage+engines";>KIP-182</a>
+            you should no longer pass in <code>Serde</code> to 
<code>KStream#print</code> operations.
+            If you can't rely on using <code>toString</code> to print your 
keys an values, you should instead you provide a custom 
<code>KeyValueMapper</code> via the <code>Printed#withKeyValueMapper</code> 
call.
+        </li>
+        <li>
+            Windowed aggregations have moved from <code>KGroupedStream</code> 
to <code>WindowedKStream</code>.
+            You can now perform a windowed aggregation by, for example, using 
<code>KGroupedStream#windowedBy(Windows)#reduce(Reducer)</code>.
+            Note: the previous aggregate functions on 
<code>KGroupedStream</code> still work, but have been deprecated.
+        </li>
+    </ul>
 
     <p>
-        Before this, users could only schedule based on stream time (i.e. 
<code>PunctuationType.STREAM_TIME</code>) and hence the <code>punctuate</code> 
function was data-driven only because stream time is determined (and advanced 
forward) by the timestamps derived from the input data.
-        If there is no data arriving at the processor, the stream time would 
not advance and hence punctuation will not be triggered.
-        On the other hand, When wall-clock time (i.e. 
<code>PunctuationType.WALL_CLOCK_TIME</code>) is used, <code>punctuate</code> 
will be triggered purely based on wall-clock time.
-        So for example if the <code>Punctuator</code> function is scheduled 
based on <code>PunctuationType.WALL_CLOCK_TIME</code>, if these 60 records were 
processed within 20 seconds,
-        <code>punctuate</code> would be called 2 times (one time every 10 
seconds);
-        if these 60 records were processed within 5 seconds, then no 
<code>punctuate</code> would be called at all.
-        Users can schedule multiple <code>Punctuator</code> callbacks with 
different <code>PunctuationType</code>s within the same processor by simply 
calling <code>ProcessorContext#schedule</code> multiple times inside 
processor's <code>init()</code> method.
+        Modified methods in <code>Processor</code>:
     </p>
+    <ul>
+        <li>
+            <p>
+                The Processor API was extended to allow users to schedule 
<code>punctuate</code> functions either based on data-driven <b>stream time</b> 
or wall-clock time.
+                As a result, the original 
<code>ProcessorContext#schedule</code> is deprecated with a new overloaded 
function that accepts a user customizable <code>Punctuator</code> callback 
interface, which triggers its <code>punctuate</code> API method periodically 
based on the <code>PunctuationType</code>.
+                The <code>PunctuationType</code> determines what notion of 
time is used for the punctuation scheduling: either <a 
href="/{{version}}/documentation/streams/core-concepts#streams_time">stream 
time</a> or wall-clock time (by default, <b>stream time</b> is configured to 
represent event time via <code>TimestampExtractor</code>).
+                In addition, the <code>punctuate</code> function inside 
<code>Processor</code> is also deprecated.
+            </p>
+            <p>
+                Before this, users could only schedule based on stream time 
(i.e. <code>PunctuationType.STREAM_TIME</code>) and hence the 
<code>punctuate</code> function was data-driven only because stream time is 
determined (and advanced forward) by the timestamps derived from the input data.
+                If there is no data arriving at the processor, the stream time 
would not advance and hence punctuation will not be triggered.
+                On the other hand, When wall-clock time (i.e. 
<code>PunctuationType.WALL_CLOCK_TIME</code>) is used, <code>punctuate</code> 
will be triggered purely based on wall-clock time.
+                So for example if the <code>Punctuator</code> function is 
scheduled based on <code>PunctuationType.WALL_CLOCK_TIME</code>, if these 60 
records were processed within 20 seconds,
+                <code>punctuate</code> would be called 2 times (one time every 
10 seconds);
+                if these 60 records were processed within 5 seconds, then no 
<code>punctuate</code> would be called at all.
+                Users can schedule multiple <code>Punctuator</code> callbacks 
with different <code>PunctuationType</code>s within the same processor by 
simply calling <code>ProcessorContext#schedule</code> multiple times inside 
processor's <code>init()</code> method.
+            </p>
+        </li>
+    </ul>
 
     <p>
         If you are monitoring on task level or processor-node / state store 
level Streams metrics, please note that the metrics sensor name and hierarchy 
was changed:

kafka git commit: KAFKA-5839: Upgrade Guide doc changes for KIP-130

Reply via email to