[ 
https://issues.apache.org/jira/browse/KAFKA-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451356#comment-16451356
 ] 

ASF GitHub Bot commented on KAFKA-6376:
---------------------------------------

guozhangwang closed pull request #4922: KAFKA-6376: Document skipped records 
metrics changes
URL: https://github.com/apache/kafka/pull/4922
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/docs/ops.html b/docs/ops.html
index 6ffe97653e6..450a268a2a1 100644
--- a/docs/ops.html
+++ b/docs/ops.html
@@ -1353,7 +1353,12 @@ <h5><a id="kafka_streams_thread_monitoring" 
href="#kafka_streams_thread_monitori
       </tr>
       <tr>
         <td>skipped-records-rate</td>
-        <td>The average number of skipped records per second. </td>
+        <td>The average number of skipped records per second.</td>
+        <td>kafka.streams:type=stream-metrics,client-id=([-.\w]+)</td>
+      </tr>
+      <tr>
+        <td>skipped-records-total</td>
+        <td>The total number of skipped records.</td>
         <td>kafka.streams:type=stream-metrics,client-id=([-.\w]+)</td>
       </tr>
  </tbody>
diff --git a/docs/streams/upgrade-guide.html b/docs/streams/upgrade-guide.html
index 7ffafb547a8..565bd0b263c 100644
--- a/docs/streams/upgrade-guide.html
+++ b/docs/streams/upgrade-guide.html
@@ -101,6 +101,37 @@ <h1>Upgrade Guide and API Changes</h1>
 
     <!-- TODO: verify release verion and update `id` and `href` attributes 
(also at other places that link to this headline) -->
     <h3><a id="streams_api_changes_120" 
href="#streams_api_changes_120">Streams API changes in 1.2.0</a></h3>
+    <p>
+        We have removed the <code>skippedDueToDeserializationError-rate</code> 
and <code>skippedDueToDeserializationError-total</code> metrics.
+        Deserialization errors, and all other causes of record skipping, are 
now accounted for in the pre-existing metrics
+        <code>skipped-records-rate</code> and 
<code>skipped-records-total</code>. When a record is skipped, the event is
+        now logged at WARN level. If these warnings become burdensome, we 
recommend explicitly filtering out unprocessable
+        records instead of depending on record skipping semantics. For more 
details, see
+        <a 
href="https://cwiki.apache.org/confluence/display/KAFKA/KIP-274%3A+Kafka+Streams+Skipped+Records+Metrics";>KIP-274</a>.
+        As of right now, the potential causes of skipped records are:
+    </p>
+    <ul>
+        <li><code>null</code> keys in table sources</li>
+        <li><code>null</code> keys in table-table inner/left/outer/right 
joins</li>
+        <li><code>null</code> keys or values in stream-table joins</li>
+        <li><code>null</code> keys or values in stream-stream joins</li>
+        <li><code>null</code> keys or values in aggregations on grouped 
streams</li>
+        <li><code>null</code> keys or values in reductions on grouped 
streams</li>
+        <li><code>null</code> keys in aggregations on windowed streams</li>
+        <li><code>null</code> keys in reductions on windowed streams</li>
+        <li><code>null</code> keys in aggregations on session-windowed 
streams</li>
+        <li>
+            Errors producing results, when the configured 
<code>default.production.exception.handler</code> decides to
+            <code>CONTINUE</code> (the default is to <code>FAIL</code> and 
throw an exception).
+        </li>
+        <li>
+            Errors deserializing records, when the configured 
<code>default.deserialization.exception.handler</code>
+            decides to <code>CONTINUE</code> (the default is to 
<code>FAIL</code> and throw an exception).
+            This was the case previously captured in the 
<code>skippedDueToDeserializationError</code> metrics.
+        </li>
+        <li>Fetched records having a negative timestamp.</li>
+    </ul>
+
     <p>
         We have added support for methods in <code>ReadOnlyWindowStore</code> 
which allows for querying a single window's key-value pair.
         For users who have customized window store implementations on the 
above interface, they'd need to update their code to implement the newly added 
method as well.


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve Streams metrics for skipped records
> -------------------------------------------
>
>                 Key: KAFKA-6376
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6376
>             Project: Kafka
>          Issue Type: Improvement
>          Components: metrics, streams
>    Affects Versions: 1.0.0
>            Reporter: Matthias J. Sax
>            Assignee: John Roesler
>            Priority: Major
>              Labels: kip
>             Fix For: 1.2.0
>
>
> Copy this from KIP-210 discussion thread:
> {quote}
> Note that currently we have two metrics for `skipped-records` on different
> levels:
> 1) on the highest level, the thread-level, we have a `skipped-records`,
> that records all the skipped records due to deserialization errors.
> 2) on the lower processor-node level, we have a
> `skippedDueToDeserializationError`, that records the skipped records on
> that specific source node due to deserialization errors.
> So you can see that 1) does not cover any other scenarios and can just be
> thought of as an aggregate of 2) across all the tasks' source nodes.
> However, there are other places that can cause a record to be dropped, for
> example:
> 1) https://issues.apache.org/jira/browse/KAFKA-5784: records could be
> dropped due to window elapsed.
> 2) KIP-210: records could be dropped on the producer side.
> 3) records could be dropped during user-customized processing on errors.
> {quote}
> [~guozhang] Not sure what you mean by "3) records could be dropped during 
> user-customized processing on errors."
> Btw: we also drop record with {{null}} key and/or value for certain DSL 
> operations. This should be included as well.
> KIP: : 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-274%3A+Kafka+Streams+Skipped+Records+Metrics



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to