kkonstantine commented on a change in pull request #8839:
URL: https://github.com/apache/kafka/pull/8839#discussion_r439227501



##########
File path: docs/connect.html
##########
@@ -180,13 +182,80 @@ <h4><a id="connect_transforms" 
href="#connect_transforms">Transformations</a></h
         <li>SetSchemaMetadata - modify the schema name or version</li>
         <li>TimestampRouter - Modify the topic of a record based on original 
topic and timestamp. Useful when using a sink that needs to write to different 
tables or indexes based on timestamps</li>
         <li>RegexRouter - modify the topic of a record based on original 
topic, replacement string and a regular expression</li>
+        <li>Filter - Removes messages from all further processing. This is 
used with a <a href="#connect_predicates">predicate</a> to selectively filter 
certain messages.</li>
     </ul>
 
     <p>Details on how to configure each transformation are listed below:</p>
 
 
     <!--#include virtual="generated/connect_transforms.html" -->
 
+
+    <h5><a id="connect_predicates" 
href="#connect_predicates">Predicates</a></h5>
+
+    <p>Transformations can be configured with prediates so that the 
transformation is applied only to messages which satisfy some condition. In 
particular, when combined with the <b>Filter</b> transformation predicates can 
be used to selectively filter out certain messages.</p>

Review comment:
       ```suggestion
       <p>Transformations can be configured with predicates so that the 
transformation is applied only to messages which satisfy some condition. In 
particular, when combined with the <b>Filter</b> transformation predicates can 
be used to selectively filter out certain messages.</p>
   ```

##########
File path: docs/connect.html
##########
@@ -180,13 +182,80 @@ <h4><a id="connect_transforms" 
href="#connect_transforms">Transformations</a></h
         <li>SetSchemaMetadata - modify the schema name or version</li>
         <li>TimestampRouter - Modify the topic of a record based on original 
topic and timestamp. Useful when using a sink that needs to write to different 
tables or indexes based on timestamps</li>
         <li>RegexRouter - modify the topic of a record based on original 
topic, replacement string and a regular expression</li>
+        <li>Filter - Removes messages from all further processing. This is 
used with a <a href="#connect_predicates">predicate</a> to selectively filter 
certain messages.</li>
     </ul>
 
     <p>Details on how to configure each transformation are listed below:</p>
 
 
     <!--#include virtual="generated/connect_transforms.html" -->
 
+
+    <h5><a id="connect_predicates" 
href="#connect_predicates">Predicates</a></h5>
+
+    <p>Transformations can be configured with prediates so that the 
transformation is applied only to messages which satisfy some condition. In 
particular, when combined with the <b>Filter</b> transformation predicates can 
be used to selectively filter out certain messages.</p>
+
+    <p>Predicates are specified in the connector configuration.</p>
+
+    <ul>
+        <li><code>predicates</code> - Set of aliases for the predicates to be 
applied to some of the transformations.</li>
+        <li><code>predicates.$alias.type</code> - Fully qualified class name 
for the predicate.</li>
+        <li><code>predicates.$alias.$predicateSpecificConfig</code> - 
Configuration properties for the predicate.</li>
+    </ul>
+
+    <p>All transformations have the implicit config properties 
<code>predicate</code> and <code>negate</code>. A predicular predicate is 
associated with a transformation by setting the transformation's 
<code>predicate</code> config to the predicate's alias. The predicate's value 
can be reversed using the <code>negate</code> configuration property.</p>
+
+    <p>For example, suppose you have a source connector which produces 
messages to many different topics and you want to:</p>
+    <ul>
+        <li>filter out the messages in the 'foo' topic entirely</li>
+        <li>apply the ExtractField transformation with the field name 
'other_field' to records in all topics <i>except</i> the topic 'bar'</li>
+    </ul>
+
+    <p>To do this we need to first to filter out the records destined for the 
topic 'foo'. The Filter transformation removes records from further processing, 
and can use the TopicNameMatches predicate to apply the transformation only to 
records in topics which match a certain regular expression. TopicNameMatches's 
only configuration property is <code>pattern</code> which is a Java regular 
expression for matching against the topic name. The configuration would look 
like this:</p>
+
+    <pre class="brush: text;">
+        transforms=Filter
+        transforms.Filter.type=org.apache.kafka.connect.transforms.Filter
+        transforms.Filter.predicate=IsFoo
+
+        predicates=IsFoo
+        
predicates.IsFoo.type=org.apache.kafka.connect.predicates.TopicNameMatches
+        predicates.IsFoo.pattern=foo
+    </pre>
+        
+    <p>Next we need to apply ExtractField only when the topic name of the 
record is not 'bar'. We can't just use TopicNameMatches directly, because that 
would apply the transformation to matching topic names, not topic names which 
do <i>not</i> match. The transformation's implicit <code>negate</code> config 
properties allows us to invert the set of records which a predicate matches. 
Adding the configuration for this to the previous example we arrive at:</p>
+
+    <pre class="brush: text;">
+        transforms=Filter,Extract
+        transforms.Filter.type=org.apache.kafka.connect.transforms.Filter
+        transforms.Filter.predicate=IsFoo
+
+        
transforms.Extract.type=org.apache.kafka.connect.transforms.ExtractField$Key
+        transforms.Extract.field=other_field
+        transforms.Extract.predicate=IsBar
+        transforms.Extract.negate=true
+
+        predicates=IsFoo,IsBar
+        
predicates.IsFoo.type=org.apache.kafka.connect.predicates.TopicNameMatches
+        predicates.IsFoo.pattern=foo
+
+        
predicates.IsBar.type=org.apache.kafka.connect.predicates.TopicNameMatches
+        predicates.IsBar.pattern=bar
+    </pre>
+
+    <p>Kafka Connect includes the following predicates:</p>
+
+    <ul>
+        <li><code>TopicNameMatches</code> - matches records in a topic with a 
name matching a particular Java regular expression.</li>
+        <li><code>HasHeaderKey</code> - matches records which have a header 
with the given key.</li>
+        <li><code>RecordIsTombstone</code> - matches tombstone records, that 
is, those will a null value.</li>

Review comment:
       ```suggestion
           <li><code>RecordIsTombstone</code> - matches tombstone records, that 
is records with a null value.</li>
   ```

##########
File path: docs/connect.html
##########
@@ -180,13 +182,80 @@ <h4><a id="connect_transforms" 
href="#connect_transforms">Transformations</a></h
         <li>SetSchemaMetadata - modify the schema name or version</li>
         <li>TimestampRouter - Modify the topic of a record based on original 
topic and timestamp. Useful when using a sink that needs to write to different 
tables or indexes based on timestamps</li>
         <li>RegexRouter - modify the topic of a record based on original 
topic, replacement string and a regular expression</li>
+        <li>Filter - Removes messages from all further processing. This is 
used with a <a href="#connect_predicates">predicate</a> to selectively filter 
certain messages.</li>
     </ul>
 
     <p>Details on how to configure each transformation are listed below:</p>
 
 
     <!--#include virtual="generated/connect_transforms.html" -->
 
+
+    <h5><a id="connect_predicates" 
href="#connect_predicates">Predicates</a></h5>
+
+    <p>Transformations can be configured with prediates so that the 
transformation is applied only to messages which satisfy some condition. In 
particular, when combined with the <b>Filter</b> transformation predicates can 
be used to selectively filter out certain messages.</p>
+
+    <p>Predicates are specified in the connector configuration.</p>
+
+    <ul>
+        <li><code>predicates</code> - Set of aliases for the predicates to be 
applied to some of the transformations.</li>
+        <li><code>predicates.$alias.type</code> - Fully qualified class name 
for the predicate.</li>
+        <li><code>predicates.$alias.$predicateSpecificConfig</code> - 
Configuration properties for the predicate.</li>
+    </ul>
+
+    <p>All transformations have the implicit config properties 
<code>predicate</code> and <code>negate</code>. A predicular predicate is 
associated with a transformation by setting the transformation's 
<code>predicate</code> config to the predicate's alias. The predicate's value 
can be reversed using the <code>negate</code> configuration property.</p>
+
+    <p>For example, suppose you have a source connector which produces 
messages to many different topics and you want to:</p>
+    <ul>
+        <li>filter out the messages in the 'foo' topic entirely</li>
+        <li>apply the ExtractField transformation with the field name 
'other_field' to records in all topics <i>except</i> the topic 'bar'</li>
+    </ul>
+
+    <p>To do this we need to first to filter out the records destined for the 
topic 'foo'. The Filter transformation removes records from further processing, 
and can use the TopicNameMatches predicate to apply the transformation only to 
records in topics which match a certain regular expression. TopicNameMatches's 
only configuration property is <code>pattern</code> which is a Java regular 
expression for matching against the topic name. The configuration would look 
like this:</p>

Review comment:
       ```suggestion
       <p>To do this we need first to filter out the records destined for the 
topic 'foo'. The Filter transformation removes records from further processing, 
and can use the TopicNameMatches predicate to apply the transformation only to 
records in topics which match a certain regular expression. TopicNameMatches's 
only configuration property is <code>pattern</code> which is a Java regular 
expression for matching against the topic name. The configuration would look 
like this:</p>
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to