[ 
https://issues.apache.org/jira/browse/KAFKA-4785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16018071#comment-16018071
 ] 

ASF GitHub Bot commented on KAFKA-4785:
---------------------------------------

GitHub user jeyhunkarimov opened a pull request:

    https://github.com/apache/kafka/pull/3106

    KAFKA-4785: Records from internal repartitioning topics should always use 
RecordMetadataTimestampExtractor

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jeyhunkarimov/kafka KAFKA-4785

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/3106.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3106
    
----
commit 98278baae63638efc2b750b1d3f12e936d845f18
Author: Jeyhun Karimov <je.kari...@gmail.com>
Date:   2017-05-19T21:39:50Z

    Records from internal repartitioning topics should always use 
RecordMetadataTimestampExtractor

----


> Records from internal repartitioning topics should always use 
> RecordMetadataTimestampExtractor
> ----------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-4785
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4785
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 0.10.2.0
>            Reporter: Matthias J. Sax
>            Assignee: Jeyhun Karimov
>
> Users can specify what timestamp extractor should be used to decode the 
> timestamp of input topic records. As long as RecordMetadataTimestamp or 
> WallclockTime is use this is fine. 
> However, for custom timestamp extractors it might be invalid to apply this 
> custom extractor to records received from internal repartitioning topics. The 
> reason is that Streams sets the current "stream time" as record metadata 
> timestamp explicitly before writing to intermediate repartitioning topics 
> because this timestamp should be use by downstream subtopologies. A custom 
> timestamp extractor might return something different breaking this assumption.
> Thus, for reading data from intermediate repartitioning topic, the configured 
> timestamp extractor should not be used, but the record's metadata timestamp 
> should be extracted as record timestamp.
> In order to leverage the same behavior for intermediate user topic (ie, used 
> in {{through()}})  we can leverage KAFKA-4144 and internally set an extractor 
> for those "intermediate sources" that returns the record's metadata timestamp 
> in order to overwrite the global extractor from {{StreamsConfig}} (ie, set 
> {{FailOnInvalidTimestampExtractor}}).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to