[PR] Add test to verify sequence name of Kafka task (druid)

via GitHub Sun, 19 Nov 2023 19:16:54 -0800


kfaraz opened a new pull request, #15397:
URL: https://github.com/apache/druid/pull/15397


   The sequence name of a streaming task is determined by hashing the following:
   - min message time
   - max message time
   - data schema
   - tuning config
   - start partition offsets
   
   Thus even if a task fails and another task is created to ingest that data, 
it gets assigned the same offset and would thus use the same sequence_name used 
for segment allocation.
   
   This PR only adds a simple test to verify that the sequence name does not 
depend on the task ID.
   More tests can be later added around the sequence name to verify the other 
fields that affect the sequence name.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] Add test to verify sequence name of Kafka task (druid)

Reply via email to