Hi Sandesh,
I think it’s maybe better to keep it into Jira.

Do you mean keep the key in other Kafka topic or the key is in fact the key of 
Kafka Message which represent user tuple?
If it  is separate key, how to keep the relation between key and value?
If Key is the key of Kafka message, basically, it will change the expected 
data. As I understand, the key here is just used for recovery, it’s not the 
data user required. And the data which write to the Kafka probably need to be 
decided by the customer logic.

Think about a customer build two applications with our operator, the first 
application write data to Kafka, the second one read data from Kafka. And at 
the very beginning, the first application was implemented by a none-exactly 
once output operator, and then changed to exactly once operator. I think the 
customer don’t expect to change the second application. But the second 
application has to be changed if it’s logic depended on key.

thanks
Bright

> On May 13, 2016, at 12:37 PM, Sandesh Hegde <[email protected]> wrote:
> 
> Hi All,
> 
> I am working on Kafka 0.9 output operator and one of the requirement is to
> implement Exactly Once Output operator. Here is the one possible idea,
> please give your feedback or suggest new design.
> 
> -------------------------------------------------------------------------------------------------------------------------
> 
> Use *Key* to store meta information which is used during recovery.
> 
> Operator users will use *Value* to store their key-value pair and implement
> the Kafka partitioning accordingly.
> 
> Format of the *Key* is as specified below:
> 
> 
> 
> Key = 1. OperatorName#ApexPartitionId#WindowId#Message#MessageId ( During
> message write )
> 
>         2. OperatorName#ApexPartitionId#WindowId#CheckPoint ( During end
> Window )
> 
> During End window, checkpoint marker is written to all the Kafka partitions
> of the topic.
> 
> Every message is given a message id, counter-reset every window, and then
> written to Kafka.
> 
> During recovery, Kafka partitions are read until the last checkpoint
> message from this operator is reached and the partially written window is
> constructed.
> 
> --------------------------------------------------------------------------------
> 
> Note: Existing Kafka exactly once operator, ( Kafka 0.8 ) also needs to be
> re-written.
> 
> Thanks

Reply via email to