[GitHub] [pulsar] sijie commented on a change in pull request #4079: PIP-34 Key_Shared subscription core implementation.

GitBox Wed, 24 Apr 2019 00:48:42 -0700

sijie commented on a change in pull request #4079: PIP-34 Key_Shared 
subscription core implementation.
URL: https://github.com/apache/pulsar/pull/4079#discussion_r277993818


 ##########
 File path: 
pulsar-client-api/src/main/java/org/apache/pulsar/client/api/TypedMessageBuilder.java
 ##########
 @@ -103,6 +103,15 @@
      */
     TypedMessageBuilder<T> keyBytes(byte[] key);
 
+    /**
+     * Sets the ordering key of the message for message dispatch in {@link 
SubscriptionType#Key_Shared} mode.
+     * Partition key Will be used if ordering key not specified
+     *
+     * @param orderingKey the ordering key for the message
+     * @return the message builder instance
+     */
+    TypedMessageBuilder<T> orderingKey(byte[] orderingKey);
 
 Review comment:
   > One thing is the per-key delivery, one other is a CDC pipeline. The 2 
don't necessarely have to go together. This is for per-key delivery, then you 
bring up CDC.
   
   CDC pipeline is a most typical use case for per-key delivery. The bottleneck 
of a CDC pipeline is key ordering. You want to ensure key ordering but you also 
want to scale out. That's why you need a per-key delivery.  I am not sure why 
do you think they are not related.
   
   >  The implementation of this feature also has nothing to do with ordering, 
rather is give me messages with same keys. Ordering is a derivative property.
   
   The change includes two parts: 1) adding an ordering key to allow people 
choose a different key for defining its ordering rather than partition key. 
That is the change for adding "ordering key". Hence the ordering key is for 
messages, not for subscription. 2) key_shared subscription is one type of 
subscription that choose which key to be used for dispatching messages. There 
can many other different ways for using these two keys.
   
   > In my view it's a "sub-key" because the routing is done on 2 levels, to 
partition and to consumers. 
   
   when you are using a term of "sub-key", it means you need both partition-key 
and sub-key together for deciding the ordering. in my example explained above, 
"from_user_id" and "conversation_id" might have some application specific 
relationship. but "conversation_id" is not necessarily a "sub-key" of 
"from_user_id". Instead, "conversation_id" provides an alternative way for 
grouping and ordering conversations than using "from_user_id". 
   
   To me it is more of a name. I am fine with whether it should be called 
"sub-key" or "ordering-key". To me it is more a matter for making a good 
clarification on how `ordering_key` is used on javadoc and website.
   
   > Just to understand, where is it written that things are set in stone? If 
people are busy to comments for a few days, one just submits a PR, his buddy 
approves and merge and that's it? Done? No one can comment on it anymore?
   > Also, as a curtesy, it would be nice to actively seek comments from people 
when introducing major features. It's not a race to get PR merged while others 
are not looking.
   
   You are overreacting to what I said here. I just meant the name was coming 
up from the discussion thread. I also didn't say it is a final. It can be any 
other names if there is a better one. In the whole conversation here, what I 
was trying to do is to show you the use cases I learned and my opinions - why I 
think "sub-key", "delivery-key" and "dispatch-key" is not better than 
"ordering-key". 
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] [pulsar] sijie commented on a change in pull request #4079: PIP-34 Key_Shared subscription core implementation.

Reply via email to