sijie commented on a change in pull request #4079: PIP-34 Key_Shared 
subscription core implementation.
URL: https://github.com/apache/pulsar/pull/4079#discussion_r277964196
 
 

 ##########
 File path: 
pulsar-client-api/src/main/java/org/apache/pulsar/client/api/TypedMessageBuilder.java
 ##########
 @@ -103,6 +103,15 @@
      */
     TypedMessageBuilder<T> keyBytes(byte[] key);
 
+    /**
+     * Sets the ordering key of the message for message dispatch in {@link 
SubscriptionType#Key_Shared} mode.
+     * Partition key Will be used if ordering key not specified
+     *
+     * @param orderingKey the ordering key for the message
+     * @return the message builder instance
+     */
+    TypedMessageBuilder<T> orderingKey(byte[] orderingKey);
 
 Review comment:
   > I think that's a bit generic at this point. I don't think it's even 
possible to do CDC on Spanner. In any case I don't see why would that be 
strictly required for this feature.
   
   I just used Spanner as an example here. Whether Spanner supports CDC is not 
the point to discuss. There are many open source Spanner-like NewSQL databases. 
E.g. TiDB, YugaByte, and many in-house solutions. I know there are already 
people working on integrations between TiDB and Pulsar, where the ordering key 
shines there.
   
   > As always, I think it's better to add things in the API when there is a 
concrete need, rather than speculate possible use cases that might not apply.
   
   Why do you think there is no concrete need when people propose a new PIP?
   
   
   > In this case, since the application expect messages in order by 
conversation_id, using that as the partitioning key will achieve the same 
identical behavior.
   
   Pulsar is a multiple subscription system. One subscription can use failover 
subscription, while the other subscription can use key_shared subscription. You 
can't force the application to choose conversation id as the partition key. As 
I said, how applications can use these two keys varies from their needs.
   
   > Why would you care about routing per user_id if you just care of ordering 
per partition_id?
   
   Because there are subscriptions required to consume all the events from a 
particular user_id. 
   
   > Finally, as mentioned above I think that "ordering-key" is a very 
misleading name. It really would be a "sub-key", "delivery-key", "dispatch-key" 
or other name.
   
   I agree that "ordering" can have a different meaning in different context. 
It can mean - publish-order, log-order, consumer-order, dispatch-order, 
key-order. However I don't think "sub-key", "delivery-key" or "dispatch-key" is 
a better name than "ordering key". In some cases, the ordering key is a 
"sub-key", but it can be a completely different key while in other cases. Same 
applies to "delivery-key" or "dispatch-key".
   
   IMO "ordering key" is not a bad name. It is a name that people already have 
some general ideas about it. Also people generally understand what partitions 
key and ordering key means. Applications can choose how to use them to adopt to 
their use cases.
   
   However, I am also not particularly strong on the name itself. We could have 
called it others if there was a better name came up in the PIP discussion email 
thread.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to