[ 
https://issues.apache.org/jira/browse/FLINK-16319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17047172#comment-17047172
 ] 

Tzu-Li (Gordon) Tai commented on FLINK-16319:
---------------------------------------------

Hi [~danp11]! Thanks a lot for bringing this up.

Supporting a pubsub pattern in Stateful Functions has definitely been on the 
radar on brought up quite a few times in discussions.
Ideally we would be able to support this as a first-class primitive in the SDK.
I also believe that this will be a valuable addition to the project.

I took a look at your current implementation, and came up with the following 
understandings:
* Function instances can publish / subscribe to new topics dynamically at 
runtime, by sending {{Subscription}} and {{PublishMessage}} events to the 
PubSub function.
* The PubSub function then routes those messages to the correct topic partition 
(hash partitioning using subscribed / target Address, over a fixed number of 
partitions). The routing logic is currently part of the PubSub function.
* The PubSub function only has in state the topic partition to a list of its 
subscribers (although we might want to use a more efficient state primitive for 
this, rather than {{PersistedValue}}).

Overall, this looks like it is heading towards a good direction.

Before moving forward with anything else / writing or looking at code or PRs, 
I'd like to discuss the following:
* Instead of having functions send {{Subscription}} and {{PublishMessage}} 
messages, lets consider integrating that as a first-class messaging primitive 
provided via {{Context}}. i.e. consider {{Context#subscribe(TopicID)}}, 
{{Context#publish(TopicID, Message)}}. The routing to the correct topic 
partition can probably happen there instead of being handled by the PubSub 
function. In this case with the routing responsibility removed, the PubSub 
function really just becomes a {{TopicPartitionFunction}}.
* What is the current delivery guarantee of published messages / subscription 
timeliness? Do we need to / want to persist published messages in state? This 
is definitely out-of-scope for a first version of this, as it would require 
some more state primitives like a {{PersistedQueue}}, and state TTL in Stateful 
Functions, but none-the-less quite interesting to think beforehand.
* With the current approach, a function *must* be invoked first with some other 
message, before it can subscribe to a topic and receive published messages. 
Would there be cases where a function only ever wants to be invoked with 
messages from a subscribed topic (some way to "eagerly" subscribe a function 
type to a topic)? Also consider the case where if we support in the {{Router}} 
a {{publish(TopicID,  Message)}} method, how would that work?

Please let me know what you think :)
I'm quite excited about this, and would be happy to work with you on this 
feature,
but can't promise I would be super responsive on this specific feature as of 
now, since as you can see a lot of work is going on right now with the 
multi-language support in Statefun.
I think we can discuss these details asynchronously on the side, and once we 
have more capacity after the multi-language work eventually push into adding 
this.

> Pubsub-/Broadcast implementation
> --------------------------------
>
>                 Key: FLINK-16319
>                 URL: https://issues.apache.org/jira/browse/FLINK-16319
>             Project: Flink
>          Issue Type: Improvement
>          Components: Stateful Functions
>            Reporter: Dan Pettersson
>            Priority: Minor
>
> Hi everyone,
>  
> I have a use case where the id of the functions are brokerId + instrumentId 
> that receives trades and orders. The instrument has state changes (Open, 
> halted, closed etc) that almost all the functions are interested in. Some 
> functions only wants for example the Close message whereas other functions 
> wants all state changes for the specific instrument. 
>  
> I've built a statefun pubsub module that exposes two interfaces, Subscriber 
> and Publisher, with these two methods:
>  
> default void subscribe(Context context, Subscription... subscriptions)
>   
> default void publish(Context context, PublishMessage publishMessage)
>  
> Behind the interfaces is a hidden StatefulPubSubFunction that keeps track of 
> which partition the subscriber is located in and to which topic it listens to.
>  
> Code is located under 
> [https://github.com/danp11/flink-statefun/tree/master/statefun-pubsub] if 
> anyone is interested.
>  
> This code is a "classic pub sub" pattern and I think that this kind of 
> functionality would be a great addition to Stateful functions. I create this 
> Jira to see if there is an interest to discuss how a optimal 
> pubsub-/broadcast solution would look like in SF? Igal has previously 
> mentioned that Broadcast could be a good fit for this kind of flow.
> At the moment I don't know the internals of SF and-/or Flink good enough to 
> come up with a proposal myself unfortunately.
>  
> I know you are very busy at the moment (Its impressive how much you have 
> produced only the last couple of weeks!:-) but if someone, on a high level, 
> has any ideas on where and how a pub sub pattern could be implemented I'd 
> really appreciate it. In the future I hope we can come up with a proposal 
> together as I need your help here.  If you think that a pubsub-/broadcast 
> solution would make SF better that is :-)
>  
> Hope to hear your thoughts on this!
>  
> Thanks,
>  
>  /Dan



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to