@Konrad: Great! It seems that I totally misunderstood what I saw in the 
akka source (which I only took a cursory look and jumped to conclusion too 
quickly).

So just to confirm my understanding: suppose I have a topic named "apple", 
then pubsub only sends the subscription data of "apple" to nodes with 
"apple" subscribers? From the source, I see that on each node, there's a 
Topic actor that takes care of local subscribers (and maybe this is what 
you refer to as co-location?) But then, a Publisher can be on any node, and 
still needs access to the list of Topic actors. So this list of Topic 
Actors must then be replicated across the cluster, right?

So it seems like if the number of topics is small and and the number of 
subscribers for each topic is large, then this is an efficient way of doing 
things. However, if the number of topics is large, and each topic might 
only have a couple of subscribers (for eg, in the chat example, each topic 
for each user) then a large amount of data (i.e. list of topic actors) has 
to be replicated across the cluster? As a general question, is pubsub the 
right tool to deal with this kind of problem?

I'm a complete newbie (only started with akka recently), and your answers 
have been great help. Thanks a lot!

Alex

PS. This seems to be very off topic from the original question.  Should I 
start a new thread?

On Monday, December 28, 2015 at 4:16:45 AM UTC-6, Konrad Malawski wrote:
>
>
> But now, there's one thing that I'm not too happy about: both distributed 
> pubsub and distributed data seem to replicate all data onto all nodes 
> instead of spreading out to the cluster. 
>
> That's not true – pubsub only sends data to such nodes, which subscribed 
> to a given topic.
>
> If all people are in all chat rooms, then yes it will be sent to all 
> nodes. But is that really the case? Some people are in `apples chat room` 
> and you can co-locate them, others are on `oranges chat room` and you can 
> co-locate them, decreasing the amount of traffic and network hits a lot. 
> With a distributed map it's a bit weird to do such things.
>
> And this was the reason why I was thinking of Hazelcast. What would a pure 
> akka implementation of this be?
>
> Exactly what you explained and I don't see a problem with it.
>
> "Let's add a distributed map" does the same thing - all nodes see the 
> updates, so what would the upside be? Downsides in terms of increasing 
> complexity I've explained.
>
>
> Note: I don't mean to say bad things about Hazlecast here – in this 
> scenario we're talking about though I don't see the need for it, but I see 
> a lot of mental and complexity cost associated with introducing it. 
>
> Hope this helps!
>
> -- Konrad
>

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Reply via email to