Re: [akka-user] Akka Persistence with Multi-Clusters

'Martin Krasser' via Akka User List Sun, 31 Jan 2016 21:13:34 -0800


On 31.01.16 17:48, Roland Kuhn wrote:

Hi Martin!
31 jan 2016 kl. 15:36 skrev 'Martin Krasser' via Akka User List<[email protected] <mailto:[email protected]>>:
Hi Roland,

a few clarifications/additions inline ...

On 31.01.16 10:17, Roland Kuhn wrote:
Hi Paul,
unfortunately it is impossible to make this “just work”—not leastbecause you would first have to define what that means. Volkermentioned Eventuate as a possible solution, but this also is notsomething that “just works”, it requires your events to bestructured such that your defined state update functions have allthe right properties to make it work.
The features and the guarantees provided by Eventuate actually do*not* depend on the structure of events. Eventuate provides means todistinguish causally related from concurrent events and it is justthe responsibility of the application to ensure that derived statedoes not depend on the order of concurrent events.
This is the crux of the matter: ensuring that events are commutativeis not possible in the general case. I am in complete agreement as towhy ACID (Associative, Commutative, Idempotent, Distributed) makessense, but we should not misrepresent the fact that these systems arenot trivially equivalent to strongly consistent systems—it willusually take quite a bit of thinking and possibly also adjustments tothe business requirements to make this work. The result will often bedesirable for many reasons, but saying that it “just works” is nothelpful in my opinion; I’m not saying that you implied that, I justwant to be very explicit about this part.

And so are the Eventuate docs :-) Volker either didn't imply that it"just works", he referred to Eventuate as one option how to deal withissues that are inevitable when choosing AP over CP. On the other hand,these issues are often rather easy to manage, provided an API offers theright abstractions for it. That's at least the experience I made from myprojects in 2015.

Causally related events are always delivered in the same causal orderat all locations (datacenters, for example). Relaxing strict order tocausal order is what gives you availability and partition-tolerancein multi-datacenter setups.
Imagine there being two copies of yourself running around and doingthings: it would not be enough for one to tell the other what it hasdone, there can be real conflicts that arise from these independentactions (like one of your selves telling your wife that you love herand shortly thereafter—without having caught up to that pointyet—the other files for divorce). The key here is coordination,without that certain actions cannot be taken. And coordination canbe impossible, e.g. due to network partitions, which means thatyou’ll have to decide whether to be cautious or reckless.
Coordination is just an option. If you want to *prevent* concflicts(you called it being "cautious"), only then you need coordination.Here you give up availability in favor of consistency. The otheroption is to *allow* conflicts, and resolve them later. This does notrequire coordination but rather means to track, detect and resolveconflicts. With this option, replicas at different datacenters remainwriteable, even if they are partitioned from others (you called thatbeing "reckless"). However, you "apologize" for being "reckless" withconflict resolution :-) A great introductory read on this is PatHelland's paperBuilding on Quicksand<http://db.cs.berkeley.edu/cs286/papers/quicksand-cidr2009.pdf>. Withthat second option, you choose availability over (strong) consistency.
Yes, you’re quoting from my recent presentations on the matter :-)
The second option is not only relevant for multi-datacenterreplication but also more generally for collaboration between(micro)services where you usually don't want to couple theavailability of one service to that of others. The price for that isthat applications must be able to deal with conflicts, rather thantrying to prevent them. A consistent approach for that is oftenmissing in distributed applications and often solved with error-pronead-hoc solutions. Eventuate tries to change that by not onlyproviding APIs to track, detect, and resolve conflicts in anautomated or interactive way but also by providing an infrastructurewhere distributed services can communicate via events in a causallyconsistent and reliable way. The underlying event logs give you thefollowing guarantees:
- the order of events in a local event log is consistent with causalorder i.e. consumers will never see an effect before its cause.- event replication across different locations is reliable andidempotent i.e. consumers will never see duplicates when reading froma local log.
These guarantees are certainly valuable, but I am hesitant to declarevictory just yet:
  * Tracking causality in general requires O(n^2) data size where n is
    the total number of participants in a causality chain. This means
    that it works reasonably well for small conversation groups or
    ephemeral interactions, but fails in practice for large networks
    of interacting agents.

That's definitely not the case. For tracking potential causality O(n)data size is required where each participant has an entry in a vectorclock<http://rbmhtechnology.github.io/eventuate/architecture.html#vector-clocks>.This can be further reduced by using plausible clocks<https://github.com/RBMHTechnology/eventuate/issues/68> whereparticipants that share an event log at a given location may also sharea clock entry. See #68<https://github.com/RBMHTechnology/eventuate/issues/68> and #103<https://github.com/RBMHTechnology/eventuate/issues/103> for details.

  * Having causality alone does not allow your agents to be programmed
    such that they can keep even simple invariants like “do not allow
    the creation of more than 500 blog entries”—if two concurrent
    histories both create the 500th post then conflict resolution
    becomes a hassle.

Right, global invariants require coordination which limits availability.However, this is not in conflict with an AP-by-default approach ascoordination can be added on top where needed.

The latter point is what users of this abstraction need to considerwhen coming from a strongly consistent RDBMS background, itfundamentally changes the world view. The apologies cannot in allcases be contained within the system, business processes involvinghuman personnel may need to be created to deal with the fallout—thisis of course very desirable because otherwise these processes wouldneed to be improvised when the hair is on fire (a.k.a. the network issplit) but it should be mentioned both as a cost and a benefit whenintroducing such a system.
When it comes to interactions spanning multiple microservices I tendto favor the Saga approach: creating an external entity that managesthe causal relationship between the changes effected at differentservices and the failure handling (i.e. apologies) allows thecompression of the required data for causality tracking—because it isno longer generic but tailored to the use-case—as well as thecolocation of action and compensating action in a single unit. As abonus these units are highly approachable for non-programmers as wellbecause they directly correspond to business processes.

I don't want to enumerate over all the advantages and disadvantages ofcentral coordination vs. federated collaboration here but centralcoordination may be in conflict with availability requirements. Also,central coordination is often the entry point to monolithic solutions :-)

Compare this to plain at-least-once based messaging betweenpersistent actors or services (in different datacenters, for example)where you cannot make assumptions on message ordering and duplicates.It usually makes writing correct business logic much harder.Furthermore, if you want to decouple a service from the availabilityof others you are again faced with the problem of detecting andresolving conflicts. I should mention here that rejecting commands !=being available :-). As soon as distributed services/applicationsshall become more resilient to network partitions, you'll anyway haveto deal with many of the issues that Eventuate is already adressing.
Eventuate meanwhile emerged from a proof-of-concept in early 2015 toa production ready toolkit for building (globally) distributed,service-oriented CQRS/ES applications. As Eventuate is also almost afunctional superset of akka-persistence, I wonder if combiningefforts would make sense. Please let me know if this soundsinteresting to you.
I am very much interested in attacking this problem, we need to giveusers the tools that reduce their responsibility to the essence oftheir business problem. I must confess that I have not yet studied theEventuate source code, all statements above are based on my incompleteunderstanding of the problem domain and some very interestingconversations at ECOOP last year. On this basis my current impressionis that while we hold some promising pieces in our hands we have notyet found the golden hammer—if it exists at all. Perhaps it is time toplan a get-together to assemble the pieces we know now and explore thelandscape that emerges?


Good idea. Let's start that initiative.

Regards,

Roland
Regards,
Martin
So, you can use Akka Persistence with the same store in differentlocations, but you’ll have to make sure that you don’t emit eventsto the same log from different places—there can only be one runningsource of truth for each persistenceId at any given time.
Regards,

Roland
22 jan 2016 kl. 14:52 skrev Paul Cleary <[email protected]>:
Will Akka Persistence work if you have two different clusterspointing to the same data store?
Imagine I have 2 data centers that point at the same database.
If I have updates happening in both data centers at the same time,will akka persistence stomp all over the journal / snapshots?
I know that akka persistence has sequence numbers, and I am notsure how those are managed.
It would be great if this just worked, but I am thinking I need toimplement my own persistence plugin and / or persistence layer inmy app to make sure that there are no collisions.
--
>>>>>>>>>> Read the docs:http://akka.io/docs/
>>>>>>>>>> Check theFAQ:http://doc.akka.io/docs/akka/current/additional/faq.html>>>>>>>>>> Search thearchives:https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the GoogleGroups "Akka User List" group.To unsubscribe from this group and stop receiving emails from it,send an email [email protected]<mailto:[email protected]>.
To post to this group, send email [email protected].
Visit this group athttps://groups.google.com/group/akka-user.
For more options, visithttps://groups.google.com/d/optout.
*Dr. Roland Kuhn*
/Akka Tech Lead/
Typesafe <http://typesafe.com/> – Reactive apps on the JVM.
twitter: @rolandkuhn
<http://twitter.com/#%21/rolandkuhn>

--
>>>>>>>>>> Read the docs:http://akka.io/docs/
>>>>>>>>>> Check theFAQ:http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives:https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the GoogleGroups "Akka User List" group.To unsubscribe from this group and stop receiving emails from it,send an email [email protected]<mailto:[email protected]>.To post to this group, send email [email protected]<mailto:[email protected]>.
Visit this group athttps://groups.google.com/group/akka-user.
For more options, visithttps://groups.google.com/d/optout.
--
Martin Krasser

blog:http://krasserm.github.io
code:http://github.com/krasserm
twitter:http://twitter.com/mrt1nz

--
>>>>>>>>>> Read the docs:http://akka.io/docs/
>>>>>>>>>> Check the 
FAQ:http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives:https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the GoogleGroups "Akka User List" group.To unsubscribe from this group and stop receiving emails from it,send an email [email protected]<mailto:[email protected]>.To post to this group, send email [email protected]<mailto:[email protected]>.
Visit this group athttps://groups.google.com/group/akka-user.
For more options, visithttps://groups.google.com/d/optout.
*Dr. Roland Kuhn*
/Akka Tech Lead/
Typesafe <http://typesafe.com/> – Reactive apps on the JVM.
twitter: @rolandkuhn
<http://twitter.com/#%21/rolandkuhn>

--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ:http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the GoogleGroups "Akka User List" group.To unsubscribe from this group and stop receiving emails from it, sendan email to [email protected]<mailto:[email protected]>.To post to this group, send email to [email protected]<mailto:[email protected]>.
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.


--
Martin Krasser

blog:    http://krasserm.github.io
code:    http://github.com/krasserm
twitter: http://twitter.com/mrt1nz

--

     Read the docs: http://akka.io/docs/
     Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
     Search the archives: https://groups.google.com/group/akka-user

---You received this message because you are subscribed to the Google Groups "Akka User List" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Re: [akka-user] Akka Persistence with Multi-Clusters

Reply via email to