Thank you for the clarifications, Enrique. It sounds not only reasonable, but critical when running in the same deployment unit runtime and data-index (same would apply for other services running co-located).
On Wed, Aug 7, 2024 at 1:04 PM Enrique Gonzalez Martinez < elguard...@gmail.com> wrote: > Inline answers > > El mié, 7 ago 2024 a las 18:35, Kris Verlaenen > (<kris.verlae...@gmail.com>) escribió: > > > > I share the concern with Francisco that we should be careful that we > don't > > introduce a solution that would require a distributed transaction to > > guarantee consistency, there are other approaches that might be > sufficient > > or that deliver eventual consistency that I think we should allow. > > > > I have never talked about distributed transactions. (if you understand > them we need to export transactions from the runtime to the data index > during execution in a distributed deployment). > > Transaction in this context means an operation execution within the > deployment. > We need transactions as several writes to the database are performed > during the same workflow execution. There is no other way to guarantee > consistency than using transactions. > XA (2 phase commit transactions) within the same deployment is > regarding some parts of the system that might write to different > databases. For instance runtime + data index or runtime + data audit. > Therefore the 2 phases commit transactions. > > > I have a few questions: > > * Could you clarify what you mean with subsystems? Because depending on > > your architecture I guess this could be different (for example you could > > run a job service embedded or as a separate service). > > Subsystems are all those services required or optional that add > functionality to the engine. jobs, data index, data audit, user tasks. > > > One could argue that > > the work of another system does not have to be done as part of the same > > transaction if the communication with that other system is done in a > > guaranteed way? Or do we consider those not a subsystem in that case? > > The transaction is not being exported or imported from one deployment > to another (never talked about that). The only thing we want to > guarantee is that an operation within the same deployment is > consistent. If that requirement is met all systems will be eventually > consistent. > e.g > > Runtime executes an operation and sends events to kafka. This is a > transaction within the same deployment and it is consistent in > runtime. > Data index consumes the event from kafka to the storage. This is > another transaction within data index deployment. > > There are two different transactions. They are not the same > transaction but within the deployment they are consistent. and among > systems they will be eventually consistent as data indexes will end up > consuming the event. > > > > * For some external services it might not be required to be part of a > > transaction? For example if you're just querying some information it > might > > be totally acceptable to do this REST invocation directly and outside a > > transaction. > > Which system are you referring to ? For now the only invocation we > have within the engine is job service, so Rest cannot be outside the > transaction as the engine requires information exchange (we send the > data to create the timer and we get the job id). > If you are talking about service tasks (WIH) the author will have to > decide if it is querying or just need to create a compensation > mechanism. > > > Similarly, it might be fine to send out events directly, > > there might be other ways to compensate or ignore events later if > necessary. > > If you try to compensate for events in a bpmn you will tie the design > of the workflow to the environments. So there won't be any point in > the abstraction itself of the process. > Phantom events or Duplication events are off the table in bpmn as it > can impact the workflow execution. Serverless workflow might be more > tolerant in those abstractions but I guess that if a workflow receives > a phantom signal or event and executes it, I can make a guess it won't > be any good. > > > * It's unclear to me how this interacts with the unit of work. The idea > > of the unit of work is to collect all the changes, and then apply them > all > > at once towards the end. > > it does not matter how the unit of work does things. The unit of work > is not everything that happens in an execution (e.g correlation > service, messaging, jobs, index, audit....). > The interaction is about operations invoked against the engine. > > > This way you could have different implementations > > of the unit of work. Is transactions a specific implementation of the > unit > > of work where you start/commit a transaction when the unit of work is > > started/ended? And another alternative would be we write everything to a > > single data source, and use the outbox pattern or similar for further > > processing? > > That won't do per unit of work impl limitation. (see my previous comment). > Even if you have a single data source you write several times in > several tables, so you need transactions to keep consistency. > > > > > Thx, > > Kris > > > > > > On Fri, Aug 2, 2024 at 8:39 AM Enrique Gonzalez Martinez < > > egonza...@apache.org> wrote: > > > > > * Transactions* > > > This document describes how to support transactions in the domain of > > > workflow engine and subsystems. > > > > > > The use cases for transactions in workflows is to enable consistency > > > during workflow executions. > > > > > > * Constraints * > > > > > > The constraints for this are related to different types of transaction > > > problems: > > > > > > Workflow transaction execution should be in one single transaction > > > (until idle elements are reached or there are no more elements to > > > process) > > > > > > Process state should be consistent in storage in one single > > > transaction. In the case of database multiple tables should be written > > > in an atomic transaction > > > > > > Reactive code should be removed as it does not behave properly with > > > transactions. > > > > > > Transactions Policy among workflow runtime and subsystems should be > > > consistent in terms of configuration (no subcomponent should start a > > > transaction if there is already one on the go, but they should mandate > > > to be in a transaction) > > > > > > Error handling should still produce an event that can be stored. > > > > > > Subsystems execution should be included during transactions > > > > > > Async execution will spawn its own transaction. > > > > > > * Architecture * > > > > > > The architecture of the solution impacts some areas: > > > > > > Components with reactive that are involved in transaction refactor. So > > > far, the only subsystem using reactive code job service. > > > > > > Process Code generation should change in order to reflect the > > > transactions of the workflow engine > > > > > > Error handling should be modified in a way the error is captured > > > outside the transaction and handled in a different one to avoid event > > > loss. > > > > > > Exchange information among runtime and subsystems should be in a way > > > that those elements are involved in a transaction or they can be > > > rolled back. At the moment the communication is being done with a rest > > > call that is not part of the transaction and cannot be rolled back. > > > > > > Events produced within the transaction should be part of the > > > transaction as well to avoid phantom events (events producing during > > > workflow execution that are sent at the end of the unit of work) > > > > > > * Risk Assessment * > > > > > > The risks identified for this work are the following: > > > > > > Error handling can be problematic depending where we set the > > > boundaries of the transaction. There are two different approaches: > > > > > > Boilerplate code for each task to start / commit / rollback the > > > transaction and deail with error in the rest call tier itself > > > > > > Use the runtime environment to install error handling for doing the > > > operation. > > > > > > Exchange information among systems in a non-transactional way. There > > > are a couple of approaches > > > > > > Install every time a transaction sync listener whenever the rest call > > > is made against the subsystem and doing a compensation when it fails > > > > > > Wrap the rest call in a XAResource that can be enlisted in the > transaction. > > > > > > The use of Kafka clients for stream that does not belong to the > > > transactions > > > > > > Wrap with XAResource (Kafka client support transactions, but does not > > > offer XAresource) > > > > > > Install a transaction sync for each transaction. > > > > > > Performance impact with transactions. > > > > > > Different transaction methods in quarkus and spring boot > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: dev-unsubscr...@kie.apache.org > > > For additional commands, e-mail: dev-h...@kie.apache.org > > > > -- > Saludos, Enrique González Martínez :) > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@kie.apache.org > For additional commands, e-mail: dev-h...@kie.apache.org > >