Hi Francisco, I can see the problem. This is not about reacting to an error, this is regarding how we create process instances. In v7 we did not have this problem because we only did have two different things to avoid this: 1. one single point to create and start a process 2. transactions that fail if something wrong happend during the first process instance transaction execution.
This was causing when the system not having to deal with processes not properly created as the creation of the process was tied to the execution of the first transaction making not possible to have a process instance not workable (at least for the first transaction) Now this is not the case. You create the process instance independently from the start event of the process instance causing this sort of problem. As I did mention before Error means still alive but some sort of required human intervention. So I am not really keen to revisit this. As I did say in my previous email I am open to discuss certain aspects: * I am open to discuss re triggering mechanisms. * I am also open to discuss certain abortion policies if certain criterias are met. As you mentioned "Another possible solution, rather than a timeout, is to allow automatic abortion of a workflow which suffers an error." which fits exactly my second bullet and I am pretty open to go back to certain v7 behaviour related to how the system works when you create the process. The idea IMO would go around this call https://github.com/apache/incubator-kie-kogito-runtimes/blob/fed1bd142a466d6be4e75e3972729278f65cabcc/jbpm/jbpm-flow/src/main/java/org/jbpm/workflow/instance/impl/NodeInstanceImpl.java#L261 and hot to implement those policies to avoid process instance creation if it fails in new process. El lun, 6 may 2024 a las 10:39, Francisco Javier Tirado Sarti (<[email protected]>) escribió: > > The reason I opened this thread is because there are some concerns from > user stale processes in error. Let me quote one of them > "I can see the system being flooded with those instances in case of wrong > params or an system which is down that fails the flwo" > Therefore, the use case certainly exists and we need to cope with it > somehow. > Another possible solution, rather than a timeout, is to allow automatic > abortion of a workflow which suffers an error. That makes sense for > workflows which are unlikely to be retriggered. Maybe we can add process > metadata to indicate a process should be aborted when an error occurs. > > On Fri, May 3, 2024 at 10:01 PM Enrique Gonzalez Martinez < > [email protected]> wrote: > > > To be honest error state in a process is a bit strange. Anyway at this > > point it means that the process is active and stale for some reason. > > > > Recovery should be something that historically needs to have some human > > intervention so i dont see why would you try to clean up anything in the > > system. The instance is alive but staled. Aborting a process should be done > > manually. We cannot make a decision on behalf of the user. > > > > I am open to discuss re triggering mechanism but not for aborting process > > instance automatically. It does not cover a real use scenario. > > I am also open to discuss certain abort policies if certain criterias are > > met. > > > > -1 to the proposal. > > > > > > El vie, 3 may 2024, 13:11, Francisco Javier Tirado Sarti < > > [email protected]> escribió: > > > > > Hi all, > > > According to my interpretation of the engine code [1], all unexpected and > > > unhandled errors during node execution are currently intercepted and the > > > process state is set to Error, but the process instance remains active to > > > allow users to update the model and retrigger process instance execution. > > > Although a clever approach to allow recovery of processes that uses do > > not > > > want to execute again from start (they might have failed because there > > was > > > a typo in a human task), this potentially creates a large number of idle > > > process instances that are not going to be deleted from memory/db > > > (depending if persistence is configured or not, in production it will be) > > > unless the users manually abort them. If the user does not monitor them, > > > this policy might jeopardize the performance of the whole application. > > > I would like to explore the possibility of setting a timeout for process > > > instances on error (that will be of course configurable). If the process > > > instance has not been acted upon for a reasonable amount of time, it will > > > be automatically aborted. > > > > > > [1] > > > > > > > > https://github.com/apache/incubator-kie-kogito-runtimes/blob/main/jbpm/jbpm-flow/src/main/java/org/jbpm/workflow/instance/impl/NodeInstanceImpl.java#L247-L251 > > > > > -- Saludos, Enrique González Martínez :) --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
