if data index was supposed to provide snapshot view of the process instance… why do we keep it after the process instance is finished?
On Mon, Feb 19, 2024 at 7:12 AM Francisco Javier Tirado Sarti < ftira...@redhat.com> wrote: > Hi Martin. > After taking a deeper look at this, I realize that the behaviour is the > expected one. > Runtimes DB does not track the completed process instance (that's what the > JDBCProcessInstances warn is telling us), but DataIndex, as expected, is > tracking it in processes and nodes table. And yes it will grow over time. > What we need is some configurable purge mechanism for DataIndex, so it > eventually removes older completed process instances. > > On Tue, Feb 13, 2024 at 12:59 PM Francisco Javier Tirado Sarti < > ftira...@redhat.com> wrote: > > > Hi Martin, > > Good catch!. Looks like the skipping performed for process instances is > > not applied to node instances. Something we definitely need to review on > > the runtimes side. > > > > On Mon, Feb 12, 2024 at 11:59 PM Martin Weiler <mwei...@ibm.com.invalid> > > wrote: > > > >> On a somewhat related note, testing a simple workflow (start -> script > >> node -> end), I see the following messages in the logs: > >> 2024-02-12 22:49:50,493 28758dde544c WARN > >> [org.kie.kogito.persistence.jdbc.JDBCProcessInstances:-1] > >> (executor-thread-3) Skipping create of process instance id: > >> 7083088e-b899-47cb-b85c-5d9ccb0aa166, state: 2 > >> > >> So far, so good. And I'd expect to see no trace of this process in the > >> database if I don't have data audit enabled. > >> > >> However, the 'processes' table contains a row with state=2, with related > >> entries in the 'nodes' table. In a load test, I see these tables grow > >> significantly over time. Am I missing something to have these entries > >> cleaned up automatically? > >> > >> ________________________________________ > >> From: Martin Weiler <mwei...@ibm.com.INVALID> > >> Sent: Monday, February 12, 2024 3:40 PM > >> To: dev@kie.apache.org > >> Subject: [EXTERNAL] RE: [DISCUSSION] Performance issues with data-index > >> persistence addon > >> > >> Thanks everyone for your input. Based on this discussion, I opened the > >> following PR: > >> https://github.com/apache/incubator-kie-kogito-apps/pull/1985 > >> > >> With this change, the performance seems to be stable over time: > >> > >> > https://drive.google.com/file/d/1zkullvfrJpRp7TRjxDa41ok6kEIR7Fty/view?usp=sharing > >> > >> Martin > >> > >> ________________________________________ > >> From: Gonzalo Muñoz <gmuno...@apache.org> > >> Sent: Friday, February 9, 2024 9:42 AM > >> To: dev@kie.apache.org > >> Subject: [EXTERNAL] Re: [DISCUSSION] Performance issues with data-index > >> persistence addon > >> > >> Great work Francisco, > >> Martin, take a look at this link with some related tips (in case you > find > >> it useful): > >> https://www.cybertec-postgresql.com/en/index-your-foreign-key/ > >> > >> El vie, 9 feb 2024 a las 17:20, Francisco Javier Tirado Sarti (< > >> ftira...@redhat.com>) escribió: > >> > >> > For the moment being, we will keep JPA till we exhaust all > >> possibilities, > >> > let's call switching from jpa to jdbc our hidden plan B ;) > >> > I already told Martin, but in order everyone to know, just after > writing > >> > the previous email, I thought "what if Postgres is not automatically > >> > indexing foreign keys like mysql?" and, eureka > >> > Postgres doc > >> > https://www.postgresql.org/docs/current/ddl-constraints.html > >> > Mysql doc > >> > https://dev.mysql.com/doc/refman/8.0/en/constraint-foreign-key.html > >> > These are the relevant excerpt > >> > > >> > *Postgresql* > >> > *A foreign key must reference columns that either are a primary key or > >> form > >> > a unique constraint, or are columns from a non-partial unique index. > >> This > >> > means that the referenced columns always have an index to allow > >> efficient > >> > lookups on whether a referencing row has a match. Since a DELETE of a > >> row > >> > from the referenced table or an UPDATE of a referenced column will > >> require > >> > a scan of the referencing table for rows matching the old value, it is > >> > often a good idea to index the referencing columns too. Because this > is > >> not > >> > always needed, and there are many choices available on how to index, > the > >> > declaration of a foreign key constraint does not automatically create > an > >> > index on the referencing columns.* > >> > *Mysql* > >> > *MySQL requires that foreign key columns be indexed; if you create a > >> table > >> > with a foreign key constraint but no index on a given column, an index > >> is > >> > created. * > >> > > >> > So I asked Martin to especially create an index for > process_instance_id > >> > column on nodes table > >> > I think that will fix the problem detected on the thread dump. > >> > The simpler process test to verify queries are fine still stands, > >> though ;) > >> > > >> > > >> > On Fri, Feb 9, 2024 at 5:10 PM Tibor Zimányi <tzima...@apache.org> > >> wrote: > >> > > >> > > I always preferred pure JDBC over Hibernate myself, just for the > sake > >> of > >> > > control of what is happening :) So I would not -1 that myself. > >> > > > >> > > Tibor > >> > > > >> > > Dňa pi 9. 2. 2024, 17:00 Francisco Javier Tirado Sarti < > >> > > ftira...@redhat.com> > >> > > napísal(a): > >> > > > >> > > > Hi, > >> > > > Usually I do not want to talk about work in progress because > >> > preliminary > >> > > > conclusions are pretty volatile but, well, there are a couple of > >> things > >> > > > that can be concluded from the really valuable information that > >> Martin > >> > > > provided. > >> > > > 1) In order to be able to determine if the number of statements is > >> > larger > >> > > > than expected, I asked Martin to test with a simpler process > >> > definition. > >> > > > One with just three nodes: start, script and end. The script one > >> should > >> > > > change just one variable. This way we can analyze if the number of > >> > > queries > >> > > > is the expected one. From the single log (audit was activated > them) > >> my > >> > > > conclusion is that the number of insert/updates over processes and > >> > nodes > >> > > > (there a lot over task, that I will prefer to skip for now, baby > >> steps) > >> > > is > >> > > > the expected one. > >> > > > 2) Analysing the thread dump, we see around 15 threads executing > >> this > >> > > line > >> > > > at > >> > > > > >> > > > > >> > > > >> > > >> > org.kie.kogito.index.jpa.storage.ProcessInstanceEntityStorage.indexNode(ProcessInstanceEntityStorage.java:125), > >> > > > so its pretty clear the code to be optimized ;). I'm evaluating > >> > > > possibilities within JPA/Hibernate, but I'm starting to think that > >> it > >> > > might > >> > > > be better to switch to JDBC and skip hibernate. Our lives will be > >> > > simpler, > >> > > > especially with a schema relatively simple like ours (that will be > >> my > >> > > > recommendation if I was an external consultant) > >> > > > > >> > > > On Fri, Feb 9, 2024 at 4:15 PM Tibor Zimányi <tzima...@apache.org > > > >> > > wrote: > >> > > > > >> > > > > Hi, > >> > > > > > >> > > > > this will be a bit off-topic. However as far as performance, I > >> think > >> > we > >> > > > > should think about that we have string primary keys (IDs). I > would > >> > > expect > >> > > > > the database systems are much better with indexing numeric keys > >> than > >> > > > > strings. I remember from the past, when I was working with DBs, > >> that > >> > > > using > >> > > > > strings as keys or indexes was a discouraged practice. > >> > > > > > >> > > > > Best regards, > >> > > > > Tibor > >> > > > > > >> > > > > Dňa št 8. 2. 2024, 22:45 Martin Weiler <mwei...@ibm.com.invalid > > > >> > > > > napísal(a): > >> > > > > > >> > > > > > I changed the test to use MongoDB [1] and I don't see a > >> performance > >> > > > > > degradation with this setup [2]. > >> > > > > > > >> > > > > > Please keep us posted of your findings. Thanks! > >> > > > > > > >> > > > > > Martin > >> > > > > > > >> > > > > > [1] > >> > > > > > >> > > https://github.com/martinweiler/job-service-refactor-test/tree/mongodb > >> > > > > > [2] > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > https://drive.google.com/file/d/1NfacXaxJlgRMw4OQ5S20cvkzvaUKUVFj/view?usp=sharing > >> > > > > > > >> > > > > > ________________________________________ > >> > > > > > From: Francisco Javier Tirado Sarti <ftira...@redhat.com> > >> > > > > > Sent: Wednesday, February 7, 2024 11:40 AM > >> > > > > > To: dev@kie.apache.org > >> > > > > > Subject: [EXTERNAL] Re: [DISCUSSION] Performance issues with > >> > > data-index > >> > > > > > persistence addon > >> > > > > > > >> > > > > > yes, it can be index degradation because of size, but I > believe > >> (I > >> > > > might > >> > > > > be > >> > > > > > wrong) the db is too small (yet) for that. > >> > > > > > But, eventually, Postgres, when the DB is huge enough, > >> unavoidably > >> > > will > >> > > > > > behave like the graphic that Martin sent. > >> > > > > > Since I believe we are not huge enough (yet), lets rule out > >> another > >> > > > issue > >> > > > > > by analysing the sql logs (I requested those to Martin offline > >> and > >> > he > >> > > > is > >> > > > > > going to kindly collect them). > >> > > > > > Also Im curious to know if Mongo behave in the same way. > >> > > > > > > >> > > > > > On Wed, Feb 7, 2024 at 7:25 PM Enrique Gonzalez Martinez < > >> > > > > > egonza...@apache.org> wrote: > >> > > > > > > >> > > > > > > Hi Francisco, > >> > > > > > > I would highly recommend to check indexes and how the > updates > >> > work > >> > > in > >> > > > > > data > >> > > > > > > index to avoid full scan table and lock the full table. Some > >> db > >> > are > >> > > > > very > >> > > > > > > sensitive to that. > >> > > > > > > > >> > > > > > > El mié, 7 feb 2024, 18:41, Francisco Javier Tirado Sarti < > >> > > > > > > ftira...@redhat.com> escribió: > >> > > > > > > > >> > > > > > > > Hi Martin, > >> > > > > > > > While I analyze the data, let me ask you if it is possible > >> to > >> > > > perform > >> > > > > > > > another check (similar in a way to disabling data-index > like > >> > you > >> > > > do) > >> > > > > > Can > >> > > > > > > > you switch to MongoDB persistence and check if the same > >> > > degradation > >> > > > > > that > >> > > > > > > is > >> > > > > > > > there for postgres remains? > >> > > > > > > > I do not know if this is feasible but will certainly > >> indicate > >> > the > >> > > > > > problem > >> > > > > > > > is on the postgres storage layer and I do not have a clear > >> > > > prediction > >> > > > > > of > >> > > > > > > > what we will see when doing this switch. > >> > > > > > > > > >> > > > > > > > On Wed, Feb 7, 2024 at 6:37 PM Martin Weiler > >> > > > <mwei...@ibm.com.invalid > >> > > > > > > >> > > > > > > > wrote: > >> > > > > > > > > >> > > > > > > > > Hi Francisco, > >> > > > > > > > > > >> > > > > > > > > thanks for your work on this important topic! > >> > > > > > > > > > >> > > > > > > > > I would like to share some test results here, which > might > >> > help > >> > > to > >> > > > > > > improve > >> > > > > > > > > the codebase even further. I am using the jmeter based > >> test > >> > > case > >> > > > > from > >> > > > > > > > Pere > >> > > > > > > > > and Enrique (thanks guys!) [1] which uses a load of 30 > >> > threads > >> > > to > >> > > > > > > > > > >> > > > > > > > > 1) start a new process instance (POST) > >> > > > > > > > > 2) retrieve tasks for a user (GET) > >> > > > > > > > > 3) fetches task details (GET) > >> > > > > > > > > 4) complete a task (POST) > >> > > > > > > > > 5) execute a query on data-audit > >> > > > > > > > > > >> > > > > > > > > With this test setup, I noticed that the performance for > >> the > >> > > POST > >> > > > > > > > > requests, in particular the one to start a new process > >> > > instance, > >> > > > > > > degrades > >> > > > > > > > > over time - see graph [2]. If I run the same test > without > >> > > > > data-index, > >> > > > > > > > then > >> > > > > > > > > there is no such performance degradation [3]. You can > >> find a > >> > > > thread > >> > > > > > > dump > >> > > > > > > > > captured a few minutes into the first test here [4] that > >> > might > >> > > > help > >> > > > > > to > >> > > > > > > > see > >> > > > > > > > > some of the contention points. > >> > > > > > > > > > >> > > > > > > > > I'd appreciate if you could take a look and see if there > >> is > >> > > > > something > >> > > > > > > > that > >> > > > > > > > > can be further improved based on your previous work. If > >> you > >> > > need > >> > > > > any > >> > > > > > > > > additional data, let me know, but otherwise it is > >> > > straightforward > >> > > > > to > >> > > > > > > run > >> > > > > > > > > the jmeter test as well. > >> > > > > > > > > > >> > > > > > > > > Thanks, > >> > > > > > > > > Martin > >> > > > > > > > > > >> > > > > > > > > [1] > >> https://github.com/pefernan/job-service-refactor-test/ > >> > > > > > > > > [2] > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > https://drive.google.com/file/d/1Gqn-ixE05kXv2jdssAUlnMuUVcHxIYZ0/view?usp=sharing > >> > > > > > > > > [3] > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > https://drive.google.com/file/d/10gVNyb4JYg_bA18bNhY9dEDbPn3TOxL7/view?usp=sharing > >> > > > > > > > > [4] > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > https://drive.google.com/file/d/1jVrtsO49gCvUlnaC9AUAtkVKTm4PbdUv/view?usp=sharing > >> > > > > > > > > > >> > > > > > > > > ________________________________________ > >> > > > > > > > > From: Francisco Javier Tirado Sarti < > ftira...@redhat.com> > >> > > > > > > > > Sent: Wednesday, January 17, 2024 9:13 AM > >> > > > > > > > > To: dev@kie.apache.org > >> > > > > > > > > Cc: Pere Fernandez Perez > >> > > > > > > > > Subject: [EXTERNAL] Re: [DISCUSSION] Performance issues > >> with > >> > > > > > data-index > >> > > > > > > > > persistence addon > >> > > > > > > > > > >> > > > > > > > > Hi Alex, > >> > > > > > > > > I did not take times (which depends on a number of > >> variables > >> > > that > >> > > > > > > > > drastically change between environments), but verify > that > >> the > >> > > > > number > >> > > > > > of > >> > > > > > > > > updates has been reduced drastically without losing > >> > > > functionality, > >> > > > > > > which > >> > > > > > > > is > >> > > > > > > > > objectively a good thing. If before the change, for > every > >> > node > >> > > > > > > executed, > >> > > > > > > > we > >> > > > > > > > > have an update for every node previously executed, so > if a > >> > > > process > >> > > > > > have > >> > > > > > > > 50 > >> > > > > > > > > nodes to execute, we were performing nearly 50*51/2 > >> updates, > >> > > > which > >> > > > > > > gives > >> > > > > > > > us > >> > > > > > > > > a total of 1275 updates, now we have just one for every > >> node > >> > > > being > >> > > > > > > > > executed, implying a total of 50 updates. > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > On Wed, Jan 17, 2024 at 3:18 PM Alex Porcelli < > >> > > a...@porcelli.me> > >> > > > > > > wrote: > >> > > > > > > > > > >> > > > > > > > > > Francisco, > >> > > > > > > > > > > >> > > > > > > > > > I noticed that your PR has been merged, but I was > >> expecting > >> > > (at > >> > > > > > least > >> > > > > > > > > > was my understanding from this thread) that before > >> merging > >> > > some > >> > > > > > > > > > benchmark data would be shared in advance - to assess > >> the > >> > > > > > > cost/benefit > >> > > > > > > > > > of such a decent size change. > >> > > > > > > > > > > >> > > > > > > > > > Do you have any information to share? > >> > > > > > > > > > > >> > > > > > > > > > On Sat, Dec 23, 2023 at 4:02 AM Francisco Javier > Tirado > >> > Sarti > >> > > > > > > > > > <ftira...@redhat.com> wrote: > >> > > > > > > > > > > > >> > > > > > > > > > > Yes, as intended, now we have one select and one > >> > > > insert/update > >> > > > > > per > >> > > > > > > > node > >> > > > > > > > > > > event. > >> > > > > > > > > > > I moved the PR as ready for review and give @Pere > >> > Fernandez > >> > > > > Perez > >> > > > > > > > > > > <pefer...@redhat.com> permission to the branch so > he > >> can > >> > > > edit > >> > > > > it > >> > > > > > > in > >> > > > > > > > > the > >> > > > > > > > > > > next two weeks (Ill be on PTO) if desired, before > >> > merging. > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > On Thu, Dec 21, 2023 at 5:58 PM Alex Porcelli < > >> > > > > a...@porcelli.me> > >> > > > > > > > > wrote: > >> > > > > > > > > > > > >> > > > > > > > > > > > Cool, thank you Francisco! > >> > > > > > > > > > > > > >> > > > > > > > > > > > Did you manage to get some preliminary data about > >> > > > > improvements? > >> > > > > > > > > > > > > >> > > > > > > > > > > > On Thu, Dec 21, 2023 at 11:52 AM Francisco Javier > >> > Tirado > >> > > > > Sarti > >> > > > > > > > > > > > <ftira...@redhat.com> wrote: > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > Yes, after some delay because of quarkus 3 > >> migration. > >> > > Im > >> > > > > > > refining > >> > > > > > > > > > this > >> > > > > > > > > > > > > draft PR > >> > > > > > > > > > > > > > >> > > > > > https://github.com/apache/incubator-kie-kogito-apps/pull/1941 > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > On Thu, Dec 21, 2023 at 5:48 PM Alex Porcelli < > >> > > > > > > a...@porcelli.me> > >> > > > > > > > > > wrote: > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > Any update or new findings on this topic? > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > On Tue, Nov 28, 2023 at 8:38 AM Francisco > Javier > >> > > Tirado > >> > > > > > Sarti > >> > > > > > > > > > > > > > <ftira...@redhat.com> wrote: > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Hi Alex, > >> > > > > > > > > > > > > > > After considering different options to > improve > >> > > > > > performance, > >> > > > > > > > we > >> > > > > > > > > > feel > >> > > > > > > > > > > > that > >> > > > > > > > > > > > > > it > >> > > > > > > > > > > > > > > is time to "partially" move away from the > >> current > >> > > Map > >> > > > > > style > >> > > > > > > > > > > > interface ( > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > https://github.com/apache/incubator-kie-kogito-apps/blob/main/persistence-commons/persistence-commons-api/src/main/java/org/kie/kogito/persistence/api/Storage.java > >> > > > > > > > > > > > > > ) > >> > > > > > > > > > > > > > > which was shared with Trusty, to one more > >> > suitable > >> > > > for > >> > > > > > > usage > >> > > > > > > > > > with a > >> > > > > > > > > > > > > > > relational DB like postgresql (but still > >> > compatible > >> > > > > with > >> > > > > > > big > >> > > > > > > > > > table > >> > > > > > > > > > > > dbs). > >> > > > > > > > > > > > > > > The idea will be to replace generic Storage > >> > > interface > >> > > > > by > >> > > > > > > four > >> > > > > > > > > > > > specific > >> > > > > > > > > > > > > > > interfaces (which will inherit from a common > >> one > >> > > that > >> > > > > > keeps > >> > > > > > > > the > >> > > > > > > > > > query > >> > > > > > > > > > > > > > part > >> > > > > > > > > > > > > > > at is it. with get and query methods), that > >> will > >> > > > > include > >> > > > > > > the > >> > > > > > > > > > required > >> > > > > > > > > > > > > > > modification operations for the four > DataIndex > >> > > > > "domains": > >> > > > > > > > > > > > > > processinstance, > >> > > > > > > > > > > > > > > usertask, processdefinitions and jobs. Those > >> > > > interfaces > >> > > > > > > will > >> > > > > > > > > > define > >> > > > > > > > > > > > > > methods > >> > > > > > > > > > > > > > > like addNode, addVariable, updateTask, > >> > > > > addAttachment..... > >> > > > > > > > that > >> > > > > > > > > > will > >> > > > > > > > > > > > allow > >> > > > > > > > > > > > > > > the persistent layer implementation to just > >> > update > >> > > > the > >> > > > > > > > needed > >> > > > > > > > > > info > >> > > > > > > > > > > > in > >> > > > > > > > > > > > > > the > >> > > > > > > > > > > > > > > DB (for example, for addNode in Postgres, > >> just > >> > > > insert > >> > > > > a > >> > > > > > > row > >> > > > > > > > > into > >> > > > > > > > > > > > nodes > >> > > > > > > > > > > > > > > table, for addNode in Mongo, basically the > >> same > >> > > > atomic > >> > > > > > > upsert > >> > > > > > > > > > > > operation > >> > > > > > > > > > > > > > > that is currently done). Therefore, we > >> increase > >> > > > > > performance > >> > > > > > > > for > >> > > > > > > > > > > > Postgres > >> > > > > > > > > > > > > > > and keep the current one for Mongo. The > >> current > >> > DB > >> > > > > > schemas > >> > > > > > > > > won't > >> > > > > > > > > > be > >> > > > > > > > > > > > > > > touched. > >> > > > > > > > > > > > > > > Since the code change is large, I do not > think > >> > I'll > >> > > > be > >> > > > > > able > >> > > > > > > > to > >> > > > > > > > > > have > >> > > > > > > > > > > > the > >> > > > > > > > > > > > > > PR > >> > > > > > > > > > > > > > > ready till next week. > >> > > > > > > > > > > > > > > But before starting, please let me know if > >> that > >> > > > > approach > >> > > > > > is > >> > > > > > > > > fine > >> > > > > > > > > > for > >> > > > > > > > > > > > you. > >> > > > > > > > > > > > > > > Best regards. > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > On Fri, Nov 24, 2023 at 6:55 PM Alex > Porcelli > >> < > >> > > > > > > > > a...@porcelli.me> > >> > > > > > > > > > > > wrote: > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > Thank you Francisco to getting deeper on > >> this… > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > Looking forward to see the results of your > >> > > > suggested > >> > > > > > > > > > improvements. > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > On Fri, Nov 24, 2023 at 9:40 AM Francisco > >> > Javier > >> > > > > Tirado > >> > > > > > > > > Sarti < > >> > > > > > > > > > > > > > > > ftira...@redhat.com> wrote: > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > I forgot to attach the queries > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > On Fri, Nov 24, 2023 at 3:04 PM > Francisco > >> > > Javier > >> > > > > > Tirado > >> > > > > > > > > > Sarti < > >> > > > > > > > > > > > > > > > > ftira...@redhat.com> wrote: > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> Hi, > >> > > > > > > > > > > > > > > > >> A brief update on this topic. > >> > > > > > > > > > > > > > > > >> After doing a simple test with example > >> > > > > > > > > > > > > > > > >> > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > https://github.com/apache/incubator-kie-kogito-examples/tree/stable/serverless-workflow-examples/serverless-workflow-data-index-quarkus > >> > > > > > > > > > > > > > > > , > >> > > > > > > > > > > > > > > > >> the number of updates over Nodes table > is > >> > n*n, > >> > > > so > >> > > > > we > >> > > > > > > > > manage > >> > > > > > > > > > to > >> > > > > > > > > > > > > > obtain a > >> > > > > > > > > > > > > > > > >> perfect quadratic performance > >> degradation. > >> > The > >> > > > > > problem > >> > > > > > > > is > >> > > > > > > > > > worse > >> > > > > > > > > > > > in > >> > > > > > > > > > > > > > the > >> > > > > > > > > > > > > > > > case > >> > > > > > > > > > > > > > > > >> of Serverless Workflow than in BPMN > >> because > >> > we > >> > > > the > >> > > > > > > > number > >> > > > > > > > > of > >> > > > > > > > > > > > nodes > >> > > > > > > > > > > > > > is > >> > > > > > > > > > > > > > > > >> greater than the number of states. In > >> that > >> > > > > example N > >> > > > > > > is > >> > > > > > > > > 16, > >> > > > > > > > > > but > >> > > > > > > > > > > > for > >> > > > > > > > > > > > > > a > >> > > > > > > > > > > > > > > > more > >> > > > > > > > > > > > > > > > >> complex workflow it would be certainly > >> > large. > >> > > > > > > > > > > > > > > > >> I think that this is more related to > how > >> we > >> > > are > >> > > > > > > handling > >> > > > > > > > > > JPA in > >> > > > > > > > > > > > the > >> > > > > > > > > > > > > > > > code, > >> > > > > > > > > > > > > > > > >> in particular the mapping from model to > >> > entity > >> > > > > > > > (basically > >> > > > > > > > > > JPA is > >> > > > > > > > > > > > > > blind > >> > > > > > > > > > > > > > > > and > >> > > > > > > > > > > > > > > > >> has to update all nodes for every write > >> > > because > >> > > > it > >> > > > > > > > > believes > >> > > > > > > > > > the > >> > > > > > > > > > > > > > node has > >> > > > > > > > > > > > > > > > >> been updated, although it is not) than > an > >> > > issue > >> > > > in > >> > > > > > the > >> > > > > > > > > table > >> > > > > > > > > > > > > > definition. > >> > > > > > > > > > > > > > > > >> In fact, when using JPA, separating the > >> > server > >> > > > > model > >> > > > > > > > from > >> > > > > > > > > > the > >> > > > > > > > > > > > JPA > >> > > > > > > > > > > > > > > > entity is > >> > > > > > > > > > > > > > > > >> not a good idea, especially if the > entity > >> > > > contains > >> > > > > > > > > > collections. > >> > > > > > > > > > > > I > >> > > > > > > > > > > > > > will > >> > > > > > > > > > > > > > > > try > >> > > > > > > > > > > > > > > > >> to change that without breaking > anything. > >> > > > > > > > > > > > > > > > >> > >> > > > > > > > > > > > > > > > >> On Wed, Nov 22, 2023 at 12:10 PM > Enrique > >> > > > Gonzalez > >> > > > > > > > > Martinez < > >> > > > > > > > > > > > > > > > >> egonza...@apache.org> wrote: > >> > > > > > > > > > > > > > > > >> > >> > > > > > > > > > > > > > > > >>> After the events split you now will > >> need to > >> > > > > create > >> > > > > > a > >> > > > > > > > node > >> > > > > > > > > > > > instance > >> > > > > > > > > > > > > > > > >>> model instance of making independent > >> from > >> > the > >> > > > > > process > >> > > > > > > > > > instance. > >> > > > > > > > > > > > > > > > >>> That should do the trick. > >> > > > > > > > > > > > > > > > >>> > >> > > > > > > > > > > > > > > > >>> Regarding deleting/inserting it was > >> fixed > >> > at > >> > > > some > >> > > > > > > > point. > >> > > > > > > > > > > > > > > > >>> > >> > > > > > > > > > > > > > > > >>> El mar, 21 nov 2023 a las 20:22, > >> Francisco > >> > > > Javier > >> > > > > > > > Tirado > >> > > > > > > > > > Sarti > >> > > > > > > > > > > > > > > > >>> (<ftira...@redhat.com>) escribió: > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > > >>> > Hi Martin, > >> > > > > > > > > > > > > > > > >>> > I have a task to review performance > of > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > > >>> > ProcessInstanceNodeDataEventMerger > >> > > > > > > > > > > > > > > > >>> > My idea is to reduce the number of > >> delete > >> > > > > inserts > >> > > > > > > > when > >> > > > > > > > > > > > processing > >> > > > > > > > > > > > > > > > >>> events > >> > > > > > > > > > > > > > > > >>> > and try to do it incremental. > >> > > > > > > > > > > > > > > > >>> > That should improve performance. > >> > > > > > > > > > > > > > > > >>> > PS: > >> > > > > > > > > > > > > > > > >>> > I was planning to send an e-mail > >> tomorrow > >> > > > > > > announcing > >> > > > > > > > > > that in > >> > > > > > > > > > > > > > case you > >> > > > > > > > > > > > > > > > >>> were > >> > > > > > > > > > > > > > > > >>> > already working on a fix for that. I > >> > assume > >> > > > you > >> > > > > > are > >> > > > > > > > not > >> > > > > > > > > > and I > >> > > > > > > > > > > > > > would > >> > > > > > > > > > > > > > > > be > >> > > > > > > > > > > > > > > > >>> > sending a PR soon. > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > > >>> > On Tue, Nov 21, 2023 at 6:09 PM > Martin > >> > > Weiler > >> > > > > > > > > > > > > > > > <mwei...@ibm.com.invalid > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > > >>> > wrote: > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > > >>> > > I looked into the new examples > using > >> > > > > data-index > >> > > > > > > > > > persistence > >> > > > > > > > > > > > > > addon - > >> > > > > > > > > > > > > > > > >>> Neus' > >> > > > > > > > > > > > > > > > >>> > > PR#1813 [1] for serverless and > >> Pere's > >> > > > branch > >> > > > > > [2] > >> > > > > > > > for > >> > > > > > > > > > > > workflow > >> > > > > > > > > > > > > > > > (great > >> > > > > > > > > > > > > > > > >>> job > >> > > > > > > > > > > > > > > > >>> > > both!) - and they work without > >> issues > >> > > using > >> > > > > > > single > >> > > > > > > > > > > > requests. > >> > > > > > > > > > > > > > > > >>> However, under > >> > > > > > > > > > > > > > > > >>> > > some load (I used 'ab' for testing > >> > with a > >> > > > > light > >> > > > > > > > > > > > concurrency of > >> > > > > > > > > > > > > > 10 > >> > > > > > > > > > > > > > > > >>> parallel > >> > > > > > > > > > > > > > > > >>> > > requests) I ran into the following > >> > > > problems: > >> > > > > > > > > > > > > > > > >>> > > > >> > > > > > > > > > > > > > > > >>> > > (1) Large number of insert/delete > >> calls > >> > > > (eg. > >> > > > > > for > >> > > > > > > > > tables > >> > > > > > > > > > > > such as > >> > > > > > > > > > > > > > > > >>> nodes, > >> > > > > > > > > > > > > > > > >>> > > definitions, etc) > >> > > > > > > > > > > > > > > > >>> > > > >> > > > > > > > > > > > > > > > >>> > > (2) Hibernate > >> OptimisticLockExceptions > >> > / > >> > > > > > > > > > > > StaleStateExceptions > >> > > > > > > > > > > > > > > > >>> > > > >> > > > > > > > > > > > > > > > >>> > > (3) DB deadlocks > >> > > > > > > > > > > > > > > > >>> > > > >> > > > > > > > > > > > > > > > >>> > > (4) Error responses, slow response > >> > times > >> > > > > > > > > > > > > > > > >>> > > > >> > > > > > > > > > > > > > > > >>> > > The reason I am reaching out with > >> this > >> > > > topic > >> > > > > > here > >> > > > > > > > is > >> > > > > > > > > to > >> > > > > > > > > > > > find > >> > > > > > > > > > > > > > out if > >> > > > > > > > > > > > > > > > >>> we are > >> > > > > > > > > > > > > > > > >>> > > aware of this issue, and if > someone > >> is > >> > > > > already > >> > > > > > > > > looking > >> > > > > > > > > > > > into or > >> > > > > > > > > > > > > > > > being > >> > > > > > > > > > > > > > > > >>> > > assigned to it? > >> > > > > > > > > > > > > > > > >>> > > > >> > > > > > > > > > > > > > > > >>> > > Thanks, > >> > > > > > > > > > > > > > > > >>> > > Martin > >> > > > > > > > > > > > > > > > >>> > > > >> > > > > > > > > > > > > > > > >>> > > [1] > >> > > > > > > > > > > > > > > > >>> > >> > > > > > > > > > > > > >> > > > > > > > >> > https://github.com/apache/incubator-kie-kogito-examples/pull/1813 > >> > > > > > > > > > > > > > > > >>> > > [2] > >> > > > > > > > > > > > > > > > >>> > > > >> > > > > > > > > > > > > > > > >>> > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > https://github.com/pefernan/kogito-examples/tree/example_data-index_persistence > >> > > > > > > > > > > > > > > > >>> > > > >> > > > > > > > > > > > > > > > >>> > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > >> > > > > >> --------------------------------------------------------------------- > >> > > > > > > > > > > > > > > > >>> > > To unsubscribe, e-mail: > >> > > > > > > > > dev-unsubscr...@kie.apache.org > >> > > > > > > > > > > > > > > > >>> > > For additional commands, e-mail: > >> > > > > > > > > > dev-h...@kie.apache.org > >> > > > > > > > > > > > > > > > >>> > > > >> > > > > > > > > > > > > > > > >>> > > > >> > > > > > > > > > > > > > > > >>> > >> > > > > > > > > > > > > > > > >>> > >> > > > > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > >> > --------------------------------------------------------------------- > >> > > > > > > > > > > > > > > > >>> To unsubscribe, e-mail: > >> > > > > > > dev-unsubscr...@kie.apache.org > >> > > > > > > > > > > > > > > > >>> For additional commands, e-mail: > >> > > > > > > > dev-h...@kie.apache.org > >> > > > > > > > > > > > > > > > >>> > >> > > > > > > > > > > > > > > > >>> > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > >> > > > > >> --------------------------------------------------------------------- > >> > > > > > > > > > > > > > > > > To unsubscribe, e-mail: > >> > > > > > dev-unsubscr...@kie.apache.org > >> > > > > > > > > > > > > > > > > For additional commands, e-mail: > >> > > > > > > dev-h...@kie.apache.org > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > >> > --------------------------------------------------------------------- > >> > > > > > > > > > > > > > To unsubscribe, e-mail: > >> > > dev-unsubscr...@kie.apache.org > >> > > > > > > > > > > > > > For additional commands, e-mail: > >> > > > dev-h...@kie.apache.org > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > >> > > > > >> --------------------------------------------------------------------- > >> > > > > > > > > > > > To unsubscribe, e-mail: > >> dev-unsubscr...@kie.apache.org > >> > > > > > > > > > > > For additional commands, e-mail: > >> > dev-h...@kie.apache.org > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > >> > --------------------------------------------------------------------- > >> > > > > > > > > > To unsubscribe, e-mail: > dev-unsubscr...@kie.apache.org > >> > > > > > > > > > For additional commands, e-mail: > >> dev-h...@kie.apache.org > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > >> --------------------------------------------------------------------- > >> > > > > > > > > To unsubscribe, e-mail: dev-unsubscr...@kie.apache.org > >> > > > > > > > > For additional commands, e-mail: > dev-h...@kie.apache.org > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: dev-unsubscr...@kie.apache.org > >> For additional commands, e-mail: dev-h...@kie.apache.org > >> > >> >