if data index was supposed to provide snapshot view of the process
instance… why do we keep it after the process instance is finished?


On Mon, Feb 19, 2024 at 7:12 AM Francisco Javier Tirado Sarti <
ftira...@redhat.com> wrote:

> Hi Martin.
> After taking a deeper look at this, I realize that the behaviour is the
> expected one.
> Runtimes DB does not track the completed process instance (that's what the
> JDBCProcessInstances warn is telling us), but DataIndex, as expected, is
> tracking it in processes and nodes table. And yes it will grow over time.
> What we need is some configurable purge mechanism for DataIndex, so it
> eventually removes older completed process instances.
>
> On Tue, Feb 13, 2024 at 12:59 PM Francisco Javier Tirado Sarti <
> ftira...@redhat.com> wrote:
>
> > Hi Martin,
> > Good catch!. Looks like the skipping performed for process instances is
> > not applied to node instances. Something we definitely need to review on
> > the runtimes side.
> >
> > On Mon, Feb 12, 2024 at 11:59 PM Martin Weiler <mwei...@ibm.com.invalid>
> > wrote:
> >
> >> On a somewhat related note, testing a simple workflow (start -> script
> >> node -> end), I see the following messages in the logs:
> >> 2024-02-12 22:49:50,493 28758dde544c WARN
> >> [org.kie.kogito.persistence.jdbc.JDBCProcessInstances:-1]
> >> (executor-thread-3) Skipping create of process instance id:
> >> 7083088e-b899-47cb-b85c-5d9ccb0aa166, state: 2
> >>
> >> So far, so good. And I'd expect to see no trace of this process in the
> >> database if I don't have data audit enabled.
> >>
> >> However, the 'processes' table contains a row with state=2, with related
> >> entries in the 'nodes' table. In a load test, I see these tables grow
> >> significantly over time. Am I missing something to have these entries
> >> cleaned up automatically?
> >>
> >> ________________________________________
> >> From: Martin Weiler <mwei...@ibm.com.INVALID>
> >> Sent: Monday, February 12, 2024 3:40 PM
> >> To: dev@kie.apache.org
> >> Subject: [EXTERNAL] RE: [DISCUSSION] Performance issues with data-index
> >> persistence addon
> >>
> >> Thanks everyone for your input. Based on this discussion, I opened the
> >> following PR:
> >> https://github.com/apache/incubator-kie-kogito-apps/pull/1985
> >>
> >> With this change, the performance seems to be stable over time:
> >>
> >>
> https://drive.google.com/file/d/1zkullvfrJpRp7TRjxDa41ok6kEIR7Fty/view?usp=sharing
> >>
> >> Martin
> >>
> >> ________________________________________
> >> From: Gonzalo Muñoz <gmuno...@apache.org>
> >> Sent: Friday, February 9, 2024 9:42 AM
> >> To: dev@kie.apache.org
> >> Subject: [EXTERNAL] Re: [DISCUSSION] Performance issues with data-index
> >> persistence addon
> >>
> >> Great work Francisco,
> >> Martin, take a look at this link with some related tips (in case you
> find
> >> it useful):
> >> https://www.cybertec-postgresql.com/en/index-your-foreign-key/
> >>
> >> El vie, 9 feb 2024 a las 17:20, Francisco Javier Tirado Sarti (<
> >> ftira...@redhat.com>) escribió:
> >>
> >> > For the moment being, we will keep JPA till we exhaust all
> >> possibilities,
> >> > let's call switching from jpa to jdbc our hidden plan B ;)
> >> > I already told Martin, but in order everyone to know, just after
> writing
> >> > the previous email, I thought "what if Postgres is not automatically
> >> > indexing foreign keys like mysql?" and, eureka
> >> > Postgres doc
> >> > https://www.postgresql.org/docs/current/ddl-constraints.html
> >> > Mysql doc
> >> > https://dev.mysql.com/doc/refman/8.0/en/constraint-foreign-key.html
> >> > These are the relevant excerpt
> >> >
> >> > *Postgresql*
> >> > *A foreign key must reference columns that either are a primary key or
> >> form
> >> > a unique constraint, or are columns from a non-partial unique index.
> >> This
> >> > means that the referenced columns always have an index to allow
> >> efficient
> >> > lookups on whether a referencing row has a match. Since a DELETE of a
> >> row
> >> > from the referenced table or an UPDATE of a referenced column will
> >> require
> >> > a scan of the referencing table for rows matching the old value, it is
> >> > often a good idea to index the referencing columns too. Because this
> is
> >> not
> >> > always needed, and there are many choices available on how to index,
> the
> >> > declaration of a foreign key constraint does not automatically create
> an
> >> > index on the referencing columns.*
> >> > *Mysql*
> >> > *MySQL requires that foreign key columns be indexed; if you create a
> >> table
> >> > with a foreign key constraint but no index on a given column, an index
> >> is
> >> > created. *
> >> >
> >> > So I asked Martin to especially create an index for
> process_instance_id
> >> > column on nodes table
> >> > I think that will fix the problem detected on the thread dump.
> >> > The simpler process test to verify queries are fine still stands,
> >> though ;)
> >> >
> >> >
> >> > On Fri, Feb 9, 2024 at 5:10 PM Tibor Zimányi <tzima...@apache.org>
> >> wrote:
> >> >
> >> > > I always preferred pure JDBC over Hibernate myself, just for the
> sake
> >> of
> >> > > control of what is happening :) So I would not -1 that myself.
> >> > >
> >> > > Tibor
> >> > >
> >> > > Dňa pi 9. 2. 2024, 17:00 Francisco Javier Tirado Sarti <
> >> > > ftira...@redhat.com>
> >> > > napísal(a):
> >> > >
> >> > > > Hi,
> >> > > > Usually I do not want to talk about work in progress because
> >> > preliminary
> >> > > > conclusions are pretty volatile but, well, there are a couple of
> >> things
> >> > > > that can be concluded from the really valuable information that
> >> Martin
> >> > > > provided.
> >> > > > 1) In order to be able to determine if the number of statements is
> >> > larger
> >> > > > than expected, I asked Martin to test with a simpler process
> >> > definition.
> >> > > > One with just three nodes: start, script and end. The script one
> >> should
> >> > > > change just one variable. This way we can analyze if the number of
> >> > > queries
> >> > > > is the expected one. From the single log (audit was activated
> them)
> >> my
> >> > > > conclusion is that the number of insert/updates over processes and
> >> > nodes
> >> > > > (there a lot over task, that I will prefer to skip for now, baby
> >> steps)
> >> > > is
> >> > > > the expected one.
> >> > > > 2) Analysing the thread dump, we see around 15 threads executing
> >> this
> >> > > line
> >> > > > at
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> org.kie.kogito.index.jpa.storage.ProcessInstanceEntityStorage.indexNode(ProcessInstanceEntityStorage.java:125),
> >> > > > so its pretty clear the code to be optimized ;). I'm evaluating
> >> > > > possibilities within JPA/Hibernate, but I'm starting to think that
> >> it
> >> > > might
> >> > > > be better to switch to JDBC and skip hibernate. Our lives will be
> >> > > simpler,
> >> > > > especially with a schema relatively simple like ours (that will be
> >> my
> >> > > > recommendation if I was an external consultant)
> >> > > >
> >> > > > On Fri, Feb 9, 2024 at 4:15 PM Tibor Zimányi <tzima...@apache.org
> >
> >> > > wrote:
> >> > > >
> >> > > > > Hi,
> >> > > > >
> >> > > > > this will be a bit off-topic. However as far as performance, I
> >> think
> >> > we
> >> > > > > should think about that we have string primary keys (IDs). I
> would
> >> > > expect
> >> > > > > the database systems are much better with indexing numeric keys
> >> than
> >> > > > > strings. I remember from the past, when I was working with DBs,
> >> that
> >> > > > using
> >> > > > > strings as keys or indexes was a discouraged practice.
> >> > > > >
> >> > > > > Best regards,
> >> > > > > Tibor
> >> > > > >
> >> > > > > Dňa št 8. 2. 2024, 22:45 Martin Weiler <mwei...@ibm.com.invalid
> >
> >> > > > > napísal(a):
> >> > > > >
> >> > > > > > I changed the test to use MongoDB [1] and I don't see a
> >> performance
> >> > > > > > degradation with this setup [2].
> >> > > > > >
> >> > > > > > Please keep us posted of your findings. Thanks!
> >> > > > > >
> >> > > > > > Martin
> >> > > > > >
> >> > > > > > [1]
> >> > > > >
> >> >
> https://github.com/martinweiler/job-service-refactor-test/tree/mongodb
> >> > > > > > [2]
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://drive.google.com/file/d/1NfacXaxJlgRMw4OQ5S20cvkzvaUKUVFj/view?usp=sharing
> >> > > > > >
> >> > > > > > ________________________________________
> >> > > > > > From: Francisco Javier Tirado Sarti <ftira...@redhat.com>
> >> > > > > > Sent: Wednesday, February 7, 2024 11:40 AM
> >> > > > > > To: dev@kie.apache.org
> >> > > > > > Subject: [EXTERNAL] Re: [DISCUSSION] Performance issues with
> >> > > data-index
> >> > > > > > persistence addon
> >> > > > > >
> >> > > > > > yes, it can be index degradation because of size, but I
> believe
> >> (I
> >> > > > might
> >> > > > > be
> >> > > > > > wrong) the db is too small (yet) for that.
> >> > > > > > But, eventually, Postgres, when the DB is huge enough,
> >> unavoidably
> >> > > will
> >> > > > > > behave like the graphic that Martin sent.
> >> > > > > > Since I believe we are not huge enough (yet), lets rule out
> >> another
> >> > > > issue
> >> > > > > > by analysing the sql logs (I requested those to Martin offline
> >> and
> >> > he
> >> > > > is
> >> > > > > > going to kindly collect them).
> >> > > > > > Also Im curious to know if Mongo behave in the same way.
> >> > > > > >
> >> > > > > > On Wed, Feb 7, 2024 at 7:25 PM Enrique Gonzalez Martinez <
> >> > > > > > egonza...@apache.org> wrote:
> >> > > > > >
> >> > > > > > > Hi Francisco,
> >> > > > > > > I would highly recommend to check indexes and how the
> updates
> >> > work
> >> > > in
> >> > > > > > data
> >> > > > > > > index to avoid full scan table and lock the full table. Some
> >> db
> >> > are
> >> > > > > very
> >> > > > > > > sensitive to that.
> >> > > > > > >
> >> > > > > > > El mié, 7 feb 2024, 18:41, Francisco Javier Tirado Sarti <
> >> > > > > > > ftira...@redhat.com> escribió:
> >> > > > > > >
> >> > > > > > > > Hi Martin,
> >> > > > > > > > While I analyze the data, let me ask you if it is possible
> >> to
> >> > > > perform
> >> > > > > > > > another check (similar in a way to disabling data-index
> like
> >> > you
> >> > > > do)
> >> > > > > > Can
> >> > > > > > > > you switch to MongoDB persistence and check if the same
> >> > > degradation
> >> > > > > > that
> >> > > > > > > is
> >> > > > > > > > there for postgres remains?
> >> > > > > > > > I do not know if this is feasible but will certainly
> >> indicate
> >> > the
> >> > > > > > problem
> >> > > > > > > > is on the postgres storage layer and I do not have a clear
> >> > > > prediction
> >> > > > > > of
> >> > > > > > > > what we will see when doing this switch.
> >> > > > > > > >
> >> > > > > > > > On Wed, Feb 7, 2024 at 6:37 PM Martin Weiler
> >> > > > <mwei...@ibm.com.invalid
> >> > > > > >
> >> > > > > > > > wrote:
> >> > > > > > > >
> >> > > > > > > > > Hi Francisco,
> >> > > > > > > > >
> >> > > > > > > > > thanks for your work on this important topic!
> >> > > > > > > > >
> >> > > > > > > > > I would like to share some test results here, which
> might
> >> > help
> >> > > to
> >> > > > > > > improve
> >> > > > > > > > > the codebase even further. I am using the jmeter based
> >> test
> >> > > case
> >> > > > > from
> >> > > > > > > > Pere
> >> > > > > > > > > and Enrique (thanks guys!) [1] which uses a load of 30
> >> > threads
> >> > > to
> >> > > > > > > > >
> >> > > > > > > > > 1) start a new process instance (POST)
> >> > > > > > > > > 2) retrieve tasks for a user (GET)
> >> > > > > > > > > 3) fetches task details (GET)
> >> > > > > > > > > 4) complete a task (POST)
> >> > > > > > > > > 5) execute a query on data-audit
> >> > > > > > > > >
> >> > > > > > > > > With this test setup, I noticed that the performance for
> >> the
> >> > > POST
> >> > > > > > > > > requests, in particular the one to start a new process
> >> > > instance,
> >> > > > > > > degrades
> >> > > > > > > > > over time - see graph [2]. If I run the same test
> without
> >> > > > > data-index,
> >> > > > > > > > then
> >> > > > > > > > > there is no such performance degradation [3]. You can
> >> find a
> >> > > > thread
> >> > > > > > > dump
> >> > > > > > > > > captured a few minutes into the first test here [4] that
> >> > might
> >> > > > help
> >> > > > > > to
> >> > > > > > > > see
> >> > > > > > > > > some of the contention points.
> >> > > > > > > > >
> >> > > > > > > > > I'd appreciate if you could take a look and see if there
> >> is
> >> > > > > something
> >> > > > > > > > that
> >> > > > > > > > > can be further improved based on your previous work. If
> >> you
> >> > > need
> >> > > > > any
> >> > > > > > > > > additional data, let me know, but otherwise it is
> >> > > straightforward
> >> > > > > to
> >> > > > > > > run
> >> > > > > > > > > the jmeter test as well.
> >> > > > > > > > >
> >> > > > > > > > > Thanks,
> >> > > > > > > > > Martin
> >> > > > > > > > >
> >> > > > > > > > > [1]
> >> https://github.com/pefernan/job-service-refactor-test/
> >> > > > > > > > > [2]
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://drive.google.com/file/d/1Gqn-ixE05kXv2jdssAUlnMuUVcHxIYZ0/view?usp=sharing
> >> > > > > > > > > [3]
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://drive.google.com/file/d/10gVNyb4JYg_bA18bNhY9dEDbPn3TOxL7/view?usp=sharing
> >> > > > > > > > > [4]
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://drive.google.com/file/d/1jVrtsO49gCvUlnaC9AUAtkVKTm4PbdUv/view?usp=sharing
> >> > > > > > > > >
> >> > > > > > > > > ________________________________________
> >> > > > > > > > > From: Francisco Javier Tirado Sarti <
> ftira...@redhat.com>
> >> > > > > > > > > Sent: Wednesday, January 17, 2024 9:13 AM
> >> > > > > > > > > To: dev@kie.apache.org
> >> > > > > > > > > Cc: Pere Fernandez Perez
> >> > > > > > > > > Subject: [EXTERNAL] Re: [DISCUSSION] Performance issues
> >> with
> >> > > > > > data-index
> >> > > > > > > > > persistence addon
> >> > > > > > > > >
> >> > > > > > > > > Hi Alex,
> >> > > > > > > > > I did not take times (which depends on a number of
> >> variables
> >> > > that
> >> > > > > > > > > drastically change between environments), but verify
> that
> >> the
> >> > > > > number
> >> > > > > > of
> >> > > > > > > > > updates has been reduced drastically without losing
> >> > > > functionality,
> >> > > > > > > which
> >> > > > > > > > is
> >> > > > > > > > > objectively a good thing. If before the change, for
> every
> >> > node
> >> > > > > > > executed,
> >> > > > > > > > we
> >> > > > > > > > > have an update for every node previously executed, so
> if a
> >> > > > process
> >> > > > > > have
> >> > > > > > > > 50
> >> > > > > > > > > nodes to execute, we were performing nearly 50*51/2
> >> updates,
> >> > > > which
> >> > > > > > > gives
> >> > > > > > > > us
> >> > > > > > > > > a total of  1275 updates, now we have just one for every
> >> node
> >> > > > being
> >> > > > > > > > > executed, implying a total of 50 updates.
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > On Wed, Jan 17, 2024 at 3:18 PM Alex Porcelli <
> >> > > a...@porcelli.me>
> >> > > > > > > wrote:
> >> > > > > > > > >
> >> > > > > > > > > > Francisco,
> >> > > > > > > > > >
> >> > > > > > > > > > I noticed that your PR has been merged, but I was
> >> expecting
> >> > > (at
> >> > > > > > least
> >> > > > > > > > > > was my understanding from this thread) that before
> >> merging
> >> > > some
> >> > > > > > > > > > benchmark data would be shared in advance - to assess
> >> the
> >> > > > > > > cost/benefit
> >> > > > > > > > > > of such a decent size change.
> >> > > > > > > > > >
> >> > > > > > > > > > Do you have any information to share?
> >> > > > > > > > > >
> >> > > > > > > > > > On Sat, Dec 23, 2023 at 4:02 AM Francisco Javier
> Tirado
> >> > Sarti
> >> > > > > > > > > > <ftira...@redhat.com> wrote:
> >> > > > > > > > > > >
> >> > > > > > > > > > > Yes, as intended, now we have one select and one
> >> > > > insert/update
> >> > > > > > per
> >> > > > > > > > node
> >> > > > > > > > > > > event.
> >> > > > > > > > > > > I moved the PR as ready for review and give @Pere
> >> > Fernandez
> >> > > > > Perez
> >> > > > > > > > > > > <pefer...@redhat.com> permission to the branch so
> he
> >> can
> >> > > > edit
> >> > > > > it
> >> > > > > > > in
> >> > > > > > > > > the
> >> > > > > > > > > > > next two weeks (Ill be on PTO)  if desired, before
> >> > merging.
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > On Thu, Dec 21, 2023 at 5:58 PM Alex Porcelli <
> >> > > > > a...@porcelli.me>
> >> > > > > > > > > wrote:
> >> > > > > > > > > > >
> >> > > > > > > > > > > > Cool, thank you Francisco!
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > Did you manage to get some preliminary data about
> >> > > > > improvements?
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > On Thu, Dec 21, 2023 at 11:52 AM Francisco Javier
> >> > Tirado
> >> > > > > Sarti
> >> > > > > > > > > > > > <ftira...@redhat.com> wrote:
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > Yes, after some delay because of quarkus 3
> >> migration.
> >> > > Im
> >> > > > > > > refining
> >> > > > > > > > > > this
> >> > > > > > > > > > > > > draft PR
> >> > > > > > > > > > > > >
> >> > > > > > https://github.com/apache/incubator-kie-kogito-apps/pull/1941
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > On Thu, Dec 21, 2023 at 5:48 PM Alex Porcelli <
> >> > > > > > > a...@porcelli.me>
> >> > > > > > > > > > wrote:
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > > Any update or new findings on this topic?
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > On Tue, Nov 28, 2023 at 8:38 AM Francisco
> Javier
> >> > > Tirado
> >> > > > > > Sarti
> >> > > > > > > > > > > > > > <ftira...@redhat.com> wrote:
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > Hi Alex,
> >> > > > > > > > > > > > > > > After considering different options to
> improve
> >> > > > > > performance,
> >> > > > > > > > we
> >> > > > > > > > > > feel
> >> > > > > > > > > > > > that
> >> > > > > > > > > > > > > > it
> >> > > > > > > > > > > > > > > is time to "partially" move away from the
> >> current
> >> > > Map
> >> > > > > > style
> >> > > > > > > > > > > > interface (
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://github.com/apache/incubator-kie-kogito-apps/blob/main/persistence-commons/persistence-commons-api/src/main/java/org/kie/kogito/persistence/api/Storage.java
> >> > > > > > > > > > > > > > )
> >> > > > > > > > > > > > > > > which was shared with Trusty, to one more
> >> > suitable
> >> > > > for
> >> > > > > > > usage
> >> > > > > > > > > > with a
> >> > > > > > > > > > > > > > > relational DB like postgresql (but still
> >> > compatible
> >> > > > > with
> >> > > > > > > big
> >> > > > > > > > > > table
> >> > > > > > > > > > > > dbs).
> >> > > > > > > > > > > > > > > The idea will be to replace generic Storage
> >> > > interface
> >> > > > > by
> >> > > > > > > four
> >> > > > > > > > > > > > specific
> >> > > > > > > > > > > > > > > interfaces (which will inherit from a common
> >> one
> >> > > that
> >> > > > > > keeps
> >> > > > > > > > the
> >> > > > > > > > > > query
> >> > > > > > > > > > > > > > part
> >> > > > > > > > > > > > > > > at is it. with get and query methods), that
> >> will
> >> > > > > include
> >> > > > > > > the
> >> > > > > > > > > > required
> >> > > > > > > > > > > > > > > modification operations for the four
> DataIndex
> >> > > > > "domains":
> >> > > > > > > > > > > > > > processinstance,
> >> > > > > > > > > > > > > > > usertask, processdefinitions and jobs. Those
> >> > > > interfaces
> >> > > > > > > will
> >> > > > > > > > > > define
> >> > > > > > > > > > > > > > methods
> >> > > > > > > > > > > > > > > like addNode, addVariable, updateTask,
> >> > > > > addAttachment.....
> >> > > > > > > > that
> >> > > > > > > > > > will
> >> > > > > > > > > > > > allow
> >> > > > > > > > > > > > > > > the persistent layer implementation  to just
> >> > update
> >> > > > the
> >> > > > > > > > needed
> >> > > > > > > > > > info
> >> > > > > > > > > > > > in
> >> > > > > > > > > > > > > > the
> >> > > > > > > > > > > > > > > DB  (for example, for addNode in Postgres,
> >> just
> >> > > > insert
> >> > > > > a
> >> > > > > > > row
> >> > > > > > > > > into
> >> > > > > > > > > > > > nodes
> >> > > > > > > > > > > > > > > table, for addNode in Mongo, basically the
> >> same
> >> > > > atomic
> >> > > > > > > upsert
> >> > > > > > > > > > > > operation
> >> > > > > > > > > > > > > > > that is currently done). Therefore, we
> >> increase
> >> > > > > > performance
> >> > > > > > > > for
> >> > > > > > > > > > > > Postgres
> >> > > > > > > > > > > > > > > and keep the current one for Mongo. The
> >> current
> >> > DB
> >> > > > > > schemas
> >> > > > > > > > > won't
> >> > > > > > > > > > be
> >> > > > > > > > > > > > > > > touched.
> >> > > > > > > > > > > > > > > Since the code change is large, I do not
> think
> >> > I'll
> >> > > > be
> >> > > > > > able
> >> > > > > > > > to
> >> > > > > > > > > > have
> >> > > > > > > > > > > > the
> >> > > > > > > > > > > > > > PR
> >> > > > > > > > > > > > > > > ready till next week.
> >> > > > > > > > > > > > > > > But before starting, please let me know if
> >> that
> >> > > > > approach
> >> > > > > > is
> >> > > > > > > > > fine
> >> > > > > > > > > > for
> >> > > > > > > > > > > > you.
> >> > > > > > > > > > > > > > > Best regards.
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > On Fri, Nov 24, 2023 at 6:55 PM Alex
> Porcelli
> >> <
> >> > > > > > > > > a...@porcelli.me>
> >> > > > > > > > > > > > wrote:
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > Thank you Francisco to getting deeper on
> >> this…
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > Looking forward to see the results of your
> >> > > > suggested
> >> > > > > > > > > > improvements.
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > On Fri, Nov 24, 2023 at 9:40 AM Francisco
> >> > Javier
> >> > > > > Tirado
> >> > > > > > > > > Sarti <
> >> > > > > > > > > > > > > > > > ftira...@redhat.com> wrote:
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > I forgot to attach the queries
> >> > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > On Fri, Nov 24, 2023 at 3:04 PM
> Francisco
> >> > > Javier
> >> > > > > > Tirado
> >> > > > > > > > > > Sarti <
> >> > > > > > > > > > > > > > > > > ftira...@redhat.com> wrote:
> >> > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > >> Hi,
> >> > > > > > > > > > > > > > > > >> A brief update on this topic.
> >> > > > > > > > > > > > > > > > >> After doing a simple test with example
> >> > > > > > > > > > > > > > > > >>
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://github.com/apache/incubator-kie-kogito-examples/tree/stable/serverless-workflow-examples/serverless-workflow-data-index-quarkus
> >> > > > > > > > > > > > > > > > ,
> >> > > > > > > > > > > > > > > > >> the number of updates over Nodes table
> is
> >> > n*n,
> >> > > > so
> >> > > > > we
> >> > > > > > > > > manage
> >> > > > > > > > > > to
> >> > > > > > > > > > > > > > obtain a
> >> > > > > > > > > > > > > > > > >> perfect quadratic performance
> >> degradation.
> >> > The
> >> > > > > > problem
> >> > > > > > > > is
> >> > > > > > > > > > worse
> >> > > > > > > > > > > > in
> >> > > > > > > > > > > > > > the
> >> > > > > > > > > > > > > > > > case
> >> > > > > > > > > > > > > > > > >> of Serverless Workflow than in BPMN
> >> because
> >> > we
> >> > > > the
> >> > > > > > > > number
> >> > > > > > > > > of
> >> > > > > > > > > > > > nodes
> >> > > > > > > > > > > > > > is
> >> > > > > > > > > > > > > > > > >> greater than the number of states. In
> >> that
> >> > > > > example N
> >> > > > > > > is
> >> > > > > > > > > 16,
> >> > > > > > > > > > but
> >> > > > > > > > > > > > for
> >> > > > > > > > > > > > > > a
> >> > > > > > > > > > > > > > > > more
> >> > > > > > > > > > > > > > > > >> complex workflow it would be certainly
> >> > large.
> >> > > > > > > > > > > > > > > > >> I think that this is more related to
> how
> >> we
> >> > > are
> >> > > > > > > handling
> >> > > > > > > > > > JPA in
> >> > > > > > > > > > > > the
> >> > > > > > > > > > > > > > > > code,
> >> > > > > > > > > > > > > > > > >> in particular the mapping from model to
> >> > entity
> >> > > > > > > > (basically
> >> > > > > > > > > > JPA is
> >> > > > > > > > > > > > > > blind
> >> > > > > > > > > > > > > > > > and
> >> > > > > > > > > > > > > > > > >> has to update all nodes for every write
> >> > > because
> >> > > > it
> >> > > > > > > > > believes
> >> > > > > > > > > > the
> >> > > > > > > > > > > > > > node has
> >> > > > > > > > > > > > > > > > >> been updated, although it is not) than
> an
> >> > > issue
> >> > > > in
> >> > > > > > the
> >> > > > > > > > > table
> >> > > > > > > > > > > > > > definition.
> >> > > > > > > > > > > > > > > > >> In fact, when using JPA, separating the
> >> > server
> >> > > > > model
> >> > > > > > > > from
> >> > > > > > > > > > the
> >> > > > > > > > > > > > JPA
> >> > > > > > > > > > > > > > > > entity is
> >> > > > > > > > > > > > > > > > >> not a good idea, especially if the
> entity
> >> > > > contains
> >> > > > > > > > > > collections.
> >> > > > > > > > > > > > I
> >> > > > > > > > > > > > > > will
> >> > > > > > > > > > > > > > > > try
> >> > > > > > > > > > > > > > > > >> to change that without breaking
> anything.
> >> > > > > > > > > > > > > > > > >>
> >> > > > > > > > > > > > > > > > >> On Wed, Nov 22, 2023 at 12:10 PM
> Enrique
> >> > > > Gonzalez
> >> > > > > > > > > Martinez <
> >> > > > > > > > > > > > > > > > >> egonza...@apache.org> wrote:
> >> > > > > > > > > > > > > > > > >>
> >> > > > > > > > > > > > > > > > >>> After the events split you now will
> >> need to
> >> > > > > create
> >> > > > > > a
> >> > > > > > > > node
> >> > > > > > > > > > > > instance
> >> > > > > > > > > > > > > > > > >>> model instance of making independent
> >> from
> >> > the
> >> > > > > > process
> >> > > > > > > > > > instance.
> >> > > > > > > > > > > > > > > > >>> That should do the trick.
> >> > > > > > > > > > > > > > > > >>>
> >> > > > > > > > > > > > > > > > >>> Regarding deleting/inserting it was
> >> fixed
> >> > at
> >> > > > some
> >> > > > > > > > point.
> >> > > > > > > > > > > > > > > > >>>
> >> > > > > > > > > > > > > > > > >>> El mar, 21 nov 2023 a las 20:22,
> >> Francisco
> >> > > > Javier
> >> > > > > > > > Tirado
> >> > > > > > > > > > Sarti
> >> > > > > > > > > > > > > > > > >>> (<ftira...@redhat.com>) escribió:
> >> > > > > > > > > > > > > > > > >>> >
> >> > > > > > > > > > > > > > > > >>> > Hi Martin,
> >> > > > > > > > > > > > > > > > >>> > I have a task to review performance
> of
> >> > > > > > > > > > > > > > > > >>> >
> >> > > > > > > > > > > > > > > > >>> > ProcessInstanceNodeDataEventMerger
> >> > > > > > > > > > > > > > > > >>> > My idea is to reduce the number of
> >> delete
> >> > > > > inserts
> >> > > > > > > > when
> >> > > > > > > > > > > > processing
> >> > > > > > > > > > > > > > > > >>> events
> >> > > > > > > > > > > > > > > > >>> > and try to do it incremental.
> >> > > > > > > > > > > > > > > > >>> > That should improve performance.
> >> > > > > > > > > > > > > > > > >>> > PS:
> >> > > > > > > > > > > > > > > > >>> > I was planning to send an e-mail
> >> tomorrow
> >> > > > > > > announcing
> >> > > > > > > > > > that in
> >> > > > > > > > > > > > > > case you
> >> > > > > > > > > > > > > > > > >>> were
> >> > > > > > > > > > > > > > > > >>> > already working on a fix for that. I
> >> > assume
> >> > > > you
> >> > > > > > are
> >> > > > > > > > not
> >> > > > > > > > > > and I
> >> > > > > > > > > > > > > > would
> >> > > > > > > > > > > > > > > > be
> >> > > > > > > > > > > > > > > > >>> > sending a PR soon.
> >> > > > > > > > > > > > > > > > >>> >
> >> > > > > > > > > > > > > > > > >>> > On Tue, Nov 21, 2023 at 6:09 PM
> Martin
> >> > > Weiler
> >> > > > > > > > > > > > > > > > <mwei...@ibm.com.invalid
> >> > > > > > > > > > > > > > > > >>> >
> >> > > > > > > > > > > > > > > > >>> > wrote:
> >> > > > > > > > > > > > > > > > >>> >
> >> > > > > > > > > > > > > > > > >>> > > I looked into the new examples
> using
> >> > > > > data-index
> >> > > > > > > > > > persistence
> >> > > > > > > > > > > > > > addon -
> >> > > > > > > > > > > > > > > > >>> Neus'
> >> > > > > > > > > > > > > > > > >>> > > PR#1813 [1] for serverless and
> >> Pere's
> >> > > > branch
> >> > > > > > [2]
> >> > > > > > > > for
> >> > > > > > > > > > > > workflow
> >> > > > > > > > > > > > > > > > (great
> >> > > > > > > > > > > > > > > > >>> job
> >> > > > > > > > > > > > > > > > >>> > > both!) - and they work without
> >> issues
> >> > > using
> >> > > > > > > single
> >> > > > > > > > > > > > requests.
> >> > > > > > > > > > > > > > > > >>> However, under
> >> > > > > > > > > > > > > > > > >>> > > some load (I used 'ab' for testing
> >> > with a
> >> > > > > light
> >> > > > > > > > > > > > concurrency of
> >> > > > > > > > > > > > > > 10
> >> > > > > > > > > > > > > > > > >>> parallel
> >> > > > > > > > > > > > > > > > >>> > > requests) I ran into the following
> >> > > > problems:
> >> > > > > > > > > > > > > > > > >>> > >
> >> > > > > > > > > > > > > > > > >>> > > (1) Large number of insert/delete
> >> calls
> >> > > > (eg.
> >> > > > > > for
> >> > > > > > > > > tables
> >> > > > > > > > > > > > such as
> >> > > > > > > > > > > > > > > > >>> nodes,
> >> > > > > > > > > > > > > > > > >>> > > definitions, etc)
> >> > > > > > > > > > > > > > > > >>> > >
> >> > > > > > > > > > > > > > > > >>> > > (2) Hibernate
> >> OptimisticLockExceptions
> >> > /
> >> > > > > > > > > > > > StaleStateExceptions
> >> > > > > > > > > > > > > > > > >>> > >
> >> > > > > > > > > > > > > > > > >>> > > (3) DB deadlocks
> >> > > > > > > > > > > > > > > > >>> > >
> >> > > > > > > > > > > > > > > > >>> > > (4) Error responses, slow response
> >> > times
> >> > > > > > > > > > > > > > > > >>> > >
> >> > > > > > > > > > > > > > > > >>> > > The reason I am reaching out with
> >> this
> >> > > > topic
> >> > > > > > here
> >> > > > > > > > is
> >> > > > > > > > > to
> >> > > > > > > > > > > > find
> >> > > > > > > > > > > > > > out if
> >> > > > > > > > > > > > > > > > >>> we are
> >> > > > > > > > > > > > > > > > >>> > > aware of this issue, and if
> someone
> >> is
> >> > > > > already
> >> > > > > > > > > looking
> >> > > > > > > > > > > > into or
> >> > > > > > > > > > > > > > > > being
> >> > > > > > > > > > > > > > > > >>> > > assigned to it?
> >> > > > > > > > > > > > > > > > >>> > >
> >> > > > > > > > > > > > > > > > >>> > > Thanks,
> >> > > > > > > > > > > > > > > > >>> > > Martin
> >> > > > > > > > > > > > > > > > >>> > >
> >> > > > > > > > > > > > > > > > >>> > > [1]
> >> > > > > > > > > > > > > > > > >>>
> >> > > > > > > > > > > >
> >> > > > > > >
> >> > https://github.com/apache/incubator-kie-kogito-examples/pull/1813
> >> > > > > > > > > > > > > > > > >>> > > [2]
> >> > > > > > > > > > > > > > > > >>> > >
> >> > > > > > > > > > > > > > > > >>>
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://github.com/pefernan/kogito-examples/tree/example_data-index_persistence
> >> > > > > > > > > > > > > > > > >>> > >
> >> > > > > > > > > > > > > > > > >>> > >
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > >
> >> > > >
> >> ---------------------------------------------------------------------
> >> > > > > > > > > > > > > > > > >>> > > To unsubscribe, e-mail:
> >> > > > > > > > > dev-unsubscr...@kie.apache.org
> >> > > > > > > > > > > > > > > > >>> > > For additional commands, e-mail:
> >> > > > > > > > > > dev-h...@kie.apache.org
> >> > > > > > > > > > > > > > > > >>> > >
> >> > > > > > > > > > > > > > > > >>> > >
> >> > > > > > > > > > > > > > > > >>>
> >> > > > > > > > > > > > > > > > >>>
> >> > > > > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > >
> >> > ---------------------------------------------------------------------
> >> > > > > > > > > > > > > > > > >>> To unsubscribe, e-mail:
> >> > > > > > > dev-unsubscr...@kie.apache.org
> >> > > > > > > > > > > > > > > > >>> For additional commands, e-mail:
> >> > > > > > > > dev-h...@kie.apache.org
> >> > > > > > > > > > > > > > > > >>>
> >> > > > > > > > > > > > > > > > >>>
> >> > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > >
> >> > > >
> >> ---------------------------------------------------------------------
> >> > > > > > > > > > > > > > > > > To unsubscribe, e-mail:
> >> > > > > > dev-unsubscr...@kie.apache.org
> >> > > > > > > > > > > > > > > > > For additional commands, e-mail:
> >> > > > > > > dev-h...@kie.apache.org
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > >
> >> > ---------------------------------------------------------------------
> >> > > > > > > > > > > > > > To unsubscribe, e-mail:
> >> > > dev-unsubscr...@kie.apache.org
> >> > > > > > > > > > > > > > For additional commands, e-mail:
> >> > > > dev-h...@kie.apache.org
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > >
> >> > > >
> >> ---------------------------------------------------------------------
> >> > > > > > > > > > > > To unsubscribe, e-mail:
> >> dev-unsubscr...@kie.apache.org
> >> > > > > > > > > > > > For additional commands, e-mail:
> >> > dev-h...@kie.apache.org
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > >
> >> > ---------------------------------------------------------------------
> >> > > > > > > > > > To unsubscribe, e-mail:
> dev-unsubscr...@kie.apache.org
> >> > > > > > > > > > For additional commands, e-mail:
> >> dev-h...@kie.apache.org
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > >
> >> ---------------------------------------------------------------------
> >> > > > > > > > > To unsubscribe, e-mail: dev-unsubscr...@kie.apache.org
> >> > > > > > > > > For additional commands, e-mail:
> dev-h...@kie.apache.org
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscr...@kie.apache.org
> >> For additional commands, e-mail: dev-h...@kie.apache.org
> >>
> >>
>

Reply via email to