Re: [DISCUSSION] Performance issues with data-index persistence addon

Enrique Gonzalez Martinez Wed, 21 Feb 2024 03:35:57 -0800

Hi Francisco

The discussion we need to have before how to achieve certain features, is
the overall of the user experience. If you want to clean it up that way it
is fine by me. I have nothing against setting policy related to some sort
of clean up for a process completed some time ago.


As we discussed previously the data index is a snapshot of the last state
of the process instance included completed, but nothing was said once
completed when we need to clean that up so any policy is welcome. How to
achieve that policy is something completely different.

The main problem is how every microservice is exposing that API to the end
user or consumer (other system) and which system is going to be the façade
for complex deployments and operations. That is the discussion I was
mentioning before.

What I want to avoid is to set different policies among components and we
should strive to be as much as possible to offer certain capabilities in
the same fashion, e.g: clean up mechanism.
In the same way I want to avoid making the current system more complex than
they are. So far they are aligned offering one simple responsibility but I
would be against mixing for instance job service with data index.

El mié, 21 feb 2024 a las 12:13, Francisco Javier Tirado Sarti (<
[email protected]>) escribió:

> Hi Enrique,
> In the case of data index I think the data to be purged is finished process
> instances (I do not think we should remove process instance that has been
> alive for ages, even if it is very likely they are not going to be ever
> completed)
> Once you delete those process instances, you also delete the associated
> user tasks and jobs.
> Therefore the problem is relatively simple, to be able to configure how
> much time a completed process instance should remain in the data index
> database. We can take a simple approach: a property with a min duration
> that cannot be changed once data index is started; a slightly complex one:
> the same property but watching it to react for changes; or the full suite:
> an admin API to be able to change the policy at any moment.
> I think this discussion is worth having.
>
> On Wed, Feb 21, 2024 at 6:15 AM Enrique Gonzalez Martinez <
> [email protected]> wrote:
>
> > Hi Martin, the main problem regarding the purge is because it is still
> > unclear the policy for what tech to use and the future components.
> >
> > Recently we had a discussion about proposing graphql for this sort of
> admin
> > tasks. So far for subsystems we have been using rest endpoints (like
> update
> > timers or modify human task or change processes). There is one exception
> > which is the gateway that is pure graphql and somehow uses graphql for
> > everything making complex operations under the hood.
> >
> > This has somehow frozen the purge for data audit for a bit and the
> proposal
> > was to use rest endpoints to do the clean up in the component and offer
> the
> > graphql counterpart in the gateway promoting it to a first class citizen
> > component instead of having it embedded in the data index.
> >
> > I would suggest to come up at least with a policy first regarding the
> > convention every component should address this.
> >
> >
> > El mar, 20 feb 2024, 23:39, Martin Weiler <[email protected]>
> > escribió:
> >
> > > IMO, it is good to have this discussion around data sanity now instead
> of
> > > putting it off until later when data has already accumulated in
> > production
> > > environments.
> > >
> > > Based on the input here, we are dealing with three types of data:
> > > 1. Runtime data - active instances only, engine cleans up the data
> > > automatically at process instance end
> > > 2. Historic log data - data created by data-audit intended for long
> term
> > > storage
> > > 3. Data-index data - somehow this data falls in between the two
> > > aforementioned categories, with the idea of the data being "recent",
> but
> > > not restricted to active instances only
> > >
> > > We'd need purge strategies for both #2 and #3 (perhaps different ones,
> or
> > > with different config settings) in order to prevent unlimited data
> > growth.
> > >
> > > ________________________________________
> > > From: Enrique Gonzalez Martinez <[email protected]>
> > > Sent: Monday, February 19, 2024 7:11 AM
> > > To: [email protected]
> > > Subject: [EXTERNAL] Re: [DISCUSSION] Performance issues with data-index
> > > persistence addon
> > >
> > > Hi Francisco,
> > > To give you more context about this.
> > >
> > > STP is a concept, a process with certain constraints: no persistence
> and
> > > returning the outcome in the call (sync execution with no idle states).
> > It
> > > was a requirement from a user in the past. One of the requirements was
> > > leaving no trail. In v7 was easy because you could disable the audit in
> > > that case. Actually we have the same way to do what we did in v7 in
> here
> > as
> > > you can add/remove index just removing deps.
> > >
> > > We have the same outcome with different approaches and STP is already
> > > delivered.
> > >
> > > El lun, 19 feb 2024 a las 14:46, Francisco Javier Tirado Sarti (<
> > > [email protected]>) escribió:
> > >
> > > > Regarding STP (which is not a concept that we have in the code. I
> mean
> > > STP
> > > > are processes as nonSTP are), I guess, as all processes, they were
> kept
> > > in
> > > > DataIndex once completed because users wanted (and still wants) to
> > check
> > > > the result once the call had been performed. If we want to leave no
> > trace
> > > > of them in DataIndex for some reason, we will need to make it a
> > > > Runtimes concept so DataIndex can handle them in a different way.
> > > >
> > > > On Mon, Feb 19, 2024 at 2:27 PM Enrique Gonzalez Martinez <
> > > > [email protected]> wrote:
> > > >
> > > > > Alex:
> > > > > Right now the data index is working in the same way as it did in v7
> > > with
> > > > > the emitters. The only difference between two impl is that in here
> > the
> > > > > storage is pgsql instead elastic search.  You are right regarding
> is
> > a
> > > > > snapshot of the last state of the process but we did never define
> how
> > > > long
> > > > > would be alive that dats Honestly i am happy right now with the way
> > it
> > > > > works. The clean up mechanism is still tbd because we still need to
> > > > discuss
> > > > > other stuff first.
> > > > >
> > > > >
> > > > > Regarding stp is to leave no trail because u can get the outcome
> > > directly
> > > > > from the call. It was defined like that in v7. So there is no use
> for
> > > the
> > > > > index or the audit.
> > > > >
> > > > > El lun, 19 feb 2024, 14:13, Francisco Javier Tirado Sarti <
> > > > > [email protected]> escribió:
> > > > >
> > > > > > Hi Alex,
> > > > > > There has been some confusion about the purpose of DataIndex. To
> be
> > > > > honest
> > > > > > I believe they were already sorted out, but your e-mail makes me
> > > think
> > > > > that
> > > > > > is not the case ;). I let Kris to clarify that with you. My view
> is
> > > > that
> > > > > > data-index is a way to query recently closed and active processes
> > > (the
> > > > > key
> > > > > > here is the definition of recently, which in my opinion should be
> > > > > > configurable)
> > > > > > But, besides that discussion and being pragmatic, keeping
> finishing
> > > > > process
> > > > > > instances "for a while" in DataIndex was the only way for users
> to
> > > > query
> > > > > > the result of straight through processes. That's a function that
> > > cannot
> > > > > be
> > > > > > removed right now
> > > > > >
> > > > > > On Mon, Feb 19, 2024 at 1:33 PM Alex Porcelli <
> [email protected]
> > >
> > > > > wrote:
> > > > > >
> > > > > > > if data index was supposed to provide snapshot view of the
> > process
> > > > > > > instance… why do we keep it after the process instance is
> > finished?
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Feb 19, 2024 at 7:12 AM Francisco Javier Tirado Sarti <
> > > > > > > [email protected]> wrote:
> > > > > > >
> > > > > > > > Hi Martin.
> > > > > > > > After taking a deeper look at this, I realize that the
> > behaviour
> > > is
> > > > > the
> > > > > > > > expected one.
> > > > > > > > Runtimes DB does not track the completed process instance
> > (that's
> > > > > what
> > > > > > > the
> > > > > > > > JDBCProcessInstances warn is telling us), but DataIndex, as
> > > > expected,
> > > > > > is
> > > > > > > > tracking it in processes and nodes table. And yes it will
> grow
> > > over
> > > > > > time.
> > > > > > > > What we need is some configurable purge mechanism for
> > DataIndex,
> > > so
> > > > > it
> > > > > > > > eventually removes older completed process instances.
> > > > > > > >
> > > > > > > > On Tue, Feb 13, 2024 at 12:59 PM Francisco Javier Tirado
> Sarti
> > <
> > > > > > > > [email protected]> wrote:
> > > > > > > >
> > > > > > > > > Hi Martin,
> > > > > > > > > Good catch!. Looks like the skipping performed for process
> > > > > instances
> > > > > > is
> > > > > > > > > not applied to node instances. Something we definitely need
> > to
> > > > > review
> > > > > > > on
> > > > > > > > > the runtimes side.
> > > > > > > > >
> > > > > > > > > On Mon, Feb 12, 2024 at 11:59 PM Martin Weiler
> > > > > > <[email protected]
> > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > >> On a somewhat related note, testing a simple workflow
> (start
> > > ->
> > > > > > script
> > > > > > > > >> node -> end), I see the following messages in the logs:
> > > > > > > > >> 2024-02-12 22:49:50,493 28758dde544c WARN
> > > > > > > > >> [org.kie.kogito.persistence.jdbc.JDBCProcessInstances:-1]
> > > > > > > > >> (executor-thread-3) Skipping create of process instance
> id:
> > > > > > > > >> 7083088e-b899-47cb-b85c-5d9ccb0aa166, state: 2
> > > > > > > > >>
> > > > > > > > >> So far, so good. And I'd expect to see no trace of this
> > > process
> > > > in
> > > > > > the
> > > > > > > > >> database if I don't have data audit enabled.
> > > > > > > > >>
> > > > > > > > >> However, the 'processes' table contains a row with
> state=2,
> > > with
> > > > > > > related
> > > > > > > > >> entries in the 'nodes' table. In a load test, I see these
> > > tables
> > > > > > grow
> > > > > > > > >> significantly over time. Am I missing something to have
> > these
> > > > > > entries
> > > > > > > > >> cleaned up automatically?
> > > > > > > > >>
> > > > > > > > >> ________________________________________
> > > > > > > > >> From: Martin Weiler <[email protected]>
> > > > > > > > >> Sent: Monday, February 12, 2024 3:40 PM
> > > > > > > > >> To: [email protected]
> > > > > > > > >> Subject: [EXTERNAL] RE: [DISCUSSION] Performance issues
> with
> > > > > > > data-index
> > > > > > > > >> persistence addon
> > > > > > > > >>
> > > > > > > > >> Thanks everyone for your input. Based on this discussion,
> I
> > > > opened
> > > > > > the
> > > > > > > > >> following PR:
> > > > > > > > >>
> > https://github.com/apache/incubator-kie-kogito-apps/pull/1985
> > > > > > > > >>
> > > > > > > > >> With this change, the performance seems to be stable over
> > > time:
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/1zkullvfrJpRp7TRjxDa41ok6kEIR7Fty/view?usp=sharing
> > > > > > > > >>
> > > > > > > > >> Martin
> > > > > > > > >>
> > > > > > > > >> ________________________________________
> > > > > > > > >> From: Gonzalo Muñoz <[email protected]>
> > > > > > > > >> Sent: Friday, February 9, 2024 9:42 AM
> > > > > > > > >> To: [email protected]
> > > > > > > > >> Subject: [EXTERNAL] Re: [DISCUSSION] Performance issues
> with
> > > > > > > data-index
> > > > > > > > >> persistence addon
> > > > > > > > >>
> > > > > > > > >> Great work Francisco,
> > > > > > > > >> Martin, take a look at this link with some related tips
> (in
> > > case
> > > > > you
> > > > > > > > find
> > > > > > > > >> it useful):
> > > > > > > > >>
> > > https://www.cybertec-postgresql.com/en/index-your-foreign-key/
> > > > > > > > >>
> > > > > > > > >> El vie, 9 feb 2024 a las 17:20, Francisco Javier Tirado
> > Sarti
> > > (<
> > > > > > > > >> [email protected]>) escribió:
> > > > > > > > >>
> > > > > > > > >> > For the moment being, we will keep JPA till we exhaust
> all
> > > > > > > > >> possibilities,
> > > > > > > > >> > let's call switching from jpa to jdbc our hidden plan B
> ;)
> > > > > > > > >> > I already told Martin, but in order everyone to know,
> just
> > > > after
> > > > > > > > writing
> > > > > > > > >> > the previous email, I thought "what if Postgres is not
> > > > > > automatically
> > > > > > > > >> > indexing foreign keys like mysql?" and, eureka
> > > > > > > > >> > Postgres doc
> > > > > > > > >> >
> > > https://www.postgresql.org/docs/current/ddl-constraints.html
> > > > > > > > >> > Mysql doc
> > > > > > > > >> >
> > > > > >
> > https://dev.mysql.com/doc/refman/8.0/en/constraint-foreign-key.html
> > > > > > > > >> > These are the relevant excerpt
> > > > > > > > >> >
> > > > > > > > >> > *Postgresql*
> > > > > > > > >> > *A foreign key must reference columns that either are a
> > > > primary
> > > > > > key
> > > > > > > or
> > > > > > > > >> form
> > > > > > > > >> > a unique constraint, or are columns from a non-partial
> > > unique
> > > > > > index.
> > > > > > > > >> This
> > > > > > > > >> > means that the referenced columns always have an index
> to
> > > > allow
> > > > > > > > >> efficient
> > > > > > > > >> > lookups on whether a referencing row has a match. Since
> a
> > > > DELETE
> > > > > > of
> > > > > > > a
> > > > > > > > >> row
> > > > > > > > >> > from the referenced table or an UPDATE of a referenced
> > > column
> > > > > will
> > > > > > > > >> require
> > > > > > > > >> > a scan of the referencing table for rows matching the
> old
> > > > value,
> > > > > > it
> > > > > > > is
> > > > > > > > >> > often a good idea to index the referencing columns too.
> > > > Because
> > > > > > this
> > > > > > > > is
> > > > > > > > >> not
> > > > > > > > >> > always needed, and there are many choices available on
> how
> > > to
> > > > > > index,
> > > > > > > > the
> > > > > > > > >> > declaration of a foreign key constraint does not
> > > automatically
> > > > > > > create
> > > > > > > > an
> > > > > > > > >> > index on the referencing columns.*
> > > > > > > > >> > *Mysql*
> > > > > > > > >> > *MySQL requires that foreign key columns be indexed; if
> > you
> > > > > > create a
> > > > > > > > >> table
> > > > > > > > >> > with a foreign key constraint but no index on a given
> > > column,
> > > > an
> > > > > > > index
> > > > > > > > >> is
> > > > > > > > >> > created. *
> > > > > > > > >> >
> > > > > > > > >> > So I asked Martin to especially create an index for
> > > > > > > > process_instance_id
> > > > > > > > >> > column on nodes table
> > > > > > > > >> > I think that will fix the problem detected on the thread
> > > dump.
> > > > > > > > >> > The simpler process test to verify queries are fine
> still
> > > > > stands,
> > > > > > > > >> though ;)
> > > > > > > > >> >
> > > > > > > > >> >
> > > > > > > > >> > On Fri, Feb 9, 2024 at 5:10 PM Tibor Zimányi <
> > > > > [email protected]
> > > > > > >
> > > > > > > > >> wrote:
> > > > > > > > >> >
> > > > > > > > >> > > I always preferred pure JDBC over Hibernate myself,
> just
> > > for
> > > > > the
> > > > > > > > sake
> > > > > > > > >> of
> > > > > > > > >> > > control of what is happening :) So I would not -1 that
> > > > myself.
> > > > > > > > >> > >
> > > > > > > > >> > > Tibor
> > > > > > > > >> > >
> > > > > > > > >> > > Dňa pi 9. 2. 2024, 17:00 Francisco Javier Tirado
> Sarti <
> > > > > > > > >> > > [email protected]>
> > > > > > > > >> > > napísal(a):
> > > > > > > > >> > >
> > > > > > > > >> > > > Hi,
> > > > > > > > >> > > > Usually I do not want to talk about work in progress
> > > > because
> > > > > > > > >> > preliminary
> > > > > > > > >> > > > conclusions are pretty volatile but, well, there
> are a
> > > > > couple
> > > > > > of
> > > > > > > > >> things
> > > > > > > > >> > > > that can be concluded from the really valuable
> > > information
> > > > > > that
> > > > > > > > >> Martin
> > > > > > > > >> > > > provided.
> > > > > > > > >> > > > 1) In order to be able to determine if the number of
> > > > > > statements
> > > > > > > is
> > > > > > > > >> > larger
> > > > > > > > >> > > > than expected, I asked Martin to test with a simpler
> > > > process
> > > > > > > > >> > definition.
> > > > > > > > >> > > > One with just three nodes: start, script and end.
> The
> > > > script
> > > > > > one
> > > > > > > > >> should
> > > > > > > > >> > > > change just one variable. This way we can analyze if
> > the
> > > > > > number
> > > > > > > of
> > > > > > > > >> > > queries
> > > > > > > > >> > > > is the expected one. From the single log (audit was
> > > > > activated
> > > > > > > > them)
> > > > > > > > >> my
> > > > > > > > >> > > > conclusion is that the number of insert/updates over
> > > > > processes
> > > > > > > and
> > > > > > > > >> > nodes
> > > > > > > > >> > > > (there a lot over task, that I will prefer to skip
> for
> > > > now,
> > > > > > baby
> > > > > > > > >> steps)
> > > > > > > > >> > > is
> > > > > > > > >> > > > the expected one.
> > > > > > > > >> > > > 2) Analysing the thread dump, we see around 15
> threads
> > > > > > executing
> > > > > > > > >> this
> > > > > > > > >> > > line
> > > > > > > > >> > > > at
> > > > > > > > >> > > >
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.kie.kogito.index.jpa.storage.ProcessInstanceEntityStorage.indexNode(ProcessInstanceEntityStorage.java:125),
> > > > > > > > >> > > > so its pretty clear the code to be optimized ;). I'm
> > > > > > evaluating
> > > > > > > > >> > > > possibilities within JPA/Hibernate, but I'm starting
> > to
> > > > > think
> > > > > > > that
> > > > > > > > >> it
> > > > > > > > >> > > might
> > > > > > > > >> > > > be better to switch to JDBC and skip hibernate. Our
> > > lives
> > > > > will
> > > > > > > be
> > > > > > > > >> > > simpler,
> > > > > > > > >> > > > especially with a schema relatively simple like ours
> > > (that
> > > > > > will
> > > > > > > be
> > > > > > > > >> my
> > > > > > > > >> > > > recommendation if I was an external consultant)
> > > > > > > > >> > > >
> > > > > > > > >> > > > On Fri, Feb 9, 2024 at 4:15 PM Tibor Zimányi <
> > > > > > > [email protected]
> > > > > > > > >
> > > > > > > > >> > > wrote:
> > > > > > > > >> > > >
> > > > > > > > >> > > > > Hi,
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > this will be a bit off-topic. However as far as
> > > > > > performance, I
> > > > > > > > >> think
> > > > > > > > >> > we
> > > > > > > > >> > > > > should think about that we have string primary
> keys
> > > > > (IDs). I
> > > > > > > > would
> > > > > > > > >> > > expect
> > > > > > > > >> > > > > the database systems are much better with indexing
> > > > numeric
> > > > > > > keys
> > > > > > > > >> than
> > > > > > > > >> > > > > strings. I remember from the past, when I was
> > working
> > > > with
> > > > > > > DBs,
> > > > > > > > >> that
> > > > > > > > >> > > > using
> > > > > > > > >> > > > > strings as keys or indexes was a discouraged
> > practice.
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > Best regards,
> > > > > > > > >> > > > > Tibor
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > Dňa št 8. 2. 2024, 22:45 Martin Weiler
> > > > > > > <[email protected]
> > > > > > > > >
> > > > > > > > >> > > > > napísal(a):
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > > I changed the test to use MongoDB [1] and I
> don't
> > > see
> > > > a
> > > > > > > > >> performance
> > > > > > > > >> > > > > > degradation with this setup [2].
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > Please keep us posted of your findings. Thanks!
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > Martin
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > [1]
> > > > > > > > >> > > > >
> > > > > > > > >> >
> > > > > > > >
> > > > >
> > https://github.com/martinweiler/job-service-refactor-test/tree/mongodb
> > > > > > > > >> > > > > > [2]
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/1NfacXaxJlgRMw4OQ5S20cvkzvaUKUVFj/view?usp=sharing
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > ________________________________________
> > > > > > > > >> > > > > > From: Francisco Javier Tirado Sarti <
> > > > > [email protected]>
> > > > > > > > >> > > > > > Sent: Wednesday, February 7, 2024 11:40 AM
> > > > > > > > >> > > > > > To: [email protected]
> > > > > > > > >> > > > > > Subject: [EXTERNAL] Re: [DISCUSSION] Performance
> > > > issues
> > > > > > with
> > > > > > > > >> > > data-index
> > > > > > > > >> > > > > > persistence addon
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > yes, it can be index degradation because of
> size,
> > > but
> > > > I
> > > > > > > > believe
> > > > > > > > >> (I
> > > > > > > > >> > > > might
> > > > > > > > >> > > > > be
> > > > > > > > >> > > > > > wrong) the db is too small (yet) for that.
> > > > > > > > >> > > > > > But, eventually, Postgres, when the DB is huge
> > > enough,
> > > > > > > > >> unavoidably
> > > > > > > > >> > > will
> > > > > > > > >> > > > > > behave like the graphic that Martin sent.
> > > > > > > > >> > > > > > Since I believe we are not huge enough (yet),
> lets
> > > > rule
> > > > > > out
> > > > > > > > >> another
> > > > > > > > >> > > > issue
> > > > > > > > >> > > > > > by analysing the sql logs (I requested those to
> > > Martin
> > > > > > > offline
> > > > > > > > >> and
> > > > > > > > >> > he
> > > > > > > > >> > > > is
> > > > > > > > >> > > > > > going to kindly collect them).
> > > > > > > > >> > > > > > Also Im curious to know if Mongo behave in the
> > same
> > > > way.
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > On Wed, Feb 7, 2024 at 7:25 PM Enrique Gonzalez
> > > > > Martinez <
> > > > > > > > >> > > > > > [email protected]> wrote:
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > > Hi Francisco,
> > > > > > > > >> > > > > > > I would highly recommend to check indexes and
> > how
> > > > the
> > > > > > > > updates
> > > > > > > > >> > work
> > > > > > > > >> > > in
> > > > > > > > >> > > > > > data
> > > > > > > > >> > > > > > > index to avoid full scan table and lock the
> full
> > > > > table.
> > > > > > > Some
> > > > > > > > >> db
> > > > > > > > >> > are
> > > > > > > > >> > > > > very
> > > > > > > > >> > > > > > > sensitive to that.
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > > > El mié, 7 feb 2024, 18:41, Francisco Javier
> > Tirado
> > > > > > Sarti <
> > > > > > > > >> > > > > > > [email protected]> escribió:
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > > > > Hi Martin,
> > > > > > > > >> > > > > > > > While I analyze the data, let me ask you if
> it
> > > is
> > > > > > > possible
> > > > > > > > >> to
> > > > > > > > >> > > > perform
> > > > > > > > >> > > > > > > > another check (similar in a way to disabling
> > > > > > data-index
> > > > > > > > like
> > > > > > > > >> > you
> > > > > > > > >> > > > do)
> > > > > > > > >> > > > > > Can
> > > > > > > > >> > > > > > > > you switch to MongoDB persistence and check
> if
> > > the
> > > > > > same
> > > > > > > > >> > > degradation
> > > > > > > > >> > > > > > that
> > > > > > > > >> > > > > > > is
> > > > > > > > >> > > > > > > > there for postgres remains?
> > > > > > > > >> > > > > > > > I do not know if this is feasible but will
> > > > certainly
> > > > > > > > >> indicate
> > > > > > > > >> > the
> > > > > > > > >> > > > > > problem
> > > > > > > > >> > > > > > > > is on the postgres storage layer and I do
> not
> > > > have a
> > > > > > > clear
> > > > > > > > >> > > > prediction
> > > > > > > > >> > > > > > of
> > > > > > > > >> > > > > > > > what we will see when doing this switch.
> > > > > > > > >> > > > > > > >
> > > > > > > > >> > > > > > > > On Wed, Feb 7, 2024 at 6:37 PM Martin Weiler
> > > > > > > > >> > > > <[email protected]
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > > > wrote:
> > > > > > > > >> > > > > > > >
> > > > > > > > >> > > > > > > > > Hi Francisco,
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > thanks for your work on this important
> > topic!
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > I would like to share some test results
> > here,
> > > > > which
> > > > > > > > might
> > > > > > > > >> > help
> > > > > > > > >> > > to
> > > > > > > > >> > > > > > > improve
> > > > > > > > >> > > > > > > > > the codebase even further. I am using the
> > > jmeter
> > > > > > based
> > > > > > > > >> test
> > > > > > > > >> > > case
> > > > > > > > >> > > > > from
> > > > > > > > >> > > > > > > > Pere
> > > > > > > > >> > > > > > > > > and Enrique (thanks guys!) [1] which uses
> a
> > > load
> > > > > of
> > > > > > 30
> > > > > > > > >> > threads
> > > > > > > > >> > > to
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > 1) start a new process instance (POST)
> > > > > > > > >> > > > > > > > > 2) retrieve tasks for a user (GET)
> > > > > > > > >> > > > > > > > > 3) fetches task details (GET)
> > > > > > > > >> > > > > > > > > 4) complete a task (POST)
> > > > > > > > >> > > > > > > > > 5) execute a query on data-audit
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > With this test setup, I noticed that the
> > > > > performance
> > > > > > > for
> > > > > > > > >> the
> > > > > > > > >> > > POST
> > > > > > > > >> > > > > > > > > requests, in particular the one to start a
> > new
> > > > > > process
> > > > > > > > >> > > instance,
> > > > > > > > >> > > > > > > degrades
> > > > > > > > >> > > > > > > > > over time - see graph [2]. If I run the
> same
> > > > test
> > > > > > > > without
> > > > > > > > >> > > > > data-index,
> > > > > > > > >> > > > > > > > then
> > > > > > > > >> > > > > > > > > there is no such performance degradation
> > [3].
> > > > You
> > > > > > can
> > > > > > > > >> find a
> > > > > > > > >> > > > thread
> > > > > > > > >> > > > > > > dump
> > > > > > > > >> > > > > > > > > captured a few minutes into the first test
> > > here
> > > > > [4]
> > > > > > > that
> > > > > > > > >> > might
> > > > > > > > >> > > > help
> > > > > > > > >> > > > > > to
> > > > > > > > >> > > > > > > > see
> > > > > > > > >> > > > > > > > > some of the contention points.
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > I'd appreciate if you could take a look
> and
> > > see
> > > > if
> > > > > > > there
> > > > > > > > >> is
> > > > > > > > >> > > > > something
> > > > > > > > >> > > > > > > > that
> > > > > > > > >> > > > > > > > > can be further improved based on your
> > previous
> > > > > work.
> > > > > > > If
> > > > > > > > >> you
> > > > > > > > >> > > need
> > > > > > > > >> > > > > any
> > > > > > > > >> > > > > > > > > additional data, let me know, but
> otherwise
> > it
> > > > is
> > > > > > > > >> > > straightforward
> > > > > > > > >> > > > > to
> > > > > > > > >> > > > > > > run
> > > > > > > > >> > > > > > > > > the jmeter test as well.
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > Thanks,
> > > > > > > > >> > > > > > > > > Martin
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > [1]
> > > > > > > > >> https://github.com/pefernan/job-service-refactor-test/
> > > > > > > > >> > > > > > > > > [2]
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > >
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/1Gqn-ixE05kXv2jdssAUlnMuUVcHxIYZ0/view?usp=sharing
> > > > > > > > >> > > > > > > > > [3]
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > >
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/10gVNyb4JYg_bA18bNhY9dEDbPn3TOxL7/view?usp=sharing
> > > > > > > > >> > > > > > > > > [4]
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > >
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/1jVrtsO49gCvUlnaC9AUAtkVKTm4PbdUv/view?usp=sharing
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > ________________________________________
> > > > > > > > >> > > > > > > > > From: Francisco Javier Tirado Sarti <
> > > > > > > > [email protected]>
> > > > > > > > >> > > > > > > > > Sent: Wednesday, January 17, 2024 9:13 AM
> > > > > > > > >> > > > > > > > > To: [email protected]
> > > > > > > > >> > > > > > > > > Cc: Pere Fernandez Perez
> > > > > > > > >> > > > > > > > > Subject: [EXTERNAL] Re: [DISCUSSION]
> > > Performance
> > > > > > > issues
> > > > > > > > >> with
> > > > > > > > >> > > > > > data-index
> > > > > > > > >> > > > > > > > > persistence addon
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > Hi Alex,
> > > > > > > > >> > > > > > > > > I did not take times (which depends on a
> > > number
> > > > of
> > > > > > > > >> variables
> > > > > > > > >> > > that
> > > > > > > > >> > > > > > > > > drastically change between environments),
> > but
> > > > > verify
> > > > > > > > that
> > > > > > > > >> the
> > > > > > > > >> > > > > number
> > > > > > > > >> > > > > > of
> > > > > > > > >> > > > > > > > > updates has been reduced drastically
> without
> > > > > losing
> > > > > > > > >> > > > functionality,
> > > > > > > > >> > > > > > > which
> > > > > > > > >> > > > > > > > is
> > > > > > > > >> > > > > > > > > objectively a good thing. If before the
> > > change,
> > > > > for
> > > > > > > > every
> > > > > > > > >> > node
> > > > > > > > >> > > > > > > executed,
> > > > > > > > >> > > > > > > > we
> > > > > > > > >> > > > > > > > > have an update for every node previously
> > > > executed,
> > > > > > so
> > > > > > > > if a
> > > > > > > > >> > > > process
> > > > > > > > >> > > > > > have
> > > > > > > > >> > > > > > > > 50
> > > > > > > > >> > > > > > > > > nodes to execute, we were performing
> nearly
> > > > > 50*51/2
> > > > > > > > >> updates,
> > > > > > > > >> > > > which
> > > > > > > > >> > > > > > > gives
> > > > > > > > >> > > > > > > > us
> > > > > > > > >> > > > > > > > > a total of  1275 updates, now we have just
> > one
> > > > for
> > > > > > > every
> > > > > > > > >> node
> > > > > > > > >> > > > being
> > > > > > > > >> > > > > > > > > executed, implying a total of 50 updates.
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > On Wed, Jan 17, 2024 at 3:18 PM Alex
> > Porcelli
> > > <
> > > > > > > > >> > > [email protected]>
> > > > > > > > >> > > > > > > wrote:
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > > Francisco,
> > > > > > > > >> > > > > > > > > >
> > > > > > > > >> > > > > > > > > > I noticed that your PR has been merged,
> > but
> > > I
> > > > > was
> > > > > > > > >> expecting
> > > > > > > > >> > > (at
> > > > > > > > >> > > > > > least
> > > > > > > > >> > > > > > > > > > was my understanding from this thread)
> > that
> > > > > before
> > > > > > > > >> merging
> > > > > > > > >> > > some
> > > > > > > > >> > > > > > > > > > benchmark data would be shared in
> advance
> > -
> > > to
> > > > > > > assess
> > > > > > > > >> the
> > > > > > > > >> > > > > > > cost/benefit
> > > > > > > > >> > > > > > > > > > of such a decent size change.
> > > > > > > > >> > > > > > > > > >
> > > > > > > > >> > > > > > > > > > Do you have any information to share?
> > > > > > > > >> > > > > > > > > >
> > > > > > > > >> > > > > > > > > > On Sat, Dec 23, 2023 at 4:02 AM
> Francisco
> > > > Javier
> > > > > > > > Tirado
> > > > > > > > >> > Sarti
> > > > > > > > >> > > > > > > > > > <[email protected]> wrote:
> > > > > > > > >> > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > Yes, as intended, now we have one
> select
> > > and
> > > > > one
> > > > > > > > >> > > > insert/update
> > > > > > > > >> > > > > > per
> > > > > > > > >> > > > > > > > node
> > > > > > > > >> > > > > > > > > > > event.
> > > > > > > > >> > > > > > > > > > > I moved the PR as ready for review and
> > > give
> > > > > > @Pere
> > > > > > > > >> > Fernandez
> > > > > > > > >> > > > > Perez
> > > > > > > > >> > > > > > > > > > > <[email protected]> permission to
> the
> > > > > branch
> > > > > > so
> > > > > > > > he
> > > > > > > > >> can
> > > > > > > > >> > > > edit
> > > > > > > > >> > > > > it
> > > > > > > > >> > > > > > > in
> > > > > > > > >> > > > > > > > > the
> > > > > > > > >> > > > > > > > > > > next two weeks (Ill be on PTO)  if
> > > desired,
> > > > > > before
> > > > > > > > >> > merging.
> > > > > > > > >> > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > On Thu, Dec 21, 2023 at 5:58 PM Alex
> > > > Porcelli
> > > > > <
> > > > > > > > >> > > > > [email protected]>
> > > > > > > > >> > > > > > > > > wrote:
> > > > > > > > >> > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > Cool, thank you Francisco!
> > > > > > > > >> > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > Did you manage to get some
> preliminary
> > > > data
> > > > > > > about
> > > > > > > > >> > > > > improvements?
> > > > > > > > >> > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > On Thu, Dec 21, 2023 at 11:52 AM
> > > Francisco
> > > > > > > Javier
> > > > > > > > >> > Tirado
> > > > > > > > >> > > > > Sarti
> > > > > > > > >> > > > > > > > > > > > <[email protected]> wrote:
> > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > Yes, after some delay because of
> > > > quarkus 3
> > > > > > > > >> migration.
> > > > > > > > >> > > Im
> > > > > > > > >> > > > > > > refining
> > > > > > > > >> > > > > > > > > > this
> > > > > > > > >> > > > > > > > > > > > > draft PR
> > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > >> > > > > >
> > > > > > > https://github.com/apache/incubator-kie-kogito-apps/pull/1941
> > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > On Thu, Dec 21, 2023 at 5:48 PM
> Alex
> > > > > > Porcelli
> > > > > > > <
> > > > > > > > >> > > > > > > [email protected]>
> > > > > > > > >> > > > > > > > > > wrote:
> > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > Any update or new findings on
> this
> > > > > topic?
> > > > > > > > >> > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > On Tue, Nov 28, 2023 at 8:38 AM
> > > > > Francisco
> > > > > > > > Javier
> > > > > > > > >> > > Tirado
> > > > > > > > >> > > > > > Sarti
> > > > > > > > >> > > > > > > > > > > > > > <[email protected]> wrote:
> > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > > Hi Alex,
> > > > > > > > >> > > > > > > > > > > > > > > After considering different
> > > options
> > > > to
> > > > > > > > improve
> > > > > > > > >> > > > > > performance,
> > > > > > > > >> > > > > > > > we
> > > > > > > > >> > > > > > > > > > feel
> > > > > > > > >> > > > > > > > > > > > that
> > > > > > > > >> > > > > > > > > > > > > > it
> > > > > > > > >> > > > > > > > > > > > > > > is time to "partially" move
> away
> > > > from
> > > > > > the
> > > > > > > > >> current
> > > > > > > > >> > > Map
> > > > > > > > >> > > > > > style
> > > > > > > > >> > > > > > > > > > > > interface (
> > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > >
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > >
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/incubator-kie-kogito-apps/blob/main/persistence-commons/persistence-commons-api/src/main/java/org/kie/kogito/persistence/api/Storage.java
> > > > > > > > >> > > > > > > > > > > > > > )
> > > > > > > > >> > > > > > > > > > > > > > > which was shared with Trusty,
> to
> > > one
> > > > > > more
> > > > > > > > >> > suitable
> > > > > > > > >> > > > for
> > > > > > > > >> > > > > > > usage
> > > > > > > > >> > > > > > > > > > with a
> > > > > > > > >> > > > > > > > > > > > > > > relational DB like postgresql
> > (but
> > > > > still
> > > > > > > > >> > compatible
> > > > > > > > >> > > > > with
> > > > > > > > >> > > > > > > big
> > > > > > > > >> > > > > > > > > > table
> > > > > > > > >> > > > > > > > > > > > dbs).
> > > > > > > > >> > > > > > > > > > > > > > > The idea will be to replace
> > > generic
> > > > > > > Storage
> > > > > > > > >> > > interface
> > > > > > > > >> > > > > by
> > > > > > > > >> > > > > > > four
> > > > > > > > >> > > > > > > > > > > > specific
> > > > > > > > >> > > > > > > > > > > > > > > interfaces (which will inherit
> > > from
> > > > a
> > > > > > > common
> > > > > > > > >> one
> > > > > > > > >> > > that
> > > > > > > > >> > > > > > keeps
> > > > > > > > >> > > > > > > > the
> > > > > > > > >> > > > > > > > > > query
> > > > > > > > >> > > > > > > > > > > > > > part
> > > > > > > > >> > > > > > > > > > > > > > > at is it. with get and query
> > > > methods),
> > > > > > > that
> > > > > > > > >> will
> > > > > > > > >> > > > > include
> > > > > > > > >> > > > > > > the
> > > > > > > > >> > > > > > > > > > required
> > > > > > > > >> > > > > > > > > > > > > > > modification operations for
> the
> > > four
> > > > > > > > DataIndex
> > > > > > > > >> > > > > "domains":
> > > > > > > > >> > > > > > > > > > > > > > processinstance,
> > > > > > > > >> > > > > > > > > > > > > > > usertask, processdefinitions
> and
> > > > jobs.
> > > > > > > Those
> > > > > > > > >> > > > interfaces
> > > > > > > > >> > > > > > > will
> > > > > > > > >> > > > > > > > > > define
> > > > > > > > >> > > > > > > > > > > > > > methods
> > > > > > > > >> > > > > > > > > > > > > > > like addNode, addVariable,
> > > > updateTask,
> > > > > > > > >> > > > > addAttachment.....
> > > > > > > > >> > > > > > > > that
> > > > > > > > >> > > > > > > > > > will
> > > > > > > > >> > > > > > > > > > > > allow
> > > > > > > > >> > > > > > > > > > > > > > > the persistent layer
> > > implementation
> > > > > to
> > > > > > > just
> > > > > > > > >> > update
> > > > > > > > >> > > > the
> > > > > > > > >> > > > > > > > needed
> > > > > > > > >> > > > > > > > > > info
> > > > > > > > >> > > > > > > > > > > > in
> > > > > > > > >> > > > > > > > > > > > > > the
> > > > > > > > >> > > > > > > > > > > > > > > DB  (for example, for addNode
> in
> > > > > > Postgres,
> > > > > > > > >> just
> > > > > > > > >> > > > insert
> > > > > > > > >> > > > > a
> > > > > > > > >> > > > > > > row
> > > > > > > > >> > > > > > > > > into
> > > > > > > > >> > > > > > > > > > > > nodes
> > > > > > > > >> > > > > > > > > > > > > > > table, for addNode in Mongo,
> > > > basically
> > > > > > the
> > > > > > > > >> same
> > > > > > > > >> > > > atomic
> > > > > > > > >> > > > > > > upsert
> > > > > > > > >> > > > > > > > > > > > operation
> > > > > > > > >> > > > > > > > > > > > > > > that is currently done).
> > > Therefore,
> > > > we
> > > > > > > > >> increase
> > > > > > > > >> > > > > > performance
> > > > > > > > >> > > > > > > > for
> > > > > > > > >> > > > > > > > > > > > Postgres
> > > > > > > > >> > > > > > > > > > > > > > > and keep the current one for
> > > Mongo.
> > > > > The
> > > > > > > > >> current
> > > > > > > > >> > DB
> > > > > > > > >> > > > > > schemas
> > > > > > > > >> > > > > > > > > won't
> > > > > > > > >> > > > > > > > > > be
> > > > > > > > >> > > > > > > > > > > > > > > touched.
> > > > > > > > >> > > > > > > > > > > > > > > Since the code change is
> large,
> > I
> > > do
> > > > > not
> > > > > > > > think
> > > > > > > > >> > I'll
> > > > > > > > >> > > > be
> > > > > > > > >> > > > > > able
> > > > > > > > >> > > > > > > > to
> > > > > > > > >> > > > > > > > > > have
> > > > > > > > >> > > > > > > > > > > > the
> > > > > > > > >> > > > > > > > > > > > > > PR
> > > > > > > > >> > > > > > > > > > > > > > > ready till next week.
> > > > > > > > >> > > > > > > > > > > > > > > But before starting, please
> let
> > me
> > > > > know
> > > > > > if
> > > > > > > > >> that
> > > > > > > > >> > > > > approach
> > > > > > > > >> > > > > > is
> > > > > > > > >> > > > > > > > > fine
> > > > > > > > >> > > > > > > > > > for
> > > > > > > > >> > > > > > > > > > > > you.
> > > > > > > > >> > > > > > > > > > > > > > > Best regards.
> > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > > On Fri, Nov 24, 2023 at
> 6:55 PM
> > > Alex
> > > > > > > > Porcelli
> > > > > > > > >> <
> > > > > > > > >> > > > > > > > > [email protected]>
> > > > > > > > >> > > > > > > > > > > > wrote:
> > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > > > Thank you Francisco to
> getting
> > > > > deeper
> > > > > > on
> > > > > > > > >> this…
> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > > > Looking forward to see the
> > > results
> > > > > of
> > > > > > > your
> > > > > > > > >> > > > suggested
> > > > > > > > >> > > > > > > > > > improvements.
> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > > > On Fri, Nov 24, 2023 at
> > 9:40 AM
> > > > > > > Francisco
> > > > > > > > >> > Javier
> > > > > > > > >> > > > > Tirado
> > > > > > > > >> > > > > > > > > Sarti <
> > > > > > > > >> > > > > > > > > > > > > > > > [email protected]> wrote:
> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > > > > I forgot to attach the
> > queries
> > > > > > > > >> > > > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > > > > On Fri, Nov 24, 2023 at
> > > 3:04 PM
> > > > > > > > Francisco
> > > > > > > > >> > > Javier
> > > > > > > > >> > > > > > Tirado
> > > > > > > > >> > > > > > > > > > Sarti <
> > > > > > > > >> > > > > > > > > > > > > > > > > [email protected]>
> wrote:
> > > > > > > > >> > > > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > > > >> Hi,
> > > > > > > > >> > > > > > > > > > > > > > > > >> A brief update on this
> > topic.
> > > > > > > > >> > > > > > > > > > > > > > > > >> After doing a simple test
> > > with
> > > > > > > example
> > > > > > > > >> > > > > > > > > > > > > > > > >>
> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > >
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > >
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/incubator-kie-kogito-examples/tree/stable/serverless-workflow-examples/serverless-workflow-data-index-quarkus
> > > > > > > > >> > > > > > > > > > > > > > > > ,
> > > > > > > > >> > > > > > > > > > > > > > > > >> the number of updates
> over
> > > > Nodes
> > > > > > > table
> > > > > > > > is
> > > > > > > > >> > n*n,
> > > > > > > > >> > > > so
> > > > > > > > >> > > > > we
> > > > > > > > >> > > > > > > > > manage
> > > > > > > > >> > > > > > > > > > to
> > > > > > > > >> > > > > > > > > > > > > > obtain a
> > > > > > > > >> > > > > > > > > > > > > > > > >> perfect quadratic
> > performance
> > > > > > > > >> degradation.
> > > > > > > > >> > The
> > > > > > > > >> > > > > > problem
> > > > > > > > >> > > > > > > > is
> > > > > > > > >> > > > > > > > > > worse
> > > > > > > > >> > > > > > > > > > > > in
> > > > > > > > >> > > > > > > > > > > > > > the
> > > > > > > > >> > > > > > > > > > > > > > > > case
> > > > > > > > >> > > > > > > > > > > > > > > > >> of Serverless Workflow
> than
> > > in
> > > > > BPMN
> > > > > > > > >> because
> > > > > > > > >> > we
> > > > > > > > >> > > > the
> > > > > > > > >> > > > > > > > number
> > > > > > > > >> > > > > > > > > of
> > > > > > > > >> > > > > > > > > > > > nodes
> > > > > > > > >> > > > > > > > > > > > > > is
> > > > > > > > >> > > > > > > > > > > > > > > > >> greater than the number
> of
> > > > > states.
> > > > > > In
> > > > > > > > >> that
> > > > > > > > >> > > > > example N
> > > > > > > > >> > > > > > > is
> > > > > > > > >> > > > > > > > > 16,
> > > > > > > > >> > > > > > > > > > but
> > > > > > > > >> > > > > > > > > > > > for
> > > > > > > > >> > > > > > > > > > > > > > a
> > > > > > > > >> > > > > > > > > > > > > > > > more
> > > > > > > > >> > > > > > > > > > > > > > > > >> complex workflow it would
> > be
> > > > > > > certainly
> > > > > > > > >> > large.
> > > > > > > > >> > > > > > > > > > > > > > > > >> I think that this is more
> > > > related
> > > > > > to
> > > > > > > > how
> > > > > > > > >> we
> > > > > > > > >> > > are
> > > > > > > > >> > > > > > > handling
> > > > > > > > >> > > > > > > > > > JPA in
> > > > > > > > >> > > > > > > > > > > > the
> > > > > > > > >> > > > > > > > > > > > > > > > code,
> > > > > > > > >> > > > > > > > > > > > > > > > >> in particular the mapping
> > > from
> > > > > > model
> > > > > > > to
> > > > > > > > >> > entity
> > > > > > > > >> > > > > > > > (basically
> > > > > > > > >> > > > > > > > > > JPA is
> > > > > > > > >> > > > > > > > > > > > > > blind
> > > > > > > > >> > > > > > > > > > > > > > > > and
> > > > > > > > >> > > > > > > > > > > > > > > > >> has to update all nodes
> for
> > > > every
> > > > > > > write
> > > > > > > > >> > > because
> > > > > > > > >> > > > it
> > > > > > > > >> > > > > > > > > believes
> > > > > > > > >> > > > > > > > > > the
> > > > > > > > >> > > > > > > > > > > > > > node has
> > > > > > > > >> > > > > > > > > > > > > > > > >> been updated, although it
> > is
> > > > not)
> > > > > > > than
> > > > > > > > an
> > > > > > > > >> > > issue
> > > > > > > > >> > > > in
> > > > > > > > >> > > > > > the
> > > > > > > > >> > > > > > > > > table
> > > > > > > > >> > > > > > > > > > > > > > definition.
> > > > > > > > >> > > > > > > > > > > > > > > > >> In fact, when using JPA,
> > > > > separating
> > > > > > > the
> > > > > > > > >> > server
> > > > > > > > >> > > > > model
> > > > > > > > >> > > > > > > > from
> > > > > > > > >> > > > > > > > > > the
> > > > > > > > >> > > > > > > > > > > > JPA
> > > > > > > > >> > > > > > > > > > > > > > > > entity is
> > > > > > > > >> > > > > > > > > > > > > > > > >> not a good idea,
> especially
> > > if
> > > > > the
> > > > > > > > entity
> > > > > > > > >> > > > contains
> > > > > > > > >> > > > > > > > > > collections.
> > > > > > > > >> > > > > > > > > > > > I
> > > > > > > > >> > > > > > > > > > > > > > will
> > > > > > > > >> > > > > > > > > > > > > > > > try
> > > > > > > > >> > > > > > > > > > > > > > > > >> to change that without
> > > breaking
> > > > > > > > anything.
> > > > > > > > >> > > > > > > > > > > > > > > > >>
> > > > > > > > >> > > > > > > > > > > > > > > > >> On Wed, Nov 22, 2023 at
> > > > 12:10 PM
> > > > > > > > Enrique
> > > > > > > > >> > > > Gonzalez
> > > > > > > > >> > > > > > > > > Martinez <
> > > > > > > > >> > > > > > > > > > > > > > > > >> [email protected]>
> > wrote:
> > > > > > > > >> > > > > > > > > > > > > > > > >>
> > > > > > > > >> > > > > > > > > > > > > > > > >>> After the events split
> you
> > > now
> > > > > > will
> > > > > > > > >> need to
> > > > > > > > >> > > > > create
> > > > > > > > >> > > > > > a
> > > > > > > > >> > > > > > > > node
> > > > > > > > >> > > > > > > > > > > > instance
> > > > > > > > >> > > > > > > > > > > > > > > > >>> model instance of making
> > > > > > independent
> > > > > > > > >> from
> > > > > > > > >> > the
> > > > > > > > >> > > > > > process
> > > > > > > > >> > > > > > > > > > instance.
> > > > > > > > >> > > > > > > > > > > > > > > > >>> That should do the
> trick.
> > > > > > > > >> > > > > > > > > > > > > > > > >>>
> > > > > > > > >> > > > > > > > > > > > > > > > >>> Regarding
> > deleting/inserting
> > > > it
> > > > > > was
> > > > > > > > >> fixed
> > > > > > > > >> > at
> > > > > > > > >> > > > some
> > > > > > > > >> > > > > > > > point.
> > > > > > > > >> > > > > > > > > > > > > > > > >>>
> > > > > > > > >> > > > > > > > > > > > > > > > >>> El mar, 21 nov 2023 a
> las
> > > > 20:22,
> > > > > > > > >> Francisco
> > > > > > > > >> > > > Javier
> > > > > > > > >> > > > > > > > Tirado
> > > > > > > > >> > > > > > > > > > Sarti
> > > > > > > > >> > > > > > > > > > > > > > > > >>> (<[email protected]>)
> > > > > escribió:
> > > > > > > > >> > > > > > > > > > > > > > > > >>> >
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > Hi Martin,
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > I have a task to
> review
> > > > > > > performance
> > > > > > > > of
> > > > > > > > >> > > > > > > > > > > > > > > > >>> >
> > > > > > > > >> > > > > > > > > > > > > > > > >>> >
> > > > > > ProcessInstanceNodeDataEventMerger
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > My idea is to reduce
> the
> > > > > number
> > > > > > of
> > > > > > > > >> delete
> > > > > > > > >> > > > > inserts
> > > > > > > > >> > > > > > > > when
> > > > > > > > >> > > > > > > > > > > > processing
> > > > > > > > >> > > > > > > > > > > > > > > > >>> events
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > and try to do it
> > > > incremental.
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > That should improve
> > > > > performance.
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > PS:
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > I was planning to send
> > an
> > > > > e-mail
> > > > > > > > >> tomorrow
> > > > > > > > >> > > > > > > announcing
> > > > > > > > >> > > > > > > > > > that in
> > > > > > > > >> > > > > > > > > > > > > > case you
> > > > > > > > >> > > > > > > > > > > > > > > > >>> were
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > already working on a
> fix
> > > for
> > > > > > > that. I
> > > > > > > > >> > assume
> > > > > > > > >> > > > you
> > > > > > > > >> > > > > > are
> > > > > > > > >> > > > > > > > not
> > > > > > > > >> > > > > > > > > > and I
> > > > > > > > >> > > > > > > > > > > > > > would
> > > > > > > > >> > > > > > > > > > > > > > > > be
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > sending a PR soon.
> > > > > > > > >> > > > > > > > > > > > > > > > >>> >
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > On Tue, Nov 21, 2023
> at
> > > > > 6:09 PM
> > > > > > > > Martin
> > > > > > > > >> > > Weiler
> > > > > > > > >> > > > > > > > > > > > > > > > <[email protected]
> > > > > > > > >> > > > > > > > > > > > > > > > >>> >
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > wrote:
> > > > > > > > >> > > > > > > > > > > > > > > > >>> >
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > > I looked into the
> new
> > > > > examples
> > > > > > > > using
> > > > > > > > >> > > > > data-index
> > > > > > > > >> > > > > > > > > > persistence
> > > > > > > > >> > > > > > > > > > > > > > addon -
> > > > > > > > >> > > > > > > > > > > > > > > > >>> Neus'
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > > PR#1813 [1] for
> > > serverless
> > > > > and
> > > > > > > > >> Pere's
> > > > > > > > >> > > > branch
> > > > > > > > >> > > > > > [2]
> > > > > > > > >> > > > > > > > for
> > > > > > > > >> > > > > > > > > > > > workflow
> > > > > > > > >> > > > > > > > > > > > > > > > (great
> > > > > > > > >> > > > > > > > > > > > > > > > >>> job
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > > both!) - and they
> work
> > > > > without
> > > > > > > > >> issues
> > > > > > > > >> > > using
> > > > > > > > >> > > > > > > single
> > > > > > > > >> > > > > > > > > > > > requests.
> > > > > > > > >> > > > > > > > > > > > > > > > >>> However, under
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > > some load (I used
> 'ab'
> > > for
> > > > > > > testing
> > > > > > > > >> > with a
> > > > > > > > >> > > > > light
> > > > > > > > >> > > > > > > > > > > > concurrency of
> > > > > > > > >> > > > > > > > > > > > > > 10
> > > > > > > > >> > > > > > > > > > > > > > > > >>> parallel
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > > requests) I ran into
> > the
> > > > > > > following
> > > > > > > > >> > > > problems:
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > >
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > > (1) Large number of
> > > > > > > insert/delete
> > > > > > > > >> calls
> > > > > > > > >> > > > (eg.
> > > > > > > > >> > > > > > for
> > > > > > > > >> > > > > > > > > tables
> > > > > > > > >> > > > > > > > > > > > such as
> > > > > > > > >> > > > > > > > > > > > > > > > >>> nodes,
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > > definitions, etc)
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > >
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > > (2) Hibernate
> > > > > > > > >> OptimisticLockExceptions
> > > > > > > > >> > /
> > > > > > > > >> > > > > > > > > > > > StaleStateExceptions
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > >
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > > (3) DB deadlocks
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > >
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > > (4) Error responses,
> > > slow
> > > > > > > response
> > > > > > > > >> > times
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > >
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > > The reason I am
> > reaching
> > > > out
> > > > > > > with
> > > > > > > > >> this
> > > > > > > > >> > > > topic
> > > > > > > > >> > > > > > here
> > > > > > > > >> > > > > > > > is
> > > > > > > > >> > > > > > > > > to
> > > > > > > > >> > > > > > > > > > > > find
> > > > > > > > >> > > > > > > > > > > > > > out if
> > > > > > > > >> > > > > > > > > > > > > > > > >>> we are
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > > aware of this issue,
> > and
> > > > if
> > > > > > > > someone
> > > > > > > > >> is
> > > > > > > > >> > > > > already
> > > > > > > > >> > > > > > > > > looking
> > > > > > > > >> > > > > > > > > > > > into or
> > > > > > > > >> > > > > > > > > > > > > > > > being
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > > assigned to it?
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > >
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > > Thanks,
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > > Martin
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > >
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > > [1]
> > > > > > > > >> > > > > > > > > > > > > > > > >>>
> > > > > > > > >> > > > > > > > > > > >
> > > > > > > > >> > > > > > >
> > > > > > > > >> >
> > > > > https://github.com/apache/incubator-kie-kogito-examples/pull/1813
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > > [2]
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > >
> > > > > > > > >> > > > > > > > > > > > > > > > >>>
> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > >
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > >
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/pefernan/kogito-examples/tree/example_data-index_persistence
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > >
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > >
> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > >
> > > > > > > > >> > > > > > > >
> > > > > > > > >> > > >
> > > > > > > > >>
> > > > > >
> > ---------------------------------------------------------------------
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > > To unsubscribe,
> > e-mail:
> > > > > > > > >> > > > > > > > > [email protected]
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > > For additional
> > commands,
> > > > > > e-mail:
> > > > > > > > >> > > > > > > > > > [email protected]
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > >
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > >
> > > > > > > > >> > > > > > > > > > > > > > > > >>>
> > > > > > > > >> > > > > > > > > > > > > > > > >>>
> > > > > > > > >> > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > >
> > > > > > > > >> > > > > >
> > > > > > > > >> >
> > > > > > >
> > > ---------------------------------------------------------------------
> > > > > > > > >> > > > > > > > > > > > > > > > >>> To unsubscribe, e-mail:
> > > > > > > > >> > > > > > > [email protected]
> > > > > > > > >> > > > > > > > > > > > > > > > >>> For additional commands,
> > > > e-mail:
> > > > > > > > >> > > > > > > > [email protected]
> > > > > > > > >> > > > > > > > > > > > > > > > >>>
> > > > > > > > >> > > > > > > > > > > > > > > > >>>
> > > > > > > > >> > > > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > >
> > > > > > > > >> > > > > > > >
> > > > > > > > >> > > >
> > > > > > > > >>
> > > > > >
> > ---------------------------------------------------------------------
> > > > > > > > >> > > > > > > > > > > > > > > > > To unsubscribe, e-mail:
> > > > > > > > >> > > > > > [email protected]
> > > > > > > > >> > > > > > > > > > > > > > > > > For additional commands,
> > > e-mail:
> > > > > > > > >> > > > > > > [email protected]
> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > >
> > > > > > > > >> > > > > >
> > > > > > > > >> >
> > > > > > >
> > > ---------------------------------------------------------------------
> > > > > > > > >> > > > > > > > > > > > > > To unsubscribe, e-mail:
> > > >
> >
>

Re: [DISCUSSION] Performance issues with data-index persistence addon

Reply via email to