Re: [DISCUSSION] Performance issues with data-index persistence addon

Francisco Javier Tirado Sarti Wed, 21 Feb 2024 03:04:03 -0800

Richard,
You can also disable event publishing (which is the mechanism to sincronize
runtimes with dataindex). In any case, since this mechanism is async, the
performance of the process execution won't be affected, even if enabled.


On Wed, Feb 21, 2024 at 11:24 AM Richard Bourner <[email protected]> wrote:

> Got it. Thanks Enrique.
>
>
> Le mer. 21 févr. 2024, 00:18, Enrique Gonzalez Martinez <
> [email protected]> a écrit :
>
> > Regarding your STP concerns, they were addressed in my previous comment:
> >
> > STP is a concept, a process with certain constraints: no persistence and
> > returning the outcome in the call (sync execution with no idle states).
> It
> > was a requirement from a user in the past. One of the requirements was
> > leaving no trail. In v7 was easy because you could disable the audit in
> > that case. Actually we have the same way to do what we did in v7 in here
> as
> > you can add/remove index just removing dependencies.
> >
> > So in fact in microservice you just need to exclude the data index from
> the
> > app and you wont have data index.
> >
> >
> >
> > El mié, 21 feb 2024, 1:40, Richard Bourner <[email protected]> escribió:
> >
> > > +1 with Martin's email.
> > >
> > > One question though in regards to Martin's point #3 and to previous
> > > following statement from Francisco: "*... keeping finishing
> > > process instances "for a while" in DataIndex was the only way for users
> > to
> > > query the result of straight through processes*"
> > > --> Is this the only use case where data index would be needed for STP?
> > > I am asking because clients will already get their result in the JSON
> > > returned from the synchronous REST call, so adding an extra computing
> > time
> > > for data index persistence does not seem right to me in the context of
> > > decision services that are supposed to be very fast to return
> (typically
> > > rule tasks+scripts).
> > > Or is it that we also want to provide some GraphQL capabilities, even
> for
> > > STP use cases?
> > >
> > > Also, what would "*for a while*" mean exactly?  Will it be
> configurable?
> > > Will there be a default expiration value?
> > >
> > > I am assuming this is all work in progress, and you may not have
> answers
> > to
> > > all my questions, no problem with that.
> > >
> > > Thanks.
> > >
> > > On Tue, Feb 20, 2024 at 5:39 PM Martin Weiler <[email protected]
> >
> > > wrote:
> > >
> > > > IMO, it is good to have this discussion around data sanity now
> instead
> > of
> > > > putting it off until later when data has already accumulated in
> > > production
> > > > environments.
> > > >
> > > > Based on the input here, we are dealing with three types of data:
> > > > 1. Runtime data - active instances only, engine cleans up the data
> > > > automatically at process instance end
> > > > 2. Historic log data - data created by data-audit intended for long
> > term
> > > > storage
> > > > 3. Data-index data - somehow this data falls in between the two
> > > > aforementioned categories, with the idea of the data being "recent",
> > but
> > > > not restricted to active instances only
> > > >
> > > > We'd need purge strategies for both #2 and #3 (perhaps different
> ones,
> > or
> > > > with different config settings) in order to prevent unlimited data
> > > growth.
> > > >
> > > > ________________________________________
> > > > From: Enrique Gonzalez Martinez <[email protected]>
> > > > Sent: Monday, February 19, 2024 7:11 AM
> > > > To: [email protected]
> > > > Subject: [EXTERNAL] Re: [DISCUSSION] Performance issues with
> data-index
> > > > persistence addon
> > > >
> > > > Hi Francisco,
> > > > To give you more context about this.
> > > >
> > > > STP is a concept, a process with certain constraints: no persistence
> > and
> > > > returning the outcome in the call (sync execution with no idle
> states).
> > > It
> > > > was a requirement from a user in the past. One of the requirements
> was
> > > > leaving no trail. In v7 was easy because you could disable the audit
> in
> > > > that case. Actually we have the same way to do what we did in v7 in
> > here
> > > as
> > > > you can add/remove index just removing deps.
> > > >
> > > > We have the same outcome with different approaches and STP is already
> > > > delivered.
> > > >
> > > > El lun, 19 feb 2024 a las 14:46, Francisco Javier Tirado Sarti (<
> > > > [email protected]>) escribió:
> > > >
> > > > > Regarding STP (which is not a concept that we have in the code. I
> > mean
> > > > STP
> > > > > are processes as nonSTP are), I guess, as all processes, they were
> > kept
> > > > in
> > > > > DataIndex once completed because users wanted (and still wants) to
> > > check
> > > > > the result once the call had been performed. If we want to leave no
> > > trace
> > > > > of them in DataIndex for some reason, we will need to make it a
> > > > > Runtimes concept so DataIndex can handle them in a different way.
> > > > >
> > > > > On Mon, Feb 19, 2024 at 2:27 PM Enrique Gonzalez Martinez <
> > > > > [email protected]> wrote:
> > > > >
> > > > > > Alex:
> > > > > > Right now the data index is working in the same way as it did in
> v7
> > > > with
> > > > > > the emitters. The only difference between two impl is that in
> here
> > > the
> > > > > > storage is pgsql instead elastic search.  You are right regarding
> > is
> > > a
> > > > > > snapshot of the last state of the process but we did never define
> > how
> > > > > long
> > > > > > would be alive that dats Honestly i am happy right now with the
> way
> > > it
> > > > > > works. The clean up mechanism is still tbd because we still need
> to
> > > > > discuss
> > > > > > other stuff first.
> > > > > >
> > > > > >
> > > > > > Regarding stp is to leave no trail because u can get the outcome
> > > > directly
> > > > > > from the call. It was defined like that in v7. So there is no use
> > for
> > > > the
> > > > > > index or the audit.
> > > > > >
> > > > > > El lun, 19 feb 2024, 14:13, Francisco Javier Tirado Sarti <
> > > > > > [email protected]> escribió:
> > > > > >
> > > > > > > Hi Alex,
> > > > > > > There has been some confusion about the purpose of DataIndex.
> To
> > be
> > > > > > honest
> > > > > > > I believe they were already sorted out, but your e-mail makes
> me
> > > > think
> > > > > > that
> > > > > > > is not the case ;). I let Kris to clarify that with you. My
> view
> > is
> > > > > that
> > > > > > > data-index is a way to query recently closed and active
> processes
> > > > (the
> > > > > > key
> > > > > > > here is the definition of recently, which in my opinion should
> be
> > > > > > > configurable)
> > > > > > > But, besides that discussion and being pragmatic, keeping
> > finishing
> > > > > > process
> > > > > > > instances "for a while" in DataIndex was the only way for users
> > to
> > > > > query
> > > > > > > the result of straight through processes. That's a function
> that
> > > > cannot
> > > > > > be
> > > > > > > removed right now
> > > > > > >
> > > > > > > On Mon, Feb 19, 2024 at 1:33 PM Alex Porcelli <
> > [email protected]
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > if data index was supposed to provide snapshot view of the
> > > process
> > > > > > > > instance… why do we keep it after the process instance is
> > > finished?
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Feb 19, 2024 at 7:12 AM Francisco Javier Tirado
> Sarti <
> > > > > > > > [email protected]> wrote:
> > > > > > > >
> > > > > > > > > Hi Martin.
> > > > > > > > > After taking a deeper look at this, I realize that the
> > > behaviour
> > > > is
> > > > > > the
> > > > > > > > > expected one.
> > > > > > > > > Runtimes DB does not track the completed process instance
> > > (that's
> > > > > > what
> > > > > > > > the
> > > > > > > > > JDBCProcessInstances warn is telling us), but DataIndex, as
> > > > > expected,
> > > > > > > is
> > > > > > > > > tracking it in processes and nodes table. And yes it will
> > grow
> > > > over
> > > > > > > time.
> > > > > > > > > What we need is some configurable purge mechanism for
> > > DataIndex,
> > > > so
> > > > > > it
> > > > > > > > > eventually removes older completed process instances.
> > > > > > > > >
> > > > > > > > > On Tue, Feb 13, 2024 at 12:59 PM Francisco Javier Tirado
> > Sarti
> > > <
> > > > > > > > > [email protected]> wrote:
> > > > > > > > >
> > > > > > > > > > Hi Martin,
> > > > > > > > > > Good catch!. Looks like the skipping performed for
> process
> > > > > > instances
> > > > > > > is
> > > > > > > > > > not applied to node instances. Something we definitely
> need
> > > to
> > > > > > review
> > > > > > > > on
> > > > > > > > > > the runtimes side.
> > > > > > > > > >
> > > > > > > > > > On Mon, Feb 12, 2024 at 11:59 PM Martin Weiler
> > > > > > > <[email protected]
> > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > >> On a somewhat related note, testing a simple workflow
> > (start
> > > > ->
> > > > > > > script
> > > > > > > > > >> node -> end), I see the following messages in the logs:
> > > > > > > > > >> 2024-02-12 22:49:50,493 28758dde544c WARN
> > > > > > > > > >>
> [org.kie.kogito.persistence.jdbc.JDBCProcessInstances:-1]
> > > > > > > > > >> (executor-thread-3) Skipping create of process instance
> > id:
> > > > > > > > > >> 7083088e-b899-47cb-b85c-5d9ccb0aa166, state: 2
> > > > > > > > > >>
> > > > > > > > > >> So far, so good. And I'd expect to see no trace of this
> > > > process
> > > > > in
> > > > > > > the
> > > > > > > > > >> database if I don't have data audit enabled.
> > > > > > > > > >>
> > > > > > > > > >> However, the 'processes' table contains a row with
> > state=2,
> > > > with
> > > > > > > > related
> > > > > > > > > >> entries in the 'nodes' table. In a load test, I see
> these
> > > > tables
> > > > > > > grow
> > > > > > > > > >> significantly over time. Am I missing something to have
> > > these
> > > > > > > entries
> > > > > > > > > >> cleaned up automatically?
> > > > > > > > > >>
> > > > > > > > > >> ________________________________________
> > > > > > > > > >> From: Martin Weiler <[email protected]>
> > > > > > > > > >> Sent: Monday, February 12, 2024 3:40 PM
> > > > > > > > > >> To: [email protected]
> > > > > > > > > >> Subject: [EXTERNAL] RE: [DISCUSSION] Performance issues
> > with
> > > > > > > > data-index
> > > > > > > > > >> persistence addon
> > > > > > > > > >>
> > > > > > > > > >> Thanks everyone for your input. Based on this
> discussion,
> > I
> > > > > opened
> > > > > > > the
> > > > > > > > > >> following PR:
> > > > > > > > > >>
> > > https://github.com/apache/incubator-kie-kogito-apps/pull/1985
> > > > > > > > > >>
> > > > > > > > > >> With this change, the performance seems to be stable
> over
> > > > time:
> > > > > > > > > >>
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/1zkullvfrJpRp7TRjxDa41ok6kEIR7Fty/view?usp=sharing
> > > > > > > > > >>
> > > > > > > > > >> Martin
> > > > > > > > > >>
> > > > > > > > > >> ________________________________________
> > > > > > > > > >> From: Gonzalo Muñoz <[email protected]>
> > > > > > > > > >> Sent: Friday, February 9, 2024 9:42 AM
> > > > > > > > > >> To: [email protected]
> > > > > > > > > >> Subject: [EXTERNAL] Re: [DISCUSSION] Performance issues
> > with
> > > > > > > > data-index
> > > > > > > > > >> persistence addon
> > > > > > > > > >>
> > > > > > > > > >> Great work Francisco,
> > > > > > > > > >> Martin, take a look at this link with some related tips
> > (in
> > > > case
> > > > > > you
> > > > > > > > > find
> > > > > > > > > >> it useful):
> > > > > > > > > >>
> > > > https://www.cybertec-postgresql.com/en/index-your-foreign-key/
> > > > > > > > > >>
> > > > > > > > > >> El vie, 9 feb 2024 a las 17:20, Francisco Javier Tirado
> > > Sarti
> > > > (<
> > > > > > > > > >> [email protected]>) escribió:
> > > > > > > > > >>
> > > > > > > > > >> > For the moment being, we will keep JPA till we exhaust
> > all
> > > > > > > > > >> possibilities,
> > > > > > > > > >> > let's call switching from jpa to jdbc our hidden plan
> B
> > ;)
> > > > > > > > > >> > I already told Martin, but in order everyone to know,
> > just
> > > > > after
> > > > > > > > > writing
> > > > > > > > > >> > the previous email, I thought "what if Postgres is not
> > > > > > > automatically
> > > > > > > > > >> > indexing foreign keys like mysql?" and, eureka
> > > > > > > > > >> > Postgres doc
> > > > > > > > > >> >
> > > > https://www.postgresql.org/docs/current/ddl-constraints.html
> > > > > > > > > >> > Mysql doc
> > > > > > > > > >> >
> > > > > > >
> > > https://dev.mysql.com/doc/refman/8.0/en/constraint-foreign-key.html
> > > > > > > > > >> > These are the relevant excerpt
> > > > > > > > > >> >
> > > > > > > > > >> > *Postgresql*
> > > > > > > > > >> > *A foreign key must reference columns that either are
> a
> > > > > primary
> > > > > > > key
> > > > > > > > or
> > > > > > > > > >> form
> > > > > > > > > >> > a unique constraint, or are columns from a non-partial
> > > > unique
> > > > > > > index.
> > > > > > > > > >> This
> > > > > > > > > >> > means that the referenced columns always have an index
> > to
> > > > > allow
> > > > > > > > > >> efficient
> > > > > > > > > >> > lookups on whether a referencing row has a match.
> Since
> > a
> > > > > DELETE
> > > > > > > of
> > > > > > > > a
> > > > > > > > > >> row
> > > > > > > > > >> > from the referenced table or an UPDATE of a referenced
> > > > column
> > > > > > will
> > > > > > > > > >> require
> > > > > > > > > >> > a scan of the referencing table for rows matching the
> > old
> > > > > value,
> > > > > > > it
> > > > > > > > is
> > > > > > > > > >> > often a good idea to index the referencing columns
> too.
> > > > > Because
> > > > > > > this
> > > > > > > > > is
> > > > > > > > > >> not
> > > > > > > > > >> > always needed, and there are many choices available on
> > how
> > > > to
> > > > > > > index,
> > > > > > > > > the
> > > > > > > > > >> > declaration of a foreign key constraint does not
> > > > automatically
> > > > > > > > create
> > > > > > > > > an
> > > > > > > > > >> > index on the referencing columns.*
> > > > > > > > > >> > *Mysql*
> > > > > > > > > >> > *MySQL requires that foreign key columns be indexed;
> if
> > > you
> > > > > > > create a
> > > > > > > > > >> table
> > > > > > > > > >> > with a foreign key constraint but no index on a given
> > > > column,
> > > > > an
> > > > > > > > index
> > > > > > > > > >> is
> > > > > > > > > >> > created. *
> > > > > > > > > >> >
> > > > > > > > > >> > So I asked Martin to especially create an index for
> > > > > > > > > process_instance_id
> > > > > > > > > >> > column on nodes table
> > > > > > > > > >> > I think that will fix the problem detected on the
> thread
> > > > dump.
> > > > > > > > > >> > The simpler process test to verify queries are fine
> > still
> > > > > > stands,
> > > > > > > > > >> though ;)
> > > > > > > > > >> >
> > > > > > > > > >> >
> > > > > > > > > >> > On Fri, Feb 9, 2024 at 5:10 PM Tibor Zimányi <
> > > > > > [email protected]
> > > > > > > >
> > > > > > > > > >> wrote:
> > > > > > > > > >> >
> > > > > > > > > >> > > I always preferred pure JDBC over Hibernate myself,
> > just
> > > > for
> > > > > > the
> > > > > > > > > sake
> > > > > > > > > >> of
> > > > > > > > > >> > > control of what is happening :) So I would not -1
> that
> > > > > myself.
> > > > > > > > > >> > >
> > > > > > > > > >> > > Tibor
> > > > > > > > > >> > >
> > > > > > > > > >> > > Dňa pi 9. 2. 2024, 17:00 Francisco Javier Tirado
> > Sarti <
> > > > > > > > > >> > > [email protected]>
> > > > > > > > > >> > > napísal(a):
> > > > > > > > > >> > >
> > > > > > > > > >> > > > Hi,
> > > > > > > > > >> > > > Usually I do not want to talk about work in
> progress
> > > > > because
> > > > > > > > > >> > preliminary
> > > > > > > > > >> > > > conclusions are pretty volatile but, well, there
> > are a
> > > > > > couple
> > > > > > > of
> > > > > > > > > >> things
> > > > > > > > > >> > > > that can be concluded from the really valuable
> > > > information
> > > > > > > that
> > > > > > > > > >> Martin
> > > > > > > > > >> > > > provided.
> > > > > > > > > >> > > > 1) In order to be able to determine if the number
> of
> > > > > > > statements
> > > > > > > > is
> > > > > > > > > >> > larger
> > > > > > > > > >> > > > than expected, I asked Martin to test with a
> simpler
> > > > > process
> > > > > > > > > >> > definition.
> > > > > > > > > >> > > > One with just three nodes: start, script and end.
> > The
> > > > > script
> > > > > > > one
> > > > > > > > > >> should
> > > > > > > > > >> > > > change just one variable. This way we can analyze
> if
> > > the
> > > > > > > number
> > > > > > > > of
> > > > > > > > > >> > > queries
> > > > > > > > > >> > > > is the expected one. From the single log (audit
> was
> > > > > > activated
> > > > > > > > > them)
> > > > > > > > > >> my
> > > > > > > > > >> > > > conclusion is that the number of insert/updates
> over
> > > > > > processes
> > > > > > > > and
> > > > > > > > > >> > nodes
> > > > > > > > > >> > > > (there a lot over task, that I will prefer to skip
> > for
> > > > > now,
> > > > > > > baby
> > > > > > > > > >> steps)
> > > > > > > > > >> > > is
> > > > > > > > > >> > > > the expected one.
> > > > > > > > > >> > > > 2) Analysing the thread dump, we see around 15
> > threads
> > > > > > > executing
> > > > > > > > > >> this
> > > > > > > > > >> > > line
> > > > > > > > > >> > > > at
> > > > > > > > > >> > > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > >
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.kie.kogito.index.jpa.storage.ProcessInstanceEntityStorage.indexNode(ProcessInstanceEntityStorage.java:125),
> > > > > > > > > >> > > > so its pretty clear the code to be optimized ;).
> I'm
> > > > > > > evaluating
> > > > > > > > > >> > > > possibilities within JPA/Hibernate, but I'm
> starting
> > > to
> > > > > > think
> > > > > > > > that
> > > > > > > > > >> it
> > > > > > > > > >> > > might
> > > > > > > > > >> > > > be better to switch to JDBC and skip hibernate.
> Our
> > > > lives
> > > > > > will
> > > > > > > > be
> > > > > > > > > >> > > simpler,
> > > > > > > > > >> > > > especially with a schema relatively simple like
> ours
> > > > (that
> > > > > > > will
> > > > > > > > be
> > > > > > > > > >> my
> > > > > > > > > >> > > > recommendation if I was an external consultant)
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > On Fri, Feb 9, 2024 at 4:15 PM Tibor Zimányi <
> > > > > > > > [email protected]
> > > > > > > > > >
> > > > > > > > > >> > > wrote:
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > > Hi,
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > this will be a bit off-topic. However as far as
> > > > > > > performance, I
> > > > > > > > > >> think
> > > > > > > > > >> > we
> > > > > > > > > >> > > > > should think about that we have string primary
> > keys
> > > > > > (IDs). I
> > > > > > > > > would
> > > > > > > > > >> > > expect
> > > > > > > > > >> > > > > the database systems are much better with
> indexing
> > > > > numeric
> > > > > > > > keys
> > > > > > > > > >> than
> > > > > > > > > >> > > > > strings. I remember from the past, when I was
> > > working
> > > > > with
> > > > > > > > DBs,
> > > > > > > > > >> that
> > > > > > > > > >> > > > using
> > > > > > > > > >> > > > > strings as keys or indexes was a discouraged
> > > practice.
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > Best regards,
> > > > > > > > > >> > > > > Tibor
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > Dňa št 8. 2. 2024, 22:45 Martin Weiler
> > > > > > > > <[email protected]
> > > > > > > > > >
> > > > > > > > > >> > > > > napísal(a):
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > > I changed the test to use MongoDB [1] and I
> > don't
> > > > see
> > > > > a
> > > > > > > > > >> performance
> > > > > > > > > >> > > > > > degradation with this setup [2].
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > Please keep us posted of your findings.
> Thanks!
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > Martin
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > [1]
> > > > > > > > > >> > > > >
> > > > > > > > > >> >
> > > > > > > > >
> > > > > >
> > > https://github.com/martinweiler/job-service-refactor-test/tree/mongodb
> > > > > > > > > >> > > > > > [2]
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > >
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/1NfacXaxJlgRMw4OQ5S20cvkzvaUKUVFj/view?usp=sharing
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > ________________________________________
> > > > > > > > > >> > > > > > From: Francisco Javier Tirado Sarti <
> > > > > > [email protected]>
> > > > > > > > > >> > > > > > Sent: Wednesday, February 7, 2024 11:40 AM
> > > > > > > > > >> > > > > > To: [email protected]
> > > > > > > > > >> > > > > > Subject: [EXTERNAL] Re: [DISCUSSION]
> Performance
> > > > > issues
> > > > > > > with
> > > > > > > > > >> > > data-index
> > > > > > > > > >> > > > > > persistence addon
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > yes, it can be index degradation because of
> > size,
> > > > but
> > > > > I
> > > > > > > > > believe
> > > > > > > > > >> (I
> > > > > > > > > >> > > > might
> > > > > > > > > >> > > > > be
> > > > > > > > > >> > > > > > wrong) the db is too small (yet) for that.
> > > > > > > > > >> > > > > > But, eventually, Postgres, when the DB is huge
> > > > enough,
> > > > > > > > > >> unavoidably
> > > > > > > > > >> > > will
> > > > > > > > > >> > > > > > behave like the graphic that Martin sent.
> > > > > > > > > >> > > > > > Since I believe we are not huge enough (yet),
> > lets
> > > > > rule
> > > > > > > out
> > > > > > > > > >> another
> > > > > > > > > >> > > > issue
> > > > > > > > > >> > > > > > by analysing the sql logs (I requested those
> to
> > > > Martin
> > > > > > > > offline
> > > > > > > > > >> and
> > > > > > > > > >> > he
> > > > > > > > > >> > > > is
> > > > > > > > > >> > > > > > going to kindly collect them).
> > > > > > > > > >> > > > > > Also Im curious to know if Mongo behave in the
> > > same
> > > > > way.
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > On Wed, Feb 7, 2024 at 7:25 PM Enrique
> Gonzalez
> > > > > > Martinez <
> > > > > > > > > >> > > > > > [email protected]> wrote:
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > > Hi Francisco,
> > > > > > > > > >> > > > > > > I would highly recommend to check indexes
> and
> > > how
> > > > > the
> > > > > > > > > updates
> > > > > > > > > >> > work
> > > > > > > > > >> > > in
> > > > > > > > > >> > > > > > data
> > > > > > > > > >> > > > > > > index to avoid full scan table and lock the
> > full
> > > > > > table.
> > > > > > > > Some
> > > > > > > > > >> db
> > > > > > > > > >> > are
> > > > > > > > > >> > > > > very
> > > > > > > > > >> > > > > > > sensitive to that.
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > > > El mié, 7 feb 2024, 18:41, Francisco Javier
> > > Tirado
> > > > > > > Sarti <
> > > > > > > > > >> > > > > > > [email protected]> escribió:
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > > > > Hi Martin,
> > > > > > > > > >> > > > > > > > While I analyze the data, let me ask you
> if
> > it
> > > > is
> > > > > > > > possible
> > > > > > > > > >> to
> > > > > > > > > >> > > > perform
> > > > > > > > > >> > > > > > > > another check (similar in a way to
> disabling
> > > > > > > data-index
> > > > > > > > > like
> > > > > > > > > >> > you
> > > > > > > > > >> > > > do)
> > > > > > > > > >> > > > > > Can
> > > > > > > > > >> > > > > > > > you switch to MongoDB persistence and
> check
> > if
> > > > the
> > > > > > > same
> > > > > > > > > >> > > degradation
> > > > > > > > > >> > > > > > that
> > > > > > > > > >> > > > > > > is
> > > > > > > > > >> > > > > > > > there for postgres remains?
> > > > > > > > > >> > > > > > > > I do not know if this is feasible but will
> > > > > certainly
> > > > > > > > > >> indicate
> > > > > > > > > >> > the
> > > > > > > > > >> > > > > > problem
> > > > > > > > > >> > > > > > > > is on the postgres storage layer and I do
> > not
> > > > > have a
> > > > > > > > clear
> > > > > > > > > >> > > > prediction
> > > > > > > > > >> > > > > > of
> > > > > > > > > >> > > > > > > > what we will see when doing this switch.
> > > > > > > > > >> > > > > > > >
> > > > > > > > > >> > > > > > > > On Wed, Feb 7, 2024 at 6:37 PM Martin
> Weiler
> > > > > > > > > >> > > > <[email protected]
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > > > wrote:
> > > > > > > > > >> > > > > > > >
> > > > > > > > > >> > > > > > > > > Hi Francisco,
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > thanks for your work on this important
> > > topic!
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > I would like to share some test results
> > > here,
> > > > > > which
> > > > > > > > > might
> > > > > > > > > >> > help
> > > > > > > > > >> > > to
> > > > > > > > > >> > > > > > > improve
> > > > > > > > > >> > > > > > > > > the codebase even further. I am using
> the
> > > > jmeter
> > > > > > > based
> > > > > > > > > >> test
> > > > > > > > > >> > > case
> > > > > > > > > >> > > > > from
> > > > > > > > > >> > > > > > > > Pere
> > > > > > > > > >> > > > > > > > > and Enrique (thanks guys!) [1] which
> uses
> > a
> > > > load
> > > > > > of
> > > > > > > 30
> > > > > > > > > >> > threads
> > > > > > > > > >> > > to
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > 1) start a new process instance (POST)
> > > > > > > > > >> > > > > > > > > 2) retrieve tasks for a user (GET)
> > > > > > > > > >> > > > > > > > > 3) fetches task details (GET)
> > > > > > > > > >> > > > > > > > > 4) complete a task (POST)
> > > > > > > > > >> > > > > > > > > 5) execute a query on data-audit
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > With this test setup, I noticed that the
> > > > > > performance
> > > > > > > > for
> > > > > > > > > >> the
> > > > > > > > > >> > > POST
> > > > > > > > > >> > > > > > > > > requests, in particular the one to
> start a
> > > new
> > > > > > > process
> > > > > > > > > >> > > instance,
> > > > > > > > > >> > > > > > > degrades
> > > > > > > > > >> > > > > > > > > over time - see graph [2]. If I run the
> > same
> > > > > test
> > > > > > > > > without
> > > > > > > > > >> > > > > data-index,
> > > > > > > > > >> > > > > > > > then
> > > > > > > > > >> > > > > > > > > there is no such performance degradation
> > > [3].
> > > > > You
> > > > > > > can
> > > > > > > > > >> find a
> > > > > > > > > >> > > > thread
> > > > > > > > > >> > > > > > > dump
> > > > > > > > > >> > > > > > > > > captured a few minutes into the first
> test
> > > > here
> > > > > > [4]
> > > > > > > > that
> > > > > > > > > >> > might
> > > > > > > > > >> > > > help
> > > > > > > > > >> > > > > > to
> > > > > > > > > >> > > > > > > > see
> > > > > > > > > >> > > > > > > > > some of the contention points.
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > I'd appreciate if you could take a look
> > and
> > > > see
> > > > > if
> > > > > > > > there
> > > > > > > > > >> is
> > > > > > > > > >> > > > > something
> > > > > > > > > >> > > > > > > > that
> > > > > > > > > >> > > > > > > > > can be further improved based on your
> > > previous
> > > > > > work.
> > > > > > > > If
> > > > > > > > > >> you
> > > > > > > > > >> > > need
> > > > > > > > > >> > > > > any
> > > > > > > > > >> > > > > > > > > additional data, let me know, but
> > otherwise
> > > it
> > > > > is
> > > > > > > > > >> > > straightforward
> > > > > > > > > >> > > > > to
> > > > > > > > > >> > > > > > > run
> > > > > > > > > >> > > > > > > > > the jmeter test as well.
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > Thanks,
> > > > > > > > > >> > > > > > > > > Martin
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > [1]
> > > > > > > > > >> https://github.com/pefernan/job-service-refactor-test/
> > > > > > > > > >> > > > > > > > > [2]
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > >
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > >
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/1Gqn-ixE05kXv2jdssAUlnMuUVcHxIYZ0/view?usp=sharing
> > > > > > > > > >> > > > > > > > > [3]
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > >
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > >
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/10gVNyb4JYg_bA18bNhY9dEDbPn3TOxL7/view?usp=sharing
> > > > > > > > > >> > > > > > > > > [4]
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > >
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > >
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/1jVrtsO49gCvUlnaC9AUAtkVKTm4PbdUv/view?usp=sharing
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > ________________________________________
> > > > > > > > > >> > > > > > > > > From: Francisco Javier Tirado Sarti <
> > > > > > > > > [email protected]>
> > > > > > > > > >> > > > > > > > > Sent: Wednesday, January 17, 2024 9:13
> AM
> > > > > > > > > >> > > > > > > > > To: [email protected]
> > > > > > > > > >> > > > > > > > > Cc: Pere Fernandez Perez
> > > > > > > > > >> > > > > > > > > Subject: [EXTERNAL] Re: [DISCUSSION]
> > > > Performance
> > > > > > > > issues
> > > > > > > > > >> with
> > > > > > > > > >> > > > > > data-index
> > > > > > > > > >> > > > > > > > > persistence addon
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > Hi Alex,
> > > > > > > > > >> > > > > > > > > I did not take times (which depends on a
> > > > number
> > > > > of
> > > > > > > > > >> variables
> > > > > > > > > >> > > that
> > > > > > > > > >> > > > > > > > > drastically change between
> environments),
> > > but
> > > > > > verify
> > > > > > > > > that
> > > > > > > > > >> the
> > > > > > > > > >> > > > > number
> > > > > > > > > >> > > > > > of
> > > > > > > > > >> > > > > > > > > updates has been reduced drastically
> > without
> > > > > > losing
> > > > > > > > > >> > > > functionality,
> > > > > > > > > >> > > > > > > which
> > > > > > > > > >> > > > > > > > is
> > > > > > > > > >> > > > > > > > > objectively a good thing. If before the
> > > > change,
> > > > > > for
> > > > > > > > > every
> > > > > > > > > >> > node
> > > > > > > > > >> > > > > > > executed,
> > > > > > > > > >> > > > > > > > we
> > > > > > > > > >> > > > > > > > > have an update for every node previously
> > > > > executed,
> > > > > > > so
> > > > > > > > > if a
> > > > > > > > > >> > > > process
> > > > > > > > > >> > > > > > have
> > > > > > > > > >> > > > > > > > 50
> > > > > > > > > >> > > > > > > > > nodes to execute, we were performing
> > nearly
> > > > > > 50*51/2
> > > > > > > > > >> updates,
> > > > > > > > > >> > > > which
> > > > > > > > > >> > > > > > > gives
> > > > > > > > > >> > > > > > > > us
> > > > > > > > > >> > > > > > > > > a total of  1275 updates, now we have
> just
> > > one
> > > > > for
> > > > > > > > every
> > > > > > > > > >> node
> > > > > > > > > >> > > > being
> > > > > > > > > >> > > > > > > > > executed, implying a total of 50
> updates.
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > On Wed, Jan 17, 2024 at 3:18 PM Alex
> > > Porcelli
> > > > <
> > > > > > > > > >> > > [email protected]>
> > > > > > > > > >> > > > > > > wrote:
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > > > > Francisco,
> > > > > > > > > >> > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > I noticed that your PR has been
> merged,
> > > but
> > > > I
> > > > > > was
> > > > > > > > > >> expecting
> > > > > > > > > >> > > (at
> > > > > > > > > >> > > > > > least
> > > > > > > > > >> > > > > > > > > > was my understanding from this thread)
> > > that
> > > > > > before
> > > > > > > > > >> merging
> > > > > > > > > >> > > some
> > > > > > > > > >> > > > > > > > > > benchmark data would be shared in
> > advance
> > > -
> > > > to
> > > > > > > > assess
> > > > > > > > > >> the
> > > > > > > > > >> > > > > > > cost/benefit
> > > > > > > > > >> > > > > > > > > > of such a decent size change.
> > > > > > > > > >> > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > Do you have any information to share?
> > > > > > > > > >> > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > On Sat, Dec 23, 2023 at 4:02 AM
> > Francisco
> > > > > Javier
> > > > > > > > > Tirado
> > > > > > > > > >> > Sarti
> > > > > > > > > >> > > > > > > > > > <[email protected]> wrote:
> > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > Yes, as intended, now we have one
> > select
> > > > and
> > > > > > one
> > > > > > > > > >> > > > insert/update
> > > > > > > > > >> > > > > > per
> > > > > > > > > >> > > > > > > > node
> > > > > > > > > >> > > > > > > > > > > event.
> > > > > > > > > >> > > > > > > > > > > I moved the PR as ready for review
> and
> > > > give
> > > > > > > @Pere
> > > > > > > > > >> > Fernandez
> > > > > > > > > >> > > > > Perez
> > > > > > > > > >> > > > > > > > > > > <[email protected]> permission to
> > the
> > > > > > branch
> > > > > > > so
> > > > > > > > > he
> > > > > > > > > >> can
> > > > > > > > > >> > > > edit
> > > > > > > > > >> > > > > it
> > > > > > > > > >> > > > > > > in
> > > > > > > > > >> > > > > > > > > the
> > > > > > > > > >> > > > > > > > > > > next two weeks (Ill be on PTO)  if
> > > > desired,
> > > > > > > before
> > > > > > > > > >> > merging.
> > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > On Thu, Dec 21, 2023 at 5:58 PM Alex
> > > > > Porcelli
> > > > > > <
> > > > > > > > > >> > > > > [email protected]>
> > > > > > > > > >> > > > > > > > > wrote:
> > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > Cool, thank you Francisco!
> > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > Did you manage to get some
> > preliminary
> > > > > data
> > > > > > > > about
> > > > > > > > > >> > > > > improvements?
> > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > On Thu, Dec 21, 2023 at 11:52 AM
> > > > Francisco
> > > > > > > > Javier
> > > > > > > > > >> > Tirado
> > > > > > > > > >> > > > > Sarti
> > > > > > > > > >> > > > > > > > > > > > <[email protected]> wrote:
> > > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > Yes, after some delay because of
> > > > > quarkus 3
> > > > > > > > > >> migration.
> > > > > > > > > >> > > Im
> > > > > > > > > >> > > > > > > refining
> > > > > > > > > >> > > > > > > > > > this
> > > > > > > > > >> > > > > > > > > > > > > draft PR
> > > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > > >> > > > > >
> > > > > > > >
> https://github.com/apache/incubator-kie-kogito-apps/pull/1941
> > > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > On Thu, Dec 21, 2023 at 5:48 PM
> > Alex
> > > > > > > Porcelli
> > > > > > > > <
> > > > > > > > > >> > > > > > > [email protected]>
> > > > > > > > > >> > > > > > > > > > wrote:
> > > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > Any update or new findings on
> > this
> > > > > > topic?
> > > > > > > > > >> > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > On Tue, Nov 28, 2023 at
> 8:38 AM
> > > > > > Francisco
> > > > > > > > > Javier
> > > > > > > > > >> > > Tirado
> > > > > > > > > >> > > > > > Sarti
> > > > > > > > > >> > > > > > > > > > > > > > <[email protected]> wrote:
> > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > > Hi Alex,
> > > > > > > > > >> > > > > > > > > > > > > > > After considering different
> > > > options
> > > > > to
> > > > > > > > > improve
> > > > > > > > > >> > > > > > performance,
> > > > > > > > > >> > > > > > > > we
> > > > > > > > > >> > > > > > > > > > feel
> > > > > > > > > >> > > > > > > > > > > > that
> > > > > > > > > >> > > > > > > > > > > > > > it
> > > > > > > > > >> > > > > > > > > > > > > > > is time to "partially" move
> > away
> > > > > from
> > > > > > > the
> > > > > > > > > >> current
> > > > > > > > > >> > > Map
> > > > > > > > > >> > > > > > style
> > > > > > > > > >> > > > > > > > > > > > interface (
> > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > >
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > >
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > >
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/incubator-kie-kogito-apps/blob/main/persistence-commons/persistence-commons-api/src/main/java/org/kie/kogito/persistence/api/Storage.java
> > > > > > > > > >> > > > > > > > > > > > > > )
> > > > > > > > > >> > > > > > > > > > > > > > > which was shared with
> Trusty,
> > to
> > > > one
> > > > > > > more
> > > > > > > > > >> > suitable
> > > > > > > > > >> > > > for
> > > > > > > > > >> > > > > > > usage
> > > > > > > > > >> > > > > > > > > > with a
> > > > > > > > > >> > > > > > > > > > > > > > > relational DB like
> postgresql
> > > (but
> > > > > > still
> > > > > > > > > >> > compatible
> > > > > > > > > >> > > > > with
> > > > > > > > > >> > > > > > > big
> > > > > > > > > >> > > > > > > > > > table
> > > > > > > > > >> > > > > > > > > > > > dbs).
> > > > > > > > > >> > > > > > > > > > > > > > > The idea will be to replace
> > > > generic
> > > > > > > > Storage
> > > > > > > > > >> > > interface
> > > > > > > > > >> > > > > by
> > > > > > > > > >> > > > > > > four
> > > > > > > > > >> > > > > > > > > > > > specific
> > > > > > > > > >> > > > > > > > > > > > > > > interfaces (which will
> inherit
> > > > from
> > > > > a
> > > > > > > > common
> > > > > > > > > >> one
> > > > > > > > > >> > > that
> > > > > > > > > >> > > > > > keeps
> > > > > > > > > >> > > > > > > > the
> > > > > > > > > >> > > > > > > > > > query
> > > > > > > > > >> > > > > > > > > > > > > > part
> > > > > > > > > >> > > > > > > > > > > > > > > at is it. with get and query
> > > > > methods),
> > > > > > > > that
> > > > > > > > > >> will
> > > > > > > > > >> > > > > include
> > > > > > > > > >> > > > > > > the
> > > > > > > > > >> > > > > > > > > > required
> > > > > > > > > >> > > > > > > > > > > > > > > modification operations for
> > the
> > > > four
> > > > > > > > > DataIndex
> > > > > > > > > >> > > > > "domains":
> > > > > > > > > >> > > > > > > > > > > > > > processinstance,
> > > > > > > > > >> > > > > > > > > > > > > > > usertask, processdefinitions
> > and
> > > > > jobs.
> > > > > > > > Those
> > > > > > > > > >> > > > interfaces
> > > > > > > > > >> > > > > > > will
> > > > > > > > > >> > > > > > > > > > define
> > > > > > > > > >> > > > > > > > > > > > > > methods
> > > > > > > > > >> > > > > > > > > > > > > > > like addNode, addVariable,
> > > > > updateTask,
> > > > > > > > > >> > > > > addAttachment.....
> > > > > > > > > >> > > > > > > > that
> > > > > > > > > >> > > > > > > > > > will
> > > > > > > > > >> > > > > > > > > > > > allow
> > > > > > > > > >> > > > > > > > > > > > > > > the persistent layer
> > > > implementation
> > > > > > to
> > > > > > > > just
> > > > > > > > > >> > update
> > > > > > > > > >> > > > the
> > > > > > > > > >> > > > > > > > needed
> > > > > > > > > >> > > > > > > > > > info
> > > > > > > > > >> > > > > > > > > > > > in
> > > > > > > > > >> > > > > > > > > > > > > > the
> > > > > > > > > >> > > > > > > > > > > > > > > DB  (for example, for
> addNode
> > in
> > > > > > > Postgres,
> > > > > > > > > >> just
> > > > > > > > > >> > > > insert
> > > > > > > > > >> > > > > a
> > > > > > > > > >> > > > > > > row
> > > > > > > > > >> > > > > > > > > into
> > > > > > > > > >> > > > > > > > > > > > nodes
> > > > > > > > > >> > > > > > > > > > > > > > > table, for addNode in Mongo,
> > > > > basically
> > > > > > > the
> > > > > > > > > >> same
> > > > > > > > > >> > > > atomic
> > > > > > > > > >> > > > > > > upsert
> > > > > > > > > >> > > > > > > > > > > > operation
> > > > > > > > > >> > > > > > > > > > > > > > > that is currently done).
> > > > Therefore,
> > > > > we
> > > > > > > > > >> increase
> > > > > > > > > >> > > > > > performance
> > > > > > > > > >> > > > > > > > for
> > > > > > > > > >> > > > > > > > > > > > Postgres
> > > > > > > > > >> > > > > > > > > > > > > > > and keep the current one for
> > > > Mongo.
> > > > > > The
> > > > > > > > > >> current
> > > > > > > > > >> > DB
> > > > > > > > > >> > > > > > schemas
> > > > > > > > > >> > > > > > > > > won't
> > > > > > > > > >> > > > > > > > > > be
> > > > > > > > > >> > > > > > > > > > > > > > > touched.
> > > > > > > > > >> > > > > > > > > > > > > > > Since the code change is
> > large,
> > > I
> > > > do
> > > > > > not
> > > > > > > > > think
> > > > > > > > > >> > I'll
> > > > > > > > > >> > > > be
> > > > > > > > > >> > > > > > able
> > > > > > > > > >> > > > > > > > to
> > > > > > > > > >> > > > > > > > > > have
> > > > > > > > > >> > > > > > > > > > > > the
> > > > > > > > > >> > > > > > > > > > > > > > PR
> > > > > > > > > >> > > > > > > > > > > > > > > ready till next week.
> > > > > > > > > >> > > > > > > > > > > > > > > But before starting, please
> > let
> > > me
> > > > > > know
> > > > > > > if
> > > > > > > > > >> that
> > > > > > > > > >> > > > > approach
> > > > > > > > > >> > > > > > is
> > > > > > > > > >> > > > > > > > > fine
> > > > > > > > > >> > > > > > > > > > for
> > > > > > > > > >> > > > > > > > > > > > you.
> > > > > > > > > >> > > > > > > > > > > > > > > Best regards.
> > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > > On Fri, Nov 24, 2023 at
> > 6:55 PM
> > > > Alex
> > > > > > > > > Porcelli
> > > > > > > > > >> <
> > > > > > > > > >> > > > > > > > > [email protected]>
> > > > > > > > > >> > > > > > > > > > > > wrote:
> > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > > > Thank you Francisco to
> > getting
> > > > > > deeper
> > > > > > > on
> > > > > > > > > >> this…
> > > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > > > Looking forward to see the
> > > > results
> > > > > > of
> > > > > > > > your
> > > > > > > > > >> > > > suggested
> > > > > > > > > >> > > > > > > > > > improvements.
> > > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > > > On Fri, Nov 24, 2023 at
> > > 9:40 AM
> > > > > > > > Francisco
> > > > > > > > > >> > Javier
> > > > > > > > > >> > > > > Tirado
> > > > > > > > > >> > > > > > > > > Sarti <
> > > > > > > > > >> > > > > > > > > > > > > > > > [email protected]>
> wrote:
> > > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > > > > I forgot to attach the
> > > queries
> > > > > > > > > >> > > > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > > > > On Fri, Nov 24, 2023 at
> > > > 3:04 PM
> > > > > > > > > Francisco
> > > > > > > > > >> > > Javier
> > > > > > > > > >> > > > > > Tirado
> > > > > > > > > >> > > > > > > > > > Sarti <
> > > > > > > > > >> > > > > > > > > > > > > > > > > [email protected]>
> > wrote:
> > > > > > > > > >> > > > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > > > > >> Hi,
> > > > > > > > > >> > > > > > > > > > > > > > > > >> A brief update on this
> > > topic.
> > > > > > > > > >> > > > > > > > > > > > > > > > >> After doing a simple
> test
> > > > with
> > > > > > > > example
> > > > > > > > > >> > > > > > > > > > > > > > > > >>
> > > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > >> > > > > > > > > >
> > > > > > > > > >> > > > > > > > >
> > > > > > > > > >> > > > > > > >
> > > > > > > > > >> > > > > > >
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > >
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/incubator-kie-kogito-examples/tree/stable/serverless-workflow-examples/serverless-workflow-data-index-quarkus
> > > > > > > > > >> > > > > > > > > > > > > > > > ,
> > > > > > > > > >> > > > > > > > > > > > > > > > >> the number of updates
> > over
> > > > > Nodes
> > > > > > > > table
> > > > > > > > > is
> > > > > > > > > >> > n*n,
> > > > > > > > > >> > > > so
> > > > > > > > > >> > > > > we
> > > > > > > > > >> > > > > > > > > manage
> > > > > > > > > >> > > > > > > > > > to
> > > > > > > > > >> > > > > > > > > > > > > > obtain a
> > > > > > > > > >> > > > > > > > > > > > > > > > >> perfect quadratic
> > > performance
> > > > > > > > > >> degradation.
> > > > > > > > > >> > The
> > > > > > > > > >> > > > > > problem
> > > > > > > > > >> > > > > > > > is
> > > > > > > > > >> > > > > > > > > > worse
> > > > > > > > > >> > > > > > > > > > > > in
> > > > > > > > > >> > > > > > > > > > > > > > the
> > > > > > > > > >> > > > > > > > > > > > > > > > case
> > > > > > > > > >> > > > > > > > > > > > > > > > >> of Serverless Workflow
> > than
> > > > in
> > > > > > BPMN
> > > > > > > > > >> because
> > > > > > > > > >> > we
> > > > > > > > > >> > > > the
> > > > > > > > > >> > > > > > > > number
> > > > > > > > > >> > > > > > > > > of
> > > > > > > > > >> > > > > > > > > > > > nodes
> > > > > > > > > >> > > > > > > > > > > > > > is
> > > > > > > > > >> > > > > > > > > > > > > > > > >> greater than the number
> > of
> > > > > > states.
> > > > > > > In
> > > > > > > > > >> that
> > > > > > > > > >> > > > > example N
> > > > > > > > > >> > > > > > > is
> > > > > > > > > >> > > > > > > > > 16,
> > > > > > > > > >> > > > > > > > > > but
> > > > > > > > > >> > > > > > > > > > > > for
> > > > > > > > > >> > > > > > > > > > > > > > a
> > > > > > > > > >> > > > > > > > > > > > > > > > more
> > > > > > > > > >> > > > > > > > > > > > > > > > >> complex workflow it
> would
> > > be
> > > > > > > > certainly
> > > > > > > > > >> > large.
> > > > > > > > > >> > > > > > > > > > > > > > > > >> I think that this is
> more
> > > > > related
> > > > > > > to
> > > > > > > > > how
> > > > > > > > > >> we
> > > > > > > > > >> > > are
> > > > > > > > > >> > > > > > > handling
> > > > > > > > > >> > > > > > > > > > JPA in
> > > > > > > > > >> > > > > > > > > > > > the
> > > > > > > > > >> > > > > > > > > > > > > > > > code,
> > > > > > > > > >> > > > > > > > > > > > > > > > >> in particular the
> mapping
> > > > from
> > > > > > > model
> > > > > > > > to
> > > > > > > > > >> > entity
> > > > > > > > > >> > > > > > > > (basically
> > > > > > > > > >> > > > > > > > > > JPA is
> > > > > > > > > >> > > > > > > > > > > > > > blind
> > > > > > > > > >> > > > > > > > > > > > > > > > and
> > > > > > > > > >> > > > > > > > > > > > > > > > >> has to update all nodes
> > for
> > > > > every
> > > > > > > > write
> > > > > > > > > >> > > because
> > > > > > > > > >> > > > it
> > > > > > > > > >> > > > > > > > > believes
> > > > > > > > > >> > > > > > > > > > the
> > > > > > > > > >> > > > > > > > > > > > > > node has
> > > > > > > > > >> > > > > > > > > > > > > > > > >> been updated, although
> it
> > > is
> > > > > not)
> > > > > > > > than
> > > > > > > > > an
> > > > > > > > > >> > > issue
> > > > > > > > > >> > > > in
> > > > > > > > > >> > > > > > the
> > > > > > > > > >> > > > > > > > > table
> > > > > > > > > >> > > > > > > > > > > > > > definition.
> > > > > > > > > >> > > > > > > > > > > > > > > > >> In fact, when using
> JPA,
> > > > > > separating
> > > > > > > > the
> > > > > > > > > >> > server
> > > > > > > > > >> > > > > model
> > > > > > > > > >> > > > > > > > from
> > > > > > > > > >> > > > > > > > > > the
> > > > > > > > > >> > > > > > > > > > > > JPA
> > > > > > > > > >> > > > > > > > > > > > > > > > entity is
> > > > > > > > > >> > > > > > > > > > > > > > > > >> not a good idea,
> > especially
> > > > if
> > > > > > the
> > > > > > > > > entity
> > > > > > > > > >> > > > contains
> > > > > > > > > >> > > > > > > > > > collections.
> > > > > > > > > >> > > > > > > > > > > > I
> > > > > > > > > >> > > > > > > > > > > > > > will
> > > > > > > > > >> > > > > > > > > > > > > > > > try
> > > > > > > > > >> > > > > > > > > > > > > > > > >> to change that without
> > > > breaking
> > > > > > > > > anything.
> > > > > > > > > >> > > > > > > > > > > > > > > > >>
> > > > > > > > > >> > > > > > > > > > > > > > > > >> On Wed, Nov 22, 2023 at
> > > > > 12:10 PM
> > > > > > > > > Enrique
> > > > > > > > > >> > > > Gonzalez
> > > > > > > > > >> > > > > > > > > Martinez <
> > > > > > > > > >> > > > > > > > > > > > > > > > >> [email protected]>
> > > wrote:
> > > > > > > > > >> > > > > > > > > > > > > > > > >>
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> After the events split
> > you
> > > > now
> > > > > > > will
> > > > > > > > > >> need to
> > > > > > > > > >> > > > > create
> > > > > > > > > >> > > > > > a
> > > > > > > > > >> > > > > > > > node
> > > > > > > > > >> > > > > > > > > > > > instance
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> model instance of
> making
> > > > > > > independent
> > > > > > > > > >> from
> > > > > > > > > >> > the
> > > > > > > > > >> > > > > > process
> > > > > > > > > >> > > > > > > > > > instance.
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> That should do the
> > trick.
> > > > > > > > > >> > > > > > > > > > > > > > > > >>>
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> Regarding
> > > deleting/inserting
> > > > > it
> > > > > > > was
> > > > > > > > > >> fixed
> > > > > > > > > >> > at
> > > > > > > > > >> > > > some
> > > > > > > > > >> > > > > > > > point.
> > > > > > > > > >> > > > > > > > > > > > > > > > >>>
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> El mar, 21 nov 2023 a
> > las
> > > > > 20:22,
> > > > > > > > > >> Francisco
> > > > > > > > > >> > > > Javier
> > > > > > > > > >> > > > > > > > Tirado
> > > > > > > > > >> > > > > > > > > > Sarti
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> (<[email protected]
> >)
> > > > > > escribió:
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> >
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > Hi Martin,
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > I have a task to
> > review
> > > > > > > > performance
> > > > > > > > > of
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> >
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> >
> > > > > > > ProcessInstanceNodeDataEventMerger
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > My idea is to reduce
> > the
> > > > > > number
> > > > > > > of
> > > > > > > > > >> delete
> > > > > > > > > >> > > > > inserts
> > > > > > > > > >> > > > > > > > when
> > > > > > > > > >> > > > > > > > > > > > processing
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> events
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > and try to do it
> > > > > incremental.
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > That should improve
> > > > > > performance.
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > PS:
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > I was planning to
> send
> > > an
> > > > > > e-mail
> > > > > > > > > >> tomorrow
> > > > > > > > > >> > > > > > > announcing
> > > > > > > > > >> > > > > > > > > > that in
> > > > > > > > > >> > > > > > > > > > > > > > case you
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> were
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > already working on a
> > fix
> > > > for
> > > > > > > > that. I
> > > > > > > > > >> > assume
> > > > > > > > > >> > > > you
> > > > > > > > > >> > > > > > are
> > > > > > > > > >> > > > > > > > not
> > > > > > > > > >> > > > > > > > > > and I
> > > > > > > > > >> > > > > > > > > > > > > > would
> > > > > > > > > >> > > > > > > > > > > > > > > > be
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > sending a PR soon.
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> >
> > > > > > > > > >> > > > > > > > > > > > > > > > >>> > On Tue, Nov 21, 2023
> > at
> > > > > > 6:09 PM
> > > > > > > > > Martin
> > > > > > > > > >> > > Weiler
> > > > > > > > > >> > > > > > > > > > > > > > > > <[email protected]
> > > > > > > > > >> > > > > > > > > > > > > > > > >>
>

Re: [DISCUSSION] Performance issues with data-index persistence addon

Reply via email to