Re: [DISCUSSION] Performance issues with data-index persistence addon

Richard Bourner Wed, 21 Feb 2024 02:24:17 -0800

Got it. Thanks Enrique.


Le mer. 21 févr. 2024, 00:18, Enrique Gonzalez Martinez <
[email protected]> a écrit :

> Regarding your STP concerns, they were addressed in my previous comment:
>
> STP is a concept, a process with certain constraints: no persistence and
> returning the outcome in the call (sync execution with no idle states). It
> was a requirement from a user in the past. One of the requirements was
> leaving no trail. In v7 was easy because you could disable the audit in
> that case. Actually we have the same way to do what we did in v7 in here as
> you can add/remove index just removing dependencies.
>
> So in fact in microservice you just need to exclude the data index from the
> app and you wont have data index.
>
>
>
> El mié, 21 feb 2024, 1:40, Richard Bourner <[email protected]> escribió:
>
> > +1 with Martin's email.
> >
> > One question though in regards to Martin's point #3 and to previous
> > following statement from Francisco: "*... keeping finishing
> > process instances "for a while" in DataIndex was the only way for users
> to
> > query the result of straight through processes*"
> > --> Is this the only use case where data index would be needed for STP?
> > I am asking because clients will already get their result in the JSON
> > returned from the synchronous REST call, so adding an extra computing
> time
> > for data index persistence does not seem right to me in the context of
> > decision services that are supposed to be very fast to return (typically
> > rule tasks+scripts).
> > Or is it that we also want to provide some GraphQL capabilities, even for
> > STP use cases?
> >
> > Also, what would "*for a while*" mean exactly?  Will it be configurable?
> > Will there be a default expiration value?
> >
> > I am assuming this is all work in progress, and you may not have answers
> to
> > all my questions, no problem with that.
> >
> > Thanks.
> >
> > On Tue, Feb 20, 2024 at 5:39 PM Martin Weiler <[email protected]>
> > wrote:
> >
> > > IMO, it is good to have this discussion around data sanity now instead
> of
> > > putting it off until later when data has already accumulated in
> > production
> > > environments.
> > >
> > > Based on the input here, we are dealing with three types of data:
> > > 1. Runtime data - active instances only, engine cleans up the data
> > > automatically at process instance end
> > > 2. Historic log data - data created by data-audit intended for long
> term
> > > storage
> > > 3. Data-index data - somehow this data falls in between the two
> > > aforementioned categories, with the idea of the data being "recent",
> but
> > > not restricted to active instances only
> > >
> > > We'd need purge strategies for both #2 and #3 (perhaps different ones,
> or
> > > with different config settings) in order to prevent unlimited data
> > growth.
> > >
> > > ________________________________________
> > > From: Enrique Gonzalez Martinez <[email protected]>
> > > Sent: Monday, February 19, 2024 7:11 AM
> > > To: [email protected]
> > > Subject: [EXTERNAL] Re: [DISCUSSION] Performance issues with data-index
> > > persistence addon
> > >
> > > Hi Francisco,
> > > To give you more context about this.
> > >
> > > STP is a concept, a process with certain constraints: no persistence
> and
> > > returning the outcome in the call (sync execution with no idle states).
> > It
> > > was a requirement from a user in the past. One of the requirements was
> > > leaving no trail. In v7 was easy because you could disable the audit in
> > > that case. Actually we have the same way to do what we did in v7 in
> here
> > as
> > > you can add/remove index just removing deps.
> > >
> > > We have the same outcome with different approaches and STP is already
> > > delivered.
> > >
> > > El lun, 19 feb 2024 a las 14:46, Francisco Javier Tirado Sarti (<
> > > [email protected]>) escribió:
> > >
> > > > Regarding STP (which is not a concept that we have in the code. I
> mean
> > > STP
> > > > are processes as nonSTP are), I guess, as all processes, they were
> kept
> > > in
> > > > DataIndex once completed because users wanted (and still wants) to
> > check
> > > > the result once the call had been performed. If we want to leave no
> > trace
> > > > of them in DataIndex for some reason, we will need to make it a
> > > > Runtimes concept so DataIndex can handle them in a different way.
> > > >
> > > > On Mon, Feb 19, 2024 at 2:27 PM Enrique Gonzalez Martinez <
> > > > [email protected]> wrote:
> > > >
> > > > > Alex:
> > > > > Right now the data index is working in the same way as it did in v7
> > > with
> > > > > the emitters. The only difference between two impl is that in here
> > the
> > > > > storage is pgsql instead elastic search.  You are right regarding
> is
> > a
> > > > > snapshot of the last state of the process but we did never define
> how
> > > > long
> > > > > would be alive that dats Honestly i am happy right now with the way
> > it
> > > > > works. The clean up mechanism is still tbd because we still need to
> > > > discuss
> > > > > other stuff first.
> > > > >
> > > > >
> > > > > Regarding stp is to leave no trail because u can get the outcome
> > > directly
> > > > > from the call. It was defined like that in v7. So there is no use
> for
> > > the
> > > > > index or the audit.
> > > > >
> > > > > El lun, 19 feb 2024, 14:13, Francisco Javier Tirado Sarti <
> > > > > [email protected]> escribió:
> > > > >
> > > > > > Hi Alex,
> > > > > > There has been some confusion about the purpose of DataIndex. To
> be
> > > > > honest
> > > > > > I believe they were already sorted out, but your e-mail makes me
> > > think
> > > > > that
> > > > > > is not the case ;). I let Kris to clarify that with you. My view
> is
> > > > that
> > > > > > data-index is a way to query recently closed and active processes
> > > (the
> > > > > key
> > > > > > here is the definition of recently, which in my opinion should be
> > > > > > configurable)
> > > > > > But, besides that discussion and being pragmatic, keeping
> finishing
> > > > > process
> > > > > > instances "for a while" in DataIndex was the only way for users
> to
> > > > query
> > > > > > the result of straight through processes. That's a function that
> > > cannot
> > > > > be
> > > > > > removed right now
> > > > > >
> > > > > > On Mon, Feb 19, 2024 at 1:33 PM Alex Porcelli <
> [email protected]
> > >
> > > > > wrote:
> > > > > >
> > > > > > > if data index was supposed to provide snapshot view of the
> > process
> > > > > > > instance… why do we keep it after the process instance is
> > finished?
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Feb 19, 2024 at 7:12 AM Francisco Javier Tirado Sarti <
> > > > > > > [email protected]> wrote:
> > > > > > >
> > > > > > > > Hi Martin.
> > > > > > > > After taking a deeper look at this, I realize that the
> > behaviour
> > > is
> > > > > the
> > > > > > > > expected one.
> > > > > > > > Runtimes DB does not track the completed process instance
> > (that's
> > > > > what
> > > > > > > the
> > > > > > > > JDBCProcessInstances warn is telling us), but DataIndex, as
> > > > expected,
> > > > > > is
> > > > > > > > tracking it in processes and nodes table. And yes it will
> grow
> > > over
> > > > > > time.
> > > > > > > > What we need is some configurable purge mechanism for
> > DataIndex,
> > > so
> > > > > it
> > > > > > > > eventually removes older completed process instances.
> > > > > > > >
> > > > > > > > On Tue, Feb 13, 2024 at 12:59 PM Francisco Javier Tirado
> Sarti
> > <
> > > > > > > > [email protected]> wrote:
> > > > > > > >
> > > > > > > > > Hi Martin,
> > > > > > > > > Good catch!. Looks like the skipping performed for process
> > > > > instances
> > > > > > is
> > > > > > > > > not applied to node instances. Something we definitely need
> > to
> > > > > review
> > > > > > > on
> > > > > > > > > the runtimes side.
> > > > > > > > >
> > > > > > > > > On Mon, Feb 12, 2024 at 11:59 PM Martin Weiler
> > > > > > <[email protected]
> > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > >> On a somewhat related note, testing a simple workflow
> (start
> > > ->
> > > > > > script
> > > > > > > > >> node -> end), I see the following messages in the logs:
> > > > > > > > >> 2024-02-12 22:49:50,493 28758dde544c WARN
> > > > > > > > >> [org.kie.kogito.persistence.jdbc.JDBCProcessInstances:-1]
> > > > > > > > >> (executor-thread-3) Skipping create of process instance
> id:
> > > > > > > > >> 7083088e-b899-47cb-b85c-5d9ccb0aa166, state: 2
> > > > > > > > >>
> > > > > > > > >> So far, so good. And I'd expect to see no trace of this
> > > process
> > > > in
> > > > > > the
> > > > > > > > >> database if I don't have data audit enabled.
> > > > > > > > >>
> > > > > > > > >> However, the 'processes' table contains a row with
> state=2,
> > > with
> > > > > > > related
> > > > > > > > >> entries in the 'nodes' table. In a load test, I see these
> > > tables
> > > > > > grow
> > > > > > > > >> significantly over time. Am I missing something to have
> > these
> > > > > > entries
> > > > > > > > >> cleaned up automatically?
> > > > > > > > >>
> > > > > > > > >> ________________________________________
> > > > > > > > >> From: Martin Weiler <[email protected]>
> > > > > > > > >> Sent: Monday, February 12, 2024 3:40 PM
> > > > > > > > >> To: [email protected]
> > > > > > > > >> Subject: [EXTERNAL] RE: [DISCUSSION] Performance issues
> with
> > > > > > > data-index
> > > > > > > > >> persistence addon
> > > > > > > > >>
> > > > > > > > >> Thanks everyone for your input. Based on this discussion,
> I
> > > > opened
> > > > > > the
> > > > > > > > >> following PR:
> > > > > > > > >>
> > https://github.com/apache/incubator-kie-kogito-apps/pull/1985
> > > > > > > > >>
> > > > > > > > >> With this change, the performance seems to be stable over
> > > time:
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/1zkullvfrJpRp7TRjxDa41ok6kEIR7Fty/view?usp=sharing
> > > > > > > > >>
> > > > > > > > >> Martin
> > > > > > > > >>
> > > > > > > > >> ________________________________________
> > > > > > > > >> From: Gonzalo Muñoz <[email protected]>
> > > > > > > > >> Sent: Friday, February 9, 2024 9:42 AM
> > > > > > > > >> To: [email protected]
> > > > > > > > >> Subject: [EXTERNAL] Re: [DISCUSSION] Performance issues
> with
> > > > > > > data-index
> > > > > > > > >> persistence addon
> > > > > > > > >>
> > > > > > > > >> Great work Francisco,
> > > > > > > > >> Martin, take a look at this link with some related tips
> (in
> > > case
> > > > > you
> > > > > > > > find
> > > > > > > > >> it useful):
> > > > > > > > >>
> > > https://www.cybertec-postgresql.com/en/index-your-foreign-key/
> > > > > > > > >>
> > > > > > > > >> El vie, 9 feb 2024 a las 17:20, Francisco Javier Tirado
> > Sarti
> > > (<
> > > > > > > > >> [email protected]>) escribió:
> > > > > > > > >>
> > > > > > > > >> > For the moment being, we will keep JPA till we exhaust
> all
> > > > > > > > >> possibilities,
> > > > > > > > >> > let's call switching from jpa to jdbc our hidden plan B
> ;)
> > > > > > > > >> > I already told Martin, but in order everyone to know,
> just
> > > > after
> > > > > > > > writing
> > > > > > > > >> > the previous email, I thought "what if Postgres is not
> > > > > > automatically
> > > > > > > > >> > indexing foreign keys like mysql?" and, eureka
> > > > > > > > >> > Postgres doc
> > > > > > > > >> >
> > > https://www.postgresql.org/docs/current/ddl-constraints.html
> > > > > > > > >> > Mysql doc
> > > > > > > > >> >
> > > > > >
> > https://dev.mysql.com/doc/refman/8.0/en/constraint-foreign-key.html
> > > > > > > > >> > These are the relevant excerpt
> > > > > > > > >> >
> > > > > > > > >> > *Postgresql*
> > > > > > > > >> > *A foreign key must reference columns that either are a
> > > > primary
> > > > > > key
> > > > > > > or
> > > > > > > > >> form
> > > > > > > > >> > a unique constraint, or are columns from a non-partial
> > > unique
> > > > > > index.
> > > > > > > > >> This
> > > > > > > > >> > means that the referenced columns always have an index
> to
> > > > allow
> > > > > > > > >> efficient
> > > > > > > > >> > lookups on whether a referencing row has a match. Since
> a
> > > > DELETE
> > > > > > of
> > > > > > > a
> > > > > > > > >> row
> > > > > > > > >> > from the referenced table or an UPDATE of a referenced
> > > column
> > > > > will
> > > > > > > > >> require
> > > > > > > > >> > a scan of the referencing table for rows matching the
> old
> > > > value,
> > > > > > it
> > > > > > > is
> > > > > > > > >> > often a good idea to index the referencing columns too.
> > > > Because
> > > > > > this
> > > > > > > > is
> > > > > > > > >> not
> > > > > > > > >> > always needed, and there are many choices available on
> how
> > > to
> > > > > > index,
> > > > > > > > the
> > > > > > > > >> > declaration of a foreign key constraint does not
> > > automatically
> > > > > > > create
> > > > > > > > an
> > > > > > > > >> > index on the referencing columns.*
> > > > > > > > >> > *Mysql*
> > > > > > > > >> > *MySQL requires that foreign key columns be indexed; if
> > you
> > > > > > create a
> > > > > > > > >> table
> > > > > > > > >> > with a foreign key constraint but no index on a given
> > > column,
> > > > an
> > > > > > > index
> > > > > > > > >> is
> > > > > > > > >> > created. *
> > > > > > > > >> >
> > > > > > > > >> > So I asked Martin to especially create an index for
> > > > > > > > process_instance_id
> > > > > > > > >> > column on nodes table
> > > > > > > > >> > I think that will fix the problem detected on the thread
> > > dump.
> > > > > > > > >> > The simpler process test to verify queries are fine
> still
> > > > > stands,
> > > > > > > > >> though ;)
> > > > > > > > >> >
> > > > > > > > >> >
> > > > > > > > >> > On Fri, Feb 9, 2024 at 5:10 PM Tibor Zimányi <
> > > > > [email protected]
> > > > > > >
> > > > > > > > >> wrote:
> > > > > > > > >> >
> > > > > > > > >> > > I always preferred pure JDBC over Hibernate myself,
> just
> > > for
> > > > > the
> > > > > > > > sake
> > > > > > > > >> of
> > > > > > > > >> > > control of what is happening :) So I would not -1 that
> > > > myself.
> > > > > > > > >> > >
> > > > > > > > >> > > Tibor
> > > > > > > > >> > >
> > > > > > > > >> > > Dňa pi 9. 2. 2024, 17:00 Francisco Javier Tirado
> Sarti <
> > > > > > > > >> > > [email protected]>
> > > > > > > > >> > > napísal(a):
> > > > > > > > >> > >
> > > > > > > > >> > > > Hi,
> > > > > > > > >> > > > Usually I do not want to talk about work in progress
> > > > because
> > > > > > > > >> > preliminary
> > > > > > > > >> > > > conclusions are pretty volatile but, well, there
> are a
> > > > > couple
> > > > > > of
> > > > > > > > >> things
> > > > > > > > >> > > > that can be concluded from the really valuable
> > > information
> > > > > > that
> > > > > > > > >> Martin
> > > > > > > > >> > > > provided.
> > > > > > > > >> > > > 1) In order to be able to determine if the number of
> > > > > > statements
> > > > > > > is
> > > > > > > > >> > larger
> > > > > > > > >> > > > than expected, I asked Martin to test with a simpler
> > > > process
> > > > > > > > >> > definition.
> > > > > > > > >> > > > One with just three nodes: start, script and end.
> The
> > > > script
> > > > > > one
> > > > > > > > >> should
> > > > > > > > >> > > > change just one variable. This way we can analyze if
> > the
> > > > > > number
> > > > > > > of
> > > > > > > > >> > > queries
> > > > > > > > >> > > > is the expected one. From the single log (audit was
> > > > > activated
> > > > > > > > them)
> > > > > > > > >> my
> > > > > > > > >> > > > conclusion is that the number of insert/updates over
> > > > > processes
> > > > > > > and
> > > > > > > > >> > nodes
> > > > > > > > >> > > > (there a lot over task, that I will prefer to skip
> for
> > > > now,
> > > > > > baby
> > > > > > > > >> steps)
> > > > > > > > >> > > is
> > > > > > > > >> > > > the expected one.
> > > > > > > > >> > > > 2) Analysing the thread dump, we see around 15
> threads
> > > > > > executing
> > > > > > > > >> this
> > > > > > > > >> > > line
> > > > > > > > >> > > > at
> > > > > > > > >> > > >
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.kie.kogito.index.jpa.storage.ProcessInstanceEntityStorage.indexNode(ProcessInstanceEntityStorage.java:125),
> > > > > > > > >> > > > so its pretty clear the code to be optimized ;). I'm
> > > > > > evaluating
> > > > > > > > >> > > > possibilities within JPA/Hibernate, but I'm starting
> > to
> > > > > think
> > > > > > > that
> > > > > > > > >> it
> > > > > > > > >> > > might
> > > > > > > > >> > > > be better to switch to JDBC and skip hibernate. Our
> > > lives
> > > > > will
> > > > > > > be
> > > > > > > > >> > > simpler,
> > > > > > > > >> > > > especially with a schema relatively simple like ours
> > > (that
> > > > > > will
> > > > > > > be
> > > > > > > > >> my
> > > > > > > > >> > > > recommendation if I was an external consultant)
> > > > > > > > >> > > >
> > > > > > > > >> > > > On Fri, Feb 9, 2024 at 4:15 PM Tibor Zimányi <
> > > > > > > [email protected]
> > > > > > > > >
> > > > > > > > >> > > wrote:
> > > > > > > > >> > > >
> > > > > > > > >> > > > > Hi,
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > this will be a bit off-topic. However as far as
> > > > > > performance, I
> > > > > > > > >> think
> > > > > > > > >> > we
> > > > > > > > >> > > > > should think about that we have string primary
> keys
> > > > > (IDs). I
> > > > > > > > would
> > > > > > > > >> > > expect
> > > > > > > > >> > > > > the database systems are much better with indexing
> > > > numeric
> > > > > > > keys
> > > > > > > > >> than
> > > > > > > > >> > > > > strings. I remember from the past, when I was
> > working
> > > > with
> > > > > > > DBs,
> > > > > > > > >> that
> > > > > > > > >> > > > using
> > > > > > > > >> > > > > strings as keys or indexes was a discouraged
> > practice.
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > Best regards,
> > > > > > > > >> > > > > Tibor
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > Dňa št 8. 2. 2024, 22:45 Martin Weiler
> > > > > > > <[email protected]
> > > > > > > > >
> > > > > > > > >> > > > > napísal(a):
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > > I changed the test to use MongoDB [1] and I
> don't
> > > see
> > > > a
> > > > > > > > >> performance
> > > > > > > > >> > > > > > degradation with this setup [2].
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > Please keep us posted of your findings. Thanks!
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > Martin
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > [1]
> > > > > > > > >> > > > >
> > > > > > > > >> >
> > > > > > > >
> > > > >
> > https://github.com/martinweiler/job-service-refactor-test/tree/mongodb
> > > > > > > > >> > > > > > [2]
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/1NfacXaxJlgRMw4OQ5S20cvkzvaUKUVFj/view?usp=sharing
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > ________________________________________
> > > > > > > > >> > > > > > From: Francisco Javier Tirado Sarti <
> > > > > [email protected]>
> > > > > > > > >> > > > > > Sent: Wednesday, February 7, 2024 11:40 AM
> > > > > > > > >> > > > > > To: [email protected]
> > > > > > > > >> > > > > > Subject: [EXTERNAL] Re: [DISCUSSION] Performance
> > > > issues
> > > > > > with
> > > > > > > > >> > > data-index
> > > > > > > > >> > > > > > persistence addon
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > yes, it can be index degradation because of
> size,
> > > but
> > > > I
> > > > > > > > believe
> > > > > > > > >> (I
> > > > > > > > >> > > > might
> > > > > > > > >> > > > > be
> > > > > > > > >> > > > > > wrong) the db is too small (yet) for that.
> > > > > > > > >> > > > > > But, eventually, Postgres, when the DB is huge
> > > enough,
> > > > > > > > >> unavoidably
> > > > > > > > >> > > will
> > > > > > > > >> > > > > > behave like the graphic that Martin sent.
> > > > > > > > >> > > > > > Since I believe we are not huge enough (yet),
> lets
> > > > rule
> > > > > > out
> > > > > > > > >> another
> > > > > > > > >> > > > issue
> > > > > > > > >> > > > > > by analysing the sql logs (I requested those to
> > > Martin
> > > > > > > offline
> > > > > > > > >> and
> > > > > > > > >> > he
> > > > > > > > >> > > > is
> > > > > > > > >> > > > > > going to kindly collect them).
> > > > > > > > >> > > > > > Also Im curious to know if Mongo behave in the
> > same
> > > > way.
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > On Wed, Feb 7, 2024 at 7:25 PM Enrique Gonzalez
> > > > > Martinez <
> > > > > > > > >> > > > > > [email protected]> wrote:
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > > Hi Francisco,
> > > > > > > > >> > > > > > > I would highly recommend to check indexes and
> > how
> > > > the
> > > > > > > > updates
> > > > > > > > >> > work
> > > > > > > > >> > > in
> > > > > > > > >> > > > > > data
> > > > > > > > >> > > > > > > index to avoid full scan table and lock the
> full
> > > > > table.
> > > > > > > Some
> > > > > > > > >> db
> > > > > > > > >> > are
> > > > > > > > >> > > > > very
> > > > > > > > >> > > > > > > sensitive to that.
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > > > El mié, 7 feb 2024, 18:41, Francisco Javier
> > Tirado
> > > > > > Sarti <
> > > > > > > > >> > > > > > > [email protected]> escribió:
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > > > > Hi Martin,
> > > > > > > > >> > > > > > > > While I analyze the data, let me ask you if
> it
> > > is
> > > > > > > possible
> > > > > > > > >> to
> > > > > > > > >> > > > perform
> > > > > > > > >> > > > > > > > another check (similar in a way to disabling
> > > > > > data-index
> > > > > > > > like
> > > > > > > > >> > you
> > > > > > > > >> > > > do)
> > > > > > > > >> > > > > > Can
> > > > > > > > >> > > > > > > > you switch to MongoDB persistence and check
> if
> > > the
> > > > > > same
> > > > > > > > >> > > degradation
> > > > > > > > >> > > > > > that
> > > > > > > > >> > > > > > > is
> > > > > > > > >> > > > > > > > there for postgres remains?
> > > > > > > > >> > > > > > > > I do not know if this is feasible but will
> > > > certainly
> > > > > > > > >> indicate
> > > > > > > > >> > the
> > > > > > > > >> > > > > > problem
> > > > > > > > >> > > > > > > > is on the postgres storage layer and I do
> not
> > > > have a
> > > > > > > clear
> > > > > > > > >> > > > prediction
> > > > > > > > >> > > > > > of
> > > > > > > > >> > > > > > > > what we will see when doing this switch.
> > > > > > > > >> > > > > > > >
> > > > > > > > >> > > > > > > > On Wed, Feb 7, 2024 at 6:37 PM Martin Weiler
> > > > > > > > >> > > > <[email protected]
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > > > wrote:
> > > > > > > > >> > > > > > > >
> > > > > > > > >> > > > > > > > > Hi Francisco,
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > thanks for your work on this important
> > topic!
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > I would like to share some test results
> > here,
> > > > > which
> > > > > > > > might
> > > > > > > > >> > help
> > > > > > > > >> > > to
> > > > > > > > >> > > > > > > improve
> > > > > > > > >> > > > > > > > > the codebase even further. I am using the
> > > jmeter
> > > > > > based
> > > > > > > > >> test
> > > > > > > > >> > > case
> > > > > > > > >> > > > > from
> > > > > > > > >> > > > > > > > Pere
> > > > > > > > >> > > > > > > > > and Enrique (thanks guys!) [1] which uses
> a
> > > load
> > > > > of
> > > > > > 30
> > > > > > > > >> > threads
> > > > > > > > >> > > to
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > 1) start a new process instance (POST)
> > > > > > > > >> > > > > > > > > 2) retrieve tasks for a user (GET)
> > > > > > > > >> > > > > > > > > 3) fetches task details (GET)
> > > > > > > > >> > > > > > > > > 4) complete a task (POST)
> > > > > > > > >> > > > > > > > > 5) execute a query on data-audit
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > With this test setup, I noticed that the
> > > > > performance
> > > > > > > for
> > > > > > > > >> the
> > > > > > > > >> > > POST
> > > > > > > > >> > > > > > > > > requests, in particular the one to start a
> > new
> > > > > > process
> > > > > > > > >> > > instance,
> > > > > > > > >> > > > > > > degrades
> > > > > > > > >> > > > > > > > > over time - see graph [2]. If I run the
> same
> > > > test
> > > > > > > > without
> > > > > > > > >> > > > > data-index,
> > > > > > > > >> > > > > > > > then
> > > > > > > > >> > > > > > > > > there is no such performance degradation
> > [3].
> > > > You
> > > > > > can
> > > > > > > > >> find a
> > > > > > > > >> > > > thread
> > > > > > > > >> > > > > > > dump
> > > > > > > > >> > > > > > > > > captured a few minutes into the first test
> > > here
> > > > > [4]
> > > > > > > that
> > > > > > > > >> > might
> > > > > > > > >> > > > help
> > > > > > > > >> > > > > > to
> > > > > > > > >> > > > > > > > see
> > > > > > > > >> > > > > > > > > some of the contention points.
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > I'd appreciate if you could take a look
> and
> > > see
> > > > if
> > > > > > > there
> > > > > > > > >> is
> > > > > > > > >> > > > > something
> > > > > > > > >> > > > > > > > that
> > > > > > > > >> > > > > > > > > can be further improved based on your
> > previous
> > > > > work.
> > > > > > > If
> > > > > > > > >> you
> > > > > > > > >> > > need
> > > > > > > > >> > > > > any
> > > > > > > > >> > > > > > > > > additional data, let me know, but
> otherwise
> > it
> > > > is
> > > > > > > > >> > > straightforward
> > > > > > > > >> > > > > to
> > > > > > > > >> > > > > > > run
> > > > > > > > >> > > > > > > > > the jmeter test as well.
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > Thanks,
> > > > > > > > >> > > > > > > > > Martin
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > [1]
> > > > > > > > >> https://github.com/pefernan/job-service-refactor-test/
> > > > > > > > >> > > > > > > > > [2]
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > >
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/1Gqn-ixE05kXv2jdssAUlnMuUVcHxIYZ0/view?usp=sharing
> > > > > > > > >> > > > > > > > > [3]
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > >
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/10gVNyb4JYg_bA18bNhY9dEDbPn3TOxL7/view?usp=sharing
> > > > > > > > >> > > > > > > > > [4]
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > >
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/1jVrtsO49gCvUlnaC9AUAtkVKTm4PbdUv/view?usp=sharing
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > ________________________________________
> > > > > > > > >> > > > > > > > > From: Francisco Javier Tirado Sarti <
> > > > > > > > [email protected]>
> > > > > > > > >> > > > > > > > > Sent: Wednesday, January 17, 2024 9:13 AM
> > > > > > > > >> > > > > > > > > To: [email protected]
> > > > > > > > >> > > > > > > > > Cc: Pere Fernandez Perez
> > > > > > > > >> > > > > > > > > Subject: [EXTERNAL] Re: [DISCUSSION]
> > > Performance
> > > > > > > issues
> > > > > > > > >> with
> > > > > > > > >> > > > > > data-index
> > > > > > > > >> > > > > > > > > persistence addon
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > Hi Alex,
> > > > > > > > >> > > > > > > > > I did not take times (which depends on a
> > > number
> > > > of
> > > > > > > > >> variables
> > > > > > > > >> > > that
> > > > > > > > >> > > > > > > > > drastically change between environments),
> > but
> > > > > verify
> > > > > > > > that
> > > > > > > > >> the
> > > > > > > > >> > > > > number
> > > > > > > > >> > > > > > of
> > > > > > > > >> > > > > > > > > updates has been reduced drastically
> without
> > > > > losing
> > > > > > > > >> > > > functionality,
> > > > > > > > >> > > > > > > which
> > > > > > > > >> > > > > > > > is
> > > > > > > > >> > > > > > > > > objectively a good thing. If before the
> > > change,
> > > > > for
> > > > > > > > every
> > > > > > > > >> > node
> > > > > > > > >> > > > > > > executed,
> > > > > > > > >> > > > > > > > we
> > > > > > > > >> > > > > > > > > have an update for every node previously
> > > > executed,
> > > > > > so
> > > > > > > > if a
> > > > > > > > >> > > > process
> > > > > > > > >> > > > > > have
> > > > > > > > >> > > > > > > > 50
> > > > > > > > >> > > > > > > > > nodes to execute, we were performing
> nearly
> > > > > 50*51/2
> > > > > > > > >> updates,
> > > > > > > > >> > > > which
> > > > > > > > >> > > > > > > gives
> > > > > > > > >> > > > > > > > us
> > > > > > > > >> > > > > > > > > a total of  1275 updates, now we have just
> > one
> > > > for
> > > > > > > every
> > > > > > > > >> node
> > > > > > > > >> > > > being
> > > > > > > > >> > > > > > > > > executed, implying a total of 50 updates.
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > On Wed, Jan 17, 2024 at 3:18 PM Alex
> > Porcelli
> > > <
> > > > > > > > >> > > [email protected]>
> > > > > > > > >> > > > > > > wrote:
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > > > > Francisco,
> > > > > > > > >> > > > > > > > > >
> > > > > > > > >> > > > > > > > > > I noticed that your PR has been merged,
> > but
> > > I
> > > > > was
> > > > > > > > >> expecting
> > > > > > > > >> > > (at
> > > > > > > > >> > > > > > least
> > > > > > > > >> > > > > > > > > > was my understanding from this thread)
> > that
> > > > > before
> > > > > > > > >> merging
> > > > > > > > >> > > some
> > > > > > > > >> > > > > > > > > > benchmark data would be shared in
> advance
> > -
> > > to
> > > > > > > assess
> > > > > > > > >> the
> > > > > > > > >> > > > > > > cost/benefit
> > > > > > > > >> > > > > > > > > > of such a decent size change.
> > > > > > > > >> > > > > > > > > >
> > > > > > > > >> > > > > > > > > > Do you have any information to share?
> > > > > > > > >> > > > > > > > > >
> > > > > > > > >> > > > > > > > > > On Sat, Dec 23, 2023 at 4:02 AM
> Francisco
> > > > Javier
> > > > > > > > Tirado
> > > > > > > > >> > Sarti
> > > > > > > > >> > > > > > > > > > <[email protected]> wrote:
> > > > > > > > >> > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > Yes, as intended, now we have one
> select
> > > and
> > > > > one
> > > > > > > > >> > > > insert/update
> > > > > > > > >> > > > > > per
> > > > > > > > >> > > > > > > > node
> > > > > > > > >> > > > > > > > > > > event.
> > > > > > > > >> > > > > > > > > > > I moved the PR as ready for review and
> > > give
> > > > > > @Pere
> > > > > > > > >> > Fernandez
> > > > > > > > >> > > > > Perez
> > > > > > > > >> > > > > > > > > > > <[email protected]> permission to
> the
> > > > > branch
> > > > > > so
> > > > > > > > he
> > > > > > > > >> can
> > > > > > > > >> > > > edit
> > > > > > > > >> > > > > it
> > > > > > > > >> > > > > > > in
> > > > > > > > >> > > > > > > > > the
> > > > > > > > >> > > > > > > > > > > next two weeks (Ill be on PTO)  if
> > > desired,
> > > > > > before
> > > > > > > > >> > merging.
> > > > > > > > >> > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > On Thu, Dec 21, 2023 at 5:58 PM Alex
> > > > Porcelli
> > > > > <
> > > > > > > > >> > > > > [email protected]>
> > > > > > > > >> > > > > > > > > wrote:
> > > > > > > > >> > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > Cool, thank you Francisco!
> > > > > > > > >> > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > Did you manage to get some
> preliminary
> > > > data
> > > > > > > about
> > > > > > > > >> > > > > improvements?
> > > > > > > > >> > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > On Thu, Dec 21, 2023 at 11:52 AM
> > > Francisco
> > > > > > > Javier
> > > > > > > > >> > Tirado
> > > > > > > > >> > > > > Sarti
> > > > > > > > >> > > > > > > > > > > > <[email protected]> wrote:
> > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > Yes, after some delay because of
> > > > quarkus 3
> > > > > > > > >> migration.
> > > > > > > > >> > > Im
> > > > > > > > >> > > > > > > refining
> > > > > > > > >> > > > > > > > > > this
> > > > > > > > >> > > > > > > > > > > > > draft PR
> > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > >> > > > > >
> > > > > > > https://github.com/apache/incubator-kie-kogito-apps/pull/1941
> > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > On Thu, Dec 21, 2023 at 5:48 PM
> Alex
> > > > > > Porcelli
> > > > > > > <
> > > > > > > > >> > > > > > > [email protected]>
> > > > > > > > >> > > > > > > > > > wrote:
> > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > Any update or new findings on
> this
> > > > > topic?
> > > > > > > > >> > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > On Tue, Nov 28, 2023 at 8:38 AM
> > > > > Francisco
> > > > > > > > Javier
> > > > > > > > >> > > Tirado
> > > > > > > > >> > > > > > Sarti
> > > > > > > > >> > > > > > > > > > > > > > <[email protected]> wrote:
> > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > > Hi Alex,
> > > > > > > > >> > > > > > > > > > > > > > > After considering different
> > > options
> > > > to
> > > > > > > > improve
> > > > > > > > >> > > > > > performance,
> > > > > > > > >> > > > > > > > we
> > > > > > > > >> > > > > > > > > > feel
> > > > > > > > >> > > > > > > > > > > > that
> > > > > > > > >> > > > > > > > > > > > > > it
> > > > > > > > >> > > > > > > > > > > > > > > is time to "partially" move
> away
> > > > from
> > > > > > the
> > > > > > > > >> current
> > > > > > > > >> > > Map
> > > > > > > > >> > > > > > style
> > > > > > > > >> > > > > > > > > > > > interface (
> > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > >
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > >
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/incubator-kie-kogito-apps/blob/main/persistence-commons/persistence-commons-api/src/main/java/org/kie/kogito/persistence/api/Storage.java
> > > > > > > > >> > > > > > > > > > > > > > )
> > > > > > > > >> > > > > > > > > > > > > > > which was shared with Trusty,
> to
> > > one
> > > > > > more
> > > > > > > > >> > suitable
> > > > > > > > >> > > > for
> > > > > > > > >> > > > > > > usage
> > > > > > > > >> > > > > > > > > > with a
> > > > > > > > >> > > > > > > > > > > > > > > relational DB like postgresql
> > (but
> > > > > still
> > > > > > > > >> > compatible
> > > > > > > > >> > > > > with
> > > > > > > > >> > > > > > > big
> > > > > > > > >> > > > > > > > > > table
> > > > > > > > >> > > > > > > > > > > > dbs).
> > > > > > > > >> > > > > > > > > > > > > > > The idea will be to replace
> > > generic
> > > > > > > Storage
> > > > > > > > >> > > interface
> > > > > > > > >> > > > > by
> > > > > > > > >> > > > > > > four
> > > > > > > > >> > > > > > > > > > > > specific
> > > > > > > > >> > > > > > > > > > > > > > > interfaces (which will inherit
> > > from
> > > > a
> > > > > > > common
> > > > > > > > >> one
> > > > > > > > >> > > that
> > > > > > > > >> > > > > > keeps
> > > > > > > > >> > > > > > > > the
> > > > > > > > >> > > > > > > > > > query
> > > > > > > > >> > > > > > > > > > > > > > part
> > > > > > > > >> > > > > > > > > > > > > > > at is it. with get and query
> > > > methods),
> > > > > > > that
> > > > > > > > >> will
> > > > > > > > >> > > > > include
> > > > > > > > >> > > > > > > the
> > > > > > > > >> > > > > > > > > > required
> > > > > > > > >> > > > > > > > > > > > > > > modification operations for
> the
> > > four
> > > > > > > > DataIndex
> > > > > > > > >> > > > > "domains":
> > > > > > > > >> > > > > > > > > > > > > > processinstance,
> > > > > > > > >> > > > > > > > > > > > > > > usertask, processdefinitions
> and
> > > > jobs.
> > > > > > > Those
> > > > > > > > >> > > > interfaces
> > > > > > > > >> > > > > > > will
> > > > > > > > >> > > > > > > > > > define
> > > > > > > > >> > > > > > > > > > > > > > methods
> > > > > > > > >> > > > > > > > > > > > > > > like addNode, addVariable,
> > > > updateTask,
> > > > > > > > >> > > > > addAttachment.....
> > > > > > > > >> > > > > > > > that
> > > > > > > > >> > > > > > > > > > will
> > > > > > > > >> > > > > > > > > > > > allow
> > > > > > > > >> > > > > > > > > > > > > > > the persistent layer
> > > implementation
> > > > > to
> > > > > > > just
> > > > > > > > >> > update
> > > > > > > > >> > > > the
> > > > > > > > >> > > > > > > > needed
> > > > > > > > >> > > > > > > > > > info
> > > > > > > > >> > > > > > > > > > > > in
> > > > > > > > >> > > > > > > > > > > > > > the
> > > > > > > > >> > > > > > > > > > > > > > > DB  (for example, for addNode
> in
> > > > > > Postgres,
> > > > > > > > >> just
> > > > > > > > >> > > > insert
> > > > > > > > >> > > > > a
> > > > > > > > >> > > > > > > row
> > > > > > > > >> > > > > > > > > into
> > > > > > > > >> > > > > > > > > > > > nodes
> > > > > > > > >> > > > > > > > > > > > > > > table, for addNode in Mongo,
> > > > basically
> > > > > > the
> > > > > > > > >> same
> > > > > > > > >> > > > atomic
> > > > > > > > >> > > > > > > upsert
> > > > > > > > >> > > > > > > > > > > > operation
> > > > > > > > >> > > > > > > > > > > > > > > that is currently done).
> > > Therefore,
> > > > we
> > > > > > > > >> increase
> > > > > > > > >> > > > > > performance
> > > > > > > > >> > > > > > > > for
> > > > > > > > >> > > > > > > > > > > > Postgres
> > > > > > > > >> > > > > > > > > > > > > > > and keep the current one for
> > > Mongo.
> > > > > The
> > > > > > > > >> current
> > > > > > > > >> > DB
> > > > > > > > >> > > > > > schemas
> > > > > > > > >> > > > > > > > > won't
> > > > > > > > >> > > > > > > > > > be
> > > > > > > > >> > > > > > > > > > > > > > > touched.
> > > > > > > > >> > > > > > > > > > > > > > > Since the code change is
> large,
> > I
> > > do
> > > > > not
> > > > > > > > think
> > > > > > > > >> > I'll
> > > > > > > > >> > > > be
> > > > > > > > >> > > > > > able
> > > > > > > > >> > > > > > > > to
> > > > > > > > >> > > > > > > > > > have
> > > > > > > > >> > > > > > > > > > > > the
> > > > > > > > >> > > > > > > > > > > > > > PR
> > > > > > > > >> > > > > > > > > > > > > > > ready till next week.
> > > > > > > > >> > > > > > > > > > > > > > > But before starting, please
> let
> > me
> > > > > know
> > > > > > if
> > > > > > > > >> that
> > > > > > > > >> > > > > approach
> > > > > > > > >> > > > > > is
> > > > > > > > >> > > > > > > > > fine
> > > > > > > > >> > > > > > > > > > for
> > > > > > > > >> > > > > > > > > > > > you.
> > > > > > > > >> > > > > > > > > > > > > > > Best regards.
> > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > > On Fri, Nov 24, 2023 at
> 6:55 PM
> > > Alex
> > > > > > > > Porcelli
> > > > > > > > >> <
> > > > > > > > >> > > > > > > > > [email protected]>
> > > > > > > > >> > > > > > > > > > > > wrote:
> > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > > > Thank you Francisco to
> getting
> > > > > deeper
> > > > > > on
> > > > > > > > >> this…
> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > > > Looking forward to see the
> > > results
> > > > > of
> > > > > > > your
> > > > > > > > >> > > > suggested
> > > > > > > > >> > > > > > > > > > improvements.
> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > > > On Fri, Nov 24, 2023 at
> > 9:40 AM
> > > > > > > Francisco
> > > > > > > > >> > Javier
> > > > > > > > >> > > > > Tirado
> > > > > > > > >> > > > > > > > > Sarti <
> > > > > > > > >> > > > > > > > > > > > > > > > [email protected]> wrote:
> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > > > > I forgot to attach the
> > queries
> > > > > > > > >> > > > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > > > > On Fri, Nov 24, 2023 at
> > > 3:04 PM
> > > > > > > > Francisco
> > > > > > > > >> > > Javier
> > > > > > > > >> > > > > > Tirado
> > > > > > > > >> > > > > > > > > > Sarti <
> > > > > > > > >> > > > > > > > > > > > > > > > > [email protected]>
> wrote:
> > > > > > > > >> > > > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > > > > >> Hi,
> > > > > > > > >> > > > > > > > > > > > > > > > >> A brief update on this
> > topic.
> > > > > > > > >> > > > > > > > > > > > > > > > >> After doing a simple test
> > > with
> > > > > > > example
> > > > > > > > >> > > > > > > > > > > > > > > > >>
> > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > > > >
> > > > > > > > >> > > > > > > > > >
> > > > > > > > >> > > > > > > > >
> > > > > > > > >> > > > > > > >
> > > > > > > > >> > > > > > >
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/incubator-kie-kogito-examples/tree/stable/serverless-workflow-examples/serverless-workflow-data-index-quarkus
> > > > > > > > >> > > > > > > > > > > > > > > > ,
> > > > > > > > >> > > > > > > > > > > > > > > > >> the number of updates
> over
> > > > Nodes
> > > > > > > table
> > > > > > > > is
> > > > > > > > >> > n*n,
> > > > > > > > >> > > > so
> > > > > > > > >> > > > > we
> > > > > > > > >> > > > > > > > > manage
> > > > > > > > >> > > > > > > > > > to
> > > > > > > > >> > > > > > > > > > > > > > obtain a
> > > > > > > > >> > > > > > > > > > > > > > > > >> perfect quadratic
> > performance
> > > > > > > > >> degradation.
> > > > > > > > >> > The
> > > > > > > > >> > > > > > problem
> > > > > > > > >> > > > > > > > is
> > > > > > > > >> > > > > > > > > > worse
> > > > > > > > >> > > > > > > > > > > > in
> > > > > > > > >> > > > > > > > > > > > > > the
> > > > > > > > >> > > > > > > > > > > > > > > > case
> > > > > > > > >> > > > > > > > > > > > > > > > >> of Serverless Workflow
> than
> > > in
> > > > > BPMN
> > > > > > > > >> because
> > > > > > > > >> > we
> > > > > > > > >> > > > the
> > > > > > > > >> > > > > > > > number
> > > > > > > > >> > > > > > > > > of
> > > > > > > > >> > > > > > > > > > > > nodes
> > > > > > > > >> > > > > > > > > > > > > > is
> > > > > > > > >> > > > > > > > > > > > > > > > >> greater than the number
> of
> > > > > states.
> > > > > > In
> > > > > > > > >> that
> > > > > > > > >> > > > > example N
> > > > > > > > >> > > > > > > is
> > > > > > > > >> > > > > > > > > 16,
> > > > > > > > >> > > > > > > > > > but
> > > > > > > > >> > > > > > > > > > > > for
> > > > > > > > >> > > > > > > > > > > > > > a
> > > > > > > > >> > > > > > > > > > > > > > > > more
> > > > > > > > >> > > > > > > > > > > > > > > > >> complex workflow it would
> > be
> > > > > > > certainly
> > > > > > > > >> > large.
> > > > > > > > >> > > > > > > > > > > > > > > > >> I think that this is more
> > > > related
> > > > > > to
> > > > > > > > how
> > > > > > > > >> we
> > > > > > > > >> > > are
> > > > > > > > >> > > > > > > handling
> > > > > > > > >> > > > > > > > > > JPA in
> > > > > > > > >> > > > > > > > > > > > the
> > > > > > > > >> > > > > > > > > > > > > > > > code,
> > > > > > > > >> > > > > > > > > > > > > > > > >> in particular the mapping
> > > from
> > > > > > model
> > > > > > > to
> > > > > > > > >> > entity
> > > > > > > > >> > > > > > > > (basically
> > > > > > > > >> > > > > > > > > > JPA is
> > > > > > > > >> > > > > > > > > > > > > > blind
> > > > > > > > >> > > > > > > > > > > > > > > > and
> > > > > > > > >> > > > > > > > > > > > > > > > >> has to update all nodes
> for
> > > > every
> > > > > > > write
> > > > > > > > >> > > because
> > > > > > > > >> > > > it
> > > > > > > > >> > > > > > > > > believes
> > > > > > > > >> > > > > > > > > > the
> > > > > > > > >> > > > > > > > > > > > > > node has
> > > > > > > > >> > > > > > > > > > > > > > > > >> been updated, although it
> > is
> > > > not)
> > > > > > > than
> > > > > > > > an
> > > > > > > > >> > > issue
> > > > > > > > >> > > > in
> > > > > > > > >> > > > > > the
> > > > > > > > >> > > > > > > > > table
> > > > > > > > >> > > > > > > > > > > > > > definition.
> > > > > > > > >> > > > > > > > > > > > > > > > >> In fact, when using JPA,
> > > > > separating
> > > > > > > the
> > > > > > > > >> > server
> > > > > > > > >> > > > > model
> > > > > > > > >> > > > > > > > from
> > > > > > > > >> > > > > > > > > > the
> > > > > > > > >> > > > > > > > > > > > JPA
> > > > > > > > >> > > > > > > > > > > > > > > > entity is
> > > > > > > > >> > > > > > > > > > > > > > > > >> not a good idea,
> especially
> > > if
> > > > > the
> > > > > > > > entity
> > > > > > > > >> > > > contains
> > > > > > > > >> > > > > > > > > > collections.
> > > > > > > > >> > > > > > > > > > > > I
> > > > > > > > >> > > > > > > > > > > > > > will
> > > > > > > > >> > > > > > > > > > > > > > > > try
> > > > > > > > >> > > > > > > > > > > > > > > > >> to change that without
> > > breaking
> > > > > > > > anything.
> > > > > > > > >> > > > > > > > > > > > > > > > >>
> > > > > > > > >> > > > > > > > > > > > > > > > >> On Wed, Nov 22, 2023 at
> > > > 12:10 PM
> > > > > > > > Enrique
> > > > > > > > >> > > > Gonzalez
> > > > > > > > >> > > > > > > > > Martinez <
> > > > > > > > >> > > > > > > > > > > > > > > > >> [email protected]>
> > wrote:
> > > > > > > > >> > > > > > > > > > > > > > > > >>
> > > > > > > > >> > > > > > > > > > > > > > > > >>> After the events split
> you
> > > now
> > > > > > will
> > > > > > > > >> need to
> > > > > > > > >> > > > > create
> > > > > > > > >> > > > > > a
> > > > > > > > >> > > > > > > > node
> > > > > > > > >> > > > > > > > > > > > instance
> > > > > > > > >> > > > > > > > > > > > > > > > >>> model instance of making
> > > > > > independent
> > > > > > > > >> from
> > > > > > > > >> > the
> > > > > > > > >> > > > > > process
> > > > > > > > >> > > > > > > > > > instance.
> > > > > > > > >> > > > > > > > > > > > > > > > >>> That should do the
> trick.
> > > > > > > > >> > > > > > > > > > > > > > > > >>>
> > > > > > > > >> > > > > > > > > > > > > > > > >>> Regarding
> > deleting/inserting
> > > > it
> > > > > > was
> > > > > > > > >> fixed
> > > > > > > > >> > at
> > > > > > > > >> > > > some
> > > > > > > > >> > > > > > > > point.
> > > > > > > > >> > > > > > > > > > > > > > > > >>>
> > > > > > > > >> > > > > > > > > > > > > > > > >>> El mar, 21 nov 2023 a
> las
> > > > 20:22,
> > > > > > > > >> Francisco
> > > > > > > > >> > > > Javier
> > > > > > > > >> > > > > > > > Tirado
> > > > > > > > >> > > > > > > > > > Sarti
> > > > > > > > >> > > > > > > > > > > > > > > > >>> (<[email protected]>)
> > > > > escribió:
> > > > > > > > >> > > > > > > > > > > > > > > > >>> >
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > Hi Martin,
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > I have a task to
> review
> > > > > > > performance
> > > > > > > > of
> > > > > > > > >> > > > > > > > > > > > > > > > >>> >
> > > > > > > > >> > > > > > > > > > > > > > > > >>> >
> > > > > > ProcessInstanceNodeDataEventMerger
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > My idea is to reduce
> the
> > > > > number
> > > > > > of
> > > > > > > > >> delete
> > > > > > > > >> > > > > inserts
> > > > > > > > >> > > > > > > > when
> > > > > > > > >> > > > > > > > > > > > processing
> > > > > > > > >> > > > > > > > > > > > > > > > >>> events
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > and try to do it
> > > > incremental.
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > That should improve
> > > > > performance.
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > PS:
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > I was planning to send
> > an
> > > > > e-mail
> > > > > > > > >> tomorrow
> > > > > > > > >> > > > > > > announcing
> > > > > > > > >> > > > > > > > > > that in
> > > > > > > > >> > > > > > > > > > > > > > case you
> > > > > > > > >> > > > > > > > > > > > > > > > >>> were
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > already working on a
> fix
> > > for
> > > > > > > that. I
> > > > > > > > >> > assume
> > > > > > > > >> > > > you
> > > > > > > > >> > > > > > are
> > > > > > > > >> > > > > > > > not
> > > > > > > > >> > > > > > > > > > and I
> > > > > > > > >> > > > > > > > > > > > > > would
> > > > > > > > >> > > > > > > > > > > > > > > > be
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > sending a PR soon.
> > > > > > > > >> > > > > > > > > > > > > > > > >>> >
> > > > > > > > >> > > > > > > > > > > > > > > > >>> > On Tue, Nov 21, 2023
> at
> > > > > 6:09 PM
> > > > > > > > Martin
> > > > > > > > >> > > Weiler
> > > > > > > > >> > > > > > > > > > > > > > > > <[email protected]
> > > > > > > > >> > > > > > > > > > > > > > > > >>

Re: [DISCUSSION] Performance issues with data-index persistence addon

Reply via email to