Re: [DISCUSSION] Performance issues with data-index persistence addon

Alex Porcelli Wed, 21 Feb 2024 15:15:34 -0800

I think I need to clarify what I mean by REST...

The REST endpoint I have in mind would execute the necessary cleaning
based on an input parameter (ie. instanced closed for more than X days
or something like that).


The use of cron is that an external cron could call the cleaning REST
endpoint from time to time (once a month, on every sunday night, etc).


On Wed, Feb 21, 2024 at 10:50 AM Francisco Javier Tirado Sarti
<[email protected]> wrote:
>
> Hi Alex,
> I'm wondering how cron and rest can live together.
> With cron the responsibility to do the clean up relies on a process
> external to the microservice accessing the db to be purged, while where
> using REST API the process performing the clean up is typically (but not
> necessarily) the same microservice that accesses the DB. In that sense,
> REST API is similar to using properties (in both approaches the responsible
> to clean up the data is the microservice), but they are different because
> REST is more sophisticated (but compatible, implementation wise we can use
> a ConfigSource, that is updated by the REST and read, as a property, by the
> cleanup routine). So, probably I'm missing something but I see the cron
> approach not reconcilable with the REST approach.
> Anyway just a detail, I agree the purge approach should be similar for all
> microservices
>
> On Wed, Feb 21, 2024 at 4:37 PM Alex Porcelli <[email protected]> wrote:
>
> > +1 for provide consistent behavior across the board
> >
> > my 2c:
> >
> > +1 for REST interface (preferred for external general management
> > perspective)
> > -1 for properties to auto clean; this can be achieved with cron and rest
> > call
> >
> > Now about GraphQL: the platform today is heavily invested in such
> > interface, so it might make sense. BUT I’d consider it in a second moment.
> >
> >
> > On Wed, Feb 21, 2024 at 6:52 AM Francisco Javier Tirado Sarti <
> > [email protected]> wrote:
> >
> > > Hi Enrique,
> > > If we configure such policies through properties, it won't be enough to
> > > define a naming convention and rely on the target platform configuration
> > > capabilities (SpringBoot or Quarkus)?.
> > >
> > > On Wed, Feb 21, 2024 at 12:36 PM Enrique Gonzalez Martinez <
> > > [email protected]> wrote:
> > >
> > > > Hi Francisco
> > > >
> > > > The discussion we need to have before how to achieve certain features,
> > is
> > > > the overall of the user experience. If you want to clean it up that way
> > > it
> > > > is fine by me. I have nothing against setting policy related to some
> > sort
> > > > of clean up for a process completed some time ago.
> > > >
> > > > As we discussed previously the data index is a snapshot of the last
> > state
> > > > of the process instance included completed, but nothing was said once
> > > > completed when we need to clean that up so any policy is welcome. How
> > to
> > > > achieve that policy is something completely different.
> > > >
> > > > The main problem is how every microservice is exposing that API to the
> > > end
> > > > user or consumer (other system) and which system is going to be the
> > > façade
> > > > for complex deployments and operations. That is the discussion I was
> > > > mentioning before.
> > > >
> > > > What I want to avoid is to set different policies among components and
> > we
> > > > should strive to be as much as possible to offer certain capabilities
> > in
> > > > the same fashion, e.g: clean up mechanism.
> > > > In the same way I want to avoid making the current system more complex
> > > than
> > > > they are. So far they are aligned offering one simple responsibility
> > but
> > > I
> > > > would be against mixing for instance job service with data index.
> > > >
> > > > El mié, 21 feb 2024 a las 12:13, Francisco Javier Tirado Sarti (<
> > > > [email protected]>) escribió:
> > > >
> > > > > Hi Enrique,
> > > > > In the case of data index I think the data to be purged is finished
> > > > process
> > > > > instances (I do not think we should remove process instance that has
> > > been
> > > > > alive for ages, even if it is very likely they are not going to be
> > ever
> > > > > completed)
> > > > > Once you delete those process instances, you also delete the
> > associated
> > > > > user tasks and jobs.
> > > > > Therefore the problem is relatively simple, to be able to configure
> > how
> > > > > much time a completed process instance should remain in the data
> > index
> > > > > database. We can take a simple approach: a property with a min
> > duration
> > > > > that cannot be changed once data index is started; a slightly complex
> > > > one:
> > > > > the same property but watching it to react for changes; or the full
> > > > suite:
> > > > > an admin API to be able to change the policy at any moment.
> > > > > I think this discussion is worth having.
> > > > >
> > > > > On Wed, Feb 21, 2024 at 6:15 AM Enrique Gonzalez Martinez <
> > > > > [email protected]> wrote:
> > > > >
> > > > > > Hi Martin, the main problem regarding the purge is because it is
> > > still
> > > > > > unclear the policy for what tech to use and the future components.
> > > > > >
> > > > > > Recently we had a discussion about proposing graphql for this sort
> > of
> > > > > admin
> > > > > > tasks. So far for subsystems we have been using rest endpoints
> > (like
> > > > > update
> > > > > > timers or modify human task or change processes). There is one
> > > > exception
> > > > > > which is the gateway that is pure graphql and somehow uses graphql
> > > for
> > > > > > everything making complex operations under the hood.
> > > > > >
> > > > > > This has somehow frozen the purge for data audit for a bit and the
> > > > > proposal
> > > > > > was to use rest endpoints to do the clean up in the component and
> > > offer
> > > > > the
> > > > > > graphql counterpart in the gateway promoting it to a first class
> > > > citizen
> > > > > > component instead of having it embedded in the data index.
> > > > > >
> > > > > > I would suggest to come up at least with a policy first regarding
> > the
> > > > > > convention every component should address this.
> > > > > >
> > > > > >
> > > > > > El mar, 20 feb 2024, 23:39, Martin Weiler <[email protected]
> > >
> > > > > > escribió:
> > > > > >
> > > > > > > IMO, it is good to have this discussion around data sanity now
> > > > instead
> > > > > of
> > > > > > > putting it off until later when data has already accumulated in
> > > > > > production
> > > > > > > environments.
> > > > > > >
> > > > > > > Based on the input here, we are dealing with three types of data:
> > > > > > > 1. Runtime data - active instances only, engine cleans up the
> > data
> > > > > > > automatically at process instance end
> > > > > > > 2. Historic log data - data created by data-audit intended for
> > long
> > > > > term
> > > > > > > storage
> > > > > > > 3. Data-index data - somehow this data falls in between the two
> > > > > > > aforementioned categories, with the idea of the data being
> > > "recent",
> > > > > but
> > > > > > > not restricted to active instances only
> > > > > > >
> > > > > > > We'd need purge strategies for both #2 and #3 (perhaps different
> > > > ones,
> > > > > or
> > > > > > > with different config settings) in order to prevent unlimited
> > data
> > > > > > growth.
> > > > > > >
> > > > > > > ________________________________________
> > > > > > > From: Enrique Gonzalez Martinez <[email protected]>
> > > > > > > Sent: Monday, February 19, 2024 7:11 AM
> > > > > > > To: [email protected]
> > > > > > > Subject: [EXTERNAL] Re: [DISCUSSION] Performance issues with
> > > > data-index
> > > > > > > persistence addon
> > > > > > >
> > > > > > > Hi Francisco,
> > > > > > > To give you more context about this.
> > > > > > >
> > > > > > > STP is a concept, a process with certain constraints: no
> > > persistence
> > > > > and
> > > > > > > returning the outcome in the call (sync execution with no idle
> > > > states).
> > > > > > It
> > > > > > > was a requirement from a user in the past. One of the
> > requirements
> > > > was
> > > > > > > leaving no trail. In v7 was easy because you could disable the
> > > audit
> > > > in
> > > > > > > that case. Actually we have the same way to do what we did in v7
> > in
> > > > > here
> > > > > > as
> > > > > > > you can add/remove index just removing deps.
> > > > > > >
> > > > > > > We have the same outcome with different approaches and STP is
> > > already
> > > > > > > delivered.
> > > > > > >
> > > > > > > El lun, 19 feb 2024 a las 14:46, Francisco Javier Tirado Sarti (<
> > > > > > > [email protected]>) escribió:
> > > > > > >
> > > > > > > > Regarding STP (which is not a concept that we have in the
> > code. I
> > > > > mean
> > > > > > > STP
> > > > > > > > are processes as nonSTP are), I guess, as all processes, they
> > > were
> > > > > kept
> > > > > > > in
> > > > > > > > DataIndex once completed because users wanted (and still wants)
> > > to
> > > > > > check
> > > > > > > > the result once the call had been performed. If we want to
> > leave
> > > no
> > > > > > trace
> > > > > > > > of them in DataIndex for some reason, we will need to make it a
> > > > > > > > Runtimes concept so DataIndex can handle them in a different
> > way.
> > > > > > > >
> > > > > > > > On Mon, Feb 19, 2024 at 2:27 PM Enrique Gonzalez Martinez <
> > > > > > > > [email protected]> wrote:
> > > > > > > >
> > > > > > > > > Alex:
> > > > > > > > > Right now the data index is working in the same way as it did
> > > in
> > > > v7
> > > > > > > with
> > > > > > > > > the emitters. The only difference between two impl is that in
> > > > here
> > > > > > the
> > > > > > > > > storage is pgsql instead elastic search.  You are right
> > > regarding
> > > > > is
> > > > > > a
> > > > > > > > > snapshot of the last state of the process but we did never
> > > define
> > > > > how
> > > > > > > > long
> > > > > > > > > would be alive that dats Honestly i am happy right now with
> > the
> > > > way
> > > > > > it
> > > > > > > > > works. The clean up mechanism is still tbd because we still
> > > need
> > > > to
> > > > > > > > discuss
> > > > > > > > > other stuff first.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Regarding stp is to leave no trail because u can get the
> > > outcome
> > > > > > > directly
> > > > > > > > > from the call. It was defined like that in v7. So there is no
> > > use
> > > > > for
> > > > > > > the
> > > > > > > > > index or the audit.
> > > > > > > > >
> > > > > > > > > El lun, 19 feb 2024, 14:13, Francisco Javier Tirado Sarti <
> > > > > > > > > [email protected]> escribió:
> > > > > > > > >
> > > > > > > > > > Hi Alex,
> > > > > > > > > > There has been some confusion about the purpose of
> > DataIndex.
> > > > To
> > > > > be
> > > > > > > > > honest
> > > > > > > > > > I believe they were already sorted out, but your e-mail
> > makes
> > > > me
> > > > > > > think
> > > > > > > > > that
> > > > > > > > > > is not the case ;). I let Kris to clarify that with you. My
> > > > view
> > > > > is
> > > > > > > > that
> > > > > > > > > > data-index is a way to query recently closed and active
> > > > processes
> > > > > > > (the
> > > > > > > > > key
> > > > > > > > > > here is the definition of recently, which in my opinion
> > > should
> > > > be
> > > > > > > > > > configurable)
> > > > > > > > > > But, besides that discussion and being pragmatic, keeping
> > > > > finishing
> > > > > > > > > process
> > > > > > > > > > instances "for a while" in DataIndex was the only way for
> > > users
> > > > > to
> > > > > > > > query
> > > > > > > > > > the result of straight through processes. That's a function
> > > > that
> > > > > > > cannot
> > > > > > > > > be
> > > > > > > > > > removed right now
> > > > > > > > > >
> > > > > > > > > > On Mon, Feb 19, 2024 at 1:33 PM Alex Porcelli <
> > > > > [email protected]
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > if data index was supposed to provide snapshot view of
> > the
> > > > > > process
> > > > > > > > > > > instance… why do we keep it after the process instance is
> > > > > > finished?
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Feb 19, 2024 at 7:12 AM Francisco Javier Tirado
> > > > Sarti <
> > > > > > > > > > > [email protected]> wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Martin.
> > > > > > > > > > > > After taking a deeper look at this, I realize that the
> > > > > > behaviour
> > > > > > > is
> > > > > > > > > the
> > > > > > > > > > > > expected one.
> > > > > > > > > > > > Runtimes DB does not track the completed process
> > instance
> > > > > > (that's
> > > > > > > > > what
> > > > > > > > > > > the
> > > > > > > > > > > > JDBCProcessInstances warn is telling us), but
> > DataIndex,
> > > as
> > > > > > > > expected,
> > > > > > > > > > is
> > > > > > > > > > > > tracking it in processes and nodes table. And yes it
> > will
> > > > > grow
> > > > > > > over
> > > > > > > > > > time.
> > > > > > > > > > > > What we need is some configurable purge mechanism for
> > > > > > DataIndex,
> > > > > > > so
> > > > > > > > > it
> > > > > > > > > > > > eventually removes older completed process instances.
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, Feb 13, 2024 at 12:59 PM Francisco Javier
> > Tirado
> > > > > Sarti
> > > > > > <
> > > > > > > > > > > > [email protected]> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Martin,
> > > > > > > > > > > > > Good catch!. Looks like the skipping performed for
> > > > process
> > > > > > > > > instances
> > > > > > > > > > is
> > > > > > > > > > > > > not applied to node instances. Something we
> > definitely
> > > > need
> > > > > > to
> > > > > > > > > review
> > > > > > > > > > > on
> > > > > > > > > > > > > the runtimes side.
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Mon, Feb 12, 2024 at 11:59 PM Martin Weiler
> > > > > > > > > > <[email protected]
> > > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > >> On a somewhat related note, testing a simple
> > workflow
> > > > > (start
> > > > > > > ->
> > > > > > > > > > script
> > > > > > > > > > > > >> node -> end), I see the following messages in the
> > > logs:
> > > > > > > > > > > > >> 2024-02-12 22:49:50,493 28758dde544c WARN
> > > > > > > > > > > > >>
> > > > [org.kie.kogito.persistence.jdbc.JDBCProcessInstances:-1]
> > > > > > > > > > > > >> (executor-thread-3) Skipping create of process
> > > instance
> > > > > id:
> > > > > > > > > > > > >> 7083088e-b899-47cb-b85c-5d9ccb0aa166, state: 2
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> So far, so good. And I'd expect to see no trace of
> > > this
> > > > > > > process
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > > > >> database if I don't have data audit enabled.
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> However, the 'processes' table contains a row with
> > > > > state=2,
> > > > > > > with
> > > > > > > > > > > related
> > > > > > > > > > > > >> entries in the 'nodes' table. In a load test, I see
> > > > these
> > > > > > > tables
> > > > > > > > > > grow
> > > > > > > > > > > > >> significantly over time. Am I missing something to
> > > have
> > > > > > these
> > > > > > > > > > entries
> > > > > > > > > > > > >> cleaned up automatically?
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> ________________________________________
> > > > > > > > > > > > >> From: Martin Weiler <[email protected]>
> > > > > > > > > > > > >> Sent: Monday, February 12, 2024 3:40 PM
> > > > > > > > > > > > >> To: [email protected]
> > > > > > > > > > > > >> Subject: [EXTERNAL] RE: [DISCUSSION] Performance
> > > issues
> > > > > with
> > > > > > > > > > > data-index
> > > > > > > > > > > > >> persistence addon
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> Thanks everyone for your input. Based on this
> > > > discussion,
> > > > > I
> > > > > > > > opened
> > > > > > > > > > the
> > > > > > > > > > > > >> following PR:
> > > > > > > > > > > > >>
> > > > > > https://github.com/apache/incubator-kie-kogito-apps/pull/1985
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> With this change, the performance seems to be stable
> > > > over
> > > > > > > time:
> > > > > > > > > > > > >>
> > > > > > > > > > > > >>
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://drive.google.com/file/d/1zkullvfrJpRp7TRjxDa41ok6kEIR7Fty/view?usp=sharing
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> Martin
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> ________________________________________
> > > > > > > > > > > > >> From: Gonzalo Muñoz <[email protected]>
> > > > > > > > > > > > >> Sent: Friday, February 9, 2024 9:42 AM
> > > > > > > > > > > > >> To: [email protected]
> > > > > > > > > > > > >> Subject: [EXTERNAL] Re: [DISCUSSION] Performance
> > > issues
> > > > > with
> > > > > > > > > > > data-index
> > > > > > > > > > > > >> persistence addon
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> Great work Francisco,
> > > > > > > > > > > > >> Martin, take a look at this link with some related
> > > tips
> > > > > (in
> > > > > > > case
> > > > > > > > > you
> > > > > > > > > > > > find
> > > > > > > > > > > > >> it useful):
> > > > > > > > > > > > >>
> > > > > > > https://www.cybertec-postgresql.com/en/index-your-foreign-key/
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> El vie, 9 feb 2024 a las 17:20, Francisco Javier
> > > Tirado
> > > > > > Sarti
> > > > > > > (<
> > > > > > > > > > > > >> [email protected]>) escribió:
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> > For the moment being, we will keep JPA till we
> > > exhaust
> > > > > all
> > > > > > > > > > > > >> possibilities,
> > > > > > > > > > > > >> > let's call switching from jpa to jdbc our hidden
> > > plan
> > > > B
> > > > > ;)
> > > > > > > > > > > > >> > I already told Martin, but in order everyone to
> > > know,
> > > > > just
> > > > > > > > after
> > > > > > > > > > > > writing
> > > > > > > > > > > > >> > the previous email, I thought "what if Postgres is
> > > not
> > > > > > > > > > automatically
> > > > > > > > > > > > >> > indexing foreign keys like mysql?" and, eureka
> > > > > > > > > > > > >> > Postgres doc
> > > > > > > > > > > > >> >
> > > > > > > https://www.postgresql.org/docs/current/ddl-constraints.html
> > > > > > > > > > > > >> > Mysql doc
> > > > > > > > > > > > >> >
> > > > > > > > > >
> > > > > >
> > https://dev.mysql.com/doc/refman/8.0/en/constraint-foreign-key.html
> > > > > > > > > > > > >> > These are the relevant excerpt
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > *Postgresql*
> > > > > > > > > > > > >> > *A foreign key must reference columns that either
> > > are
> > > > a
> > > > > > > > primary
> > > > > > > > > > key
> > > > > > > > > > > or
> > > > > > > > > > > > >> form
> > > > > > > > > > > > >> > a unique constraint, or are columns from a
> > > non-partial
> > > > > > > unique
> > > > > > > > > > index.
> > > > > > > > > > > > >> This
> > > > > > > > > > > > >> > means that the referenced columns always have an
> > > index
> > > > > to
> > > > > > > > allow
> > > > > > > > > > > > >> efficient
> > > > > > > > > > > > >> > lookups on whether a referencing row has a match.
> > > > Since
> > > > > a
> > > > > > > > DELETE
> > > > > > > > > > of
> > > > > > > > > > > a
> > > > > > > > > > > > >> row
> > > > > > > > > > > > >> > from the referenced table or an UPDATE of a
> > > referenced
> > > > > > > column
> > > > > > > > > will
> > > > > > > > > > > > >> require
> > > > > > > > > > > > >> > a scan of the referencing table for rows matching
> > > the
> > > > > old
> > > > > > > > value,
> > > > > > > > > > it
> > > > > > > > > > > is
> > > > > > > > > > > > >> > often a good idea to index the referencing columns
> > > > too.
> > > > > > > > Because
> > > > > > > > > > this
> > > > > > > > > > > > is
> > > > > > > > > > > > >> not
> > > > > > > > > > > > >> > always needed, and there are many choices
> > available
> > > on
> > > > > how
> > > > > > > to
> > > > > > > > > > index,
> > > > > > > > > > > > the
> > > > > > > > > > > > >> > declaration of a foreign key constraint does not
> > > > > > > automatically
> > > > > > > > > > > create
> > > > > > > > > > > > an
> > > > > > > > > > > > >> > index on the referencing columns.*
> > > > > > > > > > > > >> > *Mysql*
> > > > > > > > > > > > >> > *MySQL requires that foreign key columns be
> > indexed;
> > > > if
> > > > > > you
> > > > > > > > > > create a
> > > > > > > > > > > > >> table
> > > > > > > > > > > > >> > with a foreign key constraint but no index on a
> > > given
> > > > > > > column,
> > > > > > > > an
> > > > > > > > > > > index
> > > > > > > > > > > > >> is
> > > > > > > > > > > > >> > created. *
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > So I asked Martin to especially create an index
> > for
> > > > > > > > > > > > process_instance_id
> > > > > > > > > > > > >> > column on nodes table
> > > > > > > > > > > > >> > I think that will fix the problem detected on the
> > > > thread
> > > > > > > dump.
> > > > > > > > > > > > >> > The simpler process test to verify queries are
> > fine
> > > > > still
> > > > > > > > > stands,
> > > > > > > > > > > > >> though ;)
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > On Fri, Feb 9, 2024 at 5:10 PM Tibor Zimányi <
> > > > > > > > > [email protected]
> > > > > > > > > > >
> > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > > I always preferred pure JDBC over Hibernate
> > > myself,
> > > > > just
> > > > > > > for
> > > > > > > > > the
> > > > > > > > > > > > sake
> > > > > > > > > > > > >> of
> > > > > > > > > > > > >> > > control of what is happening :) So I would not
> > -1
> > > > that
> > > > > > > > myself.
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > Tibor
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > Dňa pi 9. 2. 2024, 17:00 Francisco Javier Tirado
> > > > > Sarti <
> > > > > > > > > > > > >> > > [email protected]>
> > > > > > > > > > > > >> > > napísal(a):
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > > Hi,
> > > > > > > > > > > > >> > > > Usually I do not want to talk about work in
> > > > progress
> > > > > > > > because
> > > > > > > > > > > > >> > preliminary
> > > > > > > > > > > > >> > > > conclusions are pretty volatile but, well,
> > there
> > > > > are a
> > > > > > > > > couple
> > > > > > > > > > of
> > > > > > > > > > > > >> things
> > > > > > > > > > > > >> > > > that can be concluded from the really valuable
> > > > > > > information
> > > > > > > > > > that
> > > > > > > > > > > > >> Martin
> > > > > > > > > > > > >> > > > provided.
> > > > > > > > > > > > >> > > > 1) In order to be able to determine if the
> > > number
> > > > of
> > > > > > > > > > statements
> > > > > > > > > > > is
> > > > > > > > > > > > >> > larger
> > > > > > > > > > > > >> > > > than expected, I asked Martin to test with a
> > > > simpler
> > > > > > > > process
> > > > > > > > > > > > >> > definition.
> > > > > > > > > > > > >> > > > One with just three nodes: start, script and
> > > end.
> > > > > The
> > > > > > > > script
> > > > > > > > > > one
> > > > > > > > > > > > >> should
> > > > > > > > > > > > >> > > > change just one variable. This way we can
> > > analyze
> > > > if
> > > > > > the
> > > > > > > > > > number
> > > > > > > > > > > of
> > > > > > > > > > > > >> > > queries
> > > > > > > > > > > > >> > > > is the expected one. From the single log
> > (audit
> > > > was
> > > > > > > > > activated
> > > > > > > > > > > > them)
> > > > > > > > > > > > >> my
> > > > > > > > > > > > >> > > > conclusion is that the number of
> > insert/updates
> > > > over
> > > > > > > > > processes
> > > > > > > > > > > and
> > > > > > > > > > > > >> > nodes
> > > > > > > > > > > > >> > > > (there a lot over task, that I will prefer to
> > > skip
> > > > > for
> > > > > > > > now,
> > > > > > > > > > baby
> > > > > > > > > > > > >> steps)
> > > > > > > > > > > > >> > > is
> > > > > > > > > > > > >> > > > the expected one.
> > > > > > > > > > > > >> > > > 2) Analysing the thread dump, we see around 15
> > > > > threads
> > > > > > > > > > executing
> > > > > > > > > > > > >> this
> > > > > > > > > > > > >> > > line
> > > > > > > > > > > > >> > > > at
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >>
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > org.kie.kogito.index.jpa.storage.ProcessInstanceEntityStorage.indexNode(ProcessInstanceEntityStorage.java:125),
> > > > > > > > > > > > >> > > > so its pretty clear the code to be optimized
> > ;).
> > > > I'm
> > > > > > > > > > evaluating
> > > > > > > > > > > > >> > > > possibilities within JPA/Hibernate, but I'm
> > > > starting
> > > > > > to
> > > > > > > > > think
> > > > > > > > > > > that
> > > > > > > > > > > > >> it
> > > > > > > > > > > > >> > > might
> > > > > > > > > > > > >> > > > be better to switch to JDBC and skip
> > hibernate.
> > > > Our
> > > > > > > lives
> > > > > > > > > will
> > > > > > > > > > > be
> > > > > > > > > > > > >> > > simpler,
> > > > > > > > > > > > >> > > > especially with a schema relatively simple
> > like
> > > > ours
> > > > > > > (that
> > > > > > > > > > will
> > > > > > > > > > > be
> > > > > > > > > > > > >> my
> > > > > > > > > > > > >> > > > recommendation if I was an external
> > consultant)
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > > > On Fri, Feb 9, 2024 at 4:15 PM Tibor Zimányi <
> > > > > > > > > > > [email protected]
> > > > > > > > > > > > >
> > > > > > > > > > > > >> > > wrote:
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > > > > Hi,
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > > > this will be a bit off-topic. However as far
> > > as
> > > > > > > > > > performance, I
> > > > > > > > > > > > >> think
> > > > > > > > > > > > >> > we
> > > > > > > > > > > > >> > > > > should think about that we have string
> > primary
> > > > > keys
> > > > > > > > > (IDs). I
> > > > > > > > > > > > would
> > > > > > > > > > > > >> > > expect
> > > > > > > > > > > > >> > > > > the database systems are much better with
> > > > indexing
> > > > > > > > numeric
> > > > > > > > > > > keys
> > > > > > > > > > > > >> than
> > > > > > > > > > > > >> > > > > strings. I remember from the past, when I
> > was
> > > > > > working
> > > > > > > > with
> > > > > > > > > > > DBs,
> > > > > > > > > > > > >> that
> > > > > > > > > > > > >> > > > using
> > > > > > > > > > > > >> > > > > strings as keys or indexes was a discouraged
> > > > > > practice.
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > > > Best regards,
> > > > > > > > > > > > >> > > > > Tibor
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > > > Dňa št 8. 2. 2024, 22:45 Martin Weiler
> > > > > > > > > > > <[email protected]
> > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > napísal(a):
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > > > > I changed the test to use MongoDB [1] and
> > I
> > > > > don't
> > > > > > > see
> > > > > > > > a
> > > > > > > > > > > > >> performance
> > > > > > > > > > > > >> > > > > > degradation with this setup [2].
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > > > Please keep us posted of your findings.
> > > > Thanks!
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > > > Martin
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > > > [1]
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> >
> > > > > > > > > > > >
> > > > > > > > >
> > > > > >
> > > https://github.com/martinweiler/job-service-refactor-test/tree/mongodb
> > > > > > > > > > > > >> > > > > > [2]
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >>
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://drive.google.com/file/d/1NfacXaxJlgRMw4OQ5S20cvkzvaUKUVFj/view?usp=sharing
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > > > ________________________________________
> > > > > > > > > > > > >> > > > > > From: Francisco Javier Tirado Sarti <
> > > > > > > > > [email protected]>
> > > > > > > > > > > > >> > > > > > Sent: Wednesday, February 7, 2024 11:40 AM
> > > > > > > > > > > > >> > > > > > To: [email protected]
> > > > > > > > > > > > >> > > > > > Subject: [EXTERNAL] Re: [DISCUSSION]
> > > > Performance
> > > > > > > > issues
> > > > > > > > > > with
> > > > > > > > > > > > >> > > data-index
> > > > > > > > > > > > >> > > > > > persistence addon
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > > > yes, it can be index degradation because
> > of
> > > > > size,
> > > > > > > but
> > > > > > > > I
> > > > > > > > > > > > believe
> > > > > > > > > > > > >> (I
> > > > > > > > > > > > >> > > > might
> > > > > > > > > > > > >> > > > > be
> > > > > > > > > > > > >> > > > > > wrong) the db is too small (yet) for that.
> > > > > > > > > > > > >> > > > > > But, eventually, Postgres, when the DB is
> > > huge
> > > > > > > enough,
> > > > > > > > > > > > >> unavoidably
> > > > > > > > > > > > >> > > will
> > > > > > > > > > > > >> > > > > > behave like the graphic that Martin sent.
> > > > > > > > > > > > >> > > > > > Since I believe we are not huge enough
> > > (yet),
> > > > > lets
> > > > > > > > rule
> > > > > > > > > > out
> > > > > > > > > > > > >> another
> > > > > > > > > > > > >> > > > issue
> > > > > > > > > > > > >> > > > > > by analysing the sql logs (I requested
> > those
> > > > to
> > > > > > > Martin
> > > > > > > > > > > offline
> > > > > > > > > > > > >> and
> > > > > > > > > > > > >> > he
> > > > > > > > > > > > >> > > > is
> > > > > > > > > > > > >> > > > > > going to kindly collect them).
> > > > > > > > > > > > >> > > > > > Also Im curious to know if Mongo behave in
> > > the
> > > > > > same
> > > > > > > > way.
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > > > On Wed, Feb 7, 2024 at 7:25 PM Enrique
> > > > Gonzalez
> > > > > > > > > Martinez <
> > > > > > > > > > > > >> > > > > > [email protected]> wrote:
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > > > > Hi Francisco,
> > > > > > > > > > > > >> > > > > > > I would highly recommend to check
> > indexes
> > > > and
> > > > > > how
> > > > > > > > the
> > > > > > > > > > > > updates
> > > > > > > > > > > > >> > work
> > > > > > > > > > > > >> > > in
> > > > > > > > > > > > >> > > > > > data
> > > > > > > > > > > > >> > > > > > > index to avoid full scan table and lock
> > > the
> > > > > full
> > > > > > > > > table.
> > > > > > > > > > > Some
> > > > > > > > > > > > >> db
> > > > > > > > > > > > >> > are
> > > > > > > > > > > > >> > > > > very
> > > > > > > > > > > > >> > > > > > > sensitive to that.
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > > > El mié, 7 feb 2024, 18:41, Francisco
> > > Javier
> > > > > > Tirado
> > > > > > > > > > Sarti <
> > > > > > > > > > > > >> > > > > > > [email protected]> escribió:
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > > > > Hi Martin,
> > > > > > > > > > > > >> > > > > > > > While I analyze the data, let me ask
> > you
> > > > if
> > > > > it
> > > > > > > is
> > > > > > > > > > > possible
> > > > > > > > > > > > >> to
> > > > > > > > > > > > >> > > > perform
> > > > > > > > > > > > >> > > > > > > > another check (similar in a way to
> > > > disabling
> > > > > > > > > > data-index
> > > > > > > > > > > > like
> > > > > > > > > > > > >> > you
> > > > > > > > > > > > >> > > > do)
> > > > > > > > > > > > >> > > > > > Can
> > > > > > > > > > > > >> > > > > > > > you switch to MongoDB persistence and
> > > > check
> > > > > if
> > > > > > > the
> > > > > > > > > > same
> > > > > > > > > > > > >> > > degradation
> > > > > > > > > > > > >> > > > > > that
> > > > > > > > > > > > >> > > > > > > is
> > > > > > > > > > > > >> > > > > > > > there for postgres remains?
> > > > > > > > > > > > >> > > > > > > > I do not know if this is feasible but
> > > will
> > > > > > > > certainly
> > > > > > > > > > > > >> indicate
> > > > > > > > > > > > >> > the
> > > > > > > > > > > > >> > > > > > problem
> > > > > > > > > > > > >> > > > > > > > is on the postgres storage layer and I
> > > do
> > > > > not
> > > > > > > > have a
> > > > > > > > > > > clear
> > > > > > > > > > > > >> > > > prediction
> > > > > > > > > > > > >> > > > > > of
> > > > > > > > > > > > >> > > > > > > > what we will see when doing this
> > switch.
> > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > >> > > > > > > > On Wed, Feb 7, 2024 at 6:37 PM Martin
> > > > Weiler
> > > > > > > > > > > > >> > > > <[email protected]
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > > > > > wrote:
> > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > Hi Francisco,
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > thanks for your work on this
> > important
> > > > > > topic!
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > I would like to share some test
> > > results
> > > > > > here,
> > > > > > > > > which
> > > > > > > > > > > > might
> > > > > > > > > > > > >> > help
> > > > > > > > > > > > >> > > to
> > > > > > > > > > > > >> > > > > > > improve
> > > > > > > > > > > > >> > > > > > > > > the codebase even further. I am
> > using
> > > > the
> > > > > > > jmeter
> > > > > > > > > > based
> > > > > > > > > > > > >> test
> > > > > > > > > > > > >> > > case
> > > > > > > > > > > > >> > > > > from
> > > > > > > > > > > > >> > > > > > > > Pere
> > > > > > > > > > > > >> > > > > > > > > and Enrique (thanks guys!) [1] which
> > > > uses
> > > > > a
> > > > > > > load
> > > > > > > > > of
> > > > > > > > > > 30
> > > > > > > > > > > > >> > threads
> > > > > > > > > > > > >> > > to
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > 1) start a new process instance
> > (POST)
> > > > > > > > > > > > >> > > > > > > > > 2) retrieve tasks for a user (GET)
> > > > > > > > > > > > >> > > > > > > > > 3) fetches task details (GET)
> > > > > > > > > > > > >> > > > > > > > > 4) complete a task (POST)
> > > > > > > > > > > > >> > > > > > > > > 5) execute a query on data-audit
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > With this test setup, I noticed that
> > > the
> > > > > > > > > performance
> > > > > > > > > > > for
> > > > > > > > > > > > >> the
> > > > > > > > > > > > >> > > POST
> > > > > > > > > > > > >> > > > > > > > > requests, in particular the one to
> > > > start a
> > > > > > new
> > > > > > > > > > process
> > > > > > > > > > > > >> > > instance,
> > > > > > > > > > > > >> > > > > > > degrades
> > > > > > > > > > > > >> > > > > > > > > over time - see graph [2]. If I run
> > > the
> > > > > same
> > > > > > > > test
> > > > > > > > > > > > without
> > > > > > > > > > > > >> > > > > data-index,
> > > > > > > > > > > > >> > > > > > > > then
> > > > > > > > > > > > >> > > > > > > > > there is no such performance
> > > degradation
> > > > > > [3].
> > > > > > > > You
> > > > > > > > > > can
> > > > > > > > > > > > >> find a
> > > > > > > > > > > > >> > > > thread
> > > > > > > > > > > > >> > > > > > > dump
> > > > > > > > > > > > >> > > > > > > > > captured a few minutes into the
> > first
> > > > test
> > > > > > > here
> > > > > > > > > [4]
> > > > > > > > > > > that
> > > > > > > > > > > > >> > might
> > > > > > > > > > > > >> > > > help
> > > > > > > > > > > > >> > > > > > to
> > > > > > > > > > > > >> > > > > > > > see
> > > > > > > > > > > > >> > > > > > > > > some of the contention points.
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > I'd appreciate if you could take a
> > > look
> > > > > and
> > > > > > > see
> > > > > > > > if
> > > > > > > > > > > there
> > > > > > > > > > > > >> is
> > > > > > > > > > > > >> > > > > something
> > > > > > > > > > > > >> > > > > > > > that
> > > > > > > > > > > > >> > > > > > > > > can be further improved based on
> > your
> > > > > > previous
> > > > > > > > > work.
> > > > > > > > > > > If
> > > > > > > > > > > > >> you
> > > > > > > > > > > > >> > > need
> > > > > > > > > > > > >> > > > > any
> > > > > > > > > > > > >> > > > > > > > > additional data, let me know, but
> > > > > otherwise
> > > > > > it
> > > > > > > > is
> > > > > > > > > > > > >> > > straightforward
> > > > > > > > > > > > >> > > > > to
> > > > > > > > > > > > >> > > > > > > run
> > > > > > > > > > > > >> > > > > > > > > the jmeter test as well.
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > Thanks,
> > > > > > > > > > > > >> > > > > > > > > Martin
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > [1]
> > > > > > > > > > > > >>
> > > https://github.com/pefernan/job-service-refactor-test/
> > > > > > > > > > > > >> > > > > > > > > [2]
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >>
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://drive.google.com/file/d/1Gqn-ixE05kXv2jdssAUlnMuUVcHxIYZ0/view?usp=sharing
> > > > > > > > > > > > >> > > > > > > > > [3]
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >>
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://drive.google.com/file/d/10gVNyb4JYg_bA18bNhY9dEDbPn3TOxL7/view?usp=sharing
> > > > > > > > > > > > >> > > > > > > > > [4]
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >>
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://drive.google.com/file/d/1jVrtsO49gCvUlnaC9AUAtkVKTm4PbdUv/view?usp=sharing
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > >
> > > ________________________________________
> > > > > > > > > > > > >> > > > > > > > > From: Francisco Javier Tirado Sarti
> > <
> > > > > > > > > > > > [email protected]>
> > > > > > > > > > > > >> > > > > > > > > Sent: Wednesday, January 17, 2024
> > 9:13
> > > > AM
> > > > > > > > > > > > >> > > > > > > > > To: [email protected]
> > > > > > > > > > > > >> > > > > > > > > Cc: Pere Fernandez Perez
> > > > > > > > > > > > >> > > > > > > > > Subject: [EXTERNAL] Re: [DISCUSSION]
> > > > > > > Performance
> > > > > > > > > > > issues
> > > > > > > > > > > > >> with
> > > > > > > > > > > > >> > > > > > data-index
> > > > > > > > > > > > >> > > > > > > > > persistence addon
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > Hi Alex,
> > > > > > > > > > > > >> > > > > > > > > I did not take times (which depends
> > > on a
> > > > > > > number
> > > > > > > > of
> > > > > > > > > > > > >> variables
> > > > > > > > > > > > >> > > that
> > > > > > > > > > > > >> > > > > > > > > drastically change between
> > > > environments),
> > > > > > but
> > > > > > > > > verify
> > > > > > > > > > > > that
> > > > > > > > > > > > >> the
> > > > > > > > > > > > >> > > > > number
> > > > > > > > > > > > >> > > > > > of
> > > > > > > > > > > > >> > > > > > > > > updates has been reduced drastically
> > > > > without
> > > > > > > > > losing
> > > > > > > > > > > > >> > > > functionality,
> > > > > > > > > > > > >> > > > > > > which
> > > > > > > > > > > > >> > > > > > > > is
> > > > > > > > > > > > >> > > > > > > > > objectively a good thing. If before
> > > the
> > > > > > > change,
> > > > > > > > > for
> > > > > > > > > > > > every
> > > > > > > > > > > > >> > node
> > > > > > > > > > > > >> > > > > > > executed,
> > > > > > > > > > > > >> > > > > > > > we
> > > > > > > > > > > > >> > > > > > > > > have an update for every node
> > > previously
> > > > > > > > executed,
> > > > > > > > > > so
> > > > > > > > > > > > if a
> > > > > > > > > > > > >> > > > process
> > > > > > > > > > > > >> > > > > > have
> > > > > > > > > > > > >> > > > > > > > 50
> > > > > > > > > > > > >> > > > > > > > > nodes to execute, we were performing
> > > > > nearly
> > > > > > > > > 50*51/2
> > > > > > > > > > > > >> updates,
> > > > > > > > > > > > >> > > > which
> > > > > > > > > > > > >> > > > > > > gives
> > > > > > > > > > > > >> > > > > > > > us
> > > > > > > > > > > > >> > > > > > > > > a total of  1275 updates, now we
> > have
> > > > just
> > > > > > one
> > > > > > > > for
> > > > > > > > > > > every
> > > > > > > > > > > > >> node
> > > > > > > > > > > > >> > > > being
> > > > > > > > > > > > >> > > > > > > > > executed, implying a total of 50
> > > > updates.
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > On Wed, Jan 17, 2024 at 3:18 PM Alex
> > > > > > Porcelli
> > > > > > > <
> > > > > > > > > > > > >> > > [email protected]>
> > > > > > > > > > > > >> > > > > > > wrote:
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > Francisco,
> > > > > > > > > > > > >> > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > I noticed that your PR has been
> > > > merged,
> > > > > > but
> > > > > > > I
> > > > > > > > > was
> > > > > > > > > > > > >> expecting
> > > > > > > > > > > > >> > > (at
> > > > > > > > > > > > >> > > > > > least
> > > > > > > > > > > > >> > > > > > > > > > was my understanding from this
> > > thread)
> > > > > > that
> > > > > > > > > before
> > > > > > > > > > > > >> merging
> > > > > > > > > > > > >> > > some
> > > > > > > > > > > > >> > > > > > > > > > benchmark data would be shared in
> > > > > advance
> > > > > > -
> > > > > > > to
> > > > > > > > > > > assess
> > > > > > > > > > > > >> the
> > > > > > > > > > > > >> > > > > > > cost/benefit
> > > > > > > > > > > > >> > > > > > > > > > of such a decent size change.
> > > > > > > > > > > > >> > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > Do you have any information to
> > > share?
> > > > > > > > > > > > >> > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > On Sat, Dec 23, 2023 at 4:02 AM
> > > > > Francisco
> > > > > > > > Javier
> > > > > > > > > > > > Tirado
> > > > > > > > > > > > >> > Sarti
> > > > > > > > > > > > >> > > > > > > > > > <[email protected]> wrote:
> > > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > Yes, as intended, now we have
> > one
> > > > > select
> > > > > > > and
> > > > > > > > > one
> > > > > > > > > > > > >> > > > insert/update
> > > > > > > > > > > > >> > > > > > per
> > > > > > > > > > > > >> > > > > > > > node
> > > > > > > > > > > > >> > > > > > > > > > > event.
> > > > > > > > > > > > >> > > > > > > > > > > I moved the PR as ready for
> > review
> > > > and
> > > > > > > give
> > > > > > > > > > @Pere
> > > > > > > > > > > > >> > Fernandez
> > > > > > > > > > > > >> > > > > Perez
> > > > > > > > > > > > >> > > > > > > > > > > <[email protected]>
> > permission
> > > to
> > > > > the
> > > > > > > > > branch
> > > > > > > > > > so
> > > > > > > > > > > > he
> > > > > > > > > > > > >> can
> > > > > > > > > > > > >> > > > edit
> > > > > > > > > > > > >> > > > > it
> > > > > > > > > > > > >> > > > > > > in
> > > > > > > > > > > > >> > > > > > > > > the
> > > > > > > > > > > > >> > > > > > > > > > > next two weeks (Ill be on PTO)
> > if
> > > > > > > desired,
> > > > > > > > > > before
> > > > > > > > > > > > >> > merging.
> > > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > On Thu, Dec 21, 2023 at 5:58 PM
> > > Alex
> > > > > > > > Porcelli
> > > > > > > > > <
> > > > > > > > > > > > >> > > > > [email protected]>
> > > > > > > > > > > > >> > > > > > > > > wrote:
> > > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > Cool, thank you Francisco!
> > > > > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > Did you manage to get some
> > > > > preliminary
> > > > > > > > data
> > > > > > > > > > > about
> > > > > > > > > > > > >> > > > > improvements?
> > > > > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > On Thu, Dec 21, 2023 at
> > 11:52 AM
> > > > > > > Francisco
> > > > > > > > > > > Javier
> > > > > > > > > > > > >> > Tirado
> > > > > > > > > > > > >> > > > > Sarti
> > > > > > > > > > > > >> > > > > > > > > > > > <[email protected]> wrote:
> > > > > > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > > Yes, after some delay
> > because
> > > of
> > > > > > > > quarkus 3
> > > > > > > > > > > > >> migration.
> > > > > > > > > > > > >> > > Im
> > > > > > > > > > > > >> > > > > > > refining
> > > > > > > > > > > > >> > > > > > > > > > this
> > > > > > > > > > > > >> > > > > > > > > > > > > draft PR
> > > > > > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > >
> > > > https://github.com/apache/incubator-kie-kogito-apps/pull/1941
> > > > > > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > > On Thu, Dec 21, 2023 at
> > > 5:48 PM
> > > > > Alex
> > > > > > > > > > Porcelli
> > > > > > > > > > > <
> > > > > > > > > > > > >> > > > > > > [email protected]>
> > > > > > > > > > > > >> > > > > > > > > > wrote:
> > > > > > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > > > Any update or new findings
> > > on
> > > > > this
> > > > > > > > > topic?
> > > > > > > > > > > > >> > > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > > > On Tue, Nov 28, 2023 at
> > > > 8:38 AM
> > > > > > > > > Francisco
> > > > > > > > > > > > Javier
> > > > > > > > > > > > >> > > Tirado
> > > > > > > > > > > > >> > > > > > Sarti
> > > > > > > > > > > > >> > > > > > > > > > > > > > <[email protected]>
> > > wrote:
> > > > > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > > > > Hi Alex,
> > > > > > > > > > > > >> > > > > > > > > > > > > > > After considering
> > > different
> > > > > > > options
> > > > > > > > to
> > > > > > > > > > > > improve
> > > > > > > > > > > > >> > > > > > performance,
> > > > > > > > > > > > >> > > > > > > > we
> > > > > > > > > > > > >> > > > > > > > > > feel
> > > > > > > > > > > > >> > > > > > > > > > > > that
> > > > > > > > > > > > >> > > > > > > > > > > > > > it
> > > > > > > > > > > > >> > > > > > > > > > > > > > > is time to "partially"
> > > move
> > > > > away
> > > > > > > > from
> > > > > > > > > > the
> > > > > > > > > > > > >> current
> > > > > > > > > > > > >> > > Map
> > > > > > > > > > > > >> > > > > > style
> > > > > > > > > > > > >> > > > > > > > > > > > interface (
> > > > > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >>
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://github.com/apache/incubator-kie-kogito-apps/blob/main/persistence-commons/persistence-commons-api/src/main/java/org/kie/kogito/persistence/api/Storage.java
> > > > > > > > > > > > >> > > > > > > > > > > > > > )
> > > > > > > > > > > > >> > > > > > > > > > > > > > > which was shared with
> > > > Trusty,
> > > > > to
> > > > > > > one
> > > > > > > > > > more
> > > > > > > > > > > > >> > suitable
> > > > > > > > > > > > >> > > > for
> > > > > > > > > > > > >> > > > > > > usage
> > > > > > > > > > > > >> > > > > > > > > > with a
> > > > > > > > > > > > >> > > > > > > > > > > > > > > relational DB like
> > > > postgresql
> > > > > > (but
> > > > > > > > > still
> > > > > > > > > > > > >> > compatible
> > > > > > > > > > > > >> > > > > with
> > > > > > > > > > > > >> > > > > > > big
> > > > > > > > > > > > >> > > > > > > > > > table
> > > > > > > > > > > > >> > > > > > > > > > > > dbs).
> > > > > > > > > > > > >> > > > > > > > > > > > > > > The idea will be to
> > > replace
> > > > > > > generic
> > > > > > > > > > > Storage
> > > > > > > > > > > > >> > > interface
> > > > > > > > > > > > >> > > > > by
> > > > > > > > > > > > >> > > > > > > four
> > > > > > > > > > > > >> > > > > > > > > > > > specific
> > > > > > > > > > > > >> > > > > > > > > > > > > > > interfaces (which will
> > > > inherit
> > > > > > > from
> > > > > > > > a
> > > > > > > > > > > common
> > > > > > > > > > > > >> one
> > > > > > > > > > > > >> > > that
> > > > > > > > > > > > >> > > > > > keeps
> > > > > > > > > > > > >> > > > > > > > the
> > > > > > > > > > > > >> > > > > > > > > > query
> > > > > > > > > > > > >> > > > > > > > > > > > > > part
> > > > > > > > > > > > >> > > > > > > > > > > > > > > at is it. with get and
> > > query
> > > > > > > > methods),
> > > > > > > > > > > that
> > > > > > > > > > > > >> will
> > > > > > > > > > > > >> > > > > include
> > > > > > > > > > > > >> > > > > > > the
> > > > > > > > > > > > >> > > > > > > > > > required
> > > > > > > > > > > > >> > > > > > > > > > > > > > > modification operations
> > > for
> > > > > the
> > > > > > > four
> > > > > > > > > > > > DataIndex
> > > > > > > > > > > > >> > > > > "domains":
> > > > > > > > > > > > >> > > > > > > > > > > > > > processinstance,
> > > > > > > > > > > > >> > > > > > > > > > > > > > > usertask,
> > > processdefinitions
> > > > > and
> > > > > > > > jobs.
> > > > > > > > > > > Those
> > > > > > > > > > > > >> > > > interfaces
> > > > > > > > > > > > >> > > > > > > will
> > > > > > > > > > > > >> > > > > > > > > > define
> > > > > > > > > > > > >> > > > > > > > > > > > > > methods
> > > > > > > > > > > > >> > > > > > > > > > > > > > > like addNode,
> > addVariable,
> > > > > > > > updateTask,
> > > > > > > > > > > > >> > > > > addAttachment.....
> > > > > > > > > > > > >> > > > > > > > that
> > > > > > > > > > > > >> > > > > > > > > > will
> > > > > > > > > > > > >> > > > > > > > > > > > allow
> > > > > > > > > > > > >> > > > > > > > > > > > > > > the persistent layer
> > > > > > > implementation
> > > > > > > > > to
> > > > > > > > > > > just
> > > > > > > > > > > > >> > update
> > > > > > > > > > > > >> > > > the
> > > > > > > > > > > > >> > > > > > > > needed
> > > > > > > > > > > > >> > > > > > > > > > info
> > > > > > > > > > > > >> > > > > > > > > > > > in
> > > > > > > > > > > > >> > > > > > > > > > > > > > the
> > > > > > > > > > > > >> > > > > > > > > > > > > > > DB  (for example, for
> > > > addNode
> > > > > in
> > > > > > > > > > Postgres,
> > > > > > > > > > > > >> just
> > > > > > > > > > > > >> > > > insert
> > > > > > > > > > > > >> > > > > a
> > > > > > > > > > > > >> > > > > > > row
> > > > > > > > > > > > >> > > > > > > > > into
> > > > > > > > > > > > >> > > > > > > > > > > > nodes
> > > > > > > > > > > > >> > > > > > > > > > > > > > > table, for addNode in
> > > Mongo,
> > > > > > > > basically
> > > > > > > > > > the
> > > > > > > > > > > > >> same
> > > > > > > > > > > > >> > > > atomic
> > > > > > > > > > > > >> > > > > > > upsert
> > > > > > > > > > > > >> > > > > > > > > > > > operation
> > > > > > > > > > > > >> > > > > > > > > > > > > > > that is currently done).
> > > > > > > Therefore,
> > > > > > > > we
> > > > > > > > > > > > >> increase
> > > > > > > > > > > > >> > > > > > performance
> > > > > > > > > > > > >> > > > > > > > for
> > > > > > > > > > > > >> > > > > > > > > > > > Postgres
> > > > > > > > > > > > >> > > > > > > > > > > > > > > and keep the current one
> > > for
> > > > > > > Mongo.
> > > > > > > > > The
> > > > > > > > > > > > >> current
> > > > > > > > > > > > >> > DB
> > > > > > > > > > > > >> > > > > > schemas
> > > > > > > > > > > > >> > > > > > > > > won't
> > > > > > > > > > > > >> > > > > > > > > > be
> > > > > > > > > > > > >> > > > > > > > > > > > > > > touched.
> > > > > > > > > > > > >> > > > > > > > > > > > > > > Since the code change is
> > > > > large,
> > > > > > I
> > > > > > > do
> > > > > > > > > not
> > > > > > > > > > > > think
> > > > > > > > > > > > >> > I'll
> > > > > > > > > > > > >> > > > be
> > > > > > > > > > > > >> > > > > > able
> > > > > > > > > > > > >> > > > > > > > to
> > > > > > > > > > > > >> > > > > > > > > > have
> > > > > > > > > > > > >> > > > > > > > > > > > the
> > > > > > > > > > > > >> > > > > > > > > > > > > > PR
> > > > > > > > > > > > >> > > > > > > > > > > > > > > ready till next week.
> > > > > > > > > > > > >> > > > > > > > > > > > > > > But before starting,
> > > please
> > > > > let
> > > > > > me
> > > > > > > > > know
> > > > > > > > > > if
> > > > > > > > > > > > >> that
> > > > > > > > > > > > >> > > > > approach
> > > > > > > > > > > > >> > > > > > is
> > > > > > > > > > > > >> > > > > > > > > fine
> > > > > > > > > > > > >> > > > > > > > > > for
> > > > > > > > > > > > >> > > > > > > > > > > > you.
> > > > > > > > > > > > >> > > > > > > > > > > > > > > Best regards.
> > > > > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > > > > On Fri, Nov 24, 2023 at
> > > > > 6:55 PM
> > > > > > > Alex
> > > > > > > > > > > > Porcelli
> > > > > > > > > > > > >> <
> > > > > > > > > > > > >> > > > > > > > > [email protected]>
> > > > > > > > > > > > >> > > > > > > > > > > > wrote:
> > > > > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > > > > > Thank you Francisco to
> > > > > getting
> > > > > > > > > deeper
> > > > > > > > > > on
> > > > > > > > > > > > >> this…
> > > > > > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > > > > > Looking forward to see
> > > the
> > > > > > > results
> > > > > > > > > of
> > > > > > > > > > > your
> > > > > > > > > > > > >> > > > suggested
> > > > > > > > > > > > >> > > > > > > > > > improvements.
> > > > > > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > > > > > On Fri, Nov 24, 2023
> > at
> > > > > > 9:40 AM
> > > > > > > > > > > Francisco
> > > > > > > > > > > > >> > Javier
> > > > > > > > > > > > >> > > > > Tirado
> > > > > > > > > > > > >> > > > > > > > > Sarti <
> > > > > > > > > > > > >> > > > > > > > > > > > > > > > [email protected]>
> > > > wrote:
> > > > > > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > > > > > > I forgot to attach
> > the
> > > > > > queries
> > > > > > > > > > > > >> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > > > > > > On Fri, Nov 24, 2023
> > > at
> > > > > > > 3:04 PM
> > > > > > > > > > > > Francisco
> > > > > > > > > > > > >> > > Javier
> > > > > > > > > > > > >> > > > > > Tirado
> > > > > > > > > > > > >> > > > > > > > > > Sarti <
> > > > > > > > > > > > >> > > > > > > > > > > > > > > > > [email protected]
> > >
> > > > > wrote:
> > > > > > > > > > > > >> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> Hi,
> > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> A brief update on
> > > this
> > > > > > topic.
> > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> After doing a
> > simple
> > > > test
> > > > > > > with
> > > > > > > > > > > example
> > > > > > > > > > > > >> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > > >
> > > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > > >> > > > > >
> > > > > > > > > > > > >> > > > >
> > > > > > > > > > > > >> > > >
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >>
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://github.com/apache/incubator-kie-kogito-examples/tree/stable/serverless-workflow-examples/serverless-workflow-data-index-quarkus
> > > > > > > > > > > > >> > > > > > > > > > > > > > > > ,
> > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> the number of
> > updates
> > > > > over
> > > > > > > > Nodes
> > > > > > > > > > > table
> > > > > > > > > > > > is
> > > > > > > > > > > > >> > n*n,
> > > > > > > > > > > > >> > > > so
> > > > > > > > > > > > >> > > > > we
> > > > > > > > > > > > >> > > > > > > > > manage
> > > > > > > > > > > > >> > > > > > > > > > to
> > > > > > > > > > > > >> > > > > > > > > > > > > > obtain a
> > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> perfect quadratic
> > > > > > performance
> > > > > > > > > > > > >> degradation.
> > > > > > > > > > > > >> > The
> > > > > > > > > > > > >> > > > > > problem
> > > > > > > > > > > > >> > > > > > > > is
> > > > > > > > > > > > >> > > > > > > > > > worse
> > > > > > > > > > > > >> > > > > > > > > > > > in
> > > > > > > > > > > > >> > > > > > > > > > > > > > the
> > > > > > > > > > > > >> > > > > > > > > > > > > > > > case
> > > > > >
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [DISCUSSION] Performance issues with data-index persistence addon

Reply via email to