Re: [DISCUSSION] Performance issues with data-index persistence addon

Francisco Javier Tirado Sarti Wed, 21 Feb 2024 07:51:02 -0800

Hi Alex,
I'm wondering how cron and rest can live together.
With cron the responsibility to do the clean up relies on a process
external to the microservice accessing the db to be purged, while where
using REST API the process performing the clean up is typically (but not
necessarily) the same microservice that accesses the DB. In that sense,
REST API is similar to using properties (in both approaches the responsible
to clean up the data is the microservice), but they are different because
REST is more sophisticated (but compatible, implementation wise we can use
a ConfigSource, that is updated by the REST and read, as a property, by the
cleanup routine). So, probably I'm missing something but I see the cron
approach not reconcilable with the REST approach.
Anyway just a detail, I agree the purge approach should be similar for all
microservices


On Wed, Feb 21, 2024 at 4:37 PM Alex Porcelli <[email protected]> wrote:

> +1 for provide consistent behavior across the board
>
> my 2c:
>
> +1 for REST interface (preferred for external general management
> perspective)
> -1 for properties to auto clean; this can be achieved with cron and rest
> call
>
> Now about GraphQL: the platform today is heavily invested in such
> interface, so it might make sense. BUT I’d consider it in a second moment.
>
>
> On Wed, Feb 21, 2024 at 6:52 AM Francisco Javier Tirado Sarti <
> [email protected]> wrote:
>
> > Hi Enrique,
> > If we configure such policies through properties, it won't be enough to
> > define a naming convention and rely on the target platform configuration
> > capabilities (SpringBoot or Quarkus)?.
> >
> > On Wed, Feb 21, 2024 at 12:36 PM Enrique Gonzalez Martinez <
> > [email protected]> wrote:
> >
> > > Hi Francisco
> > >
> > > The discussion we need to have before how to achieve certain features,
> is
> > > the overall of the user experience. If you want to clean it up that way
> > it
> > > is fine by me. I have nothing against setting policy related to some
> sort
> > > of clean up for a process completed some time ago.
> > >
> > > As we discussed previously the data index is a snapshot of the last
> state
> > > of the process instance included completed, but nothing was said once
> > > completed when we need to clean that up so any policy is welcome. How
> to
> > > achieve that policy is something completely different.
> > >
> > > The main problem is how every microservice is exposing that API to the
> > end
> > > user or consumer (other system) and which system is going to be the
> > façade
> > > for complex deployments and operations. That is the discussion I was
> > > mentioning before.
> > >
> > > What I want to avoid is to set different policies among components and
> we
> > > should strive to be as much as possible to offer certain capabilities
> in
> > > the same fashion, e.g: clean up mechanism.
> > > In the same way I want to avoid making the current system more complex
> > than
> > > they are. So far they are aligned offering one simple responsibility
> but
> > I
> > > would be against mixing for instance job service with data index.
> > >
> > > El mié, 21 feb 2024 a las 12:13, Francisco Javier Tirado Sarti (<
> > > [email protected]>) escribió:
> > >
> > > > Hi Enrique,
> > > > In the case of data index I think the data to be purged is finished
> > > process
> > > > instances (I do not think we should remove process instance that has
> > been
> > > > alive for ages, even if it is very likely they are not going to be
> ever
> > > > completed)
> > > > Once you delete those process instances, you also delete the
> associated
> > > > user tasks and jobs.
> > > > Therefore the problem is relatively simple, to be able to configure
> how
> > > > much time a completed process instance should remain in the data
> index
> > > > database. We can take a simple approach: a property with a min
> duration
> > > > that cannot be changed once data index is started; a slightly complex
> > > one:
> > > > the same property but watching it to react for changes; or the full
> > > suite:
> > > > an admin API to be able to change the policy at any moment.
> > > > I think this discussion is worth having.
> > > >
> > > > On Wed, Feb 21, 2024 at 6:15 AM Enrique Gonzalez Martinez <
> > > > [email protected]> wrote:
> > > >
> > > > > Hi Martin, the main problem regarding the purge is because it is
> > still
> > > > > unclear the policy for what tech to use and the future components.
> > > > >
> > > > > Recently we had a discussion about proposing graphql for this sort
> of
> > > > admin
> > > > > tasks. So far for subsystems we have been using rest endpoints
> (like
> > > > update
> > > > > timers or modify human task or change processes). There is one
> > > exception
> > > > > which is the gateway that is pure graphql and somehow uses graphql
> > for
> > > > > everything making complex operations under the hood.
> > > > >
> > > > > This has somehow frozen the purge for data audit for a bit and the
> > > > proposal
> > > > > was to use rest endpoints to do the clean up in the component and
> > offer
> > > > the
> > > > > graphql counterpart in the gateway promoting it to a first class
> > > citizen
> > > > > component instead of having it embedded in the data index.
> > > > >
> > > > > I would suggest to come up at least with a policy first regarding
> the
> > > > > convention every component should address this.
> > > > >
> > > > >
> > > > > El mar, 20 feb 2024, 23:39, Martin Weiler <[email protected]
> >
> > > > > escribió:
> > > > >
> > > > > > IMO, it is good to have this discussion around data sanity now
> > > instead
> > > > of
> > > > > > putting it off until later when data has already accumulated in
> > > > > production
> > > > > > environments.
> > > > > >
> > > > > > Based on the input here, we are dealing with three types of data:
> > > > > > 1. Runtime data - active instances only, engine cleans up the
> data
> > > > > > automatically at process instance end
> > > > > > 2. Historic log data - data created by data-audit intended for
> long
> > > > term
> > > > > > storage
> > > > > > 3. Data-index data - somehow this data falls in between the two
> > > > > > aforementioned categories, with the idea of the data being
> > "recent",
> > > > but
> > > > > > not restricted to active instances only
> > > > > >
> > > > > > We'd need purge strategies for both #2 and #3 (perhaps different
> > > ones,
> > > > or
> > > > > > with different config settings) in order to prevent unlimited
> data
> > > > > growth.
> > > > > >
> > > > > > ________________________________________
> > > > > > From: Enrique Gonzalez Martinez <[email protected]>
> > > > > > Sent: Monday, February 19, 2024 7:11 AM
> > > > > > To: [email protected]
> > > > > > Subject: [EXTERNAL] Re: [DISCUSSION] Performance issues with
> > > data-index
> > > > > > persistence addon
> > > > > >
> > > > > > Hi Francisco,
> > > > > > To give you more context about this.
> > > > > >
> > > > > > STP is a concept, a process with certain constraints: no
> > persistence
> > > > and
> > > > > > returning the outcome in the call (sync execution with no idle
> > > states).
> > > > > It
> > > > > > was a requirement from a user in the past. One of the
> requirements
> > > was
> > > > > > leaving no trail. In v7 was easy because you could disable the
> > audit
> > > in
> > > > > > that case. Actually we have the same way to do what we did in v7
> in
> > > > here
> > > > > as
> > > > > > you can add/remove index just removing deps.
> > > > > >
> > > > > > We have the same outcome with different approaches and STP is
> > already
> > > > > > delivered.
> > > > > >
> > > > > > El lun, 19 feb 2024 a las 14:46, Francisco Javier Tirado Sarti (<
> > > > > > [email protected]>) escribió:
> > > > > >
> > > > > > > Regarding STP (which is not a concept that we have in the
> code. I
> > > > mean
> > > > > > STP
> > > > > > > are processes as nonSTP are), I guess, as all processes, they
> > were
> > > > kept
> > > > > > in
> > > > > > > DataIndex once completed because users wanted (and still wants)
> > to
> > > > > check
> > > > > > > the result once the call had been performed. If we want to
> leave
> > no
> > > > > trace
> > > > > > > of them in DataIndex for some reason, we will need to make it a
> > > > > > > Runtimes concept so DataIndex can handle them in a different
> way.
> > > > > > >
> > > > > > > On Mon, Feb 19, 2024 at 2:27 PM Enrique Gonzalez Martinez <
> > > > > > > [email protected]> wrote:
> > > > > > >
> > > > > > > > Alex:
> > > > > > > > Right now the data index is working in the same way as it did
> > in
> > > v7
> > > > > > with
> > > > > > > > the emitters. The only difference between two impl is that in
> > > here
> > > > > the
> > > > > > > > storage is pgsql instead elastic search.  You are right
> > regarding
> > > > is
> > > > > a
> > > > > > > > snapshot of the last state of the process but we did never
> > define
> > > > how
> > > > > > > long
> > > > > > > > would be alive that dats Honestly i am happy right now with
> the
> > > way
> > > > > it
> > > > > > > > works. The clean up mechanism is still tbd because we still
> > need
> > > to
> > > > > > > discuss
> > > > > > > > other stuff first.
> > > > > > > >
> > > > > > > >
> > > > > > > > Regarding stp is to leave no trail because u can get the
> > outcome
> > > > > > directly
> > > > > > > > from the call. It was defined like that in v7. So there is no
> > use
> > > > for
> > > > > > the
> > > > > > > > index or the audit.
> > > > > > > >
> > > > > > > > El lun, 19 feb 2024, 14:13, Francisco Javier Tirado Sarti <
> > > > > > > > [email protected]> escribió:
> > > > > > > >
> > > > > > > > > Hi Alex,
> > > > > > > > > There has been some confusion about the purpose of
> DataIndex.
> > > To
> > > > be
> > > > > > > > honest
> > > > > > > > > I believe they were already sorted out, but your e-mail
> makes
> > > me
> > > > > > think
> > > > > > > > that
> > > > > > > > > is not the case ;). I let Kris to clarify that with you. My
> > > view
> > > > is
> > > > > > > that
> > > > > > > > > data-index is a way to query recently closed and active
> > > processes
> > > > > > (the
> > > > > > > > key
> > > > > > > > > here is the definition of recently, which in my opinion
> > should
> > > be
> > > > > > > > > configurable)
> > > > > > > > > But, besides that discussion and being pragmatic, keeping
> > > > finishing
> > > > > > > > process
> > > > > > > > > instances "for a while" in DataIndex was the only way for
> > users
> > > > to
> > > > > > > query
> > > > > > > > > the result of straight through processes. That's a function
> > > that
> > > > > > cannot
> > > > > > > > be
> > > > > > > > > removed right now
> > > > > > > > >
> > > > > > > > > On Mon, Feb 19, 2024 at 1:33 PM Alex Porcelli <
> > > > [email protected]
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > if data index was supposed to provide snapshot view of
> the
> > > > > process
> > > > > > > > > > instance… why do we keep it after the process instance is
> > > > > finished?
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Mon, Feb 19, 2024 at 7:12 AM Francisco Javier Tirado
> > > Sarti <
> > > > > > > > > > [email protected]> wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Martin.
> > > > > > > > > > > After taking a deeper look at this, I realize that the
> > > > > behaviour
> > > > > > is
> > > > > > > > the
> > > > > > > > > > > expected one.
> > > > > > > > > > > Runtimes DB does not track the completed process
> instance
> > > > > (that's
> > > > > > > > what
> > > > > > > > > > the
> > > > > > > > > > > JDBCProcessInstances warn is telling us), but
> DataIndex,
> > as
> > > > > > > expected,
> > > > > > > > > is
> > > > > > > > > > > tracking it in processes and nodes table. And yes it
> will
> > > > grow
> > > > > > over
> > > > > > > > > time.
> > > > > > > > > > > What we need is some configurable purge mechanism for
> > > > > DataIndex,
> > > > > > so
> > > > > > > > it
> > > > > > > > > > > eventually removes older completed process instances.
> > > > > > > > > > >
> > > > > > > > > > > On Tue, Feb 13, 2024 at 12:59 PM Francisco Javier
> Tirado
> > > > Sarti
> > > > > <
> > > > > > > > > > > [email protected]> wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Martin,
> > > > > > > > > > > > Good catch!. Looks like the skipping performed for
> > > process
> > > > > > > > instances
> > > > > > > > > is
> > > > > > > > > > > > not applied to node instances. Something we
> definitely
> > > need
> > > > > to
> > > > > > > > review
> > > > > > > > > > on
> > > > > > > > > > > > the runtimes side.
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, Feb 12, 2024 at 11:59 PM Martin Weiler
> > > > > > > > > <[email protected]
> > > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > >> On a somewhat related note, testing a simple
> workflow
> > > > (start
> > > > > > ->
> > > > > > > > > script
> > > > > > > > > > > >> node -> end), I see the following messages in the
> > logs:
> > > > > > > > > > > >> 2024-02-12 22:49:50,493 28758dde544c WARN
> > > > > > > > > > > >>
> > > [org.kie.kogito.persistence.jdbc.JDBCProcessInstances:-1]
> > > > > > > > > > > >> (executor-thread-3) Skipping create of process
> > instance
> > > > id:
> > > > > > > > > > > >> 7083088e-b899-47cb-b85c-5d9ccb0aa166, state: 2
> > > > > > > > > > > >>
> > > > > > > > > > > >> So far, so good. And I'd expect to see no trace of
> > this
> > > > > > process
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > > > >> database if I don't have data audit enabled.
> > > > > > > > > > > >>
> > > > > > > > > > > >> However, the 'processes' table contains a row with
> > > > state=2,
> > > > > > with
> > > > > > > > > > related
> > > > > > > > > > > >> entries in the 'nodes' table. In a load test, I see
> > > these
> > > > > > tables
> > > > > > > > > grow
> > > > > > > > > > > >> significantly over time. Am I missing something to
> > have
> > > > > these
> > > > > > > > > entries
> > > > > > > > > > > >> cleaned up automatically?
> > > > > > > > > > > >>
> > > > > > > > > > > >> ________________________________________
> > > > > > > > > > > >> From: Martin Weiler <[email protected]>
> > > > > > > > > > > >> Sent: Monday, February 12, 2024 3:40 PM
> > > > > > > > > > > >> To: [email protected]
> > > > > > > > > > > >> Subject: [EXTERNAL] RE: [DISCUSSION] Performance
> > issues
> > > > with
> > > > > > > > > > data-index
> > > > > > > > > > > >> persistence addon
> > > > > > > > > > > >>
> > > > > > > > > > > >> Thanks everyone for your input. Based on this
> > > discussion,
> > > > I
> > > > > > > opened
> > > > > > > > > the
> > > > > > > > > > > >> following PR:
> > > > > > > > > > > >>
> > > > > https://github.com/apache/incubator-kie-kogito-apps/pull/1985
> > > > > > > > > > > >>
> > > > > > > > > > > >> With this change, the performance seems to be stable
> > > over
> > > > > > time:
> > > > > > > > > > > >>
> > > > > > > > > > > >>
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/1zkullvfrJpRp7TRjxDa41ok6kEIR7Fty/view?usp=sharing
> > > > > > > > > > > >>
> > > > > > > > > > > >> Martin
> > > > > > > > > > > >>
> > > > > > > > > > > >> ________________________________________
> > > > > > > > > > > >> From: Gonzalo Muñoz <[email protected]>
> > > > > > > > > > > >> Sent: Friday, February 9, 2024 9:42 AM
> > > > > > > > > > > >> To: [email protected]
> > > > > > > > > > > >> Subject: [EXTERNAL] Re: [DISCUSSION] Performance
> > issues
> > > > with
> > > > > > > > > > data-index
> > > > > > > > > > > >> persistence addon
> > > > > > > > > > > >>
> > > > > > > > > > > >> Great work Francisco,
> > > > > > > > > > > >> Martin, take a look at this link with some related
> > tips
> > > > (in
> > > > > > case
> > > > > > > > you
> > > > > > > > > > > find
> > > > > > > > > > > >> it useful):
> > > > > > > > > > > >>
> > > > > > https://www.cybertec-postgresql.com/en/index-your-foreign-key/
> > > > > > > > > > > >>
> > > > > > > > > > > >> El vie, 9 feb 2024 a las 17:20, Francisco Javier
> > Tirado
> > > > > Sarti
> > > > > > (<
> > > > > > > > > > > >> [email protected]>) escribió:
> > > > > > > > > > > >>
> > > > > > > > > > > >> > For the moment being, we will keep JPA till we
> > exhaust
> > > > all
> > > > > > > > > > > >> possibilities,
> > > > > > > > > > > >> > let's call switching from jpa to jdbc our hidden
> > plan
> > > B
> > > > ;)
> > > > > > > > > > > >> > I already told Martin, but in order everyone to
> > know,
> > > > just
> > > > > > > after
> > > > > > > > > > > writing
> > > > > > > > > > > >> > the previous email, I thought "what if Postgres is
> > not
> > > > > > > > > automatically
> > > > > > > > > > > >> > indexing foreign keys like mysql?" and, eureka
> > > > > > > > > > > >> > Postgres doc
> > > > > > > > > > > >> >
> > > > > > https://www.postgresql.org/docs/current/ddl-constraints.html
> > > > > > > > > > > >> > Mysql doc
> > > > > > > > > > > >> >
> > > > > > > > >
> > > > >
> https://dev.mysql.com/doc/refman/8.0/en/constraint-foreign-key.html
> > > > > > > > > > > >> > These are the relevant excerpt
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > *Postgresql*
> > > > > > > > > > > >> > *A foreign key must reference columns that either
> > are
> > > a
> > > > > > > primary
> > > > > > > > > key
> > > > > > > > > > or
> > > > > > > > > > > >> form
> > > > > > > > > > > >> > a unique constraint, or are columns from a
> > non-partial
> > > > > > unique
> > > > > > > > > index.
> > > > > > > > > > > >> This
> > > > > > > > > > > >> > means that the referenced columns always have an
> > index
> > > > to
> > > > > > > allow
> > > > > > > > > > > >> efficient
> > > > > > > > > > > >> > lookups on whether a referencing row has a match.
> > > Since
> > > > a
> > > > > > > DELETE
> > > > > > > > > of
> > > > > > > > > > a
> > > > > > > > > > > >> row
> > > > > > > > > > > >> > from the referenced table or an UPDATE of a
> > referenced
> > > > > > column
> > > > > > > > will
> > > > > > > > > > > >> require
> > > > > > > > > > > >> > a scan of the referencing table for rows matching
> > the
> > > > old
> > > > > > > value,
> > > > > > > > > it
> > > > > > > > > > is
> > > > > > > > > > > >> > often a good idea to index the referencing columns
> > > too.
> > > > > > > Because
> > > > > > > > > this
> > > > > > > > > > > is
> > > > > > > > > > > >> not
> > > > > > > > > > > >> > always needed, and there are many choices
> available
> > on
> > > > how
> > > > > > to
> > > > > > > > > index,
> > > > > > > > > > > the
> > > > > > > > > > > >> > declaration of a foreign key constraint does not
> > > > > > automatically
> > > > > > > > > > create
> > > > > > > > > > > an
> > > > > > > > > > > >> > index on the referencing columns.*
> > > > > > > > > > > >> > *Mysql*
> > > > > > > > > > > >> > *MySQL requires that foreign key columns be
> indexed;
> > > if
> > > > > you
> > > > > > > > > create a
> > > > > > > > > > > >> table
> > > > > > > > > > > >> > with a foreign key constraint but no index on a
> > given
> > > > > > column,
> > > > > > > an
> > > > > > > > > > index
> > > > > > > > > > > >> is
> > > > > > > > > > > >> > created. *
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > So I asked Martin to especially create an index
> for
> > > > > > > > > > > process_instance_id
> > > > > > > > > > > >> > column on nodes table
> > > > > > > > > > > >> > I think that will fix the problem detected on the
> > > thread
> > > > > > dump.
> > > > > > > > > > > >> > The simpler process test to verify queries are
> fine
> > > > still
> > > > > > > > stands,
> > > > > > > > > > > >> though ;)
> > > > > > > > > > > >> >
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > On Fri, Feb 9, 2024 at 5:10 PM Tibor Zimányi <
> > > > > > > > [email protected]
> > > > > > > > > >
> > > > > > > > > > > >> wrote:
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > > I always preferred pure JDBC over Hibernate
> > myself,
> > > > just
> > > > > > for
> > > > > > > > the
> > > > > > > > > > > sake
> > > > > > > > > > > >> of
> > > > > > > > > > > >> > > control of what is happening :) So I would not
> -1
> > > that
> > > > > > > myself.
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > Tibor
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > Dňa pi 9. 2. 2024, 17:00 Francisco Javier Tirado
> > > > Sarti <
> > > > > > > > > > > >> > > [email protected]>
> > > > > > > > > > > >> > > napísal(a):
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > > Hi,
> > > > > > > > > > > >> > > > Usually I do not want to talk about work in
> > > progress
> > > > > > > because
> > > > > > > > > > > >> > preliminary
> > > > > > > > > > > >> > > > conclusions are pretty volatile but, well,
> there
> > > > are a
> > > > > > > > couple
> > > > > > > > > of
> > > > > > > > > > > >> things
> > > > > > > > > > > >> > > > that can be concluded from the really valuable
> > > > > > information
> > > > > > > > > that
> > > > > > > > > > > >> Martin
> > > > > > > > > > > >> > > > provided.
> > > > > > > > > > > >> > > > 1) In order to be able to determine if the
> > number
> > > of
> > > > > > > > > statements
> > > > > > > > > > is
> > > > > > > > > > > >> > larger
> > > > > > > > > > > >> > > > than expected, I asked Martin to test with a
> > > simpler
> > > > > > > process
> > > > > > > > > > > >> > definition.
> > > > > > > > > > > >> > > > One with just three nodes: start, script and
> > end.
> > > > The
> > > > > > > script
> > > > > > > > > one
> > > > > > > > > > > >> should
> > > > > > > > > > > >> > > > change just one variable. This way we can
> > analyze
> > > if
> > > > > the
> > > > > > > > > number
> > > > > > > > > > of
> > > > > > > > > > > >> > > queries
> > > > > > > > > > > >> > > > is the expected one. From the single log
> (audit
> > > was
> > > > > > > > activated
> > > > > > > > > > > them)
> > > > > > > > > > > >> my
> > > > > > > > > > > >> > > > conclusion is that the number of
> insert/updates
> > > over
> > > > > > > > processes
> > > > > > > > > > and
> > > > > > > > > > > >> > nodes
> > > > > > > > > > > >> > > > (there a lot over task, that I will prefer to
> > skip
> > > > for
> > > > > > > now,
> > > > > > > > > baby
> > > > > > > > > > > >> steps)
> > > > > > > > > > > >> > > is
> > > > > > > > > > > >> > > > the expected one.
> > > > > > > > > > > >> > > > 2) Analysing the thread dump, we see around 15
> > > > threads
> > > > > > > > > executing
> > > > > > > > > > > >> this
> > > > > > > > > > > >> > > line
> > > > > > > > > > > >> > > > at
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> >
> > > > > > > > > > > >>
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> org.kie.kogito.index.jpa.storage.ProcessInstanceEntityStorage.indexNode(ProcessInstanceEntityStorage.java:125),
> > > > > > > > > > > >> > > > so its pretty clear the code to be optimized
> ;).
> > > I'm
> > > > > > > > > evaluating
> > > > > > > > > > > >> > > > possibilities within JPA/Hibernate, but I'm
> > > starting
> > > > > to
> > > > > > > > think
> > > > > > > > > > that
> > > > > > > > > > > >> it
> > > > > > > > > > > >> > > might
> > > > > > > > > > > >> > > > be better to switch to JDBC and skip
> hibernate.
> > > Our
> > > > > > lives
> > > > > > > > will
> > > > > > > > > > be
> > > > > > > > > > > >> > > simpler,
> > > > > > > > > > > >> > > > especially with a schema relatively simple
> like
> > > ours
> > > > > > (that
> > > > > > > > > will
> > > > > > > > > > be
> > > > > > > > > > > >> my
> > > > > > > > > > > >> > > > recommendation if I was an external
> consultant)
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > On Fri, Feb 9, 2024 at 4:15 PM Tibor Zimányi <
> > > > > > > > > > [email protected]
> > > > > > > > > > > >
> > > > > > > > > > > >> > > wrote:
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > > > > Hi,
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > > this will be a bit off-topic. However as far
> > as
> > > > > > > > > performance, I
> > > > > > > > > > > >> think
> > > > > > > > > > > >> > we
> > > > > > > > > > > >> > > > > should think about that we have string
> primary
> > > > keys
> > > > > > > > (IDs). I
> > > > > > > > > > > would
> > > > > > > > > > > >> > > expect
> > > > > > > > > > > >> > > > > the database systems are much better with
> > > indexing
> > > > > > > numeric
> > > > > > > > > > keys
> > > > > > > > > > > >> than
> > > > > > > > > > > >> > > > > strings. I remember from the past, when I
> was
> > > > > working
> > > > > > > with
> > > > > > > > > > DBs,
> > > > > > > > > > > >> that
> > > > > > > > > > > >> > > > using
> > > > > > > > > > > >> > > > > strings as keys or indexes was a discouraged
> > > > > practice.
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > > Best regards,
> > > > > > > > > > > >> > > > > Tibor
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > > Dňa št 8. 2. 2024, 22:45 Martin Weiler
> > > > > > > > > > <[email protected]
> > > > > > > > > > > >
> > > > > > > > > > > >> > > > > napísal(a):
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > > > > I changed the test to use MongoDB [1] and
> I
> > > > don't
> > > > > > see
> > > > > > > a
> > > > > > > > > > > >> performance
> > > > > > > > > > > >> > > > > > degradation with this setup [2].
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > Please keep us posted of your findings.
> > > Thanks!
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > Martin
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > [1]
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> >
> > > > > > > > > > >
> > > > > > > >
> > > > >
> > https://github.com/martinweiler/job-service-refactor-test/tree/mongodb
> > > > > > > > > > > >> > > > > > [2]
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> >
> > > > > > > > > > > >>
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/1NfacXaxJlgRMw4OQ5S20cvkzvaUKUVFj/view?usp=sharing
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > ________________________________________
> > > > > > > > > > > >> > > > > > From: Francisco Javier Tirado Sarti <
> > > > > > > > [email protected]>
> > > > > > > > > > > >> > > > > > Sent: Wednesday, February 7, 2024 11:40 AM
> > > > > > > > > > > >> > > > > > To: [email protected]
> > > > > > > > > > > >> > > > > > Subject: [EXTERNAL] Re: [DISCUSSION]
> > > Performance
> > > > > > > issues
> > > > > > > > > with
> > > > > > > > > > > >> > > data-index
> > > > > > > > > > > >> > > > > > persistence addon
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > yes, it can be index degradation because
> of
> > > > size,
> > > > > > but
> > > > > > > I
> > > > > > > > > > > believe
> > > > > > > > > > > >> (I
> > > > > > > > > > > >> > > > might
> > > > > > > > > > > >> > > > > be
> > > > > > > > > > > >> > > > > > wrong) the db is too small (yet) for that.
> > > > > > > > > > > >> > > > > > But, eventually, Postgres, when the DB is
> > huge
> > > > > > enough,
> > > > > > > > > > > >> unavoidably
> > > > > > > > > > > >> > > will
> > > > > > > > > > > >> > > > > > behave like the graphic that Martin sent.
> > > > > > > > > > > >> > > > > > Since I believe we are not huge enough
> > (yet),
> > > > lets
> > > > > > > rule
> > > > > > > > > out
> > > > > > > > > > > >> another
> > > > > > > > > > > >> > > > issue
> > > > > > > > > > > >> > > > > > by analysing the sql logs (I requested
> those
> > > to
> > > > > > Martin
> > > > > > > > > > offline
> > > > > > > > > > > >> and
> > > > > > > > > > > >> > he
> > > > > > > > > > > >> > > > is
> > > > > > > > > > > >> > > > > > going to kindly collect them).
> > > > > > > > > > > >> > > > > > Also Im curious to know if Mongo behave in
> > the
> > > > > same
> > > > > > > way.
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > On Wed, Feb 7, 2024 at 7:25 PM Enrique
> > > Gonzalez
> > > > > > > > Martinez <
> > > > > > > > > > > >> > > > > > [email protected]> wrote:
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > > Hi Francisco,
> > > > > > > > > > > >> > > > > > > I would highly recommend to check
> indexes
> > > and
> > > > > how
> > > > > > > the
> > > > > > > > > > > updates
> > > > > > > > > > > >> > work
> > > > > > > > > > > >> > > in
> > > > > > > > > > > >> > > > > > data
> > > > > > > > > > > >> > > > > > > index to avoid full scan table and lock
> > the
> > > > full
> > > > > > > > table.
> > > > > > > > > > Some
> > > > > > > > > > > >> db
> > > > > > > > > > > >> > are
> > > > > > > > > > > >> > > > > very
> > > > > > > > > > > >> > > > > > > sensitive to that.
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > > > El mié, 7 feb 2024, 18:41, Francisco
> > Javier
> > > > > Tirado
> > > > > > > > > Sarti <
> > > > > > > > > > > >> > > > > > > [email protected]> escribió:
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > > > > Hi Martin,
> > > > > > > > > > > >> > > > > > > > While I analyze the data, let me ask
> you
> > > if
> > > > it
> > > > > > is
> > > > > > > > > > possible
> > > > > > > > > > > >> to
> > > > > > > > > > > >> > > > perform
> > > > > > > > > > > >> > > > > > > > another check (similar in a way to
> > > disabling
> > > > > > > > > data-index
> > > > > > > > > > > like
> > > > > > > > > > > >> > you
> > > > > > > > > > > >> > > > do)
> > > > > > > > > > > >> > > > > > Can
> > > > > > > > > > > >> > > > > > > > you switch to MongoDB persistence and
> > > check
> > > > if
> > > > > > the
> > > > > > > > > same
> > > > > > > > > > > >> > > degradation
> > > > > > > > > > > >> > > > > > that
> > > > > > > > > > > >> > > > > > > is
> > > > > > > > > > > >> > > > > > > > there for postgres remains?
> > > > > > > > > > > >> > > > > > > > I do not know if this is feasible but
> > will
> > > > > > > certainly
> > > > > > > > > > > >> indicate
> > > > > > > > > > > >> > the
> > > > > > > > > > > >> > > > > > problem
> > > > > > > > > > > >> > > > > > > > is on the postgres storage layer and I
> > do
> > > > not
> > > > > > > have a
> > > > > > > > > > clear
> > > > > > > > > > > >> > > > prediction
> > > > > > > > > > > >> > > > > > of
> > > > > > > > > > > >> > > > > > > > what we will see when doing this
> switch.
> > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > >> > > > > > > > On Wed, Feb 7, 2024 at 6:37 PM Martin
> > > Weiler
> > > > > > > > > > > >> > > > <[email protected]
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > > > > > wrote:
> > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > >> > > > > > > > > Hi Francisco,
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > thanks for your work on this
> important
> > > > > topic!
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > I would like to share some test
> > results
> > > > > here,
> > > > > > > > which
> > > > > > > > > > > might
> > > > > > > > > > > >> > help
> > > > > > > > > > > >> > > to
> > > > > > > > > > > >> > > > > > > improve
> > > > > > > > > > > >> > > > > > > > > the codebase even further. I am
> using
> > > the
> > > > > > jmeter
> > > > > > > > > based
> > > > > > > > > > > >> test
> > > > > > > > > > > >> > > case
> > > > > > > > > > > >> > > > > from
> > > > > > > > > > > >> > > > > > > > Pere
> > > > > > > > > > > >> > > > > > > > > and Enrique (thanks guys!) [1] which
> > > uses
> > > > a
> > > > > > load
> > > > > > > > of
> > > > > > > > > 30
> > > > > > > > > > > >> > threads
> > > > > > > > > > > >> > > to
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > 1) start a new process instance
> (POST)
> > > > > > > > > > > >> > > > > > > > > 2) retrieve tasks for a user (GET)
> > > > > > > > > > > >> > > > > > > > > 3) fetches task details (GET)
> > > > > > > > > > > >> > > > > > > > > 4) complete a task (POST)
> > > > > > > > > > > >> > > > > > > > > 5) execute a query on data-audit
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > With this test setup, I noticed that
> > the
> > > > > > > > performance
> > > > > > > > > > for
> > > > > > > > > > > >> the
> > > > > > > > > > > >> > > POST
> > > > > > > > > > > >> > > > > > > > > requests, in particular the one to
> > > start a
> > > > > new
> > > > > > > > > process
> > > > > > > > > > > >> > > instance,
> > > > > > > > > > > >> > > > > > > degrades
> > > > > > > > > > > >> > > > > > > > > over time - see graph [2]. If I run
> > the
> > > > same
> > > > > > > test
> > > > > > > > > > > without
> > > > > > > > > > > >> > > > > data-index,
> > > > > > > > > > > >> > > > > > > > then
> > > > > > > > > > > >> > > > > > > > > there is no such performance
> > degradation
> > > > > [3].
> > > > > > > You
> > > > > > > > > can
> > > > > > > > > > > >> find a
> > > > > > > > > > > >> > > > thread
> > > > > > > > > > > >> > > > > > > dump
> > > > > > > > > > > >> > > > > > > > > captured a few minutes into the
> first
> > > test
> > > > > > here
> > > > > > > > [4]
> > > > > > > > > > that
> > > > > > > > > > > >> > might
> > > > > > > > > > > >> > > > help
> > > > > > > > > > > >> > > > > > to
> > > > > > > > > > > >> > > > > > > > see
> > > > > > > > > > > >> > > > > > > > > some of the contention points.
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > I'd appreciate if you could take a
> > look
> > > > and
> > > > > > see
> > > > > > > if
> > > > > > > > > > there
> > > > > > > > > > > >> is
> > > > > > > > > > > >> > > > > something
> > > > > > > > > > > >> > > > > > > > that
> > > > > > > > > > > >> > > > > > > > > can be further improved based on
> your
> > > > > previous
> > > > > > > > work.
> > > > > > > > > > If
> > > > > > > > > > > >> you
> > > > > > > > > > > >> > > need
> > > > > > > > > > > >> > > > > any
> > > > > > > > > > > >> > > > > > > > > additional data, let me know, but
> > > > otherwise
> > > > > it
> > > > > > > is
> > > > > > > > > > > >> > > straightforward
> > > > > > > > > > > >> > > > > to
> > > > > > > > > > > >> > > > > > > run
> > > > > > > > > > > >> > > > > > > > > the jmeter test as well.
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > Thanks,
> > > > > > > > > > > >> > > > > > > > > Martin
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > [1]
> > > > > > > > > > > >>
> > https://github.com/pefernan/job-service-refactor-test/
> > > > > > > > > > > >> > > > > > > > > [2]
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> >
> > > > > > > > > > > >>
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/1Gqn-ixE05kXv2jdssAUlnMuUVcHxIYZ0/view?usp=sharing
> > > > > > > > > > > >> > > > > > > > > [3]
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> >
> > > > > > > > > > > >>
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/10gVNyb4JYg_bA18bNhY9dEDbPn3TOxL7/view?usp=sharing
> > > > > > > > > > > >> > > > > > > > > [4]
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> >
> > > > > > > > > > > >>
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/1jVrtsO49gCvUlnaC9AUAtkVKTm4PbdUv/view?usp=sharing
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > >
> > ________________________________________
> > > > > > > > > > > >> > > > > > > > > From: Francisco Javier Tirado Sarti
> <
> > > > > > > > > > > [email protected]>
> > > > > > > > > > > >> > > > > > > > > Sent: Wednesday, January 17, 2024
> 9:13
> > > AM
> > > > > > > > > > > >> > > > > > > > > To: [email protected]
> > > > > > > > > > > >> > > > > > > > > Cc: Pere Fernandez Perez
> > > > > > > > > > > >> > > > > > > > > Subject: [EXTERNAL] Re: [DISCUSSION]
> > > > > > Performance
> > > > > > > > > > issues
> > > > > > > > > > > >> with
> > > > > > > > > > > >> > > > > > data-index
> > > > > > > > > > > >> > > > > > > > > persistence addon
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > Hi Alex,
> > > > > > > > > > > >> > > > > > > > > I did not take times (which depends
> > on a
> > > > > > number
> > > > > > > of
> > > > > > > > > > > >> variables
> > > > > > > > > > > >> > > that
> > > > > > > > > > > >> > > > > > > > > drastically change between
> > > environments),
> > > > > but
> > > > > > > > verify
> > > > > > > > > > > that
> > > > > > > > > > > >> the
> > > > > > > > > > > >> > > > > number
> > > > > > > > > > > >> > > > > > of
> > > > > > > > > > > >> > > > > > > > > updates has been reduced drastically
> > > > without
> > > > > > > > losing
> > > > > > > > > > > >> > > > functionality,
> > > > > > > > > > > >> > > > > > > which
> > > > > > > > > > > >> > > > > > > > is
> > > > > > > > > > > >> > > > > > > > > objectively a good thing. If before
> > the
> > > > > > change,
> > > > > > > > for
> > > > > > > > > > > every
> > > > > > > > > > > >> > node
> > > > > > > > > > > >> > > > > > > executed,
> > > > > > > > > > > >> > > > > > > > we
> > > > > > > > > > > >> > > > > > > > > have an update for every node
> > previously
> > > > > > > executed,
> > > > > > > > > so
> > > > > > > > > > > if a
> > > > > > > > > > > >> > > > process
> > > > > > > > > > > >> > > > > > have
> > > > > > > > > > > >> > > > > > > > 50
> > > > > > > > > > > >> > > > > > > > > nodes to execute, we were performing
> > > > nearly
> > > > > > > > 50*51/2
> > > > > > > > > > > >> updates,
> > > > > > > > > > > >> > > > which
> > > > > > > > > > > >> > > > > > > gives
> > > > > > > > > > > >> > > > > > > > us
> > > > > > > > > > > >> > > > > > > > > a total of  1275 updates, now we
> have
> > > just
> > > > > one
> > > > > > > for
> > > > > > > > > > every
> > > > > > > > > > > >> node
> > > > > > > > > > > >> > > > being
> > > > > > > > > > > >> > > > > > > > > executed, implying a total of 50
> > > updates.
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > On Wed, Jan 17, 2024 at 3:18 PM Alex
> > > > > Porcelli
> > > > > > <
> > > > > > > > > > > >> > > [email protected]>
> > > > > > > > > > > >> > > > > > > wrote:
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > Francisco,
> > > > > > > > > > > >> > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > I noticed that your PR has been
> > > merged,
> > > > > but
> > > > > > I
> > > > > > > > was
> > > > > > > > > > > >> expecting
> > > > > > > > > > > >> > > (at
> > > > > > > > > > > >> > > > > > least
> > > > > > > > > > > >> > > > > > > > > > was my understanding from this
> > thread)
> > > > > that
> > > > > > > > before
> > > > > > > > > > > >> merging
> > > > > > > > > > > >> > > some
> > > > > > > > > > > >> > > > > > > > > > benchmark data would be shared in
> > > > advance
> > > > > -
> > > > > > to
> > > > > > > > > > assess
> > > > > > > > > > > >> the
> > > > > > > > > > > >> > > > > > > cost/benefit
> > > > > > > > > > > >> > > > > > > > > > of such a decent size change.
> > > > > > > > > > > >> > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > Do you have any information to
> > share?
> > > > > > > > > > > >> > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > On Sat, Dec 23, 2023 at 4:02 AM
> > > > Francisco
> > > > > > > Javier
> > > > > > > > > > > Tirado
> > > > > > > > > > > >> > Sarti
> > > > > > > > > > > >> > > > > > > > > > <[email protected]> wrote:
> > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > > Yes, as intended, now we have
> one
> > > > select
> > > > > > and
> > > > > > > > one
> > > > > > > > > > > >> > > > insert/update
> > > > > > > > > > > >> > > > > > per
> > > > > > > > > > > >> > > > > > > > node
> > > > > > > > > > > >> > > > > > > > > > > event.
> > > > > > > > > > > >> > > > > > > > > > > I moved the PR as ready for
> review
> > > and
> > > > > > give
> > > > > > > > > @Pere
> > > > > > > > > > > >> > Fernandez
> > > > > > > > > > > >> > > > > Perez
> > > > > > > > > > > >> > > > > > > > > > > <[email protected]>
> permission
> > to
> > > > the
> > > > > > > > branch
> > > > > > > > > so
> > > > > > > > > > > he
> > > > > > > > > > > >> can
> > > > > > > > > > > >> > > > edit
> > > > > > > > > > > >> > > > > it
> > > > > > > > > > > >> > > > > > > in
> > > > > > > > > > > >> > > > > > > > > the
> > > > > > > > > > > >> > > > > > > > > > > next two weeks (Ill be on PTO)
> if
> > > > > > desired,
> > > > > > > > > before
> > > > > > > > > > > >> > merging.
> > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > > On Thu, Dec 21, 2023 at 5:58 PM
> > Alex
> > > > > > > Porcelli
> > > > > > > > <
> > > > > > > > > > > >> > > > > [email protected]>
> > > > > > > > > > > >> > > > > > > > > wrote:
> > > > > > > > > > > >> > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > > > Cool, thank you Francisco!
> > > > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > > > Did you manage to get some
> > > > preliminary
> > > > > > > data
> > > > > > > > > > about
> > > > > > > > > > > >> > > > > improvements?
> > > > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > > > On Thu, Dec 21, 2023 at
> 11:52 AM
> > > > > > Francisco
> > > > > > > > > > Javier
> > > > > > > > > > > >> > Tirado
> > > > > > > > > > > >> > > > > Sarti
> > > > > > > > > > > >> > > > > > > > > > > > <[email protected]> wrote:
> > > > > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > > > > Yes, after some delay
> because
> > of
> > > > > > > quarkus 3
> > > > > > > > > > > >> migration.
> > > > > > > > > > > >> > > Im
> > > > > > > > > > > >> > > > > > > refining
> > > > > > > > > > > >> > > > > > > > > > this
> > > > > > > > > > > >> > > > > > > > > > > > > draft PR
> > > > > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > > > > >> > > > > >
> > > > > > > > > >
> > > https://github.com/apache/incubator-kie-kogito-apps/pull/1941
> > > > > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > > > > On Thu, Dec 21, 2023 at
> > 5:48 PM
> > > > Alex
> > > > > > > > > Porcelli
> > > > > > > > > > <
> > > > > > > > > > > >> > > > > > > [email protected]>
> > > > > > > > > > > >> > > > > > > > > > wrote:
> > > > > > > > > > > >> > > > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > > > > > Any update or new findings
> > on
> > > > this
> > > > > > > > topic?
> > > > > > > > > > > >> > > > > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > > > > > On Tue, Nov 28, 2023 at
> > > 8:38 AM
> > > > > > > > Francisco
> > > > > > > > > > > Javier
> > > > > > > > > > > >> > > Tirado
> > > > > > > > > > > >> > > > > > Sarti
> > > > > > > > > > > >> > > > > > > > > > > > > > <[email protected]>
> > wrote:
> > > > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > > > > > > Hi Alex,
> > > > > > > > > > > >> > > > > > > > > > > > > > > After considering
> > different
> > > > > > options
> > > > > > > to
> > > > > > > > > > > improve
> > > > > > > > > > > >> > > > > > performance,
> > > > > > > > > > > >> > > > > > > > we
> > > > > > > > > > > >> > > > > > > > > > feel
> > > > > > > > > > > >> > > > > > > > > > > > that
> > > > > > > > > > > >> > > > > > > > > > > > > > it
> > > > > > > > > > > >> > > > > > > > > > > > > > > is time to "partially"
> > move
> > > > away
> > > > > > > from
> > > > > > > > > the
> > > > > > > > > > > >> current
> > > > > > > > > > > >> > > Map
> > > > > > > > > > > >> > > > > > style
> > > > > > > > > > > >> > > > > > > > > > > > interface (
> > > > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> >
> > > > > > > > > > > >>
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/incubator-kie-kogito-apps/blob/main/persistence-commons/persistence-commons-api/src/main/java/org/kie/kogito/persistence/api/Storage.java
> > > > > > > > > > > >> > > > > > > > > > > > > > )
> > > > > > > > > > > >> > > > > > > > > > > > > > > which was shared with
> > > Trusty,
> > > > to
> > > > > > one
> > > > > > > > > more
> > > > > > > > > > > >> > suitable
> > > > > > > > > > > >> > > > for
> > > > > > > > > > > >> > > > > > > usage
> > > > > > > > > > > >> > > > > > > > > > with a
> > > > > > > > > > > >> > > > > > > > > > > > > > > relational DB like
> > > postgresql
> > > > > (but
> > > > > > > > still
> > > > > > > > > > > >> > compatible
> > > > > > > > > > > >> > > > > with
> > > > > > > > > > > >> > > > > > > big
> > > > > > > > > > > >> > > > > > > > > > table
> > > > > > > > > > > >> > > > > > > > > > > > dbs).
> > > > > > > > > > > >> > > > > > > > > > > > > > > The idea will be to
> > replace
> > > > > > generic
> > > > > > > > > > Storage
> > > > > > > > > > > >> > > interface
> > > > > > > > > > > >> > > > > by
> > > > > > > > > > > >> > > > > > > four
> > > > > > > > > > > >> > > > > > > > > > > > specific
> > > > > > > > > > > >> > > > > > > > > > > > > > > interfaces (which will
> > > inherit
> > > > > > from
> > > > > > > a
> > > > > > > > > > common
> > > > > > > > > > > >> one
> > > > > > > > > > > >> > > that
> > > > > > > > > > > >> > > > > > keeps
> > > > > > > > > > > >> > > > > > > > the
> > > > > > > > > > > >> > > > > > > > > > query
> > > > > > > > > > > >> > > > > > > > > > > > > > part
> > > > > > > > > > > >> > > > > > > > > > > > > > > at is it. with get and
> > query
> > > > > > > methods),
> > > > > > > > > > that
> > > > > > > > > > > >> will
> > > > > > > > > > > >> > > > > include
> > > > > > > > > > > >> > > > > > > the
> > > > > > > > > > > >> > > > > > > > > > required
> > > > > > > > > > > >> > > > > > > > > > > > > > > modification operations
> > for
> > > > the
> > > > > > four
> > > > > > > > > > > DataIndex
> > > > > > > > > > > >> > > > > "domains":
> > > > > > > > > > > >> > > > > > > > > > > > > > processinstance,
> > > > > > > > > > > >> > > > > > > > > > > > > > > usertask,
> > processdefinitions
> > > > and
> > > > > > > jobs.
> > > > > > > > > > Those
> > > > > > > > > > > >> > > > interfaces
> > > > > > > > > > > >> > > > > > > will
> > > > > > > > > > > >> > > > > > > > > > define
> > > > > > > > > > > >> > > > > > > > > > > > > > methods
> > > > > > > > > > > >> > > > > > > > > > > > > > > like addNode,
> addVariable,
> > > > > > > updateTask,
> > > > > > > > > > > >> > > > > addAttachment.....
> > > > > > > > > > > >> > > > > > > > that
> > > > > > > > > > > >> > > > > > > > > > will
> > > > > > > > > > > >> > > > > > > > > > > > allow
> > > > > > > > > > > >> > > > > > > > > > > > > > > the persistent layer
> > > > > > implementation
> > > > > > > > to
> > > > > > > > > > just
> > > > > > > > > > > >> > update
> > > > > > > > > > > >> > > > the
> > > > > > > > > > > >> > > > > > > > needed
> > > > > > > > > > > >> > > > > > > > > > info
> > > > > > > > > > > >> > > > > > > > > > > > in
> > > > > > > > > > > >> > > > > > > > > > > > > > the
> > > > > > > > > > > >> > > > > > > > > > > > > > > DB  (for example, for
> > > addNode
> > > > in
> > > > > > > > > Postgres,
> > > > > > > > > > > >> just
> > > > > > > > > > > >> > > > insert
> > > > > > > > > > > >> > > > > a
> > > > > > > > > > > >> > > > > > > row
> > > > > > > > > > > >> > > > > > > > > into
> > > > > > > > > > > >> > > > > > > > > > > > nodes
> > > > > > > > > > > >> > > > > > > > > > > > > > > table, for addNode in
> > Mongo,
> > > > > > > basically
> > > > > > > > > the
> > > > > > > > > > > >> same
> > > > > > > > > > > >> > > > atomic
> > > > > > > > > > > >> > > > > > > upsert
> > > > > > > > > > > >> > > > > > > > > > > > operation
> > > > > > > > > > > >> > > > > > > > > > > > > > > that is currently done).
> > > > > > Therefore,
> > > > > > > we
> > > > > > > > > > > >> increase
> > > > > > > > > > > >> > > > > > performance
> > > > > > > > > > > >> > > > > > > > for
> > > > > > > > > > > >> > > > > > > > > > > > Postgres
> > > > > > > > > > > >> > > > > > > > > > > > > > > and keep the current one
> > for
> > > > > > Mongo.
> > > > > > > > The
> > > > > > > > > > > >> current
> > > > > > > > > > > >> > DB
> > > > > > > > > > > >> > > > > > schemas
> > > > > > > > > > > >> > > > > > > > > won't
> > > > > > > > > > > >> > > > > > > > > > be
> > > > > > > > > > > >> > > > > > > > > > > > > > > touched.
> > > > > > > > > > > >> > > > > > > > > > > > > > > Since the code change is
> > > > large,
> > > > > I
> > > > > > do
> > > > > > > > not
> > > > > > > > > > > think
> > > > > > > > > > > >> > I'll
> > > > > > > > > > > >> > > > be
> > > > > > > > > > > >> > > > > > able
> > > > > > > > > > > >> > > > > > > > to
> > > > > > > > > > > >> > > > > > > > > > have
> > > > > > > > > > > >> > > > > > > > > > > > the
> > > > > > > > > > > >> > > > > > > > > > > > > > PR
> > > > > > > > > > > >> > > > > > > > > > > > > > > ready till next week.
> > > > > > > > > > > >> > > > > > > > > > > > > > > But before starting,
> > please
> > > > let
> > > > > me
> > > > > > > > know
> > > > > > > > > if
> > > > > > > > > > > >> that
> > > > > > > > > > > >> > > > > approach
> > > > > > > > > > > >> > > > > > is
> > > > > > > > > > > >> > > > > > > > > fine
> > > > > > > > > > > >> > > > > > > > > > for
> > > > > > > > > > > >> > > > > > > > > > > > you.
> > > > > > > > > > > >> > > > > > > > > > > > > > > Best regards.
> > > > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > > > > > > On Fri, Nov 24, 2023 at
> > > > 6:55 PM
> > > > > > Alex
> > > > > > > > > > > Porcelli
> > > > > > > > > > > >> <
> > > > > > > > > > > >> > > > > > > > > [email protected]>
> > > > > > > > > > > >> > > > > > > > > > > > wrote:
> > > > > > > > > > > >> > > > > > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > > > > > > > Thank you Francisco to
> > > > getting
> > > > > > > > deeper
> > > > > > > > > on
> > > > > > > > > > > >> this…
> > > > > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > > > > > > > Looking forward to see
> > the
> > > > > > results
> > > > > > > > of
> > > > > > > > > > your
> > > > > > > > > > > >> > > > suggested
> > > > > > > > > > > >> > > > > > > > > > improvements.
> > > > > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > > > > > > > On Fri, Nov 24, 2023
> at
> > > > > 9:40 AM
> > > > > > > > > > Francisco
> > > > > > > > > > > >> > Javier
> > > > > > > > > > > >> > > > > Tirado
> > > > > > > > > > > >> > > > > > > > > Sarti <
> > > > > > > > > > > >> > > > > > > > > > > > > > > > [email protected]>
> > > wrote:
> > > > > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > > > > > > > > I forgot to attach
> the
> > > > > queries
> > > > > > > > > > > >> > > > > > > > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > > > > > > > > On Fri, Nov 24, 2023
> > at
> > > > > > 3:04 PM
> > > > > > > > > > > Francisco
> > > > > > > > > > > >> > > Javier
> > > > > > > > > > > >> > > > > > Tirado
> > > > > > > > > > > >> > > > > > > > > > Sarti <
> > > > > > > > > > > >> > > > > > > > > > > > > > > > > [email protected]
> >
> > > > wrote:
> > > > > > > > > > > >> > > > > > > > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > > > > > > > >> Hi,
> > > > > > > > > > > >> > > > > > > > > > > > > > > > >> A brief update on
> > this
> > > > > topic.
> > > > > > > > > > > >> > > > > > > > > > > > > > > > >> After doing a
> simple
> > > test
> > > > > > with
> > > > > > > > > > example
> > > > > > > > > > > >> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > >> > > > > > > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > > >
> > > > > > > > > > > >> > > > > > > > >
> > > > > > > > > > > >> > > > > > > >
> > > > > > > > > > > >> > > > > > >
> > > > > > > > > > > >> > > > > >
> > > > > > > > > > > >> > > > >
> > > > > > > > > > > >> > > >
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> >
> > > > > > > > > > > >>
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/incubator-kie-kogito-examples/tree/stable/serverless-workflow-examples/serverless-workflow-data-index-quarkus
> > > > > > > > > > > >> > > > > > > > > > > > > > > > ,
> > > > > > > > > > > >> > > > > > > > > > > > > > > > >> the number of
> updates
> > > > over
> > > > > > > Nodes
> > > > > > > > > > table
> > > > > > > > > > > is
> > > > > > > > > > > >> > n*n,
> > > > > > > > > > > >> > > > so
> > > > > > > > > > > >> > > > > we
> > > > > > > > > > > >> > > > > > > > > manage
> > > > > > > > > > > >> > > > > > > > > > to
> > > > > > > > > > > >> > > > > > > > > > > > > > obtain a
> > > > > > > > > > > >> > > > > > > > > > > > > > > > >> perfect quadratic
> > > > > performance
> > > > > > > > > > > >> degradation.
> > > > > > > > > > > >> > The
> > > > > > > > > > > >> > > > > > problem
> > > > > > > > > > > >> > > > > > > > is
> > > > > > > > > > > >> > > > > > > > > > worse
> > > > > > > > > > > >> > > > > > > > > > > > in
> > > > > > > > > > > >> > > > > > > > > > > > > > the
> > > > > > > > > > > >> > > > > > > > > > > > > > > > case
> > > > >
>

Re: [DISCUSSION] Performance issues with data-index persistence addon

Reply via email to