Re: [DISCUSSION] Performance issues with data-index persistence addon

Francisco Javier Tirado Sarti Fri, 09 Feb 2024 08:00:30 -0800

Hi,
Usually I do not want to talk about work in progress because preliminary
conclusions are pretty volatile but, well, there are a couple of things
that can be concluded from the really valuable information that Martin
provided.
1) In order to be able to determine if the number of statements is larger
than expected, I asked Martin to test with a simpler process definition.
One with just three nodes: start, script and end. The script one should
change just one variable. This way we can analyze if the number of queries
is the expected one. From the single log (audit was activated them) my
conclusion is that the number of insert/updates over processes and nodes
(there a lot over task, that I will prefer to skip for now, baby steps) is
the expected one.
2) Analysing the thread dump, we see around 15 threads executing this line
at
org.kie.kogito.index.jpa.storage.ProcessInstanceEntityStorage.indexNode(ProcessInstanceEntityStorage.java:125),
so its pretty clear the code to be optimized ;). I'm evaluating
possibilities within JPA/Hibernate, but I'm starting to think that it might
be better to switch to JDBC and skip hibernate. Our lives will be simpler,
especially with a schema relatively simple like ours (that will be my
recommendation if I was an external consultant)


On Fri, Feb 9, 2024 at 4:15 PM Tibor Zimányi <[email protected]> wrote:

> Hi,
>
> this will be a bit off-topic. However as far as performance, I think we
> should think about that we have string primary keys (IDs). I would expect
> the database systems are much better with indexing numeric keys than
> strings. I remember from the past, when I was working with DBs, that using
> strings as keys or indexes was a discouraged practice.
>
> Best regards,
> Tibor
>
> Dňa št 8. 2. 2024, 22:45 Martin Weiler <[email protected]>
> napísal(a):
>
> > I changed the test to use MongoDB [1] and I don't see a performance
> > degradation with this setup [2].
> >
> > Please keep us posted of your findings. Thanks!
> >
> > Martin
> >
> > [1]
> https://github.com/martinweiler/job-service-refactor-test/tree/mongodb
> > [2]
> >
> https://drive.google.com/file/d/1NfacXaxJlgRMw4OQ5S20cvkzvaUKUVFj/view?usp=sharing
> >
> > ________________________________________
> > From: Francisco Javier Tirado Sarti <[email protected]>
> > Sent: Wednesday, February 7, 2024 11:40 AM
> > To: [email protected]
> > Subject: [EXTERNAL] Re: [DISCUSSION] Performance issues with data-index
> > persistence addon
> >
> > yes, it can be index degradation because of size, but I believe (I might
> be
> > wrong) the db is too small (yet) for that.
> > But, eventually, Postgres, when the DB is huge enough, unavoidably will
> > behave like the graphic that Martin sent.
> > Since I believe we are not huge enough (yet), lets rule out another issue
> > by analysing the sql logs (I requested those to Martin offline and he is
> > going to kindly collect them).
> > Also Im curious to know if Mongo behave in the same way.
> >
> > On Wed, Feb 7, 2024 at 7:25 PM Enrique Gonzalez Martinez <
> > [email protected]> wrote:
> >
> > > Hi Francisco,
> > > I would highly recommend to check indexes and how the updates work in
> > data
> > > index to avoid full scan table and lock the full table. Some db are
> very
> > > sensitive to that.
> > >
> > > El mié, 7 feb 2024, 18:41, Francisco Javier Tirado Sarti <
> > > [email protected]> escribió:
> > >
> > > > Hi Martin,
> > > > While I analyze the data, let me ask you if it is possible to perform
> > > > another check (similar in a way to disabling data-index like you do)
> > Can
> > > > you switch to MongoDB persistence and check if the same degradation
> > that
> > > is
> > > > there for postgres remains?
> > > > I do not know if this is feasible but will certainly indicate the
> > problem
> > > > is on the postgres storage layer and I do not have a clear prediction
> > of
> > > > what we will see when doing this switch.
> > > >
> > > > On Wed, Feb 7, 2024 at 6:37 PM Martin Weiler <[email protected]
> >
> > > > wrote:
> > > >
> > > > > Hi Francisco,
> > > > >
> > > > > thanks for your work on this important topic!
> > > > >
> > > > > I would like to share some test results here, which might help to
> > > improve
> > > > > the codebase even further. I am using the jmeter based test case
> from
> > > > Pere
> > > > > and Enrique (thanks guys!) [1] which uses a load of 30 threads to
> > > > >
> > > > > 1) start a new process instance (POST)
> > > > > 2) retrieve tasks for a user (GET)
> > > > > 3) fetches task details (GET)
> > > > > 4) complete a task (POST)
> > > > > 5) execute a query on data-audit
> > > > >
> > > > > With this test setup, I noticed that the performance for the POST
> > > > > requests, in particular the one to start a new process instance,
> > > degrades
> > > > > over time - see graph [2]. If I run the same test without
> data-index,
> > > > then
> > > > > there is no such performance degradation [3]. You can find a thread
> > > dump
> > > > > captured a few minutes into the first test here [4] that might help
> > to
> > > > see
> > > > > some of the contention points.
> > > > >
> > > > > I'd appreciate if you could take a look and see if there is
> something
> > > > that
> > > > > can be further improved based on your previous work. If you need
> any
> > > > > additional data, let me know, but otherwise it is straightforward
> to
> > > run
> > > > > the jmeter test as well.
> > > > >
> > > > > Thanks,
> > > > > Martin
> > > > >
> > > > > [1] https://github.com/pefernan/job-service-refactor-test/
> > > > > [2]
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/1Gqn-ixE05kXv2jdssAUlnMuUVcHxIYZ0/view?usp=sharing
> > > > > [3]
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/10gVNyb4JYg_bA18bNhY9dEDbPn3TOxL7/view?usp=sharing
> > > > > [4]
> > > > >
> > > >
> > >
> >
> https://drive.google.com/file/d/1jVrtsO49gCvUlnaC9AUAtkVKTm4PbdUv/view?usp=sharing
> > > > >
> > > > > ________________________________________
> > > > > From: Francisco Javier Tirado Sarti <[email protected]>
> > > > > Sent: Wednesday, January 17, 2024 9:13 AM
> > > > > To: [email protected]
> > > > > Cc: Pere Fernandez Perez
> > > > > Subject: [EXTERNAL] Re: [DISCUSSION] Performance issues with
> > data-index
> > > > > persistence addon
> > > > >
> > > > > Hi Alex,
> > > > > I did not take times (which depends on a number of variables that
> > > > > drastically change between environments), but verify that the
> number
> > of
> > > > > updates has been reduced drastically without losing functionality,
> > > which
> > > > is
> > > > > objectively a good thing. If before the change, for every node
> > > executed,
> > > > we
> > > > > have an update for every node previously executed, so if a process
> > have
> > > > 50
> > > > > nodes to execute, we were performing nearly 50*51/2 updates, which
> > > gives
> > > > us
> > > > > a total of  1275 updates, now we have just one for every node being
> > > > > executed, implying a total of 50 updates.
> > > > >
> > > > >
> > > > > On Wed, Jan 17, 2024 at 3:18 PM Alex Porcelli <[email protected]>
> > > wrote:
> > > > >
> > > > > > Francisco,
> > > > > >
> > > > > > I noticed that your PR has been merged, but I was expecting (at
> > least
> > > > > > was my understanding from this thread) that before merging some
> > > > > > benchmark data would be shared in advance - to assess the
> > > cost/benefit
> > > > > > of such a decent size change.
> > > > > >
> > > > > > Do you have any information to share?
> > > > > >
> > > > > > On Sat, Dec 23, 2023 at 4:02 AM Francisco Javier Tirado Sarti
> > > > > > <[email protected]> wrote:
> > > > > > >
> > > > > > > Yes, as intended, now we have one select and one insert/update
> > per
> > > > node
> > > > > > > event.
> > > > > > > I moved the PR as ready for review and give @Pere Fernandez
> Perez
> > > > > > > <[email protected]> permission to the branch so he can edit
> it
> > > in
> > > > > the
> > > > > > > next two weeks (Ill be on PTO)  if desired, before merging.
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Dec 21, 2023 at 5:58 PM Alex Porcelli <
> [email protected]>
> > > > > wrote:
> > > > > > >
> > > > > > > > Cool, thank you Francisco!
> > > > > > > >
> > > > > > > > Did you manage to get some preliminary data about
> improvements?
> > > > > > > >
> > > > > > > > On Thu, Dec 21, 2023 at 11:52 AM Francisco Javier Tirado
> Sarti
> > > > > > > > <[email protected]> wrote:
> > > > > > > > >
> > > > > > > > > Yes, after some delay because of quarkus 3 migration. Im
> > > refining
> > > > > > this
> > > > > > > > > draft PR
> > > > > > > > >
> > https://github.com/apache/incubator-kie-kogito-apps/pull/1941
> > > > > > > > >
> > > > > > > > > On Thu, Dec 21, 2023 at 5:48 PM Alex Porcelli <
> > > [email protected]>
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Any update or new findings on this topic?
> > > > > > > > > >
> > > > > > > > > > On Tue, Nov 28, 2023 at 8:38 AM Francisco Javier Tirado
> > Sarti
> > > > > > > > > > <[email protected]> wrote:
> > > > > > > > > > >
> > > > > > > > > > > Hi Alex,
> > > > > > > > > > > After considering different options to improve
> > performance,
> > > > we
> > > > > > feel
> > > > > > > > that
> > > > > > > > > > it
> > > > > > > > > > > is time to "partially" move away from the current Map
> > style
> > > > > > > > interface (
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/incubator-kie-kogito-apps/blob/main/persistence-commons/persistence-commons-api/src/main/java/org/kie/kogito/persistence/api/Storage.java
> > > > > > > > > > )
> > > > > > > > > > > which was shared with Trusty, to one more suitable for
> > > usage
> > > > > > with a
> > > > > > > > > > > relational DB like postgresql (but still compatible
> with
> > > big
> > > > > > table
> > > > > > > > dbs).
> > > > > > > > > > > The idea will be to replace generic Storage interface
> by
> > > four
> > > > > > > > specific
> > > > > > > > > > > interfaces (which will inherit from a common one that
> > keeps
> > > > the
> > > > > > query
> > > > > > > > > > part
> > > > > > > > > > > at is it. with get and query methods), that will
> include
> > > the
> > > > > > required
> > > > > > > > > > > modification operations for the four DataIndex
> "domains":
> > > > > > > > > > processinstance,
> > > > > > > > > > > usertask, processdefinitions and jobs. Those interfaces
> > > will
> > > > > > define
> > > > > > > > > > methods
> > > > > > > > > > > like addNode, addVariable, updateTask,
> addAttachment.....
> > > > that
> > > > > > will
> > > > > > > > allow
> > > > > > > > > > > the persistent layer implementation  to just update the
> > > > needed
> > > > > > info
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > > DB  (for example, for addNode in Postgres, just insert
> a
> > > row
> > > > > into
> > > > > > > > nodes
> > > > > > > > > > > table, for addNode in Mongo, basically the same atomic
> > > upsert
> > > > > > > > operation
> > > > > > > > > > > that is currently done). Therefore, we increase
> > performance
> > > > for
> > > > > > > > Postgres
> > > > > > > > > > > and keep the current one for Mongo. The current DB
> > schemas
> > > > > won't
> > > > > > be
> > > > > > > > > > > touched.
> > > > > > > > > > > Since the code change is large, I do not think I'll be
> > able
> > > > to
> > > > > > have
> > > > > > > > the
> > > > > > > > > > PR
> > > > > > > > > > > ready till next week.
> > > > > > > > > > > But before starting, please let me know if that
> approach
> > is
> > > > > fine
> > > > > > for
> > > > > > > > you.
> > > > > > > > > > > Best regards.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Nov 24, 2023 at 6:55 PM Alex Porcelli <
> > > > > [email protected]>
> > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Thank you Francisco to getting deeper on this…
> > > > > > > > > > > >
> > > > > > > > > > > > Looking forward to see the results of your suggested
> > > > > > improvements.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Fri, Nov 24, 2023 at 9:40 AM Francisco Javier
> Tirado
> > > > > Sarti <
> > > > > > > > > > > > [email protected]> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > I forgot to attach the queries
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Fri, Nov 24, 2023 at 3:04 PM Francisco Javier
> > Tirado
> > > > > > Sarti <
> > > > > > > > > > > > > [email protected]> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > >> Hi,
> > > > > > > > > > > > >> A brief update on this topic.
> > > > > > > > > > > > >> After doing a simple test with example
> > > > > > > > > > > > >>
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/incubator-kie-kogito-examples/tree/stable/serverless-workflow-examples/serverless-workflow-data-index-quarkus
> > > > > > > > > > > > ,
> > > > > > > > > > > > >> the number of updates over Nodes table is n*n, so
> we
> > > > > manage
> > > > > > to
> > > > > > > > > > obtain a
> > > > > > > > > > > > >> perfect quadratic performance degradation. The
> > problem
> > > > is
> > > > > > worse
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > > > case
> > > > > > > > > > > > >> of Serverless Workflow than in BPMN because we the
> > > > number
> > > > > of
> > > > > > > > nodes
> > > > > > > > > > is
> > > > > > > > > > > > >> greater than the number of states. In that
> example N
> > > is
> > > > > 16,
> > > > > > but
> > > > > > > > for
> > > > > > > > > > a
> > > > > > > > > > > > more
> > > > > > > > > > > > >> complex workflow it would be certainly large.
> > > > > > > > > > > > >> I think that this is more related to how we are
> > > handling
> > > > > > JPA in
> > > > > > > > the
> > > > > > > > > > > > code,
> > > > > > > > > > > > >> in particular the mapping from model to entity
> > > > (basically
> > > > > > JPA is
> > > > > > > > > > blind
> > > > > > > > > > > > and
> > > > > > > > > > > > >> has to update all nodes for every write because it
> > > > > believes
> > > > > > the
> > > > > > > > > > node has
> > > > > > > > > > > > >> been updated, although it is not) than an issue in
> > the
> > > > > table
> > > > > > > > > > definition.
> > > > > > > > > > > > >> In fact, when using JPA, separating the server
> model
> > > > from
> > > > > > the
> > > > > > > > JPA
> > > > > > > > > > > > entity is
> > > > > > > > > > > > >> not a good idea, especially if the entity contains
> > > > > > collections.
> > > > > > > > I
> > > > > > > > > > will
> > > > > > > > > > > > try
> > > > > > > > > > > > >> to change that without breaking anything.
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> On Wed, Nov 22, 2023 at 12:10 PM Enrique Gonzalez
> > > > > Martinez <
> > > > > > > > > > > > >> [email protected]> wrote:
> > > > > > > > > > > > >>
> > > > > > > > > > > > >>> After the events split you now will need to
> create
> > a
> > > > node
> > > > > > > > instance
> > > > > > > > > > > > >>> model instance of making independent from the
> > process
> > > > > > instance.
> > > > > > > > > > > > >>> That should do the trick.
> > > > > > > > > > > > >>>
> > > > > > > > > > > > >>> Regarding deleting/inserting it was fixed at some
> > > > point.
> > > > > > > > > > > > >>>
> > > > > > > > > > > > >>> El mar, 21 nov 2023 a las 20:22, Francisco Javier
> > > > Tirado
> > > > > > Sarti
> > > > > > > > > > > > >>> (<[email protected]>) escribió:
> > > > > > > > > > > > >>> >
> > > > > > > > > > > > >>> > Hi Martin,
> > > > > > > > > > > > >>> > I have a task to review performance of
> > > > > > > > > > > > >>> >
> > > > > > > > > > > > >>> > ProcessInstanceNodeDataEventMerger
> > > > > > > > > > > > >>> > My idea is to reduce the number of delete
> inserts
> > > > when
> > > > > > > > processing
> > > > > > > > > > > > >>> events
> > > > > > > > > > > > >>> > and try to do it incremental.
> > > > > > > > > > > > >>> > That should improve performance.
> > > > > > > > > > > > >>> > PS:
> > > > > > > > > > > > >>> > I was planning to send an e-mail tomorrow
> > > announcing
> > > > > > that in
> > > > > > > > > > case you
> > > > > > > > > > > > >>> were
> > > > > > > > > > > > >>> > already working on a fix for that. I assume you
> > are
> > > > not
> > > > > > and I
> > > > > > > > > > would
> > > > > > > > > > > > be
> > > > > > > > > > > > >>> > sending a PR soon.
> > > > > > > > > > > > >>> >
> > > > > > > > > > > > >>> > On Tue, Nov 21, 2023 at 6:09 PM Martin Weiler
> > > > > > > > > > > > <[email protected]
> > > > > > > > > > > > >>> >
> > > > > > > > > > > > >>> > wrote:
> > > > > > > > > > > > >>> >
> > > > > > > > > > > > >>> > > I looked into the new examples using
> data-index
> > > > > > persistence
> > > > > > > > > > addon -
> > > > > > > > > > > > >>> Neus'
> > > > > > > > > > > > >>> > > PR#1813 [1] for serverless and Pere's branch
> > [2]
> > > > for
> > > > > > > > workflow
> > > > > > > > > > > > (great
> > > > > > > > > > > > >>> job
> > > > > > > > > > > > >>> > > both!) - and they work without issues using
> > > single
> > > > > > > > requests.
> > > > > > > > > > > > >>> However, under
> > > > > > > > > > > > >>> > > some load (I used 'ab' for testing with a
> light
> > > > > > > > concurrency of
> > > > > > > > > > 10
> > > > > > > > > > > > >>> parallel
> > > > > > > > > > > > >>> > > requests) I ran into the following problems:
> > > > > > > > > > > > >>> > >
> > > > > > > > > > > > >>> > > (1) Large number of insert/delete calls (eg.
> > for
> > > > > tables
> > > > > > > > such as
> > > > > > > > > > > > >>> nodes,
> > > > > > > > > > > > >>> > > definitions, etc)
> > > > > > > > > > > > >>> > >
> > > > > > > > > > > > >>> > > (2) Hibernate OptimisticLockExceptions /
> > > > > > > > StaleStateExceptions
> > > > > > > > > > > > >>> > >
> > > > > > > > > > > > >>> > > (3) DB deadlocks
> > > > > > > > > > > > >>> > >
> > > > > > > > > > > > >>> > > (4) Error responses, slow response times
> > > > > > > > > > > > >>> > >
> > > > > > > > > > > > >>> > > The reason I am reaching out with this topic
> > here
> > > > is
> > > > > to
> > > > > > > > find
> > > > > > > > > > out if
> > > > > > > > > > > > >>> we are
> > > > > > > > > > > > >>> > > aware of this issue, and if someone is
> already
> > > > > looking
> > > > > > > > into or
> > > > > > > > > > > > being
> > > > > > > > > > > > >>> > > assigned to it?
> > > > > > > > > > > > >>> > >
> > > > > > > > > > > > >>> > > Thanks,
> > > > > > > > > > > > >>> > > Martin
> > > > > > > > > > > > >>> > >
> > > > > > > > > > > > >>> > > [1]
> > > > > > > > > > > > >>>
> > > > > > > >
> > > https://github.com/apache/incubator-kie-kogito-examples/pull/1813
> > > > > > > > > > > > >>> > > [2]
> > > > > > > > > > > > >>> > >
> > > > > > > > > > > > >>>
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/pefernan/kogito-examples/tree/example_data-index_persistence
> > > > > > > > > > > > >>> > >
> > > > > > > > > > > > >>> > >
> > > > > > > > > > > >
> > > > > > > >
> > > > ---------------------------------------------------------------------
> > > > > > > > > > > > >>> > > To unsubscribe, e-mail:
> > > > > [email protected]
> > > > > > > > > > > > >>> > > For additional commands, e-mail:
> > > > > > [email protected]
> > > > > > > > > > > > >>> > >
> > > > > > > > > > > > >>> > >
> > > > > > > > > > > > >>>
> > > > > > > > > > > > >>>
> > > > > > > > > >
> > > > > >
> > ---------------------------------------------------------------------
> > > > > > > > > > > > >>> To unsubscribe, e-mail:
> > > [email protected]
> > > > > > > > > > > > >>> For additional commands, e-mail:
> > > > [email protected]
> > > > > > > > > > > > >>>
> > > > > > > > > > > > >>>
> > > > > > > > > > > > >
> > > > > > > >
> > > > ---------------------------------------------------------------------
> > > > > > > > > > > > > To unsubscribe, e-mail:
> > [email protected]
> > > > > > > > > > > > > For additional commands, e-mail:
> > > [email protected]
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > >
> > ---------------------------------------------------------------------
> > > > > > > > > > To unsubscribe, e-mail: [email protected]
> > > > > > > > > > For additional commands, e-mail: [email protected]
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > ---------------------------------------------------------------------
> > > > > > > > To unsubscribe, e-mail: [email protected]
> > > > > > > > For additional commands, e-mail: [email protected]
> > > > > > > >
> > > > > > > >
> > > > > >
> > > > > >
> > ---------------------------------------------------------------------
> > > > > > To unsubscribe, e-mail: [email protected]
> > > > > > For additional commands, e-mail: [email protected]
> > > > > >
> > > > > >
> > > > >
> > > > >
> ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: [email protected]
> > > > > For additional commands, e-mail: [email protected]
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSSION] Performance issues with data-index persistence addon

Reply via email to