[
https://issues.apache.org/jira/browse/UNOMI-266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Serge Huber updated UNOMI-266:
------------------------------
Description:
After review some code after the migration to ElasticSearch 7, the following
problems were found:
- Event save doesn't use ES batching while it could, as we have no need for
real-time event querying, batching is acceptable here. This should make ES
ingestion performance better as event saves are among the highest frequency
operations
- In the EventService, the hasEventAlreadyBeenRaised method is a problem in its
current implementation because it performs queries over the events and this
method is called each time a rule that is marked as
isRaiseEventOnlyOnceForProfile or hasEventAlreadyBeenRaisedForSession is
evaluated, which can be very frequent. This calculation should be cached in the
session and profile, using system properties, so that we don't have to check it
all the time. A solution would be to first check in the systemProperties if we
have marked the event as already raised and if not perform the query in ES, but
only in that case. For the profile storage we would have to be careful as
events are related to the target item ID so this could grow significantly if
setup on a lot of objects. This would make it compatible with legacy data.
- In the rules, we should replace the condition for matching events with a
specified "eventTypesSupported" field that would make it a lot faster to check
if rules should be evaluated for an incoming event. The field could also
support negative eventTypes such as !view or something like that. This would
prevent going through the condition evaluators when just needing to check if it
matches the currently process event. It would also allow to implement internal
optimization (cache) tables to find all the rules that match an event type even
faster. This is actually already tracked in UNOMI-188.
- More... see sub-tasks.
In terms of data model changes and potential migration impact, most of these
changes should have limited/no impact, especially if new fields (such as the
rule one) can be made optional. It does still need to be checked what will
happen with API clients that will suddenly retrieve new fields, which might
make them break.
This ticket now regroups all other back-end improvement tickets as sub-tasks.
They still need to be properly triaged and updated.
was:
After review some code after the migration to ElasticSearch 7, the following
problems were found:
- Event save doesn't use ES batching while it could, as we have no need for
real-time event querying, batching is acceptable here. This should make ES
ingestion performance better as event saves are among the highest frequency
operations
- In the EventService, the hasEventAlreadyBeenRaised method is a problem in its
current implementation because it performs queries over the events and this
method is called each time a rule that is marked as
isRaiseEventOnlyOnceForProfile or hasEventAlreadyBeenRaisedForSession is
evaluated, which can be very frequent. This calculation should be cached in the
session and profile, using system properties, so that we don't have to check it
all the time. A solution would be to first check in the systemProperties if we
have marked the event as already raised and if not perform the query in ES, but
only in that case. This would make it compatible with legacy data.
- In the rules, we should replace the condition for matching events with a
specified "eventTypesSupported" field that would make it a lot faster to check
if rules should be evaluated for an incoming event. The field could also
support negative eventTypes such as !view or something like that. This would
prevent going through the condition evaluators when just needing to check if it
matches the currently process event. It would also allow to implement internal
optimization (cache) tables to find all the rules that match an event type even
faster. This is actually already tracked in UNOMI-188.
- More... see sub-tasks.
In terms of data model changes and potential migration impact, most of these
changes should have limited/no impact, especially if new fields (such as the
rule one) can be made optional. It does still need to be checked what will
happen with API clients that will suddenly retrieve new fields, which might
make them break.
This ticket now regroups all other back-end improvement tickets as sub-tasks.
They still need to be properly triaged and updated.
> Backend performance improvements
> --------------------------------
>
> Key: UNOMI-266
> URL: https://issues.apache.org/jira/browse/UNOMI-266
> Project: Apache Unomi
> Issue Type: Improvement
> Components: core
> Affects Versions: 1.3.0-incubating, 1.4.0
> Reporter: Serge Huber
> Assignee: Serge Huber
> Priority: Major
> Fix For: 1.5.0
>
>
> After review some code after the migration to ElasticSearch 7, the following
> problems were found:
> - Event save doesn't use ES batching while it could, as we have no need for
> real-time event querying, batching is acceptable here. This should make ES
> ingestion performance better as event saves are among the highest frequency
> operations
> - In the EventService, the hasEventAlreadyBeenRaised method is a problem in
> its current implementation because it performs queries over the events and
> this method is called each time a rule that is marked as
> isRaiseEventOnlyOnceForProfile or hasEventAlreadyBeenRaisedForSession is
> evaluated, which can be very frequent. This calculation should be cached in
> the session and profile, using system properties, so that we don't have to
> check it all the time. A solution would be to first check in the
> systemProperties if we have marked the event as already raised and if not
> perform the query in ES, but only in that case. For the profile storage we
> would have to be careful as events are related to the target item ID so this
> could grow significantly if setup on a lot of objects. This would make it
> compatible with legacy data.
> - In the rules, we should replace the condition for matching events with a
> specified "eventTypesSupported" field that would make it a lot faster to
> check if rules should be evaluated for an incoming event. The field could
> also support negative eventTypes such as !view or something like that. This
> would prevent going through the condition evaluators when just needing to
> check if it matches the currently process event. It would also allow to
> implement internal optimization (cache) tables to find all the rules that
> match an event type even faster. This is actually already tracked in
> UNOMI-188.
> - More... see sub-tasks.
> In terms of data model changes and potential migration impact, most of these
> changes should have limited/no impact, especially if new fields (such as the
> rule one) can be made optional. It does still need to be checked what will
> happen with API clients that will suddenly retrieve new fields, which might
> make them break.
> This ticket now regroups all other back-end improvement tickets as sub-tasks.
> They still need to be properly triaged and updated.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)