[ 
https://issues.apache.org/jira/browse/UNOMI-266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Serge Huber updated UNOMI-266:
------------------------------
    Description: 
After review some code after the migration to ElasticSearch 7, the following 
problems were found: 
- Event save doesn't use ES batching while it could, as we have no need for 
real-time event querying, batching is acceptable here. This should make ES 
ingestion performance better as event saves are among the highest frequency 
operations
- In the EventService, the hasEventAlreadyBeenRaised method is a problem in its 
current implementation because it performs queries over the events and this 
method is called each time a rule that is marked as 
isRaiseEventOnlyOnceForProfile or hasEventAlreadyBeenRaisedForSession is 
evaluated, which can be very frequent. This calculation should be cached in the 
session and profile, using system properties, so that we don't have to check it 
all the time. A solution would be to first check in the systemProperties if we 
have marked the event as already raised and if not perform the query in ES, but 
only in that case. This would make it compatible with legacy data.
- In the rules, we should replace the condition for matching events with a 
specified "eventTypesSupported" field that would make it a lot faster to check 
if rules should be evaluated for an incoming event. The field could also 
support negative eventTypes such as !view or something like that. This would 
prevent going through the condition evaluators when just needing to check if it 
matches the currently process event. It would also allow to implement internal 
optimization (cache) tables to find all the rules that match an event type even 
faster. This is actually already tracked in UNOMI-188.
- More... see sub-tasks.

In terms of data model changes and potential migration impact, most of these 
changes should have limited/no impact, especially if new fields (such as the 
rule one) can be made optional. It does still need to be checked what will 
happen with API clients that will suddenly retrieve new fields, which might 
make them break.

This ticket now regroups all other back-end improvement tickets as sub-tasks. 
They still need to be properly triaged and updated.

  was:
After review some code after the migration to ElasticSearch 7, the following 
problems were found: 
- Event save doesn't use ES batching while it could, as we have no need for 
real-time event querying, batching is acceptable here. This should make ES 
ingestion performance better as event saves are among the highest frequency 
operations
- In the EventService, the hasEventAlreadyBeenRaised method is a problem in its 
current implementation because it performs queries over the events and this 
method is called each time a rule that is marked as 
isRaiseEventOnlyOnceForProfile or hasEventAlreadyBeenRaisedForSession is 
evaluated, which can be very frequent. This calculation should be cached in the 
session and profile, using system properties, so that we don't have to check it 
all the time. A solution would be to first check in the systemProperties if we 
have marked the event as already raised and if not perform the query in ES, but 
only in that case. This would make it compatible with legacy data.
- In the rules, we should replace the condition for matching events with a 
specified "eventTypesSupported" field that would make it a lot faster to check 
if rules should be evaluated for an incoming event. The field could also 
support negative eventTypes such as !view or something like that. This would 
prevent going through the condition evaluators when just needing to check if it 
matches the currently process event. It would also allow to implement internal 
optimization (cache) tables to find all the rules that match an event type even 
faster. This is actually already tracked in UNOMI-188.

In terms of data model changes and potential migration impact, most of these 
changes should have limited/no impact, especially if new fields (such as the 
rule one) can be made optional. It does still need to be checked what will 
happen with API clients that will suddenly retrieve new fields, which might 
make them break.


> Backend performance improvements
> --------------------------------
>
>                 Key: UNOMI-266
>                 URL: https://issues.apache.org/jira/browse/UNOMI-266
>             Project: Apache Unomi
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.3.0-incubating, 1.4.0
>            Reporter: Serge Huber
>            Assignee: Serge Huber
>            Priority: Major
>             Fix For: 1.5.0
>
>
> After review some code after the migration to ElasticSearch 7, the following 
> problems were found: 
> - Event save doesn't use ES batching while it could, as we have no need for 
> real-time event querying, batching is acceptable here. This should make ES 
> ingestion performance better as event saves are among the highest frequency 
> operations
> - In the EventService, the hasEventAlreadyBeenRaised method is a problem in 
> its current implementation because it performs queries over the events and 
> this method is called each time a rule that is marked as 
> isRaiseEventOnlyOnceForProfile or hasEventAlreadyBeenRaisedForSession is 
> evaluated, which can be very frequent. This calculation should be cached in 
> the session and profile, using system properties, so that we don't have to 
> check it all the time. A solution would be to first check in the 
> systemProperties if we have marked the event as already raised and if not 
> perform the query in ES, but only in that case. This would make it compatible 
> with legacy data.
> - In the rules, we should replace the condition for matching events with a 
> specified "eventTypesSupported" field that would make it a lot faster to 
> check if rules should be evaluated for an incoming event. The field could 
> also support negative eventTypes such as !view or something like that. This 
> would prevent going through the condition evaluators when just needing to 
> check if it matches the currently process event. It would also allow to 
> implement internal optimization (cache) tables to find all the rules that 
> match an event type even faster. This is actually already tracked in 
> UNOMI-188.
> - More... see sub-tasks.
> In terms of data model changes and potential migration impact, most of these 
> changes should have limited/no impact, especially if new fields (such as the 
> rule one) can be made optional. It does still need to be checked what will 
> happen with API clients that will suddenly retrieve new fields, which might 
> make them break.
> This ticket now regroups all other back-end improvement tickets as sub-tasks. 
> They still need to be properly triaged and updated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to