[
https://issues.apache.org/jira/browse/UNOMI-748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17703540#comment-17703540
]
Kevan Jahanshahi edited comment on UNOMI-748 at 3/22/23 8:22 AM:
-----------------------------------------------------------------
Merge code have been improved in PR:
[https://github.com/apache/unomi/pull/593], but an other ticket have been
created to update all other place that would need to be updated:
https://issues.apache.org/jira/browse/UNOMI-753
was (Author: jkevan):
Merge code have been improved, but an other ticket have been created to update
all other place that would need to be updated:
https://issues.apache.org/jira/browse/UNOMI-753
> Unomi merge system is exposed to OOM
> ------------------------------------
>
> Key: UNOMI-748
> URL: https://issues.apache.org/jira/browse/UNOMI-748
> Project: Apache Unomi
> Issue Type: Improvement
> Affects Versions: unomi-2.1.0
> Reporter: Kevan Jahanshahi
> Assignee: Kevan Jahanshahi
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> currently the sessions/events *update* is using bulkProcessor and it is
> asynchronous, we never know when the bulk will be perform.
> * t{+}he benefit{+}: fast merge requests, the merge request is fast as
> nothing is retain, bulk processor will do the job in a separate thread.
> * {+}the cons{+}: {*}all previous sessions/events are first loaded in
> memory{*}, so in case of merging active profiles that contains a lot of past
> events/sessions, {{{}we could be exposed to OOM{}}}. {_}(We already had
> similar case with the purge that was loading all profiles in memory.{_})
> If we replace the *update(one item at a time)* by using {*}updateByQuery{*},
> the request will loose it’s asynchronous nature provided by the so called:
> BulkProcessor.
> * {+}the benefit{+}: sessions, events not load in memory, no OOM possible
> * {+}the cons{+}: request will be synchron and {{{}we expose merge requests
> to timeout on client side{}}}. merge is actually trigger by the login on jExp
> side adding extra timing here could have bad impacts and side effects.
>
> Since none of this solution seem’s ok, the perfect solution should be a mix
> of both strength: * use *{{updateByQuery}}* in a separate thread to avoid
> retaining merge request
> *
> ** We have the OOM protection by not loading all the past events/sessions
> ** We have the asynchronous execution done in a separate thread/job to free
> the current request.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)