[
https://issues.apache.org/jira/browse/UNOMI-853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Francois Gerthoffert updated UNOMI-853:
---------------------------------------
Fix Version/s: unomi-2.6.0
> Adapt migration job to use asynchronous mode avoiding timeout and connection
> lost
> ---------------------------------------------------------------------------------
>
> Key: UNOMI-853
> URL: https://issues.apache.org/jira/browse/UNOMI-853
> Project: Apache Unomi
> Issue Type: Task
> Reporter: Jonathan Sinovassin-Naïk
> Assignee: Jerome Blanchard
> Priority: Major
> Fix For: unomi-2.6.0
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> h2. Explanation of the issue:
> When using unomi with elastic cloud.
> And executing the migration from unomi-1.x to unomi-2.x the migration
> sometimes fails because of the _reindex requests.
> There is a timeout which closes the _reindex requests when they are taking
> more than 2 minutes. This timeout cannot be changed in elastic cloud.
> According to the size of the index, the reindex can be a quite long.
> In this case, the connection to elasticsearch will be closed.
> {color:#00875A}Note that the reindex task is still running in background,
> only the connection between unomi and elasticsearch is closed.{color}
> The migration scripts are based on the synchronous behaviour. So in the
> migration script, we wait the end of the _reindex before going to the next
> step.
> Here is a reindexing which can cause the issue:
> https://github.com/apache/unomi/blob/d4f4ccdeb03acfb0493228559dc4d203e1ef7319/tools/shell-commands/src/main/resources/META-INF/cxs/migration/migrate-2.0.0-15-eventsReindex.groovy#L37
> h2. Solutions to fix:
> Change the _reindex request to use the parameter wait_for_completion=false,
> Elasticsearch will asynchronously execute the reindex operation and
> immediately return a response containing the task information, instead of
> waiting for the operation to complete.
> With the task id which will be returned, we can call the _task endpoint like
> the following:
> {code:java}
> GET _tasks/<task_id>
> {code}
> and wait until the status of the task is completed before going to the next
> step.
> *Note: We should keep in mind to handle each possible status (success,
> failed, etc)*
> This way the synchronous behaviour will be implemented directly in the
> scripts.
> *Any other solutions are welcome*
--
This message was sent by Atlassian Jira
(v8.20.10#820010)