[jira] [Created] (UNOMI-861) CLONE - Adapt migration job to use asynchronous mode avoiding timeout and connection lost

Jira Wed, 23 Oct 2024 08:16:07 -0700

Jonathan Sinovassin-Naïk created UNOMI-861:
----------------------------------------------

Summary: CLONE - Adapt migration job to use asynchronous mode
avoiding timeout and connection lost
Key: UNOMI-861
URL: https://issues.apache.org/jira/browse/UNOMI-861
Project: Apache Unomi
Issue Type: Task
Reporter: Jonathan Sinovassin-Naïk
Assignee: Jerome Blanchard
Fix For: unomi-2.6.0

h2. Explanation of the issue:

When using unomi with elastic cloud.

And executing the migration from unomi-1.x to unomi-2.x the migration sometimes
fails because of the _reindex requests.

There is a timeout which closes the _reindex requests when they are taking more
than 2 minutes. This timeout cannot be changed in elastic cloud.
According to the size of the index, the reindex can be a quite long.
In this case, the connection to elasticsearch will be closed.
{color:#00875A}Note that the reindex task is still running in background, only
the connection between unomi and elasticsearch is closed.{color}

The migration scripts are based on the synchronous behaviour. So in the
migration script, we wait the end of the _reindex before going to the next step.

Here is a reindexing which can cause the issue:
https://github.com/apache/unomi/blob/d4f4ccdeb03acfb0493228559dc4d203e1ef7319/tools/shell-commands/src/main/resources/META-INF/cxs/migration/migrate-2.0.0-15-eventsReindex.groovy#L37

h2. Solutions to fix:

Change the _reindex request to use the parameter wait_for_completion=false,
Elasticsearch will asynchronously execute the reindex operation and immediately
return a response containing the task information, instead of waiting for the
operation to complete.

With the task id which will be returned, we can call the _task endpoint like
the following:
{code:java}
GET _tasks/<task_id>
{code}
and wait until the status of the task is completed before going to the next
step.
*Note: We should keep in mind to handle each possible status (success, failed,
etc)*

This way the synchronous behaviour will be implemented directly in the scripts.

*Any other solutions are welcome*

--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (UNOMI-861) CLONE - Adapt migration job to use asynchronous mode avoiding timeout and connection lost

Reply via email to