[jira] [Updated] (UNOMI-430) Make Unomi batchProfilesUpdate use ES scroll query

Serge Huber (Jira) Mon, 15 Feb 2021 06:16:34 -0800


     [ 
https://issues.apache.org/jira/browse/UNOMI-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Serge Huber updated UNOMI-430:
------------------------------
    Description: 
*As a developer* 
*I want to ensure good performances when calling batchProfilesUpdate described 
here* https://unomi.incubator.apache.org/rest-api-doc/#-244007327*

h3. Acceptance criteria

When I call batchProfilesUpdate
Then Elasticsearch scrollquery should be used to ensure good performances

When I call batchProfilesUpdate 
Then I should be able to configure the window size (1000) and the duration of 
the scroll validity


h3. Designer notes
 
 
h3. Developer notes
This method

{code:java}
    public void batchProfilesUpdate(BatchUpdate update) {
        ParserHelper.resolveConditionType(definitionsService, 
update.getCondition());
        List<Profile> profiles = 
persistenceService.query(update.getCondition(), null, Profile.class);

        for (Profile profile : profiles) {
            if (PropertyHelper.setProperty(profile, update.getPropertyName(), 
update.getPropertyValue(), update.getStrategy())) {
                save(profile);
            }
        }
    }
{code}

should be updated to something like:

{code:java}
    public void batchProfilesUpdate(BatchUpdate update) {
        ParserHelper.resolveConditionType(definitionsService, 
update.getCondition());
        PartialList<Profile> profiles = 
persistenceService.query(update.getCondition(), null, Profile.class, 0,1000, 
"10m");

        while (profiles.getList().size() > 0) {
            for (Profile profile : profiles.getList()) {
                if (PropertyHelper.setProperty(profile, 
update.getPropertyName(), update.getPropertyValue(), update.getStrategy())) {
                    save(profile);
                }
            }
            profiles = persistenceService.continueScrollQuery(Profile.class, 
profiles.getScrollIdentifier(), profiles.getScrollTimeValidity());
            if (profiles == null || profiles.getList().size() == 0) {
                break;
            }
        }
    }
{code}

because in the existing version of this method if the condition matches a large 
number of profiles they will all be loaded into memory which can be a (big) 
problem. For example if we request all the profiles of a set of 20 millions 
profiles, all those profiles will be loaded in memory. By switching to scroll 
queries, only the "window" of profiles will be loaded in memory.

Integration tests to validate this change should also be added

  was:
*As a developer* 
*I want to ensure good performances when calling batchProfilesUpdate described 
here* https://unomi.incubator.apache.org/rest-api-doc/#-244007327*

h3. Acceptance criteria

When I call batchProfilesUpdate
Then Elasticsearch scrollquery should be used to ensure good performances

When I call batchProfilesUpdate 
Then I should be able to configure the window size (1000) and the duration of 
the scroll validity


h3. Designer notes
 
 
h3. Developer notes
This method

{code:java}
    public void batchProfilesUpdate(BatchUpdate update) {
        ParserHelper.resolveConditionType(definitionsService, 
update.getCondition());
        List<Profile> profiles = 
persistenceService.query(update.getCondition(), null, Profile.class);

        for (Profile profile : profiles) {
            if (PropertyHelper.setProperty(profile, update.getPropertyName(), 
update.getPropertyValue(), update.getStrategy())) {
                save(profile);
            }
        }
    }
{code}

should be updated to something like:

{code:java}
    public void batchProfilesUpdate(BatchUpdate update) {
        ParserHelper.resolveConditionType(definitionsService, 
update.getCondition());
        PartialList<Profile> profiles = 
persistenceService.query(update.getCondition(), null, Profile.class, 0,1000, 
"10m");

        while (profiles.getList().size() > 0) {
            for (Profile profile : profiles.getList()) {
                if (PropertyHelper.setProperty(profile, 
update.getPropertyName(), update.getPropertyValue(), update.getStrategy())) {
                    save(profile);
                }
            }
            profiles = persistenceService.continueScrollQuery(Profile.class, 
profiles.getScrollIdentifier(), profiles.getScrollTimeValidity());
            if (profiles == null || profiles.getList().size() == 0) {
                break;
            }
        }
    }
{code}

Integration tests to validate this change should also be added


> Make Unomi batchProfilesUpdate use ES scroll query
> --------------------------------------------------
>
>                 Key: UNOMI-430
>                 URL: https://issues.apache.org/jira/browse/UNOMI-430
>             Project: Apache Unomi
>          Issue Type: Improvement
>            Reporter: romain.gauthier
>            Priority: Major
>             Fix For: 2.0.0
>
>
> *As a developer* 
> *I want to ensure good performances when calling batchProfilesUpdate 
> described here* https://unomi.incubator.apache.org/rest-api-doc/#-244007327*
> h3. Acceptance criteria
> When I call batchProfilesUpdate
> Then Elasticsearch scrollquery should be used to ensure good performances
> When I call batchProfilesUpdate 
> Then I should be able to configure the window size (1000) and the duration of 
> the scroll validity
> h3. Designer notes
>  
>  
> h3. Developer notes
> This method
> {code:java}
>     public void batchProfilesUpdate(BatchUpdate update) {
>         ParserHelper.resolveConditionType(definitionsService, 
> update.getCondition());
>         List<Profile> profiles = 
> persistenceService.query(update.getCondition(), null, Profile.class);
>         for (Profile profile : profiles) {
>             if (PropertyHelper.setProperty(profile, update.getPropertyName(), 
> update.getPropertyValue(), update.getStrategy())) {
>                 save(profile);
>             }
>         }
>     }
> {code}
> should be updated to something like:
> {code:java}
>     public void batchProfilesUpdate(BatchUpdate update) {
>         ParserHelper.resolveConditionType(definitionsService, 
> update.getCondition());
>         PartialList<Profile> profiles = 
> persistenceService.query(update.getCondition(), null, Profile.class, 0,1000, 
> "10m");
>         while (profiles.getList().size() > 0) {
>             for (Profile profile : profiles.getList()) {
>                 if (PropertyHelper.setProperty(profile, 
> update.getPropertyName(), update.getPropertyValue(), update.getStrategy())) {
>                     save(profile);
>                 }
>             }
>             profiles = persistenceService.continueScrollQuery(Profile.class, 
> profiles.getScrollIdentifier(), profiles.getScrollTimeValidity());
>             if (profiles == null || profiles.getList().size() == 0) {
>                 break;
>             }
>         }
>     }
> {code}
> because in the existing version of this method if the condition matches a 
> large number of profiles they will all be loaded into memory which can be a 
> (big) problem. For example if we request all the profiles of a set of 20 
> millions profiles, all those profiles will be loaded in memory. By switching 
> to scroll queries, only the "window" of profiles will be loaded in memory.
> Integration tests to validate this change should also be added



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (UNOMI-430) Make Unomi batchProfilesUpdate use ES scroll query

Reply via email to