[
https://issues.apache.org/jira/browse/IGNITE-27871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Oleg Valuyskiy updated IGNITE-27871:
------------------------------------
Summary: High latency for locally deployed tasks when
peerClassLoadingEnabled=true (deployment lookup contention) (was: Local
deployment cache miss when peerClassLoadingEnabled=true leads to repeated
synchronized deploy() calls)
> High latency for locally deployed tasks when peerClassLoadingEnabled=true
> (deployment lookup contention)
> --------------------------------------------------------------------------------------------------------
>
> Key: IGNITE-27871
> URL: https://issues.apache.org/jira/browse/IGNITE-27871
> Project: Ignite
> Issue Type: Bug
> Reporter: Oleg Valuyskiy
> Assignee: Oleg Valuyskiy
> Priority: Major
> Labels: ise
>
> When a Compute task is executed from a thin client and the task class is
> available in the node classpath (e.g. placed in libs directory),
> *GridDeploymentManager#getLocalDeployment* creates *GridDeploymentMetadata*
> without *classLoader* and {*}classLoaderId{*}.
> *GridDeploymentLocalStore#getDeployment(meta)* first attempts to find an
> existing deployment via {*}deployment(meta){*}. However, *deployment(meta)*
> matches cached deployments only if:
> {code:java}
> dep.classLoaderId() == meta.classLoaderId() || dep.classLoader() ==
> meta.classLoader(){code}
> Since both *meta.classLoader* and *meta.classLoaderId* are null, the cached
> local deployment can never be matched.
> As a result, *GridDeploymentLocalStore#deploy(...)* is invoked on every task
> execution. This method is synchronized and performs additional lookup and
> bookkeeping logic, which introduces unnecessary contention and latency under
> high load.
> The issue is reproducible with:
> * peerClassLoadingEnabled = true
> * task class present in node classpath (libs)
> * thin client executing the same task repeatedly by name
> However, when {*}peerClassLoadingEnabled=false{*}, *GridDeploymentManager*
> initializes *locDep* and reuses it directly, bypassing
> {*}GridDeploymentLocalStore{*}, which avoids this problem.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)