[
https://issues.apache.org/jira/browse/IGNITE-17964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aleksey Plekhanov updated IGNITE-17964:
---------------------------------------
Release Note: Fixed potential deadlock in discovery thread while updating
SQL statistics
> Potential deadlock in discovery thread while updating SQL statistics.
> ---------------------------------------------------------------------
>
> Key: IGNITE-17964
> URL: https://issues.apache.org/jira/browse/IGNITE-17964
> Project: Ignite
> Issue Type: Bug
> Components: sql
> Reporter: Andrey Mashenkov
> Assignee: Andrey Mashenkov
> Priority: Major
> Fix For: 2.15
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> On node start/activation IgniteStatisticsConfigurationManager initializes and
> tries to cleanup orphaned records (e.g. for tables, which were dropped before
> node stop/crash).
> To do that *stat-mgmt* thread updates distributed metastorage synchronously
> under the read-lock.
> Underneath, metastorage sends a request via discovery, then
> discovery component gets the answer on that message, and gets stuck trying to
> get the write-lock to complete the future...
> So, *stat-mgmt* and *disco-notify* threads fall into inevitable deadlock.
> We should avoid any synchronous operation on distributed metastorage under
> the read-lock.
> Let’s rewrite synchronous CAS deep inside the closure (see
> IgniteStatisticsConfigurationManager.updateLocalStatistics) to async CAS and
> pull it's future up to outside the closure and the read-lock.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)