[
https://issues.apache.org/jira/browse/IGNITE-1697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vladimir Ozerov updated IGNITE-1697:
------------------------------------
Fix Version/s: (was: 1.5)
1.6
> IGFS: implement reliable Igfs failover logic
> ---------------------------------------------
>
> Key: IGNITE-1697
> URL: https://issues.apache.org/jira/browse/IGNITE-1697
> Project: Ignite
> Issue Type: Bug
> Reporter: Ivan Veselovsky
> Assignee: Vladimir Ozerov
> Fix For: 1.6
>
>
> Problems to solve:
> 1) currently a write lock for a file may stay taken forever if a node have
> taken the lock and then crashed.
> 2) Currently the blocks of file content are written not just as
> dataCache.put() operations , but sent using ad-hoc async messages. This was
> done earlier to improve performance. But in order to implement reliable
> failover we need to get rid of that and use simple put() or asyncPut() cache
> operations.
> Solution plan:
> 1) use async put to write file data blocks.
> 2) do writing using scheme "lock" -> "reserve space" -> "write" -> "commit"
> -> "release lock".
> 3) The id of the node that locked a file should be readable from the lock id.
> 4) Upon taking a file lock the following procedure should be performed:
> if file is locked, take the node Id of the node that locked the file. After
> that ask DiscoveryProcessor if this node is alive. If it is not (node has
> left topology), perform cleanup procedure: delete all the data blocks of the
> reserved data range, then delete the lock.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)