[jira] [Commented] (IGNITE-6832) handle IO errors while checkpointing

2018-01-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-6832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16337725#comment-16337725
 ] 

ASF GitHub Bot commented on IGNITE-6832:


Github user asfgit closed the pull request at:

https://github.com/apache/ignite/pull/3394


> handle IO errors while checkpointing
> 
>
> Key: IGNITE-6832
> URL: https://issues.apache.org/jira/browse/IGNITE-6832
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.1
>Reporter: Alexander Belyak
>Assignee: Alexey Goncharuk
>Priority: Major
> Fix For: 2.4
>
>
> If we get some IO error (like "No spece left on device") during checkpointing 
> (GridCacheDatabaseSharedManager$WriteCheckpointPages:2509) node didn't stop 
> as when get same error while writting WAL log and clients will get some "Long 
> running cache futures". We must stop node in this case! Better - add some 
> internal healthcheck and stop node anyway if  it won't pass for few times (do 
> it with different issue).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-6832) handle IO errors while checkpointing

2018-01-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-6832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16328809#comment-16328809
 ] 

ASF GitHub Bot commented on IGNITE-6832:


GitHub user Jokser opened a pull request:

https://github.com/apache/ignite/pull/3394

IGNITE-6832 Proper handling LFS and WAL persistence errors.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gridgain/apache-ignite ignite-6832

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/3394.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3394


commit ec62923d7ea35ca2cf6bc0030de628f9ad9872e2
Author: Jokser 
Date:   2018-01-17T14:03:58Z

IGNITE-6832 Proper handling LFS and WAL persistence errors.




> handle IO errors while checkpointing
> 
>
> Key: IGNITE-6832
> URL: https://issues.apache.org/jira/browse/IGNITE-6832
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.1
>Reporter: Alexander Belyak
>Assignee: Pavel Kovalenko
>Priority: Major
>
> If we get some IO error (like "No spece left on device") during checkpointing 
> (GridCacheDatabaseSharedManager$WriteCheckpointPages:2509) node didn't stop 
> as when get same error while writting WAL log and clients will get some "Long 
> running cache futures". We must stop node in this case! Better - add some 
> internal healthcheck and stop node anyway if  it won't pass for few times (do 
> it with different issue).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-6832) handle IO errors while checkpointing

2018-01-15 Thread Alexey Goncharuk (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-6832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326205#comment-16326205
 ] 

Alexey Goncharuk commented on IGNITE-6832:
--

For starters, we need to have a generic method to check the environment and 
invoke when an unrecoverable exception occurs.

> handle IO errors while checkpointing
> 
>
> Key: IGNITE-6832
> URL: https://issues.apache.org/jira/browse/IGNITE-6832
> Project: Ignite
>  Issue Type: Bug
>Affects Versions: 2.1
>Reporter: Alexander Belyak
>Priority: Major
>
> If we get some IO error (like "No spece left on device") during checkpointing 
> (GridCacheDatabaseSharedManager$WriteCheckpointPages:2509) node didn't stop 
> as when get same error while writting WAL log and clients will get some "Long 
> running cache futures". We must stop node in this case! Better - add some 
> internal healthcheck and stop node anyway if  it won't pass for few times (do 
> it with different issue).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)