[ 
https://issues.apache.org/jira/browse/ARTEMIS-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17033645#comment-17033645
 ] 

Clebert Suconic commented on ARTEMIS-2618:
------------------------------------------

I think it's important to check why you were having the IO Exception on open in 
the first place.

if you really need a retry in the end we could try to reach someone with kernel 
expertise to validate if the open(rw) would incur on write-caches and other 
issues.


I wouldn't want to "fix the issue: without knowing the bottom reason yet.

> Improve Handling of Shutdown on critical I/O Error
> --------------------------------------------------
>
>                 Key: ARTEMIS-2618
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-2618
>             Project: ActiveMQ Artemis
>          Issue Type: Improvement
>    Affects Versions: 2.11.0
>            Reporter: Rico Neubauer
>            Priority: Major
>         Attachments: Improve-Handling-of-Shutdown-on-critic.patch
>
>
> Would like to request an improvement in the handling of critical I/O errors 
> on opening journal files.
> If {{org.apache.activemq.artemis.core.io.nio.NIOSequentialFile}} fails to 
> open a journal file, the whole server shuts down with {{@Message(id = 222010, 
> value = "Critical IO Error, shutting down the server. file=1, message=0"}}.
> We have seen this in the wild, where a backup-software locked the file for a 
> short time while journal was about getting opened, resulting in the shutdown.
> Proposed improvement would be to have a short-running retry for opening the 
> journal files and only fail fatally if error persists.
> Will attach a proposal patch. Can also create a PR if you accept.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to