[ 
https://issues.apache.org/jira/browse/LUCENE-8048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16247771#comment-16247771
 ] 

Uwe Schindler edited comment on LUCENE-8048 at 11/10/17 4:59 PM:
-----------------------------------------------------------------

The guarantees about fsync are weak anyways (as Robert said), and on top - 
fsyncing directory metadata is a hack, the Java API does not allow to do it via 
API, you need a hack with a file handle - but it works in our testing 
([~mikemccand] had/has a test computer with a remote powerswitch to stress all 
this for weeks). The directory sync is at least documented in Linux Man Pages, 
for other operating systems it not defined (lack of POSIX standard for it).

In short:
- On Linux, the fsync on the directory really works, but we only know about 
usual file systems (ext4 and xfs I think)
- In addition because the atomic rename use case is very common in Unix world 
to commit stuff, the kernel already does the right thing. If it sees an atomic 
rename, it automatically fsyncs directory under certain conditions (read source 
code). Robert, is this right - it's long ago when I last looked at that code!
- On MacOSX/Solaris the same applies like for linux, although it does not have 
the automatism in kernel. And we don't know if fsyncing directory is really 
done for all file systems. The Man page does not say anything and POSIX does 
not define it.
- On Windows, the fsync on directory does not work at all (it is a no-op in 
Lucene -> we have a try-catch around it with an assertion on Windows in the 
exception block). But Windows file systems guarantee that after the atomic 
rename the directory is in an consistent state (it's documented). 
Happens-before also works.


was (Author: thetaphi):
The guarantees about fsync are weak anyways (as Robert said), and on top - 
fsyncing directory metadata is a hack, the Java API does not allow to do it via 
API, you need a hack with a file handle - but it works in our testing 
([~mikemccand] had/has a test computer with a remote powerswitch to stress all 
this for weeks). The directory sync is at least documented in Linux Man Pages, 
for other operating systems it not defined (lack of POSIX standard for it).

In short:
- On Linux, the fsync on the directory really works, but we only know about 
usual file systems (ext4 and xfs I think)
- In addition because the atomic rename use case is very common in Unix world 
to commit stuff, the kernel already does the right thing. If it sees an atomic 
rename, it automatically fsyncs directory under certain conditions (read source 
code). It "detects
- On MacOSX/Solaris the same applies like for linux, although it does not have 
the automatism in kernel. And we don't know if fsyncing directory is really 
done for all file systems. The Man page does not say anything and POSIX does 
not define it.
- On Windows, the fsync on directory does not work at all (it is a no-op in 
Lucene -> we have a try-catch around it with an assertion on Windows in the 
exception block). But Windows file systems guarantee that after the atomic 
rename the directory is in an consistent state (it's documented). 
Happens-before also works.

> Filesystems do not guarantee order of directories updates
> ---------------------------------------------------------
>
>                 Key: LUCENE-8048
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8048
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Nikolay Martynov
>
> Currently when index is written to disk the following sequence of events is 
> taking place:
> * write segment file
> * sync segment file
> * write segment file
> * sync segment file
> ...
> * write list of segments
> * sync list of segments
> * rename list of segments
> * sync index directory
> This sequence leads to potential window of opportunity for system to crash 
> after 'rename list of segments' but before 'sync index directory' and 
> depending on exact filesystem implementation this may potentially lead to 
> 'list of segments' being visible in directory while some of the segments are 
> not.
> Solution to this is to sync index directory after all segments have been 
> written. [This 
> commit|https://github.com/mar-kolya/lucene-solr/commit/58e05dd1f633ab9b02d9e6374c7fab59689ae71c]
>  shows idea implemented. I'm fairly certain that I didn't find all the places 
> this may be potentially happening.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to