[
https://issues.apache.org/jira/browse/LUCENE-6176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14273592#comment-14273592
]
Uwe Schindler commented on LUCENE-6176:
---------------------------------------
Hi Woltjek,
Lucene uses the standard mechanisms probided by the JDK to write to files: We
open a standard Java OuputStream with {{Files.newOutputStream(...)}} (in coming
5.0, but {{new FileOutputStream(...)}} in Lucene 4.x). This method does not
allow to add "StandardOpenOption.READ" to it (it is also "just wrong" and this
is why the Java people forbid it). In fact, this looks like a real bug in
Windows, sorry. I don't understand why the cache manager cannot cache, if you
don't want to read from a file??? It is not only Lucene that is affected by
this problem, every Java Program that copies files or writes large files
through standard Java APIs. It gets worse if they don't buffer (Lucene uses a
buffer of 8192 bytes).
See:
[http://docs.oracle.com/javase/7/docs/api/java/nio/file/Files.html#newOutputStream(java.nio.file.Path,%20java.nio.file.OpenOption...)]
bq. The FSIndexOutput, in FSDirecotry, opens the output file stream for
Write/Append (W/A), but no Read
We open for CREATE, TRUNCATE_EXISTING, and WRITE; see Javadocs:
bq. If no options are present then this method works as if the CREATE,
TRUNCATE_EXISTING, and WRITE options are present.
We may be able to work around this by using a FileChannel (which can be opened
for READ) and wrapping an OutputStream around, but this should be a temporary
hack. Another alternative would be to use a larger chunk size on windows
(instead of 8192 bytes).
In fact, this caching bug should be fixed in Windows. Can you give an
estimation when this bug in windows is likely to be fixed? (I see from your
profile, that you are Microsoft employee). Can you also give a more clear
explanation, why this caching is a problem? I don't see a difference with
visibility on the remote server if READ is given or not.
In addition, because this bug affects *all* Java projects (see above), you
should open an issue at OpenJDK, too.
> Modify FSIndexOutpue in FSDirectory to pen output steam for Write and Read
> --------------------------------------------------------------------------
>
> Key: LUCENE-6176
> URL: https://issues.apache.org/jira/browse/LUCENE-6176
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/store
> Affects Versions: 4.10.2
> Environment: Windows
> Reporter: Wojtek Kozaczynski
>
> The FSIndexOutput, in FSDirecotry, opens the output file stream for
> Write/Append (W/A), but no Read. This is an issue when Windos wites to remote
> files. For local storage files the Windows cache manager is part of the
> kernel and can read from the file even if it is opened for W/A only (and it
> needs to read the current content of the page). When accessing remote files,
> like SMB shares, the cache manager is restricted to the access mode requested
> from the remote system. In this case since it is W/A every write, even a
> single byte, is a roundtrip to the remote storage server.
> Openning the output file stream for Write and Read, which does not impact
> other functionality, allows Windows to cache the individual Lucene writes
> regadless of their size
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]