[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-761700207 > If it returns 0, there may be more to read, just not yet available. This is important for asynchronous channels! Oh I see! I guess since it's reading into a buffer, 0 can also indicate that the buffer is full as well. Thanks for the insights there! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-761623319 > The issue with seek to filesize is well-known to me, that was also my first idea I had yesterday in the evening. But I was ready to go to bed and of course I had a beer, so I was not ready to fix it. > In MMapDirectory rewrite for JDK16+ (see #2176) I have seen similar special cases. We have there partially empty MMAPs (sized 0 bytes) just to allow the seek to end of file. > The issue here was really the block size on Windows which is much smaller so it triggers the failure. Yup I realized this later as well when looking at the bug. In your original implementation you already handled it with the below to not throw exception when at EOF ``` if (n < 0 && filePos > channel.size()) { throw new EOFException("read past EOF: " + this); } ``` But somehow I put in these obviously incompatible code just across a few lines between themselves ``` if(filePos >= channel.size()) { // this condition conflicts with the assertion below throw new EOFException("read past EOF: " + this); } ... try { n = channel.read(buffer, filePos); assert n >= 0 : "read should not past EOF"; } catch (IOException ioe) { ... ``` I guess I'm also a bit surprised that `channel.read` would return -1 to indicate both reading EOF and reading past EOF. Previously I was assuming this will just return 0 for reading EOF. Maybe the underlying JDK implementation is using exception to detect EOF? Will need to look that up. > Sorry that you had to install a Windows VM first. I was on the right plan, too, it was just too late! No problem there. I actually got a relatively powerful NAS lately so I was eager to try out running VMs on it :D > As you are also now familar with IndexInputs, you may also have a look at performance of #2176 !!! Thanks. Thanks for letting me know about this PR! Will definitely take a look! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-761525653 > > I will try to find the issue tomorrow...! (not sure if I will have time, so if you can reproduce on Linux feel free to help out) > > Hmmm that's strange. On my macbook these tests are passing. > > https://user-images.githubusercontent.com/2986273/104795124-dc558080-5760-11eb-8a2d-98aa31a869a1.png;> > > I'll install Windows and Linux VMs over the weekend to further test them out. @uschindler I figured out the issue here. Apparently I introduced a bug on EOF handling when fixing the `BaseDirectoryTestCase#testSeekPastEOF` test failure (fixed one, broke another! :( ) The reason this bug was not uncovered when running the tests on Mac is, the EOF tests in question only failed when the file size being tested is a multiple of block size, which would eventually cause the condition `filePos == channel.size()` in `DirectIOIndexInput#refill` and an EOF exception would be thrown there explicitly. This is the case for Windows platform as the file size in tests is 1024, and the block size is 512; However, for Mac and some linux, the block size is 4096, so this particular EOF handling path was not encountered. I've added a new test to force this condition and pushed up the fix. The tests are passing on my Mac and Windows VM now, could you please take a look and see if it works for you as well? Sorry for the bug there! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-761298120 > I will try to find the issue tomorrow...! (not sure if I will have time, so if you can reproduce on Linux feel free to help out) Hmmm that's strange. On my macbook these tests are passing. https://user-images.githubusercontent.com/2986273/104795124-dc558080-5760-11eb-8a2d-98aa31a869a1.png;> I'll install Windows and Linux VMs over the weekend to further test them out. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-760697048 hi @mikemccand, I just took a look, and pushed some more commits to resolve latest merge conflicts and adopt new formatting. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-753800497 > Hi @dweiss, for the JDK compiler issue, I ended up submitting a bug report to open jdk (internal review id 9068426) as I'm hoping to be able to refer to this issue in the code comment directly if there's any confirmation and follow-up from them. Will post further update here if the report goes through and a ticket gets created. The JDK bug ticket has been opened and available at https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8259039. I've added the link into java doc as well. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-753560734 Hi @dweiss, for the JDK compiler issue, I ended up submitting a bug report to open jdk (internal review id 9068426) as I'm hoping to be able to refer to this issue in the code comment directly if there's any confirmation and follow-up from them. Will post further update here if the report goes through and a ticket gets created. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-753557228 Hi Uwe, I've gone ahead and pushed a commit for the following: 1. Add checksum computation 2. Add a new test to exercise the DirectIODirectory with actual index 3. Update some logic in DirectIODirectory to pass 3 tests in `BaseDirectoryTestCase` Please feel free to revert it though if it conflicts with changes you've already made. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-753556939 Hi Uwe, I've gone ahead and pushed a commit for the following: 1. Add checksum This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-751925231 Hi @uschindler, I will have some time in the next few days. Do you mind me taking a look at what's left here and potentially pushing some changes? Don't want to get in the way though if you have the changes almost ready. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-748804203 > This may also happen on other operating systems. It depends on if the buffer is already aligned when you call alignSlice(). If it is not aligned (seems to happen 100% of the time on windows), then the returned slice is ALWAYS smaller than original size. If you read the example code by OpenJDK developers and the tests: You need to overallocate (`finalSize + blockSize - 1`) and then get the aligned slice. Yeah makes sense. I did see and use `Math.addExact(blockSize, blockSize - 1)` before referencing https://bugs.openjdk.java.net/browse/JDK-8189192, but the dots never connected for me that this was done for the purpose of handling unaligned buffer, and that it needs to be adapted to `finalSize + blockSize - 1` here. And for some reasons this misalignment never happens on Mac so the test is always passing for me. Glad that this bug gets caught! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-748563011 Oh wow thanks for the updates there, and sorry to hear the current changes are not yet ready! Yes the original implementation hasn't been used for a while now (as native code compilation is not available after Ant migration), and the DirectIO test was a recent addition as part of the work for this jira ticket. I read through all the comments and some code changes, and the main issues highlighted are: 1. On Windows, alignment is done differently from Linux, and thus the aligned slice size could be smaller than the intended buffer size. This caused problems when we also keep track of buffer manipulation stats using fields outside of aligned byte buffer. So the strategy for fixing here is to resort to ByteBuffer's fields and methods to correctly track the position, limit, mark, capacity etc for buffer manipulation. 2. Some essential methods like `getChecksum` is not implemented (is this interface added after initial Unix Directory?), and since the tests mostly exercise delegate's methods (which are already functioning), and the one that does exercise DirectIO is not using a real index, so the tests didn't discover the implementation is broken. The strategy for fixing here is to force all the tests to also use DirectIO and making sure the implementation passes all of them. Would love to help but don't want to get in the way of your plan either. Please let me know if there's anything you would like me to work on. Appreciate again your help on this @uschindler ! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-748509146 > > Hmm I don't have a windows box to test this. The blockSize is dynamically computed though. If this works on linux/darwin but fails on windows, does it signal another bug in the JDK implementation (since there's butter overflow?), or just some platform specific condition we also need to handle? > > I can debug through this later. Maybe windows returns zero Blocksize or something like this. Sounds good. I also just pushed a commit to update the java doc. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-748506650 > Hi, I implemented the approach and also cleaned up the lookup code to not require unchecked casts. I also made the code safer for exceptions, you only catched ClassNotFound, but actually there can more go wrong! > > * Cast the class after forName() to OpenOption. Then the stream knows what everything is and we can bail out soon. > * catch all Exceptions, as there can happy many: ClassNotFound, ClassCast (if it's not an OpenOption), NullPointerException (if it's not an enum). In static initializers we should not risk any of this. It's just not supported if ANYTHING goes wrong. - BASTA > > I also added an assume to the test, so it won't run without that open option. > > I also changed the Exception to UnsupportedOperationException, as this is what the Javadocs say in the class description. This is also in line with "operating system does not support it" behaviour of JDK, which throws UOE in the open method. > > There's one TODO: If we change to Java module system, we may need to add an import to the unsupported module. Wow thanks @uschindler ! The changes look great! I think there are some java doc I put in earlier that use IOException instead of UOE, I can push a commit fast to update them as well. > Nevertheless, on Windows I see the test failing all the time (JDK 11 and JDK 15 were tested): > > ``` > org.apache.lucene.misc.store.TestDirectIODirectory > testWriteReadWithDirectIO FAILED > java.nio.BufferOverflowException > at __randomizedtesting.SeedInfo.seed([667380F7FFC63E9E:3A56F8A55C75455F]:0) > at java.base/java.nio.DirectByteBuffer.put(DirectByteBuffer.java:409) > at org.apache.lucene.misc.store.DirectIODirectory$DirectIOIndexOutput.writeBytes(DirectIODirectory.java:210) > at org.apache.lucene.misc.store.TestDirectIODirectory.testWriteReadWithDirectIO(TestDirectIODirectory.java:43) > ``` > > The other implicit tests seem to work. Maybe this one stresses the block sizes too much. Hmm I don't have a windows box to test this. The `blockSize` is dynamically computed though. If this works on linux/darwin but fails on windows, does it signal another bug in the JDK implementation (since there's butter overflow?), or just some platform specific condition we also need to handle? > The master branch of the JDK always emits this exception, it seems: > https://github.com/openjdk/jdk/blob/master/src/jdk.compiler/share/classes/com/sun/tools/javac/comp/Check.java#L3534-L3540 > > Previously it was guarded with 'sunapi' lint switch. To be honest this looks like a regression because the sunApiHandler is still there in the code, it's just never used: > https://github.com/openjdk/jdk/blob/master/src/jdk.compiler/share/classes/com/sun/tools/javac/comp/Check.java#L149-L158 Thanks @dweiss for looking it up! It seems that change was introduced here more than 4 years ago https://github.com/openjdk/jdk/commit/286b0caa6cf0ac85128004c091d8d508421389f1 . However, this doesn't seems to explain what were observed in the testings though, where javac and --release using different version causing the warning to be emitted, even when no -Xlint option is provided (the new code looks like the warning message should never be emitted, like in the case where the two versions match, but only logged?). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-748443350 > You can do it statically, look at how the unmapper in Mmapdir works: > Look up the reflective in static intitalizer and save the result at very end in static final. Guard the lookup with correct try blocks and if the lookup fails assign null to the static final. > At runtime when creating the directory, check for null and throw meaningful IOException. > May I push directly to your branch? Oh I see that makes sense! Yes please feel free to push the changes if you have them ready now. I can also put that in when I wake up tomorrow. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-748441602 > I'll try to help out later today, Zach. Sounds good. Thanks Dawid! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-748427066 Wow thanks everyone for the great suggestions and ideas! Really appreciate them! Didn't quite anticipate we would go so deep into looking at java compiler when picking up this issue LOL! I pushed a commit that went with the recommended class / enum lookup approach, and it is working with JDK 12 (Yay!). I did not use final static constant for the flag though as it can only be initialized statically (I think), and thus we can't throw checked IOException if the class or the enum is not available (or has to wrap it inside `ExceptionInInitializerError`). Could you please take a look and let me know if there's any concern? > IMHO, instead of adding @SuppressWarnings, you should also review the Gradle config, if there are some -Xlintoptions enabled that cause this. I tried this with the vanilla program and plain javac command, and it seems like without any -Xlint options the warning message will still be emitted if there's a different version passed into "--release" flag ``` % export JAVA_HOME=/Library/Java/JavaVirtualMachines/adoptopenjdk-12.jdk/Contents/Home % javac --version javac 12.0.2 % javac src/com/company/Main.java --release 11 src/com/company/Main.java:19: warning: ExtendedOpenOption is internal proprietary API and may be removed in a future release StandardOpenOption.READ, com.sun.nio.file.ExtendedOpenOption.DIRECT); ^ 1 warning % export JAVA_HOME=/Library/Java/JavaVirtualMachines/adoptopenjdk-11.jdk/Contents/Home % javac --version javac 11.0.9 % javac src/com/company/Main.java --release 11 ``` > You can try to ask on JDK's compiler-dev mailing list > I can only recommend to call out to OpenJDK mailing list, how to disable the warning. I'm also thinking whether I should just create a Jira ticket for them, so that I can include it in the java doc for this change for future reference? > Sources for the JDK are also available - I sometimes hack and recompile a custom distro and this proves the simplest of all solutions (even if it seems daunting at first). It does sound daunting! :D I'm interested in knowing how / where this is emitted in JDK source though also so will find time to check it out! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-747890887 > > Alas, `gradlew precommit` from the command-line on Linux box is still angry for me: > > ``` > > > Task :lucene:misc:compileJava > > /l/trunk/lucene/misc/src/java/org/apache/lucene/misc/store/DirectIODirectory.java:163: warning: ExtendedOpenOption is internal proprietary API and may be removed in a future release > > com.sun.nio.file.ExtendedOpenOption.DIRECT); > > ^ > > /l/trunk/lucene/misc/src/java/org/apache/lucene/misc/store/DirectIODirectory.java:282: warning: ExtendedOpenOption is internal proprietary API and may be removed in a future release > > channel = FileChannel.open(path, StandardOpenOption.READ, com.sun.nio.file.ExtendedOpenOption.DIRECT); > > ^ > > error: warnings found and -Werror specified > > ``` > > > > > > Oh I see -- you are using `@SuppressForbidden` (which is indeed necessary, since we are using an API that we otherwise forbid!), but you must also add `@SuppressWarnings("sunapi")` to suppress `javac` warnings. Lots of suppression happening here!! Hmm, but when I tried adding those two lines locally, `gradlew precommit` still fails, hrmph. > > This might be a JDK difference -- I'm using JDK 15. > > Hmm, a little more research uncovers [JDK-6476630](https://bugs.openjdk.java.net/browse/JDK-6476630), which makes it sound like it is not possible to suppress this particular warning. Now I'm not sure what to do! Need @uschindler help again! > > Oh wow this is interesting! Yes when I initially started to use `ExtendedOpenOption.DIRECT` I got build failure from `forbiddenApisMain`. But after I put in `@SuppressForbidden` the failure went away, so I didn't think twice about it. > > I did deal with this type of error before from another project a long time ago, and believed back then using ` @SuppressWarnings("sunapi")` with some flags like `-XDignore.symbol.file` or `-XDenableSunApiLintControl` worked (couldn't remember which one exactly now). However this seems to be something no longer supported in the latest JDK now as I'm also having difficulty getting them to work. > > For the few JDK versions I tried, the one that works on my mac without `@SuppressWarnings("sunapi")` is JDK 11 > > ``` > openjdk 11.0.9 2020-10-20 > OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.9+11) > OpenJDK 64-Bit Server VM AdoptOpenJDK (build 11.0.9+11, mixed mode) > ``` > > But as soon as I moved to JDK 12 below, that error also started to show up > > ``` > openjdk 12.0.2 2019-07-16 > OpenJDK Runtime Environment AdoptOpenJDK (build 12.0.2+10) > OpenJDK 64-Bit Server VM AdoptOpenJDK (build 12.0.2+10, mixed mode) > ``` > > One interesting thing I noticed after playing with a few experiments, is that the following plain vanilla java program that exercises `ExtendedOpenOption.DIRECT` doesn't actually output any warning for sun api use with the JDK 12 above, but does output other "regular" warnings such as `unchecked`, when compiled with `javac src/com/company/Main.java -Xlint:all`. > > ```java > package com.company; > > import java.io.IOException; > import java.nio.channels.Channel; > import java.nio.channels.FileChannel; > import java.nio.file.Path; > import java.nio.file.StandardOpenOption; > import java.util.ArrayList; > import java.util.List; > > public class Main { > > @SuppressWarnings({"rawtypes", "unchecked"}) > public static void main(String[] args) throws IOException { > List words = new ArrayList(); > words.add("hello"); > > Channel channel = FileChannel.open(Path.of("blablabla"), StandardOpenOption.CREATE, StandardOpenOption.WRITE, > StandardOpenOption.READ, com.sun.nio.file.ExtendedOpenOption.DIRECT); > System.out.println(channel.isOpen()); > } > } > ``` > > So I'm a bit confused at this point and need to do more research on this. Okay. After spending the last few nights looking into what's causing the above warning message difference between Lucene and the plain vanilla java program, I finally found out that running different javac version with the one passed into `--release` complier flag (defined in https://github.com/apache/lucene-solr/blob/6af9cbba29ec87b53abd17382862262adaa85144/gradle/defaults-java.gradle#L27) would cause this internal proprietary API warning message to be emitted for `ExtendedOpenOption.DIRECT` (and difficult to suppress). However, when the versions of the two matched, this warning message wouldn't be emitted (confirmed with JDK 12 & 13)
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-745039717 > Alas, `gradlew precommit` from the command-line on Linux box is still angry for me: > > ``` > > Task :lucene:misc:compileJava > /l/trunk/lucene/misc/src/java/org/apache/lucene/misc/store/DirectIODirectory.java:163: warning: ExtendedOpenOption is internal proprietary API and may be removed in a future release > com.sun.nio.file.ExtendedOpenOption.DIRECT); > ^ > /l/trunk/lucene/misc/src/java/org/apache/lucene/misc/store/DirectIODirectory.java:282: warning: ExtendedOpenOption is internal proprietary API and may be removed in a future release > channel = FileChannel.open(path, StandardOpenOption.READ, com.sun.nio.file.ExtendedOpenOption.DIRECT); > ^ > error: warnings found and -Werror specified > ``` > > Oh I see -- you are using `@SuppressForbidden` (which is indeed necessary, since we are using an API that we otherwise forbid!), but you must also add `@SuppressWarnings("sunapi")` to suppress `javac` warnings. Lots of suppression happening here!! Hmm, but when I tried adding those two lines locally, `gradlew precommit` still fails, hrmph. > > This might be a JDK difference -- I'm using JDK 15. > > Hmm, a little more research uncovers [JDK-6476630](https://bugs.openjdk.java.net/browse/JDK-6476630), which makes it sound like it is not possible to suppress this particular warning. Now I'm not sure what to do! Need @uschindler help again! Oh wow this is interesting! Yes when I initially started to use `ExtendedOpenOption.DIRECT` I got build failure from `forbiddenApisMain`. But after I put in `@SuppressForbidden` the failure went away, so I didn't think twice about it. I did deal with this type of error before from another project a long time ago, and believed back then using ` @SuppressWarnings("sunapi")` with some flags like `-XDignore.symbol.file` or `-XDenableSunApiLintControl` worked (couldn't remember which one exactly now). However this seems to be something no longer supported in the latest JDK now as I'm also having difficulty getting them to work. For the few JDK versions I tried, the one that works on my mac without `@SuppressWarnings("sunapi")` is JDK 11 ``` openjdk 11.0.9 2020-10-20 OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.9+11) OpenJDK 64-Bit Server VM AdoptOpenJDK (build 11.0.9+11, mixed mode) ``` But as soon as I moved to JDK 12 below, that error also started to show up ``` openjdk 12.0.2 2019-07-16 OpenJDK Runtime Environment AdoptOpenJDK (build 12.0.2+10) OpenJDK 64-Bit Server VM AdoptOpenJDK (build 12.0.2+10, mixed mode) ``` -- One interesting thing I noticed after playing with a few experiments, is that the following plain vanilla java program that exercises `ExtendedOpenOption.DIRECT` doesn't actually output any warning for sun api use with the JDK 12 above, but does output other "regular" warnings such as `unchecked`, when compiled with `javac src/com/company/Main.java -Xlint:all`. ```java package com.company; import java.io.IOException; import java.nio.channels.Channel; import java.nio.channels.FileChannel; import java.nio.file.Path; import java.nio.file.StandardOpenOption; import java.util.ArrayList; import java.util.List; public class Main { @SuppressWarnings({"rawtypes", "unchecked"}) public static void main(String[] args) throws IOException { List words = new ArrayList(); words.add("hello"); Channel channel = FileChannel.open(Path.of("blablabla"), StandardOpenOption.CREATE, StandardOpenOption.WRITE, StandardOpenOption.READ, com.sun.nio.file.ExtendedOpenOption.DIRECT); System.out.println(channel.isOpen()); } } ``` So I'm a bit confused at this point and need to do more research on this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-740306114 > Hmm, `gradlew precommit` is angry: > > ``` > > Task :lucene:misc:compileJava > /l/directio/lucene/misc/src/java/org/apache/lucene/misc/store/DirectIODirectory.java:206: warning: ExtendedOpenOption is internal proprietary API and may be removed in a future release > com.sun.nio.file.ExtendedOpenOption.DIRECT); > ^ > /l/directio/lucene/misc/src/java/org/apache/lucene/misc/store/DirectIODirectory.java:325: warning: ExtendedOpenOption is internal proprietary API and may be removed in a future release > channel = FileChannel.open(path, StandardOpenOption.READ, com.sun.nio.file.ExtendedOpenOption.DIRECT); > ^ > error: warnings found and -Werror specified > > > Task :lucene:sandbox:ecjLintMain > > Task :lucene:demo:ecjLintMain > > Task :lucene:codecs:ecjLintMain > > > Task :lucene:misc:compileJava FAILED > 1 error > 2 warnings > ``` > > Maybe we need to add something like `@SuppressWarnings("sunapi")` annotation? Hmm this is strange. These warnings should have been suppressed by `@SuppressForbidden` at the constructors already. Both my local and the github PR gradle precommit check passed as well (https://github.com/apache/lucene-solr/pull/2052/checks?check_run_id=1496707362). Is this error showing up in some CI environment / job execution that I can take a further look at? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-737614580 Just want to have a quick follow up on this PR. Are there any more changes expected from my end? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-731491257 > > > Second, it is extremely experimental and not clear when it provides benefits / what risks there are / etc. We need to learn much more about it, in diverse usage, to help here. I'd love to hear from Elasticsearch or Solr users if this helps, since those applications do simultaneous indexing (merging) and searching on the same box. > > > > > > I am sure, you at amazon will test it extensively. But I agree: I would not make this any default, I am still in favour of using plain MMAPDirectory. The risks of making it worse by using direct io is too heavy. > > LOL, actually, no, we at Amazon are not really planning on testing this extensively! Amazon (well, specifically our customer facing product search built directly on Lucene) uses [Lucene's fast segment replication feature](http://blog.mikemccandless.com/2017/09/lucenes-near-real-time-segment-index.html), which is much more efficient than Elasticsearch/Solr document replication when you need deep replicas because you have high peak QPS. So, at Amazon, at least for product search, we never index and search on the same JVM/hardware. Instead we have a few dedicated boxes for pure indexing, then replicate segments via S3 out to many boxes dedicated to searching. > > Lucene's segment replication feature allows us to use much less hardware to simultaneously handle high indexing throughput and high query throughput. > > But, since Elasticsearch/Solr do concurrent indexing (merging) and searching on a single box, by design, I think this Directory would be very interesting to test. It is likely a massive improvement in long-pole query latencies when heavy merges are running, since the merges would now bypass the OS's buffer (IO) cache entirely, using direct IO. > > > > Third, users are able to choose to use this when they instantiate the Directory implementation for their search application, so it is straightforward to adopt and play with, even if Lucene's core does not do so by default. > > > > > > +1 > > Elasticsearch may play with it and may also improve the parts where it is actually used. We do not know yet if it is a good idea to use it when you merge stuff that needs heavy random access to index (like you have a FilterCodecReader during merging, transform an index, resort it,...). Also it depends on codecs and how they are implemented. Unless we know that it works well for merging all partsof Lucene's core codecs, we may do a recommendation. > > If we decide to make it part of Lucene core, we can just move it. It will compile and work out of box with current Java versions and most file systems. > > Yeah, this is an awesome improvement thanks to this PR -- it becomes pure Java, yay! But we need more data of actual usage to decide if this is worth moving to core, let alone somehow defaulting to. > > Thanks @zacharymorn! No problem! Happy to contribute as well! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-731491171 > > I am not sure if we should preserve the native PosixUtil. It's used nowhere and was only added by @mikemccand back at that time to do more testing. > > Nowadays, if you want to try this, we should use JNA as external library, which provides all this. The good thing: JNA has a JAR file with dll/so files for all platforms and allows usage. > > With Java 17 (hopefully), we can use the new native Layer that allows to use any library with correct permissions in security manager. > > Yeah, +1 to remove `PosixUtil.cpp` if indeed this was the only usage of it! RefCount dropped to zero! Removed this and the java one. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-729520080 > I stated a bit earlier, removing WindowsDirectory should be a separate PR. It's not the same thing (it's not about DirectIO), it works around aproblem with positional reads. We should review this again and decide later. Maybe open an issue. > > What's good with this new PR: it allows to use directio also on windows. Sounds good. I've reverted the `WindowsDirectory` removal commit, and will create a follow-up Jira ticket on this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-729393685 I've been following the comment thread on `WindowsDirectory`, but not sure if there's a consensus to remove it in this PR? For the time being, I've pushed a standalone commit to remove them https://github.com/apache/lucene-solr/pull/2052/commits/0e2b787aeca722e96d81cbcc72c02df36b368c11. Please let me know if we would like to handle it in another PR, and I can revert it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-729389857 > > I think we just need to resolve the "replace now or not" question? > > I'd like to improve the testcase and use a real index on top of a combination with FSDirectory.open() wrapped by this directory. It should also test the whole BaseDirectoryTestCase testsuite on the directory, so sublass this class for the test. Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-729387261 > > Second, it is extremely experimental and not clear when it provides benefits / what risks there are / etc. We need to learn much more about it, in diverse usage, to help here. I'd love to hear from Elasticsearch or Solr users if this helps, since those applications do simultaneous indexing (merging) and searching on the same box. > > I am sure, you at amazon will test it extensively. But I agree: I would not make this any default, I am still in favour of using plain MMAPDirectory. The risks of making it worse by using direct io is too heavy. > > > Third, users are able to choose to use this when they instantiate the Directory implementation for their search application, so it is straightforward to adopt and play with, even if Lucene's core does not do so by default. > > +1 > > Elasticsearch may play with it and may also improve the parts where it is actually used. We do not know yet if it is a good idea to use it when you merge stuff that needs heavy random access to index (like you have a FilterCodecReader during merging, transform an index, resort it,...). Also it depends on codecs and how they are implemented. Unless we know that it works well for merging all partsof Lucene's core codecs, we may do a recommendation. > > If we decide to make it part of Lucene core, we can just move it. It will compile and work out of box with current Java versions and most file systems. Thanks @mikemccand and @uschindler for your comments on this! They make sense and I guess we will leave it as is for now, and revisit this after some production usage. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
zacharymorn commented on pull request #2052: URL: https://github.com/apache/lucene-solr/pull/2052#issuecomment-723400516 > Thank you @zacharymorn! It's so nice we can move this to pure java now -- this makes it much easier and lower risk for Lucene users to test. > > I suspect for Lucene applications that do both indexing and searching on a single box (e.g. Elasticsearch), this might be a big improvement in preventing large merges from hurting concurrent queries. No problem @mikemccand ! I'm glad to be able to contribute as well. One thing I notice while working on this, and also related to your comment is, it seems Lucene core doesn't seems to use this facility via any configuration at all (no reference to `NativeUnixDirectory`). So I'm wondering if it makes sense to have a follow up task to add some kind of configuration (through Merge policy?) to use it as well once the changes stabilize ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org