[GitHub] [lucene-solr] uschindler commented on a change in pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory

2021-01-16 Thread GitBox


uschindler commented on a change in pull request #2052:
URL: https://github.com/apache/lucene-solr/pull/2052#discussion_r558840406



##
File path: 
lucene/misc/src/test/org/apache/lucene/misc/store/TestDirectIODirectory.java
##
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.misc.store;
+
+import java.io.IOException;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import org.apache.lucene.document.*;
+import org.apache.lucene.index.DirectoryReader;
+import org.apache.lucene.index.IndexReader;
+import org.apache.lucene.index.RandomIndexWriter;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.PhraseQuery;
+import org.apache.lucene.store.*;
+import org.junit.BeforeClass;
+
+public class TestDirectIODirectory extends BaseDirectoryTestCase {
+
+  @BeforeClass
+  public static void checkSupported() {
+assumeTrue(
+"This test required a JDK version that has support for 
ExtendedOpenOption.DIRECT",
+DirectIODirectory.ExtendedOpenOption_DIRECT != null);
+  }
+
+  @Override
+  protected DirectIODirectory getDirectory(Path path) throws IOException {
+return new DirectIODirectory(
+FSDirectory.open(path), DirectIODirectory.DEFAULT_MERGE_BUFFER_SIZE, 
0L) {
+  @Override
+  protected boolean useDirectIO(IOContext context) {
+return true;
+  }
+};
+  }
+
+  public void testIndexWriteRead() throws IOException {
+try (Directory dir = getDirectory(createTempDir("testDirectIODirectory"))) 
{
+  try (RandomIndexWriter iw = new RandomIndexWriter(random(), dir)) {
+Document doc = new Document();
+Field field = newField("field", "foo bar", TextField.TYPE_STORED);
+doc.add(field);
+
+iw.addDocument(doc);
+iw.commit();
+  }
+
+  try (IndexReader ir = DirectoryReader.open(dir)) {
+IndexSearcher s = newSearcher(ir);
+assertEquals(1, s.count(new PhraseQuery("field", "foo", "bar")));
+  }
+}
+  }
+
+  public void testIllegalEOFWithFileSizeMultipleOfBlockSize() throws Exception 
{
+Path path = createTempDir("testIllegalEOF");
+final int fileSize = 
Math.toIntExact(Files.getFileStore(path).getBlockSize()) * 2;

Review comment:
   Yes that's the correct way to test it, use the local FS block size  .

##
File path: 
lucene/misc/src/test/org/apache/lucene/misc/store/TestDirectIODirectory.java
##
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.misc.store;
+
+import java.io.IOException;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import org.apache.lucene.document.*;
+import org.apache.lucene.index.DirectoryReader;
+import org.apache.lucene.index.IndexReader;
+import org.apache.lucene.index.RandomIndexWriter;
+import org.apache.lucene.search.IndexSearcher;
+import org.apache.lucene.search.PhraseQuery;
+import org.apache.lucene.store.*;
+import org.junit.BeforeClass;
+
+public class TestDirectIODirectory extends BaseDirectoryTestCase {
+
+  @BeforeClass
+  public static void checkSupported() {
+assumeTrue(
+"This test required a JDK version that has support for 
ExtendedOpenOption.DIRECT",
+DirectIODirectory.ExtendedOpenOption_DIRECT != null);
+  }
+
+  @Override
+  protected DirectIODirectory getDirectory(Path path) throws IOException {
+return new DirectIODirectory(
+

[GitHub] [lucene-solr] uschindler commented on a change in pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory

2021-01-16 Thread GitBox


uschindler commented on a change in pull request #2052:
URL: https://github.com/apache/lucene-solr/pull/2052#discussion_r558840321



##
File path: 
lucene/misc/src/java/org/apache/lucene/misc/store/DirectIODirectory.java
##
@@ -346,17 +360,21 @@ public byte readByte() throws IOException {
 
 private void refill() throws IOException {
   filePos += buffer.capacity();
-  
+
+  // BaseDirectoryTestCase#testSeekPastEOF test for consecutive read past 
EOF,
+  // hence throwing EOFException early to maintain buffer state (position 
in particular)
+  if(filePos >= channel.size()) {

Review comment:
   I think this was exactly the problem. I was yesterday about to change 
this to `>` instead of `>=`. It is allowed to seek to exectly the file size, 
which was obviously causing the test failures on windows.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] uschindler commented on a change in pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory

2021-01-15 Thread GitBox


uschindler commented on a change in pull request #2052:
URL: https://github.com/apache/lucene-solr/pull/2052#discussion_r558492000



##
File path: 
lucene/misc/src/java/org/apache/lucene/misc/store/DirectIODirectory.java
##
@@ -346,17 +360,21 @@ public byte readByte() throws IOException {
 
 private void refill() throws IOException {
   filePos += buffer.capacity();
-  
+
+  // BaseDirectoryTestCase#testSeekPastEOF test for consecutive read past 
EOF,
+  // hence throwing EOFException early to maintain buffer state (position 
in particular)
+  if(filePos >= channel.size()) {

Review comment:
   Thanks, this was driving me crazy when I tried to fix this (I gave up 
after some beers deep night).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] uschindler commented on a change in pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory

2021-01-15 Thread GitBox


uschindler commented on a change in pull request #2052:
URL: https://github.com/apache/lucene-solr/pull/2052#discussion_r558491882



##
File path: 
lucene/misc/src/java/org/apache/lucene/misc/store/DirectIODirectory.java
##
@@ -307,7 +314,14 @@ public void close() throws IOException {
 
 @Override
 public long getFilePointer() {
-  return filePos + buffer.position();
+  long filePointer = filePos + buffer.position();
+
+  // opening the input and immediately calling getFilePointer without 
calling readX (and thus refill) first,
+  // will result in negative value equal to bufferSize being returned,
+  // due to the initialization method filePos = -bufferSize used in 
constructor.
+  assert filePointer == -buffer.capacity() || filePointer >= 0 :
+"filePointer should either be initial value equal to negative buffer 
capacity, or larger than or equal to 0";
+  return Math.max(filePointer, 0);

Review comment:
   Thanks, this was driving me crazy when I tried to fix this (I gave up 
after some beers deep night).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] uschindler commented on a change in pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory

2020-12-20 Thread GitBox


uschindler commented on a change in pull request #2052:
URL: https://github.com/apache/lucene-solr/pull/2052#discussion_r546454653



##
File path: 
lucene/misc/src/java/org/apache/lucene/misc/store/DirectIODirectory.java
##
@@ -0,0 +1,392 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.misc.store;
+
+import java.io.EOFException;
+import java.io.IOException;
+import java.io.UncheckedIOException;
+import java.nio.ByteBuffer;
+import java.nio.channels.FileChannel;
+import java.nio.file.Files;
+import java.nio.file.OpenOption;
+import java.nio.file.Path;
+import java.nio.file.StandardOpenOption;
+import java.util.Arrays;
+
+import org.apache.lucene.store.*;
+import org.apache.lucene.store.IOContext.Context;
+
+/**
+ * A {@link Directory} implementation for all Unixes and Windows that uses
+ * DIRECT I/O to bypass OS level IO caching during
+ * merging.  For all other cases (searching, writing) we delegate
+ * to the provided Directory instance.
+ *
+ * See Overview
+ * for more details.
+ *
+ * WARNING: this code is very new and quite easily
+ * could contain horrible bugs.
+ *
+ * This directory passes Solr and Lucene tests on Linux, OS X,
+ * and Windows; other systems should work but have not been
+ * tested! Use at your own risk.
+ *
+ * @throws UnsupportedOperationException if the operating system, file 
system or JDK
+ * does not support Direct I/O or a sufficient equivalent.
+ *
+ * @lucene.experimental
+ */
+public class DirectIODirectory extends FilterDirectory {
+
+  /** Default buffer size before writing to disk (256 KB);
+   *  larger means less IO load but more RAM and direct
+   *  buffer storage space consumed during merging. */
+
+  public final static int DEFAULT_MERGE_BUFFER_SIZE = 262144;
+
+  /** Default min expected merge size before direct IO is
+   *  used (10 MB): */
+  public final static long DEFAULT_MIN_BYTES_DIRECT = 10*1024*1024;
+
+  private final int blockSize, mergeBufferSize;
+  private final long minBytesDirect;
+  private final Path path;
+  
+  volatile boolean isOpen = true;
+
+  /** Reference to {@code com.sun.nio.file.ExtendedOpenOption.DIRECT} by 
reflective class and enum lookup.
+   * There are two reasons for using this instead of directly referencing 
ExtendedOpenOption.DIRECT:
+   * 
+   *  ExtendedOpenOption.DIRECT is OpenJDK's internal proprietary API. 
This API causes un-suppressible(?) warning to be emitted
+   *  when compiling with --release flag and value N, where N is smaller than 
the the version of javac used for compilation.
+   *  It is possible that Lucene is run using JDK that does not support 
ExtendedOpenOption.DIRECT. In such a
+   *  case, dynamic lookup allows us to bail out with 
UnsupportedOperationException with meaningful error message.
+   * 
+   * This reference is {@code null}, if the JDK does not support direct I/O.
+   */
+  static final OpenOption ExtendedOpenOption_DIRECT; // visible for test
+  static {
+OpenOption option;
+try {
+  final Class clazz = 
Class.forName("com.sun.nio.file.ExtendedOpenOption").asSubclass(OpenOption.class);
+  option = Arrays.stream(clazz.getEnumConstants())
+  .filter(e -> e.toString().equalsIgnoreCase("DIRECT"))
+  .findFirst()
+  .orElse(null);
+} catch (Exception e) {
+  option = null;
+}
+ExtendedOpenOption_DIRECT = option;
+  }
+
+  /** Create a new DirectIODirectory for the named location.
+   * 
+   * @param path the path of the directory
+   * @param mergeBufferSize Size of buffer to use for
+   *merging.
+   * @param minBytesDirect Merges, or files to be opened for
+   *   reading, smaller than this will
+   *   not use direct IO.  See {@link
+   *   #DEFAULT_MIN_BYTES_DIRECT}
+   * @param delegate fallback Directory for non-merges
+   * @throws IOException If there is a low-level I/O error
+   */
+  public DirectIODirectory(Path path, int mergeBufferSize, long 
minBytesDirect, FSDirectory delegate) throws IOException {
+super(delegate);
+this.blockSize = Math.toIntExact(Files.getFileStore(path).getBlockSize());
+this.mergeBufferSize = mergeBufferSize;
+this.minBytesDirect = 

[GitHub] [lucene-solr] uschindler commented on a change in pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory

2020-12-20 Thread GitBox


uschindler commented on a change in pull request #2052:
URL: https://github.com/apache/lucene-solr/pull/2052#discussion_r546454647



##
File path: 
lucene/misc/src/java/org/apache/lucene/misc/store/DirectIODirectory.java
##
@@ -0,0 +1,392 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.misc.store;
+
+import java.io.EOFException;
+import java.io.IOException;
+import java.io.UncheckedIOException;
+import java.nio.ByteBuffer;
+import java.nio.channels.FileChannel;
+import java.nio.file.Files;
+import java.nio.file.OpenOption;
+import java.nio.file.Path;
+import java.nio.file.StandardOpenOption;
+import java.util.Arrays;
+
+import org.apache.lucene.store.*;
+import org.apache.lucene.store.IOContext.Context;
+
+/**
+ * A {@link Directory} implementation for all Unixes and Windows that uses
+ * DIRECT I/O to bypass OS level IO caching during
+ * merging.  For all other cases (searching, writing) we delegate
+ * to the provided Directory instance.
+ *
+ * See Overview
+ * for more details.
+ *
+ * WARNING: this code is very new and quite easily
+ * could contain horrible bugs.
+ *
+ * This directory passes Solr and Lucene tests on Linux, OS X,
+ * and Windows; other systems should work but have not been
+ * tested! Use at your own risk.
+ *
+ * @throws UnsupportedOperationException if the operating system, file 
system or JDK
+ * does not support Direct I/O or a sufficient equivalent.
+ *
+ * @lucene.experimental
+ */
+public class DirectIODirectory extends FilterDirectory {
+
+  /** Default buffer size before writing to disk (256 KB);
+   *  larger means less IO load but more RAM and direct
+   *  buffer storage space consumed during merging. */
+
+  public final static int DEFAULT_MERGE_BUFFER_SIZE = 262144;
+
+  /** Default min expected merge size before direct IO is
+   *  used (10 MB): */
+  public final static long DEFAULT_MIN_BYTES_DIRECT = 10*1024*1024;
+
+  private final int blockSize, mergeBufferSize;
+  private final long minBytesDirect;
+  private final Path path;
+  
+  volatile boolean isOpen = true;
+
+  /** Reference to {@code com.sun.nio.file.ExtendedOpenOption.DIRECT} by 
reflective class and enum lookup.
+   * There are two reasons for using this instead of directly referencing 
ExtendedOpenOption.DIRECT:
+   * 
+   *  ExtendedOpenOption.DIRECT is OpenJDK's internal proprietary API. 
This API causes un-suppressible(?) warning to be emitted
+   *  when compiling with --release flag and value N, where N is smaller than 
the the version of javac used for compilation.
+   *  It is possible that Lucene is run using JDK that does not support 
ExtendedOpenOption.DIRECT. In such a
+   *  case, dynamic lookup allows us to bail out with 
UnsupportedOperationException with meaningful error message.
+   * 
+   * This reference is {@code null}, if the JDK does not support direct I/O.
+   */
+  static final OpenOption ExtendedOpenOption_DIRECT; // visible for test
+  static {
+OpenOption option;
+try {
+  final Class clazz = 
Class.forName("com.sun.nio.file.ExtendedOpenOption").asSubclass(OpenOption.class);
+  option = Arrays.stream(clazz.getEnumConstants())
+  .filter(e -> e.toString().equalsIgnoreCase("DIRECT"))
+  .findFirst()
+  .orElse(null);
+} catch (Exception e) {
+  option = null;
+}
+ExtendedOpenOption_DIRECT = option;
+  }
+
+  /** Create a new DirectIODirectory for the named location.
+   * 
+   * @param path the path of the directory
+   * @param mergeBufferSize Size of buffer to use for
+   *merging.
+   * @param minBytesDirect Merges, or files to be opened for
+   *   reading, smaller than this will
+   *   not use direct IO.  See {@link
+   *   #DEFAULT_MIN_BYTES_DIRECT}
+   * @param delegate fallback Directory for non-merges
+   * @throws IOException If there is a low-level I/O error
+   */
+  public DirectIODirectory(Path path, int mergeBufferSize, long 
minBytesDirect, FSDirectory delegate) throws IOException {

Review comment:
   I fixed this in 
https://github.com/apache/lucene-solr/pull/2052/commits/9a59b56f2f94aed53d2268c3053e0fa37382ce44





[GitHub] [lucene-solr] uschindler commented on a change in pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory

2020-11-18 Thread GitBox


uschindler commented on a change in pull request #2052:
URL: https://github.com/apache/lucene-solr/pull/2052#discussion_r525918279



##
File path: 
lucene/misc/src/test/org/apache/lucene/misc/store/TestDirectIODirectory.java
##
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.misc.store;
+
+import com.carrotsearch.randomizedtesting.LifecycleScope;
+import com.carrotsearch.randomizedtesting.RandomizedTest;
+import org.apache.lucene.store.*;
+
+import java.io.IOException;
+import java.nio.file.Files;
+import java.nio.file.Path;
+
+import static 
org.apache.lucene.misc.store.DirectIODirectory.DEFAULT_MIN_BYTES_DIRECT;
+
+public class TestDirectIODirectory extends BaseDirectoryTestCase {
+  public void testWriteReadWithDirectIO() throws IOException {
+try(Directory dir = 
getDirectory(RandomizedTest.newTempDir(LifecycleScope.TEST))) {
+  final long blockSize = 
Files.getFileStore(createTempFile()).getBlockSize();
+  final long minBytesDirect = 
Double.valueOf(Math.ceil(DEFAULT_MIN_BYTES_DIRECT / blockSize)).longValue() *
+blockSize;
+  // Need to worry about overflows here?
+  final int writtenByteLength = Math.toIntExact(minBytesDirect);
+
+  MergeInfo mergeInfo = new MergeInfo(1000, Integer.MAX_VALUE, true, 1);
+  final IOContext context = new IOContext(mergeInfo);
+
+  IndexOutput indexOutput = dir.createOutput("test", context);
+  indexOutput.writeBytes(new byte[writtenByteLength], 0, 
writtenByteLength);
+  IndexInput indexInput = dir.openInput("test", context);
+
+  assertEquals("The length of bytes read should equal to written", 
writtenByteLength, indexInput.length());
+
+  indexOutput.close();
+  indexInput.close();
+}
+  }
+
+  @Override
+  protected Directory getDirectory(Path path) throws IOException {

Review comment:
   I verified: `BaseDirectoryTestCase` uses 
[newIOContext(Random)](https://lucene.apache.org/core/8_7_0/test-framework/org/apache/lucene/util/LuceneTestCase.html#newIOContext-java.util.Random-).
 So we have the correct randomization and our directory is used from time to 
time.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] uschindler commented on a change in pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory

2020-11-18 Thread GitBox


uschindler commented on a change in pull request #2052:
URL: https://github.com/apache/lucene-solr/pull/2052#discussion_r525917794



##
File path: 
lucene/misc/src/test/org/apache/lucene/misc/store/TestDirectIODirectory.java
##
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.misc.store;
+
+import com.carrotsearch.randomizedtesting.LifecycleScope;
+import com.carrotsearch.randomizedtesting.RandomizedTest;
+import org.apache.lucene.store.*;
+
+import java.io.IOException;
+import java.nio.file.Files;
+import java.nio.file.Path;
+
+import static 
org.apache.lucene.misc.store.DirectIODirectory.DEFAULT_MIN_BYTES_DIRECT;
+
+public class TestDirectIODirectory extends BaseDirectoryTestCase {
+  public void testWriteReadWithDirectIO() throws IOException {
+try(Directory dir = 
getDirectory(RandomizedTest.newTempDir(LifecycleScope.TEST))) {
+  final long blockSize = 
Files.getFileStore(createTempFile()).getBlockSize();
+  final long minBytesDirect = 
Double.valueOf(Math.ceil(DEFAULT_MIN_BYTES_DIRECT / blockSize)).longValue() *
+blockSize;
+  // Need to worry about overflows here?
+  final int writtenByteLength = Math.toIntExact(minBytesDirect);
+
+  MergeInfo mergeInfo = new MergeInfo(1000, Integer.MAX_VALUE, true, 1);
+  final IOContext context = new IOContext(mergeInfo);
+
+  IndexOutput indexOutput = dir.createOutput("test", context);
+  indexOutput.writeBytes(new byte[writtenByteLength], 0, 
writtenByteLength);
+  IndexInput indexInput = dir.openInput("test", context);
+
+  assertEquals("The length of bytes read should equal to written", 
writtenByteLength, indexInput.length());
+
+  indexOutput.close();
+  indexInput.close();
+}
+  }
+
+  @Override
+  protected Directory getDirectory(Path path) throws IOException {
+Directory delegate = FSDirectory.open(path);

Review comment:
   I verified: `BaseDirectoryTestCase` uses 
[newIOContext(Random)](https://lucene.apache.org/core/8_7_0/test-framework/org/apache/lucene/util/LuceneTestCase.html#newIOContext-java.util.Random-).
 So we have the correct randomization and our directory is used from time to 
time.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] uschindler commented on a change in pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory

2020-11-18 Thread GitBox


uschindler commented on a change in pull request #2052:
URL: https://github.com/apache/lucene-solr/pull/2052#discussion_r525912478



##
File path: 
lucene/misc/src/test/org/apache/lucene/misc/store/TestDirectIODirectory.java
##
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.misc.store;
+
+import com.carrotsearch.randomizedtesting.LifecycleScope;
+import com.carrotsearch.randomizedtesting.RandomizedTest;
+import org.apache.lucene.store.*;
+
+import java.io.IOException;
+import java.nio.file.Files;
+import java.nio.file.Path;
+
+import static 
org.apache.lucene.misc.store.DirectIODirectory.DEFAULT_MIN_BYTES_DIRECT;
+
+public class TestDirectIODirectory extends BaseDirectoryTestCase {
+  public void testWriteReadWithDirectIO() throws IOException {
+try(Directory dir = 
getDirectory(RandomizedTest.newTempDir(LifecycleScope.TEST))) {
+  final long blockSize = 
Files.getFileStore(createTempFile()).getBlockSize();
+  final long minBytesDirect = 
Double.valueOf(Math.ceil(DEFAULT_MIN_BYTES_DIRECT / blockSize)).longValue() *
+blockSize;
+  // Need to worry about overflows here?
+  final int writtenByteLength = Math.toIntExact(minBytesDirect);
+
+  MergeInfo mergeInfo = new MergeInfo(1000, Integer.MAX_VALUE, true, 1);
+  final IOContext context = new IOContext(mergeInfo);
+
+  IndexOutput indexOutput = dir.createOutput("test", context);
+  indexOutput.writeBytes(new byte[writtenByteLength], 0, 
writtenByteLength);
+  IndexInput indexInput = dir.openInput("test", context);
+
+  assertEquals("The length of bytes read should equal to written", 
writtenByteLength, indexInput.length());
+
+  indexOutput.close();
+  indexInput.close();
+}
+  }
+
+  @Override
+  protected Directory getDirectory(Path path) throws IOException {

Review comment:
   Thanks great. We have to check, if the base class also sometimes send 
the correct IOContexts, so our directory is triggered. Because the wrong 
IOContext will cause everything to be delegated to the underlying FSDirectory.

##
File path: 
lucene/misc/src/test/org/apache/lucene/misc/store/TestDirectIODirectory.java
##
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.misc.store;
+
+import com.carrotsearch.randomizedtesting.LifecycleScope;
+import com.carrotsearch.randomizedtesting.RandomizedTest;
+import org.apache.lucene.store.*;
+
+import java.io.IOException;
+import java.nio.file.Files;
+import java.nio.file.Path;
+
+import static 
org.apache.lucene.misc.store.DirectIODirectory.DEFAULT_MIN_BYTES_DIRECT;
+
+public class TestDirectIODirectory extends BaseDirectoryTestCase {
+  public void testWriteReadWithDirectIO() throws IOException {
+try(Directory dir = 
getDirectory(RandomizedTest.newTempDir(LifecycleScope.TEST))) {
+  final long blockSize = 
Files.getFileStore(createTempFile()).getBlockSize();
+  final long minBytesDirect = 
Double.valueOf(Math.ceil(DEFAULT_MIN_BYTES_DIRECT / blockSize)).longValue() *
+blockSize;
+  // Need to worry about overflows here?
+  final int writtenByteLength = Math.toIntExact(minBytesDirect);
+
+  MergeInfo mergeInfo = new MergeInfo(1000, Integer.MAX_VALUE, true, 1);
+  final IOContext context = new IOContext(mergeInfo);
+
+  IndexOutput indexOutput = 

[GitHub] [lucene-solr] uschindler commented on a change in pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory

2020-11-18 Thread GitBox


uschindler commented on a change in pull request #2052:
URL: https://github.com/apache/lucene-solr/pull/2052#discussion_r525910929



##
File path: 
lucene/misc/src/test/org/apache/lucene/misc/store/TestDirectIODirectory.java
##
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.misc.store;
+
+import com.carrotsearch.randomizedtesting.LifecycleScope;
+import com.carrotsearch.randomizedtesting.RandomizedTest;
+import org.apache.lucene.store.*;
+
+import java.io.IOException;
+import java.nio.file.Files;
+import java.nio.file.Path;
+
+import static 
org.apache.lucene.misc.store.DirectIODirectory.DEFAULT_MIN_BYTES_DIRECT;
+
+public class TestDirectIODirectory extends BaseDirectoryTestCase {

Review comment:
   That's because of the new tests in it, it looks too different to the 
forked file. So it's 2 different files now. But this really depends on the Git 
version that is used.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] uschindler commented on a change in pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory

2020-11-18 Thread GitBox


uschindler commented on a change in pull request #2052:
URL: https://github.com/apache/lucene-solr/pull/2052#discussion_r525909321



##
File path: 
lucene/misc/src/java/org/apache/lucene/misc/store/DirectIODirectory.java
##
@@ -74,12 +65,12 @@
  *
  * @lucene.experimental
  */
-public class NativeUnixDirectory extends FSDirectory {

Review comment:
   You are right, WindowsDirectory is unrelated to direct IO. So let's 
discuss this on separate issue! The background is here: 
https://issues.apache.org/jira/browse/LUCENE-2791. The issue behind that is the 
following: https://bugs.openjdk.java.net/browse/JDK-6265734





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] uschindler commented on a change in pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory

2020-11-17 Thread GitBox


uschindler commented on a change in pull request #2052:
URL: https://github.com/apache/lucene-solr/pull/2052#discussion_r525305942



##
File path: 
lucene/misc/src/java/org/apache/lucene/misc/store/DirectIODirectory.java
##
@@ -74,12 +65,12 @@
  *
  * @lucene.experimental
  */
-public class NativeUnixDirectory extends FSDirectory {

Review comment:
   > Will push another commit to remove WindowsDirectory in a day, as it's 
getting too late for me now :)
   
   We should make a completely separate PR out of this. WindowsDirectory is 
also native, but we should still discuss if it was ever used. On Windows, 
currently you can only use MMapDirectory in a highly concurrent environment, 
because of the locking problem with positional reads in FileChannel.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] uschindler commented on a change in pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory

2020-11-17 Thread GitBox


uschindler commented on a change in pull request #2052:
URL: https://github.com/apache/lucene-solr/pull/2052#discussion_r525294149



##
File path: lucene/misc/native/src/main/posix/NativePosixUtil.cpp
##
@@ -102,6 +102,7 @@ JNIEXPORT jint JNICALL 
Java_org_apache_lucene_misc_store_NativePosixUtil_posix_1
 }
 #endif
 
+// TODO: To be removed together?

Review comment:
   I'd like to get rid of it as soon as possible. This class is obsolete 
and completely unused, unless we also keep the original directory 
implementation.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] uschindler commented on a change in pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory

2020-11-17 Thread GitBox


uschindler commented on a change in pull request #2052:
URL: https://github.com/apache/lucene-solr/pull/2052#discussion_r524967715



##
File path: 
lucene/misc/src/java/org/apache/lucene/misc/store/DirectIODirectory.java
##
@@ -74,12 +65,12 @@
  *
  * @lucene.experimental
  */
-public class NativeUnixDirectory extends FSDirectory {

Review comment:
   I can confirm, windows works: 
http://cr.openjdk.java.net/~bpb/8164900/webrev.19/





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] uschindler commented on a change in pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory

2020-11-17 Thread GitBox


uschindler commented on a change in pull request #2052:
URL: https://github.com/apache/lucene-solr/pull/2052#discussion_r524965464



##
File path: 
lucene/misc/src/test/org/apache/lucene/misc/store/DirectIODirectoryTest.java
##
@@ -18,29 +18,37 @@
 
 import com.carrotsearch.randomizedtesting.LifecycleScope;
 import com.carrotsearch.randomizedtesting.RandomizedTest;
-import org.apache.lucene.store.ByteBuffersDirectory;
-import org.apache.lucene.store.Directory;
-import org.apache.lucene.store.IOContext;
-import org.apache.lucene.store.MergeInfo;
+import org.apache.lucene.store.*;
 import org.apache.lucene.util.LuceneTestCase;
-import org.junit.Rule;
-import org.junit.rules.TestRule;
 
 import java.io.IOException;
-import java.util.EnumSet;
+import java.nio.file.Files;
 
-public class NativeUnixDirectoryTest extends LuceneTestCase {
-  @Rule
-  public static TestRule requiresNative = new NativeLibEnableRule(
-  EnumSet.of(NativeLibEnableRule.OperatingSystem.MAC,
-  NativeLibEnableRule.OperatingSystem.FREE_BSD,
-  NativeLibEnableRule.OperatingSystem.LINUX));
+import static 
org.apache.lucene.misc.store.DirectIODirectory.DEFAULT_MIN_BYTES_DIRECT;
 
-  public void testLibraryLoaded() throws IOException {
+public class DirectIODirectoryTest extends LuceneTestCase {

Review comment:
   OK, then WindowsDirectory is obsolete, too. Thanks for testing. 
Nevertheless, as mentioned before, we should add javadocs, that on unsupported 
platforms, this directory may throw UOE on opening/closing IndexInputs.
   
   We should also extend this test to use a random directory (FS based) as 
delegate. It looks strange to me to just use a ByteBuffersDirectory as 
delegate, as the filestore is totally different.
   
   So: Extend the BaseDirectoryTestcase and then create a random temp dir and 
then build  FSDirectory.open() as backend and use DirectIO as wrapper on top.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] uschindler commented on a change in pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory

2020-11-17 Thread GitBox


uschindler commented on a change in pull request #2052:
URL: https://github.com/apache/lucene-solr/pull/2052#discussion_r524963604



##
File path: 
lucene/misc/src/java/org/apache/lucene/misc/store/DirectIODirectory.java
##
@@ -74,12 +65,12 @@
  *
  * @lucene.experimental
  */
-public class NativeUnixDirectory extends FSDirectory {

Review comment:
   IMHO, we should add more Javadocs, that this directory may throw 
`UnsupportedOperationException` if the underlying platform does not allow 
direct IO.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] uschindler commented on a change in pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory

2020-11-17 Thread GitBox


uschindler commented on a change in pull request #2052:
URL: https://github.com/apache/lucene-solr/pull/2052#discussion_r524961831



##
File path: 
lucene/misc/src/java/org/apache/lucene/misc/store/DirectIODirectory.java
##
@@ -74,12 +65,12 @@
  *
  * @lucene.experimental
  */
-public class NativeUnixDirectory extends FSDirectory {

Review comment:
   I am not fully sure if the ExtendedOpenOption works at all with Windows. 
When checking the OpenJDK sources it seems to throw UOE, but haven't tested 
with new JDKs yet.
   
   Anyways, we should try to get rid of WindowsDirectory, too, as the intention 
behind it is the same: do direct IO, it's just named differently on Windows.
   
   We should maybe also look into the other ExtendedOpenptions.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] uschindler commented on a change in pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory

2020-11-17 Thread GitBox


uschindler commented on a change in pull request #2052:
URL: https://github.com/apache/lucene-solr/pull/2052#discussion_r524954118



##
File path: lucene/misc/src/java/overview.html
##
@@ -35,7 +35,7 @@ NativeUnixDirectory
 have to compile on your platform.
 
 
-{@link org.apache.lucene.misc.store.NativeUnixDirectory} is a Directory 
implementation that bypasses the
+{@link org.apache.lucene.misc.store.DirectIODirectory} is a Directory 
implementation that bypasses the

Review comment:
   More of the javadocs below need to go away, too.

##
File path: 
lucene/misc/src/test/org/apache/lucene/misc/store/DirectIODirectoryTest.java
##
@@ -18,29 +18,37 @@
 
 import com.carrotsearch.randomizedtesting.LifecycleScope;
 import com.carrotsearch.randomizedtesting.RandomizedTest;
-import org.apache.lucene.store.ByteBuffersDirectory;
-import org.apache.lucene.store.Directory;
-import org.apache.lucene.store.IOContext;
-import org.apache.lucene.store.MergeInfo;
+import org.apache.lucene.store.*;
 import org.apache.lucene.util.LuceneTestCase;
-import org.junit.Rule;
-import org.junit.rules.TestRule;
 
 import java.io.IOException;
-import java.util.EnumSet;
+import java.nio.file.Files;
 
-public class NativeUnixDirectoryTest extends LuceneTestCase {
-  @Rule
-  public static TestRule requiresNative = new NativeLibEnableRule(
-  EnumSet.of(NativeLibEnableRule.OperatingSystem.MAC,
-  NativeLibEnableRule.OperatingSystem.FREE_BSD,
-  NativeLibEnableRule.OperatingSystem.LINUX));
+import static 
org.apache.lucene.misc.store.DirectIODirectory.DEFAULT_MIN_BYTES_DIRECT;
 
-  public void testLibraryLoaded() throws IOException {
+public class DirectIODirectoryTest extends LuceneTestCase {

Review comment:
   In my opinion, this test should also extend the BaseDirectoryTestCase, 
because it then does a full test of all I/O functionality, including 
multi-threaded and corner cases - this is important to make sure that the 
directory works as expected, also if somebody uses it when not merging (e.g. 
for searching). The original test did not do this as this relied on a native 
library.
   
   Nevertheless: I am not fully sure if Windows supports this 
ExtendedOpenOption. I had no time to test this yet.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] uschindler commented on a change in pull request #2052: LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory

2020-11-17 Thread GitBox


uschindler commented on a change in pull request #2052:
URL: https://github.com/apache/lucene-solr/pull/2052#discussion_r524950844



##
File path: 
lucene/misc/src/java/org/apache/lucene/misc/store/DirectIODirectory.java
##
@@ -74,12 +65,12 @@
  *
  * @lucene.experimental
  */
-public class NativeUnixDirectory extends FSDirectory {

Review comment:
   I don't think any explicit perf testing is needed, as we just replace 
the calls to our CPP file by setting the flag on open of FileChannel.
   
   It's good that I have seen this pull request, because I was wondering why we 
took the "make native builds available" at all. 
   
   We should also remove WindowsDirectory, because this one also works on 
Windows.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org