[GitHub] [ignite-3] rpuch commented on a diff in pull request #1407: IGNITE-18322 Define scan contract SortedIndexStorage

GitBox Thu, 08 Dec 2022 07:05:45 -0800


rpuch commented on code in PR #1407:
URL: https://github.com/apache/ignite-3/pull/1407#discussion_r1043430490



##########
modules/storage-api/src/testFixtures/java/org/apache/ignite/internal/storage/index/AbstractSortedIndexStorageTest.java:
##########
@@ -579,6 +585,401 @@ void testNullValues(ColumnDefinition columnDefinition) 
throws Exception {
         }
     }
 
+    /**
+     * Checks simple scenarios for a scanning cursor.
+     */
+    @Test
+    void testScanSimple() {
+        SortedIndexDefinition indexDefinition = 
SchemaBuilders.sortedIndex("TEST_IDX")
+                
.addIndexColumn(ColumnType.INT32.typeSpec().name()).asc().done()
+                .build();
+
+        SortedIndexStorage indexStorage = createIndexStorage(indexDefinition);
+
+        BinaryTupleRowSerializer serializer = new 
BinaryTupleRowSerializer(indexStorage.indexDescriptor());
+
+        for (int i = 0; i < 5; i++) {
+            put(indexStorage, serializer.serializeRow(new Object[]{i}, new 
RowId(TEST_PARTITION)));
+        }
+
+        // Checking without borders.
+        assertThat(
+                scan(indexStorage, null, null, 0).stream().map(objects -> 
objects[0]).collect(toList()),
+                contains(0, 1, 2, 3, 4)
+        );
+
+        // Let's check without borders.
+        assertThat(
+                indexStorage.scan(
+                                serializer.serializeRowPrefix(0),
+                                serializer.serializeRowPrefix(4),
+                                (GREATER_OR_EQUAL | LESS_OR_EQUAL)
+                        ).stream()
+                        .map(indexRow -> 
serializer.deserializeColumns(indexRow)[0])
+                        .collect(toList()),
+                contains(0, 1, 2, 3, 4)
+        );
+
+        assertThat(
+                indexStorage.scan(
+                                serializer.serializeRowPrefix(0),
+                                serializer.serializeRowPrefix(4),
+                                (GREATER_OR_EQUAL | LESS)
+                        ).stream()
+                        .map(indexRow -> 
serializer.deserializeColumns(indexRow)[0])
+                        .collect(toList()),
+                contains(0, 1, 2, 3)
+        );
+
+        assertThat(
+                indexStorage.scan(
+                                serializer.serializeRowPrefix(0),
+                                serializer.serializeRowPrefix(4),
+                                (GREATER | LESS_OR_EQUAL)
+                        ).stream()
+                        .map(indexRow -> 
serializer.deserializeColumns(indexRow)[0])
+                        .collect(toList()),
+                contains(1, 2, 3, 4)
+        );
+
+        assertThat(
+                indexStorage.scan(
+                                serializer.serializeRowPrefix(0),
+                                serializer.serializeRowPrefix(4),
+                                (GREATER | LESS)
+                        ).stream()
+                        .map(indexRow -> 
serializer.deserializeColumns(indexRow)[0])
+                        .collect(toList()),
+                contains(1, 2, 3)
+        );
+
+        // Let's check only with the lower bound.
+        assertThat(
+                indexStorage.scan(
+                                serializer.serializeRowPrefix(1),
+                                null,
+                                (GREATER_OR_EQUAL | LESS_OR_EQUAL)
+                        ).stream()
+                        .map(indexRow -> 
serializer.deserializeColumns(indexRow)[0])
+                        .collect(toList()),
+                contains(1, 2, 3, 4)
+        );
+
+        assertThat(
+                indexStorage.scan(
+                                serializer.serializeRowPrefix(1),
+                                null,
+                                (GREATER_OR_EQUAL | LESS)
+                        ).stream()
+                        .map(indexRow -> 
serializer.deserializeColumns(indexRow)[0])
+                        .collect(toList()),
+                contains(1, 2, 3, 4)
+        );
+
+        assertThat(
+                indexStorage.scan(
+                                serializer.serializeRowPrefix(1),
+                                null,
+                                (GREATER | LESS_OR_EQUAL)
+                        ).stream()
+                        .map(indexRow -> 
serializer.deserializeColumns(indexRow)[0])
+                        .collect(toList()),
+                contains(2, 3, 4)
+        );
+
+        assertThat(
+                indexStorage.scan(
+                                serializer.serializeRowPrefix(1),
+                                null,
+                                (GREATER | LESS)
+                        ).stream()
+                        .map(indexRow -> 
serializer.deserializeColumns(indexRow)[0])
+                        .collect(toList()),
+                contains(2, 3, 4)
+        );
+
+        // Let's check only with the upper bound.
+        assertThat(
+                indexStorage.scan(
+                                null,
+                                serializer.serializeRowPrefix(3),
+                                (GREATER_OR_EQUAL | LESS_OR_EQUAL)
+                        ).stream()
+                        .map(indexRow -> 
serializer.deserializeColumns(indexRow)[0])
+                        .collect(toList()),
+                contains(0, 1, 2, 3)
+        );
+
+        assertThat(
+                indexStorage.scan(
+                                null,
+                                serializer.serializeRowPrefix(3),
+                                (GREATER_OR_EQUAL | LESS)
+                        ).stream()
+                        .map(indexRow -> 
serializer.deserializeColumns(indexRow)[0])
+                        .collect(toList()),
+                contains(0, 1, 2)
+        );
+
+        assertThat(
+                indexStorage.scan(
+                                null,
+                                serializer.serializeRowPrefix(3),
+                                (GREATER | LESS_OR_EQUAL)
+                        ).stream()
+                        .map(indexRow -> 
serializer.deserializeColumns(indexRow)[0])
+                        .collect(toList()),
+                contains(0, 1, 2, 3)
+        );
+
+        assertThat(
+                indexStorage.scan(
+                                null,
+                                serializer.serializeRowPrefix(3),
+                                (GREATER | LESS)
+                        ).stream()
+                        .map(indexRow -> 
serializer.deserializeColumns(indexRow)[0])
+                        .collect(toList()),
+                contains(0, 1, 2)
+        );
+    }
+
+    @Test
+    void testScanContractForEmptyIndex() {
+        SortedIndexDefinition indexDefinition = 
SchemaBuilders.sortedIndex("TEST_IDX")
+                
.addIndexColumn(ColumnType.INT32.typeSpec().name()).asc().done()
+                .build();
+
+        SortedIndexStorage indexStorage = createIndexStorage(indexDefinition);
+
+        BinaryTupleRowSerializer serializer = new 
BinaryTupleRowSerializer(indexStorage.indexDescriptor());
+
+        Cursor<IndexRow> scanBeforeHashNext = indexStorage.scan(null, null, 0);

Review Comment:
   ```suggestion
           Cursor<IndexRow> scanBeforeHasNext = indexStorage.scan(null, null, 
0);
   ```



##########
modules/storage-api/src/testFixtures/java/org/apache/ignite/internal/storage/index/AbstractSortedIndexStorageTest.java:
##########
@@ -579,6 +585,401 @@ void testNullValues(ColumnDefinition columnDefinition) 
throws Exception {
         }
     }
 
+    /**
+     * Checks simple scenarios for a scanning cursor.
+     */
+    @Test
+    void testScanSimple() {

Review Comment:
   This test contains a lot of repetition. How about rewriting it using 
`@ParameterizedTest` (giving it upper+lower bounds, flags, expected output)?



##########
modules/storage-api/src/testFixtures/java/org/apache/ignite/internal/storage/index/AbstractSortedIndexStorageTest.java:
##########
@@ -579,6 +585,401 @@ void testNullValues(ColumnDefinition columnDefinition) 
throws Exception {
         }
     }
 
+    /**
+     * Checks simple scenarios for a scanning cursor.
+     */
+    @Test
+    void testScanSimple() {
+        SortedIndexDefinition indexDefinition = 
SchemaBuilders.sortedIndex("TEST_IDX")
+                
.addIndexColumn(ColumnType.INT32.typeSpec().name()).asc().done()
+                .build();
+
+        SortedIndexStorage indexStorage = createIndexStorage(indexDefinition);
+
+        BinaryTupleRowSerializer serializer = new 
BinaryTupleRowSerializer(indexStorage.indexDescriptor());
+
+        for (int i = 0; i < 5; i++) {
+            put(indexStorage, serializer.serializeRow(new Object[]{i}, new 
RowId(TEST_PARTITION)));
+        }
+
+        // Checking without borders.
+        assertThat(
+                scan(indexStorage, null, null, 0).stream().map(objects -> 
objects[0]).collect(toList()),
+                contains(0, 1, 2, 3, 4)
+        );
+
+        // Let's check without borders.

Review Comment:
   ```suggestion
           // Let's check with borders.
   ```



##########
modules/page-memory/src/main/java/org/apache/ignite/internal/pagememory/tree/BplusTree.java:
##########
@@ -6222,6 +6250,72 @@ public void close() {
         }
     }
 
+    /**
+     * Class for getting the next row.
+     */
+    private final class GetNext extends Get {
+        @Nullable
+        private T nextRow;
+
+        private GetNext(L row, boolean includeRow) {
+            super(row, false);
+
+            shift = includeRow ? -1 : 1;
+        }
+
+        @Override
+        boolean found(BplusIo<L> io, long pageAddr, int idx, int lvl) {
+            // Must never be called because we always have a shift.
+            throw new IllegalStateException();
+        }
+
+        @Override
+        boolean notFound(BplusIo<L> io, long pageAddr, int idx, int lvl) 
throws IgniteInternalCheckedException {
+            if (lvl != 0) {
+                return false;
+            }
+
+            int cnt = io.getCount(pageAddr);
+
+            if (cnt == 0) {
+                // Empty tree.
+                assert io.getForward(pageAddr, partId) == 0L;
+            } else {
+                assert io.isLeaf() : io;
+                assert cnt > 0 : cnt;
+                assert idx >= 0 || idx == -1 : idx;

Review Comment:
   How could `idx` be -1? In the following code
   
   ```
                   idx = fix(idx);
   
                   if (g.notFound(io, pageAddr, idx, lvl)) {
                       // No way down, stop here.
                       return NOT_FOUND;
                   }
   ```
   
   on entry `idx` is negative, so `fix()` will make it non-negative.
   
   Also, I ran `testFindNext()` with a debugger and a breakpoint inside `if 
(idx == -1)` was never triggered.



##########
modules/page-memory/src/main/java/org/apache/ignite/internal/pagememory/tree/BplusTree.java:
##########
@@ -1572,6 +1572,34 @@ public final T findOne(L row) throws 
IgniteInternalCheckedException {
         return findOne(row, null, null);
     }
 
+    /**
+     * Searches for the next row from the passed from the arguments.
+     *
+     * @param row Row.

Review Comment:
   How about renaming the parameter to `lowerBound`? It seems that it would 
make it clearer what this parameter means.



##########
modules/page-memory/src/main/java/org/apache/ignite/internal/pagememory/tree/BplusTree.java:
##########
@@ -6222,6 +6250,72 @@ public void close() {
         }
     }
 
+    /**
+     * Class for getting the next row.
+     */
+    private final class GetNext extends Get {
+        @Nullable
+        private T nextRow;
+
+        private GetNext(L row, boolean includeRow) {
+            super(row, false);
+
+            shift = includeRow ? -1 : 1;
+        }
+
+        @Override
+        boolean found(BplusIo<L> io, long pageAddr, int idx, int lvl) {
+            // Must never be called because we always have a shift.
+            throw new IllegalStateException();
+        }
+
+        @Override
+        boolean notFound(BplusIo<L> io, long pageAddr, int idx, int lvl) 
throws IgniteInternalCheckedException {
+            if (lvl != 0) {
+                return false;
+            }
+
+            int cnt = io.getCount(pageAddr);
+
+            if (cnt == 0) {
+                // Empty tree.
+                assert io.getForward(pageAddr, partId) == 0L;
+            } else {
+                assert io.isLeaf() : io;
+                assert cnt > 0 : cnt;
+                assert idx >= 0 || idx == -1 : idx;
+                assert cnt >= idx : "cnt=" + cnt + ", idx=" + idx;
+
+                checkDestroyed();
+
+                if (idx == -1) {
+                    idx = findNextRowIdx(pageAddr, io, cnt);
+                }
+
+                if (cnt - idx > 0) {

Review Comment:
   How about changing the condition to `idx < cnt`? It seems to be easier to 
grasp.



##########
modules/storage-api/src/testFixtures/java/org/apache/ignite/internal/storage/index/AbstractSortedIndexStorageTest.java:
##########
@@ -579,6 +585,401 @@ void testNullValues(ColumnDefinition columnDefinition) 
throws Exception {
         }
     }
 
+    /**
+     * Checks simple scenarios for a scanning cursor.
+     */
+    @Test
+    void testScanSimple() {
+        SortedIndexDefinition indexDefinition = 
SchemaBuilders.sortedIndex("TEST_IDX")
+                
.addIndexColumn(ColumnType.INT32.typeSpec().name()).asc().done()
+                .build();
+
+        SortedIndexStorage indexStorage = createIndexStorage(indexDefinition);
+
+        BinaryTupleRowSerializer serializer = new 
BinaryTupleRowSerializer(indexStorage.indexDescriptor());
+
+        for (int i = 0; i < 5; i++) {
+            put(indexStorage, serializer.serializeRow(new Object[]{i}, new 
RowId(TEST_PARTITION)));
+        }
+
+        // Checking without borders.
+        assertThat(
+                scan(indexStorage, null, null, 0).stream().map(objects -> 
objects[0]).collect(toList()),
+                contains(0, 1, 2, 3, 4)
+        );
+
+        // Let's check without borders.
+        assertThat(
+                indexStorage.scan(
+                                serializer.serializeRowPrefix(0),
+                                serializer.serializeRowPrefix(4),
+                                (GREATER_OR_EQUAL | LESS_OR_EQUAL)
+                        ).stream()
+                        .map(indexRow -> 
serializer.deserializeColumns(indexRow)[0])
+                        .collect(toList()),
+                contains(0, 1, 2, 3, 4)
+        );
+
+        assertThat(
+                indexStorage.scan(
+                                serializer.serializeRowPrefix(0),
+                                serializer.serializeRowPrefix(4),
+                                (GREATER_OR_EQUAL | LESS)
+                        ).stream()
+                        .map(indexRow -> 
serializer.deserializeColumns(indexRow)[0])
+                        .collect(toList()),
+                contains(0, 1, 2, 3)
+        );
+
+        assertThat(
+                indexStorage.scan(
+                                serializer.serializeRowPrefix(0),
+                                serializer.serializeRowPrefix(4),
+                                (GREATER | LESS_OR_EQUAL)
+                        ).stream()
+                        .map(indexRow -> 
serializer.deserializeColumns(indexRow)[0])
+                        .collect(toList()),
+                contains(1, 2, 3, 4)
+        );
+
+        assertThat(
+                indexStorage.scan(
+                                serializer.serializeRowPrefix(0),
+                                serializer.serializeRowPrefix(4),
+                                (GREATER | LESS)
+                        ).stream()
+                        .map(indexRow -> 
serializer.deserializeColumns(indexRow)[0])
+                        .collect(toList()),
+                contains(1, 2, 3)
+        );
+
+        // Let's check only with the lower bound.
+        assertThat(
+                indexStorage.scan(
+                                serializer.serializeRowPrefix(1),
+                                null,
+                                (GREATER_OR_EQUAL | LESS_OR_EQUAL)
+                        ).stream()
+                        .map(indexRow -> 
serializer.deserializeColumns(indexRow)[0])
+                        .collect(toList()),
+                contains(1, 2, 3, 4)
+        );
+
+        assertThat(
+                indexStorage.scan(
+                                serializer.serializeRowPrefix(1),
+                                null,
+                                (GREATER_OR_EQUAL | LESS)
+                        ).stream()
+                        .map(indexRow -> 
serializer.deserializeColumns(indexRow)[0])
+                        .collect(toList()),
+                contains(1, 2, 3, 4)
+        );
+
+        assertThat(
+                indexStorage.scan(
+                                serializer.serializeRowPrefix(1),
+                                null,
+                                (GREATER | LESS_OR_EQUAL)
+                        ).stream()
+                        .map(indexRow -> 
serializer.deserializeColumns(indexRow)[0])
+                        .collect(toList()),
+                contains(2, 3, 4)
+        );
+
+        assertThat(
+                indexStorage.scan(
+                                serializer.serializeRowPrefix(1),
+                                null,
+                                (GREATER | LESS)
+                        ).stream()
+                        .map(indexRow -> 
serializer.deserializeColumns(indexRow)[0])
+                        .collect(toList()),
+                contains(2, 3, 4)
+        );
+
+        // Let's check only with the upper bound.
+        assertThat(
+                indexStorage.scan(
+                                null,
+                                serializer.serializeRowPrefix(3),
+                                (GREATER_OR_EQUAL | LESS_OR_EQUAL)
+                        ).stream()
+                        .map(indexRow -> 
serializer.deserializeColumns(indexRow)[0])
+                        .collect(toList()),
+                contains(0, 1, 2, 3)
+        );
+
+        assertThat(
+                indexStorage.scan(
+                                null,
+                                serializer.serializeRowPrefix(3),
+                                (GREATER_OR_EQUAL | LESS)
+                        ).stream()
+                        .map(indexRow -> 
serializer.deserializeColumns(indexRow)[0])
+                        .collect(toList()),
+                contains(0, 1, 2)
+        );
+
+        assertThat(
+                indexStorage.scan(
+                                null,
+                                serializer.serializeRowPrefix(3),
+                                (GREATER | LESS_OR_EQUAL)
+                        ).stream()
+                        .map(indexRow -> 
serializer.deserializeColumns(indexRow)[0])
+                        .collect(toList()),
+                contains(0, 1, 2, 3)
+        );
+
+        assertThat(
+                indexStorage.scan(
+                                null,
+                                serializer.serializeRowPrefix(3),
+                                (GREATER | LESS)
+                        ).stream()
+                        .map(indexRow -> 
serializer.deserializeColumns(indexRow)[0])
+                        .collect(toList()),
+                contains(0, 1, 2)
+        );
+    }
+
+    @Test
+    void testScanContractForEmptyIndex() {
+        SortedIndexDefinition indexDefinition = 
SchemaBuilders.sortedIndex("TEST_IDX")
+                
.addIndexColumn(ColumnType.INT32.typeSpec().name()).asc().done()
+                .build();
+
+        SortedIndexStorage indexStorage = createIndexStorage(indexDefinition);
+
+        BinaryTupleRowSerializer serializer = new 
BinaryTupleRowSerializer(indexStorage.indexDescriptor());
+
+        Cursor<IndexRow> scanBeforeHashNext = indexStorage.scan(null, null, 0);
+
+        Cursor<IndexRow> scanAfterHasNext = indexStorage.scan(null, null, 0);
+
+        assertFalse(scanAfterHasNext.hasNext());
+
+        Cursor<IndexRow> scanWithoutHasNext = indexStorage.scan(null, null, 0);
+
+        put(indexStorage, serializer.serializeRow(new Object[]{0}, new 
RowId(TEST_PARTITION)));
+
+        assertTrue(scanBeforeHashNext.hasNext());
+        assertFalse(scanAfterHasNext.hasNext());
+
+        assertEquals(0, 
serializer.deserializeColumns(scanBeforeHashNext.next())[0]);
+
+        assertThrows(NoSuchElementException.class, scanAfterHasNext::next);

Review Comment:
   This is interesting. `scanAfterHasNext` said that it has nothing 
(`hasNext()` returned `false). Then its `next()` threw an exception saying that 
it has nothing. But after this the cursor starts seeing the next element.
   
   So it looks like the returned cursor does not work over a snapshot of data 
that existed during the cursor creation, but it can cache its view of the 
underlying data (just one row) if `hasNext()` is called. Is this true? If yes, 
probably this contract should be described on the `scan()` method, it's not too 
obvious.



##########
modules/page-memory/src/main/java/org/apache/ignite/internal/pagememory/tree/BplusTree.java:
##########
@@ -1572,6 +1572,34 @@ public final T findOne(L row) throws 
IgniteInternalCheckedException {
         return findOne(row, null, null);
     }
 
+    /**
+     * Searches for the next row from the passed from the arguments.

Review Comment:
   ```suggestion
        * Searches for the row that (strictly or loosely, depending on {@code 
includeRow}) follows the row passed as an argument.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [ignite-3] rpuch commented on a diff in pull request #1407: IGNITE-18322 Define scan contract SortedIndexStorage

Reply via email to