[GitHub] [phoenix] kadirozde commented on a change in pull request #611: PHOENIX-5535 Replay delete markers during server side global index re…

GitBox Tue, 29 Oct 2019 12:01:42 -0700

kadirozde commented on a change in pull request #611: PHOENIX-5535 Replay 
delete markers during server side global index re…
URL: https://github.com/apache/phoenix/pull/611#discussion_r340266698


 ##########
 File path: phoenix-core/src/it/java/org/apache/phoenix/end2end/IndexToolIT.java
 ##########
 @@ -157,6 +162,70 @@ public static void setup() throws Exception {
         return TestUtil.filterTxParamData(list,0);
     }
 
+    private void setEveryNthRowWithNull(int nrows, int nthRowNull, 
PreparedStatement stmt) throws Exception {
+        for (int i = 1; i <= nrows; i++) {
+            stmt.setInt(1, i);
+            stmt.setInt(2, i + 1);
+            if (i % nthRowNull != 0) {
+                stmt.setInt(3, i * i);
+            } else {
+                stmt.setNull(3, Types.INTEGER);
+            }
+            stmt.execute();
+        }
+    }
+
+    @Test
+    public void testWithSetNull() throws Exception {
+        // This test is for building non-transactional mutable global indexes 
with direct api
+        if (localIndex || transactional || !mutable || !directApi || 
useSnapshot) {
+            return;
+        }
+        // This tests the cases where a column having a null value is 
overwritten with a not null value and vice versa;
+        // and after that the index table is still rebuilt correctly
+        final int NROWS = 2 * 3 * 5 * 7;
+        String schemaName = generateUniqueName();
+        String dataTableName = generateUniqueName();
+        String dataTableFullName = SchemaUtil.getTableName(schemaName, 
dataTableName);
+        String indexTableName = generateUniqueName();
+        String indexTableFullName = SchemaUtil.getTableName(schemaName, 
indexTableName);
+        Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES);
+        try (Connection conn = DriverManager.getConnection(getUrl(), props)) {
+            conn.createStatement().execute("CREATE TABLE " + dataTableFullName
+                    + " (ID INTEGER NOT NULL PRIMARY KEY, VAL1 INTEGER, VAL2 
INTEGER) "
+                    + tableDDLOptions);
+            String upsertStmt = "UPSERT INTO " + dataTableFullName + " 
VALUES(?,?,?)";
+            PreparedStatement stmt = conn.prepareStatement(upsertStmt);
+            setEveryNthRowWithNull(NROWS, 2, stmt);
+            conn.commit();
+            setEveryNthRowWithNull(NROWS, 3, stmt);
+            conn.commit();
+            conn.createStatement().execute(String.format(
+                    "CREATE %s INDEX %s ON %s (VAL1) INCLUDE (VAL2) ASYNC ",
+                    (localIndex ? "LOCAL" : ""), indexTableName, 
dataTableFullName));
+            // Run the index MR job and verify that the index table is built 
correctly
+            IndexTool indexTool = runIndexTool(directApi, useSnapshot, 
schemaName, dataTableName, indexTableName, null, 0, new String[0]);
+            assertEquals(NROWS, 
indexTool.getJob().getCounters().findCounter(INPUT_RECORDS).getValue());
+            long actualRowCount = IndexScrutiny.scrutinizeIndex(conn, 
dataTableFullName, indexTableFullName);
+            assertEquals(NROWS, actualRowCount);
+            // Check after compaction
+            TestUtil.doMajorCompaction(conn, dataTableFullName);
+            actualRowCount = IndexScrutiny.scrutinizeIndex(conn, 
dataTableFullName, indexTableFullName);
+            assertEquals(NROWS, actualRowCount);
+            setEveryNthRowWithNull(NROWS, 5, stmt);
+            conn.commit();
+            actualRowCount = IndexScrutiny.scrutinizeIndex(conn, 
dataTableFullName, indexTableFullName);
+            assertEquals(NROWS, actualRowCount);
+            setEveryNthRowWithNull(NROWS, 7, stmt);
+            conn.commit();
+            actualRowCount = IndexScrutiny.scrutinizeIndex(conn, 
dataTableFullName, indexTableFullName);
+            assertEquals(NROWS, actualRowCount);
+            TestUtil.doMajorCompaction(conn, dataTableFullName);
+            actualRowCount = IndexScrutiny.scrutinizeIndex(conn, 
dataTableFullName, indexTableFullName);
+            assertEquals(NROWS, actualRowCount);
 
 Review comment:
   Got it. The loop is for getting the maximum timestamp for the mutation of a  
given row and using the max timestamp for the delete markers for the missing 
columns. Since we do not read delete markers, we do not know exactly when the 
delete markers are added or if the row is created with missing columns without 
delete markers in the first place. Even if we attempt to get the delete 
markers, we may not find them because they were not added in the first place or 
were removed by compaction.  So reading delete markers does not help either. 
Now, why we do insert these delete markers here, you may ask. It is to make 
sure that index rebuild code on IndexRegionObserver passes these delete markers 
to the index table. If these markers are not there in the index table for 
missing columns and there is a wrong/stale row in the index table before 
rebuild, the missing columns will inherit these stale/wrong values. 
   Please note that the test for delete markers is pretty involved. It creates 
210 rows and overwrites these rows with different null patterns three times and 
checks if the index table is correct with and without compaction.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] [phoenix] kadirozde commented on a change in pull request #611: PHOENIX-5535 Replay delete markers during server side global index re…

Reply via email to