[jira] [Commented] (PHOENIX-6420) Wrong result when conditional and regular upserts are passed in the same commit batch

ASF GitHub Bot (Jira) Fri, 26 Mar 2021 11:02:06 -0700


    [ 
https://issues.apache.org/jira/browse/PHOENIX-6420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17309614#comment-17309614
 ]


ASF GitHub Bot commented on PHOENIX-6420:
-----------------------------------------

gjacoby126 commented on a change in pull request #1183:
URL: https://github.com/apache/phoenix/pull/1183#discussion_r601889798



##########
File path: 
phoenix-core/src/test/java/org/apache/phoenix/execute/MutationStateTest.java
##########
@@ -17,23 +17,7 @@
  */
 package org.apache.phoenix.execute;
 
-import static org.apache.phoenix.execute.MutationState.joinSortedIntArrays;
-import static org.apache.phoenix.util.TestUtil.TEST_PROPERTIES;
-import static org.junit.Assert.assertArrayEquals;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertTrue;
-import static org.mockito.Mockito.doReturn;
-import static org.mockito.Mockito.mock;
-import static org.mockito.Mockito.spy;
-import static org.mockito.Mockito.when;
-
-import java.sql.Connection;
-import java.sql.DriverManager;
-import java.sql.SQLException;
-import java.util.Iterator;
-import java.util.List;
-import java.util.Properties;
-
+import com.google.common.collect.ImmutableList;

Review comment:
       good to use the shaded version of guava for easier porting to 5.x

##########
File path: 
phoenix-core/src/main/java/org/apache/phoenix/execute/MutationState.java
##########
@@ -404,67 +434,89 @@ public int getNumRows() {
         return numRows;
     }
 
+    private MultiRowMutationState getLastMutationBatch(Map<TableRef, 
List<MultiRowMutationState>> mutations, TableRef tableRef) {
+        List<MultiRowMutationState> mutationBatches = mutations.get(tableRef);
+        if (mutationBatches == null || mutationBatches.isEmpty()) {
+            return null;
+        }
+        return mutationBatches.get(mutationBatches.size() - 1);
+    }
+
     private void joinMutationState(TableRef tableRef, MultiRowMutationState 
srcRows,
-            Map<TableRef, MultiRowMutationState> dstMutations) {
+        Map<TableRef, List<MultiRowMutationState>> dstMutations) {
         PTable table = tableRef.getTable();
         boolean isIndex = table.getType() == PTableType.INDEX;
-        boolean incrementRowCount = dstMutations == this.mutations;
-        MultiRowMutationState existingRows = dstMutations.put(tableRef, 
srcRows);
-        if (existingRows != null) { // Rows for that table already exist
-            // Loop through new rows and replace existing with new
-            for (Map.Entry<ImmutableBytesPtr, RowMutationState> rowEntry : 
srcRows.entrySet()) {
-                // Replace existing row with new row
-                RowMutationState existingRowMutationState = 
existingRows.put(rowEntry.getKey(), rowEntry.getValue());
-                if (existingRowMutationState != null) {
-                    Map<PColumn, byte[]> existingValues = 
existingRowMutationState.getColumnValues();
-                    if (existingValues != PRow.DELETE_MARKER) {
-                        Map<PColumn, byte[]> newRow = 
rowEntry.getValue().getColumnValues();
-                        // if new row is PRow.DELETE_MARKER, it means delete, 
and we don't need to merge it with
-                        // existing row.
-                        if (newRow != PRow.DELETE_MARKER) {
-                            // decrement estimated size by the size of the old 
row
-                            estimatedSize -= 
existingRowMutationState.calculateEstimatedSize();
-                            // Merge existing column values with new column 
values
-                            existingRowMutationState.join(rowEntry.getValue());
-                            // increment estimated size by the size of the new 
row
-                            estimatedSize += 
existingRowMutationState.calculateEstimatedSize();
-                            // Now that the existing row has been merged with 
the new row, replace it back
-                            // again (since it was merged with the new one 
above).
-                            existingRows.put(rowEntry.getKey(), 
existingRowMutationState);
-                        }
-                    }
-                } else {
-                    if (incrementRowCount && !isIndex) { // Don't count index 
rows in row count
-                        numRows++;
-                        // increment estimated size by the size of the new row
-                        estimatedSize += 
rowEntry.getValue().calculateEstimatedSize();
-                    }
-                }
-            }
-            // Put the existing one back now that it's merged
-            dstMutations.put(tableRef, existingRows);
-        } else {
+        boolean incrementRowCount = dstMutations == this.mutationsMap;
+        // we only need to check if the new mutation batch (srcRows) conflicts 
with the
+        // last mutation batch
+        MultiRowMutationState existingRows = 
getLastMutationBatch(dstMutations, tableRef);
+
+        if (existingRows == null) { // no rows found for this table
             // Size new map at batch size as that's what it'll likely grow to.
             MultiRowMutationState newRows = new 
MultiRowMutationState(connection.getMutateBatchSize());
             newRows.putAll(srcRows);
-            dstMutations.put(tableRef, newRows);
+            addMutations(dstMutations, tableRef, newRows);
             if (incrementRowCount && !isIndex) {
                 numRows += srcRows.size();
                 // if we added all the rows from newMutationState we can just 
increment the
                 // estimatedSize by newMutationState.estimatedSize
                 estimatedSize += srcRows.estimatedSize;
             }
+            return;
+        }
+
+        // for conflicting rows
+        MultiRowMutationState conflictingRows = new 
MultiRowMutationState(connection.getMutateBatchSize());
+
+        // Rows for this table already exist, check for conflicts
+        for (Map.Entry<ImmutableBytesPtr, RowMutationState> rowEntry : 
srcRows.entrySet()) {
+            ImmutableBytesPtr key = rowEntry.getKey();
+            RowMutationState newRowMutationState = rowEntry.getValue();
+            RowMutationState existingRowMutationState = existingRows.get(key);
+            if (existingRowMutationState == null) {
+                existingRows.put(key, newRowMutationState);
+                if (incrementRowCount && !isIndex) { // Don't count index rows 
in row count
+                    numRows++;
+                    // increment estimated size by the size of the new row
+                    estimatedSize += 
newRowMutationState.calculateEstimatedSize();
+                }
+                continue;
+            }
+            Map<PColumn, byte[]> existingValues = 
existingRowMutationState.getColumnValues();
+            Map<PColumn, byte[]> newValues = 
newRowMutationState.getColumnValues();
+            if (existingValues != PRow.DELETE_MARKER && newValues != 
PRow.DELETE_MARKER) {
+                // Check if we can merge existing column values with new 
column values
+                long beforeMerge = 
existingRowMutationState.calculateEstimatedSize();

Review comment:
       nit: maybe beforeMergeSize? This reads like a boolean name to me. 

##########
File path: 
phoenix-core/src/main/java/org/apache/phoenix/execute/MutationState.java
##########
@@ -586,10 +638,11 @@ public boolean hasNext() {
                     // the tables in the mutations map
                     if (!sendAll) {
                         TableRef key = new TableRef(index);
-                        MultiRowMutationState multiRowMutationState = 
mutations.remove(key);
+                        List<MultiRowMutationState> multiRowMutationState = 
mutationsMap.remove(key);
                         if (multiRowMutationState != null) {
                             final List<Mutation> deleteMutations = 
Lists.newArrayList();
-                            generateMutations(key, mutationTimestamp, 
serverTimestamp, multiRowMutationState, deleteMutations, null);
+                            // for index table there will only be 1 mutation 
batch in the list

Review comment:
       why only one mutation batch? What if there are conflicting batches for 
the index?

##########
File path: 
phoenix-core/src/test/java/org/apache/phoenix/execute/MutationStateTest.java
##########
@@ -210,4 +216,63 @@ public void testPendingMutationsOnDDL() throws Exception {
                     + "( id1 UNSIGNED_INT not null primary key," + "appId1 
VARCHAR)");
         }
     }
+
+    @Test
+    public void testOnDupAndUpsertInSameCommitBatch() throws Exception {
+        try (Connection conn = DriverManager.getConnection(getUrl())) {
+            conn.createStatement().execute(
+                "create table MUTATION_TEST1" +
+                    "( id1 UNSIGNED_INT not null primary key," +
+                    "appId1 VARCHAR)");
+            conn.createStatement().execute(
+                "create table MUTATION_TEST2" +
+                    "( id2 UNSIGNED_INT not null primary key," +
+                    "appId2 VARCHAR)");
+
+            conn.createStatement().execute("upsert into 
MUTATION_TEST1(id1,appId1) values(111,'app1')");
+            conn.createStatement().execute(
+                "upsert into MUTATION_TEST1(id1,appId1) values(111, 'app1') ON 
DUPLICATE KEY UPDATE appId1 = null");
+            conn.createStatement().execute("upsert into 
MUTATION_TEST2(id2,appId2) values(222,'app2')");
+            conn.createStatement().execute(
+                "upsert into MUTATION_TEST2(id2,appId2) values(222,'app2') ON 
DUPLICATE KEY UPDATE appId2 = null");
+
+            final PhoenixConnection pconn = 
conn.unwrap(PhoenixConnection.class);
+            MutationState state = pconn.getMutationState();
+            assertEquals(2, state.getNumRows());
+
+            int actualPairs = 0;
+            Iterator<Pair<byte[], List<Mutation>>> mutations = 
state.toMutations();

Review comment:
       A comment here on what you're checking would be helpful. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


> Wrong result when conditional and regular upserts are passed in the same 
> commit batch
> -------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-6420
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6420
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Tanuj Khurana
>            Assignee: Tanuj Khurana
>            Priority: Major
>
> Consider this example:
> {code:java}
> CREATE TABLE T1 (k integer not null primary key, v1 bigint, v2 bigint);
> {code}
> Now consider this batch:
> {code:java}
> UPSERT INTO T1 VALUES(0,0,1);
> UPSERT INTO T1 VALUES(0,1,1) ON DUPLICATE KEY UPDATE v1 = v1 + 2;
> commit();
> {code}
> Expected row state: 0, 2, 1
> Actual: 0, 2, 0
> The value of the column (v2) not updated in the conditional expression 
> remains default. It's value should have been the one set in the regular 
> upsert in the batch.
>  Now, the row exists. Consider another batch of updates
> {code:java}
> UPSERT INTO T1 VALUES(0, 7, 4);
> UPSERT INTO T1 VALUES(0,1,1) ON DUPLICATE KEY UPDATE v1 = v1 + 2;
> commit();
> {code}
> Expected row state: 0,2,1  -> 0, 9, 4
> Actual: 0,2,0 -> 0, 4, 0
> The conditional update expression is evaluated and applied on the row state 
> already committed instead of on the regular update in the same batch. Also, 
> v2 still remains 0 (the default value).
>  Now consider the case of a partial regular update following a conditional 
> update:
> {code:java}
> UPSERT INTO T1 (k, v2) VALUES(0,100) ON DUPLICATE KEY UPDATE v1 = v1 + 2;
> UPSERT INTO T1 (k, v2) VALUES (0,125);
> commit();
> {code}
> Expected row state: 0, 9, 4 -> 0, 11, 125
> Actual: 0, 4, 0 -> 0, 4, 125
> Only the regular update is applied and the conditional update is completely 
> ignored.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (PHOENIX-6420) Wrong result when conditional and regular upserts are passed in the same commit batch

Reply via email to