[jira] [Commented] (PHOENIX-5055) Split mutations batches probably affects correctness of index data

Hadoop QA (JIRA) Tue, 11 Dec 2018 01:40:26 -0800


    [ 
https://issues.apache.org/jira/browse/PHOENIX-5055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16716652#comment-16716652
 ]


Hadoop QA commented on PHOENIX-5055:
------------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12951315/PHOENIX-5055-4.x-HBase-1.4-v4.patch
  against 4.x-HBase-1.4 branch at commit 
f0881a137c9b1a020d807f4d1651ca139ee1a7be.
  ATTACHMENT ID: 12951315

    {color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

    {color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

    {color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

    {color:red}-1 release audit{color}.  The applied patch generated 1 release 
audit warnings (more than the master's current 0 warnings).

    {color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
    +        try (PhoenixConnection conn = 
DriverManager.getConnection(getUrl(), props).unwrap(PhoenixConnection.class)) {
+            conn.createStatement().executeUpdate("CREATE INDEX " + indexName + 
" on "  + tableName + " (C) INCLUDE(D)");
+            conn.createStatement().executeUpdate("UPSERT INTO "  + tableName + 
"(A,B,C,D) VALUES ('A2','B2','C2','D2')");
+            conn.createStatement().executeUpdate("UPSERT INTO "  + tableName + 
"(A,B,C,D) VALUES ('A3','B3', 'C3', null)");
+                        assertEquals("(" + cell.toString() + ") has different 
ts", ts, cell.getTimestamp());
+        // set the batch size (rows) to 2 since three are at least 2 mutations 
when updates a single row
+     * Split the list of mutations into multiple lists. since a single row 
update can contain multiple mutations,
+    public static List<List<Mutation>> getMutationBatchList(long batchSize, 
long batchSizeBytes, List<Mutation> allMutationList) {
+                "Mutation types are put or delete, for one row all mutations 
must be in one batch.");
+            List<Mutation> list = ImmutableList.of(new Put(r3), new Put(r1), 
new Delete(r1), new Put(r2), new Put(r4), new Delete(r4));

    {color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/2202//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/2202//artifact/patchprocess/patchReleaseAuditWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/2202//console

This message is automatically generated.

> Split mutations batches probably affects correctness of index data
> ------------------------------------------------------------------
>
>                 Key: PHOENIX-5055
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5055
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 5.0.0, 4.14.1
>            Reporter: Jaanai
>            Assignee: Jaanai
>            Priority: Critical
>             Fix For: 5.1.0
>
>         Attachments: ConcurrentTest.java, 
> PHOENIX-5055-4.x-HBase-1.4-v2.patch, PHOENIX-5055-4.x-HBase-1.4-v3.patch, 
> PHOENIX-5055-4.x-HBase-1.4-v4.patch, PHOENIX-5055-v4.x-HBase-1.4.patch
>
>
> In order to get more performance, we split the list of mutations into 
> multiple batches in MutationSate.  For one upsert SQL with some null values 
> that will produce two type KeyValues(Put and DeleteColumn),  These KeyValues 
> should have the same timestamp so that keep on an atomic operation for 
> corresponding the row key.
> [^ConcurrentTest.java] produced some random upsert/delete SQL and 
> concurrently executed, some SQL snippets as follows:
> {code:java}
> 1149:UPSERT INTO ConcurrentReadWritTest(A,C,E,F,G) VALUES 
> ('3826','2563','3052','3170','3767');
> 1864:UPSERT INTO ConcurrentReadWritTest(A,B,C,D,E,F,G) VALUES 
> ('2563','4926','3526','678',null,null,'1617');
> 2332:UPSERT INTO ConcurrentReadWritTest(A,B,C,D,E,F,G) VALUES 
> ('1052','2563','1120','2314','1456',null,null);
> 2846:UPSERT INTO ConcurrentReadWritTest(A,B,C,D,G) VALUES 
> ('1922','146',null,'469','2563');
> 2847:DELETE FROM ConcurrentReadWritTest WHERE A = '2563’;
> {code}
> Found incorrect indexed data for the index tables by sqlline.
> !https://gw.alicdn.com/tfscom/TB1nSDqpxTpK1RjSZFGXXcHqFXa.png|width=665,height=400!
> Debugged the mutations of batches on the server side. the DeleteColumns and 
> Puts were splitted into the different batches for the once upsert,  the 
> DeleteFaimly also was executed by another thread.  due to DeleteColumns's 
> timestamp is larger than DeleteFaimly under multiple threads.
> !https://gw.alicdn.com/tfscom/TB1frHmpCrqK1RjSZK9XXXyypXa.png|width=901,height=120!
>  
> Running the following:
> {code:java}
> conn.createStatement().executeUpdate( "CREATE TABLE " + tableName + " (" + "A 
> VARCHAR NOT NULL PRIMARY KEY," + "B VARCHAR," + "C VARCHAR," + "D VARCHAR) 
> COLUMN_ENCODED_BYTES = 0"); 
> conn.createStatement().executeUpdate("CREATE INDEX " + indexName + " on " + 
> tableName + " (C) INCLUDE(D)"); 
> conn.createStatement().executeUpdate("UPSERT INTO " + tableName + "(A,B,C,D) 
> VALUES ('A2','B2','C2','D2')"); 
> conn.createStatement().executeUpdate("UPSERT INTO " + tableName + "(A,B,C,D) 
> VALUES ('A3','B3', 'C3', null)");
> {code}
> dump IndexMemStore:
> {code:java}
> hbase.index.covered.data.IndexMemStore(117): 
> Inserting:\x01A3/0:D/1542190446218/DeleteColumn/vlen=0/seqid=0/value= 
> phoenix.hbase.index.covered.data.IndexMemStore(133): Current kv state: 
> phoenix.hbase.index.covered.data.IndexMemStore(135): KV: 
> \x01A3/0:B/1542190446167/Put/vlen=2/seqid=5/value=B3 
> phoenix.hbase.index.covered.data.IndexMemStore(135): KV: 
> \x01A3/0:C/1542190446167/Put/vlen=2/seqid=5/value=C3 
> phoenix.hbase.index.covered.data.IndexMemStore(135): KV: 
> \x01A3/0:D/1542190446218/DeleteColumn/vlen=0/seqid=0/value= 
> phoenix.hbase.index.covered.data.IndexMemStore(135): KV: 
> \x01A3/0:_0/1542190446167/Put/vlen=1/seqid=5/value=x 
> phoenix.hbase.index.covered.data.IndexMemStore(137): ========== END MemStore 
> Dump ==================
> {code}
>  
> The DeleteColumn's timestamp larger than other mutations.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (PHOENIX-5055) Split mutations batches probably affects correctness of index data

Reply via email to