[ 
https://issues.apache.org/jira/browse/PHOENIX-1100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067361#comment-14067361
 ] 

Jan Fernando commented on PHOENIX-1100:
---------------------------------------

[~jamestaylor] As promised here is an IT test we can use to repro. Add this to 
UpsertSelectIT. If I remove SALT_BUCKETS= 64 from the DDL for 
DUMMY_SEQ_TEST_DATA, the tests pass which corroborates my theory that 
parallelization is causing this.

{code}
    @Test
    public void testUpsertSelectWithSequenceAndOrderByWithSalting() throws 
Exception {
        
                int numOfRecords = 2000;
        long ts = nextTimestamp();
        Properties props = new Properties();
              props.setProperty(QueryServices.THREAD_POOL_SIZE_ATTRIB, 
Integer.toString(64));
              props.setProperty(QueryServices.QUEUE_SIZE_ATTRIB, 
Integer.toString(500));
              props.setProperty(PhoenixRuntime.CURRENT_SCN_ATTRIB, 
Long.toString(ts));
        Connection conn = DriverManager.getConnection(getUrl(), props);
        String ddl = "CREATE TABLE IF NOT EXISTS DUMMY_CURSOR_STORAGE ("
                                        + "ORGANIZATION_ID CHAR(15) NOT NULL, 
QUERY_ID CHAR(15) NOT NULL, CURSOR_ORDER BIGINT NOT NULL, K1 INTEGER, V1 
INTEGER "
                                        + "CONSTRAINT MAIN_PK PRIMARY KEY 
(ORGANIZATION_ID, QUERY_ID, CURSOR_ORDER) "
                                        + ") SALT_BUCKETS = 64";
        conn.createStatement().execute(ddl);
        conn.createStatement().execute("CREATE TABLE DUMMY_SEQ_TEST_DATA " +
                                                                                
                                                 "(ORGANIZATION_ID CHAR(15) NOT 
NULL, k1 integer NOT NULL, v1 integer NOT NULL " +
                                                                                
                                                 "CONSTRAINT PK PRIMARY KEY 
(ORGANIZATION_ID, k1, v1) ) VERSIONS=1, SALT_BUCKETS = 64");
        conn.createStatement().execute("create sequence s cache " + 
Integer.MAX_VALUE);
        conn.close();

        props.setProperty(PhoenixRuntime.CURRENT_SCN_ATTRIB, Long.toString(ts + 
10));
        conn = DriverManager.getConnection(getUrl(), props);
        for (int i = 0; i < numOfRecords; i++) {
                conn.createStatement().execute("UPSERT INTO DUMMY_SEQ_TEST_DATA 
values ('00Dxx0000001gEH'," + i + "," + (i + 2) + ")");
        }
        conn.commit();

        props.setProperty(PhoenixRuntime.CURRENT_SCN_ATTRIB, Long.toString(ts + 
15));
        conn = DriverManager.getConnection(getUrl(), props);
        conn.setAutoCommit(true);
        conn.createStatement().execute("UPSERT INTO DUMMY_CURSOR_STORAGE SELECT 
'00Dxx0000001gEH', 'MyQueryId', NEXT VALUE FOR S, k1, v1  FROM 
DUMMY_SEQ_TEST_DATA ORDER BY K1, V1");

        props.setProperty(PhoenixRuntime.CURRENT_SCN_ATTRIB, Long.toString(ts + 
20));
        conn = DriverManager.getConnection(getUrl(), props);
        ResultSet rs = conn.createStatement().executeQuery("select count(*) 
from DUMMY_CURSOR_STORAGE");
        
        assertTrue(rs.next());
        assertEquals(numOfRecords, rs.getLong(1));
        conn.close();

        props.setProperty(PhoenixRuntime.CURRENT_SCN_ATTRIB, Long.toString(ts + 
25));
        ResultSet rs2 = conn.createStatement().executeQuery("select 
cursor_order, k1, v1 from DUMMY_CURSOR_STORAGE order by cursor_order");
        long seq = 1;
        while (rs2.next()) {
            assertEquals(seq, rs2.getLong("cursor_order"));
            // This value should be the sequence - 1 as we said order by k1 in 
the UPSERT...SELECT, but is not because of sequence processing.
            assertEquals(seq - 1, rs2.getLong("k1"));
            seq++;
        }
        conn.close();
    
    }
{code}

> Upsert Select with Sequence and Order By doesn't result in sequence getting 
> assigned based on requested ordering 
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-1100
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1100
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 3.0.0, 4.0.0, 5.0.0
>            Reporter: Jan Fernando
>
> I working with Query More implementation I noticed that once we had 
> sufficient parallelization in the SELECT portion of UPSERT...SELECT due to 
> Salting or region splits the results were not returned in the correct order. 
> The root cause appeared to be due to the fact that we are using sequences to 
> build a cursor of data for our Query More implementation. What appears to be 
> happening is that as we get the next sequence value from the SequenceManager 
> during result processing parallel threads all increment the sequence and 
> therefore the sequence values reflect the processing order of each iterator 
> versus the actual order specified by the order by. The expectation is that 
> the sequence value should reflect the requested order specified by the ORDER 
> BY versus the processing order.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to