[ 
https://issues.apache.org/jira/browse/PHOENIX-5136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hieu Nguyen updated PHOENIX-5136:
---------------------------------
    Description: 
Rows with null values inserted using UPSERT .. ON DUPLICATE KEY UPDATE will be 
selected in queries when they should not be.

Here is a failing test that demonstrates the issue:
{noformat}
@Test
public void 
testRowsCreatedViaUpsertOnDuplicateKeyShouldNotBeReturnedInQueryIfNotMatched() 
throws Exception {
    Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES);
    Connection conn = DriverManager.getConnection(getUrl(), props);
    String tableName = generateUniqueName();
    String ddl = " create table " + tableName + "(pk varchar primary key, 
counter1 bigint, counter2 smallint)";
    conn.createStatement().execute(ddl);
    createIndex(conn, tableName);
    // The data has to be specifically starting with null for the first counter 
to fail the test. If you reverse the values, the test passes.
    String dml1 = "UPSERT INTO " + tableName + " VALUES('a',NULL,2) ON 
DUPLICATE KEY UPDATE " +
            "counter1 = CASE WHEN (counter1 IS NULL) THEN NULL ELSE counter1 
END, " +
            "counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2 END";
    conn.createStatement().execute(dml1);
    conn.commit();

    String dml2 = "UPSERT INTO " + tableName + " VALUES('b',1,2) ON DUPLICATE 
KEY UPDATE " +
            "counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1 END, 
" +
            "counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2 END";
    conn.createStatement().execute(dml2);
    conn.commit();

    // Using this statement causes the test to pass
    //ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " + 
tableName + " WHERE counter2 = 2 AND counter1 = 1");
    // This statement should be equivalent to the one above, but it selects 
both rows.
    ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " + 
tableName + " WHERE counter2 = 2 AND (counter1 = 1 OR counter1 = 1)");
    assertTrue(rs.next());
    assertEquals("b",rs.getString(1));
    assertEquals(1,rs.getLong(2));
    assertEquals(2,rs.getLong(3));
    assertFalse(rs.next());

    conn.close();
}{noformat}
The conditions are fairly specific:
 * Must use ON DUPLICATE KEY UPDATE.  Inserting rows using UPSERT by itself 
will have correct results
 * The "counter2 = 2 AND (counter1 = 1 OR counter1 = 1)" condition caused the 
test to fail, as opposed to the equivalent but simpler "counter2 = 2 AND 
counter1 = 1".  I tested a similar "counter2 = 2 AND (counter1 = 1 OR counter1 
< 1)", which also caused the test to fail.
 * If the NULL value for row 'a' is instead in the last position (counter2), 
then row 'a' is not selected in the query as expected.  The below test 
demonstrates this behavior (it passes as expected):

{noformat}
@Test
public void 
testRowsCreatedViaUpsertOnDuplicateKeyShouldNotBeReturnedInQueryIfNotMatched() 
throws Exception {
    Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES);
    Connection conn = DriverManager.getConnection(getUrl(), props);
    String tableName = generateUniqueName();
    String ddl = " create table " + tableName + "(pk varchar primary key, 
counter1 bigint, counter2 smallint)";
    conn.createStatement().execute(ddl);
    createIndex(conn, tableName);

    String dml1 = "UPSERT INTO " + tableName + " VALUES('a',1,NULL) ON 
DUPLICATE KEY UPDATE " +
            "counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1 END, 
" +
            "counter2 = CASE WHEN (counter1 IS NULL) THEN NULL ELSE counter2 
END";
    conn.createStatement().execute(dml1);
    conn.commit();

    String dml2 = "UPSERT INTO " + tableName + " VALUES('b',1,2) ON DUPLICATE 
KEY UPDATE " +
            "counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1 END, 
" +
            "counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2 END";
    conn.createStatement().execute(dml2);
    conn.commit();

    ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " + 
tableName + " WHERE counter1 = 1 AND (counter2 = 2 OR counter2 = 2)");
    assertTrue(rs.next());
    assertEquals("b",rs.getString(1));
    assertEquals(1,rs.getLong(2));
    assertEquals(2,rs.getLong(3));
    assertFalse(rs.next());

    conn.close();
}
{noformat}

We also noticed this behavior when upserting and selecting manually against a 
View.

  was:
Rows with null values inserted using UPSERT .. ON DUPLICATE KEY UPDATE will be 
selected in queries when they should not be.

Here is a failing test that demonstrates the issue:
{noformat}
@Test
public void 
testRowsCreatedViaUpsertOnDuplicateKeyShouldNotBeReturnedInQueryIfNotMatched() 
throws Exception {
    Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES);
    Connection conn = DriverManager.getConnection(getUrl(), props);
    String tableName = generateUniqueName();
    String ddl = " create table " + tableName + "(pk varchar primary key, 
counter1 bigint, counter2 smallint)";
    conn.createStatement().execute(ddl);
    createIndex(conn, tableName);
    // The data has to be specifically starting with null for the first counter 
to fail the test. If you reverse the values, the test passes.
    String dml1 = "UPSERT INTO " + tableName + " VALUES('a',NULL,2) ON 
DUPLICATE KEY UPDATE " +
            "counter1 = CASE WHEN (counter1 IS NULL) THEN NULL ELSE counter1 
END, " +
            "counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2 END";
    conn.createStatement().execute(dml1);
    conn.commit();

    String dml2 = "UPSERT INTO " + tableName + " VALUES('b',1,2) ON DUPLICATE 
KEY UPDATE " +
            "counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1 END, 
" +
            "counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2 END";
    conn.createStatement().execute(dml2);
    conn.commit();

    // Using this statement causes the test to pass
    //ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " + 
tableName + " WHERE counter2 = 2 AND counter1 = 1");
    // This statement should be equivalent to the one above, but it selects 
both rows.
    ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " + 
tableName + " WHERE counter2 = 2 AND (counter1 = 1 OR counter1 = 1)");
    assertTrue(rs.next());
    assertEquals("b",rs.getString(1));
    assertEquals(1,rs.getLong(2));
    assertEquals(2,rs.getLong(3));
    assertFalse(rs.next());

    conn.close();
}{noformat}
The conditions are fairly specific:
 * Must use ON DUPLICATE KEY UPDATE.  Inserting rows using UPSERT by itself 
will have correct results
 * The "counter2 = 2 AND (counter1 = 1 OR counter1 = 1)" condition caused the 
test to fail, as opposed to the equivalent but simpler "counter2 = 2 AND 
counter1 = 1".  I tested a similar "counter2 = 2 AND (counter1 = 1 OR counter1 
< 1)", which also caused the test to fail.
 * If the NULL value for row 'a' is instead in the last position (counter2), 
then row 'a' is not selected in the query as expected.  The below test 
demonstrates this behavior (it passes as expected):

{noformat}
@Test
public void 
testRowsCreatedViaUpsertOnDuplicateKeyShouldNotBeReturnedInQueryIfNotMatched() 
throws Exception {
    Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES);
    Connection conn = DriverManager.getConnection(getUrl(), props);
    String tableName = generateUniqueName();
    String ddl = " create table " + tableName + "(pk varchar primary key, 
counter1 bigint, counter2 smallint)";
    conn.createStatement().execute(ddl);
    createIndex(conn, tableName);

    String dml1 = "UPSERT INTO " + tableName + " VALUES('a',1,NULL) ON 
DUPLICATE KEY UPDATE " +
            "counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1 END, 
" +
            "counter2 = CASE WHEN (counter1 IS NULL) THEN NULL ELSE counter2 
END";
    conn.createStatement().execute(dml1);
    conn.commit();

    String dml2 = "UPSERT INTO " + tableName + " VALUES('b',1,2) ON DUPLICATE 
KEY UPDATE " +
            "counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1 END, 
" +
            "counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2 END";
    conn.createStatement().execute(dml2);
    conn.commit();

    ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " + 
tableName + " WHERE counter1 = 1 AND (counter2 = 2 OR counter2 = 2)");
    assertTrue(rs.next());
    assertEquals("b",rs.getString(1));
    assertEquals(1,rs.getLong(2));
    assertEquals(2,rs.getLong(3));
    assertFalse(rs.next());

    conn.close();
}{noformat}


> Rows with null values inserted by UPSERT .. ON DUPLICATE KEY UPDATE are 
> included in query results when they shouldn't be
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-5136
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5136
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 5.0.0
>            Reporter: Hieu Nguyen
>            Priority: Major
>
> Rows with null values inserted using UPSERT .. ON DUPLICATE KEY UPDATE will 
> be selected in queries when they should not be.
> Here is a failing test that demonstrates the issue:
> {noformat}
> @Test
> public void 
> testRowsCreatedViaUpsertOnDuplicateKeyShouldNotBeReturnedInQueryIfNotMatched()
>  throws Exception {
>     Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES);
>     Connection conn = DriverManager.getConnection(getUrl(), props);
>     String tableName = generateUniqueName();
>     String ddl = " create table " + tableName + "(pk varchar primary key, 
> counter1 bigint, counter2 smallint)";
>     conn.createStatement().execute(ddl);
>     createIndex(conn, tableName);
>     // The data has to be specifically starting with null for the first 
> counter to fail the test. If you reverse the values, the test passes.
>     String dml1 = "UPSERT INTO " + tableName + " VALUES('a',NULL,2) ON 
> DUPLICATE KEY UPDATE " +
>             "counter1 = CASE WHEN (counter1 IS NULL) THEN NULL ELSE counter1 
> END, " +
>             "counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2 
> END";
>     conn.createStatement().execute(dml1);
>     conn.commit();
>     String dml2 = "UPSERT INTO " + tableName + " VALUES('b',1,2) ON DUPLICATE 
> KEY UPDATE " +
>             "counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1 
> END, " +
>             "counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2 
> END";
>     conn.createStatement().execute(dml2);
>     conn.commit();
>     // Using this statement causes the test to pass
>     //ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " + 
> tableName + " WHERE counter2 = 2 AND counter1 = 1");
>     // This statement should be equivalent to the one above, but it selects 
> both rows.
>     ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " + 
> tableName + " WHERE counter2 = 2 AND (counter1 = 1 OR counter1 = 1)");
>     assertTrue(rs.next());
>     assertEquals("b",rs.getString(1));
>     assertEquals(1,rs.getLong(2));
>     assertEquals(2,rs.getLong(3));
>     assertFalse(rs.next());
>     conn.close();
> }{noformat}
> The conditions are fairly specific:
>  * Must use ON DUPLICATE KEY UPDATE.  Inserting rows using UPSERT by itself 
> will have correct results
>  * The "counter2 = 2 AND (counter1 = 1 OR counter1 = 1)" condition caused the 
> test to fail, as opposed to the equivalent but simpler "counter2 = 2 AND 
> counter1 = 1".  I tested a similar "counter2 = 2 AND (counter1 = 1 OR 
> counter1 < 1)", which also caused the test to fail.
>  * If the NULL value for row 'a' is instead in the last position (counter2), 
> then row 'a' is not selected in the query as expected.  The below test 
> demonstrates this behavior (it passes as expected):
> {noformat}
> @Test
> public void 
> testRowsCreatedViaUpsertOnDuplicateKeyShouldNotBeReturnedInQueryIfNotMatched()
>  throws Exception {
>     Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES);
>     Connection conn = DriverManager.getConnection(getUrl(), props);
>     String tableName = generateUniqueName();
>     String ddl = " create table " + tableName + "(pk varchar primary key, 
> counter1 bigint, counter2 smallint)";
>     conn.createStatement().execute(ddl);
>     createIndex(conn, tableName);
>     String dml1 = "UPSERT INTO " + tableName + " VALUES('a',1,NULL) ON 
> DUPLICATE KEY UPDATE " +
>             "counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1 
> END, " +
>             "counter2 = CASE WHEN (counter1 IS NULL) THEN NULL ELSE counter2 
> END";
>     conn.createStatement().execute(dml1);
>     conn.commit();
>     String dml2 = "UPSERT INTO " + tableName + " VALUES('b',1,2) ON DUPLICATE 
> KEY UPDATE " +
>             "counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1 
> END, " +
>             "counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2 
> END";
>     conn.createStatement().execute(dml2);
>     conn.commit();
>     ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " + 
> tableName + " WHERE counter1 = 1 AND (counter2 = 2 OR counter2 = 2)");
>     assertTrue(rs.next());
>     assertEquals("b",rs.getString(1));
>     assertEquals(1,rs.getLong(2));
>     assertEquals(2,rs.getLong(3));
>     assertFalse(rs.next());
>     conn.close();
> }
> {noformat}
> We also noticed this behavior when upserting and selecting manually against a 
> View.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to