[jira] [Assigned] (PHOENIX-5003) Fix ViewIT.testCreateViewMappedToExistingHbaseTableWithNamespaceMappingEnabled()

2019-02-12 Thread Kadir OZDEMIR (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kadir OZDEMIR reassigned PHOENIX-5003:
--

Assignee: Thomas D'Silva  (was: Kadir OZDEMIR)

> Fix 
> ViewIT.testCreateViewMappedToExistingHbaseTableWithNamespaceMappingEnabled()
> 
>
> Key: PHOENIX-5003
> URL: https://issues.apache.org/jira/browse/PHOENIX-5003
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Thomas D'Silva
>Assignee: Thomas D'Silva
>Priority: Major
>
> FYI @Daniel Wong, this test is failing consistently on the 1.3 branch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5018) Index mutations created by UPSERT SELECT will have wrong timestamps

2019-02-12 Thread Kadir OZDEMIR (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kadir OZDEMIR updated PHOENIX-5018:
---
Attachment: PHOENIX-5018.4.x-HBase-1.3.001.patch

> Index mutations created by UPSERT SELECT will have wrong timestamps
> ---
>
> Key: PHOENIX-5018
> URL: https://issues.apache.org/jira/browse/PHOENIX-5018
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.0, 5.0.0
>Reporter: Geoffrey Jacoby
>Assignee: Kadir OZDEMIR
>Priority: Major
> Attachments: PHOENIX-5018.4.x-HBase-1.3.001.patch, 
> PHOENIX-5018.4.x-HBase-1.4.001.patch, PHOENIX-5018.master.001.patch, 
> PHOENIX-5018.master.002.patch, PHOENIX-5018.master.003.patch
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> When doing a full rebuild (or initial async build) of a local or global index 
> using IndexTool and PhoenixIndexImportDirectMapper, or doing a synchronous 
> initial build of a global index using the index create DDL, we generate the 
> index mutations by using an UPSERT SELECT query from the base table to the 
> index.
> The timestamps of the mutations use the default HBase behavior, which is to 
> take the current wall clock. However, the timestamp of an index KeyValue 
> should use the timestamp of the initial KeyValue in the base table.
> Having base table and index timestamps out of sync can cause all sorts of 
> weird side effects, such as if the base table has data with an expired TTL 
> that isn't expired in the index yet. Also inserting old mutations with new 
> timestamps may overwrite the data that has been newly overwritten by the 
> regular data path during index build, which would lead to data loss and 
> inconsistency issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (PHOENIX-5019) Index mutations created by synchronous index builds will have wrong timestamps

2019-02-12 Thread Kadir OZDEMIR (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kadir OZDEMIR reassigned PHOENIX-5019:
--

Assignee: Kadir OZDEMIR

> Index mutations created by synchronous index builds will have wrong timestamps
> --
>
> Key: PHOENIX-5019
> URL: https://issues.apache.org/jira/browse/PHOENIX-5019
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.0.0, 4.14.1
>Reporter: Vincent Poon
>Assignee: Kadir OZDEMIR
>Priority: Major
>
> Similar to PHOENIX-5018, if we do a synchronous index build, since it's doing 
> an UpsertSelect , the timestamp of an index mutation will have current 
> wallclock time instead matching up with the data table counterpart's timestamp



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5136) Rows with null values inserted by UPSERT .. ON DUPLICATE KEY UPDATE are included in query results when they shouldn't be

2019-02-12 Thread Hieu Nguyen (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hieu Nguyen updated PHOENIX-5136:
-
Description: 
Rows with null values inserted using UPSERT .. ON DUPLICATE KEY UPDATE will be 
selected in queries when they should not be.

Here is a failing test that demonstrates the issue:
{noformat}
@Test
public void 
testRowsCreatedViaUpsertOnDuplicateKeyShouldNotBeReturnedInQueryIfNotMatched() 
throws Exception {
Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES);
Connection conn = DriverManager.getConnection(getUrl(), props);
String tableName = generateUniqueName();
String ddl = " create table " + tableName + "(pk varchar primary key, 
counter1 bigint, counter2 smallint)";
conn.createStatement().execute(ddl);
createIndex(conn, tableName);
// The data has to be specifically starting with null for the first counter 
to fail the test. If you reverse the values, the test passes.
String dml1 = "UPSERT INTO " + tableName + " VALUES('a',NULL,2) ON 
DUPLICATE KEY UPDATE " +
"counter1 = CASE WHEN (counter1 IS NULL) THEN NULL ELSE counter1 
END, " +
"counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2 END";
conn.createStatement().execute(dml1);
conn.commit();

String dml2 = "UPSERT INTO " + tableName + " VALUES('b',1,2) ON DUPLICATE 
KEY UPDATE " +
"counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1 END, 
" +
"counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2 END";
conn.createStatement().execute(dml2);
conn.commit();

// Using this statement causes the test to pass
//ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " + 
tableName + " WHERE counter2 = 2 AND counter1 = 1");
// This statement should be equivalent to the one above, but it selects 
both rows.
ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " + 
tableName + " WHERE counter2 = 2 AND (counter1 = 1 OR counter1 = 1)");
assertTrue(rs.next());
assertEquals("b",rs.getString(1));
assertEquals(1,rs.getLong(2));
assertEquals(2,rs.getLong(3));
assertFalse(rs.next());

conn.close();
}{noformat}
The conditions are fairly specific:
 * Must use ON DUPLICATE KEY UPDATE.  Inserting rows using UPSERT by itself 
will have correct results
 * The "counter2 = 2 AND (counter1 = 1 OR counter1 = 1)" condition caused the 
test to fail, as opposed to the equivalent but simpler "counter2 = 2 AND 
counter1 = 1".  I tested a similar "counter2 = 2 AND (counter1 = 1 OR counter1 
< 1)", which also caused the test to fail.
 * If the NULL value for row 'a' is instead in the last position (counter2), 
then row 'a' is not selected in the query as expected.  The below test 
demonstrates this behavior (it passes as expected):

{noformat}
@Test
public void 
testRowsCreatedViaUpsertOnDuplicateKeyShouldNotBeReturnedInQueryIfNotMatched() 
throws Exception {
Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES);
Connection conn = DriverManager.getConnection(getUrl(), props);
String tableName = generateUniqueName();
String ddl = " create table " + tableName + "(pk varchar primary key, 
counter1 bigint, counter2 smallint)";
conn.createStatement().execute(ddl);
createIndex(conn, tableName);

String dml1 = "UPSERT INTO " + tableName + " VALUES('a',1,NULL) ON 
DUPLICATE KEY UPDATE " +
"counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1 END, 
" +
"counter2 = CASE WHEN (counter1 IS NULL) THEN NULL ELSE counter2 
END";
conn.createStatement().execute(dml1);
conn.commit();

String dml2 = "UPSERT INTO " + tableName + " VALUES('b',1,2) ON DUPLICATE 
KEY UPDATE " +
"counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1 END, 
" +
"counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2 END";
conn.createStatement().execute(dml2);
conn.commit();

ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " + 
tableName + " WHERE counter1 = 1 AND (counter2 = 2 OR counter2 = 2)");
assertTrue(rs.next());
assertEquals("b",rs.getString(1));
assertEquals(1,rs.getLong(2));
assertEquals(2,rs.getLong(3));
assertFalse(rs.next());

conn.close();
}
{noformat}

We also noticed this behavior when upserting and selecting manually against a 
View.

Any ideas on where to look to fix this issue?

  was:
Rows with null values inserted using UPSERT .. ON DUPLICATE KEY UPDATE will be 
selected in queries when they should not be.

Here is a failing test that demonstrates the issue:
{noformat}
@Test
public void 
testRowsCreatedViaUpsertOnDuplicateKeyShouldNotBeReturnedInQueryIfNotMatched() 
throws Exception {
Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES);
Connection conn = DriverManager.getConnection(getUrl(), props);
String tableName = 

[jira] [Created] (PHOENIX-5136) Rows with null values inserted by UPSERT .. ON DUPLICATE KEY UPDATE are included in query results when they shouldn't be

2019-02-12 Thread Hieu Nguyen (JIRA)
Hieu Nguyen created PHOENIX-5136:


 Summary: Rows with null values inserted by UPSERT .. ON DUPLICATE 
KEY UPDATE are included in query results when they shouldn't be
 Key: PHOENIX-5136
 URL: https://issues.apache.org/jira/browse/PHOENIX-5136
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 5.0.0
Reporter: Hieu Nguyen


Rows with null values inserted using UPSERT .. ON DUPLICATE KEY UPDATE will be 
selected in queries when they should not be.

Here is a failing test that demonstrates the issue:
{noformat}
@Test
public void 
testRowsCreatedViaUpsertOnDuplicateKeyShouldNotBeReturnedInQueryIfNotMatched() 
throws Exception {
Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES);
Connection conn = DriverManager.getConnection(getUrl(), props);
String tableName = generateUniqueName();
String ddl = " create table " + tableName + "(pk varchar primary key, 
counter1 bigint, counter2 smallint)";
conn.createStatement().execute(ddl);
createIndex(conn, tableName);
// The data has to be specifically starting with null for the first counter 
to fail the test. If you reverse the values, the test passes.
String dml1 = "UPSERT INTO " + tableName + " VALUES('a',NULL,2) ON 
DUPLICATE KEY UPDATE " +
"counter1 = CASE WHEN (counter1 IS NULL) THEN NULL ELSE counter1 
END, " +
"counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2 END";
conn.createStatement().execute(dml1);
conn.commit();

String dml2 = "UPSERT INTO " + tableName + " VALUES('b',1,2) ON DUPLICATE 
KEY UPDATE " +
"counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1 END, 
" +
"counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2 END";
conn.createStatement().execute(dml2);
conn.commit();

// Using this statement causes the test to pass
//ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " + 
tableName + " WHERE counter2 = 2 AND counter1 = 1");
// This statement should be equivalent to the one above, but it selects 
both rows.
ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " + 
tableName + " WHERE counter2 = 2 AND (counter1 = 1 OR counter1 = 1)");
assertTrue(rs.next());
assertEquals("b",rs.getString(1));
assertEquals(1,rs.getLong(2));
assertEquals(2,rs.getLong(3));
assertFalse(rs.next());

conn.close();
}{noformat}
The conditions are fairly specific:
 * Must use ON DUPLICATE KEY UPDATE.  Inserting rows using UPSERT by itself 
will have correct results
 * The "counter2 = 2 AND (counter1 = 1 OR counter1 = 1)" condition caused the 
test to fail, as opposed to the equivalent but simpler "counter2 = 2 AND 
counter1 = 1".  I tested a similar "counter2 = 2 AND (counter1 = 1 OR counter1 
< 1)", which also caused the test to fail.
 * If the NULL value for row 'a' is instead in the last position (counter2), 
then row 'a' is not selected in the query as expected.  The below test 
demonstrates this behavior (it passes as expected):

{noformat}
@Test
public void 
testRowsCreatedViaUpsertOnDuplicateKeyShouldNotBeReturnedInQueryIfNotMatched() 
throws Exception {
Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES);
Connection conn = DriverManager.getConnection(getUrl(), props);
String tableName = generateUniqueName();
String ddl = " create table " + tableName + "(pk varchar primary key, 
counter1 bigint, counter2 smallint)";
conn.createStatement().execute(ddl);
createIndex(conn, tableName);

String dml1 = "UPSERT INTO " + tableName + " VALUES('a',1,NULL) ON 
DUPLICATE KEY UPDATE " +
"counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1 END, 
" +
"counter2 = CASE WHEN (counter1 IS NULL) THEN NULL ELSE counter2 
END";
conn.createStatement().execute(dml1);
conn.commit();

String dml2 = "UPSERT INTO " + tableName + " VALUES('b',1,2) ON DUPLICATE 
KEY UPDATE " +
"counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1 END, 
" +
"counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2 END";
conn.createStatement().execute(dml2);
conn.commit();

ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " + 
tableName + " WHERE counter1 = 1 AND (counter2 = 2 OR counter2 = 2)");
assertTrue(rs.next());
assertEquals("b",rs.getString(1));
assertEquals(1,rs.getLong(2));
assertEquals(2,rs.getLong(3));
assertFalse(rs.next());

conn.close();
}{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5136) Rows with null values inserted by UPSERT .. ON DUPLICATE KEY UPDATE are included in query results when they shouldn't be

2019-02-12 Thread Hieu Nguyen (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hieu Nguyen updated PHOENIX-5136:
-
Description: 
Rows with null values inserted using UPSERT .. ON DUPLICATE KEY UPDATE will be 
selected in queries when they should not be.

Here is a failing test that demonstrates the issue:
{noformat}
@Test
public void 
testRowsCreatedViaUpsertOnDuplicateKeyShouldNotBeReturnedInQueryIfNotMatched() 
throws Exception {
Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES);
Connection conn = DriverManager.getConnection(getUrl(), props);
String tableName = generateUniqueName();
String ddl = " create table " + tableName + "(pk varchar primary key, 
counter1 bigint, counter2 smallint)";
conn.createStatement().execute(ddl);
createIndex(conn, tableName);
// The data has to be specifically starting with null for the first counter 
to fail the test. If you reverse the values, the test passes.
String dml1 = "UPSERT INTO " + tableName + " VALUES('a',NULL,2) ON 
DUPLICATE KEY UPDATE " +
"counter1 = CASE WHEN (counter1 IS NULL) THEN NULL ELSE counter1 
END, " +
"counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2 END";
conn.createStatement().execute(dml1);
conn.commit();

String dml2 = "UPSERT INTO " + tableName + " VALUES('b',1,2) ON DUPLICATE 
KEY UPDATE " +
"counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1 END, 
" +
"counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2 END";
conn.createStatement().execute(dml2);
conn.commit();

// Using this statement causes the test to pass
//ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " + 
tableName + " WHERE counter2 = 2 AND counter1 = 1");
// This statement should be equivalent to the one above, but it selects 
both rows.
ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " + 
tableName + " WHERE counter2 = 2 AND (counter1 = 1 OR counter1 = 1)");
assertTrue(rs.next());
assertEquals("b",rs.getString(1));
assertEquals(1,rs.getLong(2));
assertEquals(2,rs.getLong(3));
assertFalse(rs.next());

conn.close();
}{noformat}
The conditions are fairly specific:
 * Must use ON DUPLICATE KEY UPDATE.  Inserting rows using UPSERT by itself 
will have correct results
 * The "counter2 = 2 AND (counter1 = 1 OR counter1 = 1)" condition caused the 
test to fail, as opposed to the equivalent but simpler "counter2 = 2 AND 
counter1 = 1".  I tested a similar "counter2 = 2 AND (counter1 = 1 OR counter1 
< 1)", which also caused the test to fail.
 * If the NULL value for row 'a' is instead in the last position (counter2), 
then row 'a' is not selected in the query as expected.  The below test 
demonstrates this behavior (it passes as expected):

{noformat}
@Test
public void 
testRowsCreatedViaUpsertOnDuplicateKeyShouldNotBeReturnedInQueryIfNotMatched() 
throws Exception {
Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES);
Connection conn = DriverManager.getConnection(getUrl(), props);
String tableName = generateUniqueName();
String ddl = " create table " + tableName + "(pk varchar primary key, 
counter1 bigint, counter2 smallint)";
conn.createStatement().execute(ddl);
createIndex(conn, tableName);

String dml1 = "UPSERT INTO " + tableName + " VALUES('a',1,NULL) ON 
DUPLICATE KEY UPDATE " +
"counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1 END, 
" +
"counter2 = CASE WHEN (counter1 IS NULL) THEN NULL ELSE counter2 
END";
conn.createStatement().execute(dml1);
conn.commit();

String dml2 = "UPSERT INTO " + tableName + " VALUES('b',1,2) ON DUPLICATE 
KEY UPDATE " +
"counter1 = CASE WHEN (counter1 IS NULL) THEN 1 ELSE counter1 END, 
" +
"counter2 = CASE WHEN (counter1 IS NULL) THEN 2 ELSE counter2 END";
conn.createStatement().execute(dml2);
conn.commit();

ResultSet rs = conn.createStatement().executeQuery("SELECT * FROM " + 
tableName + " WHERE counter1 = 1 AND (counter2 = 2 OR counter2 = 2)");
assertTrue(rs.next());
assertEquals("b",rs.getString(1));
assertEquals(1,rs.getLong(2));
assertEquals(2,rs.getLong(3));
assertFalse(rs.next());

conn.close();
}
{noformat}

We also noticed this behavior when upserting and selecting manually against a 
View.

  was:
Rows with null values inserted using UPSERT .. ON DUPLICATE KEY UPDATE will be 
selected in queries when they should not be.

Here is a failing test that demonstrates the issue:
{noformat}
@Test
public void 
testRowsCreatedViaUpsertOnDuplicateKeyShouldNotBeReturnedInQueryIfNotMatched() 
throws Exception {
Properties props = PropertiesUtil.deepCopy(TEST_PROPERTIES);
Connection conn = DriverManager.getConnection(getUrl(), props);
String tableName = generateUniqueName();
String ddl = " create 

[jira] [Updated] (PHOENIX-5018) Index mutations created by UPSERT SELECT will have wrong timestamps

2019-02-12 Thread Kadir OZDEMIR (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kadir OZDEMIR updated PHOENIX-5018:
---
Attachment: PHOENIX-5018.master.003.patch

> Index mutations created by UPSERT SELECT will have wrong timestamps
> ---
>
> Key: PHOENIX-5018
> URL: https://issues.apache.org/jira/browse/PHOENIX-5018
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.0, 5.0.0
>Reporter: Geoffrey Jacoby
>Assignee: Kadir OZDEMIR
>Priority: Major
> Attachments: PHOENIX-5018.4.x-HBase-1.4.001.patch, 
> PHOENIX-5018.master.001.patch, PHOENIX-5018.master.002.patch, 
> PHOENIX-5018.master.003.patch
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> When doing a full rebuild (or initial async build) of a local or global index 
> using IndexTool and PhoenixIndexImportDirectMapper, or doing a synchronous 
> initial build of a global index using the index create DDL, we generate the 
> index mutations by using an UPSERT SELECT query from the base table to the 
> index.
> The timestamps of the mutations use the default HBase behavior, which is to 
> take the current wall clock. However, the timestamp of an index KeyValue 
> should use the timestamp of the initial KeyValue in the base table.
> Having base table and index timestamps out of sync can cause all sorts of 
> weird side effects, such as if the base table has data with an expired TTL 
> that isn't expired in the index yet. Also inserting old mutations with new 
> timestamps may overwrite the data that has been newly overwritten by the 
> regular data path during index build, which would lead to data loss and 
> inconsistency issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5018) Index mutations created by UPSERT SELECT will have wrong timestamps

2019-02-12 Thread Kadir OZDEMIR (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kadir OZDEMIR updated PHOENIX-5018:
---
Attachment: PHOENIX-5018.4.x-HBase-1.4.001.patch

> Index mutations created by UPSERT SELECT will have wrong timestamps
> ---
>
> Key: PHOENIX-5018
> URL: https://issues.apache.org/jira/browse/PHOENIX-5018
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.0, 5.0.0
>Reporter: Geoffrey Jacoby
>Assignee: Kadir OZDEMIR
>Priority: Major
> Attachments: PHOENIX-5018.4.x-HBase-1.4.001.patch, 
> PHOENIX-5018.master.001.patch, PHOENIX-5018.master.002.patch
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> When doing a full rebuild (or initial async build) of a local or global index 
> using IndexTool and PhoenixIndexImportDirectMapper, or doing a synchronous 
> initial build of a global index using the index create DDL, we generate the 
> index mutations by using an UPSERT SELECT query from the base table to the 
> index.
> The timestamps of the mutations use the default HBase behavior, which is to 
> take the current wall clock. However, the timestamp of an index KeyValue 
> should use the timestamp of the initial KeyValue in the base table.
> Having base table and index timestamps out of sync can cause all sorts of 
> weird side effects, such as if the base table has data with an expired TTL 
> that isn't expired in the index yet. Also inserting old mutations with new 
> timestamps may overwrite the data that has been newly overwritten by the 
> regular data path during index build, which would lead to data loss and 
> inconsistency issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (PHOENIX-5135) Spooled files are not cleaned up for large ORDER BYs

2019-02-12 Thread Abhishek Singh Chouhan (JIRA)
Abhishek Singh Chouhan created PHOENIX-5135:
---

 Summary: Spooled files are not cleaned up for large ORDER BYs
 Key: PHOENIX-5135
 URL: https://issues.apache.org/jira/browse/PHOENIX-5135
 Project: Phoenix
  Issue Type: Bug
Affects Versions: 4.13.1
Reporter: Abhishek Singh Chouhan


For the case of concurrent order by queries running against a large region 
(thereby leading to spooling) we've seen that temp files are not cleaned up (in 
most of the cases this happens when the query times out on the client while the 
server side is still doing the work, but in some cases we've seen the files 
lying around even after the query completes). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5018) Index mutations created by UPSERT SELECT will have wrong timestamps

2019-02-12 Thread Kadir OZDEMIR (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kadir OZDEMIR updated PHOENIX-5018:
---
Attachment: PHOENIX-5018.master.002.patch

> Index mutations created by UPSERT SELECT will have wrong timestamps
> ---
>
> Key: PHOENIX-5018
> URL: https://issues.apache.org/jira/browse/PHOENIX-5018
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.14.0, 5.0.0
>Reporter: Geoffrey Jacoby
>Assignee: Kadir OZDEMIR
>Priority: Major
> Attachments: PHOENIX-5018.master.001.patch, 
> PHOENIX-5018.master.002.patch
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> When doing a full rebuild (or initial async build) of a local or global index 
> using IndexTool and PhoenixIndexImportDirectMapper, or doing a synchronous 
> initial build of a global index using the index create DDL, we generate the 
> index mutations by using an UPSERT SELECT query from the base table to the 
> index.
> The timestamps of the mutations use the default HBase behavior, which is to 
> take the current wall clock. However, the timestamp of an index KeyValue 
> should use the timestamp of the initial KeyValue in the base table.
> Having base table and index timestamps out of sync can cause all sorts of 
> weird side effects, such as if the base table has data with an expired TTL 
> that isn't expired in the index yet. Also inserting old mutations with new 
> timestamps may overwrite the data that has been newly overwritten by the 
> regular data path during index build, which would lead to data loss and 
> inconsistency issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


CDH 6.X road map

2019-02-12 Thread Mahdi Salarkia
Hi
Is there a plan to release a Hbase-2.0-CDH (Cloudera 6.X) compatible
version of Phoenix anytime soon?
I can see Phoenix currently supports older versions of CDH (5.11, ...) but
doesn't seem to be much work being done for the version 6.
P.S : I'll be happy to help given instructions to help build the CDH 6.X
version

Thanks
Mehdi


[jira] [Assigned] (PHOENIX-5068) Autocommit off is not working as expected might be a bug!?

2019-02-12 Thread Xinyi Yan (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xinyi Yan reassigned PHOENIX-5068:
--

Assignee: Xinyi Yan

> Autocommit off is not working as expected might be a bug!?
> --
>
> Key: PHOENIX-5068
> URL: https://issues.apache.org/jira/browse/PHOENIX-5068
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Amarnath Ramamoorthi
>Assignee: Xinyi Yan
>Priority: Minor
> Attachments: test_foo_data.sql
>
>
> Autocommit off is working strange might be a bug!?
> Here is what we found when using autocommit off.
> A table has only 2 int columns and both set as primary key, containing 100 
> rows in total.
> On *"autocommit off"* when we try to upsert values in to same table, it says 
> 200 rows affected.
> Works fine when we run the same Upsert command but with less than 100 rows 
> using WHERE command as you can see below.
> There is something wrong with auto commit off with >= 100 rows upsert`s.
> {code:java}
> 0: jdbc:phoenix:XXYYZZ> select count(*) from "FOO".DEMO;
> +---+
> | COUNT(1)  |
> +---+
> | 100   |
> +---+
> 1 row selected (0.025 seconds)
> 0: jdbc:phoenix:XXYYZZ> SELECT * FROM "FOO".DEMO WHERE "id_x"=9741;
> ++---+
> | id_x  |   id_y   |
> ++---+
> | 9741   | 63423770  |
> ++---+
> 1 row selected (0.04 seconds)
> 0: jdbc:phoenix:XXYYZZ> !autocommit off
> Autocommit status: false
> 0: jdbc:phoenix:XXYYZZ> UPSERT INTO "FOO".DEMO SELECT * FROM "FOO".DEMO;
> 200 rows affected (0.023 seconds)
> 0: jdbc:phoenix:XXYYZZ> 
> 0: jdbc:phoenix:XXYYZZ> UPSERT INTO "FOO".DEMO SELECT * FROM "FOO".DEMO WHERE 
> "id_x"=9741;
> 1 row affected (0.014 seconds)
> 0: jdbc:phoenix:XXYYZZ> UPSERT INTO "FOO".DEMO SELECT * FROM "FOO".DEMO WHERE 
> "id_x"!=9741;
> 99 rows affected (0.045 seconds)
> 0: jdbc:phoenix:XXYYZZ>
> 0: jdbc:phoenix:XXYYZZ> !autocommit on
> Autocommit status: true
> 0: jdbc:phoenix:XXYYZZ> UPSERT INTO "FOO".DEMO SELECT * FROM "FOO".DEMO;
> 100 rows affected (0.065 seconds)
> {code}
> Tested once again, but now select from different table
> {code:java}
> 0: jdbc:phoenix:XXYYZZ> !autocommit off
> Autocommit status: false
> 0: jdbc:phoenix:XXYYZZ> UPSERT INTO "FOO".DEMO SELECT * FROM "FOO".TEST limit 
> 100;
> 200 rows affected (0.052 seconds)
> 0: jdbc:phoenix:XXYYZZ> UPSERT INTO "FOO".DEMO SELECT * FROM "FOO".TEST limit 
> 99;
> 99 rows affected (0.029 seconds)
> 0: jdbc:phoenix:XXYYZZ> UPSERT INTO "FOO".DEMO SELECT * FROM "FOO".TEST limit 
> 500;
> 1,000 rows affected (0.041 seconds)
> {code}
> Still the same, It shows the rows affected is 1,000 even though we have it 
> limited to 500. It keeps doubling up.
> Would be really helpful if someone could help on this please.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5132) View indexes with different owners but of the same base table can be assigned same ViewIndexId

2019-02-12 Thread Geoffrey Jacoby (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Geoffrey Jacoby updated PHOENIX-5132:
-
Attachment: PHOENIX-5132-4.x-HBase-1.4.v2.patch

> View indexes with different owners but of the same base table can be assigned 
> same ViewIndexId
> --
>
> Key: PHOENIX-5132
> URL: https://issues.apache.org/jira/browse/PHOENIX-5132
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 5.0.0, 4.14.1
>Reporter: Geoffrey Jacoby
>Assignee: Geoffrey Jacoby
>Priority: Critical
> Attachments: PHOENIX-5132-4.x-HBase-1.4.patch, 
> PHOENIX-5132-4.x-HBase-1.4.v2.patch, PHOENIX-5132-repro.patch
>
>
> All indexes on views for a particular base table are stored in the same 
> physical HBase table. Phoenix distinguishes them by prepending each row key 
> with an encoded short or long integer called a ViewIndexId. 
> The ViewIndexId is generated by using a sequence to guarantee that each view 
> index id is unique. Unfortunately, the sequence used follows a convention of 
> [SaltByte, Tenant, Schema, BaseTable] for its key, which means that there's a 
> separate sequence for each tenant that owns an index in the view index table. 
> (See MetaDataUtil.getViewIndexSequenceKey) Since all the sequences start at 
> the same value, collisions are not only possible but likely. 
> I've written a test that confirms the ViewIndexId collision. This means it's 
> very likely that query results using one view index could mistakenly include 
> rows from another index, but I haven't confirmed this. 
> All view indexes for a base table, regardless of whether globally or 
> tenant-owned, should use the same sequence. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (PHOENIX-5134) Phoenix Connection Driver #normalize does not distinguish different url with same ZK quorum but different Properties

2019-02-12 Thread Xu Cang (JIRA)
Xu Cang created PHOENIX-5134:


 Summary: Phoenix Connection Driver #normalize does not distinguish 
different url with same ZK quorum but different Properties
 Key: PHOENIX-5134
 URL: https://issues.apache.org/jira/browse/PHOENIX-5134
 Project: Phoenix
  Issue Type: Improvement
Reporter: Xu Cang


In this code
https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/jdbc/PhoenixDriver.java#L228


Phoenix now uses a cache to maintain Hconnections. The cache's key is generated 
by 'normalize' method here:
https://github.com/apache/phoenix/blob/master/phoenix-core/src/main/java/org/apache/phoenix/jdbc/PhoenixEmbeddedDriver.java#L312
The normalize method takes ZK quorum, port, rootNode, principle and keytab into 
account. But not properties passed in in url. 

E.g.
Request to reate one connection by this url: 
jdbc:phoenix:localhost:61733;TenantId=1
Request to create another connection by this url
jdbc:phoenix:localhost:61733;TenantId=2

Based on logic we have, it will result in one same Hconnection in the 
connection cache here. 
This might not be something we really want. 
For example, different tenant wants to have different HBase config (such as 
HBase timeout settings) With the same Hconnection returned, tenant2's config 
will be ignored silently. 






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (PHOENIX-5128) Provide option to skip header with CsvBulkLoadTool

2019-02-12 Thread Josh Elser (JIRA)


 [ 
https://issues.apache.org/jira/browse/PHOENIX-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser updated PHOENIX-5128:

Fix Version/s: 5.1.0
   4.15.0

> Provide option to skip header with CsvBulkLoadTool
> --
>
> Key: PHOENIX-5128
> URL: https://issues.apache.org/jira/browse/PHOENIX-5128
> Project: Phoenix
>  Issue Type: New Feature
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Major
> Fix For: 4.15.0, 5.1.0
>
> Attachments: PHOENIX-5128.001.patch
>
>
> We can pretty easily add a feature to skip the "header" row of a CSV file 
> with a thin wrapper around the TextInputFormat/LineRecordReader.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)