[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables

2018-07-16 Thread Chinmay Kulkarni (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545547#comment-16545547
 ] 

Chinmay Kulkarni commented on PHOENIX-3955:
---

[~jamestaylor] I definitely plan to..just been caught up with a few things at 
work. Will update this soon. Thanks!

> Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync 
> between the physical data table and index tables
> --
>
> Key: PHOENIX-3955
> URL: https://issues.apache.org/jira/browse/PHOENIX-3955
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Chinmay Kulkarni
>Priority: Major
>
> We need to make sure that indexes inherit the REPLICATION_SCOPE, 
> KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can 
> run into situations where the data was removed (or not removed) from the data 
> table but was removed (or not removed) from the index. Or vice-versa. We also 
> need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET 
> KEEP_DELETED_CELLS statements propagate the properties to the indexes too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables

2018-07-11 Thread James Taylor (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16540238#comment-16540238
 ] 

James Taylor commented on PHOENIX-3955:
---

You've made great progress on this one, [~ckulkarni]. Will you be able to see 
this one through? It's pretty high on the prioritized list here: 
https://docs.google.com/document/d/16BWU73qlnCQdxCflM3LScyIxE_OGi76ZYT2WHUfewiE

> Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync 
> between the physical data table and index tables
> --
>
> Key: PHOENIX-3955
> URL: https://issues.apache.org/jira/browse/PHOENIX-3955
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Chinmay Kulkarni
>Priority: Major
>
> We need to make sure that indexes inherit the REPLICATION_SCOPE, 
> KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can 
> run into situations where the data was removed (or not removed) from the data 
> table but was removed (or not removed) from the index. Or vice-versa. We also 
> need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET 
> KEEP_DELETED_CELLS statements propagate the properties to the indexes too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508746#comment-16508746
 ] 

ASF GitHub Bot commented on PHOENIX-3955:
-

Github user twdsilva commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/304#discussion_r194546247
  
--- Diff: 
phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java
 ---
@@ -767,50 +748,117 @@ private void 
modifyColumnFamilyDescriptor(HColumnDescriptor hcd, Map 
ensureTableColumnFamilyPropsInSync(String tableName, byte[] defaultFamilyBytes) 
throws SQLException {
+HTableDescriptor tableDescriptor = 
getTableDescriptor(Bytes.toBytes(tableName));
+HColumnDescriptor[] colFamilies = 
tableDescriptor.getColumnFamilies();
+HColumnDescriptor  defaultColDescriptor = 
tableDescriptor.getFamily(defaultFamilyBytes);
+// It's possible that the table has specific column families and 
none of them are declared to be the DEFAULT_COLUMN_FAMILY
+defaultColDescriptor = defaultColDescriptor != null ? 
defaultColDescriptor : colFamilies[0];
+
+for (String propName: 
MetaDataUtil.PROPERTIES_TO_KEEP_IN_SYNC_AMONG_COL_FAMS_AND_INDEXES) {
+if (defaultColDescriptor.getValue(propName) == null) {
--- End diff --

Do you need to throw UpgradeRequiredException  if the default column table 
property is null here?


> Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync 
> between the physical data table and index tables
> --
>
> Key: PHOENIX-3955
> URL: https://issues.apache.org/jira/browse/PHOENIX-3955
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Chinmay Kulkarni
>Priority: Major
>
> We need to make sure that indexes inherit the REPLICATION_SCOPE, 
> KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can 
> run into situations where the data was removed (or not removed) from the data 
> table but was removed (or not removed) from the index. Or vice-versa. We also 
> need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET 
> KEEP_DELETED_CELLS statements propagate the properties to the indexes too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508747#comment-16508747
 ] 

ASF GitHub Bot commented on PHOENIX-3955:
-

Github user twdsilva commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/304#discussion_r194544646
  
--- Diff: 
phoenix-core/src/main/java/org/apache/phoenix/util/MetaDataUtil.java ---
@@ -651,6 +653,10 @@ public static String 
getJdbcUrl(RegionCoprocessorEnvironment env) {
 + PhoenixRuntime.JDBC_PROTOCOL_SEPARATOR + zkParentNode;
 }
 
+public static boolean 
isPropertyAllowedToBeOutOfSyncAmongColFamsAndIndexes(String colFamProp) {
--- End diff --

Just return .contains since the callers always negate the boolean returned 
by this function.


> Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync 
> between the physical data table and index tables
> --
>
> Key: PHOENIX-3955
> URL: https://issues.apache.org/jira/browse/PHOENIX-3955
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Chinmay Kulkarni
>Priority: Major
>
> We need to make sure that indexes inherit the REPLICATION_SCOPE, 
> KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can 
> run into situations where the data was removed (or not removed) from the data 
> table but was removed (or not removed) from the index. Or vice-versa. We also 
> need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET 
> KEEP_DELETED_CELLS statements propagate the properties to the indexes too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508748#comment-16508748
 ] 

ASF GitHub Bot commented on PHOENIX-3955:
-

Github user twdsilva commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/304#discussion_r194545366
  
--- Diff: 
phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java
 ---
@@ -767,50 +748,117 @@ private void 
modifyColumnFamilyDescriptor(HColumnDescriptor hcd, Map 
ensureTableColumnFamilyPropsInSync(String tableName, byte[] defaultFamilyBytes) 
throws SQLException {
+HTableDescriptor tableDescriptor = 
getTableDescriptor(Bytes.toBytes(tableName));
+HColumnDescriptor[] colFamilies = 
tableDescriptor.getColumnFamilies();
+HColumnDescriptor  defaultColDescriptor = 
tableDescriptor.getFamily(defaultFamilyBytes);
+// It's possible that the table has specific column families and 
none of them are declared to be the DEFAULT_COLUMN_FAMILY
+defaultColDescriptor = defaultColDescriptor != null ? 
defaultColDescriptor : colFamilies[0];
+
+for (String propName: 
MetaDataUtil.PROPERTIES_TO_KEEP_IN_SYNC_AMONG_COL_FAMS_AND_INDEXES) {
+if (defaultColDescriptor.getValue(propName) == null) {
+if (!isUpgradeRequired()) {
+// We cannot have a null value for any of the 
properties that need to be kept in sync amongst all column families
+logger.error("Cannot have null value for column family 
property: " + propName);
+setUpgradeRequired();
+throw new UpgradeRequiredException();
+} else {
+defaultColDescriptor.setValue(propName, 
HColumnDescriptor.getDefaultValues().get(propName));
+}
+}
+}
+// Used in the upgrade path to actually fix the table descriptor 
by syncing properties
+HTableDescriptor syncedTableDescriptor = new 
HTableDescriptor(tableDescriptor);
+// Check that these properties are in sync amongst all column 
families of the table
+for (HColumnDescriptor family: colFamilies) {
+if (isUpgradeRequired()) {
+family = syncedTableDescriptor.getFamily(family.getName());
+}
+for (String propName: 
MetaDataUtil.PROPERTIES_TO_KEEP_IN_SYNC_AMONG_COL_FAMS_AND_INDEXES) {
+if 
(!family.getValue(propName).toUpperCase(Locale.ROOT).equals(defaultColDescriptor.getValue(propName).toUpperCase(Locale.ROOT)))
 {
--- End diff --

Do you need the toUpperCase(Locale.ROOT) ?


> Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync 
> between the physical data table and index tables
> --
>
> Key: PHOENIX-3955
> URL: https://issues.apache.org/jira/browse/PHOENIX-3955
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Chinmay Kulkarni
>Priority: Major
>
> We need to make sure that indexes inherit the REPLICATION_SCOPE, 
> KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can 
> run into situations where the data was removed (or not removed) from the data 
> table but was removed (or not removed) from the index. Or vice-versa. We also 
> need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET 
> KEEP_DELETED_CELLS statements propagate the properties to the indexes too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508745#comment-16508745
 ] 

ASF GitHub Bot commented on PHOENIX-3955:
-

Github user twdsilva commented on a diff in the pull request:

https://github.com/apache/phoenix/pull/304#discussion_r194539685
  
--- Diff: 
phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java
 ---
@@ -98,14 +84,7 @@
 import javax.annotation.concurrent.GuardedBy;
 
 import org.apache.hadoop.conf.Configuration;
-import org.apache.hadoop.hbase.HColumnDescriptor;
-import org.apache.hadoop.hbase.HConstants;
-import org.apache.hadoop.hbase.HRegionLocation;
-import org.apache.hadoop.hbase.HTableDescriptor;
-import org.apache.hadoop.hbase.NamespaceDescriptor;
-import org.apache.hadoop.hbase.NamespaceNotFoundException;
-import org.apache.hadoop.hbase.TableExistsException;
-import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.*;
--- End diff --

Are you using the phoenix code template 
https://phoenix.apache.org/develop.html ?


> Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync 
> between the physical data table and index tables
> --
>
> Key: PHOENIX-3955
> URL: https://issues.apache.org/jira/browse/PHOENIX-3955
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Chinmay Kulkarni
>Priority: Major
>
> We need to make sure that indexes inherit the REPLICATION_SCOPE, 
> KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can 
> run into situations where the data was removed (or not removed) from the data 
> table but was removed (or not removed) from the index. Or vice-versa. We also 
> need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET 
> KEEP_DELETED_CELLS statements propagate the properties to the indexes too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables

2018-06-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508744#comment-16508744
 ] 

ASF GitHub Bot commented on PHOENIX-3955:
-

Github user twdsilva commented on the issue:

https://github.com/apache/phoenix/pull/304
  
I think you is better If you add code to UpgradeUtil that runs at the next 
major upgrade that checks all the tables and ensures these three property 
values are in sync (see addParentToChildLinks for an example). Then you 
wouldn't need to validate that the properties are in sync in multiple places.


> Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync 
> between the physical data table and index tables
> --
>
> Key: PHOENIX-3955
> URL: https://issues.apache.org/jira/browse/PHOENIX-3955
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Chinmay Kulkarni
>Priority: Major
>
> We need to make sure that indexes inherit the REPLICATION_SCOPE, 
> KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can 
> run into situations where the data was removed (or not removed) from the data 
> table but was removed (or not removed) from the index. Or vice-versa. We also 
> need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET 
> KEEP_DELETED_CELLS statements propagate the properties to the indexes too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables

2018-06-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498331#comment-16498331
 ] 

ASF GitHub Bot commented on PHOENIX-3955:
-

Github user ChinmaySKulkarni commented on the issue:

https://github.com/apache/phoenix/pull/304
  
Ran into [PHOENIX-3167](https://issues.apache.org/jira/browse/PHOENIX-3167) 
when doing some testing for this JIRA. 

This can be a bigger problem now that we force an upgrade in case column 
families are out of sync for a table.

Steps to repro:
create table test (id INTEGER not null primary key, name varchar(10)) 
TTL=1200,KEEP_DELETED_CELLS='false',REPLICATION_SCOPE='1';
alter table test add cf1.random varchar(10);
-> New column family cf1 inherits TTL, KEEP_DELETED_CELLS and 
REPLICATION_SCOPE properties
alter table test set TTL=10;
create table if not exists test (id INTEGER not null primary key, name 
varchar(10)) TTL=1200,KEEP_DELETED_CELLS='false',REPLICATION_SCOPE='1';
-> Modifies HBase metadata for the table so TTL for default cf is changed 
to 1200, whereas newly added cf1's TTL is still 10.

With this patch, at this point, any future alter table command or create 
index will fail with UpgradeRequiredException since column family properties 
are out of sync.

Even if we run the second create table with 'if not exists', same behavior.
Similar problem is seen with alter table for global index. See 
[PHOENIX-4743](https://issues.apache.org/jira/browse/PHOENIX-4743)
@twdsilva @vincentpoon 


> Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync 
> between the physical data table and index tables
> --
>
> Key: PHOENIX-3955
> URL: https://issues.apache.org/jira/browse/PHOENIX-3955
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Chinmay Kulkarni
>Priority: Major
>
> We need to make sure that indexes inherit the REPLICATION_SCOPE, 
> KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can 
> run into situations where the data was removed (or not removed) from the data 
> table but was removed (or not removed) from the index. Or vice-versa. We also 
> need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET 
> KEEP_DELETED_CELLS statements propagate the properties to the indexes too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables

2018-06-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498322#comment-16498322
 ] 

ASF GitHub Bot commented on PHOENIX-3955:
-

Github user ChinmaySKulkarni commented on the issue:

https://github.com/apache/phoenix/pull/304
  
@twdsilva @vincentpoon please review. I will be adding stuff mentioned in 
"TODO", but just wanted your feedback on this initial patch.


> Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync 
> between the physical data table and index tables
> --
>
> Key: PHOENIX-3955
> URL: https://issues.apache.org/jira/browse/PHOENIX-3955
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Chinmay Kulkarni
>Priority: Major
>
> We need to make sure that indexes inherit the REPLICATION_SCOPE, 
> KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can 
> run into situations where the data was removed (or not removed) from the data 
> table but was removed (or not removed) from the index. Or vice-versa. We also 
> need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET 
> KEEP_DELETED_CELLS statements propagate the properties to the indexes too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables

2018-06-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498319#comment-16498319
 ] 

ASF GitHub Bot commented on PHOENIX-3955:
-

GitHub user ChinmaySKulkarni opened a pull request:

https://github.com/apache/phoenix/pull/304

PHOENIX-3955: Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL 
properties stay in sync between the physical data table and index tables

**[DO NOT MERGE - WIP patch]**

- Disallow specifying TTL, KEEP_DELETED_CELLS and REPLICATION_SCOPE at
column family level when creating table and also while creating indexes
- Sync these properties in create index code path after checking for
sync of existing column families in data table
- Modified alter table and alter global index code path to ensure
syncing of properties
- Added syncing properties of column families of tables and their
indexes as a step in execute upgrade code path

TODO:
- Add local testing that has been done
- Test old client new server and execute upgrade
- Handle indexes on views

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ChinmaySKulkarni/phoenix PHOENIX-3955

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/phoenix/pull/304.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #304


commit a370da3ecf1692fbed36463c184e1f4953ad545e
Author: Chinmay Kulkarni 
Date:   2018-06-01T17:36:36Z

PHOENIX-3955: Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL 
properties stay in sync between the physical data table and index tables
[DO NOT MERGE - WIP patch]

- Disallow specifying TTL, KEEP_DELETED_CELLS and REPLICATION_SCOPE at
column family level when creating table and also while creating indexes
- Sync these properties in create index code path after checking for
sync of existing column families in data table
- Modified alter table and alter global index code path to ensure
syncing of properties
- Added syncing properties of column families of tables and their
indexes as a step in execute upgrade code path

TODO:
- Did lots of local testing for corner cases. Need to add tests
- Test old client new server and execute upgrade
- Handle indexes on views




> Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync 
> between the physical data table and index tables
> --
>
> Key: PHOENIX-3955
> URL: https://issues.apache.org/jira/browse/PHOENIX-3955
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Chinmay Kulkarni
>Priority: Major
>
> We need to make sure that indexes inherit the REPLICATION_SCOPE, 
> KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can 
> run into situations where the data was removed (or not removed) from the data 
> table but was removed (or not removed) from the index. Or vice-versa. We also 
> need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET 
> KEEP_DELETED_CELLS statements propagate the properties to the indexes too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables

2018-05-17 Thread Thomas D'Silva (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480129#comment-16480129
 ] 

Thomas D'Silva commented on PHOENIX-3955:
-

At upgrade time if you check all tables and if you find a table with multiple 
column families with property values that aren't in sync, then change them all 
to the value of the default column family. If the table has indexes change the 
property values to keep them in sync as well. This should cover tables created 
with older clients that might have inconsistent properties, right?

> Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync 
> between the physical data table and index tables
> --
>
> Key: PHOENIX-3955
> URL: https://issues.apache.org/jira/browse/PHOENIX-3955
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Chinmay Kulkarni
>Priority: Major
>
> We need to make sure that indexes inherit the REPLICATION_SCOPE, 
> KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can 
> run into situations where the data was removed (or not removed) from the data 
> table but was removed (or not removed) from the index. Or vice-versa. We also 
> need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET 
> KEEP_DELETED_CELLS statements propagate the properties to the indexes too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables

2018-05-17 Thread Chinmay Kulkarni (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480007#comment-16480007
 ] 

Chinmay Kulkarni commented on PHOENIX-3955:
---

[~jamestaylor] [~tdsilva] [~gjacoby] In the upgrade path, I guess we would have 
to do 2 things then: For every table, make sure these properties are in sync 
amongst all column families; and ensure these properties are in sync for each 
index table. In the first case, I guess we can use the default CF as the source 
of truth.

What about the case where a table is created with an old phoenix client and so 
these properties have different values amongst its own column families, and we 
then try to create an index on this table with a new phoenix client? Since the 
base table's properties are out of sync amongst its own CFs, we won't know 
which properties to inherit during index creation. One solution is to force an 
entire upgrade/throw an UpgradeRequiredException, but "EXECUTE UPGRADE" does a 
lot of other stuff which we don't require at this point.

Is it worth the effort to introduce some new command like "SYNC TABLE  " which syncs these properties amongst all its column 
families and also all the indexes of that table?

> Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync 
> between the physical data table and index tables
> --
>
> Key: PHOENIX-3955
> URL: https://issues.apache.org/jira/browse/PHOENIX-3955
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Chinmay Kulkarni
>Priority: Major
>
> We need to make sure that indexes inherit the REPLICATION_SCOPE, 
> KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can 
> run into situations where the data was removed (or not removed) from the data 
> table but was removed (or not removed) from the index. Or vice-versa. We also 
> need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET 
> KEEP_DELETED_CELLS statements propagate the properties to the indexes too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables

2018-05-15 Thread Thomas D'Silva (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16476567#comment-16476567
 ] 

Thomas D'Silva commented on PHOENIX-3955:
-

As part of the upgrade to the next release we can check index tables to ensure 
these properties are in sync. 

> Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync 
> between the physical data table and index tables
> --
>
> Key: PHOENIX-3955
> URL: https://issues.apache.org/jira/browse/PHOENIX-3955
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Chinmay Kulkarni
>Priority: Major
>
> We need to make sure that indexes inherit the REPLICATION_SCOPE, 
> KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can 
> run into situations where the data was removed (or not removed) from the data 
> table but was removed (or not removed) from the index. Or vice-versa. We also 
> need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET 
> KEEP_DELETED_CELLS statements propagate the properties to the indexes too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables

2018-05-15 Thread Geoffrey Jacoby (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16476550#comment-16476550
 ] 

Geoffrey Jacoby commented on PHOENIX-3955:
--

[~tdsilva] [~jamestaylor] - if you're not allowed to alter these properties on 
an index table, what would a user do to get an index's properties back in sync 
with a base table if they were created improperly using a prior version of 
Phoenix?

> Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync 
> between the physical data table and index tables
> --
>
> Key: PHOENIX-3955
> URL: https://issues.apache.org/jira/browse/PHOENIX-3955
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Chinmay Kulkarni
>Priority: Major
>
> We need to make sure that indexes inherit the REPLICATION_SCOPE, 
> KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can 
> run into situations where the data was removed (or not removed) from the data 
> table but was removed (or not removed) from the index. Or vice-versa. We also 
> need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET 
> KEEP_DELETED_CELLS statements propagate the properties to the indexes too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables

2018-05-15 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16476547#comment-16476547
 ] 

James Taylor commented on PHOENIX-3955:
---

Agree, [~tdsilva].

> Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync 
> between the physical data table and index tables
> --
>
> Key: PHOENIX-3955
> URL: https://issues.apache.org/jira/browse/PHOENIX-3955
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Chinmay Kulkarni
>Priority: Major
>
> We need to make sure that indexes inherit the REPLICATION_SCOPE, 
> KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can 
> run into situations where the data was removed (or not removed) from the data 
> table but was removed (or not removed) from the index. Or vice-versa. We also 
> need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET 
> KEEP_DELETED_CELLS statements propagate the properties to the indexes too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables

2018-05-15 Thread Thomas D'Silva (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16476525#comment-16476525
 ] 

Thomas D'Silva commented on PHOENIX-3955:
-

 

I think KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL should be the same for 
all column families of a table (and any indexes on the table). 

If you alter them on the data table the properties on the index table should be 
kept in sync. You should also not be allowed to alter them on index tables.

WDYT [~jamestaylor]?

> Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync 
> between the physical data table and index tables
> --
>
> Key: PHOENIX-3955
> URL: https://issues.apache.org/jira/browse/PHOENIX-3955
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Chinmay Kulkarni
>Priority: Major
>
> We need to make sure that indexes inherit the REPLICATION_SCOPE, 
> KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can 
> run into situations where the data was removed (or not removed) from the data 
> table but was removed (or not removed) from the index. Or vice-versa. We also 
> need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET 
> KEEP_DELETED_CELLS statements propagate the properties to the indexes too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables

2018-05-15 Thread Chinmay Kulkarni (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16476523#comment-16476523
 ] 

Chinmay Kulkarni commented on PHOENIX-3955:
---

Why do we allow to set different values for KEEP_DELETED_CELLS and 
REPLICATION_SCOPE across different column families, but not allow each CF to 
have its own TTL? I don't understand the use-case for this. Can we enforce all 
3 properties to be the same for all CFs for a table? We can disallow users to 
create indexes with their own values for these properties and force the indexes 
to have the same value as other CFs of the data table.

Accordingly, we should disallow "ALTER TABLE" to change these 3 properties for 
specific CFs and instead apply the modified property to all CFs of the data 
table and all indexes as well. What say [~jamestaylor], [~tdsilva]?

> Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync 
> between the physical data table and index tables
> --
>
> Key: PHOENIX-3955
> URL: https://issues.apache.org/jira/browse/PHOENIX-3955
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Chinmay Kulkarni
>Priority: Major
>
> We need to make sure that indexes inherit the REPLICATION_SCOPE, 
> KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can 
> run into situations where the data was removed (or not removed) from the data 
> table but was removed (or not removed) from the index. Or vice-versa. We also 
> need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET 
> KEEP_DELETED_CELLS statements propagate the properties to the indexes too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables

2018-05-11 Thread Chinmay Kulkarni (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16471710#comment-16471710
 ] 

Chinmay Kulkarni commented on PHOENIX-3955:
---

Hey [~jamestaylor], [~samarthjain] [~tdsilva]
Here are some points on achieving this along with some questions I have:
Let's take a simple example. Say I create the base data table with the 
following query:

{code:sql}
CREATE TABLE IF NOT EXISTS z_base_table (
id INTEGER not null primary key, CF1.host VARCHAR(10),flag BOOLEAN) 
TTL=12,CF1.KEEP_DELETED_CELLS='true',REPLICATION_SCOPE='1';
{code}

We have the following paths to consider:

1. Create Index code path:
* *Case1: We create the data table with specific column families and there is 
no default CF*:
In this case, the global index table's default CF and the CFs corresponding to 
all local indexes should have default values for REPLICATION_SCOPE and 
KEEP_DELETED_CELLS as they do now, BUT they should inherit the TTL property 
from the non-local index CFs. In this case, it should be sufficient to check 
any non-local index CF's TTL since they are enforced to all be the same. 

* *Case2: The data table has a default CF*:
In this case, the global index table's default CF and the CFs corresponding to 
all local indexes should inherit REPLICATION_SCOPE, KEEP_DELETED_CELLS and the 
TTL property from the data table's default CF.

* *Question 1*: If we create an index with its own properties, say something 
like:
{code:sql}
CREATE INDEX diff_properties_z_index ON z_base_table(host) 
TTL=5000,KEEP_DELETED_CELLS='true';
{code}
We override the data table properties making the index tables and data table 
properties out of sync. This JIRA might set expectations that these properties 
are always in sync between index tables and the data table, so should we 
disallow this henceforth? At the very least we may want to log that the index 
table and data table properties will be out of sync after executing this query.

* *Question 1.1*: Given the above situation, if we later on alter the data 
table, should we blindly also alter the properties of the index tables (given 
that we want them to be in sync), or only alter index table properties in case 
they are equivalent to the data table properties?

* "Create index code path" changes should be achievable by changes in 
_CQSI.generateTableDescriptor_ before we apply specific properties of the index 
tables themselves.

2. Alter table set  code path:
* Here we can keep track of properties to be applied to 
_QueryConstants.ALL_FAMILY_PROPERTIES_KEY_ and not to specific CFs. In case we 
are changing TTL, REPLICATION_SCOPE or KEEP_DELETED_CELLS for all families, we 
will alter the properties for index table CFs as well.

* *Case 1: Global Index Tables:*
We can have _CQSI.separateAndValidateProperties_ return a _Map>_ and then later store all tabledescs and 
call _sendHBaseMetaData_() with this list of changes (which will now include 
GLOBAL index table changes as well). 

* *Case 2: Local Indexes:*
Can we simply change the column descriptor for the local index CF for the data 
table? I'm not sure if this makes sense, but feel free to throw some light on 
this case.

* *Question 2:*: If I create a local index on a CF specific column like:
{code:sql}
CREATE LOCAL INDEX cf_specific_z_index ON z_base_table(host);
{code}
 then shouldn't the local index be using a CF of "L#CF1" instead of the default 
"L#0"? In sqlline, when I do _select * from cf_specific_z_index;_, I see the 
column as _CF1:Host_, but when I _desc 'z_base_table'_ in HBase shell, I see 
the cf name to be "L#0". 

* *Question 3:* How do we handle the case of multiple local indexes created on 
the same table? If I run the following:
{code:sql}
CREATE LOCAL INDEX local_z_index1 ON z_base_table(host) 
TTL=,KEEP_DELETED_CELLS='true';
CREATE LOCAL INDEX local_z_index2 ON z_base_table(flag) 
TTL=,KEEP_DELETED_CELLS='false';
{code}
The actual HBase metadata change only reflects the last statement, since both 
local indexes map to the 'L#0' column family. Please let me know if this is 
handled at the Phoenix layer and I'm missing something.

> Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync 
> between the physical data table and index tables
> --
>
> Key: PHOENIX-3955
> URL: https://issues.apache.org/jira/browse/PHOENIX-3955
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Chinmay Kulkarni
>Priority: Major
>
> We need to make sure that indexes inherit the REPLICATION_SCOPE, 
> KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can 
> run into situations where the data was removed (or not removed) from the data 
> table but was