[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables
[ https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545547#comment-16545547 ] Chinmay Kulkarni commented on PHOENIX-3955: --- [~jamestaylor] I definitely plan to..just been caught up with a few things at work. Will update this soon. Thanks! > Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync > between the physical data table and index tables > -- > > Key: PHOENIX-3955 > URL: https://issues.apache.org/jira/browse/PHOENIX-3955 > Project: Phoenix > Issue Type: Bug >Reporter: Samarth Jain >Assignee: Chinmay Kulkarni >Priority: Major > > We need to make sure that indexes inherit the REPLICATION_SCOPE, > KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can > run into situations where the data was removed (or not removed) from the data > table but was removed (or not removed) from the index. Or vice-versa. We also > need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET > KEEP_DELETED_CELLS statements propagate the properties to the indexes too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables
[ https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16540238#comment-16540238 ] James Taylor commented on PHOENIX-3955: --- You've made great progress on this one, [~ckulkarni]. Will you be able to see this one through? It's pretty high on the prioritized list here: https://docs.google.com/document/d/16BWU73qlnCQdxCflM3LScyIxE_OGi76ZYT2WHUfewiE > Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync > between the physical data table and index tables > -- > > Key: PHOENIX-3955 > URL: https://issues.apache.org/jira/browse/PHOENIX-3955 > Project: Phoenix > Issue Type: Bug >Reporter: Samarth Jain >Assignee: Chinmay Kulkarni >Priority: Major > > We need to make sure that indexes inherit the REPLICATION_SCOPE, > KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can > run into situations where the data was removed (or not removed) from the data > table but was removed (or not removed) from the index. Or vice-versa. We also > need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET > KEEP_DELETED_CELLS statements propagate the properties to the indexes too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables
[ https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508746#comment-16508746 ] ASF GitHub Bot commented on PHOENIX-3955: - Github user twdsilva commented on a diff in the pull request: https://github.com/apache/phoenix/pull/304#discussion_r194546247 --- Diff: phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java --- @@ -767,50 +748,117 @@ private void modifyColumnFamilyDescriptor(HColumnDescriptor hcd, Map ensureTableColumnFamilyPropsInSync(String tableName, byte[] defaultFamilyBytes) throws SQLException { +HTableDescriptor tableDescriptor = getTableDescriptor(Bytes.toBytes(tableName)); +HColumnDescriptor[] colFamilies = tableDescriptor.getColumnFamilies(); +HColumnDescriptor defaultColDescriptor = tableDescriptor.getFamily(defaultFamilyBytes); +// It's possible that the table has specific column families and none of them are declared to be the DEFAULT_COLUMN_FAMILY +defaultColDescriptor = defaultColDescriptor != null ? defaultColDescriptor : colFamilies[0]; + +for (String propName: MetaDataUtil.PROPERTIES_TO_KEEP_IN_SYNC_AMONG_COL_FAMS_AND_INDEXES) { +if (defaultColDescriptor.getValue(propName) == null) { --- End diff -- Do you need to throw UpgradeRequiredException if the default column table property is null here? > Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync > between the physical data table and index tables > -- > > Key: PHOENIX-3955 > URL: https://issues.apache.org/jira/browse/PHOENIX-3955 > Project: Phoenix > Issue Type: Bug >Reporter: Samarth Jain >Assignee: Chinmay Kulkarni >Priority: Major > > We need to make sure that indexes inherit the REPLICATION_SCOPE, > KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can > run into situations where the data was removed (or not removed) from the data > table but was removed (or not removed) from the index. Or vice-versa. We also > need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET > KEEP_DELETED_CELLS statements propagate the properties to the indexes too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables
[ https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508747#comment-16508747 ] ASF GitHub Bot commented on PHOENIX-3955: - Github user twdsilva commented on a diff in the pull request: https://github.com/apache/phoenix/pull/304#discussion_r194544646 --- Diff: phoenix-core/src/main/java/org/apache/phoenix/util/MetaDataUtil.java --- @@ -651,6 +653,10 @@ public static String getJdbcUrl(RegionCoprocessorEnvironment env) { + PhoenixRuntime.JDBC_PROTOCOL_SEPARATOR + zkParentNode; } +public static boolean isPropertyAllowedToBeOutOfSyncAmongColFamsAndIndexes(String colFamProp) { --- End diff -- Just return .contains since the callers always negate the boolean returned by this function. > Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync > between the physical data table and index tables > -- > > Key: PHOENIX-3955 > URL: https://issues.apache.org/jira/browse/PHOENIX-3955 > Project: Phoenix > Issue Type: Bug >Reporter: Samarth Jain >Assignee: Chinmay Kulkarni >Priority: Major > > We need to make sure that indexes inherit the REPLICATION_SCOPE, > KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can > run into situations where the data was removed (or not removed) from the data > table but was removed (or not removed) from the index. Or vice-versa. We also > need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET > KEEP_DELETED_CELLS statements propagate the properties to the indexes too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables
[ https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508748#comment-16508748 ] ASF GitHub Bot commented on PHOENIX-3955: - Github user twdsilva commented on a diff in the pull request: https://github.com/apache/phoenix/pull/304#discussion_r194545366 --- Diff: phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java --- @@ -767,50 +748,117 @@ private void modifyColumnFamilyDescriptor(HColumnDescriptor hcd, Map ensureTableColumnFamilyPropsInSync(String tableName, byte[] defaultFamilyBytes) throws SQLException { +HTableDescriptor tableDescriptor = getTableDescriptor(Bytes.toBytes(tableName)); +HColumnDescriptor[] colFamilies = tableDescriptor.getColumnFamilies(); +HColumnDescriptor defaultColDescriptor = tableDescriptor.getFamily(defaultFamilyBytes); +// It's possible that the table has specific column families and none of them are declared to be the DEFAULT_COLUMN_FAMILY +defaultColDescriptor = defaultColDescriptor != null ? defaultColDescriptor : colFamilies[0]; + +for (String propName: MetaDataUtil.PROPERTIES_TO_KEEP_IN_SYNC_AMONG_COL_FAMS_AND_INDEXES) { +if (defaultColDescriptor.getValue(propName) == null) { +if (!isUpgradeRequired()) { +// We cannot have a null value for any of the properties that need to be kept in sync amongst all column families +logger.error("Cannot have null value for column family property: " + propName); +setUpgradeRequired(); +throw new UpgradeRequiredException(); +} else { +defaultColDescriptor.setValue(propName, HColumnDescriptor.getDefaultValues().get(propName)); +} +} +} +// Used in the upgrade path to actually fix the table descriptor by syncing properties +HTableDescriptor syncedTableDescriptor = new HTableDescriptor(tableDescriptor); +// Check that these properties are in sync amongst all column families of the table +for (HColumnDescriptor family: colFamilies) { +if (isUpgradeRequired()) { +family = syncedTableDescriptor.getFamily(family.getName()); +} +for (String propName: MetaDataUtil.PROPERTIES_TO_KEEP_IN_SYNC_AMONG_COL_FAMS_AND_INDEXES) { +if (!family.getValue(propName).toUpperCase(Locale.ROOT).equals(defaultColDescriptor.getValue(propName).toUpperCase(Locale.ROOT))) { --- End diff -- Do you need the toUpperCase(Locale.ROOT) ? > Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync > between the physical data table and index tables > -- > > Key: PHOENIX-3955 > URL: https://issues.apache.org/jira/browse/PHOENIX-3955 > Project: Phoenix > Issue Type: Bug >Reporter: Samarth Jain >Assignee: Chinmay Kulkarni >Priority: Major > > We need to make sure that indexes inherit the REPLICATION_SCOPE, > KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can > run into situations where the data was removed (or not removed) from the data > table but was removed (or not removed) from the index. Or vice-versa. We also > need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET > KEEP_DELETED_CELLS statements propagate the properties to the indexes too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables
[ https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508745#comment-16508745 ] ASF GitHub Bot commented on PHOENIX-3955: - Github user twdsilva commented on a diff in the pull request: https://github.com/apache/phoenix/pull/304#discussion_r194539685 --- Diff: phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java --- @@ -98,14 +84,7 @@ import javax.annotation.concurrent.GuardedBy; import org.apache.hadoop.conf.Configuration; -import org.apache.hadoop.hbase.HColumnDescriptor; -import org.apache.hadoop.hbase.HConstants; -import org.apache.hadoop.hbase.HRegionLocation; -import org.apache.hadoop.hbase.HTableDescriptor; -import org.apache.hadoop.hbase.NamespaceDescriptor; -import org.apache.hadoop.hbase.NamespaceNotFoundException; -import org.apache.hadoop.hbase.TableExistsException; -import org.apache.hadoop.hbase.TableName; +import org.apache.hadoop.hbase.*; --- End diff -- Are you using the phoenix code template https://phoenix.apache.org/develop.html ? > Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync > between the physical data table and index tables > -- > > Key: PHOENIX-3955 > URL: https://issues.apache.org/jira/browse/PHOENIX-3955 > Project: Phoenix > Issue Type: Bug >Reporter: Samarth Jain >Assignee: Chinmay Kulkarni >Priority: Major > > We need to make sure that indexes inherit the REPLICATION_SCOPE, > KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can > run into situations where the data was removed (or not removed) from the data > table but was removed (or not removed) from the index. Or vice-versa. We also > need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET > KEEP_DELETED_CELLS statements propagate the properties to the indexes too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables
[ https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16508744#comment-16508744 ] ASF GitHub Bot commented on PHOENIX-3955: - Github user twdsilva commented on the issue: https://github.com/apache/phoenix/pull/304 I think you is better If you add code to UpgradeUtil that runs at the next major upgrade that checks all the tables and ensures these three property values are in sync (see addParentToChildLinks for an example). Then you wouldn't need to validate that the properties are in sync in multiple places. > Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync > between the physical data table and index tables > -- > > Key: PHOENIX-3955 > URL: https://issues.apache.org/jira/browse/PHOENIX-3955 > Project: Phoenix > Issue Type: Bug >Reporter: Samarth Jain >Assignee: Chinmay Kulkarni >Priority: Major > > We need to make sure that indexes inherit the REPLICATION_SCOPE, > KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can > run into situations where the data was removed (or not removed) from the data > table but was removed (or not removed) from the index. Or vice-versa. We also > need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET > KEEP_DELETED_CELLS statements propagate the properties to the indexes too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables
[ https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498331#comment-16498331 ] ASF GitHub Bot commented on PHOENIX-3955: - Github user ChinmaySKulkarni commented on the issue: https://github.com/apache/phoenix/pull/304 Ran into [PHOENIX-3167](https://issues.apache.org/jira/browse/PHOENIX-3167) when doing some testing for this JIRA. This can be a bigger problem now that we force an upgrade in case column families are out of sync for a table. Steps to repro: create table test (id INTEGER not null primary key, name varchar(10)) TTL=1200,KEEP_DELETED_CELLS='false',REPLICATION_SCOPE='1'; alter table test add cf1.random varchar(10); -> New column family cf1 inherits TTL, KEEP_DELETED_CELLS and REPLICATION_SCOPE properties alter table test set TTL=10; create table if not exists test (id INTEGER not null primary key, name varchar(10)) TTL=1200,KEEP_DELETED_CELLS='false',REPLICATION_SCOPE='1'; -> Modifies HBase metadata for the table so TTL for default cf is changed to 1200, whereas newly added cf1's TTL is still 10. With this patch, at this point, any future alter table command or create index will fail with UpgradeRequiredException since column family properties are out of sync. Even if we run the second create table with 'if not exists', same behavior. Similar problem is seen with alter table for global index. See [PHOENIX-4743](https://issues.apache.org/jira/browse/PHOENIX-4743) @twdsilva @vincentpoon > Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync > between the physical data table and index tables > -- > > Key: PHOENIX-3955 > URL: https://issues.apache.org/jira/browse/PHOENIX-3955 > Project: Phoenix > Issue Type: Bug >Reporter: Samarth Jain >Assignee: Chinmay Kulkarni >Priority: Major > > We need to make sure that indexes inherit the REPLICATION_SCOPE, > KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can > run into situations where the data was removed (or not removed) from the data > table but was removed (or not removed) from the index. Or vice-versa. We also > need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET > KEEP_DELETED_CELLS statements propagate the properties to the indexes too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables
[ https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498322#comment-16498322 ] ASF GitHub Bot commented on PHOENIX-3955: - Github user ChinmaySKulkarni commented on the issue: https://github.com/apache/phoenix/pull/304 @twdsilva @vincentpoon please review. I will be adding stuff mentioned in "TODO", but just wanted your feedback on this initial patch. > Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync > between the physical data table and index tables > -- > > Key: PHOENIX-3955 > URL: https://issues.apache.org/jira/browse/PHOENIX-3955 > Project: Phoenix > Issue Type: Bug >Reporter: Samarth Jain >Assignee: Chinmay Kulkarni >Priority: Major > > We need to make sure that indexes inherit the REPLICATION_SCOPE, > KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can > run into situations where the data was removed (or not removed) from the data > table but was removed (or not removed) from the index. Or vice-versa. We also > need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET > KEEP_DELETED_CELLS statements propagate the properties to the indexes too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables
[ https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498319#comment-16498319 ] ASF GitHub Bot commented on PHOENIX-3955: - GitHub user ChinmaySKulkarni opened a pull request: https://github.com/apache/phoenix/pull/304 PHOENIX-3955: Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables **[DO NOT MERGE - WIP patch]** - Disallow specifying TTL, KEEP_DELETED_CELLS and REPLICATION_SCOPE at column family level when creating table and also while creating indexes - Sync these properties in create index code path after checking for sync of existing column families in data table - Modified alter table and alter global index code path to ensure syncing of properties - Added syncing properties of column families of tables and their indexes as a step in execute upgrade code path TODO: - Add local testing that has been done - Test old client new server and execute upgrade - Handle indexes on views You can merge this pull request into a Git repository by running: $ git pull https://github.com/ChinmaySKulkarni/phoenix PHOENIX-3955 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/phoenix/pull/304.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #304 commit a370da3ecf1692fbed36463c184e1f4953ad545e Author: Chinmay Kulkarni Date: 2018-06-01T17:36:36Z PHOENIX-3955: Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables [DO NOT MERGE - WIP patch] - Disallow specifying TTL, KEEP_DELETED_CELLS and REPLICATION_SCOPE at column family level when creating table and also while creating indexes - Sync these properties in create index code path after checking for sync of existing column families in data table - Modified alter table and alter global index code path to ensure syncing of properties - Added syncing properties of column families of tables and their indexes as a step in execute upgrade code path TODO: - Did lots of local testing for corner cases. Need to add tests - Test old client new server and execute upgrade - Handle indexes on views > Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync > between the physical data table and index tables > -- > > Key: PHOENIX-3955 > URL: https://issues.apache.org/jira/browse/PHOENIX-3955 > Project: Phoenix > Issue Type: Bug >Reporter: Samarth Jain >Assignee: Chinmay Kulkarni >Priority: Major > > We need to make sure that indexes inherit the REPLICATION_SCOPE, > KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can > run into situations where the data was removed (or not removed) from the data > table but was removed (or not removed) from the index. Or vice-versa. We also > need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET > KEEP_DELETED_CELLS statements propagate the properties to the indexes too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables
[ https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480129#comment-16480129 ] Thomas D'Silva commented on PHOENIX-3955: - At upgrade time if you check all tables and if you find a table with multiple column families with property values that aren't in sync, then change them all to the value of the default column family. If the table has indexes change the property values to keep them in sync as well. This should cover tables created with older clients that might have inconsistent properties, right? > Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync > between the physical data table and index tables > -- > > Key: PHOENIX-3955 > URL: https://issues.apache.org/jira/browse/PHOENIX-3955 > Project: Phoenix > Issue Type: Bug >Reporter: Samarth Jain >Assignee: Chinmay Kulkarni >Priority: Major > > We need to make sure that indexes inherit the REPLICATION_SCOPE, > KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can > run into situations where the data was removed (or not removed) from the data > table but was removed (or not removed) from the index. Or vice-versa. We also > need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET > KEEP_DELETED_CELLS statements propagate the properties to the indexes too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables
[ https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480007#comment-16480007 ] Chinmay Kulkarni commented on PHOENIX-3955: --- [~jamestaylor] [~tdsilva] [~gjacoby] In the upgrade path, I guess we would have to do 2 things then: For every table, make sure these properties are in sync amongst all column families; and ensure these properties are in sync for each index table. In the first case, I guess we can use the default CF as the source of truth. What about the case where a table is created with an old phoenix client and so these properties have different values amongst its own column families, and we then try to create an index on this table with a new phoenix client? Since the base table's properties are out of sync amongst its own CFs, we won't know which properties to inherit during index creation. One solution is to force an entire upgrade/throw an UpgradeRequiredException, but "EXECUTE UPGRADE" does a lot of other stuff which we don't require at this point. Is it worth the effort to introduce some new command like "SYNC TABLE " which syncs these properties amongst all its column families and also all the indexes of that table? > Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync > between the physical data table and index tables > -- > > Key: PHOENIX-3955 > URL: https://issues.apache.org/jira/browse/PHOENIX-3955 > Project: Phoenix > Issue Type: Bug >Reporter: Samarth Jain >Assignee: Chinmay Kulkarni >Priority: Major > > We need to make sure that indexes inherit the REPLICATION_SCOPE, > KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can > run into situations where the data was removed (or not removed) from the data > table but was removed (or not removed) from the index. Or vice-versa. We also > need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET > KEEP_DELETED_CELLS statements propagate the properties to the indexes too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables
[ https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16476567#comment-16476567 ] Thomas D'Silva commented on PHOENIX-3955: - As part of the upgrade to the next release we can check index tables to ensure these properties are in sync. > Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync > between the physical data table and index tables > -- > > Key: PHOENIX-3955 > URL: https://issues.apache.org/jira/browse/PHOENIX-3955 > Project: Phoenix > Issue Type: Bug >Reporter: Samarth Jain >Assignee: Chinmay Kulkarni >Priority: Major > > We need to make sure that indexes inherit the REPLICATION_SCOPE, > KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can > run into situations where the data was removed (or not removed) from the data > table but was removed (or not removed) from the index. Or vice-versa. We also > need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET > KEEP_DELETED_CELLS statements propagate the properties to the indexes too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables
[ https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16476550#comment-16476550 ] Geoffrey Jacoby commented on PHOENIX-3955: -- [~tdsilva] [~jamestaylor] - if you're not allowed to alter these properties on an index table, what would a user do to get an index's properties back in sync with a base table if they were created improperly using a prior version of Phoenix? > Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync > between the physical data table and index tables > -- > > Key: PHOENIX-3955 > URL: https://issues.apache.org/jira/browse/PHOENIX-3955 > Project: Phoenix > Issue Type: Bug >Reporter: Samarth Jain >Assignee: Chinmay Kulkarni >Priority: Major > > We need to make sure that indexes inherit the REPLICATION_SCOPE, > KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can > run into situations where the data was removed (or not removed) from the data > table but was removed (or not removed) from the index. Or vice-versa. We also > need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET > KEEP_DELETED_CELLS statements propagate the properties to the indexes too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables
[ https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16476547#comment-16476547 ] James Taylor commented on PHOENIX-3955: --- Agree, [~tdsilva]. > Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync > between the physical data table and index tables > -- > > Key: PHOENIX-3955 > URL: https://issues.apache.org/jira/browse/PHOENIX-3955 > Project: Phoenix > Issue Type: Bug >Reporter: Samarth Jain >Assignee: Chinmay Kulkarni >Priority: Major > > We need to make sure that indexes inherit the REPLICATION_SCOPE, > KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can > run into situations where the data was removed (or not removed) from the data > table but was removed (or not removed) from the index. Or vice-versa. We also > need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET > KEEP_DELETED_CELLS statements propagate the properties to the indexes too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables
[ https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16476525#comment-16476525 ] Thomas D'Silva commented on PHOENIX-3955: - I think KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL should be the same for all column families of a table (and any indexes on the table). If you alter them on the data table the properties on the index table should be kept in sync. You should also not be allowed to alter them on index tables. WDYT [~jamestaylor]? > Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync > between the physical data table and index tables > -- > > Key: PHOENIX-3955 > URL: https://issues.apache.org/jira/browse/PHOENIX-3955 > Project: Phoenix > Issue Type: Bug >Reporter: Samarth Jain >Assignee: Chinmay Kulkarni >Priority: Major > > We need to make sure that indexes inherit the REPLICATION_SCOPE, > KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can > run into situations where the data was removed (or not removed) from the data > table but was removed (or not removed) from the index. Or vice-versa. We also > need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET > KEEP_DELETED_CELLS statements propagate the properties to the indexes too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables
[ https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16476523#comment-16476523 ] Chinmay Kulkarni commented on PHOENIX-3955: --- Why do we allow to set different values for KEEP_DELETED_CELLS and REPLICATION_SCOPE across different column families, but not allow each CF to have its own TTL? I don't understand the use-case for this. Can we enforce all 3 properties to be the same for all CFs for a table? We can disallow users to create indexes with their own values for these properties and force the indexes to have the same value as other CFs of the data table. Accordingly, we should disallow "ALTER TABLE" to change these 3 properties for specific CFs and instead apply the modified property to all CFs of the data table and all indexes as well. What say [~jamestaylor], [~tdsilva]? > Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync > between the physical data table and index tables > -- > > Key: PHOENIX-3955 > URL: https://issues.apache.org/jira/browse/PHOENIX-3955 > Project: Phoenix > Issue Type: Bug >Reporter: Samarth Jain >Assignee: Chinmay Kulkarni >Priority: Major > > We need to make sure that indexes inherit the REPLICATION_SCOPE, > KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can > run into situations where the data was removed (or not removed) from the data > table but was removed (or not removed) from the index. Or vice-versa. We also > need to make sure that any ALTER TABLE SET TTL or ALTER TABLE SET > KEEP_DELETED_CELLS statements propagate the properties to the indexes too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (PHOENIX-3955) Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync between the physical data table and index tables
[ https://issues.apache.org/jira/browse/PHOENIX-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16471710#comment-16471710 ] Chinmay Kulkarni commented on PHOENIX-3955: --- Hey [~jamestaylor], [~samarthjain] [~tdsilva] Here are some points on achieving this along with some questions I have: Let's take a simple example. Say I create the base data table with the following query: {code:sql} CREATE TABLE IF NOT EXISTS z_base_table ( id INTEGER not null primary key, CF1.host VARCHAR(10),flag BOOLEAN) TTL=12,CF1.KEEP_DELETED_CELLS='true',REPLICATION_SCOPE='1'; {code} We have the following paths to consider: 1. Create Index code path: * *Case1: We create the data table with specific column families and there is no default CF*: In this case, the global index table's default CF and the CFs corresponding to all local indexes should have default values for REPLICATION_SCOPE and KEEP_DELETED_CELLS as they do now, BUT they should inherit the TTL property from the non-local index CFs. In this case, it should be sufficient to check any non-local index CF's TTL since they are enforced to all be the same. * *Case2: The data table has a default CF*: In this case, the global index table's default CF and the CFs corresponding to all local indexes should inherit REPLICATION_SCOPE, KEEP_DELETED_CELLS and the TTL property from the data table's default CF. * *Question 1*: If we create an index with its own properties, say something like: {code:sql} CREATE INDEX diff_properties_z_index ON z_base_table(host) TTL=5000,KEEP_DELETED_CELLS='true'; {code} We override the data table properties making the index tables and data table properties out of sync. This JIRA might set expectations that these properties are always in sync between index tables and the data table, so should we disallow this henceforth? At the very least we may want to log that the index table and data table properties will be out of sync after executing this query. * *Question 1.1*: Given the above situation, if we later on alter the data table, should we blindly also alter the properties of the index tables (given that we want them to be in sync), or only alter index table properties in case they are equivalent to the data table properties? * "Create index code path" changes should be achievable by changes in _CQSI.generateTableDescriptor_ before we apply specific properties of the index tables themselves. 2. Alter table set code path: * Here we can keep track of properties to be applied to _QueryConstants.ALL_FAMILY_PROPERTIES_KEY_ and not to specific CFs. In case we are changing TTL, REPLICATION_SCOPE or KEEP_DELETED_CELLS for all families, we will alter the properties for index table CFs as well. * *Case 1: Global Index Tables:* We can have _CQSI.separateAndValidateProperties_ return a _Map>_ and then later store all tabledescs and call _sendHBaseMetaData_() with this list of changes (which will now include GLOBAL index table changes as well). * *Case 2: Local Indexes:* Can we simply change the column descriptor for the local index CF for the data table? I'm not sure if this makes sense, but feel free to throw some light on this case. * *Question 2:*: If I create a local index on a CF specific column like: {code:sql} CREATE LOCAL INDEX cf_specific_z_index ON z_base_table(host); {code} then shouldn't the local index be using a CF of "L#CF1" instead of the default "L#0"? In sqlline, when I do _select * from cf_specific_z_index;_, I see the column as _CF1:Host_, but when I _desc 'z_base_table'_ in HBase shell, I see the cf name to be "L#0". * *Question 3:* How do we handle the case of multiple local indexes created on the same table? If I run the following: {code:sql} CREATE LOCAL INDEX local_z_index1 ON z_base_table(host) TTL=,KEEP_DELETED_CELLS='true'; CREATE LOCAL INDEX local_z_index2 ON z_base_table(flag) TTL=,KEEP_DELETED_CELLS='false'; {code} The actual HBase metadata change only reflects the last statement, since both local indexes map to the 'L#0' column family. Please let me know if this is handled at the Phoenix layer and I'm missing something. > Ensure KEEP_DELETED_CELLS, REPLICATION_SCOPE, and TTL properties stay in sync > between the physical data table and index tables > -- > > Key: PHOENIX-3955 > URL: https://issues.apache.org/jira/browse/PHOENIX-3955 > Project: Phoenix > Issue Type: Bug >Reporter: Samarth Jain >Assignee: Chinmay Kulkarni >Priority: Major > > We need to make sure that indexes inherit the REPLICATION_SCOPE, > KEEP_DELETED_CELLS and TTL properties from the base table. Otherwise we can > run into situations where the data was removed (or not removed) from the data > table but was