[ 
https://issues.apache.org/jira/browse/PHOENIX-597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14286775#comment-14286775
 ] 

James Taylor commented on PHOENIX-597:
--------------------------------------

By default, DROP TABLE <table_name> drops the HBase metadata and the Phoenix 
metadata, so the original description doesn't apply. However, it's still an 
issue CREATE TABLE IF NOT EXISTS is used:
{code}
CREATE TABLE t (k VARCHAR PRIMARY KEY);
{code}
followed by a 
{code}
CREATE TABLE IF NOT EXISTS t (k VARCHAR PRIMARY KEY) SALT_BUCKETS=10;
{code}
No exception will be thrown, however the table will not be salted. So clearly 
it's an error if a create-time-only property is used that's different than the 
prior value.

Also, the case where a SPLIT is provided is somewhat tricky. We could just 
error out, or we could get the regions of the existing table and only error out 
if we don't have split points already at every split point provided:
{code}
CREATE TABLE IF NOT EXISTS t (k VARCHAR PRIMARY KEY) SPLIT ON ('a','b','c');
{code}

However, what if a property that may be changed is used? For example:
{code}
CREATE TABLE IF NOT EXISTS t (k VARCHAR PRIMARY KEY) DISABLE_WAL=true;
{code}
In this case, we can either also error out, or (probably better), we can issue 
the call to update the properties.

And here's another case, where a column (and/or column family) is added:
{code}
CREATE TABLE IF NOT EXISTS t (k VARCHAR PRIMARY KEY, a.v VARCHAR);
{code}
In this case, we can also either error out, or we can adjust the table 
definition as needed.

And one more case, if a create table statement with an existing column but a 
new data type is issued. This is definitely an error:
{code}
CREATE TABLE IF NOT EXISTS t (k BIGINT PRIMARY KEY);
{code}
Falling into this same category would be an attempt to remove a column from the 
PK or add a new not null column to the PK.

The bulk of changes would be in MetaDataClient.createTableInternal(). The code 
that determines that a table already exists is the following block on line 1605:
{code}
            MetaDataMutationResult result = 
connection.getQueryServices().createTable(
                    tableMetaData,
                    viewType == ViewType.MAPPED || indexId != null ? 
physicalNames.get(0).getBytes() : null,
                    tableType, tableProps, familyPropList, splits);
            MutationCode code = result.getMutationCode();
            switch(code) {
            case TABLE_ALREADY_EXISTS:
                connection.addTable(result.getTable());
                if (!statement.ifNotExists()) {
                    throw new TableAlreadyExistsException(schemaName, 
tableName, result.getTable());
                }
                return null;
{code}
Notice that the PTable is returned from the server, in result.getTable(), and 
that we don't throw an exception when IF NOT EXISTS is in the DDL statement. 
We'd need to add validation at this point that compares this PTable with the 
one that would be created if the table didn't exist (see line 1623, the default 
case statement):
{code}
            default:
                PName newSchemaName = PNameFactory.newName(schemaName);
                PTable table =  PTableImpl.makePTable(
                        tenantId, newSchemaName, 
PNameFactory.newName(tableName), tableType, indexState, 
result.getMutationTime(),
                        PTable.INITIAL_SEQ_NUM, pkName == null ? null : 
PNameFactory.newName(pkName), saltBucketNum, columns,
                        dataTableName == null ? null : newSchemaName, 
dataTableName == null ? null : PNameFactory.newName(dataTableName), 
Collections.<PTable>emptyList(), isImmutableRows,
                        physicalNames, defaultFamilyName == null ? null : 
PNameFactory.newName(defaultFamilyName), viewStatement, 
Boolean.TRUE.equals(disableWAL), multiTenant, storeNulls, viewType,
                        indexId, indexType);
{code}
It's possible that we could do a logical diff between the two tables and 
produce and run an ALTER TABLE statement that would modify the table 
appropriately (or at a minimum throw an exception if they're not the same). 
Another alternative would be to do this on the server side in 
MetaDataEndPointImpl.createTable(). Note that we'd need to also take into 
account the non Phoenix properties supplied as key=value in the create table 
statement.

FYI, [~aakash.pradeep] - [~samarthjain] is an expert in this area- would be 
interested in your opinion here too.

> Detect if HBase table exists and is not split as expected when DDL run
> ----------------------------------------------------------------------
>
>                 Key: PHOENIX-597
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-597
>             Project: Phoenix
>          Issue Type: Task
>            Reporter: James Taylor
>            Assignee: Aakash Pradeep
>
> Currently, if you create an unsplit or unsalted table like this:
>       CREATE TABLE t (k VARCHAR PRIMARY KEY);
> and then drop it:
>       DROP TABLE k;
> and the create it again, either salted or split:
>       CREATE TABLE t (k VARCHAR PRIMARY KEY) SALT_BUCKETS=10;
> Then it's not going to work properly, since the second CREATE will not 
> pre-split the table since the HBase table already exists (as we currently 
> never drop HBase metadata).
> We should either drop the HBase metadata on DROP TABLE, or we should at a 
> minimum detect this case and throw.
> As a workaround, when creating a salted or split table, make sure to disable 
> and drop it from the HBase shell if it already exists.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to