[jira] [Updated] (PHOENIX-7026) Validate LAST_DDL_TIMESTAMP for write requests.

Rushabh Shah (Jira) Tue, 22 Aug 2023 11:04:07 -0700


     [ 
https://issues.apache.org/jira/browse/PHOENIX-7026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Rushabh Shah updated PHOENIX-7026:
----------------------------------
    Description: 
We need to validate LAST_DDL_TIMESTAMP for write requests.
We can't create an extra RPC like we do for read requests (PHOENIX-7025) since 
the semantics for executeQuery() and executeMutation() is very different.

The usual semantic for batch write requests (assuming auto commit OFF) is:
{noformat}
conn = DriverManager.getConnection(getUrl(), props);
conn.createStatement().execute("upsert into foo values (1,2)");
conn.createStatement().execute("upsert into foo values (3,4)");
conn.createStatement().execute("upsert into foo values (5,6)");
conn.createStatement().execute("upsert into bar values ('a','b')");
conn.createStatement().execute("upsert into bar values ('c','d')"); 
conn.commit();
{noformat}
If we introduce an RPC for every execute() method, then it increases the 
latency of the write request.

There are 2 options I can think of:
 #  Call validateDDLTimestamp rpc for the very first execute() method for every 
table in the batch.
 ** Pros: 
 *** An extra RPC is made only once in the whole batch for every table. 
 *** Easy to implement. 
 *** In case of StaleMetadataCacheException, the retry is very easy to 
implement.
 ** Cons:
 # Call validateDDLTimestamp rpc for all the tables in the batch when we call 
conn.commit()
 ** Pros 
 *** We add only 1 extra RPC  irrespective of the number of tables in the batch.
 ** Cons
 *** In case of StaleMetadataCacheException, the retry becomes very complex. We 
will have to re-create all the mutations that we have created in the 
executeMutation phase.
 *** If autocommit is turned ON, then we introduce an extra RPC for every 
execute() call which will introduce regression in this feature.

  was:
We need to validate LAST_DDL_TIMESTAMP for write requests.
We can't create an extra RPC like we do for read requests (PHOENIX-7025) since 
the semantics for executeQuery() and executeMutation() is very different.

The usual semantic for batch write requests (assuming auto commit OFF) is:
{noformat}
conn = DriverManager.getConnection(getUrl(), props);
conn.createStatement().execute("upsert into foo values (1,2)");
conn.createStatement().execute("upsert into foo values (3,4)");
conn.createStatement().execute("upsert into foo values (5,6)");
conn.createStatement().execute("upsert into bar values ('a','b')");
conn.createStatement().execute("upsert into foo values ('c','d')"); 
conn.commit();
{noformat}
If we introduce an RPC for every execute() method, then it increases the 
latency of the write request.

There are 2 options I can think of:
 #  Call validateDDLTimestamp rpc for the very first execute() method for every 
table in the batch.
 ** Pros: 
 *** An extra RPC is made only once in the whole batch for every table. 
 *** Easy to implement. 
 *** In case of StaleMetadataCacheException, the retry is very easy to 
implement.
 ** Cons:
 # Call validateDDLTimestamp rpc for all the tables in the batch when we call 
conn.commit()
 ** Pros 
 *** We add only 1 extra RPC  irrespective of the number of tables in the batch.
 ** Cons
 *** In case of StaleMetadataCacheException, the retry becomes very complex. We 
will have to re-create all the mutations that we have created in the 
executeMutation phase.
 *** If autocommit is turned ON, then we introduce an extra RPC for every 
execute() call which will introduce regression in this feature.


> Validate LAST_DDL_TIMESTAMP for write requests.
> -----------------------------------------------
>
>                 Key: PHOENIX-7026
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-7026
>             Project: Phoenix
>          Issue Type: Sub-task
>          Components: core
>            Reporter: Rushabh Shah
>            Priority: Major
>
> We need to validate LAST_DDL_TIMESTAMP for write requests.
> We can't create an extra RPC like we do for read requests (PHOENIX-7025) 
> since the semantics for executeQuery() and executeMutation() is very 
> different.
> The usual semantic for batch write requests (assuming auto commit OFF) is:
> {noformat}
> conn = DriverManager.getConnection(getUrl(), props);
> conn.createStatement().execute("upsert into foo values (1,2)");
> conn.createStatement().execute("upsert into foo values (3,4)");
> conn.createStatement().execute("upsert into foo values (5,6)");
> conn.createStatement().execute("upsert into bar values ('a','b')");
> conn.createStatement().execute("upsert into bar values ('c','d')"); 
> conn.commit();
> {noformat}
> If we introduce an RPC for every execute() method, then it increases the 
> latency of the write request.
> There are 2 options I can think of:
>  #  Call validateDDLTimestamp rpc for the very first execute() method for 
> every table in the batch.
>  ** Pros: 
>  *** An extra RPC is made only once in the whole batch for every table. 
>  *** Easy to implement. 
>  *** In case of StaleMetadataCacheException, the retry is very easy to 
> implement.
>  ** Cons:
>  # Call validateDDLTimestamp rpc for all the tables in the batch when we call 
> conn.commit()
>  ** Pros 
>  *** We add only 1 extra RPC  irrespective of the number of tables in the 
> batch.
>  ** Cons
>  *** In case of StaleMetadataCacheException, the retry becomes very complex. 
> We will have to re-create all the mutations that we have created in the 
> executeMutation phase.
>  *** If autocommit is turned ON, then we introduce an extra RPC for every 
> execute() call which will introduce regression in this feature.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (PHOENIX-7026) Validate LAST_DDL_TIMESTAMP for write requests.

Reply via email to