[ 
https://issues.apache.org/jira/browse/KUDU-3484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17727684#comment-17727684
 ] 

Joe McDonnell commented on KUDU-3484:
-------------------------------------

[~araina] From Impala's point of view, we want the Kudu C++ client to behave 
transparently as if the cache doesn't exist. I think #3 is closest to that 
approach, so here is a sketch of #3:
 # Operation arrives
 # Take a timestamp before accessing the cache
 # Try with the cache
 ## If success, then everything is the same
 # If failure, look up the tablet
 # If the tablet is older than the timestamp from before the cache access, then 
the tablet existed before the operation started. Retry with the new tablet.
 # If the tablet is newer than the timestamp from before the cache access, 
throw an error (tablet not found)

If necessary, Impala can pass in a timestamp for when the operation starts.

> Kudu C++ client needs to invalidate cache if Java client issued a DDL op on 
> same partition
> ------------------------------------------------------------------------------------------
>
>                 Key: KUDU-3484
>                 URL: https://issues.apache.org/jira/browse/KUDU-3484
>             Project: Kudu
>          Issue Type: Improvement
>          Components: client
>            Reporter: Ashwani Raina
>            Assignee: Ashwani Raina
>            Priority: Major
>
> This Jira is created to track the work for improve Kudu C++ client's 
> metacache integrity when both C++ as well Java clients are used on same 
> partition in a particular sequence of steps that result in cache entry 
> becoming stale in Kudu C++ client.
> Here is a detailed step-wise execution of a test that could result in such a 
> situation:
> +++
> metacache at Kudu client doesn't cleanup the old tablet from its cache after 
> new partition with same range is created.
> This new tablet id is the valid one and hence received from server response 
> and should be used everywhere onwards.
> If we look at the steps to repro:
> 1. We create a table first with following query:
> +++
> /** 1. Create table **/ drop table if exists impala_crash; create table if 
> not exists impala_crash ( dt string, col string, primary key(dt) ) partition 
> by range(dt) ( partition values <= '00000000' ) stored as kudu;
> +++
> 2. Then, table is altered by adding a partition with range 20230301:
> +++
> alter table impala_crash drop if exists range partition value='20230301'; 
> alter table impala_crash add if not exists range partition value='20230301'; 
> insert into impala_crash values ('20230301','abc');
> +++
> 3. Then, we alter the table again by adding a partition with same range after 
> deleting old partition:
> +++
> alter table impala_crash drop if exists range partition value='20230301'; 
> alter table impala_crash add if not exists range partition value='20230301'; 
> insert into impala_crash values ('20230301','abc');
> +++
> Even though old partition is dropped and new one is added, old cache entry 
> (with old tablet id) reference still remains in kudu client metacache, 
> although it is marked as stale.
> When we try to write the new value to same range, it first searches entry 
> (using tablet id) inside the metacache and finds it to be stale. As a result, 
> rpc lookup is issued which connects with server and fetches payload response 
> that contains new tablet id as there is no old tablet entry on server 
> anymore. This new tablet id is recorded in client metacache. When PickLeader 
> resumes again, it goes into rpc lookup cycle which now does a successful 
> fastpath lookup because the latest entry is present in cache. But when its 
> callback is invoked, it again resumes work with old tablet id at hand which 
> never gets updated.
> +++
> Different approaches were discussed to address this. Following are some of 
> the approaches captured here for posterity:
> +++
> 1. Maintain a context in impala that can be shared among different clients. 
> The same context can be used to notify the c++ client to get rid of cache if 
> there has been set of operations that could invalidate a cache. Simply 
> passing tablet id may not work because that may not be enough for a client 
> take the decision.
> 2. Impala sends a hint to c++ client to remove the cache entry after a DDL 
> operation (invoked via java client) and perform a remote lookup instead of 
> relying on the local cache.
> 3. Kudu detects the problem internally and returns up to RPC layer and there 
> it over-writes the rpc structure with new tablet object and retry. This is a 
> tricky and unclean approach and has potential of introducing bugs.
> 4. Change the tablet id in the RPC itself. This is a non-trivial and error 
> prone approach as tablet id is defined const and implementation of rpc, 
> batcher and client is done with assumption that tablet id becomes read-only 
> after RPC is registered for an incoming op.
> +++
> The likelihood of finalising approach #1 or #2 is high as compared to rest. 
> Hence, two jiras may be required to track kudu work and impala work 
> separately.
> This Jira will be used to track kudu side work
> For impala side work, following Jira will be used:
> https://issues.apache.org/jira/browse/IMPALA-12172
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to