[ 
https://issues.apache.org/jira/browse/KUDU-3747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Serbin updated KUDU-3747:
--------------------------------
    Description: 
With KUDU-1802 implemented, the Kudu C++ client is able to obtain information 
on tablet replica locations from scan tokens, avoiding extra calls to Kudu 
masters when such information in present.  It works very well in the majority 
of use cases.  However, a couple of surprises were discovered in a relatively 
rare use case when a Kudu client application creates a new table, immediately 
start writing data into the newly created table, and then reads data back from 
the table.

The two unexpected items are:
# After populating its metacache with the information on the location of tablet 
replicas and using it while writing data into the newly created table, the 
client still calls {{KuduClient::Data::GetTabletServer()}} as a part of 
{{KuduScanner::Data::OpenTablet()}} when commencing read/scan operation.
# Even if ignoring the already existing tablet location information in the 
metacache (say, if it's short-lived and already expired), it would be natural 
for the client at least to rely on the information in the scan tokens when 
opening a table for scanning (given KUDU-1802 provisions are already there) 
instead of re-resolving tablet locations.

This became apparent once increasing the setting for the 
{{\-\-raft_heartbeat_interval_ms}} setting that Kudu tablet servers run tenfold 
(i.e. from default 500 ms to 5000 ms): the client application that tries to 
exercise the described use case exhibited high latency when calling 
{{KuduScanner::Open()}} right after writing a few rows into the newly created 
table.

It's necessary to clarify what's going on.

[Attached|https://issues.apache.org/jira/secure/attachment/13081007/repro.cc] 
is a sample Kudu client application in C++.  Compile it similarly to 
{{$KUDU_HOME/examples/cpp/example.cc}} and run against a Kudu cluster having at 
least three tablet servers.

  was:
With KUDU-1802 implemented, the Kudu C++ client is able to obtain information 
on tablet replica locations from scan tokens, avoiding extra calls to Kudu 
masters when such information in present.  It works very well in the majority 
of use cases.  However, a couple of surprises were discovered in a relatively 
rare use case when a Kudu client application creates a new table, immediately 
start writing data into the newly created table, and then reads data back from 
the table.

The two unexpected items are:
# After populating its metacache with the information on the location of tablet 
replicas and using it while writing data into the newly created table, the 
client still calls {{KuduClient::Data::GetTabletServer()}} as a part of 
{{KuduScanner::Data::OpenTablet()}} when commencing read/scan operation.
# Even if ignoring the already existing tablet location information in the 
metacache (say, if it's short-lived and already expired), it would be natural 
for the client at least to rely on the information in the scan tokens when 
opening a table for scanning (given KUDU-1802 provisions are already there) 
instead of re-resolving tablet locations.

This became apparent once increasing the setting for the 
{{\-\-raft_heartbeat_interval_ms}} setting that Kudu tablet servers run tenfold 
(i.e. from default 500 ms to 5000 ms): the client application that tries to 
exercise the described use case exhibited high latency when calling 
{{KuduScanner::Open()}} right after writing a few rows into the newly created 
table.

It's necessary to clarify what's going on.

[Attached|https://issues.apache.org/jira/secure/attachment/13081006/repro.cc] 
is a sample Kudu client application in C++.  Compile it similarly to 
{{$KUDU_HOME/examples/cpp/example.cc}} and run against a Kudu cluster having at 
least three tablet servers.


> Clarify on the C++ client meta-cache logic for 
> create-table-and-immediate-write-and-scan use case
> -------------------------------------------------------------------------------------------------
>
>                 Key: KUDU-3747
>                 URL: https://issues.apache.org/jira/browse/KUDU-3747
>             Project: Kudu
>          Issue Type: Task
>          Components: client
>    Affects Versions: 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0, 1.18.0, 1.17.1, 
> 1.18.1
>            Reporter: Alexey Serbin
>            Priority: Major
>         Attachments: repro.cc
>
>
> With KUDU-1802 implemented, the Kudu C++ client is able to obtain information 
> on tablet replica locations from scan tokens, avoiding extra calls to Kudu 
> masters when such information in present.  It works very well in the majority 
> of use cases.  However, a couple of surprises were discovered in a relatively 
> rare use case when a Kudu client application creates a new table, immediately 
> start writing data into the newly created table, and then reads data back 
> from the table.
> The two unexpected items are:
> # After populating its metacache with the information on the location of 
> tablet replicas and using it while writing data into the newly created table, 
> the client still calls {{KuduClient::Data::GetTabletServer()}} as a part of 
> {{KuduScanner::Data::OpenTablet()}} when commencing read/scan operation.
> # Even if ignoring the already existing tablet location information in the 
> metacache (say, if it's short-lived and already expired), it would be natural 
> for the client at least to rely on the information in the scan tokens when 
> opening a table for scanning (given KUDU-1802 provisions are already there) 
> instead of re-resolving tablet locations.
> This became apparent once increasing the setting for the 
> {{\-\-raft_heartbeat_interval_ms}} setting that Kudu tablet servers run 
> tenfold (i.e. from default 500 ms to 5000 ms): the client application that 
> tries to exercise the described use case exhibited high latency when calling 
> {{KuduScanner::Open()}} right after writing a few rows into the newly created 
> table.
> It's necessary to clarify what's going on.
> [Attached|https://issues.apache.org/jira/secure/attachment/13081007/repro.cc] 
> is a sample Kudu client application in C++.  Compile it similarly to 
> {{$KUDU_HOME/examples/cpp/example.cc}} and run against a Kudu cluster having 
> at least three tablet servers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to