[ 
https://issues.apache.org/jira/browse/KUDU-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Serbin updated KUDU-1693:
--------------------------------
    Description: 
Currently, the Kudu C++ client buffers incoming operations regardless of their 
destination tablet server.  Accordingly, it's possible to set limit on the 
_total_ buffer space, not per tablet server.  This approach works but there is 
room for improvement: there are real-world scenarios where per-TS buffering 
would be more robust.  Besides, tablet servers impose limit on the RPC 
operations size.

Grouping write operations on per-tablet-server basis would be beneficial for 
'one-out-of-many lagging tablet server' scenario.  There, all tablet servers 
for a table perform well except for one which runs slow due to excessive IO, 
network issues, failing disk, etc.  The problem is that the lagging server 
hinders the overall performance.  This is due to the current approach to the 
buffer turnaround: a buffer is considered 'flushed' and its space is reclaimed 
at once when _all_ operations in the buffer are completed.  So, if 1000 
operations have already been sent but there is 1 operation still in progress, 
the whole buffer space is 'locked' and cannot be used.

Accordingly, introducing per-tablet-server buffer limit would help to address 
scenarios with concurrent writes into tables with extremely diverse partition 
factors (like 2 and 100).   E.g., consider a case when incoming write 
operations for tables with diverse partition factors are intermixed in the 
context of one session.  The problem is that setting the total buffer space 
limit high is beneficial for the writes into the table with many partitions 
(assuming those writes are evenly distributed across participating tablets), 
but it may be over the server-side's limit for max transaction size if those 
writes are targeted for the table with a few partitions.

  was:
Grouping write operations on per-tablet-server basis would be beneficial for 
'one-out-of-many lagging tablet server' scenario.  There, all tablet servers 
for a table perform good except for one which runs slow due to some reason 
(excessive IO, network issues, failing disk, etc.).  The problem is that the 
lagging server hinders buffers turnaround: a buffer is considered 'flushed' and 
its space is reclaimed at once when _all_ operations in the buffer are 
completed.  So, if 1000 operations are flushed but there is 1 operation still 
in progress, the whole buffer space is 'locked' and cannot be used.

Accordingly, introducing per-tablet-server buffer limit for write operations 
would help to address scenarios with concurrent writes into tables with very 
different partition factors (like 2 and 100).   E.g., the incoming operations 
for tables with very different partition factors are intermixed in the context 
of the same session.  The problem is that setting the total buffer space limit 
high is fine for the writes into the table with many partitions (assuming those 
writes are evenly distributed across participating tablets), but it may be over 
the server-side's limit for max transaction size if those writes are targeted 
for a table with a few partitions.


> Flush write operations on per-TS basis and add corresponding limit on the 
> buffer space
> --------------------------------------------------------------------------------------
>
>                 Key: KUDU-1693
>                 URL: https://issues.apache.org/jira/browse/KUDU-1693
>             Project: Kudu
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 1.0.0
>            Reporter: Alexey Serbin
>
> Currently, the Kudu C++ client buffers incoming operations regardless of 
> their destination tablet server.  Accordingly, it's possible to set limit on 
> the _total_ buffer space, not per tablet server.  This approach works but 
> there is room for improvement: there are real-world scenarios where per-TS 
> buffering would be more robust.  Besides, tablet servers impose limit on the 
> RPC operations size.
> Grouping write operations on per-tablet-server basis would be beneficial for 
> 'one-out-of-many lagging tablet server' scenario.  There, all tablet servers 
> for a table perform well except for one which runs slow due to excessive IO, 
> network issues, failing disk, etc.  The problem is that the lagging server 
> hinders the overall performance.  This is due to the current approach to the 
> buffer turnaround: a buffer is considered 'flushed' and its space is 
> reclaimed at once when _all_ operations in the buffer are completed.  So, if 
> 1000 operations have already been sent but there is 1 operation still in 
> progress, the whole buffer space is 'locked' and cannot be used.
> Accordingly, introducing per-tablet-server buffer limit would help to address 
> scenarios with concurrent writes into tables with extremely diverse partition 
> factors (like 2 and 100).   E.g., consider a case when incoming write 
> operations for tables with diverse partition factors are intermixed in the 
> context of one session.  The problem is that setting the total buffer space 
> limit high is beneficial for the writes into the table with many partitions 
> (assuming those writes are evenly distributed across participating tablets), 
> but it may be over the server-side's limit for max transaction size if those 
> writes are targeted for the table with a few partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to