[jira] [Updated] (FLINK-32870) Reading multiple small buffers by reading and slicing one large buffer for tiered storage

Yuxin Tan (Jira) Tue, 22 Aug 2023 00:55:05 -0700


     [ 
https://issues.apache.org/jira/browse/FLINK-32870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Yuxin Tan updated FLINK-32870:
------------------------------
    Description: 
Currently, when the file reader of tiered storage loads data from the disk 
file, it reads data in buffer granularity. Before compression, each buffer is 
32K by default. After compressed, the size will become smaller (may less than 
5K), which is pretty small for the network buffer and the file IO. 
We should read multiple small buffers by reading and slicing one large buffer 
to decrease the buffer competition and the file IO, leading to better 
performance.

  was:
Currently, when the file reader of tiered storage loads data from the disk 
file, it reads data in buffer granularity. Before compression, each buffer is 
32K by default, after compression the size will become smaller (may less than 
5K), which is pretty small for the network buffer and the file IO. 
We should merge the multiple small buffers into a larger one to decrease the 
buffer competition and the file IO, leading to better performance.


> Reading multiple small buffers by reading and slicing one large buffer for 
> tiered storage
> -----------------------------------------------------------------------------------------
>
>                 Key: FLINK-32870
>                 URL: https://issues.apache.org/jira/browse/FLINK-32870
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Network
>    Affects Versions: 1.18.0
>            Reporter: Yuxin Tan
>            Assignee: Yuxin Tan
>            Priority: Major
>
> Currently, when the file reader of tiered storage loads data from the disk 
> file, it reads data in buffer granularity. Before compression, each buffer is 
> 32K by default. After compressed, the size will become smaller (may less than 
> 5K), which is pretty small for the network buffer and the file IO. 
> We should read multiple small buffers by reading and slicing one large buffer 
> to decrease the buffer competition and the file IO, leading to better 
> performance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-32870) Reading multiple small buffers by reading and slicing one large buffer for tiered storage

Reply via email to