[
https://issues.apache.org/jira/browse/FLINK-24293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
huntercc updated FLINK-24293:
-----------------------------
Description:
In the current blob storage design, tasks executed by the same TaskExecutor
will share BLOBs storage dir and tasks executed by different TaskExecutor use
different dir. As a result, a TaskExecutor has to download user jar even if
there has been the same user jar downloaded by other TaskExecutors on the
machine. We believe that there is no need to download many copies of the same
user jar to the local, two main problems will by exposed:
# The NIC bandwidth of the distribution terminal may become a bottleneck
!image-2021-09-15-20-43-17-304.png|width=695,height=193!
As shown in the figure above, 24640 Mbps of the total 25000 Mbps NIC bandwidth
is used when we launched a flink job with 4000 TaskManagers, which will cause a
long deployment time and akka timeout exception.
# Take up more disk space
We expect to optimize the sharing mechanism of user jar by allowing tasks from
the same job on a machine to share blob storage dir, more specifically, share
the user jar in the dir. Only one task deployed to the machine will download
the user jar from BLOB server or distributed file storage, and the subsequent
tasks just use the localized user jar. In this way, the user jar of one job
only needs to be downloaded once on a machine. Here is a comparison of job
startup time before and after optimization.
||num of TM||before optimization||after optimization||
|1000|62s|37s|
|2000|104s|40s|
|3000|170s|43s|
|4000|211s|45s|
was:
In the current blob storage design, tasks executed by the same TaskExecutor
will share BLOBs storage dir and tasks executed by different TaskExecutor use
different dir. As a result, a TaskExecutor has to download user jar even if
there has been the same user jar downloaded by other TaskExecutors on the
machine. We believe that there is no need to download many copies of the same
user jar to the local, two main problems will by exposed:
# The NIC bandwidth of the distribution terminal may become a bottlenec
!image-2021-09-15-20-43-17-304.png|width=695,height=193! As shown in the figure
above, 24640 Mbps of the total 25000 Mbps NIC bandwidth is used when we
launched a flink job with 4000 TaskManagers, which will cause a long deployment
time and akka timeout exception.
# Take up more disk space
We expect to optimize the sharing mechanism of user jar by allowing tasks from
the same job on a machine to share blob storage dir, more specifically, share
the user jar in the dir. Only one task deployed to the machine will download
the user jar from BLOB server or distributed file storage, and the subsequent
tasks just use the localized user jar. In this way, the user jar of one job
only needs to be downloaded once on a machine. Here is a comparison of job
startup time before and after optimization.
||num of TM||before optimization||after optimization||
|1000|62s|37s|
|2000|104s|40s|
|3000|170s|43s|
|4000|211s|45s|
> Tasks from the same job on a machine share user jar
> ----------------------------------------------------
>
> Key: FLINK-24293
> URL: https://issues.apache.org/jira/browse/FLINK-24293
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Coordination
> Reporter: huntercc
> Priority: Major
> Labels: pull-request-available
> Attachments: image-2021-09-15-20-43-11-758.png,
> image-2021-09-15-20-43-17-304.png
>
>
> In the current blob storage design, tasks executed by the same TaskExecutor
> will share BLOBs storage dir and tasks executed by different TaskExecutor use
> different dir. As a result, a TaskExecutor has to download user jar even if
> there has been the same user jar downloaded by other TaskExecutors on the
> machine. We believe that there is no need to download many copies of the same
> user jar to the local, two main problems will by exposed:
> # The NIC bandwidth of the distribution terminal may become a bottleneck
> !image-2021-09-15-20-43-17-304.png|width=695,height=193!
> As shown in the figure above, 24640 Mbps of the total 25000 Mbps NIC
> bandwidth is used when we launched a flink job with 4000 TaskManagers, which
> will cause a long deployment time and akka timeout exception.
> # Take up more disk space
> We expect to optimize the sharing mechanism of user jar by allowing tasks
> from the same job on a machine to share blob storage dir, more specifically,
> share the user jar in the dir. Only one task deployed to the machine will
> download the user jar from BLOB server or distributed file storage, and the
> subsequent tasks just use the localized user jar. In this way, the user jar
> of one job only needs to be downloaded once on a machine. Here is a
> comparison of job startup time before and after optimization.
> ||num of TM||before optimization||after optimization||
> |1000|62s|37s|
> |2000|104s|40s|
> |3000|170s|43s|
> |4000|211s|45s|
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)