[
https://issues.apache.org/jira/browse/FLINK-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911145#comment-16911145
]
Till Rohrmann commented on FLINK-13750:
---------------------------------------
Hi Tison,
I would try to go the following way: The {{RestClusterClient}} should only need
the {{webMonitorRetrievalService}}. Hence we should try to get rid of the
{{dispatcherLeaderRetriever}} and then the {{HighAvailabilityServices}}stored
in the {{ClusterClient}}. Btw. the {{ClusterClient}} is a legacy class with a
lot of unneeded code.
Then I would introduce a {{ClientHighAvailabilityServices}} which has a method
{{LeaderRetrievalService getWebMonitorLeaderRetriever();}}. In order to not
break backwards compatibility we could only deprecate the same method in
{{HighAvailabilityServices}}.
Next, we would need to introduce a new method to
{{HighAvailabilityServicesFactory#createClientHAServices}} which allows us to
create a {{ClientHighAvailabilityServices}} instance. This method should have a
default implementation which fails.
For backwards compatibility, we could still create a
{{HighAvailabilityServices}} if {{#createClientHAServices}} fails and then call
{{HighAvailabilityServices#getWebMonitorLeaderRetriever()}}.
In order for proper resource clean up, one either needs to pass the
{{(Client)HighAvailabilitySerivces}} to the {{RestClusterClient}} (ideally as a
{{AutoCloseable}}) or create a wrapper for the {{LeaderRetrievalService}} which
also closes the services when closing the {{LeaderRetrievalService}}.
I hope I haven't overlooked too many details here.
> Separate HA services between client-/ and server-side
> -----------------------------------------------------
>
> Key: FLINK-13750
> URL: https://issues.apache.org/jira/browse/FLINK-13750
> Project: Flink
> Issue Type: Improvement
> Components: Command Line Client, Runtime / Coordination
> Reporter: Chesnay Schepler
> Assignee: TisonKun
> Priority: Major
>
> Currently, we use the same {{HighAvailabilityServices}} on the client and
> server. However, the client does not need several of the features that the
> services currently provide (access to the blobstore or checkpoint metadata).
> Additionally, due to how these services are setup they also require the
> client to have access to the blob storage, despite it never actually being
> used, which can cause issues, like FLINK-13500.
> [~Tison] Would be be interested in this issue?
--
This message was sent by Atlassian Jira
(v8.3.2#803003)