[
https://issues.apache.org/jira/browse/FLINK-13750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16910848#comment-16910848
]
TisonKun commented on FLINK-13750:
----------------------------------
Hi [~Zentol] & [~till.rohrmann].
After an investigation I notice that {{ClusterClient}} need not to hold a field
is or like {{highAvailabilityServices}}. Towards the target {{ClusterClient}}
is an interface, i.e., is not an abstract class, we can shift down the
initialize logic into {{RestClusterClient}} and {{MiniClusterClient}}.
Here are two possible direction we do the separation and I post here for advice.
1. introduce utility functions in {{HighAvailabilityServicesUtils}} to return a
limited set of high-availability service regarded as client-side services,
without introduce any new class or interface.(a prototype can be found at
https://github.com/TisonKun/flink/commit/1ea7c4ed6c7c2ce2a82da48bcacfd20e2bc0fdfd)
pros:
- easy to implement
- in custom HA scenario, customer doesn't need to modify their code instead of
their implementation has similar issue with FLINK-13500.
cons:
- there is no explicit client-side service concept.
- {{HighAvailabilityServicesUtils}} knows details of Standalone and ZooKeeper
implementation.
nit: for the prototype, we might separate
{{getDispatcherLeaderRetrievalService}} and
{{getWebMonitorLeaderRetrievalService}} while the downside is we would
initialize {{CurationFramework}} and custom HA service twice or more.
2. introduce an interface {{RetrieverOnlyHighAvailabilityService}} which looks
like
{code:java}
interface RetrieverOnlyHighAvailabilityService {
LeaderRetrievalService getDispatcherLeaderRetrievalService();
LeaderRetrievalService getWebMonitorLeaderRetrievalService();
}
{code}
and implement it for different high-availability backends.
pros:
- a clear concept of separation between high-availability services.
- HighAvailabilityServicesUtils only pass configuration to generate
RetrieverOnlyHighAvailabilityService and only
RetrieverOnlyHighAvailabilityService knows the detail.
cons:
- we need to implement RetrieverOnlyHighAvailabilityService for every
high-availability services.
- in {{MiniClusterClient}} scenario, we actually used the service passed from
MiniCluster. either we should treat it as a special case or change totally the
logic {{MiniClusterClient}} initialization.
- in custom HA scenario, user has to implement a new interface.
nit:
it is not the truth for current codebase that every ClusterClient share the
same retrieval requirements. only RestClusterClient need to
getWebMonitorLeaderRetrievalService. or in a more conceptual layer client
should only communicate with WebMonitor and request to Dispatcher is routed by
WebMonitor.
> Separate HA services between client-/ and server-side
> -----------------------------------------------------
>
> Key: FLINK-13750
> URL: https://issues.apache.org/jira/browse/FLINK-13750
> Project: Flink
> Issue Type: Improvement
> Components: Command Line Client, Runtime / Coordination
> Reporter: Chesnay Schepler
> Assignee: TisonKun
> Priority: Major
>
> Currently, we use the same {{HighAvailabilityServices}} on the client and
> server. However, the client does not need several of the features that the
> services currently provide (access to the blobstore or checkpoint metadata).
> Additionally, due to how these services are setup they also require the
> client to have access to the blob storage, despite it never actually being
> used, which can cause issues, like FLINK-13500.
> [~Tison] Would be be interested in this issue?
--
This message was sent by Atlassian Jira
(v8.3.2#803003)