[jira] [Commented] (FLINK-18312) SavepointStatusHandler and StaticFileServerHandler not redirect
[ https://issues.apache.org/jira/browse/FLINK-18312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17336154#comment-17336154 ] Flink Jira Bot commented on FLINK-18312: This issue was labeled "stale-major" 7 ago and has not received any updates so it is being deprioritized. If this ticket is actually Major, please raise the priority and ask a committer to assign you the issue or revive the public discussion. > SavepointStatusHandler and StaticFileServerHandler not redirect > > > Key: FLINK-18312 > URL: https://issues.apache.org/jira/browse/FLINK-18312 > Project: Flink > Issue Type: Bug > Components: Runtime / REST >Affects Versions: 1.8.0, 1.9.0, 1.10.0 > Environment: 1. Deploy flink cluster in standlone mode on kubernetes > and use two Jobmanagers for HA. > 2. Deploy a kubernetes service for the two jobmanagers to provide a unified > url. >Reporter: Yu Wang >Priority: Major > Labels: stale-major > > Savepoint: > 1. Deploy our flink cluster in standlone mode on kubernetes and use two > Jobmanagers for HA. > 2. Deploy a kubernetes service for the two jobmanagers to provide a unified > url. > 3. Send a savepoint trigger request to the leader Jobmanager. > 4. Query the savepoint status from leader Jobmanager, get correct response. > 5. Query the savepoint status from standby Jobmanager, the response will be > 404. > Jobmanager log: > 1. Query log from leader Jobmanager, get leader log. > 2. Query log from standby Jobmanager, get standby log. > > Both these two requests will be redirect to the leader in 1.7. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-18312) SavepointStatusHandler and StaticFileServerHandler not redirect
[ https://issues.apache.org/jira/browse/FLINK-18312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17327768#comment-17327768 ] Flink Jira Bot commented on FLINK-18312: This major issue is unassigned and itself and all of its Sub-Tasks have not been updated for 30 days. So, it has been labeled "stale-major". If this ticket is indeed "major", please either assign yourself or give an update. Afterwards, please remove the label. In 7 days the issue will be deprioritized. > SavepointStatusHandler and StaticFileServerHandler not redirect > > > Key: FLINK-18312 > URL: https://issues.apache.org/jira/browse/FLINK-18312 > Project: Flink > Issue Type: Bug > Components: Runtime / REST >Affects Versions: 1.8.0, 1.9.0, 1.10.0 > Environment: 1. Deploy flink cluster in standlone mode on kubernetes > and use two Jobmanagers for HA. > 2. Deploy a kubernetes service for the two jobmanagers to provide a unified > url. >Reporter: Yu Wang >Priority: Major > Labels: stale-major > > Savepoint: > 1. Deploy our flink cluster in standlone mode on kubernetes and use two > Jobmanagers for HA. > 2. Deploy a kubernetes service for the two jobmanagers to provide a unified > url. > 3. Send a savepoint trigger request to the leader Jobmanager. > 4. Query the savepoint status from leader Jobmanager, get correct response. > 5. Query the savepoint status from standby Jobmanager, the response will be > 404. > Jobmanager log: > 1. Query log from leader Jobmanager, get leader log. > 2. Query log from standby Jobmanager, get standby log. > > Both these two requests will be redirect to the leader in 1.7. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-18312) SavepointStatusHandler and StaticFileServerHandler not redirect
[ https://issues.apache.org/jira/browse/FLINK-18312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17139087#comment-17139087 ] Yu Wang commented on FLINK-18312: - [~chesnay], [~trohrmann] agree with you, it's better to move the cache layer behind the PRC layer. > SavepointStatusHandler and StaticFileServerHandler not redirect > > > Key: FLINK-18312 > URL: https://issues.apache.org/jira/browse/FLINK-18312 > Project: Flink > Issue Type: Bug > Components: Runtime / REST >Affects Versions: 1.8.0, 1.9.0, 1.10.0 > Environment: 1. Deploy flink cluster in standlone mode on kubernetes > and use two Jobmanagers for HA. > 2. Deploy a kubernetes service for the two jobmanagers to provide a unified > url. >Reporter: Yu Wang >Priority: Major > > Savepoint: > 1. Deploy our flink cluster in standlone mode on kubernetes and use two > Jobmanagers for HA. > 2. Deploy a kubernetes service for the two jobmanagers to provide a unified > url. > 3. Send a savepoint trigger request to the leader Jobmanager. > 4. Query the savepoint status from leader Jobmanager, get correct response. > 5. Query the savepoint status from standby Jobmanager, the response will be > 404. > Jobmanager log: > 1. Query log from leader Jobmanager, get leader log. > 2. Query log from standby Jobmanager, get standby log. > > Both these two requests will be redirect to the leader in 1.7. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-18312) SavepointStatusHandler and StaticFileServerHandler not redirect
[ https://issues.apache.org/jira/browse/FLINK-18312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17138361#comment-17138361 ] Till Rohrmann commented on FLINK-18312: --- True, this indeed a good idea to think through [~chesnay]. It would be something like a {{RESTRequestBackend}} which receives all REST requests. It could run as part of the {{Dispatcher}} process. Ideally it would be coupled to the leadership of the {{Dispatcher}} in order to avoid another leader election service. > SavepointStatusHandler and StaticFileServerHandler not redirect > > > Key: FLINK-18312 > URL: https://issues.apache.org/jira/browse/FLINK-18312 > Project: Flink > Issue Type: Bug > Components: Runtime / REST >Affects Versions: 1.8.0, 1.9.0, 1.10.0 > Environment: 1. Deploy flink cluster in standlone mode on kubernetes > and use two Jobmanagers for HA. > 2. Deploy a kubernetes service for the two jobmanagers to provide a unified > url. >Reporter: Yu Wang >Priority: Major > > Savepoint: > 1. Deploy our flink cluster in standlone mode on kubernetes and use two > Jobmanagers for HA. > 2. Deploy a kubernetes service for the two jobmanagers to provide a unified > url. > 3. Send a savepoint trigger request to the leader Jobmanager. > 4. Query the savepoint status from leader Jobmanager, get correct response. > 5. Query the savepoint status from standby Jobmanager, the response will be > 404. > Jobmanager log: > 1. Query log from leader Jobmanager, get leader log. > 2. Query log from standby Jobmanager, get standby log. > > Both these two requests will be redirect to the leader in 1.7. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-18312) SavepointStatusHandler and StaticFileServerHandler not redirect
[ https://issues.apache.org/jira/browse/FLINK-18312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17138276#comment-17138276 ] Chesnay Schepler commented on FLINK-18312: -- I suppose for synchronizing between rest servers we would need to put all caching/persistent data structures/operations behind the RPC layer, i.e., into the Dispatcher (or some component thereof). This may not be too bad in the long run, and would solve a number of issues. I don't really want to start doing redirections in the REST layer; then we have 2 separate of redirecting requests. IIRC that is what we wanted to avoid initially. > SavepointStatusHandler and StaticFileServerHandler not redirect > > > Key: FLINK-18312 > URL: https://issues.apache.org/jira/browse/FLINK-18312 > Project: Flink > Issue Type: Bug > Components: Runtime / REST >Affects Versions: 1.8.0, 1.9.0, 1.10.0 > Environment: 1. Deploy flink cluster in standlone mode on kubernetes > and use two Jobmanagers for HA. > 2. Deploy a kubernetes service for the two jobmanagers to provide a unified > url. >Reporter: Yu Wang >Priority: Major > > Savepoint: > 1. Deploy our flink cluster in standlone mode on kubernetes and use two > Jobmanagers for HA. > 2. Deploy a kubernetes service for the two jobmanagers to provide a unified > url. > 3. Send a savepoint trigger request to the leader Jobmanager. > 4. Query the savepoint status from leader Jobmanager, get correct response. > 5. Query the savepoint status from standby Jobmanager, the response will be > 404. > Jobmanager log: > 1. Query log from leader Jobmanager, get leader log. > 2. Query log from standby Jobmanager, get standby log. > > Both these two requests will be redirect to the leader in 1.7. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-18312) SavepointStatusHandler and StaticFileServerHandler not redirect
[ https://issues.apache.org/jira/browse/FLINK-18312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17138239#comment-17138239 ] Till Rohrmann commented on FLINK-18312: --- It is true that for asynchronous operations, the user needs to query the same {{RestServerEndpoint}} in order to get a response because other {{RestServerEndpoints}} don't know about these operations. I don't think that synchronizing between the {{RestServerEndpoints}} is a feasible solution though. One potential solution could be to re-introduce the redirection logic and to make it configurable whether the {{RestServerEndpoint}} forwards the requests to the leading {{Dispatcher}} or sends a redirect response to the client. > SavepointStatusHandler and StaticFileServerHandler not redirect > > > Key: FLINK-18312 > URL: https://issues.apache.org/jira/browse/FLINK-18312 > Project: Flink > Issue Type: Bug > Components: Runtime / REST >Affects Versions: 1.8.0, 1.9.0, 1.10.0 > Environment: 1. Deploy flink cluster in standlone mode on kubernetes > and use two Jobmanagers for HA. > 2. Deploy a kubernetes service for the two jobmanagers to provide a unified > url. >Reporter: Yu Wang >Priority: Major > > Savepoint: > 1. Deploy our flink cluster in standlone mode on kubernetes and use two > Jobmanagers for HA. > 2. Deploy a kubernetes service for the two jobmanagers to provide a unified > url. > 3. Send a savepoint trigger request to the leader Jobmanager. > 4. Query the savepoint status from leader Jobmanager, get correct response. > 5. Query the savepoint status from standby Jobmanager, the response will be > 404. > Jobmanager log: > 1. Query log from leader Jobmanager, get leader log. > 2. Query log from standby Jobmanager, get standby log. > > Both these two requests will be redirect to the leader in 1.7. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-18312) SavepointStatusHandler and StaticFileServerHandler not redirect
[ https://issues.apache.org/jira/browse/FLINK-18312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136258#comment-17136258 ] Yu Wang commented on FLINK-18312: - I think there seems a issue in "AbstractAsynchronousOperationHandlers", in this handler, there is a local memory cache "completedOperationCache" to store the pending savpoint opeartion before redirect the request to the leader jobmanager, which seems not synced between all the jobmanagers. This makes only the jobmanager which receive the savepoint trigger requset can lookup the status of the savpoint, while the others can only return 404. > SavepointStatusHandler and StaticFileServerHandler not redirect > > > Key: FLINK-18312 > URL: https://issues.apache.org/jira/browse/FLINK-18312 > Project: Flink > Issue Type: Bug > Components: Runtime / REST >Affects Versions: 1.8.0, 1.9.0, 1.10.0 > Environment: 1. Deploy flink cluster in standlone mode on kubernetes > and use two Jobmanagers for HA. > 2. Deploy a kubernetes service for the two jobmanagers to provide a unified > url. >Reporter: Yu Wang >Priority: Major > > Savepoint: > 1. Deploy our flink cluster in standlone mode on kubernetes and use two > Jobmanagers for HA. > 2. Deploy a kubernetes service for the two jobmanagers to provide a unified > url. > 3. Send a savepoint trigger request to the leader Jobmanager. > 4. Query the savepoint status from leader Jobmanager, get correct response. > 5. Query the savepoint status from standby Jobmanager, the response will be > 404. > Jobmanager log: > 1. Query log from leader Jobmanager, get leader log. > 2. Query log from standby Jobmanager, get standby log. > > Both these two requests will be redirect to the leader in 1.7. > -- This message was sent by Atlassian Jira (v8.3.4#803005)