[jira] [Commented] (FLINK-9805) HTTP Redirect to Active JM in Flink CLI
[ https://issues.apache.org/jira/browse/FLINK-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17321579#comment-17321579 ] Flink Jira Bot commented on FLINK-9805: --- This issue and all of its Sub-Tasks have not been updated for 180 days. So, it has been labeled "stale-minor". If you are still affected by this bug or are still interested in this issue, please give an update and remove the label. In 7 days the issue will be closed automatically. > HTTP Redirect to Active JM in Flink CLI > --- > > Key: FLINK-9805 > URL: https://issues.apache.org/jira/browse/FLINK-9805 > Project: Flink > Issue Type: Improvement > Components: Command Line Client >Affects Versions: 1.5.0 >Reporter: Sayat Satybaldiyev >Priority: Minor > Labels: newbie, pull-request-available, stale-minor > > Flink CLI allows specifying job manager address via --jobmanager flag. > However, in HA mode the JM can change and then standby JM does HTTP redirect > to the active one. However, during deployment via flink CLI with --jobmanager > flag option the CLI does not redirect to the active one. Thus fails to submit > job with "Could not complete the operation. Number of retries has been > exhausted" > > *Proposal:* > Honor JM HTTP redirect in case leadership changes in flink CLI with > --jobmanager flag active. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-9805) HTTP Redirect to Active JM in Flink CLI
[ https://issues.apache.org/jira/browse/FLINK-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16552601#comment-16552601 ] ASF GitHub Bot commented on FLINK-9805: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/6307 > HTTP Redirect to Active JM in Flink CLI > --- > > Key: FLINK-9805 > URL: https://issues.apache.org/jira/browse/FLINK-9805 > Project: Flink > Issue Type: Improvement > Components: Client >Affects Versions: 1.5.0 >Reporter: Sayat Satybaldiyev >Assignee: vinoyang >Priority: Minor > Labels: newbie, pull-request-available > > Flink CLI allows specifying job manager address via --jobmanager flag. > However, in HA mode the JM can change and then standby JM does HTTP redirect > to the active one. However, during deployment via flink CLI with --jobmanager > flag option the CLI does not redirect to the active one. Thus fails to submit > job with "Could not complete the operation. Number of retries has been > exhausted" > > *Proposal:* > Honor JM HTTP redirect in case leadership changes in flink CLI with > --jobmanager flag active. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9805) HTTP Redirect to Active JM in Flink CLI
[ https://issues.apache.org/jira/browse/FLINK-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541590#comment-16541590 ] Chesnay Schepler commented on FLINK-9805: - No, the PR references the correct JIRA. The PR is related to this issue since it fixes the error message that is printed under the circumstances the user described. > HTTP Redirect to Active JM in Flink CLI > --- > > Key: FLINK-9805 > URL: https://issues.apache.org/jira/browse/FLINK-9805 > Project: Flink > Issue Type: Improvement > Components: Client >Affects Versions: 1.5.0 >Reporter: Sayat Satybaldiyev >Assignee: vinoyang >Priority: Minor > Labels: newbie, pull-request-available > > Flink CLI allows specifying job manager address via --jobmanager flag. > However, in HA mode the JM can change and then standby JM does HTTP redirect > to the active one. However, during deployment via flink CLI with --jobmanager > flag option the CLI does not redirect to the active one. Thus fails to submit > job with "Could not complete the operation. Number of retries has been > exhausted" > > *Proposal:* > Honor JM HTTP redirect in case leadership changes in flink CLI with > --jobmanager flag active. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9805) HTTP Redirect to Active JM in Flink CLI
[ https://issues.apache.org/jira/browse/FLINK-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541560#comment-16541560 ] vinoyang commented on FLINK-9805: - hi [~Zentol] the link between Jira and github PR is wrong. it seems your PR title is wrong. please check it~ > HTTP Redirect to Active JM in Flink CLI > --- > > Key: FLINK-9805 > URL: https://issues.apache.org/jira/browse/FLINK-9805 > Project: Flink > Issue Type: Improvement > Components: Client >Affects Versions: 1.5.0 >Reporter: Sayat Satybaldiyev >Assignee: vinoyang >Priority: Minor > Labels: newbie, pull-request-available > > Flink CLI allows specifying job manager address via --jobmanager flag. > However, in HA mode the JM can change and then standby JM does HTTP redirect > to the active one. However, during deployment via flink CLI with --jobmanager > flag option the CLI does not redirect to the active one. Thus fails to submit > job with "Could not complete the operation. Number of retries has been > exhausted" > > *Proposal:* > Honor JM HTTP redirect in case leadership changes in flink CLI with > --jobmanager flag active. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9805) HTTP Redirect to Active JM in Flink CLI
[ https://issues.apache.org/jira/browse/FLINK-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541556#comment-16541556 ] ASF GitHub Bot commented on FLINK-9805: --- Github user yanghua commented on the issue: https://github.com/apache/flink/pull/6307 +1 > HTTP Redirect to Active JM in Flink CLI > --- > > Key: FLINK-9805 > URL: https://issues.apache.org/jira/browse/FLINK-9805 > Project: Flink > Issue Type: Improvement > Components: Client >Affects Versions: 1.5.0 >Reporter: Sayat Satybaldiyev >Assignee: vinoyang >Priority: Minor > Labels: newbie, pull-request-available > > Flink CLI allows specifying job manager address via --jobmanager flag. > However, in HA mode the JM can change and then standby JM does HTTP redirect > to the active one. However, during deployment via flink CLI with --jobmanager > flag option the CLI does not redirect to the active one. Thus fails to submit > job with "Could not complete the operation. Number of retries has been > exhausted" > > *Proposal:* > Honor JM HTTP redirect in case leadership changes in flink CLI with > --jobmanager flag active. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9805) HTTP Redirect to Active JM in Flink CLI
[ https://issues.apache.org/jira/browse/FLINK-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16541363#comment-16541363 ] Sayat Satybaldiyev commented on FLINK-9805: --- Is there anything like "debug" mode in CLI? I only see quite mode there. Yep, if active JM is specified everything works. Though, it's annoying to change them time to time when leadership changes. > HTTP Redirect to Active JM in Flink CLI > --- > > Key: FLINK-9805 > URL: https://issues.apache.org/jira/browse/FLINK-9805 > Project: Flink > Issue Type: Improvement > Components: Client >Affects Versions: 1.5.0 >Reporter: Sayat Satybaldiyev >Assignee: vinoyang >Priority: Minor > Labels: newbie, pull-request-available > > Flink CLI allows specifying job manager address via --jobmanager flag. > However, in HA mode the JM can change and then standby JM does HTTP redirect > to the active one. However, during deployment via flink CLI with --jobmanager > flag option the CLI does not redirect to the active one. Thus fails to submit > job with "Could not complete the operation. Number of retries has been > exhausted" > > *Proposal:* > Honor JM HTTP redirect in case leadership changes in flink CLI with > --jobmanager flag active. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9805) HTTP Redirect to Active JM in Flink CLI
[ https://issues.apache.org/jira/browse/FLINK-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16540109#comment-16540109 ] ASF GitHub Bot commented on FLINK-9805: --- GitHub user zentol opened a pull request: https://github.com/apache/flink/pull/6307 [FLINK-9805][rest] Catch JsonProcessingException in RestClient ## What is the purpose of the change With this PR `RestClient#readRawResponse` catches all JSON-related exceptions. The exact circumstances where a `non-JsonParseException` is thrown isn't clearly defined, but one case when it can occur is when the input is empty. You can merge this pull request into a Git repository by running: $ git pull https://github.com/zentol/flink 9805 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/6307.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6307 > HTTP Redirect to Active JM in Flink CLI > --- > > Key: FLINK-9805 > URL: https://issues.apache.org/jira/browse/FLINK-9805 > Project: Flink > Issue Type: Improvement > Components: Client >Affects Versions: 1.5.0 >Reporter: Sayat Satybaldiyev >Assignee: vinoyang >Priority: Minor > Labels: newbie, pull-request-available > > Flink CLI allows specifying job manager address via --jobmanager flag. > However, in HA mode the JM can change and then standby JM does HTTP redirect > to the active one. However, during deployment via flink CLI with --jobmanager > flag option the CLI does not redirect to the active one. Thus fails to submit > job with "Could not complete the operation. Number of retries has been > exhausted" > > *Proposal:* > Honor JM HTTP redirect in case leadership changes in flink CLI with > --jobmanager flag active. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9805) HTTP Redirect to Active JM in Flink CLI
[ https://issues.apache.org/jira/browse/FLINK-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16540089#comment-16540089 ] Chesnay Schepler commented on FLINK-9805: - Well that exception isn't really helpful, looks like the exceptional parsing path doesn't cover all cases :/ Will open a PR to fix that... So this is indeed unexpected behavior, if you send anything to a standby JM it should redirect to the active one. Is there any exception in the jobmanager logs? Usually all exceptions caused by REST operations are also logged on the server. If you specify the address of the active JM I assume the submission works? > HTTP Redirect to Active JM in Flink CLI > --- > > Key: FLINK-9805 > URL: https://issues.apache.org/jira/browse/FLINK-9805 > Project: Flink > Issue Type: Improvement > Components: Client >Affects Versions: 1.5.0 >Reporter: Sayat Satybaldiyev >Assignee: vinoyang >Priority: Minor > Labels: newbie > > Flink CLI allows specifying job manager address via --jobmanager flag. > However, in HA mode the JM can change and then standby JM does HTTP redirect > to the active one. However, during deployment via flink CLI with --jobmanager > flag option the CLI does not redirect to the active one. Thus fails to submit > job with "Could not complete the operation. Number of retries has been > exhausted" > > *Proposal:* > Honor JM HTTP redirect in case leadership changes in flink CLI with > --jobmanager flag active. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9805) HTTP Redirect to Active JM in Flink CLI
[ https://issues.apache.org/jira/browse/FLINK-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16540048#comment-16540048 ] Sayat Satybaldiyev commented on FLINK-9805: --- Flink cluster setup in HA mode. It has 2 JM - one active the other is standby. However, flink client does not have access to ZK to retrieve the actual leader information. Thus, I need to have specify endpoint of the jobmanager with -m flag. When I specify the address of JM with --jobmanager, it exists however in passive standby mode. The client fails to submit the job cluster with exception: Could not complete the operation. Number of retries has been exhausted Full log: The program finished with the following exception: org.apache.flink.client.program.ProgramInvocationException: Could not submit job 80ec6d942e910d0de00d50f5e6886461. at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:247) at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:464) at org.apache.flink.client.program.DetachedEnvironment.finalizeExecute(DetachedEnvironment.java:77) at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:410) at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:785) at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:279) at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:214) at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1025) at org.apache.flink.client.cli.CliFrontend.lambda$main$9(CliFrontend.java:1101) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556) at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1101) Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph. at org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$8(RestClusterClient.java:370) at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870) at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) at org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$5(FutureUtils.java:214) at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760) at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) at org.apache.flink.runtime.rest.RestClient$ClientHandler.readRawResponse(RestClient.java:428) at org.apache.flink.runtime.rest.RestClient$ClientHandler.channelRead0(RestClient.java:374) at org.apache.flink.shaded.netty4.io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) at org.apache.flink.shaded.netty4.io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) at org.apache.flink.shaded.netty4.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) at org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242) at org.apache.flink.shaded.netty4.io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:147) at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) at
[jira] [Commented] (FLINK-9805) HTTP Redirect to Active JM in Flink CLI
[ https://issues.apache.org/jira/browse/FLINK-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16540029#comment-16540029 ] Chesnay Schepler commented on FLINK-9805: - Alright let's step back for a second; does the address you specify with the {{--jobmanager}} flag point to an actually running JobManager? Could you include the full exception? > HTTP Redirect to Active JM in Flink CLI > --- > > Key: FLINK-9805 > URL: https://issues.apache.org/jira/browse/FLINK-9805 > Project: Flink > Issue Type: Improvement > Components: Client >Affects Versions: 1.5.0 >Reporter: Sayat Satybaldiyev >Assignee: vinoyang >Priority: Minor > Labels: newbie > > Flink CLI allows specifying job manager address via --jobmanager flag. > However, in HA mode the JM can change and then standby JM does HTTP redirect > to the active one. However, during deployment via flink CLI with --jobmanager > flag option the CLI does not redirect to the active one. Thus fails to submit > job with "Could not complete the operation. Number of retries has been > exhausted" > > *Proposal:* > Honor JM HTTP redirect in case leadership changes in flink CLI with > --jobmanager flag active. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9805) HTTP Redirect to Active JM in Flink CLI
[ https://issues.apache.org/jira/browse/FLINK-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16540011#comment-16540011 ] Sayat Satybaldiyev commented on FLINK-9805: --- [~Zentol] But shouldn't a client and server be decoupled? If server re-directs a client to the different HTTP endpoint shouldn't client follow it, instead just crashing and do nothing? Flink's client can be always exposed to Zookeeper address due to security issues to retrieve a correct leader address. > HTTP Redirect to Active JM in Flink CLI > --- > > Key: FLINK-9805 > URL: https://issues.apache.org/jira/browse/FLINK-9805 > Project: Flink > Issue Type: Improvement > Components: Client >Affects Versions: 1.5.0 >Reporter: Sayat Satybaldiyev >Assignee: vinoyang >Priority: Minor > Labels: newbie > > Flink CLI allows specifying job manager address via --jobmanager flag. > However, in HA mode the JM can change and then standby JM does HTTP redirect > to the active one. However, during deployment via flink CLI with --jobmanager > flag option the CLI does not redirect to the active one. Thus fails to submit > job with "Could not complete the operation. Number of retries has been > exhausted" > > *Proposal:* > Honor JM HTTP redirect in case leadership changes in flink CLI with > --jobmanager flag active. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9805) HTTP Redirect to Active JM in Flink CLI
[ https://issues.apache.org/jira/browse/FLINK-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16540008#comment-16540008 ] Chesnay Schepler commented on FLINK-9805: - This sounds like correct behavior to me. If I explicitly specify a JM address I would expect the client to honor that, even if it no JM exists for that address. > HTTP Redirect to Active JM in Flink CLI > --- > > Key: FLINK-9805 > URL: https://issues.apache.org/jira/browse/FLINK-9805 > Project: Flink > Issue Type: Improvement > Components: Client >Affects Versions: 1.5.0 >Reporter: Sayat Satybaldiyev >Assignee: vinoyang >Priority: Minor > Labels: newbie > > Flink CLI allows specifying job manager address via --jobmanager flag. > However, in HA mode the JM can change and then standby JM does HTTP redirect > to the active one. However, during deployment via flink CLI with --jobmanager > flag option the CLI does not redirect to the active one. Thus fails to submit > job with "Could not complete the operation. Number of retries has been > exhausted" > > *Proposal:* > Honor JM HTTP redirect in case leadership changes in flink CLI with > --jobmanager flag active. -- This message was sent by Atlassian JIRA (v7.6.3#76005)