Re: Re:flink s3 checkpoint 一直IN_PROGRESS(100%)直到失败
Hi 一般是卡在最后一步从JM写checkpoint meta上面了,建议使用jstack等工具检查一下JM的cpu栈,看问题出在哪里。 祝好 唐云 From: Sun.Zhu <17626017...@163.com> Sent: Tuesday, March 8, 2022 14:12 To: user-zh@flink.apache.org Subject: Re:flink s3 checkpoint 一直IN_PROGRESS(100%)直到失败 图挂了 https://postimg.cc/Z9XdxwSk 在 2022-03-08 14:05:39,"Sun.Zhu" <17626017...@163.com> 写道: hi all, flink 1.13.2,将checkpoint 写到S3但是一直成功不了,一直显示IN_PROGRESS,直到超时失败,有大佬遇到过吗?
?????? Flink????????????
streaming api ??sql api streaming api ---- ??: "user-zh" https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/fault-tolerance/state/ hjw <1010445...@qq.com.invalid ??2022??3??9?? 01:32?? sql??SELECT color, sum(id) FROM T GROUP BY colorFlinkTgroup by key??color)??Flink???
Re: flink on yarn任务停止发生异常
异常提示的很明确了,做savepoint的过程中有的task不在running状态,你可以看下你的作业是否发生了failover。 QiZhu Chan 于2022年3月8日周二 17:37写道: > Hi, > > 各位社区大佬们,帮忙看一下如下报错是什么原因造成的?正常情况下客户端日志应该返回一个savepoint路径,但却出现如下异常日志,同时作业已被停止并且查看hdfs有发现当前job产生的savepoint文件 > > > > >
Re: Flink计算机制疑问
只存计算结果,来一条数据更新一次状态并且下发出去。具体可以参考下state: https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/fault-tolerance/state/ hjw <1010445...@qq.com.invalid> 于2022年3月9日周三 01:32写道: > 如下一段sql:SELECT color, sum(id) FROM T GROUP BY > colorFlink在实际计算中会将T流整个存入状态里,流中来一条数据触发一次全流计算。亦或是状态只存计算结果,来了新的一条数据,在原来同group > by key(color)结果进行加减即可。这种具体Flink的运行机制请问有文档翻阅或者有规律进行总结吗?谢谢。
Re: Re: k8s native session 问题咨询
你用新版本试一下,看着是已经修复了 https://issues.apache.org/jira/browse/FLINK-19212 Best, Yang 崔深圳 于2022年3月9日周三 10:31写道: > > > > web ui报错:请求这个接口: /jobs/overview,时而报错, Exception on server > side:\norg.apache.flink.runtime.rpc.akka.exceptions.AkkaRpcException: > Failed to serialize the result for RPC call : > requestMultipleJobDetails.\n\tat > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:417)\n\tat > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$sendAsyncResponse$2(AkkaRpcActor.java:373)\n\tat > java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:836)\n\tat > java.util.concurrent.CompletableFuture.uniHandleStage(CompletableFuture.java:848)\n\tat > java.util.concurrent.CompletableFuture.handle(CompletableFuture.java:2168)\n\tat > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.sendAsyncResponse(AkkaRpcActor.java:365)\n\tat > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:332)\n\tat > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:217)\n\tat > org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:78)\n\tat > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:163)\n\tat > akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:24)\n\tat > akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:20)\n\tat > scala.PartialFunction.applyOrElse(PartialFunction.scala:123)\n\tat > scala.PartialFunction.applyOrElse$(PartialFunction.scala:122)\n\tat > akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:20)\n\tat > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)\n\tat > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)\n\tat > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)\n\tat > akka.actor.Actor.aroundReceive(Actor.scala:537)\n\tat > akka.actor.Actor.aroundReceive$(Actor.scala:535)\n\tat > akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:220)\n\tat > akka.actor.ActorCell.receiveMessage(ActorCell.scala:580)\n\tat > akka.actor.ActorCell.invoke(ActorCell.scala:548)\n\tat > akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270)\n\tat > akka.dispatch.Mailbox.run(Mailbox.scala:231)\n\tat > akka.dispatch.Mailbox.exec(Mailbox.scala:243)\n\tat > java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)\n\tat > java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)\n\tat > java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)\n\tat > java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)\nCaused > by: java.io.NotSerializableException: java.util.HashMap$Values\n\tat > java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)\n\tat > java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)\n\tat > java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)\n\tat > java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)\n\tat > java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)\n\tat > java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)\n\tat > org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:632)\n\tat > org.apache.flink.runtime.rpc.akka.AkkaRpcSerializedValue.valueOf(AkkaRpcSerializedValue.java:66)\n\tat > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:400)\n\t... > 29 more\n\nEnd of exception on server side" > > > > > > > > > > > > > > > 在 2022-03-09 09:56:21,"yu'an huang" 写道: > >你好,路由到非master节点会有什么问题吗,非master节点在处理请求时应该会通过HA服务找到Active的Job manager, > 然后向Active Job manager拿到结果再返回给client. > > > >> On 7 Mar 2022, at 7:46 PM, 崔深圳 wrote: > >> > >> k8s native session 模式下,配置了ha,job_manager 的数量为3,然后web ui,通过rest > service访问,总是路由到非master节点,有什么办法使其稳定吗? > > >
Re:Re: k8s native session 问题咨询
web ui报错:请求这个接口: /jobs/overview,时而报错, Exception on server side:\norg.apache.flink.runtime.rpc.akka.exceptions.AkkaRpcException: Failed to serialize the result for RPC call : requestMultipleJobDetails.\n\tat org.apache.flink.runtime.rpc.akka.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:417)\n\tat org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$sendAsyncResponse$2(AkkaRpcActor.java:373)\n\tat java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:836)\n\tat java.util.concurrent.CompletableFuture.uniHandleStage(CompletableFuture.java:848)\n\tat java.util.concurrent.CompletableFuture.handle(CompletableFuture.java:2168)\n\tat org.apache.flink.runtime.rpc.akka.AkkaRpcActor.sendAsyncResponse(AkkaRpcActor.java:365)\n\tat org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:332)\n\tat org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:217)\n\tat org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:78)\n\tat org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:163)\n\tat akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:24)\n\tat akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:20)\n\tat scala.PartialFunction.applyOrElse(PartialFunction.scala:123)\n\tat scala.PartialFunction.applyOrElse$(PartialFunction.scala:122)\n\tat akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:20)\n\tat scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)\n\tat scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)\n\tat scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)\n\tat akka.actor.Actor.aroundReceive(Actor.scala:537)\n\tat akka.actor.Actor.aroundReceive$(Actor.scala:535)\n\tat akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:220)\n\tat akka.actor.ActorCell.receiveMessage(ActorCell.scala:580)\n\tat akka.actor.ActorCell.invoke(ActorCell.scala:548)\n\tat akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270)\n\tat akka.dispatch.Mailbox.run(Mailbox.scala:231)\n\tat akka.dispatch.Mailbox.exec(Mailbox.scala:243)\n\tat java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)\n\tat java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)\n\tat java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)\n\tat java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)\nCaused by: java.io.NotSerializableException: java.util.HashMap$Values\n\tat java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)\n\tat java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)\n\tat java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)\n\tat java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)\n\tat java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)\n\tat java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)\n\tat org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:632)\n\tat org.apache.flink.runtime.rpc.akka.AkkaRpcSerializedValue.valueOf(AkkaRpcSerializedValue.java:66)\n\tat org.apache.flink.runtime.rpc.akka.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:400)\n\t... 29 more\n\nEnd of exception on server side" 在 2022-03-09 09:56:21,"yu'an huang" 写道: >你好,路由到非master节点会有什么问题吗,非master节点在处理请求时应该会通过HA服务找到Active的Job manager, 然后向Active >Job manager拿到结果再返回给client. > >> On 7 Mar 2022, at 7:46 PM, 崔深圳 wrote: >> >> k8s native session 模式下,配置了ha,job_manager 的数量为3,然后web ui,通过rest >> service访问,总是路由到非master节点,有什么办法使其稳定吗? >
Re:Re: k8s native session 问题咨询
web ui报错:请求这个接口: /jobs/overview,时而报错, Exception on server side:\norg.apache.flink.runtime.rpc.akka.exceptions.AkkaRpcException: Failed to serialize the result for RPC call : requestMultipleJobDetails.\n\tat org.apache.flink.runtime.rpc.akka.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:417)\n\tat org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$sendAsyncResponse$2(AkkaRpcActor.java:373)\n\tat java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:836)\n\tat java.util.concurrent.CompletableFuture.uniHandleStage(CompletableFuture.java:848)\n\tat java.util.concurrent.CompletableFuture.handle(CompletableFuture.java:2168)\n\tat org.apache.flink.runtime.rpc.akka.AkkaRpcActor.sendAsyncResponse(AkkaRpcActor.java:365)\n\tat org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:332)\n\tat org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:217)\n\tat org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:78)\n\tat org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:163)\n\tat akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:24)\n\tat akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:20)\n\tat scala.PartialFunction.applyOrElse(PartialFunction.scala:123)\n\tat scala.PartialFunction.applyOrElse$(PartialFunction.scala:122)\n\tat akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:20)\n\tat scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)\n\tat scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)\n\tat scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)\n\tat akka.actor.Actor.aroundReceive(Actor.scala:537)\n\tat akka.actor.Actor.aroundReceive$(Actor.scala:535)\n\tat akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:220)\n\tat akka.actor.ActorCell.receiveMessage(ActorCell.scala:580)\n\tat akka.actor.ActorCell.invoke(ActorCell.scala:548)\n\tat akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270)\n\tat akka.dispatch.Mailbox.run(Mailbox.scala:231)\n\tat akka.dispatch.Mailbox.exec(Mailbox.scala:243)\n\tat java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)\n\tat java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)\n\tat java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)\n\tat java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)\nCaused by: java.io.NotSerializableException: java.util.HashMap$Values\n\tat java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)\n\tat java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)\n\tat java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)\n\tat java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)\n\tat java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)\n\tat java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)\n\tat org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:632)\n\tat org.apache.flink.runtime.rpc.akka.AkkaRpcSerializedValue.valueOf(AkkaRpcSerializedValue.java:66)\n\tat org.apache.flink.runtime.rpc.akka.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:400)\n\t... 29 more\n\nEnd of exception on server side" 在 2022-03-09 09:56:21,"yu'an huang" 写道: >你好,路由到非master节点会有什么问题吗,非master节点在处理请求时应该会通过HA服务找到Active的Job manager, 然后向Active >Job manager拿到结果再返回给client. > >> On 7 Mar 2022, at 7:46 PM, 崔深圳 wrote: >> >> k8s native session 模式下,配置了ha,job_manager 的数量为3,然后web ui,通过rest >> service访问,总是路由到非master节点,有什么办法使其稳定吗? >
Re: k8s native session 问题咨询
你好,路由到非master节点会有什么问题吗,非master节点在处理请求时应该会通过HA服务找到Active的Job manager, 然后向Active Job manager拿到结果再返回给client. > On 7 Mar 2022, at 7:46 PM, 崔深圳 wrote: > > k8s native session 模式下,配置了ha,job_manager 的数量为3,然后web ui,通过rest > service访问,总是路由到非master节点,有什么办法使其稳定吗?
Flink????????????
sql??SELECT color, sum(id) FROM T GROUP BY colorFlinkTgroup by key??color)??Flink???