ibzib commented on pull request #14719:
URL: https://github.com/apache/beam/pull/14719#issuecomment-836955490


   > Arrghh seems metrics are broken for the Portable runner. Can you maybe 
what is the deal @ibzib ?
   https://ci-beam.apache.org/job/beam_PreCommit_Python_PVR_Flink_Phrase/131/
   
   Looks like it's hitting an NPE while unregistering metrics because the 
[taskExecutorManager](https://github.com/apache/flink/blob/9fd6ecf18ddb744a971f575ab11eb19b44383899/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/DeclarativeSlotManager.java#L705)
 is null. This seems like a Flink bug to me -- if registers these metrics, they 
should remain accessible even after the slot manager has been shut down. 
Getting these metrics should probably return 0 rather than NPE.
   
   ```
   May 08, 2021 4:57:46 AM org.apache.flink.runtime.metrics.MetricRegistryImpl 
unregister
   WARNING: Error while unregistering metric: taskSlotsAvailable.
   java.lang.NullPointerException
        at 
org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager.getNumberFreeSlots(DeclarativeSlotManager.java:715)
        at 
org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager.lambda$registerSlotManagerMetrics$2(DeclarativeSlotManager.java:200)
        at 
org.apache.beam.runners.flink.metrics.Metrics.toString(Metrics.java:33)
        at 
org.apache.beam.runners.flink.metrics.FileReporter.notifyOfRemovedMetric(FileReporter.java:72)
        at 
org.apache.flink.runtime.metrics.MetricRegistryImpl.unregister(MetricRegistryImpl.java:435)
        at 
org.apache.flink.runtime.metrics.groups.AbstractMetricGroup.close(AbstractMetricGroup.java:333)
        at 
org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager.close(DeclarativeSlotManager.java:241)
        at 
org.apache.flink.runtime.resourcemanager.ResourceManager.stopResourceManagerServices(ResourceManager.java:298)
        at 
org.apache.flink.runtime.resourcemanager.ResourceManager.onStop(ResourceManager.java:275)
        at 
org.apache.flink.runtime.rpc.RpcEndpoint.internalCallOnStop(RpcEndpoint.java:214)
        at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor$StartedState.terminate(AkkaRpcActor.java:563)
        at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleControlMessage(AkkaRpcActor.java:186)
        at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
        at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
        at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
        at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
        at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
        at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
        at akka.actor.ActorCell.invoke(ActorCell.scala:561)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
        at akka.dispatch.Mailbox.run(Mailbox.scala:225)
        at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
        at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at 
akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at 
akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
   
   May 08, 2021 4:57:46 AM org.apache.flink.runtime.metrics.MetricRegistryImpl 
unregister
   WARNING: Error while unregistering metric: taskSlotsTotal.
   java.lang.NullPointerException
        at 
org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager.getNumberRegisteredSlots(DeclarativeSlotManager.java:705)
        at 
org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager.lambda$registerSlotManagerMetrics$3(DeclarativeSlotManager.java:202)
        at 
org.apache.beam.runners.flink.metrics.Metrics.toString(Metrics.java:33)
        at 
org.apache.beam.runners.flink.metrics.FileReporter.notifyOfRemovedMetric(FileReporter.java:72)
        at 
org.apache.flink.runtime.metrics.MetricRegistryImpl.unregister(MetricRegistryImpl.java:435)
        at 
org.apache.flink.runtime.metrics.groups.AbstractMetricGroup.close(AbstractMetricGroup.java:333)
        at 
org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager.close(DeclarativeSlotManager.java:241)
        at 
org.apache.flink.runtime.resourcemanager.ResourceManager.stopResourceManagerServices(ResourceManager.java:298)
        at 
org.apache.flink.runtime.resourcemanager.ResourceManager.onStop(ResourceManager.java:275)
        at 
org.apache.flink.runtime.rpc.RpcEndpoint.internalCallOnStop(RpcEndpoint.java:214)
        at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor$StartedState.terminate(AkkaRpcActor.java:563)
        at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleControlMessage(AkkaRpcActor.java:186)
        at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
        at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
        at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
        at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
        at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
        at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
        at akka.actor.ActorCell.invoke(ActorCell.scala:561)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
        at akka.dispatch.Mailbox.run(Mailbox.scala:225)
        at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
        at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at 
akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at 
akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to