[ 
https://issues.apache.org/jira/browse/FLINK-25904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17508643#comment-17508643
 ] 

Matthew McMahon commented on FLINK-25904:
-----------------------------------------

I am seeing something similar on 1.14.4. Getting this error

{code}
2022-03-18 08:07:21,281 ERROR 
org.apache.flink.runtime.rest.handler.job.checkpoints.CheckpointingStatisticsHandler
 [] - Unhandled exception.
org.apache.commons.math3.exception.NullArgumentException: input array
        at 
org.apache.commons.math3.util.MathArrays.verifyValues(MathArrays.java:1650) 
~[flink-dist_2.12-1.14.4.jar:1.14.4]
        at 
org.apache.commons.math3.stat.descriptive.AbstractUnivariateStatistic.test(AbstractUnivariateStatistic.java:158)
 ~[flink-dist_2.12-1.14.4.jar:1.14.4]
        at 
org.apache.commons.math3.stat.descriptive.rank.Percentile.evaluate(Percentile.java:272)
 ~[flink-dist_2.12-1.14.4.jar:1.14.4]
        at 
org.apache.commons.math3.stat.descriptive.rank.Percentile.evaluate(Percentile.java:241)
 ~[flink-dist_2.12-1.14.4.jar:1.14.4]
        at 
org.apache.flink.runtime.metrics.DescriptiveStatisticsHistogramStatistics$CommonMetricsSnapshot.getPercentile(DescriptiveStatisticsHistogramStatistics.java:158)
 ~[flink-dist_2.12-1.14.4.jar:1.14.4]
        at 
org.apache.flink.runtime.metrics.DescriptiveStatisticsHistogramStatistics.getQuantile(DescriptiveStatisticsHistogramStatistics.java:52)
 ~[flink-dist_2.12-1.14.4.jar:1.14.4]
        at 
org.apache.flink.runtime.checkpoint.StatsSummarySnapshot.getQuantile(StatsSummarySnapshot.java:108)
 ~[flink-dist_2.12-1.14.4.jar:1.14.4]
        at 
org.apache.flink.runtime.rest.messages.checkpoints.StatsSummaryDto.valueOf(StatsSummaryDto.java:81)
 ~[flink-dist_2.12-1.14.4.jar:1.14.4]
        at 
org.apache.flink.runtime.rest.handler.job.checkpoints.CheckpointingStatisticsHandler.createCheckpointingStatistics(CheckpointingStatisticsHandler.java:129)
 ~[flink-dist_2.12-1.14.4.jar:1.14.4]
        at 
org.apache.flink.runtime.rest.handler.job.checkpoints.CheckpointingStatisticsHandler.handleRequest(CheckpointingStatisticsHandler.java:84)
 ~[flink-dist_2.12-1.14.4.jar:1.14.4]
        at 
org.apache.flink.runtime.rest.handler.job.checkpoints.CheckpointingStatisticsHandler.handleRequest(CheckpointingStatisticsHandler.java:58)
 ~[flink-dist_2.12-1.14.4.jar:1.14.4]
        at 
org.apache.flink.runtime.rest.handler.job.AbstractAccessExecutionGraphHandler.handleRequest(AbstractAccessExecutionGraphHandler.java:68)
 ~[flink-dist_2.12-1.14.4.jar:1.14.4]
        at 
org.apache.flink.runtime.rest.handler.job.AbstractExecutionGraphHandler.lambda$handleRequest$0(AbstractExecutionGraphHandler.java:87)
 ~[flink-dist_2.12-1.14.4.jar:1.14.4]
        at 
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616) 
[?:1.8.0_322]
        at 
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
 [?:1.8.0_322]
        at 
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:456)
 [?:1.8.0_322]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[?:1.8.0_322]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[?:1.8.0_322]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 [?:1.8.0_322]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 [?:1.8.0_322]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_322]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_322]
        at java.lang.Thread.run(Thread.java:750) [?:1.8.0_322]
{code}

I have 2 Job Managers running on different Nodes, using 
https://github.com/lyft/flinkk8soperator.

This is occuring on the Job Manager that is not currently the lead.

It doesn't seem to affect operation though, as if I kill the lead JM, this will 
pick up the job.

> NullArgumentException in case of increasing number of nodes for the job 
> ------------------------------------------------------------------------
>
>                 Key: FLINK-25904
>                 URL: https://issues.apache.org/jira/browse/FLINK-25904
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Metrics
>    Affects Versions: 1.14.3
>            Reporter: Sergey Nuyanzin
>            Priority: Major
>
> We have a job running on one node
> after increasing number of nodes to e.g. 3 on a new nodes job starts failing 
> with 
> {noformat}ERROR Unhandled exception. 
> (org.apache.flink.runtime.rest.handler.job.checkpoints.CheckpointingStatisticsHandler:260)
>  org.apache.commons.math3.exception.NullArgumentException: input array
>          at 
> org.apache.commons.math3.util.MathArrays.verifyValues(MathArrays.java:1650) 
> ~[flink-dist_2.12-1.14.3.jar:1.14.3]
>          at 
> org.apache.commons.math3.stat.descriptive.AbstractUnivariateStatistic.test(AbstractUnivariateStatistic.java:158)
>  ~[flink-dist_2.12-1.14.3.jar:1.14.3]
>          at 
> org.apache.commons.math3.stat.descriptive.rank.Percentile.evaluate(Percentile.java:272)
>  ~[flink-dist_2.12-1.14.3.jar:1.14.3]
>          at 
> org.apache.commons.math3.stat.descriptive.rank.Percentile.evaluate(Percentile.java:241)
>  ~[flink-dist_2.12-1.14.3.jar:1.14.3]
>          at 
> org.apache.flink.runtime.metrics.DescriptiveStatisticsHistogramStatistics$CommonMetricsSnapshot.getPercentile(DescriptiveStatisticsHistogramStatistics.java:158)
>  >
>          at 
> org.apache.flink.runtime.metrics.DescriptiveStatisticsHistogramStatistics.getQuantile(DescriptiveStatisticsHistogramStatistics.java:52)
>  ~[flink-dist_2.12-1.14.3.>
>          at 
> org.apache.flink.runtime.checkpoint.StatsSummarySnapshot.getQuantile(StatsSummarySnapshot.java:108)
>  ~[flink-dist_2.12-1.14.3.jar:1.14.3]
>          at 
> org.apache.flink.runtime.rest.messages.checkpoints.StatsSummaryDto.valueOf(StatsSummaryDto.java:81)
>  ~[flink-dist_2.12-1.14.3.jar:1.14.3]
>          at 
> org.apache.flink.runtime.rest.handler.job.checkpoints.CheckpointingStatisticsHandler.createCheckpointingStatistics(CheckpointingStatisticsHandler.java:129)
>  ~[fli>
>          at 
> org.apache.flink.runtime.rest.handler.job.checkpoints.CheckpointingStatisticsHandler.handleRequest(CheckpointingStatisticsHandler.java:84)
>  ~[flink-dist_2.12-1.14>
>          at 
> org.apache.flink.runtime.rest.handler.job.checkpoints.CheckpointingStatisticsHandler.handleRequest(CheckpointingStatisticsHandler.java:58)
>  ~[flink-dist_2.12-1.14>
>          at 
> org.apache.flink.runtime.rest.handler.job.AbstractAccessExecutionGraphHandler.handleRequest(AbstractAccessExecutionGraphHandler.java:68)
>  ~[flink-dist_2.12-1.14.3>
>          at 
> org.apache.flink.runtime.rest.handler.job.AbstractExecutionGraphHandler.lambda$handleRequest$0(AbstractExecutionGraphHandler.java:87)
>  ~[flink-dist_2.12-1.14.3.ja>
>          at 
> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:642)
>  [?:?]
>          at 
> java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478)
>  [?:?]
>          at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
>          at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
>          at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>  [?:?]
>          at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>  [?:?]
>          at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>  [?:?]
>          at java.lang.Thread.run(Thread.java:829) [?:?]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to