[
https://issues.apache.org/jira/browse/SAMZA-1346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063542#comment-16063542
]
Jake Maes commented on SAMZA-1346:
----------------------------------
Appreciate the feedback.
>If (for some reason) there was a change in JobModelManager that resulted in
>breaking this contract, the grouper can become more defensive by saying that
>host-affinity is enabled and it cannot operate without locality manager.
That's why this check was added. In case JobModelManager changes. The only
difference is that I think this check is the core check that must be performed.
If there's no locality manager, the balance method cannot work, as it depends
on the persistence provided by the locality manager. We could add a secondary
check against the config, but I don't think that check would be self sufficient
because it also depends on a contract that could change: host affinity vs null
locality manager.
> GroupByContainerCount.balance() should guard against null LocalityManager
> -------------------------------------------------------------------------
>
> Key: SAMZA-1346
> URL: https://issues.apache.org/jira/browse/SAMZA-1346
> Project: Samza
> Issue Type: Improvement
> Reporter: Jake Maes
> Assignee: Jake Maes
>
> While it's less likely after SAMZA-1334, we have seen cases of an NPE in
> embedded mode.
> {noFormat}
> org.apache.samza.SamzaException: Failed to run application
> at
> org.apache.samza.runtime.LocalApplicationRunner.run(LocalApplicationRunner.java:136)
> at
> com.linkedin.beam.runners.samza.runtime.fluent.FluentRuntime$RunnerTask.run(FluentRuntime.java:114)
> ... 1 more
> Caused by: java.lang.NullPointerException
> at
> org.apache.samza.container.grouper.task.GroupByContainerCount.balance(GroupByContainerCount.java:92)
> at
> org.apache.samza.coordinator.JobModelManager$.readJobModel(JobModelManager.scala:257)
> at
> org.apache.samza.coordinator.JobModelManager.readJobModel(JobModelManager.scala)
> at
> org.apache.samza.standalone.StandaloneJobCoordinator.<init>(StandaloneJobCoordinator.java:108)
> at
> org.apache.samza.standalone.StandaloneJobCoordinatorFactory.getJobCoordinator(StandaloneJobCoordinatorFactory.java:29)
> at
> org.apache.samza.processor.StreamProcessor.<init>(StreamProcessor.java:111)
> at
> org.apache.samza.processor.StreamProcessor.<init>(StreamProcessor.java:94)
> at
> org.apache.samza.runtime.LocalApplicationRunner.createStreamProcessor(LocalApplicationRunner.java:231)
> at
> org.apache.samza.runtime.LocalApplicationRunner.lambda$run$0(LocalApplicationRunner.java:125)
> at
> org.apache.samza.runtime.LocalApplicationRunner$$Lambda$35/1940982718.accept(Unknown
> Source)
> at java.util.ArrayList.forEach(ArrayList.java:1249)
> at
> org.apache.samza.runtime.LocalApplicationRunner.run(LocalApplicationRunner.java:121)
> ... 2 more
> {noFormat}
> It should be straight forward to defend against this case and provide better
> feedback in the logs. E.g. if the locality manager is null, then host
> affinity is not enabled and we could just defer to group().
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)