It's not possible to tell. You would have to look into the logs of the
job manager to check what happened. The not killed task manager could
have re-connected to the job manager, if it was restarted quickly
after the failure. Why do you think that the task manager would
influence the job result though?

On Mon, Jul 4, 2016 at 12:23 PM, Flavio Pompermaier
<pomperma...@okkam.it> wrote:
> No, I haven't.
> I fear that unkilled taskmanger could have been the cause of this problem.
> Last day I run the job and I discovered that on some node there was some
> zombie taskmanger yhat wasn't terminated during the stop-cluster.
> What do you think?What happens in this situations?old taskmanager are still
> avle to interfer with the new jobmanager?
> in the webdashboard I didn't  see them so I thought it wasn't  problematic
> at all so I just killed them..
>
> On 4 Jul 2016 12:07 p.m., "Ufuk Celebi" <u...@apache.org> wrote:
>
> I guess Aljoscha was referring to whether you also have broadcasted
> input or something like it?
>
> On Fri, Jul 1, 2016 at 7:05 PM, Flavio Pompermaier <pomperma...@okkam.it>
> wrote:
>> what do you mean exactly?
>>
>> On 1 Jul 2016 18:58, "Aljoscha Krettek" <aljos...@apache.org> wrote:
>>>
>>> Hi,
>>> do you have any data in the coGroup/groupBy operators that you use,
>>> besides the input data?
>>>
>>> Cheers,
>>> Aljoscha
>>>
>>> On Fri, 1 Jul 2016 at 14:17 Flavio Pompermaier <pomperma...@okkam.it>
>>> wrote:
>>>>
>>>> Hi to all,
>>>> I have a Flink job that computes data correctly when launched locally
>>>> from my IDE while it doesn't when launched on the cluster.
>>>>
>>>> Is there any suggestion/example to understand the problematic operators
>>>> in this way?
>>>> I think the root cause is the fact that some operator (e.g.
>>>> coGroup/groupBy,etc), which I assume to have all the data for a key,
>>>> maybe
>>>> it is not (because the data is partitioned among nodes).
>>>>
>>>> Any help is appreciated,
>>>> Flavio

Reply via email to