karound.
> >> Thanks
> >>
> >> Thomas
> >>
> >>
> >> De : Maximilian Michels [m...@apache.org]
> >> Envoyé : mardi 15 mars 2016 16:51
> >> À : user@flink.apache.org
> >> Cc :
; Envoyé : mardi 15 mars 2016 16:51
> À : user@flink.apache.org
> Cc : Niels Basjes
> Objet : Re: Flink job on secure Yarn fails after many hours
>
> Hi Thomas,
>
> Nils (CC) and I found out that you need at least Hadoop version 2.6.1
> to properly run Kerberos applications on Hadoop clust
0 AM, Thomas Lamirault
> <thomas.lamira...@ericsson.com> wrote:
>>
>> Hi Max,
>>
>> I will try these workaround.
>> Thanks
>>
>> Thomas
>>
>>
>> De : Maximilian Michels [m...@apache.org]
>
homas
>
>
>
>
>
> De : ni...@basj.es [ni...@basj.es] de la part de Niels Basjes
> [ni...@basjes.nl]
> Envoyé : vendredi 4 décembre 2015 10:40
> À : user@flink.apache.org
> Objet : Re: Flink job on secure Yarn fails after many hours
>
> Hi
Hi Niels,
Just got back from our CI. The build above would fail with a
Checkstyle error. I corrected that. Also I have built the binaries for
your Hadoop version 2.6.0.
Binaries:
https://drive.google.com/file/d/0BziY9U_qva1sZ1FVR3RWeVNrNzA/view?usp=sharing
Source:
I mentioned that the exception gets thrown when requesting container
status information. We need this to send a heartbeat to YARN but it is
not very crucial if this fails once for the running job. Possibly, we
could work around this problem by retrying N times in case of an
exception.
Would it be
Hi Niels,
You mentioned you have the option to update Hadoop and redeploy the
job. Would be great if you could do that and let us know how it turns
out.
Cheers,
Max
On Wed, Dec 2, 2015 at 3:45 PM, Niels Basjes wrote:
> Hi,
>
> I posted the entire log from the first log line at
Hi,
I posted the entire log from the first log line at the moment of failure to
the very end of the logfile.
This is all I have.
As far as I understand the Kerberos and Keytab mechanism in Hadoop Yarn is
that it catches the "Invalid Token" and then (if keytab) gets a new
Kerberos ticket (or
Great. Here is the commit to try out:
https://github.com/mxm/flink/commit/f49b9635bec703541f19cb8c615f302a07ea88b3
If you already have the Flink repository, check it out using
git fetch https://github.com/mxm/flink/
f49b9635bec703541f19cb8c615f302a07ea88b3 && git checkout FETCH_HEAD
Sure, just give me the git repo url to build and I'll give it a try.
Niels
On Wed, Dec 2, 2015 at 4:28 PM, Maximilian Michels wrote:
> I mentioned that the exception gets thrown when requesting container
> status information. We need this to send a heartbeat to YARN but it is
I forgot you're using Flink 0.10.1. The above was for the master.
So here's the commit for Flink 0.10.1:
https://github.com/mxm/flink/commit/a41f3866f4097586a7b2262093088861b62930cd
git fetch https://github.com/mxm/flink/ \
a41f3866f4097586a7b2262093088861b62930cd && git checkout FETCH_HEAD
Hi Niels,
Sorry for hear you experienced this exception. From a first glance, it
looks like a bug in Hadoop to me.
> "Not retrying because the invoked method is not idempotent, and unable to
> determine whether it was invoked"
That is nothing to worry about. This is Hadoop's internal retry
12 matches
Mail list logo