Yeah, I checked -- no old YARN processes running. ZK and Kafka are the only 
other two Java processes running on my machine.

Martin

On 20 Feb 2014, at 00:20, Chris Riccomini <[email protected]> wrote:
> Hey Martin,
> 
> Have you checked if you've leaked a NM process?
> 
> I've seen cases in the past where an NM wasn't properly shutdown, and the
> pid was over-written. Could be that.
> 
> Cheers,
> Chris
> 
> On 2/19/14 4:18 PM, "Martin Kleppmann" <[email protected]> wrote:
> 
>> Hi,
>> 
>> I'm suddenly having problems with YARN as set up by hello-samza. It was
>> working fine earlier today and I don't recall changing anything in my
>> setup -- so I just wanted to check if anyone has seen this before.
>> 
>> The YARN resourcemanager seems to start up fine (at least the web UI
>> works, and nothing strange-looking in the log). But when the nodemanager
>> starts, I see a lot of this in its logs:
>> 
>> 14/02/20 00:00:04 INFO ipc.Client: Retrying connect to server:
>> 0.0.0.0/0.0.0.0:8031. Already tried 0 time(s); maxRetries=45
>> 14/02/20 00:00:08 INFO ipc.Client: Retrying connect to server:
>> 0.0.0.0/0.0.0.0:8031. Already tried 0 time(s); retry policy is
>> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
>> 14/02/20 00:00:09 INFO ipc.Client: Retrying connect to server:
>> 0.0.0.0/0.0.0.0:8031. Already tried 1 time(s); retry policy is
>> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
>> 14/02/20 00:00:11 INFO ipc.Client: Retrying connect to server:
>> 0.0.0.0/0.0.0.0:8031. Already tried 2 time(s); retry policy is
>> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
>> 14/02/20 00:00:12 INFO ipc.Client: Retrying connect to server:
>> 0.0.0.0/0.0.0.0:8031. Already tried 3 time(s); retry policy is
>> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
>> 
>> ...etc repeating every few seconds, and never connecting. But the RM is
>> listening on localhost:8031 (verified with netcat).
>> 
>> run-job.sh similarly sits there, writing a similar message to
>> hello-samza/deploy/samza/undefined-samza-container-name.log every few
>> seconds (but with port 8032 instead of 8031).
>> 
>> Any ideas?
>> 
>> Thanks,
>> Martin
>> 
> 

Reply via email to