Hi Lewis,

I think I narrowed the problem to SelectorEntryComparator class nested in 
GeneratorJob. In debugger during crash I noticed there a single instance of 
SelectorEntryComparator shared across multiple reducer tasks. The class is 
inherited from org.apache.hadoop.io.WritableComparator that has a few members 
unprotected for concurrent usage. At some point multiple threads may access 
those members in WritableComparator.compare call. I modified 
SelectorEntryComparator and it seems solved the problem but I am not sure if 
the change is appropriate and/or sufficient (covers GENERATE only?)

Original code:
============================

  public static class SelectorEntryComparator extends WritableComparator {
    public SelectorEntryComparator() {
      super(SelectorEntry.class, true);
    }
  }

Modified code:
============================
  public static class SelectorEntryComparator extends WritableComparator {
    public SelectorEntryComparator() {
      super(SelectorEntry.class, true);
    }
    
    @Override
    synchronized public int compare(byte[] b1, int s1, int l1, byte[] b2, int 
s2, int l2) {
        return super.compare(b1, s1, l1, b2, s2, l2);
    }    
  }

Regards,

Vyacheslav Pascarel


-----Original Message-----
From: lewis john mcgibbney [mailto:lewi...@apache.org] 
Sent: Wednesday, June 21, 2017 1:41 PM
To: user@nutch.apache.org
Subject: [EXTERNAL] - Re: ERROR: Cannot run job worker!

Hi Vyacheslav,

Which version of Nutch are you using? 2.x?
lewis

On Wed, Jun 21, 2017 at 10:32 AM, <user-digest-h...@nutch.apache.org> wrote:

>
>
> From: Vyacheslav Pascarel <vpasc...@opentext.com>
> To: "user@nutch.apache.org" <user@nutch.apache.org>
> Cc:
> Bcc:
> Date: Wed, 21 Jun 2017 17:32:15 +0000
> Subject: ERROR: Cannot run job worker!
> Hello,
>
> I am writing an application that performs web site crawling using 
> Nutch REST services. The application:
>
>
>

Reply via email to