Hi Vyacheslav, Thanks for the update, can you please open a ticket at https://issues.apache.org/jira/projects/NUTCH If you are able to submit a pull request at https://github.com/apache/nutch/, it would be appreciated. Lewis
On Sat, Jun 24, 2017 at 9:36 AM, <user-digest-h...@nutch.apache.org> wrote: > > From: Vyacheslav Pascarel <vpasc...@opentext.com> > To: "user@nutch.apache.org" <user@nutch.apache.org> > Cc: > Bcc: > Date: Fri, 23 Jun 2017 13:07:39 +0000 > Subject: RE: [EXTERNAL] - Re: ERROR: Cannot run job worker! > Hi Lewis, > > I think I narrowed the problem to SelectorEntryComparator class nested in > GeneratorJob. In debugger during crash I noticed there a single instance of > SelectorEntryComparator shared across multiple reducer tasks. The class is > inherited from org.apache.hadoop.io.WritableComparator that has a few > members unprotected for concurrent usage. At some point multiple threads > may access those members in WritableComparator.compare call. I modified > SelectorEntryComparator and it seems solved the problem but I am not sure > if the change is appropriate and/or sufficient (covers GENERATE only?) > > Original code: > ============================ > > public static class SelectorEntryComparator extends WritableComparator { > public SelectorEntryComparator() { > super(SelectorEntry.class, true); > } > } > > Modified code: > ============================ > public static class SelectorEntryComparator extends WritableComparator { > public SelectorEntryComparator() { > super(SelectorEntry.class, true); > } > > @Override > synchronized public int compare(byte[] b1, int s1, int l1, byte[] b2, > int s2, int l2) { > return super.compare(b1, s1, l1, b2, s2, l2); > } > } > >