Which protocol are you using Amit?

On Wed, Dec 4, 2013 at 10:46 PM, Amit Sela <[email protected]> wrote:

> In my case, the fetch got to that point in 45 minutes and is stuck another
> 75 minutes with those mappers.
> The log just keeps printing:
>
> org.apache.nutch.fetcher.Fetcher: -activeThreads=2, spinWaiting=0,
> fetchQueues.totalSize=0
>
> org.apache.nutch.fetcher.Fetcher: -activeThreads=2, spinWaiting=0,
> fetchQueues.totalSize=0
>
> org.apache.nutch.fetcher.Fetcher: -activeThreads=2, spinWaiting=0,
> fetchQueues.totalSize=0
>
> ....
>
>
>
> On Wed, Dec 4, 2013 at 4:31 PM, feng lu <[email protected]> wrote:
>
> > I see that it use a while loop to wait for threads to exit and will wait
> 1
> > second between each check. so even if fetcher thread was finished, the
> > whole fetcher process will take little longer to exit.
> >
> > code structure like this.
> >
> >  do {                                          // wait for threads to
> exit
> >       pagesLastSec = pages.get();
> >       bytesLastSec = (int)bytes.get();
> >
> >       try {
> >         Thread.sleep(1000);
> >       } catch (InterruptedException e) {}
> >
> >       ....
> >       reportStatus(pagesLastSec, bytesLastSec);   // your print output is
> > coming here
> >
> >       LOG.info("-activeThreads=" + activeThreads + ", spinWaiting=" +
> > spinWaiting.get()
> >           + ", fetchQueues.totalSize=" + fetchQueues.getTotalSize());
> >
> >       if (!feeder.isAlive() && fetchQueues.getTotalSize() < 5) {
> >         fetchQueues.dump();
> >       }
> >       ....
> >       // check timelimit
> >       if (!feeder.isAlive()) {
> >         int hitByTimeLimit = fetchQueues.checkTimelimit();
> >         if (hitByTimeLimit != 0) reporter.incrCounter("FetcherStatus",
> >             "hitByTimeLimit", hitByTimeLimit);
> >       }
> >
> >       // some requests seem to hang, despite all intentions
> >       if ((System.currentTimeMillis() - lastRequestStart.get()) >
> timeout)
> > {
> >         if (LOG.isWarnEnabled()) {
> >           LOG.warn("Aborting with "+activeThreads+" hung threads.");
> >         }
> >         return;
> >       }
> >
> >     } while (activeThreads.get() > 0);
> >
> >
> > On Wed, Dec 4, 2013 at 7:57 PM, Amit Sela <[email protected]> wrote:
> >
> > > In the fetch phase, I notice that some of the mappers take much longer
> to
> > > finish.
> > > In the running task mapreduce admin screen it shows
> > >
> > > *1 threads, 1 queues, 0 URLs queued, *
> > >
> > > So why those tasks are not complete ?
> > >
> >
> >
> >
> > --
> > Don't Grow Old, Grow Up... :-)
> >
>

Reply via email to