Hi Mathieu,

It is a bug indeed. As Feng suggested, please open an issue on
 https://issues.apache.org/jira/browse/NUTCH
<https://issues.apache.org/jira/browse/NUTCH> and attach a patch if you can.

Thanks

Julien


On 20 August 2014 02:59, feng lu <[email protected]> wrote:

> yes, I think this is a bug for bin/crawl script. It need to store the exist
> status of the previously executed command.
>
> I think you can open a issue and add you patch.
>
>
>
>
> On Tue, Aug 19, 2014 at 8:14 PM, Bouchard Mathieu (DGTT) <
> [email protected]> wrote:
>
> > Hi,
> >
> > We are using Solr with Nutch to provide a complete search engine for our
> > website.
> >
> > I created a cron job that would use Nutch to crawl and update the Solr
> > index each night. This cron job is trying to automatically correct some
> > errors that could result in a corrupt crawldb. However, it seems that the
> > bin/crawl command doesn't correctly propagate errors coming from
> bin/nutch.
> >
> > Here is an exemple from the bin/crawl script :
> >     $bin/nutch inject $CRAWL_PATH/crawldb $SEEDDIR
> >
> >     if [ $? -ne 0 ]
> >       then exit $?
> >     fi
> >
> > Even if there is an error in the nutch inject command, the crawl script
> > always returns 0. The way I understand it, the exit code returned is the
> > result of the shell test and not the result of the nutch inject command.
> >
> > To correct this, we would need to modify the script with something like :
> >     $bin/nutch inject $CRAWL_PATH/crawldb $SEEDDIR
> >     RETCODE=$?
> >
> >     if [ $RETCODE -ne 0 ]
> >       then exit $RETCODE
> >     fi
> >
> > I also have a problem with the bin/nutch generate command. This command
> > would return the same error code if there is an error or no new segment
> to
> > process, so there is no way to tell if the error is real or not.
> >
> > I'm thinking on opening a tiket with these issues, but i'm wondering if
> > there was a reason the script was written this way?
> >
> > Thanks,
> >
> > Les renseignements contenus dans ce message peuvent être confidentiels.
> >
> > Si vous n'êtes pas le destinataire visé ou une personne autorisée à lui
> > remettre ce courriel, vous êtes par la présente avisé qu'il est
> strictement
> > interdit d'utiliser, de copier ou de distribuer ce courriel, de dévoiler
> la
> > teneur de ce message ou de prendre quelque mesure fondée sur
> l'information
> > contenue. Vous êtes donc prié d'aviser immédiatement l'expéditeur de
> cette
> > erreur et de détruire ce message sans garder de copie.
> >
>
>
>
> --
> Don't Grow Old, Grow Up... :-)
>



-- 

Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble

Reply via email to