Adriano,

        The email that you sent earlier seemed to have [EMAIL PROTECTED]
on a separate line.  As I understand it these lines need to start with a
+ for regular expressions of things that should be included in the crawl
a - for regular expressions of things that should not be included or a #
for comments.

        I'm not sure what having [EMAIL PROTECTED] on its own line would
do, but in regular expressions you would be defining a character class
that would match any of the characters between the []s.

        I hope that helps.

Jake.

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: Thursday, September 15, 2005 3:54 AM
To: [email protected]
Subject: Re: RE: crawl-urlfilter.txt


I'm sorry but Idon't understand very well.
You said : "you try commenting that line out" but out??? where??? in
that mode???
 thanks
                    Adriano

>       I'm fairly new to nutch myself, but this line doesn't look right
to me:
>
># skip URLs containing certain characters as probable queries, etc.
[EMAIL 
>PROTECTED]
>
>       I'd try commenting that line out and try the crawl again.
>
>Jake.
>


------------------------------------------------------------------------
-
Visita http://domini.interfree.it, il sito di Interfree dove trovare
soluzioni semplici e complete che soddisfano le tue esigenze in
Internet,
ecco due esempi di offerte:

-  Registrazione Dominio: un dominio con 1 MB di spazio disco +  2
caselle
   email a soli 18,59 euro
-  MioDominio: un dominio con 20 MB di spazio disco + 5 caselle email 
   a soli 51,13 euro

Vieni a trovarci!

Lo Staff di Interfree 
------------------------------------------------------------------------
-

Reply via email to