Hi Earl,

Please, see my responses below.


--- Earl Cahill <[EMAIL PROTECTED]> wrote:


As you probably saw in the OutlinkExtractor class, the links are
extracted with a Regexp.  I'm no expert in the matter, but that will
certainly answer your questions below...

> So, three open questions
> 
> 1.  Why doesn't my link (<a
> href=/sitemap.html>browse</a>) get parsed?

Because it doesn't match the aforementioned regexp.

> 2.  Why does my style get followed?

Because it matches the regexp.

> 3.  Where do I look for a list of all the failed
> links?

I don't think there is any.

I have just created the issue in JIRA:
http://issues.apache.org/jira/browse/NUTCH-119


Regards,
Sébastien.




        

        
                
___________________________________________________________________________ 
Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger 
Téléchargez cette version sur http://fr.messenger.yahoo.com

Reply via email to