Thanks. Could you please be more specific, how to setup the url filter?
something like http://mysite.doc? But how can I get all doc files at mysite
if the doc is at http://mysite/1/2/~user/a.doc.

Is there any reference for word parser? I don't know how to use it, thank you.


On Mon, 28 Mar 2005 14:59:57 +0200, Stefan Groschupf <[EMAIL PROTECTED]> wrote:
> Setup a url filter for any *.doc and install and use the word parser,
> that is all you need to do...
> 
> Am 28.03.2005 um 07:12 schrieb Eric Money:
> 
> > Hi all,
> >
> > If I wanna search a site but only interested in the
> > files with .doc suffix, how should I re-write nutch to
> > get all these files? Any comments and experiences
> > are appreciated, thanks all in advance.
> >
> >
> > -------------------------------------------------------
> > SF email is sponsored by - The IT Product Guide
> > Read honest & candid reviews on hundreds of IT Products from real
> > users.
> > Discover which products truly live up to the hype. Start reading now.
> > http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
> > _______________________________________________
> > Nutch-general mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/nutch-general
> >
> >
> ---------------------------------------------------------------
> company:                http://www.media-style.com
> forum:          http://www.text-mining.org
> blog:                   http://www.find23.net
> 
>

Reply via email to