It's ok. I have found.

But I have some strange errors :

050406 155957 fetch okay, but can't parse
http://localhost:8080/testsIndex/file.doc, reason: Content
truncated at 70954 bytes. Parser can't handle incomplete
msword file.
050406 155958 fetch okay, but can't parse
http://localhost:8080/testsIndex/file.pdf, reason: Content
truncated at 70957 bytes. Parser can't handle incomplete pdf file.
050406 160001 fetch okay, but can't parse
http://localhost:8080/testsIndex/file.rtf, reason: Exception
parsing RTF document

Thank you for helping me.

Guillaume

> How can I proceed to enable these parsers : what files must be
> modified and how ?
>
> Thank you very much !
>
> Guillaume
>
>
> > You have to enable these parsers in your plugin
> configuration.   I know
> > pdf and doc works great myself, not sure about the others
> being supported.
> >
> > -byron
> >
> > -----Original Message-----
> > From: "guillaume lefebvre" <[EMAIL PROTECTED]>
> > To: "nutch-user" <[email protected]>
> > Date: Wed,  6 Apr 2005 13:41:43 +0200
> > Subject: PDF, XML, DOC, RTF Parsing
> >
> > > Hi,
> > >
> > > I'm a new user of Nutch.
> > >
> > > I have some problems to index PDF, XML, DOC, RTF. Is it
normal
> > > ? Does Nutch support the PDF, XML, DOC and RTF parsing ?
> > >
> > > Thank you !
> > > Guillaume
> > >
> > >
> > > Acc�dez au courrier �lectronique de La Poste :
> www.laposte.net ;
> > > 3615 LAPOSTENET (0,34�/mn) ; t�l : 08 92 68 13 50 (0,34�/mn)
> > >
> > >
> > >
> >
> >
>
> Acc�dez au courrier �lectronique de La Poste :
www.laposte.net ;
> 3615 LAPOSTENET (0,34�/mn) ; t�l : 08 92 68 13 50 (0,34�/mn)
>
>
>
> 

Acc�dez au courrier �lectronique de La Poste : www.laposte.net ; 
3615 LAPOSTENET (0,34�/mn) ; t�l : 08 92 68 13 50 (0,34�/mn)



Reply via email to