> > Sorry, I'm able to parse doc, docx, sxw, odt and rtf as well. After I > removed the plugins.folder I changed in order to run Nutch inside Eclipse, > everything works. >
Good > > BTW, I see the following in my log file: > 2010-10-08 13:56:32,555 WARN more.MoreIndexingFilter - > http://ridder.uio.no/test1.xlsx: can't parse erroneous date: > 2010-10-08T13:55:54Z > 2010-10-08 13:56:32,558 WARN more.MoreIndexingFilter - > http://ridder.uio.no/wtest1.docx: can't parse erroneous date: > 2010-10-08T13:55:49Z > > Should I report this as an IndexingFilter bug? It seems that I need to > rewrite it in order to parse the date correctly, but not a big issue right > now. > > yes please Thanks Julien -- * *Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com

