Hello, what is the current status of the RTF parser? I saw that there was a license problem and was unable to find the code of the RTFParser.
When I crawl on rtf files, I almost always have the following error: Error parsing: xxx.rtf : failed(2,0): Can't be handled as Microsoft document. java.io.IOException: Invalid header signature; read 7015536635646467195, expected -2226271756974174256 This error was also pointed by V. Shridar in one mail but unfortunately, there was no response. Should I definitely give up indexing rtf? -- View this message in context: http://www.nabble.com/rtf-parser-status-tp20223773p20223773.html Sent from the Nutch - User mailing list archive at Nabble.com.
