On Fri, 30 Jan 2009, ahammad wrote:
Nutch doesn't do that, so I came to the conclusion that I'll have to change the msword parsers. From searching the web, I found that the best way to do this is to use the CustomProperties and the DocumentSummaryInformation classes.
I think we might already have a class to do basically what you need: http://poi.apache.org/apidocs/org/apache/poi/hpsf/extractor/HPSFPropertiesExtractor.html Nick --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
