Hi Jukka, On 10/7/07 1:01 PM, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote:
> Author: jukka > Date: Sun Oct 7 13:01:46 2007 > New Revision: 582674 > > URL: http://svn.apache.org/viewvc?rev=582674&view=rev > Log: > TIKA-46 - Use Metadata in Parser > - With improvements by Chris Mattmann > > Modified: > incubator/tika/trunk/src/main/java/org/apache/tika/utils/ParseUtils.java > URL: > http://svn.apache.org/viewvc/incubator/tika/trunk/src/main/java/org/apache/tik > a/utils/ParseUtils.java?rev=582674&r1=582673&r2=582674&view=diff > ============================================================================== > --- incubator/tika/trunk/src/main/java/org/apache/tika/utils/ParseUtils.java > (original) > +++ incubator/tika/trunk/src/main/java/org/apache/tika/utils/ParseUtils.java I'm not sure I get these changes for this file: did you just remove it and add it back? Was it formatting that changed? > Modified: incubator/tika/trunk/src/test/java/org/apache/tika/TestParsers.java > URL: > http://svn.apache.org/viewvc/incubator/tika/trunk/src/test/java/org/apache/tik > a/TestParsers.java?rev=582674&r1=582673&r2=582674&view=diff > ============================================================================== > --- incubator/tika/trunk/src/test/java/org/apache/tika/TestParsers.java > (original) > +++ incubator/tika/trunk/src/test/java/org/apache/tika/TestParsers.java Sun > - assertEquals("Sample Powerpoint Slide", contents.get("title") > - .getValue()); > + assertEquals("Sample Powerpoint Slide", metadata.get("title")); Your commit didn't include my updates to the above, which changed it to use Metadata.TITLE, instead of the literal string "title" > } > > public void testWORDxtraction() throws Exception { > @@ -130,15 +131,16 @@ > assertEquals(s1, s2); > ParserConfig config = tc.getParserConfig("application/msword"); > Parser parser = ParserFactory.getParser(config); > - Map<String, Content> contents = config.getContents(); > + Collection<Content> contents = config.getContents(); > assertNotNull(contents); > + Metadata metadata = new Metadata(); > InputStream stream = new FileInputStream(file); > try { > - parser.parse(stream, contents.values()); > + parser.parse(stream, contents, metadata); > } finally { > stream.close(); > } > - assertEquals("Sample Word Document", > contents.get("title").getValue()); > + assertEquals("Sample Word Document", metadata.get("title")); Same here > } > > public void testEXCELExtraction() throws Exception { > @@ -156,15 +158,16 @@ > .contains(expected)); > ParserConfig config = tc.getParserConfig("application/vnd.ms-excel"); > Parser parser = ParserFactory.getParser(config); > - Map<String, Content> contents = config.getContents(); > + Collection<Content> contents = config.getContents(); > assertNotNull(contents); > + Metadata metadata = new Metadata(); > InputStream stream = new FileInputStream(file); > try { > - parser.parse(stream, contents.values()); > + parser.parse(stream, contents, metadata); > } finally { > stream.close(); > } > - assertEquals("Simple Excel document", > contents.get("title").getValue()); > + assertEquals("Simple Excel document", metadata.get("title")); And here > } > > public void testOOExtraction() throws Exception { > @@ -185,18 +188,18 @@ > Parser parser = ParserFactory.getParser(config); > assertNotNull(parser); > > - Map<String, Content> contents = config.getContents(); > + Collection<Content> contents = config.getContents(); > assertNotNull(contents); > + Metadata metadata = new Metadata(); > InputStream stream = new FileInputStream(file); > try { > - parser.parse(stream, contents.values()); > + parser.parse(stream, contents, metadata); > } finally { > stream.close(); > } > - assertEquals("Title : Test Indexation Html", contents.get("title") > - .getValue()); > + assertEquals("Title : Test Indexation Html", metadata.get("title")); And here. Probably just an omission, but could you update it? Thanks! Cheers, Chris ______________________________________________ Chris Mattmann, Ph.D. [EMAIL PROTECTED] Cognizant Development Engineer Early Detection Research Network Project _________________________________________________ Jet Propulsion Laboratory Pasadena, CA Office: 171-266B Mailstop: 171-246 _______________________________________________________ Disclaimer: The opinions presented within are my own and do not reflect those of either NASA, JPL, or the California Institute of Technology.
