Thanks Martin. I’ll contact Gautham who did the original ISO 19115 parser and see if he has time to take a look.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----Original Message----- From: Martin Desruisseaux <[email protected]> Organization: Geomatys Reply-To: "[email protected]" <[email protected]> Date: Wednesday, November 4, 2015 at 11:33 AM To: "[email protected]" <[email protected]> Subject: Re: ISO 19115 as a metadata model for Tika? >Hello Chris > >Le 03/11/15 19:02, Mattmann, Chris A (3980) a écrit : >> I think having some specific patches of how this would look >> would help to take it less away from the abstract and more >> into the concrete area. I encourage you to try it out MartinD, >> and see if there is a good overlap there. > >I attached to TIKA-443 a demo extracting some >org.apache.tika.metadata.DublinCore properties from an >org.opengis.metadata.Metadata object. This is not a patch that can be >included in Tika however since I do not know how to integrate those >properties in Tika (I would let this work to volunteers). > >This demo tries to give some tips about only one aspect of the >discussion: adding an ISO 19115 parser in Tika. There is an other aspect >of the discussion which is not covered by this demo: whether the Tika >metadata model should be extended to support the richness of more >complex models like ISO 19115. > >More specifically, if one look at the demo, we can see that there is >many loops. "Identification" object can contain many "Citation", which >in turn can contain many "ResponsibleParty", etc. For this demo I just >mapped e.g. the title of the first "Identification" instance to the >DublinCore's "title" property, then break the loop. Obviously >information are lost, so the question is whether it is a goal for Tika >to capture those information, or if they are considered too specific. > >If Tika chooses to capture such information, then a tree structure will >become necessary. So a next question would be how to do that, if a "tree >structure" and a "flat structure" should cohabit, etc. But we do not >need to answer those questions now (a simple ISO 19115 parser mapping to >the current Dublin Core properties could be done). > > Martin > >
