Hi Chris, Thanks for this information, I will take a look on it and I will communicate with you. Regards.
2007/10/10, Chris Mattmann <[EMAIL PROTECTED]>: > > Hi Rida, > > I agree totally! You should take a look at the MarkupLanguageProposal > (within Nutch http://wiki.apache.org/nutch/MarkupLanguageParserProposal) > and > the work done in Frutch > (http://www.krugle.com/kse/files?query=frutch%20parse%20out) on the > ParseXml > plugin. > > I'd love to chat with you more about this. Let me know what you think. > > Thanks, > Chris > > > > On 10/10/07 9:28 AM, "Rida Benjelloun" <[EMAIL PROTECTED]> > wrote: > > > Hi, > > Do you think that we should have a XmlOutputter that save the extracted > > content and metadata in XML file ? This will simplify integration with > other > > technologies like Solr for example. > > The XmlOutputter will process File (File or Directory recursively) and > Url. > > Will use XSLT as a filter to masque or display the elements needed and > an > > output encoding : > > Example > > TikaXmlOutputter txo = new TikaXmlOutputter() > > txo.output(File|URL input, File xmlOutput, File xsltFilter, String > > encoding); > > > > Regards. > > ______________________________________________ > Chris Mattmann, Ph.D. > [EMAIL PROTECTED] > Cognizant Development Engineer > Early Detection Research Network Project > > _________________________________________________ > Jet Propulsion Laboratory Pasadena, CA > Office: 171-266B Mailstop: 171-246 > _______________________________________________________ > > Disclaimer: The opinions presented within are my own and do not reflect > those of either NASA, JPL, or the California Institute of Technology. > > >
