Hi Philippe, Thanks, for your comments. I have already add multi-values for a field in lucene. I will try it with nutch plugin.
Best regards. On 1/26/06, Philippe EUGENE (JIRA) <[EMAIL PROTECTED]> wrote: > > [ > http://issues.apache.org/jira/browse/NUTCH-185?page=comments#action_12364087] > > Philippe EUGENE commented on NUTCH-185: > --------------------------------------- > > Great Plugin. Thanks ! > I succesfull test this plugin on a 0.7.1 version of nutch. > I have just a problem with somes structures like this : > <authors> > <author>author1</author> > <author>author2</author> > <author>author3</author> > <authorr> > > In my Lucene Index i just see the author3 value for this field. > I'm not sure that the problem is on the plugin. > I don't know if it's possible to have multi-values for a field on nutch > 0.7.1 > > > XMLParser is configurable plugin. It use XPath and namespaces to do the > mapping between the XML elements and Lucene fields. > > > --------------------------------------------------------------------------------------------------------------------------- > > > > Key: NUTCH-185 > > URL: http://issues.apache.org/jira/browse/NUTCH-185 > > Project: Nutch > > Type: New Feature > > Components: fetcher, indexer > > Versions: 0.7.2-dev > > Environment: OS Independent > > Reporter: Rida Benjelloun > > Attachments: parse-xml.zip > > > > XMLParser is configurable plugin. It use XPath and namespaces to do the > mapping between the XML elements and Lucene fields. > > Informations : > > 1- Copy "xmlparser-conf.xml" to the nutch/conf dir > > 2- To index your custom XML file, you have to modify the " > xmlparser-conf.xml". > > This parser uses namespaces and XPATH to parse XML content > > The config file do the mapping between the XML noeds (using XPATH) and > lucene field. > > Example : <field name="dctitle" xpath="//dc:title" type="Text" boost=" > 1.4" /> > > 3- The xmlIndexerProperties encapsulate a set of fields associated to a > namespace. > > If the namespace is found in the xml document, the fields represented by > the namespace will be indexed. > > Example : > > <xmlIndexerProperties type="filePerDocument" namespace=" > http://purl.org/dc/elements/1.1/"> > > <field name="dctitle" xpath="//dc:title" type="Text" boost=" 1.4" /> > > <field name="dccreator" xpath="//dc:creator" type="keyword" boost=" > 1.0" /> > > </xmlIndexerProperties> > > 4- It is possible to define a default namespace that will be applied > when the parser > > didn't find any namespace in the document or when the namespace found in > the xml document doesn't match with the namespace defined in the > xmlIndexerProperties. > > Example : > > <xmlIndexerProperties type="filePerDocument" namespace="default"> > > <field name="xmlcontent" xpath="//*" type="Unstored" boost="1.0" /> > > </xmlIndexerProperties> > > -- > This message is automatically generated by JIRA. > - > If you think it was sent incorrectly contact one of the administrators: > http://issues.apache.org/jira/secure/Administrators.jspa > - > For more information on JIRA, see: > http://www.atlassian.com/software/jira > > -- ---------------------------------------- Rida Benjelloun Président directeur général DocuLibre inc. Téléphone : (418) 262-3222 Site Web : http://www.doculibre.com Courriel : [EMAIL PROTECTED] ----------------------------------------
