Hi Lance,

your are right:
XPathEntityProcessor has the attribut "xsl", so I can use xslt to generate a 
xml-File "in the form of the standard Solr update schema".
I will check the performance of this.

Best regards

btw. "flatten" is an attribute of the "field"-Tag, not of XPathEntityProcessor 
(like wrongly specified it the wiki)

-------- Lance
> There is an option somewhere to use the full XML DOM implementation
> for using xpaths. The purpose of the XPathEP is to be as simple and
> dumb as possible and handle most cases: RSS feeds and other open
> standards.
> Search for xsl(optional)
> http://wiki.apache.org/solr/DataImportHandler#Configuration_in_data-config.xml-1
-------- Karsten
> On Sat, Apr 9, 2011 at 5:32 AM
> > Hi Folks,
> >
> > does anyone improve DIH XPathRecordReader to deal with nested xpaths?
> > e.g.
> > data-config.xml with
> >  <entity .. processor="XPathEntityProcessor" ..
> >  <field column="title" xpath="//body/h1"/>
> >  <field column="alltext” xpath="//body" flatten="true"/>
> > and the XML stream contains
> >  /html/body/h1...
> > will only fill field “alltext” but field “title” will be empty.
> >
> > This is a known issue from 2009
> >
> https://issues.apache.org/jira/browse/SOLR-1437#commentauthor_12756469_verbose
> >
> > So three questions:
> > 1. How to fill a “search over all”-Field without nested xpaths?
> >   (schema.xml  <copyField source="*" dest="alltext"/> will not help,
> because we lose the original token order)
> > 2. Does anyone try to improve XPathRecordReader to deal with nested
> xpaths?
> > 3. Does anyone else need this feature?
> >
> >
> > Best regards
> >  Karsten
> >

Reply via email to