Re: DIH XPathEntityProcessor question

2014-12-08 Thread Dan Davis
Yes, that worked quite well. I still need the "//tagname" but that is the only DIH incantation I need. This will substantially accelerate things. On Mon, Dec 8, 2014 at 5:37 PM, Dan Davis wrote: > The problem is that XPathEntityProcessor implements Xpath on its own, and > implements a subset

Re: DIH XPathEntityProcessor question

2014-12-08 Thread Dan Davis
The problem is that XPathEntityProcessor implements Xpath on its own, and implements a subset of XPath. So, if the input document is small enough, it makes no sense to fight it. One possibility is to apply an XSLT to the file before processing ite This blog post

Re: DIH XPathEntityProcessor question

2014-12-08 Thread Alexandre Rafalovitch
I don't believe there are any alternatives. At least I could not get anything but the full path to work. Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedi

Re: DIH XPathEntityProcessor question

2014-12-08 Thread Dan Davis
In experimentation with a much simpler and smaller XML file, it doesn't look like '//health-topic/@url" will not work, nor will '//@url' etc.So far, only spelling it all out will work. With child elements, such as , an xpath of "//title" works fine, but it is beginning to same dangerous. Is t

DIH XPathEntityProcessor question

2014-12-08 Thread Dan Davis
When I have a forEach attribute like the following: forEach="/medical-topics/medical-topic/health-topic[@language='English']" And then need to match an attribute of that, is there any alternative to spelling it all out: I suppose I could do "//health-topic/@url" since the document should