shenzhuxi created SOLR-5086:
-------------------------------
Summary: The OR operator works incorrectly in XPathEntityProcessor
Key: SOLR-5086
URL: https://issues.apache.org/jira/browse/SOLR-5086
Project: Solr
Issue Type: Bug
Components: contrib - DataImportHandler
Affects Versions: 4.4
Reporter: shenzhuxi
I's trying to use DataImportHandler to index RSS/ATOM feed and find bizarre
behaviours of the OR operator in XPathEntityProcessor.
Here is the configuration.
<?xml version="1.0" encoding="UTF-8"?>
<dataConfig>
<dataSource type="FileDataSource"/>
<document>
<entity name="rss" processor="FileListEntityProcessor"
baseDir="${solr.solr.home}/feed/rss" fileName="^.*\.xml$" recursive="true"
rootEntity="false" dataSource="null">
<entity name="feed" url="${rss.fileAbsolutePath}"
processor="XPathEntityProcessor" forEach="/rss/channel/item|/feed/entry"
transformer="DateFormatTransformer">
<field column="link"
xpath="/rss/channel/item/link|/feed/entry/link/@href"/>
</entity>
</entity>
</document>
</dataConfig>
The first OR operator in "/rss/channel/item|/feed/entry" works correctly.
But the second one in "/rss/channel/item/link|/feed/entry/link/@href" doesn't
work.
If I rewrite it to either "/rss/channel/item/link" or "/feed/entry/link/@href",
it works correctly.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]