[
https://issues.apache.org/jira/browse/SOLR-7383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15424714#comment-15424714
]
Alexandre Rafalovitch commented on SOLR-7383:
---------------------------------------------
After some testing, even with corrected (RDF) field definition, the commonField
mapping is a problem for several reasons:
* commonFields are set in postTransform, which is not called if there are no
(normal) fields. So, the /RDF/channel record which does not match any normal
fields is skipped
* even if normal fields are matched for some reason, they will still be skipped
unless the primary key record is matched
the commonField is not really described anywhere and is not tested.
In summary, we have a dead example on our hands.
> DIH rss example is broken again
> -------------------------------
>
> Key: SOLR-7383
> URL: https://issues.apache.org/jira/browse/SOLR-7383
> Project: Solr
> Issue Type: Bug
> Components: contrib - DataImportHandler
> Affects Versions: 5.0, 6.0
> Reporter: Upayavira
> Assignee: Alexandre Rafalovitch
> Priority: Minor
>
> The DIH example (solr/example/example-DIH/solr/rss/conf/rss-data-config.xml)
> is broken again. See associated issues.
> Below is a config that should work.
> This is caused by Slashdot seemingly oscillating between RDF/RSS and pure
> RSS. Perhaps we should depend upon something more static, rather than an
> external service that is free to change as it desires.
> <dataConfig>
> <dataSource type="URLDataSource" />
> <document>
> <entity name="slashdot"
> pk="link"
> url="http://rss.slashdot.org/Slashdot/slashdot"
> processor="XPathEntityProcessor"
> forEach="/RDF/item"
> transformer="DateFormatTransformer">
>
> <field column="source" xpath="/RDF/channel/title"
> commonField="true" />
> <field column="source-link" xpath="/RDF/channel/link"
> commonField="true" />
> <field column="subject" xpath="/RDF/channel/subject"
> commonField="true" />
>
> <field column="title" xpath="/RDF/item/title" />
> <field column="link" xpath="/RDF/item/link" />
> <field column="description" xpath="/RDF/item/description" />
> <field column="creator" xpath="/RDF/item/creator" />
> <field column="item-subject" xpath="/RDF/item/subject" />
> <field column="date" xpath="/RDF/item/date"
> dateTimeFormat="yyyy-MM-dd'T'HH:mm:ss" />
> <field column="slash-department" xpath="/RDF/item/department" />
> <field column="slash-section" xpath="/RDF/item/section" />
> <field column="slash-comments" xpath="/RDF/item/comments" />
> </entity>
> </document>
> </dataConfig>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]