[
https://issues.apache.org/jira/browse/CONNECTORS-589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13536171#comment-13536171
]
David Morana edited comment on CONNECTORS-589 at 12/19/12 5:31 PM:
-------------------------------------------------------------------
ok, what variable name do I use in the schema to capture the links?
Are the links contained in an array? or in individual variables?
With my current schema, only one link is showing up in the index as the id (the
unique key)
Here's a sample from the index:
{code:xml}
<doc>
<arr name="attr_solr.title">
<str>Profiles</str>
<str>Acevedo-Aviles, Joel</str>
</arr>
<arr name="attr_stream_source_info">
<str>myfile</str>
</arr>
<arr name="attr_stream_content_type">
<str>application/octet-stream</str>
</arr>
<arr name="attr_stream_size">
<str>63162</str>
</arr>
<arr name="attr_content_encoding">
<str>UTF-8</str>
</arr>
<arr name="attr_stream_name">
<str>docname</str>
</arr>
<arr name="content_type">
<str>text/html; charset=UTF-8</str>
</arr>
<arr name="attr_dc_title">
<str>Profiles</str>
</arr>
<arr name="attr_source">
<str>
https://[redacted]/profiles/atom/search.do?email=ll.mit.edu&ps=500</str>
</arr>
<str name="category">profile</str>
<str name="id">
https://[redacted]/profiles/html/profileView.do?key=2ebd85f8-7411-4eed-a839-a0c03255cd76</str>
<arr name="attr_pubdate">
<str>1348674235000</str>
</arr>
<long name="_version_">1421801364636303360</long>
</doc>
{code}
This is Profile (user) data; the previous example I sent was File data. It
shouldn't matter though because you're sending all the links found, right?
was (Author: dmorana):
ok, what variable name do I use in the schema to capture the links?
Are the links contained in an array? or in individual variables?
With my current schema, only one link is showing up in the index as the id (the
unique key)
Here's a sample from the index:
{code:xml}
<doc>
<arr name="attr_solr.title">
<str>Profiles</str>
<str>Acevedo-Aviles, Joel (JO21372)</str>
</arr>
<arr name="attr_stream_source_info">
<str>myfile</str>
</arr>
<arr name="attr_stream_content_type">
<str>application/octet-stream</str>
</arr>
<arr name="attr_stream_size">
<str>63162</str>
</arr>
<arr name="attr_content_encoding">
<str>UTF-8</str>
</arr>
<arr name="attr_stream_name">
<str>docname</str>
</arr>
<arr name="content_type">
<str>text/html; charset=UTF-8</str>
</arr>
<arr name="attr_dc_title">
<str>Profiles</str>
</arr>
<arr name="attr_source">
<str>
https://[redacted]/profiles/atom/search.do?email=ll.mit.edu&ps=500</str>
</arr>
<str name="category">profile</str>
<str name="id">
https://[redacted]/profiles/html/profileView.do?key=2ebd85f8-7411-4eed-a839-a0c03255cd76</str>
<arr name="attr_pubdate">
<str>1348674235000</str>
</arr>
<long name="_version_">1421801364636303360</long>
</doc>
{code}
This is Profile (user) data; the previous example I sent was File data. It
shouldn't matter though because you're sending all the links found, right?
> For compatibility with IBM portal, feed parser should allow multiple links to
> be added to the queue, per entry
> --------------------------------------------------------------------------------------------------------------
>
> Key: CONNECTORS-589
> URL: https://issues.apache.org/jira/browse/CONNECTORS-589
> Project: ManifoldCF
> Issue Type: Improvement
> Components: RSS connector
> Affects Versions: ManifoldCF 1.1
> Reporter: Karl Wright
> Assignee: Karl Wright
> Fix For: ManifoldCF 1.1
>
>
> The IBM portal apparently generates feeds that have multiple links per entry,
> as follows:
> {code}
> <link
> href="https://[redacted]/files/basic/anonymous/api/library/e82a8b7b-08ea-42b6-bc5b-094f2bd124a3/document/1adf16d8-bbe4-4e70-be09-b002ce5cd816/entry"
> rel="self"></link>
> <link
> href="https://[redacted]/files/app/file/1adf16d8-bbe4-4e70-be09-b002ce5cd816"
> rel="alternate" type="text/html"></link>
> <link
> href="https://[redacted]/files/basic/api/library/e82a8b7b-08ea-42b6-bc5b-094f2bd124a3/document/1adf16d8-bbe4-4e70-be09-b002ce5cd816/entry"
> rel="edit"></link>
> <link
> href="https://[redacted]/files/basic/api/library/e82a8b7b-08ea-42b6-bc5b-094f2bd124a3/document/1adf16d8-bbe4-4e70-be09-b002ce5cd816/media"
> rel="edit-media"></link>
> <link
> href="https://[redacted]/files/basic/anonymous/api/library/e82a8b7b-08ea-42b6-bc5b-094f2bd124a3/document/1adf16d8-bbe4-4e70-be09-b002ce5cd816/media/web.png"
> rel="enclosure" type="image/png" title="web.png" hreflang="en"
> length="4297"></link>
> <link
> href="https://[redacted]/files/basic/anonymous/api/library/e82a8b7b-08ea-42b6-bc5b-094f2bd124a3/document/1adf16d8-bbe4-4e70-be09-b002ce5cd816/thumbnail"
> rel="thumbnail"></link>
> <category term="document" scheme="tag:ibm.com,2006:td/type"
> label="document"></category>
> <link
> href="https://[redacted]/files/basic/anonymous/api/library/e82a8b7b-08ea-42b6-bc5b-094f2bd124a3/document/1adf16d8-bbe4-4e70-be09-b002ce5cd816/feed"
> rel="replies" type="application/atom+xml" thr:count="0"
> {code}
> Right now, only the last link is processed. It would be better if all of
> them were processed.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira