[
https://jira.duraspace.org/browse/DS-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=22767#comment-22767
]
Àlex Magaz Graça commented on DS-1050:
--------------------------------------
I had already seen the RDF statements, but I find weird the disseminator
exposed bundles which, at least I thought, were used internally by DSpace. I
think can agree with the LICENSE one, but I'm not sure about the others. I
mean, in some of the tests I've made harvesting with ORE I get some items
harvested with somedocument.pdf.txt and somedocument.pdf.jpg. Maybe I'm wrong,
but I thought the .pdf.txt was used to allow full text searches and the
.pdf.jpg to show thumbnails. This to me seems like a DSpace implementation
detail that the user shouldn't never see, and therefore, not exposed through
ORE.
If, for whatever reason, this behaviour is really intended, then I think the
harvester should ingest these bitstreams in a way they are hidden from the user
(XMLUI/JSPUI), like in the source repository. In this case, shall I open a new
bug report?
> ORE disseminator should only export bitstreams from the ORIGINAL bundle
> -----------------------------------------------------------------------
>
> Key: DS-1050
> URL: https://jira.duraspace.org/browse/DS-1050
> Project: DSpace
> Issue Type: Bug
> Components: OAI-PMH
> Reporter: Àlex Magaz Graça
>
> If a collection is harvested with references to bitstreams, the bitstreams
> used internally by DSpace are also linked in the files section of the
> harvested items. It's due to the ORE disseminator exporting bitstreams from
> all bundles in the item. For example, plain text version of PDFs, thumbnails,
> and license files are exported, although they aren't shown in JSPUI and XMLUI
> interfaces.
> Here is an example item when this problem occurs:
> https://buleria.unileon.es/oai/request?verb=GetRecord&metadataPrefix=ore&identifier=oai:buleria.unileon.es:10612/793
> In the output two links appear to files used internally by DSpace:
> [...]
> <atom:link [...]
> href="https://buleria.unileon.es/xmlui/bitstream/handle/10612/793/1945333.pdf.txt?sequence=4"
> title="1945333.pdf.txt" type="text/plain" length="61150"/>
> <atom:link [...]
> href="https://buleria.unileon.es/xmlui/bitstream/handle/10612/793/license.txt?sequence=3"
> title="license.txt" type="text/plain; charset=utf-8" length="1487"/>
> [...]
> When harvested, the item appears with 3 bitstream instead of the one shown in
> the source:
> https://buleria.unileon.es/handle/10612/793
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://jira.duraspace.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel