[
https://jira.duraspace.org/browse/DS-1202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=25377#comment-25377
]
DSpace @ Lyncode edited comment on DS-1202 at 7/2/12 2:36 AM:
--------------------------------------------------------------
Hi Mark Diggory,
1) Yes. The actual oai interface (OAICAT based) has some issues that cannot be
easily solved, mainly:
- Doesn't support virtual contexts (this is the major development in XOAI), for
interoperability that concerns guide-lines from Driver and OpenAIRE, simply,
the actual oai is not enough. Driver and OpenAIRE have specific metadata
(values) formatting requirements that are sightly distinct from each other, it
is incorrect (OAI-PMH protocol) to have one only interface that returns
metadata in different formats (ie. date format, prefixed values, sufixed
values, and so on), one must have distinct interfaces that outputs metadata
with the desired values format.
- OAICAT makes some incorrect assumptions (with respect to the OAI-PMH
protocol): https://jira.duraspace.org/browse/DS-1195
2) The new solr core is because of the "addon development policy", but it could
use the search core, but one need the search core to answer a specific query,
more properly:
- Which items does have all its bitstreams free up to download?
Another thing (i didn't look at the actual search core implementation, only the
schema.xml), does it indexes in all metadata fields? Even the user created
ones? This is an important requirement for this OAI implementation.
3 & 4) The XOAI architecture it's based on the idea that all OAI data providers
have the same core functionality, that is, they implement the OAI-PMH protocol
[1]. So one just need to implement the DSpace specific datasource, XOAI core do
the rest. So a spring based solution it's a possibility, but i think this
approach (used by OAICAT also) is a good way of keeping things simple.
Considering future OAI-PMH protocol versions, this approach also seems to be
the best, one just need to update the core library to reflect those changes.
--- Getting into the core
XOAI uses a 2-phase pipeline transformation (configurable) using XSL:
1 - Metadata values transformation (none, driver, openaire, ...)
2 - Metadata schema transformation (oai_dc, mets, didl, ...)
-- XSLT Input
This XSL transformers receive as input a XML file that uses a specific (and
flexible) schema (allowing us to output any kind of information - DSpace
datasource implementation) [attached XSD].
-- Data Sources
The DSpace data source is a specific datasource. One must provide access to all
the needed OAI-PMH information:
> Repository Name, Email (Identify)
> Communities & Collections (ListSets)
> Items (ListRecords & ListIdentifiers)
-- Configuration
XOAI provides the concept of Filter, that is, one could associate filters with
sets, and when requested (set=<setSpec>) it triggers the use of those filters
resulting in a specific datasource query (Filters are also specific class
implementations that extends the AbstractFilter class, specific DSpace Filters
could be found at [3]). Filters could also be associated with metadata formats
and contexts.
The actual configuration could be found at [4] (spring based concept)
--- Resources
[1] http://www.openarchives.org/OAI/openarchivesprotocol.html
[2]
https://github.com/lyncode/xoai-common/blob/master/src/main/java/com/lyncode/xoai/common/dataprovider/filter/AbstractFilter.java
[3]
https://github.com/lyncode/DSpace/tree/dspace-with-xoai/dspace-xoai/dspace-xoai-api/src/main/java/org/dspace/xoai/filter
[4]
https://github.com/lyncode/DSpace/blob/dspace-with-xoai/dspace/config/modules/xoai/xoai.xml
PS - I would like to discuss with you a specific OpenAIRE requirement. OpenAIRE
is aware of the embargo end date, but dublin core does not provide a specific
field for this one. The embargo DSpace feature, for example, requires the user
to define such field. I think it's important for the DSpace community to,
somehow, have more "control" over the possible metadata fields, just giving a
shot... why not produce (like OAI-PMH > oai_dc) a specific (DC extesion)
schema? DSpace development is limited (by default) to the DC Schema
information, which i think, represents a huge limitation (DC is getting
older... and there are some needs that could be fulfilled with the extesion of
the DC).
was (Author: lyncode):
Hi Mark Diggory,
1) Yes. The actual oai interface (OAICAT based) has some issues that cannot be
solved easily, mainly:
- Doesn't support virtual contexts (this is the major development in XOAI), for
interoperability that concerns guide-lines from Driver and OpenAIRE, simply,
the actual oai is not enough. Driver and OpenAIRE have specific metadata
(values) formatting requirements that are sightly distinct from each other, it
is incorrect (OAI-PMH protocol) to have one only interface that returns
metadata in different formats (ie. date format, prefixed values, sufixed
values, and so on), one must have distinct interfaces that outputs metadata
with the desired values format.
- OAICAT makes some incorrect assumptions (with respect to the OAI-PMH
protocol): https://jira.duraspace.org/browse/DS-1195
2) The new solr core is because of the "addon development policy", but it could
use the search core, but one need the search core to answer a specific query,
more properly:
- Which items does have all its bitstreams free up to download?
Another thing (i didn't look at the actual search core implementation, only the
schema.xml), does it indexes in all metadata fields? Even the user created
ones? This is an important requirement for this OAI implementation.
3 & 4) The XOAI architecture it's based on the idea that all OAI data providers
have the same core functionality, that is, they implement the OAI-PMH protocol
[1]. So one just need to implement the DSpace specific datasource, XOAI core do
the rest. So a spring based solution it's a possibility, but i think this
approach (used by OAICAT also) is a good way of keeping things simple.
Considering future OAI-PMH protocol versions, this approach also seems to be
the best, one just need to update the core library to reflect those changes.
--- Getting into the core
XOAI uses a 2-phase pipeline transformation (configurable) using XSL:
1 - Metadata values transformation (none, driver, openaire, ...)
2 - Metadata schema transformation (oai_dc, mets, didl, ...)
-- XSLT Input
This XSL transformers receive as input a XML file that uses a specific (and
flexible) schema (allowing us to output any kind of information - DSpace
datasource implementation) [attached XSD].
-- Data Sources
The DSpace data source is a specific datasource. One must provide access to all
the needed OAI-PMH information:
> Repository Name, Email (Identify)
> Communities & Collections (ListSets)
> Items (ListRecords & ListIdentifiers)
-- Configuration
XOAI provides the concept of Filter, that is, one could associate filters with
sets, and when requested (set=<setSpec>) it triggers the use of those filters
resulting in a specific datasource query (Filters are also specific class
implementations that extends the AbstractFilter class, specific DSpace Filters
could be found at [3]). Filters could also be associated with metadata formats
and contexts.
The actual configuration could be found at [4] (spring based concept)
--- Resources
[1] http://www.openarchives.org/OAI/openarchivesprotocol.html
[2]
https://github.com/lyncode/xoai-common/blob/master/src/main/java/com/lyncode/xoai/common/dataprovider/filter/AbstractFilter.java
[3]
https://github.com/lyncode/DSpace/tree/dspace-with-xoai/dspace-xoai/dspace-xoai-api/src/main/java/org/dspace/xoai/filter
[4]
https://github.com/lyncode/DSpace/blob/dspace-with-xoai/dspace/config/modules/xoai/xoai.xml
PS - I would like to discuss with you a specific OpenAIRE requirement. OpenAIRE
is aware of the embargo end date, but dublin core does not provide a specific
field for this one. The embargo DSpace feature, for example, requires the user
to define such field. I think it's important for the DSpace community to,
somehow, have more "control" over the possible metadata fields, just giving a
shot... why not produce (like OAI-PMH > oai_dc) a specific (DC extesion)
schema? DSpace development is limited (by default) to the DC Schema
information, which i think, represents a huge limitation (DC is getting
older... and there are some needs that could be fulfilled with the extesion of
the DC).
> DSpace XOAI Data Provider
> -------------------------
>
> Key: DS-1202
> URL: https://jira.duraspace.org/browse/DS-1202
> Project: DSpace
> Issue Type: New Feature
> Components: OAI-PMH
> Reporter: DSpace @ Lyncode
> Priority: Major
> Labels: oai
>
> DSpace XOAI Data Provider is an OAI-PMH Interface for DSpace based upon XOAI
> (OAI-PMH java toolkit). With the following characteristics:
> - OpenAIRE compliant
> - Driver compliant
> - Default context (same behavior as the original DSpace OAI interface)
> - Completely configurable
> - Fast (based on solr, also with cache)
> - Extendable
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://jira.duraspace.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel