Hi Kim,

I've come across this explanation from Tim Donohue when searching for a 
possible solution to my query: OAI harvest not processing dc.identifier.uri 
correctly 
<https://groups.google.com/d/msg/dspace-tech/y2vVi2HhXlU/_myJwC30BgAJ>. So 
even if I use 'oai_dc' to harvest items via OAI_PMH, based on Tim's 
explanation, the harvesting repository should not create a new 
dc.identifier.uri for that harvested item. What I'm experiencing now is 
that the setting oai.harvester.acceptedHandleServer = hdl.handle.net in 
oai.cfg has no effect whatsoever. I tried harvesting in demo.dspace.org 
(col_10673_2) which is clearly using the hdl.handle.net Handle Server but 
it is still creating/minting new handles for incoming harvested items. I 
tried all three (oai_dc, qdc, dim) metadata formats and all three metadata 
formats upon completion of the harvest have created new handles for the 
harvested items.

I wonder if there are configurations that I missed and hoping this is not a 
bug since this is not the expected behavior based on the comments in 
oai.cfg specifically in the oai.harvester.acceptedHandleServer setting.

Thanks in advance,
euler

On Monday, January 23, 2017 at 9:39:21 AM UTC+8, Kim Shepherd wrote:
>
> Hi euler, 
>
> REgarding the handle baseURL, yes that's correct - perhaps that will fix 
> everything up for you without having to tinker further
>
> Regarding DIM ingest crosswalk... you'll see that out of the box the OAI 
> harvester comes with 3 metadata formats configured, including DIM
>
> # Crosswalk settings; the {name} value must correspond to a declared 
> ingestion crosswalk
>
> # oai.harvester.metadataformats.{name} = {namespace},{optional display 
> name}
>
> # The display name is only used in the xmlui; for the jspui there are 
> entries in the
>
> # Messages.properties in the form 
> jsp.tools.edit-collection.form.label21.select.{name}
>
> oai.harvester.metadataformats.dc = http://
> www.openarchives.org/OAI/2.0/oai_dc/\ 
> <http://www.openarchives.org/OAI/2.0/oai_dc/%5C>, Simple Dublin Core
>
> oai.harvester.metadataformats.qdc = http://purl.org/dc/terms/\ 
> <http://purl.org/dc/terms/%5C>, Qualified Dublin Core
>
> oai.harvester.metadataformats.dim = http://
> www.dspace.org/xmlns/dspace/dim\ 
> <http://www.dspace.org/xmlns/dspace/dim%5C>, DSpace Intermediate Metadata
>
> If you look in the "Crosswalk Plugin Configuration" in the main dspace.cfg 
> where packager/ingest crosswalks are defined, you'll see
>
>   org.dspace.content.crosswalk.DIMIngestionCrosswalk = dim, \
>
> You'll see it's pointing the 'dim' metadataformat name to 
> DIMIngestionCrosswalk, which is a really simple java crosswalk that 
> bypasses any other XSLT (
> https://github.com/DSpace/DSpace/blob/master/dspace-api/src/main/java/org/dspace/content/crosswalk/DIMIngestionCrosswalk.java
> )
>
> I've seen this 'sample' XSL floating around in the config dir which maybe 
> could be turned into something more useful for you if you can plug it into 
> a new dim ingestion crosswalk declaration
>
> /home/vagrant/dspace/config/crosswalks/sample-crosswalk-DIM2DC.xsl
>
> But I think, overall, given the amount of repositories out there offering 
> DIM as a serious harvestable metadataFormat, and the much better support 
> for DC everywhere in OAI dissemenation, harvest, the configuration rules 
> you're seeing there, etc., I'd stick to at least qdc wherever possible if 
> simple dc is not enough.
>
> Sorry that's all a bit vague, 
>
> Hope this helps!
>
> Cheers
>
> Kim
>
> On Thursday, January 19, 2017 at 10:20:09 PM UTC+13, euler wrote:
>>
>> Hi Kim,
>>
>> Thanks for the response. Can you please point to me where can I find the 
>> ingest crosswalk? I only found the file sword-swap-ingest.xsl in the 
>> [dspace]/config/crosswalks directory. Regarding about my second question, 
>> does that mean I can use the property setting like below?
>>
>> oai.harvester.acceptedHandleServer = hdl.handle.net, 
>> repository.university.edu
>>
>> Thanks again,
>> euler
>>
>> On Thursday, January 19, 2017 at 3:33:54 PM UTC+8, Kim Shepherd wrote:
>>>
>>> Hi euler,
>>>
>>> I haven't done harvesting in DIM format, I'd probably need to see your 
>>> ingest crosswalk to know exactly what to expect from that method, but 
>>> you're onto something with your second question - if your source repository 
>>> is serving up identifier URIs prefixed with repository.university.edu, 
>>> you'll want to add this to acceptedHandleServer
>>>
>>> Cheers
>>>
>>> Kim
>>>
>>> On Wednesday, January 18, 2017 at 6:00:32 PM UTC+13, euler wrote:
>>>>
>>>> Dear All,
>>>>
>>>> Reposting my query from last December. Would really appreciate for any 
>>>> comments on this.
>>>>
>>>> Thanks in advance,
>>>> euler
>>>>
>>>> On Monday, December 5, 2016 at 3:08:06 PM UTC+8, euler wrote:
>>>>>
>>>>> Dear All,
>>>>>
>>>>> I am testing the harvesting of my new DSpace 6.0 installation. I 
>>>>> wonder if it's possible to prevent DSpace from creating a new 
>>>>> dc.identifier.uri for incoming harvested items? It seems the 
>>>>> setting oai.harvester.acceptedHandleServer = hdl.handle.net in 
>>>>> oai.cfg has no effect when using DIM Metadata Format. It says in oai.cfg 
>>>>> that:
>>>>>
>>>>> # A harvest process will attempt to scan the metadata of the incoming 
>>>>> items
>>>>> # (dc.identifier.uri field, to be exact) to see if it looks like a 
>>>>> handle.
>>>>> # If so, it matches the pattern against the values of this parameter.
>>>>> # If there is a match the new item is assigned the handle from the 
>>>>> metadata value
>>>>> # instead of minting a new one. Default value: hdl.handle.net
>>>>> oai.harvester.acceptedHandleServer = hdl.handle.net
>>>>>
>>>>> I tried several repositories to test this, including the Demo server 
>>>>> of DSpace but it is just minting a new dc.identifier.uri for incoming 
>>>>> items. If I use Simple Dublin Core or QDC as the Metadata format, it will 
>>>>> save the original handle of the item in dc.identifier but it is still 
>>>>> creating new dc.identifier.uri for incoming items. Is this the expected 
>>>>> behaviour? Or is this a bug? I just want to assign the original handle of 
>>>>> the item in dc.identifier.uri.
>>>>>
>>>>> Lastly, if the source repository did not register with CNRI's handle 
>>>>> service, what will I put in the oai.harvester.acceptedHandleServer? Can I 
>>>>> just use the source repository's url? eg repository.university.edu 
>>>>> instead of hdl.handle.net?
>>>>>
>>>>> Thanks in advance and best regards,
>>>>> euler
>>>>>
>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.

Reply via email to