[ 
https://jira.duraspace.org/browse/DS-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=27668#comment-27668
 ] 

Mark H. Wood commented on DS-1481:
----------------------------------

Maybe we need to be able to set per-Collection validators for some fields.  A 
Collection for things being published by submitting them to DSpace would have a 
dc.date.issued validator which sets the field to "now" if it is null; a 
Collection of previously-published objects would use a validator which demands 
a non-null date (whatever we think "dates" look like).

Validation should be embedded in *all* submission paths.  Pushing an object in 
via SWORD or LNI should fail if any validator returns failure.

We can probably just do a small set of generic validators (accept anything, 
must be nonempty, must be a date, perhaps a few others) and make them pluggable 
so that more can be readily added.  Actually, "accept anything" should probably 
correspond to "no validator declared".

The validator interface should include a "suggest a suitable value" method, so 
that "looks like a date" can supply "now" to e.g. the GUI submission plumbing 
as a default.
                
> "dc.date.issued" is often incorrectly set (reported from Google)
> ----------------------------------------------------------------
>
>                 Key: DS-1481
>                 URL: https://jira.duraspace.org/browse/DS-1481
>             Project: DSpace
>          Issue Type: Improvement
>          Components: DSpace API
>    Affects Versions: 1.7.0, 1.7.1, 1.7.2, 1.8.0, 1.8.1, 1.8.2, 3.0, 3.1
>            Reporter: Tim Donohue
>             Fix For: 4.0
>
>
> Google (Anurag Acharya and Darcy Darpa) has contacted DuraSpace about a 
> common indexing issue affecting all DSpace sites.
> When Google & Google Scholar index DSpace content (from a variety of 
> institutions), the "dc.date.issued" value is incorrect the majority of the 
> time. The reason is that, if unspecified, DSpace sets this issued date to the 
> *date of accession* (i.e. date that it was submitted to DSpace), see:
> https://github.com/DSpace/DSpace/blob/master/dspace-api/src/main/java/org/dspace/content/InstallItem.java#L130
> Google says this causes their crawlers (for both Google & Google Scholar) to 
> assume that the date of accession is actually the formal publication date.
> Rather than defaulting the 'dc.date.issued' to the accession date, Google 
> recommends we leave it blank.  DSpace is already tracking the accession date 
> separately (in 'dc.date.accessioned'), so it seems odd to set 
> 'dc.date.issued' to the same value by default.
> Google will be sending along some examples of this. They said they have seen 
> repositories, where 30-50% of their items all have the same "dc.date.issued", 
> as those items were all imported on the same date.
> This seems like a very reasonable recommendation to me as well.  I'm not sure 
> we should be setting 'dc.date.issued' by default, as it really is meant to be 
> the date of *formal publication*, and not the date that something is made 
> available on the web.  
> This also seems like a small fix (remove a few lines from InstallItem).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013 
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel

Reply via email to