[
https://jira.duraspace.org/browse/DS-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=27690#comment-27690
]
Tim Donohue commented on DS-1482:
---------------------------------
A brief followup to Anurag's points (in previous comment).
We make recommendations similar to what he states on our wiki at:
https://wiki.duraspace.org/display/DSPACE/Ensuring+your+instance+is+indexed
(And we do embed an invisible link to HTML sitemaps in JSPUI and our various
XMLUI themes)
However, he does make a good point that currently we don't have any way to
default sitemaps to be enabled (as they are generated/refreshed by a
recommended cron job). So, even though Google Scholar can index the sitemaps,
they often are not enabled, so the Scholar crawler cannot really depend on them.
So, there may be a couple options here:
(1) Look into whether we can auto-update sitemaps (perhaps via a new event
consumer or similar) so that Google / Google Scholar can use those.
AND/OR
(2) Potentially add a way to browse content by the date it was added (this may
even be useful / interesting to repo managers as a sort of "report" of recently
added content)
> Add a way for harvesters to find recently added items (request from Google)
> ---------------------------------------------------------------------------
>
> Key: DS-1482
> URL: https://jira.duraspace.org/browse/DS-1482
> Project: DSpace
> Issue Type: New Feature
> Reporter: Tim Donohue
>
> This request came out of a discussion I had with Anurag Acharya and Darcy
> Darpa at Google / Google Scholar.
> Anurag mentioned that often the Google harvesters seem to need to do a lot of
> "paging / clicking" in order to find new items in a DSpace instance. This
> can cause both a performance hit in DSpace (as the crawler keeps requesting
> pages), and also can result in delays where items may not appear in Google
> for some time (if the crawler gives up or moves on before it ever finds the
> item).
> Anurag mentioned that it'd be much easier (both on DSpace performance and on
> the Google crawlers), if DSpace provided some way to easily locate recently
> added items.
> This could be something like a "Browse Recently Added Items" (i.e. browse by
> dc.date.accessioned), or similar. It was noted that EPrints has such a
> feature (called "Latest Additions"). For example, see their demo site:
> http://demoprints.eprints.org/cgi/latest
> It's also worth noting this could just be as simple as adding a "More...."
> Option to our existing "Recently Added" list (of 5 items), so that you can
> see other recently added items.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel