I would say DSpace is doing a "good" job of producing Scholar tags
(highwire) for the most part. There are some edge cases, as mentioned above
by others, that other systems could be doing a better job. I don't know
enough about (EPrints / BePress) scholar support to weigh in. There is a
config setting
https://github.com/DSpace/DSpace/blob/master/dspace/config/crosswalks/google-metadata.propertiesthat
you will NEED to modify to map your custom metadata profile, to Google
Scholar (highwire) metadata fields.

Citing specific examples, DSpace out-of-the-box, only supports mapping to
the citation_pdf_url, when you only have one bitstream, and it is a PDF, in
the ORIGINAL bundle. In any other circumstance, it will punt, and not add a
citation_pdf_url.

The reason for that is if you have multiple PDF's, DSpace doesn't have
enough information to know which one is the "best" PDF that contains your
article. Or, in other cases, people use multiple bundles to store their
content. Or, you have multiple formats available, such as word, text/latex,
and again, DSpace can't say which one is the best. So, if you are deviating
from the simple use-case, then you'll need to customize the logic for
determining the citation_pdf_url, likely altering some Java code to do so.

Another example of things that Scholar doesn't like is the dc.date.issued
being set to the date submitted (i.e. today's date, if you just submitted).
So, if that article you just submitted was actually published elsewhere a
few months ago, but the version you submit to your IR has today's date,
then scholar has conflicting information about the Date of that article,
and doesn't think of them as multiple versions/sources of the same content.
DSpace 4.0 has some changes regarding that, as it tries not to add
date.issued of today, for anything that you mark as previously published.

Peter Dietz


On Wed, Nov 6, 2013 at 9:50 AM, Calloni, Rodrigo <[email protected]> wrote:

>  Thanks a lot Tim. Very important to know the differences as we move
> forward into the best integration we can have with all search tools, in
> special Scholar.
>
>
>
> Rodrigo
>
>
>
> *From:* Tim Donohue [mailto:[email protected]]
> *Sent:* Tuesday, November 05, 2013 10:50 AM
> *To:* Calloni, Rodrigo; [email protected]
>
> *Subject:* Re: [Dspace-tech] DSpace and Google Scholar
>
>
>
> Hi Rodrigo,
>
>
> DuraSpace has been in contact with the Google Scholar team frequently over
> the past few years with regards to DSpace and Google Scholar. We have been
> providing feedback/requests back to DSpace developers directly from the
> Google Scholar team.
>
> So, we've been in ongoing discussions with Google Scholar around making
> DSpace more easily indexed/searched by Google Scholar.  Nearly every new
> version of DSpace includes some search engine improvements (more are coming
> in the upcoming 4.0).  Google Scholar has changed its own "best practices"
> over time (as they improve their system), and as such DSpace has been
> changing its functionality to better support these new  best practices.
>
> Because of that, it is very important to stay up-to-date with DSpace in
> order to get all of these Google Scholar enhancements.  This is another
> difference between DSpace and EPrints & bepress.  Although it's not always
> the case, EPrints and bepress often are "hosted" solutions -- meaning that
> the hosting provider keeps the software up-to-date on your behalf.
> Therefore, as EPrints and bepress make GS improvements, you'd get them
> "automatically" in your hosted system.  There are also some DSpace hosting
> options (e.g. DSpaceDirect via DuraSpace, Open Repository via BioMed
> Central, others), but most institutions run DSpace on their own servers.
> This means that, in order to see all the GS improvements in DSpace, you
> need to be sure you are upgrading the software at a relatively regular pace
> (or hiring someone to do it on your behalf)
>
> Currently, DSpace supports embedded Google Scholar metadata (in their
> recommended Highwire Press format), it's also editable so that you can
> enhance the metadata even more based on any local metadata fields you may
> add. As Richard mentioned, another difference here is that DSpace is built
> to store *any* content you want to put into it (it need not even be
> "scholarly" in nature), which is why we have configurable Google Scholar
> metadata to support multiple use cases.  Finally, DSpace also provides
> "sitemaps" which let search engines (in general) more easily locate content
> in DSpace.
>
> Google Scholar Metadata tags:
> https://wiki.duraspace.org/display/DSDOC4x/Google+Scholar+Metadata+Mappings
> SiteMaps / SEO:
> https://wiki.duraspace.org/pages/viewpage.action?pageId=34642415
>
> I hope this gives you a good overview of how DSpace attempts to stay up to
> date with Google Scholar and other search engine best practices.
>
> Feel free to let us know if you have other questions,
>
> - Tim
>
>  --
>
> Tim Donohue
>
> Technical Lead for DSpace & DSpaceDirect
>
> DuraSpace.org | DSpace.org | DSpaceDirect.org
>
>
>
> On 11/4/2013 4:23 PM, Calloni, Rodrigo wrote:
>
> Hello
>
>
> We are using DSpace 1.8 XMLUI.
>
>
>
> I am in contact with someone at Google Scholar who mentioned that EPrints
> and BEPRess’s Digital Commons are better integrated with Scholar than
> DSpace.
>
>
>
> I wonder if you are aware of this and what these 2 other IR solutions are
> doing to bet better acceptable platforms for Scholar. Is it the UI?
>
>
>
> Thanks in advance
>
> Rodrigo
>
>
>
>
>  
> ------------------------------------------------------------------------------
>
> November Webinars for C, C++, Fortran Developers
>
> Accelerate application performance with scalable programming models. Explore
>
> techniques for threading, error checking, porting, and tuning. Get the most
>
> from the latest Intel processors and coprocessors. See abstracts and register
>
> http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk
>
>
>
>
>  _______________________________________________
>
> DSpace-tech mailing list
>
> [email protected]
>
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>
> List Etiquette: 
> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
>
>
>
>
> ------------------------------------------------------------------------------
> November Webinars for C, C++, Fortran Developers
> Accelerate application performance with scalable programming models.
> Explore
> techniques for threading, error checking, porting, and tuning. Get the most
> from the latest Intel processors and coprocessors. See abstracts and
> register
> http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk
> _______________________________________________
> DSpace-tech mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
> List Etiquette:
> https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
>
------------------------------------------------------------------------------
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most 
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Reply via email to