Hello,

Putting all this "tail wagging the dog" aside. I think it would be very
good to get the appropriate "metadata" added to the PDF.

I wanted to contribute that we recently had a "non-coverpage" case where
the title of a paper was correct in the first page of the pdf and in the
DSpace metadata, but the PDF had the incorrect title in its internal
metadata. This caused Google Scholar to show the incorrect title in its
search results, which caused much confusion for the owner of that document.
Changing the metadata resulted in the GS record changing. From this point,
it is clear the GS is leaning heavily on PDF internal metadata as is
primary source for its records.

I think that if the appropriate metadata were populated in the pdf process,
that it would take precedence over the cover page in GS.

Mark


On Fri, Jun 19, 2015 at 7:00 AM, Tim Donohue <tdono...@duraspace.org> wrote:

> Hi Peter,
>
> Thanks for your thoughts on this. I do definitely see that many times,
> things still "work" in Google Scholar (and that's good to see).
>
> However, I will mention that I've immediately found examples from
> kb.OSU.edu where things didn't "work" as expected, and the "abstract"
> reported by Google Scholar is actually text from the PDF coverpage.
>
> For example:
>
> "Mycorrhizae and establishment of trees on strip-mined land"
>
> https://scholar.google.com/scholar?cluster=3612920260665530599&hl=en&as_sdt=0,36
>
> This becomes even more visible if you search for text that appears in
> the cover page itself. For example "Downloaded from the Knowledge Bank"
> appears in your PDF cover page:
>
> Finds ~243,000 results with this text in the "abstract" field:
>
> https://scholar.google.com/scholar?q=Downloaded+from+the+Knowledge+Bank&btnG=&hl=en&as_sdt=0%2C36
>
> It does look like Google Scholar is smart enough to generally grab the
> Title, Author and even date from the cover page itself (so assuming they
> are spelled correctly in the DSpace metadata, that is great!). But, the
> abstract seems to be problematic, so some of your articles may not have
> as much "visibility" in that they may be only searchable by Title,
> Author and Date.
>
> This is just something to consider if Google Scholar visibility is of
> high importance to your researchers/users. There definitely are some
> issues here, even with a well-formatted Cover Page.
>
> - Tim
>
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Dspace-general mailing list
> Dspace-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dspace-general
>



-- 
[image: @mire Inc.]
*Mark Diggory*
*2888 Loker Avenue East, Suite 315, Carlsbad, CA. 92010*
*Esperantolaan 4, Heverlee 3001, Belgium*
http://www.atmire.com
------------------------------------------------------------------------------
_______________________________________________
Dspace-general mailing list
Dspace-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-general

Reply via email to