Hi,

Redirects of PDF URLs are also a problem with our 5.8 installation.

Can the configuration

google.citation_pdf_url = $simple-pdf

in the file "google-metadata.properties" be changed to provide the direkt PDF URLs?

Best regards
Franziska Rapp


Am 05.01.2021 um 13:44 schrieb Alex Fletcher:
We've found recently that Google Scholar is having some issues with their crawler having a lot of trouble matching the PDF URLs with the correct repository record/landing page with the associated metadata because of redirects in place for the PDF URLs.

For example, https://qspace.library.queensu.ca/handle/1974/28134

lists in the metatags

<meta content="https://qspace.library.queensu.ca/bitstream/1974/28134/10/Public-Water-Covid-19.pdf"name="citation_pdf_url";>

But for the crawler, this PDF URL redirects to another URL:

Fetched Header
Permanent redirect (301) to https://qspace.library.queensu.ca/bitstream/handle/1974/28134/Public-Water-Covid-19.pdf;jsessionid=3681375205557337FE2EA1F5058CED47?sequence=10

This is being flagged as suspicious behavior (cloaking) by the indexing system, so this redirect is not followed.

I looked through the DSpace documentation and found this page

https://wiki.lyrasis.org/display/DSDOC5x/Search+Engine+Optimization#SearchEngineOptimization-AvoidredirectingfiledownloadstoItemlandingpages

Which states:

Make sure that you never redirect "direct file downloads" (i.e. users who directly jump to downloading a file, often from a search engine) to the associated Item's splash/landing page.  In the past, some DSpace sites have added these custom URL redirects in order to facilitate capturing statistics via Google Analytics or similar.

We've never put those in, and are using stock code for that segment. Is there a configuration variable somewhere to disable this auto URL redirect that we've missed somewhere?

Alex


--
All messages to this mailing list should adhere to the DuraSpace Code of Conduct: https://duraspace.org/about/policies/code-of-conduct/
---
You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected] <mailto:[email protected]>. To view this discussion on the web visit https://groups.google.com/d/msgid/dspace-tech/02f60d0d-8aad-4e10-9565-f2fa84950c75n%40googlegroups.com <https://groups.google.com/d/msgid/dspace-tech/02f60d0d-8aad-4e10-9565-f2fa84950c75n%40googlegroups.com?utm_medium=email&utm_source=footer>.

--
Franziska Rapp
Communication and Information Center (kiz)
Ulm University

--
All messages to this mailing list should adhere to the DuraSpace Code of 
Conduct: https://duraspace.org/about/policies/code-of-conduct/
--- You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dspace-tech/f68a9d46-df2f-4144-5ebe-796d1c2b3a34%40uni-ulm.de.

Reply via email to