Agree that we want to preserve (and enhance) this cover page capability
for reasons already cited in the thread...helps identify the source of
the content and adds a bit of polish to student papers and items that
haven't already been published elsewhere.
The cover page is generated on the fly...it isn't stored so aren't we in
fact preserving the original PDF?
Are people saying that google rejects all PDFs with a
generated-on-the-fly cover page or just some small percentage--those
that fail to present the expected result from the on-the-fly process
(e.g, something went wrong with it this time around)? A Google
harvest isn't a one-time event so perhaps error code that reported
failed on-the-fly conversions would be a way to track down the PDFs that
are routinely failing to present a proper content to Google.
- Wally
Wally Grotophorst
Associate University Librarian
George Mason University
Fairfax, Virginia 22030
(703) 993-9005
Kim Shepherd wrote:
[...]
So, to get to my question. In DSpace 5.0, we actually added a
basic PDF
Cover Page capability (which was requested by DCAT and others):
https://wiki.duraspace.org/display/DSDOC5x/PDF+Citation+Cover+Page
As this may have strong implications for inclusion in Google Scholar,
should we consider removing this functionality from DSpace?
For the time being, I've placed warnings in the Documentation for this
feature to try to dissuade institutions from enabling it if Google
Scholar inclusion is of high importance.
An alternative idea I had was to do some kind of conditional inclusion
-- we already skip cover page generation if the user is authenticated
as an administrator, perhaps we could also inspect user agent, IP
block, etc. to determine if the 'user' is a Google Scholar crawler,
and send the raw PDF without cover page if so.
Cheers
Kim (not getting involved in the PDF debate, but a bit sad that Google
Scholar have so much power - we are more than just free, compliant
data sources for Google...)
M: k...@shepherd.nz <mailto:k...@shepherd.nz>
T: @kimshepherd
P: +6421883635
0CCB D957 0C35 F5C1 497E CDCF FC4B ABA3 2A1A FAEC
https://keybase.io/kshepherd
------------------------------------------------------------------------------
_______________________________________________
Dspace-general mailing list
Dspace-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-general
------------------------------------------------------------------------------
_______________________________________________
Dspace-general mailing list
Dspace-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-general