Tim Donohue created DS-1387:
-------------------------------

             Summary: Reports that Google Scholar is linking to DSpace 
extracted text (*.pdf.txt) files instead of original PDF
                 Key: DS-1387
                 URL: https://jira.duraspace.org/browse/DS-1387
             Project: DSpace
          Issue Type: Bug
          Components: XMLUI
            Reporter: Tim Donohue


This ticket is a placeholder for several recent reports about PDF indexing 
oddities with Google Scholar and DSpace (seemingly XMLUI specific, though that 
is unconfirmed).  

In several cases, users have reported that Google Scholar is mistakenly linking 
to the internal extracted PDF text files (*.pdf.txt files).  These internal 
".pdf.txt" files are automatically generated by DSpace for its own indexing, 
and are not meant to be utilized by external search engines.

Although the "*.pdf.txt" files are technically publicly accessible, they are 
currently not linked to from the main Item "splash page", so it's uncertain how 
they are being located by web spiders. (Some have speculated perhaps form the 
OAI interface, or from indexing of the XMLUI's "mets.xml" file)

Here are a few threads describing this issues on dspace-tech mailing list:
* http://www.mail-archive.com/[email protected]/msg19303.html
* http://www.mail-archive.com/[email protected]/msg18831.html

If anyone else has noticed this issue, we'd encourage you to provide examples 
in this JIRA ticket.  It may help us to better track down whether this is a 
DSpace issue, a Google Scholar issue, or perhaps even a bit of both.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel

Reply via email to