Hi All, Thanks for the additional details everyone. This sounds like it's occurring in several institutions, which definitely implies this is a more widespread issue in Google Scholar's indexing of DSpace sites.
Regarding the TEXT bitstream URL appearing in the <script> tag: I'm seeing what you mean, Bill. Now that I look closely, I'm seeing it also on our demo site. There does seem to be some extraneous JSON in that <script> tag that looks like cached responses from the REST backend.... I'm not exactly sure where that's coming from, and it *does* seem to sometimes include the URL of the TEXT bundle file. My guess would be that *might be* where Google Scholar is finding the link, but I cannot say with any certainty. They obviously don't share all the information on how they index sites. But, I do know that Google Scholar uses the SSR (server side rendered) HTML page. Their bots don't use OAI or anything else like that. I'll bring this up in tomorrow's DSpace Developers Meeting to see if anyone has brainstorms on a possible fix. It sounds like either we need to find what is adding that extraneous JSON (and it could be something in Angular), or we may need to re-prioritize a fix for the TEXT bundle permissions discussion (that Sascha noted) that was logged in https://github.com/DSpace/DSpace/issues/11681. Tim On Wednesday, February 4, 2026 at 10:09:34 AM UTC-6 Andrew K wrote: > Hello, > > It looks like the extracted text is intended for internal search, right? > Then it should never be exposed. > > WBR, > Andrew > > середа, 4 лютого 2026 р. о 10:02:32 UTC+2 Sascha Szott пише: > > Hello everyone, > > just a small note regarding the discussion: we already talked about the > topic of bitstreams in the TEXT bundle in a developer meeting last year. > > This resulted in the GitHub ticket > > https://github.com/DSpace/DSpace/issues/11681. > > Presumably, we can restrict access to the bitstreams in the TEXT bundle. > Ideally, the URLs should not appear in the SSR output at all. > > Best > Sascha > > -- All messages to this mailing list should adhere to the Code of Conduct: https://lyrasis.org/code-of-conduct/ --- You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/dspace-tech/7ed8dce2-53f6-42a4-ae1c-3247bc980b77n%40googlegroups.com.
