Hi Bill, I have to admit, I find this confusing too. I'm also not aware of anywhere in the UI where we provide a *publicly available* link to files in the TEXT bundle. If there is such a way that we are "exposing" the TEXT bundle to crawlers, then it's accidental. Files in that TEXT bundle are not meant for public downloads.
Are you able to get any clues from which Google Scholar regarding which Items are linking to TEXT bundles? Are they all newer content, or older content? If older content, it's always possible this was a bug in an older version of DSpace. If newer content, that implies maybe we're missing a place these are exposed in recent DSpace versions...that'd imply though that they'd be in the HTML *somewhere*, likely either on the Item page or the "Full" Item page. (I'm not seeing them on either of those pages on our demo site though, e.g. https://demo.dspace.org/items/bb3eb3d2-9796-4a6b-b08e-af914e2438a9 or https://demo.dspace.org/items/bb3eb3d2-9796-4a6b-b08e-af914e2438a9/full ). Either that, or Google Scholar's bot is finding links to them elsewhere on the web (which would be odd). Overall, I think this might require digging for more clues...or (as you've already done) seeing if others have seen this behavior as well. Either one might help us narrow things down. Tim On Tuesday, February 3, 2026 at 11:04:49 AM UTC-6 [email protected] wrote: > We are discovering extracted text, from the TEXT bundle indexed in Google > Scholar. I'm not sure how this is happening. bitstreams in the TEXT > bundle are referenced numerous times in the <script> element of the source > code, but not in the UI so far as I can tell. > > Is there a way to prevent these bitstreams from being indexed? > > Thanks for any tips! > ~~Bill > > -- > ______________________________________ > Bill Tantzen University of Minnesota Libraries > 612-626-9949 <(612)%20626-9949> (U of M) 612-325-1777 <(612)%20325-1777> > (mobile) > -- All messages to this mailing list should adhere to the Code of Conduct: https://lyrasis.org/code-of-conduct/ --- You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/dspace-tech/e642e6b8-7e83-4460-bb71-0879627bf17dn%40googlegroups.com.
