Same here. Nothing in the UI, but when I view the source, I see 14 instances of each TEXT bitstream there. Perhaps google has learned to parse them from there? I cannot find a trace of them anywhere else. ~~Bill
On Tue, Feb 3, 2026 at 4:44 PM DSpace Technical Support < [email protected]> wrote: > Hi Bill, > > I have to admit, I find this confusing too. I'm also not aware of > anywhere in the UI where we provide a *publicly available* link to files in > the TEXT bundle. If there is such a way that we are "exposing" the TEXT > bundle to crawlers, then it's accidental. Files in that TEXT bundle are > not meant for public downloads. > > Are you able to get any clues from which Google Scholar regarding which > Items are linking to TEXT bundles? Are they all newer content, or older > content? If older content, it's always possible this was a bug in an older > version of DSpace. If newer content, that implies maybe we're missing a > place these are exposed in recent DSpace versions...that'd imply though > that they'd be in the HTML *somewhere*, likely either on the Item page or > the "Full" Item page. (I'm not seeing them on either of those pages on our > demo site though, e.g. > https://demo.dspace.org/items/bb3eb3d2-9796-4a6b-b08e-af914e2438a9 or > https://demo.dspace.org/items/bb3eb3d2-9796-4a6b-b08e-af914e2438a9/full > ). Either that, or Google Scholar's bot is finding links to them elsewhere > on the web (which would be odd). > > Overall, I think this might require digging for more clues...or (as you've > already done) seeing if others have seen this behavior as well. Either one > might help us narrow things down. > > Tim > > > > On Tuesday, February 3, 2026 at 11:04:49 AM UTC-6 [email protected] wrote: > >> We are discovering extracted text, from the TEXT bundle indexed in Google >> Scholar. I'm not sure how this is happening. bitstreams in the TEXT >> bundle are referenced numerous times in the <script> element of the source >> code, but not in the UI so far as I can tell. >> >> Is there a way to prevent these bitstreams from being indexed? >> >> Thanks for any tips! >> ~~Bill >> >> -- >> ______________________________________ >> Bill Tantzen University of Minnesota Libraries >> 612-626-9949 <(612)%20626-9949> (U of M) 612-325-1777 <(612)%20325-1777> >> (mobile) >> > -- > All messages to this mailing list should adhere to the Code of Conduct: > https://lyrasis.org/code-of-conduct/ > --- > You received this message because you are subscribed to the Google Groups > "DSpace Technical Support" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion visit > https://groups.google.com/d/msgid/dspace-tech/e642e6b8-7e83-4460-bb71-0879627bf17dn%40googlegroups.com > <https://groups.google.com/d/msgid/dspace-tech/e642e6b8-7e83-4460-bb71-0879627bf17dn%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- ______________________________________ Bill Tantzen University of Minnesota Libraries 612-626-9949 (U of M) 612-325-1777 (mobile) -- All messages to this mailing list should adhere to the Code of Conduct: https://lyrasis.org/code-of-conduct/ --- You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/dspace-tech/CADgrb7GuNVv3RLKjmXhnxmekfvZ-Y7zCPqzh%2BhnnGcE_VNeE9g%40mail.gmail.com.
