There is a companion setting in discovery.cfg (discovery.solr.fulltext.charLimit) which limits the number of characters that are actually stored in the solr index in the fulltext field; initially, that is also set to 100000 characters. Simply set this to a higher count, or -1 for unlimited.
Hope that helps! ~~Bill On Thu, Mar 2, 2023 at 10:50 AM 'Tim Donohue' via DSpace Community < [email protected]> wrote: > Hi Erivelto, > > I'd recommend looking more closely at the 5 items which were matched in > DSpace 6.3 but not in 7.5. Is there something in common among those 5 > items? Is the search results match occurring in the metadata of those > items or in the full text? > > If you can narrow things down, it'd be much easier to provide > support/ideas. There have been a lot of changes in the search engine of > DSpace 7.5... including a move to a later version of Solr. It's possible > you've found a bug, or it could be a misconfiguration, or simply a change > in the behavior of Solr. It's difficult to narrow down without more > information about the differences in the results that you are seeing. > > If you can send more information to this list or your email to dspace-tech > (as I see you sent the same email to both lists), that might provide others > with more clues as to what might be going on. > > Tim > ------------------------------ > *From:* [email protected] < > [email protected]> on behalf of Erivelto Henrique < > [email protected]> > *Sent:* Thursday, March 2, 2023 7:41 AM > *To:* DSpace Community <[email protected]> > *Subject:* [dspace-community] Differences in search result on items > between DSpace 6.3 / DSpace 7.5 > > Hi everyone; > > I have a DSpace 6.3 installation and am deploying a new document > repository with version 7.5 > We have already installed the new version 7.5 on a new server, and we have > imported some documents for this new installation. > We did some search tests and noticed a very big difference in search > results between the two versions. > When I search for a term in version 6.3, I get 14 results found for the > search, and when I search in version 7.5, I only get 9 returns. > Version 6.3 search result > [image: Screenshot_8.png] > Search result in version 7.5 > [image: Screenshot_10.png] > > With some PDF documents that are very large and the Text Extractor > settings were set to 100k characters, the file was not converting 100% to > TXT. I changed it to textextractor.max-chars = -1 but still the search > result remains the same. > > Anyone can help with this? > > Thanks > > Erivelto > > -- > All messages to this mailing list should adhere to the Code of Conduct: > https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx > --- > You received this message because you are subscribed to the Google Groups > "DSpace Community" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/dspace-community/9ba0b682-523b-4709-a6cb-de2284f1f90bn%40googlegroups.com > <https://groups.google.com/d/msgid/dspace-community/9ba0b682-523b-4709-a6cb-de2284f1f90bn%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > -- > All messages to this mailing list should adhere to the Code of Conduct: > https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx > --- > You received this message because you are subscribed to the Google Groups > "DSpace Community" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/dspace-community/PH0PR22MB327422A68D1E78434679A776EDB29%40PH0PR22MB3274.namprd22.prod.outlook.com > <https://groups.google.com/d/msgid/dspace-community/PH0PR22MB327422A68D1E78434679A776EDB29%40PH0PR22MB3274.namprd22.prod.outlook.com?utm_medium=email&utm_source=footer> > . > -- Human wheels spin round and round While the clock keeps the pace... -- John Mellencamp ________________________________________________________________ Bill Tantzen University of Minnesota Libraries 612-626-9949 (U of M) 612-325-1777 (cell) -- All messages to this mailing list should adhere to the Code of Conduct: https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx --- You received this message because you are subscribed to the Google Groups "DSpace Community" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/dspace-community/CADgrb7EdJyMkL7%2BN0vkQ-eMECT_6P0pgT_bgtKtgZ_XaJLc10g%40mail.gmail.com.
