Hi Bill & Tim! This is the server spec:
Server DSpace App Ubuntu Server 22.04.2 LTS openjdk version "11.0.18" 2023-01-17 OpenJDK Runtime Environment (build 11.0.18+10-post-Ubuntu-0ubuntu122.04) OpenJDK 64-Bit Server VM (build 11.0.18+10-post-Ubuntu-0ubuntu122.04, mixed mode, sharing) Tomcat 9.0.72 Solr 8.11.2 Apache Maven 3.6.3 Apache Ant 1.10.12 NodeJS v16.19.1 npm 8.19.3 ----------------------------- Server Database Ubuntu Server 22.04.2 LTS PostgreSQL 14.6 (Ubuntu 14.6-0ubuntu0.22.04.1) No error in the installation of DSpace. The PDF files that were compared in DS6.5 and DS7.5 are the same and were batch imported into a SAF file. No import error. I modified the configurations for extractor text in the DSPACE.cfg file for textextractor.max -chars = -1. All files were 100% converted to TXT. Bill, there is no companion setting in discovery.cfg (discovery.solr.fulltext.charLimit). The research failure remains. Thanks. Erivelto Em qui., 2 de mar. de 2023 às 14:27, Bill Tantzen <[email protected]> escreveu: > There is a companion setting in discovery.cfg > (discovery.solr.fulltext.charLimit) which limits the number of characters > that are actually stored in the solr index in the fulltext field; > initially, that is also set to 100000 characters. Simply set this to a > higher count, or -1 for unlimited. > > Hope that helps! > ~~Bill > > On Thu, Mar 2, 2023 at 10:50 AM 'Tim Donohue' via DSpace Community < > [email protected]> wrote: > >> Hi Erivelto, >> >> I'd recommend looking more closely at the 5 items which were matched in >> DSpace 6.3 but not in 7.5. Is there something in common among those 5 >> items? Is the search results match occurring in the metadata of those >> items or in the full text? >> >> If you can narrow things down, it'd be much easier to provide >> support/ideas. There have been a lot of changes in the search engine of >> DSpace 7.5... including a move to a later version of Solr. It's possible >> you've found a bug, or it could be a misconfiguration, or simply a change >> in the behavior of Solr. It's difficult to narrow down without more >> information about the differences in the results that you are seeing. >> >> If you can send more information to this list or your email to >> dspace-tech (as I see you sent the same email to both lists), that might >> provide others with more clues as to what might be going on. >> >> Tim >> ------------------------------ >> *From:* [email protected] < >> [email protected]> on behalf of Erivelto Henrique < >> [email protected]> >> *Sent:* Thursday, March 2, 2023 7:41 AM >> *To:* DSpace Community <[email protected]> >> *Subject:* [dspace-community] Differences in search result on items >> between DSpace 6.3 / DSpace 7.5 >> >> Hi everyone; >> >> I have a DSpace 6.3 installation and am deploying a new document >> repository with version 7.5 >> We have already installed the new version 7.5 on a new server, and we >> have imported some documents for this new installation. >> We did some search tests and noticed a very big difference in search >> results between the two versions. >> When I search for a term in version 6.3, I get 14 results found for the >> search, and when I search in version 7.5, I only get 9 returns. >> Version 6.3 search result >> [image: Screenshot_8.png] >> Search result in version 7.5 >> [image: Screenshot_10.png] >> >> With some PDF documents that are very large and the Text Extractor >> settings were set to 100k characters, the file was not converting 100% to >> TXT. I changed it to textextractor.max-chars = -1 but still the search >> result remains the same. >> >> Anyone can help with this? >> >> Thanks >> >> Erivelto >> >> -- >> All messages to this mailing list should adhere to the Code of Conduct: >> https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx >> --- >> You received this message because you are subscribed to the Google Groups >> "DSpace Community" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/dspace-community/9ba0b682-523b-4709-a6cb-de2284f1f90bn%40googlegroups.com >> <https://groups.google.com/d/msgid/dspace-community/9ba0b682-523b-4709-a6cb-de2284f1f90bn%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> >> -- >> All messages to this mailing list should adhere to the Code of Conduct: >> https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx >> --- >> You received this message because you are subscribed to the Google Groups >> "DSpace Community" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/dspace-community/PH0PR22MB327422A68D1E78434679A776EDB29%40PH0PR22MB3274.namprd22.prod.outlook.com >> <https://groups.google.com/d/msgid/dspace-community/PH0PR22MB327422A68D1E78434679A776EDB29%40PH0PR22MB3274.namprd22.prod.outlook.com?utm_medium=email&utm_source=footer> >> . >> > > > -- > Human wheels spin round and round > While the clock keeps the pace... -- John Mellencamp > ________________________________________________________________ > Bill Tantzen University of Minnesota Libraries > 612-626-9949 (U of M) 612-325-1777 (cell) > -- All messages to this mailing list should adhere to the Code of Conduct: https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx --- You received this message because you are subscribed to the Google Groups "DSpace Community" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/dspace-community/CAFBNU5fgMM6jCqsAMVJw%3D7JVUFcguFXTdOoFMOHs3TxtKJ_cjA%40mail.gmail.com.
