Hi Bill & Tim!

This is the server spec:

Server DSpace App
Ubuntu Server 22.04.2 LTS
openjdk version "11.0.18" 2023-01-17
OpenJDK Runtime Environment (build 11.0.18+10-post-Ubuntu-0ubuntu122.04)
OpenJDK 64-Bit Server VM (build 11.0.18+10-post-Ubuntu-0ubuntu122.04, mixed
mode, sharing)
Tomcat 9.0.72
Solr 8.11.2
Apache Maven 3.6.3
Apache Ant 1.10.12
NodeJS v16.19.1
npm 8.19.3
-----------------------------
Server Database
Ubuntu Server 22.04.2 LTS
PostgreSQL 14.6 (Ubuntu 14.6-0ubuntu0.22.04.1)

No error in the installation of DSpace.
The PDF files that were compared in DS6.5 and DS7.5 are the same and were
batch imported into a SAF file. No import error.
I modified the configurations for extractor text in the DSPACE.cfg file for
textextractor.max -chars = -1. All files were 100% converted to TXT.

Bill, there is no companion setting in discovery.cfg
(discovery.solr.fulltext.charLimit).

The research failure remains.

Thanks.

Erivelto


Em qui., 2 de mar. de 2023 às 14:27, Bill Tantzen <[email protected]>
escreveu:

> There is a companion setting in discovery.cfg
> (discovery.solr.fulltext.charLimit) which limits the number of characters
> that are actually stored in the solr index in the fulltext field;
> initially, that is also set to 100000 characters.  Simply set this to a
> higher count, or -1 for unlimited.
>
> Hope that helps!
> ~~Bill
>
> On Thu, Mar 2, 2023 at 10:50 AM 'Tim Donohue' via DSpace Community <
> [email protected]> wrote:
>
>> Hi Erivelto,
>>
>> I'd recommend looking more closely at the 5 items which were matched in
>> DSpace 6.3 but not in 7.5.  Is there something in common among those 5
>> items?  Is the search results match occurring in the metadata of those
>> items or in the full text?
>>
>> If you can narrow things down, it'd be much easier to provide
>> support/ideas.  There have been a lot of changes in the search engine of
>> DSpace 7.5... including a move to a later version of Solr.  It's possible
>> you've found a bug, or it could be a misconfiguration, or simply a change
>> in the behavior of Solr.  It's difficult to narrow down without more
>> information about the differences in the results that you are seeing.
>>
>> If you can send more information to this list or your email to
>> dspace-tech (as I see you sent the same email to both lists), that might
>> provide others with more clues as to what might be going on.
>>
>> Tim
>> ------------------------------
>> *From:* [email protected] <
>> [email protected]> on behalf of Erivelto Henrique <
>> [email protected]>
>> *Sent:* Thursday, March 2, 2023 7:41 AM
>> *To:* DSpace Community <[email protected]>
>> *Subject:* [dspace-community] Differences in search result on items
>> between DSpace 6.3 / DSpace 7.5
>>
>> Hi everyone;
>>
>> I have a DSpace 6.3 installation and am deploying a new document
>> repository with version 7.5
>> We have already installed the new version 7.5 on a new server, and we
>> have imported some documents for this new installation.
>> We did some search tests and noticed a very big difference in search
>> results between the two versions.
>> When I search for a term in version 6.3, I get 14 results found for the
>> search, and when I search in version 7.5, I only get 9 returns.
>> Version 6.3 search result
>> [image: Screenshot_8.png]
>> Search result in version 7.5
>> [image: Screenshot_10.png]
>>
>> With some PDF documents that are very large and the Text Extractor
>> settings were set to 100k characters, the file was not converting 100% to
>> TXT. I changed it to textextractor.max-chars = -1 but still the search
>> result remains the same.
>>
>> Anyone can help with this?
>>
>> Thanks
>>
>> Erivelto
>>
>> --
>> All messages to this mailing list should adhere to the Code of Conduct:
>> https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "DSpace Community" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/dspace-community/9ba0b682-523b-4709-a6cb-de2284f1f90bn%40googlegroups.com
>> <https://groups.google.com/d/msgid/dspace-community/9ba0b682-523b-4709-a6cb-de2284f1f90bn%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> --
>> All messages to this mailing list should adhere to the Code of Conduct:
>> https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "DSpace Community" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/dspace-community/PH0PR22MB327422A68D1E78434679A776EDB29%40PH0PR22MB3274.namprd22.prod.outlook.com
>> <https://groups.google.com/d/msgid/dspace-community/PH0PR22MB327422A68D1E78434679A776EDB29%40PH0PR22MB3274.namprd22.prod.outlook.com?utm_medium=email&utm_source=footer>
>> .
>>
>
>
> --
> Human wheels spin round and round
> While the clock keeps the pace... -- John Mellencamp
> ________________________________________________________________
> Bill Tantzen    University of Minnesota Libraries
> 612-626-9949 (U of M)    612-325-1777 (cell)
>

-- 
All messages to this mailing list should adhere to the Code of Conduct: 
https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
--- 
You received this message because you are subscribed to the Google Groups 
"DSpace Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dspace-community/CAFBNU5fgMM6jCqsAMVJw%3D7JVUFcguFXTdOoFMOHs3TxtKJ_cjA%40mail.gmail.com.

Reply via email to