Hi Bill!

I was wrong, there is  the companion setting in discovery.cfg
(discovery.solr.fulltext.charLimit). I change it to -1 and now DSpace is
finding all occurrences in the search.

But now I find other problem. I sent other PDF files, and some of them,
when I execute the command ./dspace index-discovery, some files are erased.

Thank you.

Erivelto





Em qui., 2 de mar. de 2023 às 22:43, Erivelto Alves <[email protected]>
escreveu:

> Hi Bill & Tim!
>
> This is the server spec:
>
> Server DSpace App
> Ubuntu Server 22.04.2 LTS
> openjdk version "11.0.18" 2023-01-17
> OpenJDK Runtime Environment (build 11.0.18+10-post-Ubuntu-0ubuntu122.04)
> OpenJDK 64-Bit Server VM (build 11.0.18+10-post-Ubuntu-0ubuntu122.04,
> mixed mode, sharing)
> Tomcat 9.0.72
> Solr 8.11.2
> Apache Maven 3.6.3
> Apache Ant 1.10.12
> NodeJS v16.19.1
> npm 8.19.3
> -----------------------------
> Server Database
> Ubuntu Server 22.04.2 LTS
> PostgreSQL 14.6 (Ubuntu 14.6-0ubuntu0.22.04.1)
>
> No error in the installation of DSpace.
> The PDF files that were compared in DS6.5 and DS7.5 are the same and were
> batch imported into a SAF file. No import error.
> I modified the configurations for extractor text in the DSPACE.cfg file
> for textextractor.max -chars = -1. All files were 100% converted to TXT.
>
> Bill, there is no companion setting in discovery.cfg
> (discovery.solr.fulltext.charLimit).
>
> The research failure remains.
>
> Thanks.
>
> Erivelto
>
>
> Em qui., 2 de mar. de 2023 às 14:27, Bill Tantzen <[email protected]>
> escreveu:
>
>> There is a companion setting in discovery.cfg
>> (discovery.solr.fulltext.charLimit) which limits the number of characters
>> that are actually stored in the solr index in the fulltext field;
>> initially, that is also set to 100000 characters.  Simply set this to a
>> higher count, or -1 for unlimited.
>>
>> Hope that helps!
>> ~~Bill
>>
>> On Thu, Mar 2, 2023 at 10:50 AM 'Tim Donohue' via DSpace Community <
>> [email protected]> wrote:
>>
>>> Hi Erivelto,
>>>
>>> I'd recommend looking more closely at the 5 items which were matched in
>>> DSpace 6.3 but not in 7.5.  Is there something in common among those 5
>>> items?  Is the search results match occurring in the metadata of those
>>> items or in the full text?
>>>
>>> If you can narrow things down, it'd be much easier to provide
>>> support/ideas.  There have been a lot of changes in the search engine of
>>> DSpace 7.5... including a move to a later version of Solr.  It's possible
>>> you've found a bug, or it could be a misconfiguration, or simply a change
>>> in the behavior of Solr.  It's difficult to narrow down without more
>>> information about the differences in the results that you are seeing.
>>>
>>> If you can send more information to this list or your email to
>>> dspace-tech (as I see you sent the same email to both lists), that might
>>> provide others with more clues as to what might be going on.
>>>
>>> Tim
>>> ------------------------------
>>> *From:* [email protected] <
>>> [email protected]> on behalf of Erivelto Henrique <
>>> [email protected]>
>>> *Sent:* Thursday, March 2, 2023 7:41 AM
>>> *To:* DSpace Community <[email protected]>
>>> *Subject:* [dspace-community] Differences in search result on items
>>> between DSpace 6.3 / DSpace 7.5
>>>
>>> Hi everyone;
>>>
>>> I have a DSpace 6.3 installation and am deploying a new document
>>> repository with version 7.5
>>> We have already installed the new version 7.5 on a new server, and we
>>> have imported some documents for this new installation.
>>> We did some search tests and noticed a very big difference in search
>>> results between the two versions.
>>> When I search for a term in version 6.3, I get 14 results found for the
>>> search, and when I search in version 7.5, I only get 9 returns.
>>> Version 6.3 search result
>>> [image: Screenshot_8.png]
>>> Search result in version 7.5
>>> [image: Screenshot_10.png]
>>>
>>> With some PDF documents that are very large and the Text Extractor
>>> settings were set to 100k characters, the file was not converting 100% to
>>> TXT. I changed it to textextractor.max-chars = -1 but still the search
>>> result remains the same.
>>>
>>> Anyone can help with this?
>>>
>>> Thanks
>>>
>>> Erivelto
>>>
>>> --
>>> All messages to this mailing list should adhere to the Code of Conduct:
>>> https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "DSpace Community" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/dspace-community/9ba0b682-523b-4709-a6cb-de2284f1f90bn%40googlegroups.com
>>> <https://groups.google.com/d/msgid/dspace-community/9ba0b682-523b-4709-a6cb-de2284f1f90bn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> --
>>> All messages to this mailing list should adhere to the Code of Conduct:
>>> https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "DSpace Community" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/dspace-community/PH0PR22MB327422A68D1E78434679A776EDB29%40PH0PR22MB3274.namprd22.prod.outlook.com
>>> <https://groups.google.com/d/msgid/dspace-community/PH0PR22MB327422A68D1E78434679A776EDB29%40PH0PR22MB3274.namprd22.prod.outlook.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>
>>
>> --
>> Human wheels spin round and round
>> While the clock keeps the pace... -- John Mellencamp
>> ________________________________________________________________
>> Bill Tantzen    University of Minnesota Libraries
>> 612-626-9949 (U of M)    612-325-1777 (cell)
>>
>

-- 
All messages to this mailing list should adhere to the Code of Conduct: 
https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx
--- 
You received this message because you are subscribed to the Google Groups 
"DSpace Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dspace-community/CAFBNU5f3ua-Qd82QeduS-hZ9DEFoWpiBuDTEkYvb1%3Dy33kzQGg%40mail.gmail.com.

Reply via email to