By and large, storing data will not affect search speed as much as you might 
think. Getting the top N results (say 10) doesn’t use stored data at all. It’s 
only _after_ that point that highlighting occurs on the 10 docs.

As far as needing the full doc, Jörn is right, it must be stored. The problem 
is that what’s in the index, aside from being very expensive to use to 
reconstruct the doc (think 10s of seconds at least per doc) is lossy. Say you 
stem and one of your words is ‘running’. All that’s in the index is ‘run’ so 
using that to highlight, even if it were fast, wouldn’t be satisfactory.

Best,
Erick

> On Mar 21, 2019, at 9:32 AM, Jörn Franke <jornfra...@gmail.com> wrote:
> 
> Hi,
> 
> Then you have to go for the full documents. I recommend to reduce then the 
> returned results, use paging (if it is a web ui) and split the documents on 
> several nodes (if the previous measures do not turn out to be successful).
> 
> Best regards 
> 
>> Am 21.03.2019 um 17:15 schrieb Martin Frank Hansen (MHQ) <m...@kmd.dk>:
>> 
>> Hi Jörn,
>> 
>> Thanks for your answer.
>> 
>> Unfortunately, there is no summary included in the documents  and I would 
>> like it to work for all documents.
>> 
>> Best regards
>> 
>> Martin
>> 
>> 
>> Internal - KMD A/S
>> 
>> -----Original Message-----
>> From: Jörn Franke <jornfra...@gmail.com>
>> Sent: 21. marts 2019 17:11
>> To: solr-user@lucene.apache.org
>> Subject: Re: highlighter, stored documents and performance
>> 
>> I don’t think so - to highlight any possible query you need the full 
>> document.
>> 
>> You could optimize it by only storing a subset of the document and highlight 
>> only in this subset.
>> 
>> Alternatively you can store a summary and show only the summary without 
>> highlighting.
>> 
>>> Am 21.03.2019 um 17:05 schrieb Martin Frank Hansen (MHQ) <m...@kmd.dk>:
>>> 
>>> Hi,
>>> 
>>> I am wondering how performance highlighting in Solr performs when the 
>>> number of documents get large?
>>> 
>>> Right now we have about 1 TB of data in all sorts of file types and I was 
>>> wondering how storing these documents within Solr (for highlighting 
>>> purpose) will affect performance?
>>> 
>>> Is it possible to use highlighting without storing the documents?
>>> 
>>> Best regards
>>> 
>>> Martin
>>> 
>>> 
>>> 
>>> 
>>> Internal - KMD A/S
>>> 
>>> Beskyttelse af dine personlige oplysninger er vigtig for os. Her finder du 
>>> KMD’s Privatlivspolitik<http://www.kmd.dk/Privatlivspolitik>, der 
>>> fortæller, hvordan vi behandler oplysninger om dig.
>>> 
>>> Protection of your personal data is important to us. Here you can read 
>>> KMD’s Privacy Policy<http://www.kmd.net/Privacy-Policy> outlining how we 
>>> process your personal data.
>>> 
>>> Vi gør opmærksom på, at denne e-mail kan indeholde fortrolig information. 
>>> Hvis du ved en fejltagelse modtager e-mailen, beder vi dig venligst 
>>> informere afsender om fejlen ved at bruge svarfunktionen. Samtidig beder vi 
>>> dig slette e-mailen i dit system uden at videresende eller kopiere den. 
>>> Selvom e-mailen og ethvert vedhæftet bilag efter vores overbevisning er fri 
>>> for virus og andre fejl, som kan påvirke computeren eller it-systemet, 
>>> hvori den modtages og læses, åbnes den på modtagerens eget ansvar. Vi 
>>> påtager os ikke noget ansvar for tab og skade, som er opstået i forbindelse 
>>> med at modtage og bruge e-mailen.
>>> 
>>> Please note that this message may contain confidential information. If you 
>>> have received this message by mistake, please inform the sender of the 
>>> mistake by sending a reply, then delete the message from your system 
>>> without making, distributing or retaining any copies of it. Although we 
>>> believe that the message and any attachments are free from viruses and 
>>> other errors that might affect the computer or it-system where it is 
>>> received and read, the recipient opens the message at his or her own risk. 
>>> We assume no responsibility for any loss or damage arising from the receipt 
>>> or use of this message.

Reply via email to