Hi,
Thanks for the responses.

It's a soft boundary which is resulted by dynamic syntax from our application. 
So may vary from different user searches, one user can search some "word1" in 
starting 30 words, and another can search "word2" in
starting 10 words. The use case is to match some terms/phrase in specific 
document places in order to identify scripts/specific word ocuurences.

So I guess copy field won't work here.

Any other suggestions/thoughts ?
Maybe some hidden position filters in native level to limit from start/end of 
the document ?

Thanks,
Adi

-----Original Message-----
From: Tim Casey <tca...@gmail.com>
Sent: Tuesday, October 15, 2019 11:05 PM
To: solr-user@lucene.apache.org
Subject: Re: Position search

If this is about a normalized query, I would put the normalization text into a 
specific field.  The reason for this is you may want to search the overall text 
during any form of expansion phase of searching for data.
That is, maybe you want to know the context of up to the 120th word.  At least 
you have both.
Also, you may want to note which normalized fields were truncated or were 
simply too small. This would give some guidance as to the bias of the 
normalization.  If 95% of the fields were not truncated, there is a chance you 
are not doing good at normalizing because you have a set of particularly short 
messages.  So I would expect a small set of side fields remarking this.  This 
would allow you to carry the measures along with the data.

tim

On Tue, Oct 15, 2019 at 12:19 PM Alexandre Rafalovitch <arafa...@gmail.com>
wrote:

> Is the 100 words a hard boundary or a soft one?
>
> If it is a hard one (always 100 words), the easiest is probably copy
> field and in the (unstored) copy, trim off whatever you don't want to
> search. Possibly using regular expressions. Of course, "what's a word"
> is an important question here.
>
> Similarly, you could do that with Update Request Processors and
> clone/process field even before it hits the schema. Then you could
> store the extract for highlighting purposes.
>
> Regards,
>    Alex.
>
> On Tue, 15 Oct 2019 at 02:25, Kaminski, Adi <adi.kamin...@verint.com>
> wrote:
> >
> > Hi,
> > What's the recommended way to search in Solr (assuming 8.2 is used)
> > for
> specific terms/phrases/expressions while limiting the search from
> position perspective.
> > For example to search only in the first/last 100 words of the document ?
> >
> > Is there any built-in functionality for that ?
> >
> > Thanks in advance,
> > Adi
> >
> >
> > This electronic message may contain proprietary and confidential
> information of Verint Systems Inc., its affiliates and/or
> subsidiaries. The information is intended to be for the use of the
> individual(s) or
> entity(ies) named above. If you are not the intended recipient (or
> authorized to receive this e-mail for the intended recipient), you may
> not use, copy, disclose or distribute to anyone this message or any
> information contained in this message. If you have received this
> electronic message in error, please notify us by replying to this e-mail.
>


This electronic message may contain proprietary and confidential information of 
Verint Systems Inc., its affiliates and/or subsidiaries. The information is 
intended to be for the use of the individual(s) or entity(ies) named above. If 
you are not the intended recipient (or authorized to receive this e-mail for 
the intended recipient), you may not use, copy, disclose or distribute to 
anyone this message or any information contained in this message. If you have 
received this electronic message in error, please notify us by replying to this 
e-mail.

Reply via email to