bq: this is not a typical case that one searches for a keyword but
highlights something else

This isn't really an unusual case, apparently I mislead you.

What I was trying to convey is that the analysis chain used is firmly
attached to a particular _field_. There's no way to say "use one
analysis chain for the query and another for highlighting on the
_same_ field".

You can use two different fields with different analysis chains, one
for each purpose. So something like

q=f1:something&hl.fl=f2,f3&hl.q=other

is certainly reasonable. It'll search for "something" in f1, and
highlight "other" in f2 and f3

Each fields processes its input with the analysis chain defined in the schema.

The rest about stored="true" can be ignored, it's just me wandering
off into the weeds about an optimization that only stores the data
once rather than redundantly in multiple fields.

Best,
Erick

On Fri, Mar 23, 2018 at 4:37 AM, Arturas Mazeika <maze...@gmail.com> wrote:
> Hi Mathesis (Stefan),
>
> Thanks for the questions. This made me look at the problem from a distance
> and re-frame the situation. Good questions indeed.
>
> Trying to go around: consider a user who describes herself as being a BMW
> fan, being convinced that all BMW need to be the blackest color possible
> (for a sake of argument) who would like to search and later browse the
> entries in the discussion forum (of course not everything but BMW of the
> blackest color), and what interest her are the snippets that have
> understood, craziest as keywords or the like (because she is looking for a
> dozen of discussions that she saw before).
>
> What I was not able to achieve so far is: (i) combine query term for
> filtering and highlighting, (ii) using the analyzer-chain from the
> attribute to rewrite the highlight query (or define one in the search)
>
> CTR+F technique is a very powerful one, indeed. Works most of the time. The
> difficulties with it are query rewriting, enriching, etc.
>
> Cheers,
> Arturas
>
> On Fri, Mar 23, 2018 at 11:29 AM, Stefan Matheis <matheis.ste...@gmail.com>
> wrote:
>
>> Perhaps we try it the other way round .. what's your use case for this? I'm
>> trying to think of a situation where I'd need this a as user?
>>
>> The only reason I see myself doing this is CTRL+F in a page when the search
>> result is not  immediately visible for me ;)
>>
>> On Mar 23, 2018 9:41 AM, "Arturas Mazeika" <maze...@gmail.com> wrote:
>>
>> > Hi Erick et al,
>> >
>> > From your answer I understand that this is not a typical case that one
>> > searches for a keyword but highlights something else. Since we have two
>> > parameters (q vs hl.q) I thought they are freely combinable. From your
>> > answer I understand that this is not really the case. My current
>> > understanding came from [1] that says:
>> >
>> > hl.q
>> >
>> > A query to use for highlighting. This parameter allows you to highlight
>> > different terms than those being used to retrieve documents.
>> > what I hear from you is something different: i.e., that this is not
>> enough
>> > just to combine the q with hl.q, that there are caveats to achieve the
>> task
>> > (multiple fields, FastVectorHighlighter).
>> >
>> > Your infos are very helpful.
>> >
>> > Cheers,
>> > Arturas
>> >
>> > [1]  https://lucene.apache.org/solr/guide/7_2/highlighting.html
>> >
>> > On Thu, Mar 22, 2018 at 4:07 PM, Erick Erickson <erickerick...@gmail.com
>> >
>> > wrote:
>> >
>> > > Basically you need to use a copyField, but in several variants:
>> > >
>> > > If you use the field _exclusively_ for highlighting then store the raw
>> > > content there and have the field use whatever analyzer you want. You
>> > > do _not_ need to have indexed="true" set for the field if you're
>> > > highlighting on the fly. So you're searching against field1 (which has
>> > > indexed="true" stored="false" set) but highlighting against field2
>> > > (which has indexed="false" stored="true" set). Of course any time you
>> > > want to return the contents in a doc your fl needs to specify
>> > > field2...
>> > >
>> > > The above does not bloat your index at all since the cost of
>> > > stored="true" indexed="true" is the same as if you use two fields,
>> > > each with only one option turned on.
>> > >
>> > > The second approach if you want to use FastVectorHighlighter or the
>> > > like is simply to index both fields.
>> > >
>> > > Best,
>> > > Erick
>> > >
>> > > On Thu, Mar 22, 2018 at 2:18 AM, Arturas Mazeika <maze...@gmail.com>
>> > > wrote:
>> > > > Hi Solr-Users,
>> > > >
>> > > > I've been playing with a german collection of documents, where I
>> tried
>> > to
>> > > > search for one word (q=Tag) and highlighted another:
>> (hl.q=Kundigung).
>> > Is
>> > > > this a "legal" use case? My key question is how can I tell solr which
>> > > query
>> > > > analyzer to use for highlighting? Strictly speaking, I should use
>> > > > hl.q=Kündigung to conceptually look for relevant information, but in
>> > this
>> > > > case, no highlighting is returned (as all umlauts are left out in the
>> > > > index) .
>> > > >
>> > > > Additional infos:
>> > > >
>> > > > solr version: 7.2
>> > > > urls to query:
>> > > >
>> > > > http://localhost:8983/solr/trans/select?q=trans:Zeit&hl=
>> > > true&hl.fl=trans&hl.q=Kundigung&hl.snippets=3&wt=xml&rows=1
>> > > >
>> > > > http://localhost:8983/solr/trans/select?q=trans:Zeit&hl=
>> > > true&hl.fl=trans&hl.q=K%C3%BCndigung&hl.snippets=3&wt=xml&rows=1
>> > > > <http://localhost:8983/solr/trans/select?q=trans:Zeit&hl=
>> > > true&hl.fl=trans&hl.q=Kundigung&hl.snippets=3&wt=xml&rows=1>
>> > > >
>> > > > Managed-schema:
>> > > >
>> > > >   <fieldType name="text_de" class="solr.TextField"
>> > > positionIncrementGap="100">
>> > > >     <analyzer>
>> > > >       <tokenizer class="solr.StandardTokenizerFactory"/>
>> > > >       <filter class="solr.LowerCaseFilterFactory"/>
>> > > >       <filter class="solr.StopFilterFactory" format="snowball"
>> > > > words="lang/stopwords_de.txt" ignoreCase="true"/>
>> > > >       <filter class="solr.GermanNormalizationFilterFactory"/>
>> > > >       <filter class="solr.GermanLightStemFilterFactory"/>
>> > > >     </analyzer>
>> > > >   </fieldType>
>> > > >
>> > > >
>> > > > Other additional infos:
>> > > > https://stackoverflow.com/questions/49276093/solr-
>> > > highlighting-terms-with-umlaut-not-found-not-highlighted
>> > > >
>> > > > Cheers,
>> > > > Arturas
>> > >
>> >
>>

Reply via email to