Doug, I think it would be wonderful if a FieldType had N analyzer chains instead of exactly 3 (index, query, multiTerm). Each chain could simply have a name. The query parser could be configured to pick a particular chain by name.
I worked on a search project that had like a half dozen query analyzers, which were also machine generated in code on the custom FieldType. The query parser, also custom, could then communicate with the FieldType to get the particular analyzer that was appropriate for the use. It's annoying (hard to maintain) to see repeated chains that are slightly different. I've wondered if it would be more maintainable to have one chain, with some qualifier on each element to say to which named chains it applies to (if not all)? I dunno; trade-offs, trade-offs. ~ David On Thu, Nov 23, 2017 at 11:03 AM Doug Turnbull < [email protected]> wrote: > An alternate solution could be to create a fieldType that was a > "FacadeTextField" that searches a real TextField field with a different > query time analyzer. IE it would not have a physical representation in the > index, but just provide a handle to a "field" that is searched with a > different query time analyzer. > > For example, actor_nosyn is really a facade for searching "actor" with a > different analyzer > > <!-- search actor field without synonyms --> > <field name="actor_nosyn" type="text_nosyn" facadeOf="actor"/> > > <!-- searches actor field as normal text field --> > <field name="actor" type="text" indexed="true" stored="true"/> > > > <!-- Facade field type that places a different query time analyzer in > front of another field --> > <fieldType name="text_nosyn" class="solr.FacadeTextField" > > <analyzer type="query" >...</analyzer> > </fieldType> > > <!-- fully fledged text field type --> > <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> > <analyzer type="query" >...</analyzer> > <analyzer type="index" >...</analyzer> > </fieldType> > > This would allow edismax and other query parsers to remain unchanged > searching, ie: > > q=action movies&qf=actor actor_nosyn title text&defType=edismax > > > > On Thu, Nov 23, 2017 at 10:50 AM Doug Turnbull < > [email protected]> wrote: > >> I wonder if there's been any thought by the community to refactoring >> fieldTypes to allow multiple query-time analyzers per indexed field? >> Currently, to get different query-time analysis behavior you have to >> duplicate a field. This is unfortunate duplication if, for example, I want >> to search a field with query time synonyms on/off. For higher scale search >> cases, allowing multiple query time analyzers against a single index field >> can be invaluable. It's one reason I created the Match Query Parser ( >> https://github.com/o19s/match-query-parser) and a major feature of >> hon-lucene-synonyms (https://github.com/healthonnet/hon-lucene-synonyms ) >> >> What I would propose is the ability to place multiple analyzers under a >> field type. For example: >> >> <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> >> <analyzer type="query" default="true" >> name="with_synonyms">...</analyzer> >> <analyzer type="query" name="without_synonyms">...</analyzer> >> <analyzer type="index">...</analyzer> >> </fieldType> >> >> Notice how one query-time analyzer is "default" (and including only one >> would make it the default) >> >> This would require allowing query parsers pass the analyzer to use at >> query time. I would propose introduce a syntax for configuring query >> behavior per-field in edismax. Omitting this would continue to use the >> default behavior/analyzer. >> >> For example, one could query title and text as usual: >> >> q=action movies&qf=actor title text&defType=edismax >> >> I would propose introducing a syntax whereby qf could refer to a kind of >> psuedo field, configurable with a syntax similar to per-field facet settings >> >> For example, below "actor_nosyn" and "actor_syn" actually search the same >> physical field, but are configured with different analyzers >> >> q=action movies&qf=actor_syn actor_nosyn^10 title >> text&defType=edismax&qf.actor_nosyn.field=actor&qf.actor_nosyn.analyzer=without_synonyms&qf.actor_syn.field=actor&qf.actor_syn.analyzer=with_synonyms >> >> Indeed, I would propose extending this syntax to control some of the >> query-specific properties that currently are tied to the fieldType, such as >> >> q=action movies&qf=actor_syn actor_nosyn^10 title >> text&defType=edismax&qf.actor_nosyn.field=actor&qf.actor_nosyn.analyzer=without_synonyms&qf.actor_syn.field=actor&qf.actor_syn.analyzer=with_synonyms&qf.actorNoSyn.autoGeneratePhraseQueries=false >> >> I think this could be a pretty powerful syntax, but would require >> refactoring of the field type and edismax (and possibly other query >> parsers) quite a bit >> >> Any thoughts? >> >> Best >> -Doug >> -- >> Consultant, OpenSource Connections. Contact info at >> http://o19s.com/about-us/doug-turnbull/; Free/Busy ( >> http://bit.ly/dougs_cal) >> > -- > Consultant, OpenSource Connections. Contact info at > http://o19s.com/about-us/doug-turnbull/; Free/Busy ( > http://bit.ly/dougs_cal) > -- Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker LinkedIn: http://linkedin.com/in/davidwsmiley | Book: http://www.solrenterprisesearchserver.com
