[ 
https://issues.apache.org/jira/browse/SOLR-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343254#comment-15343254
 ] 

Trey Grainger commented on SOLR-6492:
-------------------------------------

Hi [~krantiparisa] and [~dannytei1]. Apologies for the long lapse without a 
response on this issue. I won't get into the reasons here (combination of 
personal and professional commitments), but I just wanted to say that I expect 
to pick this issue back up in the near future and continue work on this patch.

In the meantime, I have added an ASL 2.0 license to the current code (from Solr 
in Action) so that folks can feel free to use what's there now: 
https://github.com/treygrainger/solr-in-action/tree/master/src/main/java/sia/ch14

I'll turn what's there now into a patch, update it to Solr trunk, and keep 
iterating on it until the folks commenting on this issue are satisfied with the 
design and capabilities. Stay tuned...

> Solr field type that supports multiple, dynamic analyzers
> ---------------------------------------------------------
>
>                 Key: SOLR-6492
>                 URL: https://issues.apache.org/jira/browse/SOLR-6492
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Trey Grainger
>             Fix For: 5.0
>
>
> A common request - particularly for multilingual search - is to be able to 
> support one or more dynamically-selected analyzers for a field. For example, 
> someone may have a "content" field and pass in a document in Greek (using an 
> Analyzer with Tokenizer/Filters for German), a separate document in English 
> (using an English Analyzer), and possibly even a field with mixed-language 
> content in Greek and English. This latter case could pass the content 
> separately through both an analyzer defined for Greek and another Analyzer 
> defined for English, stacking or concatenating the token streams based upon 
> the use-case.
> There are some distinct advantages in terms of index size and query 
> performance which can be obtained by stacking terms from multiple analyzers 
> in the same field instead of duplicating content in separate fields and 
> searching across multiple fields. 
> Other non-multilingual use cases may include things like switching to a 
> different analyzer for the same field to remove a feature (i.e. turning 
> on/off query-time synonyms against the same field on a per-query basis).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to