[jira] [Commented] (LUCENE-6212) Remove IndexWriter's per-document analyzer add/updateDocument APIs

Shai Erera (JIRA) Sun, 01 Feb 2015 01:59:48 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-6212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14300135#comment-14300135
 ]


Shai Erera commented on LUCENE-6212:
------------------------------------

How do you index multi-lingual documents in one index then? We used to do it by 
pulling the correct Analyzer per the document's language and call addDoc(doc, 
langAnazlyer). What's the alternative without that API? Is there any easy 
alternative, or should we add all fields to a document with a language-specific 
TokenStream, which is much less convenient, but still an alternative.

Is it worth having a CHANGES / MIGRATION entry for this? I think if users 
depend on that API for good reasons (i.e. it's not a 'trap' for them), it 
should be mentioned somewhere..

> Remove IndexWriter's per-document analyzer add/updateDocument APIs
> ------------------------------------------------------------------
>
>                 Key: LUCENE-6212
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6212
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 5.0, Trunk, 5.1
>
>         Attachments: LUCENE-6212.patch
>
>
> IndexWriter already takes an analyzer up-front (via
> IndexWriterConfig), but it also allows you to specify a different one
> for each add/updateDocument.
> I think this is quite dangerous/trappy since it means you can easily
> index tokens for that document that don't match at search-time based
> on the search-time analyzer.
> I think we should remove this trap in 5.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-6212) Remove IndexWriter's per-document analyzer add/updateDocument APIs

Reply via email to