+1 to make this less trappy.

It looks like KoreanPartOfSpeechStopFilterFactory will fallback to default
stop tags if no args were provided.  I think we should indeed make
JapanesePartOfSpeechStopFilterFactory consistent.

Maybe, we fix this only in next major release (9.0), add an entry to
MIGRATE.txt explaining that, and go with option 2?  And possibly option 1
for 8.x releases?  (Or maybe don't fix it in 8.x releases... not sure).

Mike McCandless

http://blog.mikemccandless.com


On Fri, Oct 2, 2020 at 12:10 PM Michael Froh <msf...@gmail.com> wrote:

> I am currently working on migrating a project from an old version of Solr
> to Elasticsearch, and came across a funny (to me at least) difference in
> the "default" behavior of JapanesePartOfSpeechStopFilterFactory.
>
> If JapanesePartOfSpeechStopFilterFactory is given empty args, it does
> nothing. It doesn't load any stop tags, and just passes along the
> TokenStream passed to create(). (By comparison, the Elasticsearch filter
> will default to loading the stop tags shipped in the Kuromoji analyzer
> JAR.) So, for many years, my project was not using
> JapanesePartOfSpeechStopFilter, when I thought that it was.
>
> I would like to create an issue and submit a patch, in case other users
> out there are failing to use the filter factory correctly, but I'm not sure
> what the best approach is, between:
>
> 1. If someone doesn't specify the tags argument, then throw an exception
> (because the user probably doesn't know what they're doing).
> 2. If someone doesn't specify the tags argument, then load the default
> stop tags (like JapaneseAnalyzer does).
>
> I would lean more toward 1, to avoid a silent change in behavior.
>

Reply via email to