magibney commented on issue #892: LUCENE-8972: Add ICUTransformCharFilter, to support pre-tokenizer ICU text transformation URL: https://github.com/apache/lucene-solr/pull/892#issuecomment-564217729 > Regarding further automatic performance optimization, I think that is very ambitious and maybe too much magic. Agreed. The only issue I have with the change you recently posted is the randomization of `assumeExternalUnicodeNormalization`. The tests _shouldn't_ generally pass with that arg randomly flipped, unless accompanied by a complementary, transliterator-specific top-level normalization change. If the tests pass with that arg randomized, that's dependent on the specific transliterator being used, and I'm concerned that even if the tests happen pass as written, randomly flipping that arg in tests conveys the impression that that's something one should be able to generally. Would you object to reverting/removing (from the proposed patch) the randomized form of `getTransliterator(String)` in `TestICUTransformCharFilter`? Could be replaced with additional test variants that _explicitly_ disable bundled normalization (and specifically add corresponding top-level -- i.e., `ICUNormalizer2CharFilter` -- normalization if necessary)?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org