Hi all,
We are using the UAX29URLEmailTokenizer so we can use the token types in our plugins. However, I noticed that the tokenizer is not detecting certain URLs as <URL> but <ALPHANUM> instead. Examples that are not working: example.com is <ALPHANUM> example.net is <ALPHANUM> But: https://example.com is <URL> as is https://example.net. Examples that work: example.ch is <URL> example.co.uk is <URL> example.nl is <URL> I have checked the JIRA, and could not find an issue. I have tested this on Lucene (Solr) 6.4.1 and 7.3. Could someone confirm my findings and advise what I could do to (help) resolve this issue? /JZ <https://example.com>