Taewoo Kim created ASTERIXDB-2443: ------------------------------------- Summary: The current word tokenizer is too restricted. Key: ASTERIXDB-2443 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2443 Project: Apache AsterixDB Issue Type: Improvement Reporter: Taewoo Kim Assignee: Taewoo Kim
The current tokenizer is too restricted. It treats all characters except alphanumeric characters (A-Za-z0-9) as a delimiter. As a consequence, all international characters are treated as a delimiter. -- This message was sent by Atlassian JIRA (v7.6.3#76005)