Taewoo Kim created ASTERIXDB-2443:
-------------------------------------

             Summary: The current word tokenizer is too restricted.
                 Key: ASTERIXDB-2443
                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2443
             Project: Apache AsterixDB
          Issue Type: Improvement
            Reporter: Taewoo Kim
            Assignee: Taewoo Kim


The current tokenizer is too restricted. It treats all characters except 
alphanumeric characters (A-Za-z0-9) as a delimiter. As a consequence, all 
international characters are treated as a delimiter. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to