[jira] [Commented] (LUCENE-8966) KoreanTokenizer should split unknown words on digits

2019-09-13 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16929036#comment-16929036 ] ASF subversion and git services commented on LUCENE-8966: - Commit

[jira] [Commented] (LUCENE-8966) KoreanTokenizer should split unknown words on digits

2019-09-13 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16929035#comment-16929035 ] ASF subversion and git services commented on LUCENE-8966: - Commit

[jira] [Commented] (LUCENE-8966) KoreanTokenizer should split unknown words on digits

2019-09-13 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16929032#comment-16929032 ] ASF subversion and git services commented on LUCENE-8966: - Commit

[jira] [Commented] (LUCENE-8966) KoreanTokenizer should split unknown words on digits

2019-09-11 Thread Namgyu Kim (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927933#comment-16927933 ] Namgyu Kim commented on LUCENE-8966: Oh, Thank you for your reply. [~jim.ferenczi] :D I checked

[jira] [Commented] (LUCENE-8966) KoreanTokenizer should split unknown words on digits

2019-09-09 Thread Jim Ferenczi (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16925588#comment-16925588 ] Jim Ferenczi commented on LUCENE-8966: -- I don't think it's a bug [~danmuzi] or at least that it's

[jira] [Commented] (LUCENE-8966) KoreanTokenizer should split unknown words on digits

2019-09-07 Thread Namgyu Kim (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924773#comment-16924773 ] Namgyu Kim commented on LUCENE-8966: But there is a bug I just checked :( Input : "4..4사이즈"

[jira] [Commented] (LUCENE-8966) KoreanTokenizer should split unknown words on digits

2019-09-07 Thread Namgyu Kim (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924768#comment-16924768 ] Namgyu Kim commented on LUCENE-8966: Good job! [~jim.ferenczi] :D It can be serious enough for Nori

[jira] [Commented] (LUCENE-8966) KoreanTokenizer should split unknown words on digits

2019-09-06 Thread Mike Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924206#comment-16924206 ] Mike Sokolov commented on LUCENE-8966: -- > For complex number grouping and normalization, Namgyu Kim

[jira] [Commented] (LUCENE-8966) KoreanTokenizer should split unknown words on digits

2019-09-05 Thread Lucene/Solr QA (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923404#comment-16923404 ] Lucene/Solr QA commented on LUCENE-8966: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote

[jira] [Commented] (LUCENE-8966) KoreanTokenizer should split unknown words on digits

2019-09-05 Thread Jim Ferenczi (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923394#comment-16923394 ] Jim Ferenczi commented on LUCENE-8966: -- {quote} Would you consider grouping numbers and (at least

[jira] [Commented] (LUCENE-8966) KoreanTokenizer should split unknown words on digits

2019-09-05 Thread Mike Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923378#comment-16923378 ] Mike Sokolov commented on LUCENE-8966: -- Would you consider grouping numbers and (at least some)

[jira] [Commented] (LUCENE-8966) KoreanTokenizer should split unknown words on digits

2019-09-05 Thread Jim Ferenczi (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923357#comment-16923357 ] Jim Ferenczi commented on LUCENE-8966: -- Thanks for looking [~thetaphi]. These two private static

[jira] [Commented] (LUCENE-8966) KoreanTokenizer should split unknown words on digits

2019-09-05 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923335#comment-16923335 ] Uwe Schindler commented on LUCENE-8966: --- Also you are using a private isDigit() at one place, the

[jira] [Commented] (LUCENE-8966) KoreanTokenizer should split unknown words on digits

2019-09-05 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923332#comment-16923332 ] Uwe Schindler commented on LUCENE-8966: --- isNumber() is dead code? > KoreanTokenizer should split

[jira] [Commented] (LUCENE-8966) KoreanTokenizer should split unknown words on digits

2019-09-05 Thread Lucene/Solr QA (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923279#comment-16923279 ] Lucene/Solr QA commented on LUCENE-8966: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote

[jira] [Commented] (LUCENE-8966) KoreanTokenizer should split unknown words on digits

2019-09-05 Thread Jim Ferenczi (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923222#comment-16923222 ] Jim Ferenczi commented on LUCENE-8966: -- Here is a patch that breaks unknown words on digits instead