[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-08-13 Thread ASF subversion and git services (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906844#comment-16906844 ] ASF subversion and git services commented on LUCENE-8933: - Commit

[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-08-13 Thread ASF subversion and git services (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906841#comment-16906841 ] ASF subversion and git services commented on LUCENE-8933: - Commit

[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-08-13 Thread ASF subversion and git services (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906820#comment-16906820 ] ASF subversion and git services commented on LUCENE-8933: - Commit

[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-08-13 Thread ASF subversion and git services (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906819#comment-16906819 ] ASF subversion and git services commented on LUCENE-8933: - Commit

[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-07-27 Thread Tomoko Uchida (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894349#comment-16894349 ] Tomoko Uchida commented on LUCENE-8933: --- I opened pull requests. Could you review them? For

[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-07-27 Thread Tomoko Uchida (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16894298#comment-16894298 ] Tomoko Uchida commented on LUCENE-8933: --- {quote}Should we go further and check that the

[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-07-26 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893677#comment-16893677 ] Jim Ferenczi commented on LUCENE-8933: -- {quote} Should we go further and check that the

[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-07-26 Thread Adrien Grand (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893662#comment-16893662 ] Adrien Grand commented on LUCENE-8933: -- Should we go further and check that the concatenation of

[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-07-26 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893478#comment-16893478 ] Jim Ferenczi commented on LUCENE-8933: -- {quote} If there are no other opinions or objections, I'd

[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-07-25 Thread Tomoko Uchida (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893288#comment-16893288 ] Tomoko Uchida commented on LUCENE-8933: --- Just for clarification, let me wrap up the problem here.

[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-07-25 Thread Tomoko Uchida (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892941#comment-16892941 ] Tomoko Uchida commented on LUCENE-8933: --- [~danmuzi]: thanks for confirmation, sorry it relates

[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-07-25 Thread Namgyu Kim (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892876#comment-16892876 ] Namgyu Kim commented on LUCENE-8933: Great analysis! :D I checked KoreanTokenizer and there is no

[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-07-25 Thread Tomoko Uchida (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892709#comment-16892709 ] Tomoko Uchida commented on LUCENE-8933: --- If there are no other opinions or objections, I'd like to

[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-07-25 Thread Tomoko Uchida (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892708#comment-16892708 ] Tomoko Uchida commented on LUCENE-8933: --- Thanks for your explanation and investigation, I agree

[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-07-25 Thread Jim Ferenczi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892671#comment-16892671 ] Jim Ferenczi commented on LUCENE-8933: -- The first argument of the dictionary rule is the original

[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-07-25 Thread Adrien Grand (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892644#comment-16892644 ] Adrien Grand commented on LUCENE-8933: -- Ah, thanks for digging [~tomoko] and [~danmuzi]. I

[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-07-25 Thread Tomoko Uchida (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892574#comment-16892574 ] Tomoko Uchida commented on LUCENE-8933: --- The surrogate pair Emoji character  included in the

[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-07-25 Thread Tomoko Uchida (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892562#comment-16892562 ] Tomoko Uchida commented on LUCENE-8933: --- This pure lucene code produces the very similar error and

[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-07-24 Thread Tomoko Uchida (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892126#comment-16892126 ] Tomoko Uchida commented on LUCENE-8933: --- [~danmuzi]: thank you for telling me about the issue. The

[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-07-24 Thread Namgyu Kim (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892041#comment-16892041 ] Namgyu Kim commented on LUCENE-8933: Elasticsearch Issue Link : 

[jira] [Commented] (LUCENE-8933) JapaneseTokenizer creates Token objects with corrupt offsets

2019-07-24 Thread Tomoko Uchida (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-8933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16891940#comment-16891940 ] Tomoko Uchida commented on LUCENE-8933: --- I once encountered similar errors but it's not related to