[jira] [Updated] (LUCENE-4656) Fix EmptyTokenizer
[ https://issues.apache.org/jira/browse/LUCENE-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-4656: - Attachment: LUCENE-4656.patch Patch. I wasn't sure whether to add a CharTermAttribute to EmptyTokenizer or to try fixing BaseTokenStreamTestCase but I couldn't think of a non-trivial tokenizer that wouldn't have a CharTermAttribute so I left the assertion that checks that a token stream always has a CharTermAttribute. Fix EmptyTokenizer -- Key: LUCENE-4656 URL: https://issues.apache.org/jira/browse/LUCENE-4656 Project: Lucene - Core Issue Type: Bug Components: modules/analysis Reporter: Adrien Grand Assignee: Adrien Grand Priority: Trivial Attachments: LUCENE-4656.patch TestRandomChains can fail because EmptyTokenizer doesn't have a CharTermAttribute and doesn't compute the end offset (if the offset attribute was added by a filter). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4656) Fix EmptyTokenizer
[ https://issues.apache.org/jira/browse/LUCENE-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-4656: - Attachment: LUCENE-4656.patch Alternative patch that fixes BaseTokenStreamTestCase. I needed to add a quick hack to add a TermToBytesRefAttribute when the tokenstream doesn't have one so that TermsHashPerField doesn't complain that it can't find this attribute when indexing. Fix EmptyTokenizer -- Key: LUCENE-4656 URL: https://issues.apache.org/jira/browse/LUCENE-4656 Project: Lucene - Core Issue Type: Bug Components: modules/analysis Reporter: Adrien Grand Assignee: Adrien Grand Priority: Trivial Attachments: LUCENE-4656.patch, LUCENE-4656.patch TestRandomChains can fail because EmptyTokenizer doesn't have a CharTermAttribute and doesn't compute the end offset (if the offset attribute was added by a filter). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4656) Fix EmptyTokenizer
[ https://issues.apache.org/jira/browse/LUCENE-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-4656: -- Attachment: LUCENE-4656-IW-bug.patch Here a patch showing the bug in the public class EmptyTokenStream from analysis-common working together with IndexWriter. It also has a test that assertTokenStreamContents actually works. Fix EmptyTokenizer -- Key: LUCENE-4656 URL: https://issues.apache.org/jira/browse/LUCENE-4656 Project: Lucene - Core Issue Type: Bug Components: modules/analysis Reporter: Adrien Grand Assignee: Adrien Grand Priority: Trivial Attachments: LUCENE-4656-IW-bug.patch, LUCENE-4656.patch, LUCENE-4656.patch TestRandomChains can fail because EmptyTokenizer doesn't have a CharTermAttribute and doesn't compute the end offset (if the offset attribute was added by a filter). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4656) Fix EmptyTokenizer
[ https://issues.apache.org/jira/browse/LUCENE-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-4656: -- Attachment: LUCENE-4656-IW-fix.patch Here the fix that solves the DocInverterPerField issue (it also removes the horrible for(;;) loop where the first clause is a if ... break. Now only BaseTokenStreamTestCase should be able to handle the missing attribute. It should *only* complain when actually tokens are emitted. Fix EmptyTokenizer -- Key: LUCENE-4656 URL: https://issues.apache.org/jira/browse/LUCENE-4656 Project: Lucene - Core Issue Type: Bug Components: modules/analysis Reporter: Adrien Grand Assignee: Adrien Grand Priority: Trivial Attachments: LUCENE-4656-IW-bug.patch, LUCENE-4656-IW-fix.patch, LUCENE-4656.patch, LUCENE-4656.patch TestRandomChains can fail because EmptyTokenizer doesn't have a CharTermAttribute and doesn't compute the end offset (if the offset attribute was added by a filter). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4656) Fix EmptyTokenizer
[ https://issues.apache.org/jira/browse/LUCENE-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-4656: -- Attachment: LUCENE-4656-IW-fix.patch Better patch, ueses do...while, which is more readable. Fix EmptyTokenizer -- Key: LUCENE-4656 URL: https://issues.apache.org/jira/browse/LUCENE-4656 Project: Lucene - Core Issue Type: Bug Components: modules/analysis Reporter: Adrien Grand Assignee: Adrien Grand Priority: Trivial Attachments: LUCENE-4656-IW-bug.patch, LUCENE-4656-IW-fix.patch, LUCENE-4656-IW-fix.patch, LUCENE-4656.patch, LUCENE-4656.patch TestRandomChains can fail because EmptyTokenizer doesn't have a CharTermAttribute and doesn't compute the end offset (if the offset attribute was added by a filter). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4656) Fix EmptyTokenizer
[ https://issues.apache.org/jira/browse/LUCENE-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-4656: Attachment: LUCENE-4656_bttc.patch here's a patch for BaseTokenStreamTestCase. I think it should work for this EmptyTokenizer too. Fix EmptyTokenizer -- Key: LUCENE-4656 URL: https://issues.apache.org/jira/browse/LUCENE-4656 Project: Lucene - Core Issue Type: Bug Components: modules/analysis Reporter: Adrien Grand Assignee: Adrien Grand Priority: Trivial Attachments: LUCENE-4656_bttc.patch, LUCENE-4656-IW-bug.patch, LUCENE-4656-IW-fix.patch, LUCENE-4656-IW-fix.patch, LUCENE-4656.patch, LUCENE-4656.patch TestRandomChains can fail because EmptyTokenizer doesn't have a CharTermAttribute and doesn't compute the end offset (if the offset attribute was added by a filter). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4656) Fix EmptyTokenizer
[ https://issues.apache.org/jira/browse/LUCENE-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-4656: -- Attachment: LUCENE-4656.patch New patch merged with Adrien's. I am not sure if the Fix in BaseTokenStreamTestCase is correct, because if you pass the String[] you expect tokens and the fix is different like the one for offsets or positionincrements. Fix EmptyTokenizer -- Key: LUCENE-4656 URL: https://issues.apache.org/jira/browse/LUCENE-4656 Project: Lucene - Core Issue Type: Bug Components: modules/analysis Reporter: Adrien Grand Assignee: Adrien Grand Priority: Trivial Attachments: LUCENE-4656_bttc.patch, LUCENE-4656-IW-bug.patch, LUCENE-4656-IW-fix.patch, LUCENE-4656-IW-fix.patch, LUCENE-4656.patch, LUCENE-4656.patch, LUCENE-4656.patch TestRandomChains can fail because EmptyTokenizer doesn't have a CharTermAttribute and doesn't compute the end offset (if the offset attribute was added by a filter). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4656) Fix EmptyTokenizer
[ https://issues.apache.org/jira/browse/LUCENE-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-4656: -- Attachment: LUCENE-4656.patch Patch merged with Robert's. Fix EmptyTokenizer -- Key: LUCENE-4656 URL: https://issues.apache.org/jira/browse/LUCENE-4656 Project: Lucene - Core Issue Type: Bug Components: modules/analysis Reporter: Adrien Grand Assignee: Uwe Schindler Priority: Trivial Attachments: LUCENE-4656_bttc.patch, LUCENE-4656-IW-bug.patch, LUCENE-4656-IW-fix.patch, LUCENE-4656-IW-fix.patch, LUCENE-4656.patch, LUCENE-4656.patch, LUCENE-4656.patch, LUCENE-4656.patch TestRandomChains can fail because EmptyTokenizer doesn't have a CharTermAttribute and doesn't compute the end offset (if the offset attribute was added by a filter). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4656) Fix EmptyTokenizer
[ https://issues.apache.org/jira/browse/LUCENE-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-4656: -- Attachment: LUCENE-4656.patch Add a check that the document is really in IW after indexing. Fix EmptyTokenizer -- Key: LUCENE-4656 URL: https://issues.apache.org/jira/browse/LUCENE-4656 Project: Lucene - Core Issue Type: Bug Components: modules/analysis Reporter: Adrien Grand Assignee: Uwe Schindler Priority: Trivial Attachments: LUCENE-4656_bttc.patch, LUCENE-4656-IW-bug.patch, LUCENE-4656-IW-fix.patch, LUCENE-4656-IW-fix.patch, LUCENE-4656.patch, LUCENE-4656.patch, LUCENE-4656.patch, LUCENE-4656.patch, LUCENE-4656.patch TestRandomChains can fail because EmptyTokenizer doesn't have a CharTermAttribute and doesn't compute the end offset (if the offset attribute was added by a filter). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org