[jira] [Updated] (LUCENE-4880) Difference in offset handling between IndexReader created by MemoryIndex and one created by RAMDirectory
[ https://issues.apache.org/jira/browse/LUCENE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-4880: Attachment: LUCENE-4880.patch Attached is a fix with tests. Difference in offset handling between IndexReader created by MemoryIndex and one created by RAMDirectory Key: LUCENE-4880 URL: https://issues.apache.org/jira/browse/LUCENE-4880 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.2 Environment: Windows 7 (probably irrelevant) Reporter: Timothy Allison Attachments: LUCENE-4880.patch, MemoryIndexVsRamDirZeroLengthTermTest.java MemoryIndex skips tokens that have length == 0 when building the index; the result is that it does not increment the token offset (nor does it store the position offsets if that option is set) for tokens of length == 0. A regular index (via, say, RAMDirectory) does not appear to do this. When using the ICUFoldingFilter, it is possible to have a term of zero length (the \u0640 character separated by spaces). If that occurs in a document, the offsets returned at search time differ between the MemoryIndex and a regular index. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-4880) Difference in offset handling between IndexReader created by MemoryIndex and one created by RAMDirectory
[ https://issues.apache.org/jira/browse/LUCENE-4880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Allison updated LUCENE-4880: Attachment: MemoryIndexVsRamDirZeroLengthTermTest.java Difference in offset handling between IndexReader created by MemoryIndex and one created by RAMDirectory Key: LUCENE-4880 URL: https://issues.apache.org/jira/browse/LUCENE-4880 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 4.2 Environment: Windows 7 (probably irrelevant) Reporter: Timothy Allison Attachments: MemoryIndexVsRamDirZeroLengthTermTest.java MemoryIndex skips tokens that have length == 0 when building the index; the result is that it does not increment the token offset (nor does it store the position offsets if that option is set) for tokens of length == 0. A regular index (via, say, RAMDirectory) does not appear to do this. When using the ICUFoldingFilter, it is possible to have a term of zero length (the \u0640 character separated by spaces). If that occurs in a document, the offsets returned at search time differ between the MemoryIndex and a regular index. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org