Timothy Allison created LUCENE-4880:
---------------------------------------

             Summary: Difference in offset handling between IndexReader created 
by MemoryIndex and one created by RAMDirectory
                 Key: LUCENE-4880
                 URL: https://issues.apache.org/jira/browse/LUCENE-4880
             Project: Lucene - Core
          Issue Type: Bug
          Components: core/index
    Affects Versions: 4.2
         Environment: Windows 7 (probably irrelevant)
            Reporter: Timothy Allison
         Attachments: MemoryIndexVsRamDirZeroLengthTermTest.java

MemoryIndex skips tokens that have length == 0 when building the index; the 
result is that it does not increment the token offset (nor does it store the 
position offsets if that option is set) for tokens of length == 0.  A regular 
index (via, say, RAMDirectory) does not appear to do this.

When using the ICUFoldingFilter, it is possible to have a term of zero length 
(the \u0640 character separated by spaces).  If that occurs in a document, the 
offsets returned at search time differ between the MemoryIndex and a regular 
index.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to