[ 
https://issues.apache.org/jira/browse/BLUR-61?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron McCurry updated BLUR-61:
------------------------------
    Priority: Trivial  (was: Major)

> Remove sessions from the code
> -----------------------------
>
>                 Key: BLUR-61
>                 URL: https://issues.apache.org/jira/browse/BLUR-61
>             Project: Apache Blur
>          Issue Type: Bug
>    Affects Versions: experimental-dev
>            Reporter: Aaron McCurry
>            Priority: Trivial
>
> There was a discussion on the mail list about the maintaining of sessions in 
> the 0.2 code.
> http://mail-archives.apache.org/mod_mbox/incubator-blur-dev/201302.mbox/%3ccag_bhoy3_vdtv1jmfbscu-7mob4i9pm6dlof5di6ousgmpj...@mail.gmail.com%3E
> I would like to remove the need for sessions from the code.  I prepose that 
> we accomplish this by including the segment in the documentation location 
> throughout the API.
> Background, this is really an issue with Lucene and how it deals with 
> mutations on the index.  Let me provide an example:
> 1. Document A gets added to the index and let's say that it gets added into 
> the Lucene segment of "aa" which through a bit of math it becomes document id 
> 3570586 in the overall index but it actually document id 304 in the "aa" 
> segment.  
> 2. Search gets executed, an index snapshot is created and Document A was 
> reported in the search results as a hit at 3570586.
> 3. Now say that the document id reported to another system, and later that 
> system actually wants to fetch the data for the hit.
> 4. Now a merge occurs and the "aa" is now merged with another segment (one or 
> more).
> 5. Then the other system wants to fetch the document 3570586.  A new snapshot 
> of the index was created and then document id 3570586 was requested.  But 
> it's very likely (only blind luck will it be the right document) that it's 
> going to fetch the wrong document.
> Currently in the blur 0.2 code we get around this problem by storing the 
> index snapshot in a session on each server.  So during a session the index 
> cannot change.
> Back to my preposed change of adding the segment to the document location.  
> The new document location will include [ shard index / segment name / 
> document id in the segment (not the overall index document id) ].  On the 
> server side keep old segments around for a certain amount of time after their 
> last access, basically a LRU cache.  That way if a segment is deleted and 
> another system still asks for data from an old segment, the data can still be 
> retrieved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to