[
https://issues.apache.org/jira/browse/BLUR-61?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aaron McCurry updated BLUR-61:
------------------------------
Priority: Trivial (was: Major)
> Remove sessions from the code
> -----------------------------
>
> Key: BLUR-61
> URL: https://issues.apache.org/jira/browse/BLUR-61
> Project: Apache Blur
> Issue Type: Bug
> Affects Versions: experimental-dev
> Reporter: Aaron McCurry
> Priority: Trivial
>
> There was a discussion on the mail list about the maintaining of sessions in
> the 0.2 code.
> http://mail-archives.apache.org/mod_mbox/incubator-blur-dev/201302.mbox/%3ccag_bhoy3_vdtv1jmfbscu-7mob4i9pm6dlof5di6ousgmpj...@mail.gmail.com%3E
> I would like to remove the need for sessions from the code. I prepose that
> we accomplish this by including the segment in the documentation location
> throughout the API.
> Background, this is really an issue with Lucene and how it deals with
> mutations on the index. Let me provide an example:
> 1. Document A gets added to the index and let's say that it gets added into
> the Lucene segment of "aa" which through a bit of math it becomes document id
> 3570586 in the overall index but it actually document id 304 in the "aa"
> segment.
> 2. Search gets executed, an index snapshot is created and Document A was
> reported in the search results as a hit at 3570586.
> 3. Now say that the document id reported to another system, and later that
> system actually wants to fetch the data for the hit.
> 4. Now a merge occurs and the "aa" is now merged with another segment (one or
> more).
> 5. Then the other system wants to fetch the document 3570586. A new snapshot
> of the index was created and then document id 3570586 was requested. But
> it's very likely (only blind luck will it be the right document) that it's
> going to fetch the wrong document.
> Currently in the blur 0.2 code we get around this problem by storing the
> index snapshot in a session on each server. So during a session the index
> cannot change.
> Back to my preposed change of adding the segment to the document location.
> The new document location will include [ shard index / segment name /
> document id in the segment (not the overall index document id) ]. On the
> server side keep old segments around for a certain amount of time after their
> last access, basically a LRU cache. That way if a segment is deleted and
> another system still asks for data from an old segment, the data can still be
> retrieved.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)