[jira] Commented: (LUCENE-2575) Concurrent byte and int block implementations

Jason Rutherglen (JIRA) Sun, 26 Sep 2010 22:44:59 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915130#action_12915130
 ]


Jason Rutherglen commented on LUCENE-2575:
------------------------------------------

Here are the new parallel arrays.  It seems like something went wrong and there 
are too many, however I think each is required.

{code}
final int[] skipStarts; // address where the term's skip list starts (for 
reading)
final int[] skipAddrs; // where writing left off
final int[] sliceAddrs; // the start addr of the last posting slice
final byte[] sliceLevels; // posting slice levels
final int[] skipLastDoc; // last skip doc written
final int[] skipLastAddr; // last skip addr written
{code}

In regards to writing into the skip list the start address of
the first level 9 posting slice: Because we're writing vints
into the posting slices, and vints may span more than 1 byte, we
may (and this has happened in testing) write a vint that spans
slices, so if we record the last slice address and read a vint
from that point, we'll get an incorrect vint. If we start 1+
bytes into a slice, we will not know where the slice ends
(because we are assuming they're 200 bytes in length). Perhaps
in the slice address parallel array we can somehow encode the
first slice's length, or add yet another parallel array for the
length of the first slice.  Something to think about.

We can't simply read
ahead 200 bytes (ie, level 9), nor can

> Concurrent byte and int block implementations
> ---------------------------------------------
>
>                 Key: LUCENE-2575
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2575
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: Realtime Branch
>            Reporter: Jason Rutherglen
>             Fix For: Realtime Branch
>
>         Attachments: LUCENE-2575.patch, LUCENE-2575.patch, LUCENE-2575.patch, 
> LUCENE-2575.patch
>
>
> The current *BlockPool implementations aren't quite concurrent.
> We really need something that has a locking flush method, where
> flush is called at the end of adding a document. Once flushed,
> the newly written data would be available to all other reading
> threads (ie, postings etc). I'm not sure I understand the slices
> concept, it seems like it'd be easier to implement a seekable
> random access file like API. One'd seek to a given position,
> then read or write from there. The underlying management of byte
> arrays could then be hidden?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] Commented: (LUCENE-2575) Concurrent byte and int block implementations

Reply via email to