[ https://issues.apache.org/jira/browse/LUCENE-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915130#action_12915130 ]
Jason Rutherglen commented on LUCENE-2575: ------------------------------------------ Here are the new parallel arrays. It seems like something went wrong and there are too many, however I think each is required. {code} final int[] skipStarts; // address where the term's skip list starts (for reading) final int[] skipAddrs; // where writing left off final int[] sliceAddrs; // the start addr of the last posting slice final byte[] sliceLevels; // posting slice levels final int[] skipLastDoc; // last skip doc written final int[] skipLastAddr; // last skip addr written {code} In regards to writing into the skip list the start address of the first level 9 posting slice: Because we're writing vints into the posting slices, and vints may span more than 1 byte, we may (and this has happened in testing) write a vint that spans slices, so if we record the last slice address and read a vint from that point, we'll get an incorrect vint. If we start 1+ bytes into a slice, we will not know where the slice ends (because we are assuming they're 200 bytes in length). Perhaps in the slice address parallel array we can somehow encode the first slice's length, or add yet another parallel array for the length of the first slice. Something to think about. We can't simply read ahead 200 bytes (ie, level 9), nor can > Concurrent byte and int block implementations > --------------------------------------------- > > Key: LUCENE-2575 > URL: https://issues.apache.org/jira/browse/LUCENE-2575 > Project: Lucene - Java > Issue Type: Improvement > Components: Index > Affects Versions: Realtime Branch > Reporter: Jason Rutherglen > Fix For: Realtime Branch > > Attachments: LUCENE-2575.patch, LUCENE-2575.patch, LUCENE-2575.patch, > LUCENE-2575.patch > > > The current *BlockPool implementations aren't quite concurrent. > We really need something that has a locking flush method, where > flush is called at the end of adding a document. Once flushed, > the newly written data would be available to all other reading > threads (ie, postings etc). I'm not sure I understand the slices > concept, it seems like it'd be easier to implement a seekable > random access file like API. One'd seek to a given position, > then read or write from there. The underlying management of byte > arrays could then be hidden? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org