Marvin Humphrey <[EMAIL PROTECTED]> wrote:
> > Container is only aware of the single inStream, while codec can still
> > think its operating on 3 even if it's really 1 or 2.
> >
>
> I don't understand. If you have three streams, all of them are going to
> have to get skipped, right?
For the "all
On Apr 27, 2008, at 3:28 AM, Michael McCandless wrote:
Actually, I was picturing that the container does the seeking itself
(using skip data), to get "close" to the right point, and then it uses
the codec to step through single docs at a time until it's at or
beyond the right one.
I believe i
Marvin Humphrey <[EMAIL PROTECTED]> wrote:
> > > Seeking might get a little weird, I suppose.
> >
> > Maybe not?: if the container is only aware of the single InStream, and
> > say it's "indexed" with a multi-skip index, then when you ask
> > container to seek, it forwards the request to multi-ski
On Apr 24, 2008, at 4:47 AM, Michael McCandless wrote:
Seeking might get a little weird, I suppose.
Maybe not?: if the container is only aware of the single InStream, and
say it's "indexed" with a multi-skip index, then when you ask
container to seek, it forwards the request to multi-skip whic
Marvin Humphrey <[EMAIL PROTECTED]> wrote:
>
> On Apr 17, 2008, at 11:57 AM, Michael McCandless wrote:
>
>
> > If I have a pluggable indexer,
> > then on the querying side I need something (I'm not sure what/how)
> > that knows how to create the right demuxer (container) and codec
> > (decoder) to
On Apr 17, 2008, at 11:57 AM, Michael McCandless wrote:
If I have a pluggable indexer,
then on the querying side I need something (I'm not sure what/how)
that knows how to create the right demuxer (container) and codec
(decoder) to interact with whatever my indexing plugins wrote.
So I don't t
Marvin Humphrey <[EMAIL PROTECTED]> wrote:
> On Apr 13, 2008, at 2:35 AM, Michael McCandless wrote:
>
>
> > I think the major difference is locality? In a compound file, you
> > have to seek "far away" to reach the prx & skip data (if they are
> > separate).
>
> There's another item worth mentio
On Apr 13, 2008, at 2:35 AM, Michael McCandless wrote:
I think the major difference is locality? In a compound file, you
have to seek "far away" to reach the prx & skip data (if they are
separate).
There's another item worth mentioning, something that Doug, Grant and
I discussed when this
Marvin Humphrey <[EMAIL PROTECTED]> wrote:
>
> On Apr 10, 2008, at 3:10 AM, Michael McCandless wrote:
>
>
> > Can't you compartmentalize while still serializing skip data into the
> > single frq/prx file?
> >
>
> Yes, that's possible.
>
> The way KS is set up right now, PostingList objects maintai
On Apr 10, 2008, at 3:10 AM, Michael McCandless wrote:
Can't you compartmentalize while still serializing skip data into the
single frq/prx file?
Yes, that's possible.
The way KS is set up right now, PostingList objects maintain i/o
state, and Posting's Read_Record() method just deals with
Marvin Humphrey <[EMAIL PROTECTED]> wrote:
> On Apr 9, 2008, at 6:35 AM, Michael Busch wrote:
>
>
> > We also need to come up with a good solution for the dictionary, because a
> term with frq/prx postings needs to store two (or three for skiplist) file
> pointers in the dictionary, whereas e. g. a
Michael Busch <[EMAIL PROTECTED]> wrote:
> > I agree we would have an abstract base Posting class that just tracks
> > the term text.
> >
> > Then, DocumentsWriter manages inverting each field, maintaining the
> > per-field hash of term Text -> abstract Posting instances, exposing
> > the methods
On Apr 9, 2008, at 6:35 AM, Michael Busch wrote:
We also need to come up with a good solution for the dictionary,
because a term with frq/prx postings needs to store two (or three
for skiplist) file pointers in the dictionary, whereas e. g. a
"binary" posting list only needs one pointer.
Thanks for your quick answers.
Michael McCandless wrote:
Hi Michael,
I've actually been working on factoring DocumentsWriter, as a first
step towards flexible indexing.
Cool, yeah separating the DocumentsWriter into multiple classes
certainly helped understanding the complex code better.
14 matches
Mail list logo