Thank you very much Michael for the information!
-John
On Fri, Sep 18, 2009 at 6:01 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:
> > Say you have a type of field with fixed length data per doc, e.g. a
> > 8 bytes.
>
> OK this makes sense -- thanks for the example! This sounds like
> Say you have a type of field with fixed length data per doc, e.g. a
> 8 bytes.
OK this makes sense -- thanks for the example! This sounds like
getting column-stride-fields before that feature is added to Lucene
"for real".
For flushing, you can plugin your own indexing chain to IndexWriter.
Th
I bet custom per-segment files could very well be used for per-segment
userdata/debuginfo we introduced earlier.
With them it could be stored neatly in a separate file instead of
being grafted onto current ones.
On Thu, Sep 17, 2009 at 18:35, Michael McCandless
wrote:
> I'm actively working on LU
Yes, I guess you could branch the code? It probably doesn't need to
be final Mike?
On Thu, Sep 17, 2009 at 7:16 PM, John Wang wrote:
> Hi Michael:
>
> Is there a wiki or some sort of write up on LUCENE-1458? It looks
> extremely cool!
>
> Re: Jason: isn't flush final?
>
> -John
>
> On Fri,
On Fri, Sep 18, 2009 at 08:14:24AM +0800, John Wang wrote:
> Say you have a type of field with fixed length data per doc, e.g. a 8 bytes.
> It might be good to store in a segment:
>
Heh. You've just described this proof of concept class:
http://www.rectangular.com/kinosearch/docs/deve
Hi Michael:
Is there a wiki or some sort of write up on LUCENE-1458? It looks
extremely cool!
Re: Jason: isn't flush final?
-John
On Fri, Sep 18, 2009 at 9:09 AM, Jason Rutherglen <
jason.rutherg...@gmail.com> wrote:
> I believe you could override the IW.flush and IW.mergeSuccess
> method
I believe you could override the IW.flush and IW.mergeSuccess
methods. flush unfortunately doesn't expose the new SegmentInfo,
however it could be obtained via
IW.getReader().getSequentialSubReaders (by comparing the before
and after).
Adjacent segment files could then be maintained without hackin
Sure.
A simple example:
Say you have a type of field with fixed length data per doc, e.g. a 8 bytes.
It might be good to store in a segment:
so if you have 1000 docs, your seg file is 8k+4 bytes.
Merging would be rather trivial as well.
Doing this right now involves storing into payload,
I'm actively working on LUCENE-1458, to enable differenct codecs for
reading/writing the terms dict and doc/freq/prox/payload postings.
I'm working now towards getting PforDelta working...
However, that change doesn't [yet] do anything for norms, stored
fields nor term vectors.
Can you describe m