Hey Karl,

I haven't exactly looked at the source code in your branch, but I always 
wondered how changes in your personal branch off of Lucene 2 will ever get into 
the main trunk, since your efforts are, I think, omstly out of sync with 
patches and changes people are making in HEAD.

My mom says hi to your mom.

Otis

----- Original Message ----
From: Karl Wettin (JIRA) <[EMAIL PROTECTED]>
To: java-dev@lucene.apache.org
Sent: Thursday, July 20, 2006 4:05:21 AM
Subject: [jira] Commented: (LUCENE-550) InstanciatedIndex - faster but memory 
consuming index

    [ 
http://issues.apache.org/jira/browse/LUCENE-550?page=comments#action_12422359 ] 
            
Karl Wettin commented on LUCENE-550:
------------------------------------

To make this index work flawless (I hope), remove the if-statement around the 
following row in InstatiatedIndexWriter (row 477 or so):

termDocumentInformation.termPositions.add(fieldSettings.position);

This will fix the termposition bug noted in an earlier comment.

I'll keep posting bugfixes as comments here, but when I work on it it's really 
in my branch of lucene 2.0.0, available here: 
http://www.ginandtonique.org/trac/snigel/wiki/Lucene2-karl

If someone feels that this layer is an interesting thing to add to Lucene, let 
me know what is required for commit and I'll make those changes. It still seems 
to be about 40 times (mean value on a "nomal" index with "normal" amount of 
terms. have seen 20x-200x) than RAMDirectory when comparing search and to 
retrieve documents time combined.

> InstanciatedIndex - faster but memory consuming index
> -----------------------------------------------------
>
>                 Key: LUCENE-550
>                 URL: http://issues.apache.org/jira/browse/LUCENE-550
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Store
>    Affects Versions: 1.9
>            Reporter: Karl Wettin
>         Attachments: class_diagram.png, class_diagram.png, 
> instanciated_20060527.tar, InstanciatedIndexTermEnum.java, 
> lucene.1.9-karl1.jpg
>
>
> After fixing the bugs, it's now 4.5 -> 5 times the speed. This is true for 
> both at index and query time. Sorry if I got your hopes up too much. There 
> are still things to be done though. Might not have time to do anything with 
> this until next month, so here is the code if anyone wants a peek.
> Not good enough for Jira yet, but if someone wants to fool around with it, 
> here it is. The implementation passes a TermEnum -> TermDocs -> Fields -> 
> TermVector comparation against the same data in a Directory.
> When it comes to features, offsets don't exists and positions are stored ugly 
> and has bugs.
> You might notice that norms are float[] and not byte[]. That is me who 
> refactored it to see if it would do any good. Bit shifting don't take many 
> ticks, so I might just revert that.
> I belive the code is quite self explaining.
> InstanciatedIndex ii = ..
> ii.new InstanciatedIndexReader();
> ii.addDocument(s).. replace IndexWriter for now.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to