[jira] Commented: (LUCENE-550) InstanciatedIndex - faster but memory consuming index

Karl Wettin (JIRA) Tue, 21 Nov 2006 19:06:25 -0800

    [ 
http://issues.apache.org/jira/browse/LUCENE-550?page=comments#action_12451726 ] 
            
Karl Wettin commented on LUCENE-550:
------------------------------------


Here is what I just sent to Wolgang. I've adapted his bench test case to also 
work with InstantiatedIndex. It is worth noticing this is a test with one 
document only, and the speed is not linear according to my previous tests. 
InstantiatedIndex is much more than 3x faster than RAMDirectory in a larger 
index. So this is really only to compare MemoryIndex with InstantiatedIndex, 
and not as a bench against RAMDirectory.

RAMDirectory:

secs = 95.159
queries/sec= 315.26184
MB/sec = 9.900338
Done benchmarking (without checking correctness).


MemoryIndex:

secs = 26.692
queries/sec= 1123.9323
MB/sec = 35.295456
Done benchmarking (without checking correctness).



InstantiatedIndex:

secs = 27.44
queries/sec= 1093.2944
MB/sec = 34.333317
Done benchmarking (without checking correctness).


MemoryIndex is a bit faster than InstantiatedIndex. But I'm aware of a couple 
of small optimizations I can do. 

> InstanciatedIndex - faster but memory consuming index
> -----------------------------------------------------
>
>                 Key: LUCENE-550
>                 URL: http://issues.apache.org/jira/browse/LUCENE-550
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Store
>    Affects Versions: 1.9
>            Reporter: Karl Wettin
>         Attachments: class_diagram.png, class_diagram.png, 
> instanciated_20060527.tar, InstanciatedIndexTermEnum.java, 
> lucene.1.9-karl1.jpg, lucene2-karl_20060722.tar.gz, 
> lucene2-karl_20060723.tar.gz
>
>
> After fixing the bugs, it's now 4.5 -> 5 times the speed. This is true for 
> both at index and query time. Sorry if I got your hopes up too much. There 
> are still things to be done though. Might not have time to do anything with 
> this until next month, so here is the code if anyone wants a peek.
> Not good enough for Jira yet, but if someone wants to fool around with it, 
> here it is. The implementation passes a TermEnum -> TermDocs -> Fields -> 
> TermVector comparation against the same data in a Directory.
> When it comes to features, offsets don't exists and positions are stored ugly 
> and has bugs.
> You might notice that norms are float[] and not byte[]. That is me who 
> refactored it to see if it would do any good. Bit shifting don't take many 
> ticks, so I might just revert that.
> I belive the code is quite self explaining.
> InstanciatedIndex ii = ..
> ii.new InstanciatedIndexReader();
> ii.addDocument(s).. replace IndexWriter for now.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (LUCENE-550) InstanciatedIndex - faster but memory consuming index

Reply via email to