What a neat search engine! (Searching stack traces). Unfortunately, loading stored fields is slowish -- it entails 2 disk seeks under the hood. Really you should retrieve at most a page worth of docs, in the serial path of a query. How many are you retrieving per query?
That said, you shouldn't use LAZY_LOAD if you know you will need the value. Also, it's possible that sorting the docIDs (ascending) first may get you better performance since your load is then a single scan of the 2 files in the index. You may want to use FieldCache.DEFAULT.getStrings instead -- this gives you a very fast String[], but, may suck up tons of memory depending on how many unique frames there are (how do you index each frame?). Mike On Thu, Sep 9, 2010 at 4:01 AM, Johannes Lerch <lerch.johan...@googlemail.com> wrote: > Hi, > > i am working on a search for stacktraces. To do this i implemented my own > Query, Weight and Scorer. I save exception, method and the frames as fields > in the index and am able to pick relevant documents by matching those fields > with my query stacktrace (using IndexReader.termDocs()). I implemented my > own scoring which is calculated pairwise for stacktraces (the one of the > query and each of the relevant documents). For this scoring i calculate a > similarity between both traces by comparing the frames if they exist in both > and also check for ordering. This works similar as diff on text/source code. > My problem is, that i need all frames contained in both stacktraces, so i > have to retrieve all frame fields of the stored stacktraces. For now i do > this with: > Document document = reader.document(doc, new FieldSelector() { > �...@override > public FieldSelectorResult accept(String fieldName) { > if(Indexer.FIELD_FRAMES.equals(fieldName)) > return FieldSelectorResult.LAZY_LOAD; > else > return FieldSelectorResult.NO_LOAD; > } > }); > Fieldable[] fieldables = document.getFieldables(Indexer.FIELD_FRAMES); > > But this call really decreases performance to something which is not > agreeable for me (>10 times slower on 100000 stacktraces in index). So my > question is, are there are other ways to get stored fields or do you have > ideas for workarounds. Would it be better to store all stacktraces in a > database and retrieve them from there? If so how do i get the docId of > stacktraces i wrote to the index? > > Regards, > Johannes >