Compared to caching and passing in a List to the Document constructor, I imagine a clear() based solution would be slower... there's more work to do. clear() needs to null the pointers, and then one needs to add the fields again, one-by-one. But I doubt we'd be able to detect a variance anyway, given that document construction time (as opposed to Field construction) is insignificant compared to indexing.
-Yonik http://www.lucidimagination.com On Wed, May 20, 2009 at 4:10 PM, Shai Erera <ser...@gmail.com> wrote: > I came across this while working on 1595 (changes to benchmark). I noticed > LineDocMaker reuses Document and Fields, and I wanted to pull that up to a > base DocMaker since I got the impression it yields better (even if not > significant) performance. > > With the addition of the Field ctor which accepts a boolean for interning, > and with the changes to String.intern() which are to come, I agree this is > will have less impact, but is still convenient. Today, I can already call > doc.getFields(), iterate on them and call doc.remove(Field). > Document.clear() will just save me the trouble. > > Besides all the above changes, reusing Document and FIeld saves object > allocations. For the documents in the benchmark package this may mean > millions of Document objects + much more Field objects. Even if it always > avoided interning, this means saving lots of allocations, which are really > not necessary. > > For other applications, the number of fields may be much larger than in the > current benchmark impls, where it becomes even more important. > > Passing a list of Fields will save the Field allocations (assuming the app > caches them on the outside) but still require Document allocation. Why not > save that either? > > On Wed, May 20, 2009 at 11:01 PM, Yonik Seeley <yo...@lucidimagination.com> > wrote: >> >> On Wed, May 20, 2009 at 3:27 PM, Shai Erera <ser...@gmail.com> wrote: >> > I noticed Document does not have a clear() method, to remove all the >> > Fields >> > set on it. >> >> Document's state is so simple (a List and a boost), reuse doesn't seem >> worth it. >> What if, instead, we allowed the List to be passed into via Document's >> constructor? >> >> To put it into perspective, the Document object then becomes lighter >> weight than the String object (provided the user is caching the List >> of fields). And really, I think caching the list of fields is even >> overboard for pretty much all of the applications out there - I doubt >> it would ever be significant given how much relative work is needed to >> index a document. >> >> -Yonik >> http://www.lucidimagination.com --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org