After reading into the source code, I think Lucene doeesn't use B+tree or other tree structure for index.
A possible reason is that, since Lucene aims at handling gigabytes , it has to be cautious about the index file's size. B+tree may grow rapidly when the number of leaves grows. Hence, B+tree doesn't scale well with gigabytes data. Seems to me Lucene's search is implemented by "hopping" along the sorted index. For example, suppose there is a field "first name" whose values are the first names of all California residents. Those values are listed sequentially but sorted. Given a certain query, the search can start from the first name, if the first name is smaller than the query, hop to the 10'th one, if still smaller, continue hop forward, otherwise, hop back. so on. By this, we don't need B+tree. Of course, there are tricks to improve the hopping algorithm. However, my doubt is that the search performance may not so competitive as B+tree. Is my understanding correct? Kan --- Kan Deng <[EMAIL PROTECTED]> wrote: > > I have similar problem about the internal indexing > data structure > > According to Paolo Ferragina of Univ Pisa, B+tree > with > cluster is best for sorting. However, referring to > the > implementation of > org.apache.lucene.search.IndexSearch, it looks like > the impl doesn't take B+tree, never mention cluster. > > > Can anyone explain this issue in more depth? > > thanks, > Kan > > > > --- shailesh kumar <[EMAIL PROTECTED]> wrote: > > > I had looked at the document you had listed as > > well as used a Hex editor to look at the segment > > files. .That is how I came to know about the > > lexicographic sorting. But was not sure if BTree > is > > used. If I understand correctly a Binary tree > (i.e > > each node only 2 children) or a high order > > Balanced tree (where in a range of values are > stored > > in the node and each node can have more than 2 > > children) is the best way to search. > > So wantted to know if Lucene implements it that > > way ( if not in the data storage of the index in > > the file, atleast in the memory during lookups??) > > > > Shailesh > > > > > > > > > > > > > > --------------------------------- > > Yahoo! Photos > > Ring in the New Year with Photo Calendars. Add > > photos, events, holidays, whatever. > > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam > protection around > http://mail.yahoo.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: > [EMAIL PROTECTED] > For additional commands, e-mail: > [EMAIL PROTECTED] > > __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]