We tend to use tries, which have very good performance characteristics. See "bits of unicode" on my site: www.macchiato.com.
Mark __________________________________ http://www.macchiato.com â ààààààààààààààààààààà â ----- Original Message ----- From: "Theodore H. Smith" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Mon, 2003 Nov 17 18:21 Subject: Re: Ternary search trees for Unicode dictionaries > I've looked into the TST thing. > > I'm not sure that it is optimal, despite how popular they are! > > Look at this, if I add "1", "2","3", "4", "5", "6", "7", "8", "9" to a > TST, they will all be in a line, in the tree. All will be reference via > the "high" node. > > So, to find "9", I have to read through 9 items! > > Now, I'm not sure if this is a bad thing. It's just that compared to a > binary search on an array, in this example, a TST does more work. > > When reading in data, that has already been sorted, I think this could > be a problem. It's fine for randomly ordered data, but with sorted > data... I can't tell how much of a problem it could be. > > This is the kind of structure that punishes neat people. An unexpected > payback for the effort of being neat and ordered. > > I'm quite new to this, and so I don't know if there is a good solution > to making the tree's balanced without imposing huge overhead, or if in > practice, this will be a problem. > > >