In the not-so-distant past I was part of a three man
effort to write a web site indexer / search engine
generator.  My job was to take the indexed files / urls
(they sucked them down with java) and create a suffix
tree database that could be searched upon via cgi.  I
don't have any specific numbers, but it was quite fast.

This was when google was just becoming known and once
we realized we could point google at a website the
project was abandoned.

The whole point of using suffix trees is linear time
search wrt the size of the search string (note: not
the size of the searched text).  Seems like it's
a good candidate for this task.

Sam

Reply via email to