> >If you can get to old CACMs see `Minimal Perfect Hash Functions Made Simple'
> >by Richard J. Cichelli, Comm. of ACM, Jan 1980.  AFAIK gperf uses some
> >variation of that algorithm and may have some details.  A minimal perfect hash
> >function is only worth it (IMHO) when the set of input keys is mostly fixed and
> >the hash function is used many many times (e.g. programming language keywords).
> 
> And even then it's seldom worth it according to the people behind the LCC
> compiler...

I'd be interested in a reference if you have one [I don't doubt you, just
curious).

I used gperf to generate such a function in a verilog parser and came to
the same conclusion but it can't be generalized to the syscall (or for
that matter database) case.  The reason it doesn't buy much in a compiler
is because what is not a keyword is an identifier and you end up doing a symbol
table lookup on it.  So you may as well just do a symbol table search for
everything (also, typically there are more identifiers than keywords in a
program so it all comes out in a wash).  This is not the case for a simple
lookup (database or syscall) -- you don't do another lookup on the same
string if the first search fails.

I agree that for a very small number of syscalls a simple hash or even
a strcmp is good enough but that doesn't scale well.  Min. perfect hash
function is about as good as it gets *if* one wants to allow for a large
number of loadable syscalls *without* apriori assignment of numbers.  With
an IANA like central number assigning authority, preassignment of syscall
numbers is far better.

-- bakul


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Reply via email to