Re: [R] Fast lookup in ragged array

Seth Falcon Sat, 17 Mar 2007 07:26:42 -0800

Peter McMahan <[EMAIL PROTECTED]> writes:

> That's a good point.


What's a good point?  [this is why top-posting isn't so helpful].

> What's the overhead on digests like that? 

Depends on the digest algorithm, the implementation, etc.  To some
extent, you can just try it and see.  Or you can compute the digest of
an average sized subgraph node label list in a loop and estimate that
way.

> Also, does that open up the possibility, exceedingly small though it
> may be, of misidentifying a branch as already searched and missing a
> qualifying subgraph?

Yes and the size of "exceedingly small" depends on the digest.  I
don't think this is worth worrying about.

>>> Also, is it better to over-estimate or under-estimate the
>>> size parameter?

I perhaps should have stressed that over-estimating is better.  The
way hashed environments work is that a vector is initialized to the
desired size and collisions are resolved using chaining.  To reduce
collisions and have a more efficient hashtable, you want to have more
slots in the vector than items since the hash function is rarely
perfect for your data.

+ seth

-- 
Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center
http://bioconductor.org

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fast lookup in ragged array

Reply via email to