[EMAIL PROTECTED] wrote:
>
> Those who've answered my previous questions about hashes have been
> most helpful. I especially appreciated John W. Kennedy's excellent
> card catalogue cabinet analogy.
>
> I have one last (I hope) question on the topic:I understand from
> previous answers that not all data sets will hash "politely." Some
> will have a very nice one-item-to-one-bucket outcome, while others
> can have several, even many, items in just a few buckets. And that
> the worst case could be "10,000 items all in one bucket." Is there
> a way of determining how well my data goes into its hash? As an
> example of my question:
>
> $h{$_} = $_ for 'a'..'zzz';
> print scalar( %h ), "\n";
>
> prints out
>
> 14056/32768
>
> This says that the 18,278 values generated used only 14,056 of the
> reserved 32,768 buckets. I understand that permutations would hash to
> the same key (er, the same bucket?), for example "cab" and "abc"
> (although I suspect this would depend on the hashing algorithm,
> wouldn't it?). My question here is simply if there's a way to see how
> "well-behaved" my data set is. Some of my scalars come out to:
>
> 59/128 -- for 69 values
> 78/128 -- for 120 values
>
> In fact, none of them come out "okay" -- that is, x buckets for x
> values.
>
> I guess a second question might be whether the answer to the first is
> even meaningful--at least insofar as this has any effect on
> performance. If it doesn't, I suppose the question's almost
> pointless, except as a point of knowledge.
I will admit that I don't know how Perl handles the hashes as far as
storage allocation, but let me just point out two things:
1) As long as you can access all of the elements, does it really
matter?
and
2) Some sets will hash well and some sets won't, but unless you are
willing to change your data to accommodate the hashing algorithm,
you can't do anything about it.
--
Bowie
_______________________________________________
ActivePerl mailing list
[email protected]
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs