Re: hash speed

Ken Gillett Wed, 16 Feb 2005 00:14:15 -0800


On 15 Feb 2005, at 17:43, Jay wrote:

On Tue, 15 Feb 2005 09:48:23 -0500, Wiggins d'Anconia
<[EMAIL PROTECTED]> wrote:
Please bottom post, and reply-all so that everyone can help and be helped.
Nope. That's a here are the tools that you should be able to determine
on your own which is faster. ".... teach a man to fish ...."
http://danconia.org
Ken Gillett wrote:
That's a no then?
On 14 Feb 2005, at 15:00, Wiggins d'Anconia wrote:
Ken Gillett wrote:
I have a script that creates a hash, up to several thousand key=>value pairs. Each value is a string that is created by adding to it repeatedly, maybe hundreds of times, each addition probably about 10 bytes. I can do this in (at least) 2 ways. One is to repeatedly concatenate ( .= ) the additional string onto a scalar variable and then, once it has been fully created, to add this variable to the hash with its appropriate key. The other is to directly add onto the hash value itself, no other variable involved. My question is:- Which is faster in operation? I don't know enough about the internal workings of perl's memory structures (actually I know nothing about that:-) to be able to hazard a guess at this. Maybe it makes no measurable difference, but maybe one is definitely the better modus operandi. Can anyone answer this?
perldoc Benchmark
Check the list archives for examples of usage on other problems.
To elaborate on the "no" a little: there are a host of things that
affect execution.  Diifferent processes use system resources in
differnt ways.  Some take more memory, some take more processor time.
Some scale lineraly, some have a high initial overhead.  Different
architectures and processors are optimised for different tasks.  The
list goes on.  Often, you'll find that the fastest solution for a
small data set isn't the fastest for a larger one, and vice versa.  No
one on this list is you, sitting at your computer, with your data.  So
none of us can tell you what's best.  Here are some things to think
about.  Generally, assigning to a temporary variable is slower than
modifying a vaule directly (because it incurs the overhead of creating
the variable).  On the other hand, hashes take up a lot of memory, and
modifying large hases can be time consuming.  So it would be
reasonbale to expect that for a "small" data set, modifying the hash
directly would be faster, but that on a "large"--in terms of
bytes--data set assigning to a temp variable might possibly be faster,
because it only modifies the hash once.  But only Becnchmark will tell
you what large and small mean for your system in this situation, and
where the tradeoff happens for you.  Remember, too, that your results
will change depending on what the rest of your program is doing, and
the available system resources.  Each platform also has its own
quirks.
Look into the archives for some of the recent threads on Benchmarking,
and you'll see why the answer to your original question has to be no.

Brilliant, thank you for that informative reply. It's what I was trying to allow for in my original question, that there are vagaries involved which can affect the outcome. It's not the answer I wanted, but it IS an answer. Thanks.

As an extension to my question, what about when repeatedly adding to a data set that needs to be written to a file? Will it be quicker to write each line directly to the file, or repeatedly add to a variable then write that to the file in one hit?

My guess is that this will have a more definitive answer since the speed difference between writing to a variable and writing to a file will make it a more obvious outcome and indeed my experience indicates that writing to a file is measurably slower. But does anyone have any in depth knowledge of these processes.


Ken  G i l l e t t

_/_/_/_/_/_/_/_/_/


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Re: hash speed

Reply via email to