On Wed, 16 Feb 2005 08:13:02 +0000, Ken Gillett <[EMAIL PROTECTED]> wrote:
> As an extension to my question, what about when repeatedly adding to a > data set that needs to be written to a file? Will it be quicker to > write each line directly to the file, or repeatedly add to a variable > then write that to the file in one hit? > > My guess is that this will have a more definitive answer since the > speed difference between writing to a variable and writing to a file > will make it a more obvious outcome and indeed my experience indicates > that writing to a file is measurably slower. But does anyone have any > in depth knowledge of these processes. > Ken, Again, it will really depend. How "big" is the hit going to be? Big enough that storing the data in memory will eat up your RAM and force you to swap? If you're dealing with thousands of lines, you may not want to store them all in memory, especially if you're like me, running a database server on a PII/133 with 16M of RAM. On the other hand, if you're on a brand new P4 with 2G of DDR2, who really cares? Of course if your data set is multiple terabytes, even 2G isn't going to be enough. What's you I/O look like? Does your system support buffered disk writes? If so, passing the I/O off to the kernel's buffer won't cost you much on any current desktop drive, and 15Krpm SCSI drives in high end servers live for this. On the other hand, are you performing unbufferd I/O on an old 2400rpm disk? Then you might want to think a little. Does it even matter at what point you write to a 56K dailup ftp connection? Next question: what else is running on the machine? Are other processes eating up RAM and forcing you to swap? Then write. Is some long-running find or A/V operation flodding the IDE or SCSI bus? Are you writng to a network drive that's responding slowly? Write out a chunk. But the truth is that unless you're running a server dedicated to just this one perl script, the environment is going to change from minute to minute. Tomorrow, the state may reverse itself. If you know some things about the quirks of you system, by all means, code for them. But on the whole, your best best is to code in a way that's readable and makes sense to you and you maintenance programmers. You time is far more valuable than the few milliseconds of cpu or i/o you'll save in most cases. If you find yourself staring at the screen thinking "why is this taking so long?", then that's the time to look at your i/o and memory usage and think about what you can do differently in that case. Then you can benchmark a couple of methods and see what's going on. But there still isn't going to be an hard and fast rule that this is always faster than that, or that some operations are always to be avoided. You know two things about and peice of data you're program is currently processing: it's in memory now, and it needs to get written to a device (screen, disk, socket) or discarded eventually. What steps make the most sense to get any particular bit to any particular final destination will depend heavily on circumstance. HTH, --jay -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>