I wrote a Perl script today to analyze Folding@home logs.  Then I
wrote the exact same script in Python.  The scripts are attached.
Here are my observations.

Perl Pluses

        Parsing the input was easier in Perl.  I used the
        exact same regular expressions in both, but the Perl
        seemed a lot easier.

        Autovivification made things easier.

        Automatic string<->int conversions saved time and thought.

        Worked everywhere.  Some machines still have Perl 5.003,
        but I didn't use any new language features.  The Python
        version required Python 2.2, which was only installed on
        a couple of machines.

Python Positives

        The StatSample class just happened.  It seemed easier, so I
        wrote a class.  I didn't want to go back and do the same in
        Perl because class declarations are so messy there.

        The Python is a lot easier to read.  Perl's punctuation
        clutters the text and makes it hard to read.

        First time I've ever used the print statement in a nontrivial
        way.  Worked pretty well in this case.  Wouldn't be as clean
        if I didn't have whitespace around each calculated value.

I'm still new to Python.  If you can see ways to write it better,
I'd appreciate the criticism.  If you can see ways to write the
Perl better, that's cool too.

-- 
Bob Miller                              K<bob>
kbobsoft software consulting
http://kbobsoft.com                     [EMAIL PROTECTED]

Attachment: folding-stats.pl
Description: Perl program

#!/usr/local/bin/python

import fileinput
import math
import re

def time_format(interval):
    hr, min, sec = int(interval / 3600), interval / 60 % 60, interval % 60
    if hr:
        return "%d:%02d:%02d" % (hr, min, sec)
    else:
        return "%d:%02d" % (min, sec)

class StatSample:

    "List of numbers that can calculate simple statistics."

    def __init__(self):
        self.__data = []

    def __len__(self):
        return len(self.__data)

    def __getitem__(self, index):
        return self.__data[index]

    def __setitem__(self, index):
        return self.__data[index]

    def append(self, datum):
        return self.__data.append(datum)

    def minimum(self):
        min = None
        for d in self:
            if min is None or min > d:
                min = d
        return min

    def maximum(self):
        max = None
        for d in self:
            if max is None or max < d:
                max = d
        return max

    def mean(self):
        if len(self) == 0: raise
        sum = 0
        for d in self:
            sum += d
        return sum / len(self)

    def variance(self):
        if len(self) == 0:
            return 0
        sum, sum2 = 0, 0
        for d in self:
            sum += d
            sum2 += d * d
        return (sum2 - (sum * sum) / len(self)) / (len(self) - 1)

    def std_deviation(self):
        return math.sqrt(self.variance())


# Read the file.

ftimes = StatSample()
o, t = None, None
for line in fileinput.input():
    timestamp = re.match(r'\[(\d+)\:(\d+)\:(\d+)\]', line)
    if timestamp:
        hr, min, sec = [int(n) for n in timestamp.groups()]
        o, t = t, (hr * 60 + min) * 60 + sec
        if re.search('Finished a frame', line) and o is not None:
            ftimes.append((t - o + 86400) % 86400)


# Calculate and print statistics.

print len(ftimes), "frames"
print "fastest:", time_format(ftimes.minimum()),
print " slowest:", time_format(ftimes.maximum())
print "mean:", time_format(ftimes.mean())
print "standard deviation:", time_format(ftimes.std_deviation())

Reply via email to