<snip>


> This is all fun, but what about the context?  Your original function
> took an integer, not a string, and thus wasn't charged with measuring
> string length, possibly multiple times.  Even so, each of these tests is
> taking around a microsecond.  So are you expecting to do anything with
> the result?  Just calling ljust() method more than doubled the time.  If
> you actually have some code to generate the string, and/or if you're
> going to take the result and write it to a file, then pretty soon this
> function is negligible time.
> 
> If it were my code, i think I'd use something like   "      
> "[-sz%8:] 
> and either prepend or append that to my string.  But if I had to do
> something more complex, I'd tune it to the way the string were being used.
> 

Yeah, maybe I got a little carried away. ;-) Knuth's 'premature optimization is 
the root of all evil' comes to mind. Then
again, it was fun from an educational point of view.

The context: the code is part of a program that writes spss system files 
(binary, .sav). These may be anything from 
a few hundred till millions of records. Spss knows to types of data: character 
and numerical. The code is only relevant
for char data. If a 5M record dataset has 8 char variables, it means 40M 
executions of the posted code snippet. Or around 40 seconds devoted to padding. 
But in general there are fewer values. I'll use cProfile later to see if there 
are more urgent pain spots. This seemed a good candidate as the function is 
used so often. FWIW, here is the function, plus some of the init:

def __init__(self, *args, *kwargs):
        self.strRange = range(1, MAXLENGTHS['SPSS_MAX_LONGSTRING'][0] + 1)
        self.pad_8_lookup = dict([(i, -8 * (i // -8)) for i in self.strRange])

    def writerow(self, record):
        """ This function writes one record, which is a Python list."""
        convertedRecord = []
        for value, varName in zip(record, self.varNames):
            charlen = self.varNamesTypes[varName]
            if charlen > 0:
                value = value.ljust( self.pad_8_lookup[charlen] )
            else:
                try:
                    value = float(value)
                except ValueError:
                    value = self.sysmis
            convertedRecord.append(value)
        caseOut = struct.pack(self.structFmt, *convertedRecord)
        retcode = self.spssio.spssWholeCaseOut(self.fh, caseOut)
        if retcodes.get(retcode) != "SPSS_OK":
            raise SPSSIOError("Problem writing row:\n" % " ".join(record), 
retcode)
        return retcode
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to