I have a question about optimization.
I am helping a local candidate with their database. It is a large
county election database which I have imported into a field within
Rev.
This is a voter database in which we would like to identify a
single addresses for all voters within a given households so that we
do not have to send multiple letters to individual voters within the
same household. This makes a big difference in mailing costs.
I found that my original program runs prohibitively slowly.
But I find when I break up the data into smaller blocks, things
run much more rapidly. For example I use the following code:
repeat with k = 0 to 8
put line k*1000 to (k+1)*1000+1 of tField into temp
put identifyUniqueAddresses(temp) into a[k]
end repeat
so that the data in the variable tField is broken up into 9
chunks of 1000 lines each. Later I reassemble the results from the
array, a[k].
If instead I try to run the whole field at once using:
I would have to wait all day for the data in tField to
process.
(I have not found a was to use:
I have to be able to discover whether *successive* lines in the
sorted data share the same address.)
Now I'm sure my handler, identifyUniqueAddresess, is not the most
efficient code, but my question is this: Why does the handler
run so much more rapidly working on several smaller chunks which are
later reassembled rather than all at once?
I suspect the problem may be successively pulling up lines of
text from a very long list of lines. Would it help if I first put the
lines into an array and then worked with the array?
Is there an optimizer out there? Gentlemen and gentle ladies,
start your engines.
--
Jim Hurley
Jim Hurley
