Sorry for the delay, real world work took me away ...

everything was global, ....how you guys handle a modern structured
language



Don't worry this is one of the hardest bad habits to break.
You are not alone. The easiest way is to just pass the data
from function to function in the function parameters. Its not
at all unusual for functions to have lots of parameters, "global"
programmers tend to panic when they have more than a couple,


yep !

but its not at all bad to have 5 or 6 - more than that gets
unweildy I admit and is usually time to start thinking about
classes and objects.



I have ended up with my application in several separate directories.



Separate modules is good. Separate directories for anything
other than big programs (say 20 or more files?) is more hassle
than its worth. The files are better kept in a single directory
IMHO. The exception being modules designed for reuse...
It just makes life simpler!


Ive tried to be hyper organized and added my dirs in
/usr/lib/python2.3/site-packages/mypath.pth

/home/dave/mygg/gg1.3/live_datad
/home/dave/mygg/gg1.3/logger
/home/dave/mygg/gg1.3/utils
/home/dave/mygg/gg1.3/datacore
/home/dave/mygg/gg1.3
/home/dave/mygg/gg1.3/configs

This works OK but I sometimes have to search around a bit to find where the modules are.

Probarby part of the problem is I tend to write lots of small modules, debug them & then import them into one controlling script, It works OK but I start to drown in files, eg my live_datad contains ...

exact_sleep.py garbage_collect.py gg ftsed.e3p html_strip.py live_datad.py valid_day.pyc
exact_sleep.pyc garbage_collect.pyc gg ftsed.e3s html_strip.pyc valid_day.py


When I get more experienced I will try & write fewer, bigger modules :-)



My problem is that pretty much all the modules need to fix where


they


are when they exit and pick up from that point later on,



There are two "classic" approaches to this kind of problem:

1) batch oriented - each step of the process produces its own
output file or data structure and this gets picked up by the
next stage. Tis usually involved processing data in chunks
- writing the first dump after every 10th set of input say.
This is a very efficient way of processing large chuinks of
data and avoids any problems of synchronisation since the
output chunks form the self contained input to the next step.
And the input stage can run ahead of the processing or the
processing aghead of the input. This is classic mainframe
strategy, ideal for big volumes. BUT it introduces delays
in the end to end process time, its not instant.


I see your point, like a static chain, one calling the next & passing data, the problem being that the links of the chain will need to remember their previous state when called again, so their output is a function of previous data + fresh data. I guess their state could be written to a file, then re-read.

2) Real time serial processing, typically constructs a
processing chain in a single process. Has a separate thread
reading the input data


Got that working live_datad ...

and kicks off a separate processing
thread (or process) for each bit of data received. Each
thread then processes the data to completion and writes
the output.

OK

A third process or thread then assembles the
outputs into a single report.



Interesting ...

This produces results quickly but can overload the computer
if data starts to arrive so fast that the threads start to
back up on each other. Also error handling is harder since
with the batch job data errors can be fixed at the
intermediate files but with this an error anywhere means
that whole data processing chain will be broken with no way
to fix it other than resubmitting the initial data.



An interesting idea, I had not thought of this approach as an option even with its stated drawbacks. Its given me an idea for some scripting I have to do later on ...

With my code now running to a few hundred lines
(Don't laugh this is BIG for me :-D )



Its big for me in Python, I've only writtenone program with
more than a thousand lines of Python wheras I've written
many C/C++ programs in ecess of 10,000 lines



Boy am I glad I chose to learn Python rather than C++, probarbly still be at 'hello world' ;-)


and worked
on several of more than a million lines. But few if any
Python programs get to those sizes.

HTH,

Alan G
Author of the Learn to Program web tutor
http://www.freenetpages.co.uk/hp/alan.gauld






_______________________________________________ Tutor maillist - [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/tutor

Reply via email to