On 11/8/05, Frank Bax <[EMAIL PROTECTED]> wrote:
> I have script that takes a very long time to run - hours, sometimes even
> days (even on a P2-2.8Ghz machine.  After loading some data from a database
> at the beginning (less than a second), the script does no i/o until results
> are output at the end of script.  I'd like to know how the script is
> progressing through its data, so I added some code to update the database
> at regular data intervals, but this has some problems:
>         - database is remote, so script goes much slower with these status 
> updates.
>         - updates are based on data instead of clock.
>
> I read something about threads in perl and was wondering if these status
> updates should be coded inside a thread so they have less impact on overall
> script performance.  The number crunching could still go on while database
> update happens in separate thread.
>
> To get updates based on clock instead of data...  Are there tools within
> perl for using clock/timer information?  Do I have to parse clock/timer
> info myself to make something happen every hour inside an existing loop?
>
> I realise that my subject line might suggest use of cron, but this is not
> workable unless there is some way for two scripts to communicate with each
> other.  If this could work, the processing script would probably still need
> a thread to do communication with timer script anyway.
>
> Frank

Frank,

See perldoc -f alarm and perlipc for details, but the normal idom for
this sort of thing is to trap SIGALRM. A little pseudocode:

my $timeout = 3600; # 1 hour

while ( @your_data ) {
    eval {
        local $SIG{ALRM} = sub {
            # do something to save state
            die "alarm\n" }; # NB: \n required

        alarm $timeout;

        while ( @your_data ) {
            my $data = shift @your_data;
            # process your data, calling a subroutine
            # for the heavy lifting makes sense
        }
        alarm 0; # unset the alarm after you finish the last pass
    };

    if ($@ && $@ eq "alarm\n" ) {
        # timed out
        your_log_sub();  # perform your logging here
        recover_state();
            # e.g. push interrupted procedure back onto stack
    } elsif ($@) {
        # do something about other errors
    }
}

As for how you do your logging, it's certainly simplest to just attach
to the database and log as you go along. It's hard to believe that the
time taken for a database connection every couple of hours would be
that inhibiting on a process that runs for days.

If it really is a big deal, though, your log subroutine can use fork()
to spawn a subprocess to log. See perldoc -f fork and perlipc for
details.

Threading probably isn't what you want here, since it will, among
other things, copy the entire data structure at the time the thread is
created. you could create a logging thread early in your program
before you load your data set, but in my mind at least that seems like
overkill for simple logging for a non-daemon process. perlthrtut is a
good place to start learning about threads.

HTH

-- jay
--------------------------------------------------
This email and attachment(s): [  ] blogable; [ x ] ask first; [  ]
private and confidential

daggerquill [at] gmail [dot] com
http://www.tuaw.com  http://www.dpguru.com  http://www.engatiki.org

values of β will give rise to dom!

Reply via email to