Using Solaris 2.8 and Perl v5.8.7 built for sun4-solaris-thread-multi
I have a small script that is given a collection of email addresses,
1-800+ along with a message
and it goes through each address and sends the message to each address.
If the
address is a standard email, it simply emails it. If its a email hosted
by my2way.com
the script uses the LWP / HTTP modules to post the message to their web
page.
When the program starts up, it spawns up to 25 worker threads to handle the
collection of the messages. On rare occasions, a thread create fails
and my program
goes down hill from there... Not being a perl expert, I'm struggling to
find how
to recover gracefully.
I could post the entire script if needed, but here are the key parts.
#!/apps/soc/bin/perl
use threads;
use threads::shared;
sub worker_thread ($)
{
my $thread_number = $_[0];
my $single_address;
while (1)
{
{ # used for locking
lock (@temp_address_list);
if (!defined ($single_address = shift @temp_address_list))
#get the next address
{
# print LOGFILE format_log_header () . "Thread:
$thread_number NULL addresses left - exiting: $thread_number\n";
return; #terminate thread
}
} # unlock here
if (!send_page ($single_address, $thread_number))
{
return; #terminate this thread if it errors
}
}
}
## Now start a loop of each address to send to.
##
if ($address_count > $max_thread_count) {
$threads_started = $max_thread_count
}
else {
$threads_started = $address_count;
}
for ($loop = 0; $loop < $threads_started; $loop++)
{
print LOGFILE format_log_header () . "Thread Starting: $loop\n";
print format_log_header () . "Thread Starting: $loop\n";
$threads_list[$loop] = threads->new(\&worker_thread, $loop);
}
for ($loop = 0; $loop < $threads_started; $loop++)
{
$threads_list[$loop]->join;
print "Thread completed\n";
print LOGFILE format_log_header () . "Reaping thread: $loop\n";
}
The problem is I occasionally get this error:
FATAL: Callback called exit at /apps/soc/paging/send_page.pl line 447,
<STDIN> line 15.
where line 447 is the threads->new statement.
After this error, I may also get these errors further into the script
FATAL: END failed--call queue aborted at /apps/soc/paging/send_page.pl
line 11
which point to the header comments of the script.
My problem is probably as simple as not checking if the threads create
fails,
but what is the best way to do this? Also any idea to reduce /
eliminate the failure
in the first place, it happens maybe only 1 in 500 executions.
Thanks for any help, again if people need the full script, I can easilly
post it.
Brian