Jerry,
Jerry D. Hedden wrote:
(``$SIG{CHLD} = \&REAPER'' is an interprocess construct put forth in
perlipc, not an interthread construct. Not sure what effect you intend it
to have.)
I usually add this to take into account the situation where a thread
dies without a proper "return". It's not necessairy here.
When a thread dies, it does not cause a SIGCHLD to be generated. That
only occurs with forked processes. Therefore, if you are not using
'fork', you do not need a $SIG{CHLD} handler.
Well, I never really understood why this was. But, after doing some
additional research, I did find it out:
The program I am writing starts 8 ssh-sessions running in parallel that
take messages from a "tail -f some-logfile" from 8 different server.
When one of these threads died, it produced the "SIGCHLD" in the main
thread. But, it turns out that this is because these subthreads actually
do a ssh to a server, using the Net::telnet module.
And THAT module actually does a 'spawn' of an external ssh-process;
using a fork.
OK. Mystery solved. ;-)
But now, I have another problem. I get some strange behaviour so I'm not
really sure that I am doing wrong or not.
So I have 8 threads that do a ssh-session to a server getting
log-information from a log-file by doing a "tail -f ..."
That information then needs to be processed and aggregated but needs to
be used by several different applications (each running as a thread
inside the same program)..
So, I wrote a small module called called the "message-replicator";
which -as the name implied- receives information from one source (in
this case, a message-queue containing the log-file data from the 8
ssh-threads) and "distributes" (i.e. replicates) that info to the other
threads -via a "outbound" message-queue- to each "listener" thread.
The message-replicator works dynamically so it can be used to service
processes which appear and dissappear.
One example is a tcp-server running a tcp-port which can be used to
'monitor' changes in the log-files in real-time by just doing a telnet
to the monitoring-server at tcp-port 7778.
(handy for day-to-day trouble-shooting if you have a problem somewhere).
The creation of outgoing-queues in th emessage-replicator
("subscripion") is done by a message-queue (the "command-queue"). You
just sent a message to that queue containing the name of the
message-queue which is used to sent data from the replicator to the
listener-thread.
It then sends a signal "HUP" to the replicator to notify it there is a
something waiting for it in the control-queue.
(The reason I do it this way is that the replicator is not able
to listen to more then one queue at the same time, without the use of
"non-blocking" reads, sleeps, etc.)
Anycase, it works great. I have a thread that opens up tcp-port 7778 on
my machine and starts -when you connect to that port- it fires up a
sub-thread that
- subscribes to the message-multiplier
- spits out everything it receives from the message-multiplier (which is
a copy of what the multiplier receives, i.e. what the 8 ssh-sessions get
from the "tail -f <logfile>" on the servers).
All works well.
But, my problem is this.
- when the "listening-process" dies (e.g. because somebody shuts down
the telnet-session to port 7778) the thread that services that
tcp-connection dies.
But the "replicator" does not know this, so it will continue sending
data into the queue which was used by this particular listener; but they
will never get dequeued.
So, no problem, I think, I'll just add a small piece of code in the
replicator which checks from time to time if the listener-threads are
still running, and -if they are not- delete the queues that where used
to communicate with that listener.
When subscribing, the listener-thread just need to sent the id of
himself to the replicator and that's it.
But that's where the problem start:
First I try this:
(piece of code of the listener-thread to subscribe to a replicator:
$Qs{$replicatorcontrolqueuename}->enqueue("STARTNEW",$queuename,threads->self());
$$repicatorid->kill(HUP);
} else {
...
(%Qs is a shared hashed-area for all queues)
But, when I try this, I get this error:
"Thread 7 terminated abnormally: Invalid value for shared scalar at
/usr/local/lib/perl5/5.8.8/Thread/Queue.pm line 90."
(Note that the error is in the thread-library itself, not in my program)!!!
So, I then try to send the thread-id as a reference
$mythread = threads->self()
$Qs{$replicatorcontrolqueuename}->enqueue("STARTNEW",$queuename,\$mythread);
Same result!!!
Next step:
I try to send the thread using its tid.
$Qs{$replicatorcontrolqueuename}->enqueue("STARTNEW",$queuename,threads->self->tid());
This passes OK, but ... I then have a problem in the "replicator"
When I do a
$thread = threads->object($tid);
if ($thread->is_running()) {
...
I get this error:
Thread 2 terminated abnormally: Can't call method "is_running" on an
undefined value at (eval 11) line 19.
It turns out that the threads->object($tid) always is undefined,
eventhou $tid is a legal value.
Now, fact is that this thread with tid 8 is not a subthread the
replicator but of a completely different thread.
Is it possible that you can only use threads->object($tid) for your own
sub-threads?
I know these are a lot of questions at the same time, but I'm kind of
stuck here. I don't know is this is "normal behaviour" or a bug
(especially the error in "Threads/queue.pm") ???
Any help appriciated!
Some additional info:
This runs on a then3.
The perl is perl 5.8.8 compiled from source myself (as then3 only has
rpm's for perl 5.8.0
The thread-library is threads-1.37 from cpan.
Cheerio! Kr. Bonne.
Cheerio! Kr. Bonne.