Hi all, I haven't written in a while, but I've come up against something very 
unusual....
We recently switched over to mico 2.3.13 with some home-grown multi-threaded 
queue handlers.  Everything was working grand, until we had one service hang.  
We killed it and everything worked fine again for a week - same thing happened 
again.  Then some other services started experiencing the same problems.
As a quick fix, we put in a timer set at 0.5 hours, and if no corba functions 
are called, then the service kills itself.  This works fine because we use the 
imr, so when a client asks mico where the service is, the mico daemon 
automatically starts up the service.  This fixed one of the services and we 
haven't had any problems since.
However, the original service is still stuck - so after much debugging and 
rooting around in the mico code, we came up with an unusual situation. The 
service receives the corba header telling it how long the incoming message is 
(the header is 12 bytes, and the incoming message is approximately 8300 bytes). 
 The socket is read numerous times to get the entire 8300 bytes, but after 
about 5200 bytes, the reading stops.  This wouldn't be so bad, but all the 
threads in the service all stop receiving messages (None of them respond to any 
new messages coming in).  The homegrown queue thread is still running and 
watching for any new messages, but no other messages come in.
The strange part, is if I use "lsof" and compare the file descriptor with the 
debugger file descriptor, I see that the socket connection is "ESTABLISHED".  
But, if I go to the other machine (that's sending the message) I don't see any 
socket connection - no fin wait, no closed, nothing!  So the service will never 
receive the last 3100 bytes.
I've tried duplicating this in a test environment and I've hammered the service 
with no problems.  The service resides on an HPux 11i and the client normally 
resides on an SGI.  I have done all my testing with the client on HPux 11i and 
linux, but never SGI (that's coming next).  Anybody know of inherent socket 
issues with SGI or HPux 11i to SGI?
I can give specific debug info and lines of code if anyone is interested.  Any 
help is appreciated!
Thanks,Mark
------------------------------------------------------------------------------
EMC VNX: the world's simplest storage, starting under $10K
The only unified storage solution that offers unified management 
Up to 160% more powerful than alternatives and 25% more efficient. 
Guaranteed. http://p.sf.net/sfu/emc-vnx-dev2dev
_______________________________________________
Mico-devel mailing list
Mico-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mico-devel

Reply via email to