Bonjour We post some messages about our "vanishing bridge" (ARISTOTE) some times ago. We have exactly the same issue with UBUNTU 10.4 and found nothing in the logs too
Cordialement Ph d'Anfray -------- Message d'origine-------- De: ag-tech-boun...@lists.mcs.anl.gov de la part de John I. Quebedeaux, Jr List-Post: accessgrid-tech@lists.sourceforge.net Date: mer. 09/06/2010 17:48 À: Matthew Leszczenski; ag-t...@lists.mcs.anl.gov Objet : Re: [AG-TECH] Bridge Registry Timeout Ahhhhh... I can at least report I¹m seeing the same with the LSU bridge FC 12, 3.2beta. It started happening after our upgrading to FC12 and 3.2beta. I haven¹t been able to identify what¹s different, I¹ve been comparing logs between the old and new... -John From: Matthew Leszczenski <mxl9...@rit.edu> List-Post: accessgrid-tech@lists.sourceforge.net Date: Wed, 09 Jun 2010 11:44:04 -0400 To: <ag-t...@lists.mcs.anl.gov> Subject: [AG-TECH] Bridge Registry Timeout Hello all, My apologies if this has been covered by other people in the past, however I have spent considerable time searching the archives for instructions on how to fix this issue (or even exactly where it stems from). Here at RIT I have been working on setting up a Unicast Bridge, however I have run into a snag. I have the bridge up and working fine, I consistently have 2 of our own nodes connected through the bridge at all times that they are up, so it works as a bridge already. Our problem is that for about 5 minutes the bridge shows up in the registry list as an option for the nodes, but after that 5 minutes it disappears from the registry list if a registry purge is used, or if a node logs into AG after the registry timeout happens it is gone. If the bridge is in the list from those 5 minutes, and the list is not purged, that node can still connect and disconnect from the bridge server without a problem, so it is still up and working. For details, I am running on Fedora 12, using the Bridge python script that is installed with AG3.2 (it has a created date of 2005/12/06 in case it has been updated). When running the script I am running it with the following command: ./Bridge -n "RIT Brooklyn" -l RIT I have been watching the log file that I have been directing all the output to, and the beginning I have found an interesting entry, but this is only when there are no clients connected: reached inactivity timeout and have no clients; exiting Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/AccessGrid3/AccessGrid/AGXMLRPCServer.py", line 63, in run self.handle_request() File "/usr/lib/python2.6/SocketServer.py", line 262, in handle_request fd_sets = select.selectP[self], [], [], timeout) error: (4, 'Interrupted system call') However when there are clients connected it every so often just prints out the connection information as follows: max_unicast_mem is 32 myhostname=brooklyn myhostipaddress=129.21.x.x using multicast ucport [data]=51390 ucport [rtcp]=51391 mcport [data]=56384 mcport [rtcp]=56385 making multicast port [0] making multicast port [1] No bridge.acl file found, no ACL set If anyone has information that could help me track down where this problem is, it would be a great help. Thank you in advance, Matthew Leszczenski -Collaborations Technology Specialist @ RIT Research Computing Department