I don't know what's going on with this, but it's not power cycling. Bunch of errors about no data received from EDT, then buffer over runs. Last time we saw the EDT errors the MTU was set incorrectly on the 10 gig iface, but that's fixed. Excerpt below.. Ideas?
Thu May 5 23:23:28 2011 : No data from EDT 0 in last 35 seconds! DoneCount is 922 Thu May 5 23:23:29 2011 : DSK[0] : RING BUFFER OVERRUN !! 1304652209.369000 1304652170.949000 0.209715 Thu May 5 23:23:30 2011 : DSK[0] : RING BUFFER OVERRUN !! 1304652210.835000 1304652209.369000 0.209715 Thu May 5 23:23:33 2011 : DSK[0] : RING BUFFER OVERRUN !! 1304652213.000000 1304652211.866000 0.209715 Thu May 5 23:23:36 2011 : DSK[0] : RING BUFFER OVERRUN !! 1304652216.223000 1304652214.557000 0.209715 Thu May 5 23:23:39 2011 : DSK[0] : RING BUFFER OVERRUN !! 1304652219.026000 1304652217.976000 0.209715 Thu May 5 23:23:43 2011 : DSK[0] : RING BUFFER OVERRUN !! 1304652223.956000 1304652222.906000 0.209715 Thu May 5 23:23:58 2011 : No data from EDT 0 in last 5 seconds! DoneCount is 949 Thu May 5 23:24:03 2011 : No data from EDT 0 in last 10 seconds! DoneCount is 949 Thu May 5 23:24:08 2011 : No data from EDT 0 in last 15 seconds! DoneCount is 949 Thu May 5 23:24:13 2011 : No data from EDT 0 in last 20 seconds! DoneCount is 949 Thu May 5 23:24:18 2011 : No data from EDT 0 in last 25 seconds! DoneCount is 949 Thu May 5 23:24:23 2011 : No data from EDT 0 in last 30 seconds! DoneCount is 949 Thu May 5 23:24:28 2011 : No data from EDT 0 in last 35 seconds! DoneCount is 949 Thu May 5 23:24:33 2011 : No data from EDT 0 in last 40 seconds! DoneCount is 949 Thu May 5 23:24:38 2011 : No data from EDT 0 in last 45 seconds! DoneCount is 949 Thu May 5 23:24:43 2011 : No data from EDT 0 in last 50 seconds! DoneCount is 949 Thu May 5 23:24:48 2011 : No data from EDT 0 in last 55 seconds! DoneCount is 949 Thu May 5 23:24:53 2011 : No data from EDT 0 in last 60 seconds! DoneCount is 949 Thu May 5 23:24:53 2011 : DSK[0] : RING BUFFER OVERRUN !! 1304652293.747000 1304652230.231000 0.209715 Thu May 5 23:39:13 2011 : No data from EDT 0 in last 5 seconds! DoneCount is 2222 Thu May 5 23:39:18 2011 : No data from EDT 0 in last 10 seconds! DoneCount is 2222 Thu May 5 23:39:23 2011 : No data from EDT 0 in last 15 seconds! DoneCount is 2222 Thu May 5 23:39:28 2011 : No data from EDT 0 in last 20 seconds! DoneCount is 2222 Thu May 5 23:39:33 2011 : No data from EDT 0 in last 25 seconds! DoneCount is 2222 Thu May 5 23:39:33 2011 : DSK[0] : RING BUFFER OVERRUN !! 1304653173.409000 1304653147.327000 0.209715 On 5/5/11 8:08 PM, "[email protected]" <[email protected]> wrote: > >SERENDIP V.5 CRITICAL ERROR REPORT > >beam switcher appears to be working... >Same file > >dr2 tail looks ok: >Thu May 5 22:56:31 2011 : Idle Watch Thread has rejoined >The BEE2 is responding to pings: >PING beecourageous (192.168.1.86) 56(84) bytes of data. >64 bytes from beecourageous (192.168.1.86): icmp_seq=1 ttl=64 time=0.348 >ms > >--- beecourageous ping statistics --- >1 packets transmitted, 1 received, 0% packet loss, time 0ms >rtt min/avg/max/mdev = 0.348/0.348/0.348/0.000 ms >The iBOB is responding to pings: >PING ddc (192.168.2.6) 56(84) bytes of data. >64 bytes from ddc (192.168.2.6): icmp_seq=1 ttl=64 time=4.61 ms > >--- ddc ping statistics --- >1 packets transmitted, 1 received, 0% packet loss, time 0ms >rtt min/avg/max/mdev = 4.612/4.612/4.612/0.000 ms > > >*****The data collection process is **NOT RUNNING** as of Thursday 05th >May 2011 11:00:01 PM >Attempting to restart the BEE... > > Power off command output: > > Power on command output: >Attempting to restart the IBOB... > > Power off command output: > > Power on command output: > > Current Status: 1 ON >2 OFF >3 ON >4 ON >5 ON >6 OFF >7 ON >8 OFF > >Sleeping 60 seconds to wait for fpga devices to come back up... > >!!Success!! The iBOB has awoken: >PING ddc (192.168.2.6) 56(84) bytes of data. >64 bytes from ddc (192.168.2.6): icmp_seq=1 ttl=64 time=5.12 ms > >--- ddc ping statistics --- >1 packets transmitted, 1 received, 0% packet loss, time 0ms >rtt min/avg/max/mdev = 5.127/5.127/5.127/0.000 ms >!!Success!! The BEE2 lives! >Sleeping 5 minutes to let the BEE2 boot... >Attempting to restart the whole shebang... >Check output below: >root >copying dr2_config_short into dr2_config >copying old output files for safekeeping >killing any previous setispec_dr runs >killing old disk buf collector... >killing previous ssh instances to run sendstatus on obs@beecourageous >initalizing the iBOB > >7 beam, 2 pol >Number of beams: 7 > Number of pols: 2 > Number of cycles: 1 >Dwelltime: 1 >Max address: 14 > > >0x0000 / 00000 -> 0x000000ED / 0b00000000000000000000000011101101 / >0000000237 > > > >0x0001 / 00001 -> 0x000000EB / 0b00000000000000000000000011101011 / >0000000235 > > > >0x0002 / 00002 -> 0x000000E7 / 0b00000000000000000000000011100111 / >0000000231 > > > >0x0003 / 00003 -> 0x000000DE / 0b00000000000000000000000011011110 / >0000000222 > > > >0x0004 / 00004 -> 0x000000DD / 0b00000000000000000000000011011101 / >0000000221 > > > >0x0005 / 00005 -> 0x000000DB / 0b00000000000000000000000011011011 / >0000000219 > > > >0x0006 / 00006 -> 0x000000D7 / 0b00000000000000000000000011010111 / >0000000215 > > > >0x0007 / 00007 -> 0x000000BE / 0b00000000000000000000000010111110 / >0000000190 > > > >0x0008 / 00008 -> 0x000000BD / 0b00000000000000000000000010111101 / >0000000189 > > > >0x0009 / 00009 -> 0x000000BB / 0b00000000000000000000000010111011 / >0000000187 > > > >0x000A / 00010 -> 0x000000B7 / 0b00000000000000000000000010110111 / >0000000183 > > > >0x000B / 00011 -> 0x0000007E / 0b00000000000000000000000001111110 / >0000000126 > > > >0x000C / 00012 -> 0x0000007D / 0b00000000000000000000000001111101 / >0000000125 > > > >0x000D / 00013 -> 0x0000007B / 0b00000000000000000000000001111011 / >0000000123 > > > >0x000E / 00014 -> 0x00000000 / 0b00000000000000000000000000000000 / >0000000000 > > >rebooting the bee2 >tcgetattr: Inappropriate ioctl for device >killing previous sendstatus >deleting previous nohup file >killing running bofs >loading new bofs >nohup: appending output to `nohup.out' >nohup: nohup: nohup: appending output to `nohup.out' >appending output to `nohup.out' >appending output to `nohup.out' >configuring chips >Setting the maximum number of hits to: 25 >Setting the scale threshold to 80 >done >Connection to beecourageous closed. >starting sendstatus >starting diskbuf cleaner >starting new run >imdonenow > > >Disk Usage: >Filesystem Size Used Avail Use% Mounted on >/dev/md1 226G 179G 47G 80% / >tmpfs 2.0G 0 2.0G 0% /lib/init/rw >udev 10M 108K 9.9M 2% /dev >tmpfs 2.0G 0 2.0G 0% /dev/shm >/dev/md0 236M 42M 182M 19% /boot > > >
