I don't know what's going on with this, but it's not power cycling.  Bunch
of errors about no data received from EDT, then buffer over runs.  Last
time we saw the EDT errors the MTU was set incorrectly on the 10 gig
iface, but that's fixed.  Excerpt below.. Ideas?


Thu May  5 23:23:28 2011 : No data from EDT 0 in last 35 seconds!
DoneCount is 922
Thu May  5 23:23:29 2011 : DSK[0] : RING BUFFER OVERRUN !!
1304652209.369000  1304652170.949000 0.209715
Thu May  5 23:23:30 2011 : DSK[0] : RING BUFFER OVERRUN !!
1304652210.835000  1304652209.369000 0.209715
Thu May  5 23:23:33 2011 : DSK[0] : RING BUFFER OVERRUN !!
1304652213.000000  1304652211.866000 0.209715
Thu May  5 23:23:36 2011 : DSK[0] : RING BUFFER OVERRUN !!
1304652216.223000  1304652214.557000 0.209715
Thu May  5 23:23:39 2011 : DSK[0] : RING BUFFER OVERRUN !!
1304652219.026000  1304652217.976000 0.209715
Thu May  5 23:23:43 2011 : DSK[0] : RING BUFFER OVERRUN !!
1304652223.956000  1304652222.906000 0.209715
Thu May  5 23:23:58 2011 : No data from EDT 0 in last 5 seconds!
DoneCount is 949
Thu May  5 23:24:03 2011 : No data from EDT 0 in last 10 seconds!
DoneCount is 949
Thu May  5 23:24:08 2011 : No data from EDT 0 in last 15 seconds!
DoneCount is 949
Thu May  5 23:24:13 2011 : No data from EDT 0 in last 20 seconds!
DoneCount is 949
Thu May  5 23:24:18 2011 : No data from EDT 0 in last 25 seconds!
DoneCount is 949
Thu May  5 23:24:23 2011 : No data from EDT 0 in last 30 seconds!
DoneCount is 949
Thu May  5 23:24:28 2011 : No data from EDT 0 in last 35 seconds!
DoneCount is 949
Thu May  5 23:24:33 2011 : No data from EDT 0 in last 40 seconds!
DoneCount is 949
Thu May  5 23:24:38 2011 : No data from EDT 0 in last 45 seconds!
DoneCount is 949
Thu May  5 23:24:43 2011 : No data from EDT 0 in last 50 seconds!
DoneCount is 949
Thu May  5 23:24:48 2011 : No data from EDT 0 in last 55 seconds!
DoneCount is 949
Thu May  5 23:24:53 2011 : No data from EDT 0 in last 60 seconds!
DoneCount is 949
Thu May  5 23:24:53 2011 : DSK[0] : RING BUFFER OVERRUN !!
1304652293.747000  1304652230.231000 0.209715
Thu May  5 23:39:13 2011 : No data from EDT 0 in last 5 seconds!
DoneCount is 2222
Thu May  5 23:39:18 2011 : No data from EDT 0 in last 10 seconds!
DoneCount is 2222
Thu May  5 23:39:23 2011 : No data from EDT 0 in last 15 seconds!
DoneCount is 2222
Thu May  5 23:39:28 2011 : No data from EDT 0 in last 20 seconds!
DoneCount is 2222
Thu May  5 23:39:33 2011 : No data from EDT 0 in last 25 seconds!
DoneCount is 2222
Thu May  5 23:39:33 2011 : DSK[0] : RING BUFFER OVERRUN !!
1304653173.409000  1304653147.327000 0.209715




On 5/5/11 8:08 PM, "[email protected]" <[email protected]> wrote:

>
>SERENDIP V.5 CRITICAL ERROR REPORT
>
>beam switcher appears to be working...
>Same file
>
>dr2 tail looks ok:
>Thu May  5 22:56:31 2011 : Idle Watch Thread has rejoined
>The BEE2 is responding to pings:
>PING beecourageous (192.168.1.86) 56(84) bytes of data.
>64 bytes from beecourageous (192.168.1.86): icmp_seq=1 ttl=64 time=0.348
>ms
>
>--- beecourageous ping statistics ---
>1 packets transmitted, 1 received, 0% packet loss, time 0ms
>rtt min/avg/max/mdev = 0.348/0.348/0.348/0.000 ms
>The iBOB is responding to pings:
>PING ddc (192.168.2.6) 56(84) bytes of data.
>64 bytes from ddc (192.168.2.6): icmp_seq=1 ttl=64 time=4.61 ms
>
>--- ddc ping statistics ---
>1 packets transmitted, 1 received, 0% packet loss, time 0ms
>rtt min/avg/max/mdev = 4.612/4.612/4.612/0.000 ms
>
>
>*****The data collection process is **NOT RUNNING** as of Thursday 05th
>May 2011 11:00:01 PM
>Attempting to restart the BEE...
>
> Power off command output:
>
> Power on command output:
>Attempting to restart the IBOB...
>
> Power off command output:
>
> Power on command output:
>
> Current Status: 1 ON
>2 OFF
>3 ON
>4 ON
>5 ON
>6 OFF
>7 ON
>8 OFF
>
>Sleeping 60 seconds to wait for fpga devices to come back up...
>
>!!Success!! The iBOB has awoken:
>PING ddc (192.168.2.6) 56(84) bytes of data.
>64 bytes from ddc (192.168.2.6): icmp_seq=1 ttl=64 time=5.12 ms
>
>--- ddc ping statistics ---
>1 packets transmitted, 1 received, 0% packet loss, time 0ms
>rtt min/avg/max/mdev = 5.127/5.127/5.127/0.000 ms
>!!Success!! The BEE2 lives!
>Sleeping 5 minutes to let the BEE2 boot...
>Attempting to restart the whole shebang...
>Check output below:
>root
>copying dr2_config_short into dr2_config
>copying old output files for safekeeping
>killing any previous setispec_dr runs
>killing old disk buf collector...
>killing previous ssh instances to run sendstatus on obs@beecourageous
>initalizing the iBOB
>
>7 beam, 2 pol
>Number of beams: 7
> Number of pols: 2
> Number of cycles: 1
>Dwelltime: 1
>Max address: 14
>
>
>0x0000 / 00000 -> 0x000000ED / 0b00000000000000000000000011101101 /
>0000000237
>
>
>
>0x0001 / 00001 -> 0x000000EB / 0b00000000000000000000000011101011 /
>0000000235
>
>
>
>0x0002 / 00002 -> 0x000000E7 / 0b00000000000000000000000011100111 /
>0000000231
>
>
>
>0x0003 / 00003 -> 0x000000DE / 0b00000000000000000000000011011110 /
>0000000222
>
>
>
>0x0004 / 00004 -> 0x000000DD / 0b00000000000000000000000011011101 /
>0000000221
>
>
>
>0x0005 / 00005 -> 0x000000DB / 0b00000000000000000000000011011011 /
>0000000219
>
>
>
>0x0006 / 00006 -> 0x000000D7 / 0b00000000000000000000000011010111 /
>0000000215
>
>
>
>0x0007 / 00007 -> 0x000000BE / 0b00000000000000000000000010111110 /
>0000000190
>
>
>
>0x0008 / 00008 -> 0x000000BD / 0b00000000000000000000000010111101 /
>0000000189
>
>
>
>0x0009 / 00009 -> 0x000000BB / 0b00000000000000000000000010111011 /
>0000000187
>
>
>
>0x000A / 00010 -> 0x000000B7 / 0b00000000000000000000000010110111 /
>0000000183
>
>
>
>0x000B / 00011 -> 0x0000007E / 0b00000000000000000000000001111110 /
>0000000126
>
>
>
>0x000C / 00012 -> 0x0000007D / 0b00000000000000000000000001111101 /
>0000000125
>
>
>
>0x000D / 00013 -> 0x0000007B / 0b00000000000000000000000001111011 /
>0000000123
>
>
>
>0x000E / 00014 -> 0x00000000 / 0b00000000000000000000000000000000 /
>0000000000
>
>
>rebooting the bee2
>tcgetattr: Inappropriate ioctl for device
>killing previous sendstatus
>deleting previous nohup file
>killing running bofs
>loading new bofs
>nohup: appending output to `nohup.out'
>nohup: nohup: nohup: appending output to `nohup.out'
>appending output to `nohup.out'
>appending output to `nohup.out'
>configuring chips
>Setting the maximum number of hits to: 25
>Setting the scale threshold to 80
>done
>Connection to beecourageous closed.
>starting sendstatus
>starting diskbuf cleaner
>starting new run
>imdonenow
>
>
>Disk Usage: 
>Filesystem            Size  Used Avail Use% Mounted on
>/dev/md1              226G  179G   47G  80% /
>tmpfs                 2.0G     0  2.0G   0% /lib/init/rw
>udev                   10M  108K  9.9M   2% /dev
>tmpfs                 2.0G     0  2.0G   0% /dev/shm
>/dev/md0              236M   42M  182M  19% /boot
>
>
>



Reply via email to