It reboots everything under any circumstance now. Should we change this?
sent from a mobile device -----Original Message----- From: Laura Spitler <[email protected]> Date: Fri, 23 Apr 2010 16:00:01 To: Andrew Siemion<[email protected]> Cc: Dan Werthimer<[email protected]>; <[email protected]> Subject: Re: SERENDIP V.5 Critical Error We should be able to tell with the data. The other day we killed the data recorder process and rebooted the iBOB, and everything is back to normal. This time I only killed the data recorder process, so the script shouldn't reboot the iBOB (unless it can't ping it). Therefore I would say that if only rebooting the data recorder code solves it, that it's a software and not a hardware problem. Observations start around 7 tonight Arecibo time, so we don't have to wait long to find out. Laura On Fri, Apr 23, 2010 at 3:56 PM, Andrew Siemion <[email protected]> wrote: > Hi Laura, > > Shoot. Can you tell if the beam switcher is actually getting stuck, or if > the reporting is just screwed up? > > - Andrew > > On 4/23/10 12:53 PM, "Laura Spitler" <[email protected]> wrote: > >> Hi everyone, >> It looks like sometime during observations on Thursday the beam got >> stuck again. I killed the data recorder process, which you will soon >> receive an email about. I'll take a look at the new data we collect >> tonight to see if everything is back to normal. Clearly this is a >> recurring problem that we need to look into. >> >> Laura >> >> >> On Tue, Apr 20, 2010 at 7:30 PM, Andrew Siemion <[email protected]> wrote: >>> Hi Dan, >>> >>> Thanks, a bottle of Ron del Barillitos would be great! >>> >>> - Andrew >>> >>> On 4/20/10 4:23 PM, "Dan Werthimer" <[email protected]> wrote: >>> >>>> >>>> >>>> andrew and laura, >>>> >>>> i'm at arecibo for another 13 hours, so let >>>> me know if you want me to do anthing. >>>> >>>> dan >>>> >>>> >>>> On 4/20/2010 4:12 PM, Andrew Siemion wrote: >>>>> Hi all, >>>>> >>>>> Laura noticed that sV.v had stopped switching beams. We have made some >>>>> changes to the status check script to: >>>>> >>>>> 1. check to make sure the ibob is responding to pings >>>>> 2. include the ibob in the power off - power on sequence >>>>> 3. power off - power on both the ibob and bee2 upon any error condition >>>>> >>>>> - Andrew >>>>> >>>>> >>>>> On 4/20/10 4:08 PM, "[email protected]"<[email protected]> wrote: >>>>> >>>>> >>>>>> SERENDIP V.5 CRITICAL ERROR REPORT >>>>>> >>>>>> The BEE2 is responding to pings: >>>>>> PING beecourageous (192.168.1.86) 56(84) bytes of data. >>>>>> 64 bytes from beecourageous (192.168.1.86): icmp_seq=1 ttl=64 time=0.369 >>>>>> ms >>>>>> >>>>>> --- beecourageous ping statistics --- >>>>>> 1 packets transmitted, 1 received, 0% packet loss, time 0ms >>>>>> rtt min/avg/max/mdev = 0.369/0.369/0.369/0.000 ms >>>>>> The iBOB is responding to pings: >>>>>> PING ddc (192.168.2.6) 56(84) bytes of data. >>>>>> 64 bytes from ddc (192.168.2.6): icmp_seq=1 ttl=64 time=5.56 ms >>>>>> >>>>>> --- ddc ping statistics --- >>>>>> 1 packets transmitted, 1 received, 0% packet loss, time 0ms >>>>>> rtt min/avg/max/mdev = 5.565/5.565/5.565/0.000 ms >>>>>> >>>>>> >>>>>> *****The data collection process is **NOT RUNNING** as of Tuesday 20th >>>>>> April >>>>>> 2010 07:00:01 PM >>>>>> Attempting to restart the BEE... >>>>>> >>>>>> Power off command output: >>>>>> >>>>>> Power on command output: >>>>>> Attempting to restart the IBOB... >>>>>> >>>>>> Power off command output: >>>>>> >>>>>> Power on command output: >>>>>> >>>>>> Current Status: 1 ON >>>>>> 2 OFF >>>>>> 3 ON >>>>>> 4 ON >>>>>> 5 ON >>>>>> 6 OFF >>>>>> 7 ON >>>>>> 8 OFF >>>>>> >>>>>> Sleeping 60 seconds to wait for fpga devices to come back up... >>>>>> >>>>>> !!Success!! The iBOB has awoken: >>>>>> PING ddc (192.168.2.6) 56(84) bytes of data. >>>>>> 64 bytes from ddc (192.168.2.6): icmp_seq=1 ttl=64 time=6.43 ms >>>>>> >>>>>> --- ddc ping statistics --- >>>>>> 1 packets transmitted, 1 received, 0% packet loss, time 0ms >>>>>> rtt min/avg/max/mdev = 6.434/6.434/6.434/0.000 ms >>>>>> !!Success!! The BEE2 lives! >>>>>> Sleeping 5 minutes to let the BEE2 boot... >>>>>> Attempting to restart the whole shebang... >>>>>> Check output below: >>>>>> root >>>>>> copying dr2_config_short into dr2_config >>>>>> copying old output files for safekeeping >>>>>> killing any previous setispec_dr runs >>>>>> killing old disk buf collector... >>>>>> killing previous ssh instances to run sendstatus on o...@beecourageous >>>>>> initalizing the iBOB >>>>>> >>>>>> 7 beam, 2 pol >>>>>> Number of beams: 7 >>>>>> Number of pols: 2 >>>>>> Number of cycles: 1 >>>>>> Dwelltime: 1 >>>>>> Max address: 14 >>>>>> >>>>>> >>>>>> 0x0000 / 00000 -> 0x000000ED / 0b00000000000000000000000011101101 / >>>>>> 0000000237 >>>>>> >>>>>> >>>>>> >>>>>> 0x0001 / 00001 -> 0x000000EB / 0b00000000000000000000000011101011 / >>>>>> 0000000235 >>>>>> >>>>>> >>>>>> >>>>>> 0x0002 / 00002 -> 0x000000E7 / 0b00000000000000000000000011100111 / >>>>>> 0000000231 >>>>>> >>>>>> >>>>>> >>>>>> 0x0003 / 00003 -> 0x000000DE / 0b00000000000000000000000011011110 / >>>>>> 0000000222 >>>>>> >>>>>> >>>>>> >>>>>> 0x0004 / 00004 -> 0x000000DD / 0b00000000000000000000000011011101 / >>>>>> 0000000221 >>>>>> >>>>>> >>>>>> >>>>>> 0x0005 / 00005 -> 0x000000DB / 0b00000000000000000000000011011011 / >>>>>> 0000000219 >>>>>> >>>>>> >>>>>> >>>>>> 0x0006 / 00006 -> 0x000000D7 / 0b00000000000000000000000011010111 / >>>>>> 0000000215 >>>>>> >>>>>> >>>>>> >>>>>> 0x0007 / 00007 -> 0x000000BE / 0b00000000000000000000000010111110 / >>>>>> 0000000190 >>>>>> >>>>>> >>>>>> >>>>>> 0x0008 / 00008 -> 0x000000BD / 0b00000000000000000000000010111101 / >>>>>> 0000000189 >>>>>> >>>>>> >>>>>> >>>>>> 0x0009 / 00009 -> 0x000000BB / 0b00000000000000000000000010111011 / >>>>>> 0000000187 >>>>>> >>>>>> >>>>>> >>>>>> 0x000A / 00010 -> 0x000000B7 / 0b00000000000000000000000010110111 / >>>>>> 0000000183 >>>>>> >>>>>> >>>>>> >>>>>> 0x000B / 00011 -> 0x0000007E / 0b00000000000000000000000001111110 / >>>>>> 0000000126 >>>>>> >>>>>> >>>>>> >>>>>> 0x000C / 00012 -> 0x0000007D / 0b00000000000000000000000001111101 / >>>>>> 0000000125 >>>>>> >>>>>> >>>>>> >>>>>> 0x000D / 00013 -> 0x0000007B / 0b00000000000000000000000001111011 / >>>>>> 0000000123 >>>>>> >>>>>> >>>>>> >>>>>> 0x000E / 00014 -> 0x00000000 / 0b00000000000000000000000000000000 / >>>>>> 0000000000 >>>>>> >>>>>> >>>>>> rebooting the bee2 >>>>>> tcgetattr: Inappropriate ioctl for device >>>>>> killing previous sendstatus >>>>>> deleting previous nohup file >>>>>> killing running bofs >>>>>> loading new bofs >>>>>> nohup: appending output to `nohup.out' >>>>>> nohup: appending output to `nohup.out' >>>>>> nohup: appending output to `nohup.out' >>>>>> nohup: appending output to `nohup.out' >>>>>> configuring chips >>>>>> Setting the maximum number of hits to: 25 >>>>>> Setting the scale threshold to 80 >>>>>> done >>>>>> Connection to beecourageous closed. >>>>>> starting sendstatus >>>>>> starting diskbuf cleaner >>>>>> starting new run >>>>>> imdonenow >>>>>> >>>>>> >>>>>> Disk Usage: >>>>>> Filesystem Size Used Avail Use% Mounted on >>>>>> /dev/md1 226G 71G 155G 32% / >>>>>> tmpfs 2.0G 0 2.0G 0% /lib/init/rw >>>>>> udev 10M 108K 9.9M 2% /dev >>>>>> tmpfs 2.0G 0 2.0G 0% /dev/shm >>>>>> /dev/md0 236M 35M 190M 16% /boot >>>>>> /dev/sdc1 1.4T 147G 1.3T 11% /mockdata >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>> >>>> >>> >>> >>> >>> > > >
