Hi all,
I thought I lost a Roach2 after a power issue. It would boot but had some 
strange behaviour. A reflash of the file system using tftproot from uboot 
seemed to fix it. Good luck. MM

-----Original Message-----
From: Jason Manley [mailto:[email protected]] 
Sent: 18 April 2018 08:13 AM
To: Casper Lists
Subject: Re: [casper] temporary ROACH2 faults after power dips and spikes

We have about 100 ROACH2s deployed, both on our site and in various lab 
systems, and have not seen this sort of behaviour in any of our boards. But, we 
don't have ADCs connected to any of them. There're only power, 4x 10G SFP 
copper cables and 1x 1G STP going into the back. The boards are grounded via 
the ground pin of the IEC power connector.

We have observed that after an AC power restore, some of the boards do not 
automatically power back up, and we've got a few whose PPCs periodically stop 
responding on the network and they need to be power-cycled to recover them. But 
I believe these are all hardware failures, since they happen on the same boards 
each time and represent a small percentage (~5%) of all the boards.

Jason Manley
Functional Manager: DSP
SKA-SA

Cell: +27 82 662 7726
Work: +27 21 506 7300

On 18 Apr 2018, at 2:33, Matt Dexter <[email protected]> wrote:

> Hi Jonathon & Dan,
> 
> I had the same sort of idea regarding the FTDI USB and PPC USB ports.
> 
> Not sure how that being held in some partially on partially off state 
> could create problems for an ADC but maybe somehow corrupts the power 
> on reset or something connected, even indirectly, to the FTDI IC U33.  
> Or the PPC IC for the PPC USB port case.
> 
> Are either of Roach2's USB ports connected to something powered up 
> during the attempts to power down ?
> 
> Matt
> 
> On Tue, 17 Apr 2018, Dan Werthimer wrote:
> 
>> Date: Tue, 17 Apr 2018 17:16:22 -0700
>> From: Dan Werthimer <[email protected]>
>> Reply-To: [email protected]
>> To: CASPER Mailing List <[email protected]>
>> Subject: Re: [casper] temporary ROACH2 faults after power dips and 
>> spikes hi jonathan, here's a remote possibility that might explain 
>> some of the behaviour you are
>> seeting: when the power goes down to your roach2's, does the power 
>> also go down on the sample clock or 1 PPS distribution? if the sample clock 
>> continues to be fed to the ADC's, then the CMOS adc chips can continued to 
>> be powered via the clock, or perhaps via 1 PPS, and because the voltages are 
>> low, the adc's can get in a wierd mode...
>> you might need to power off the 1 PPS and sample clock ? or after 
>> power is restored, issue a reset to the ADC ? dan On Tue, Apr 17, 
>> 2018 at 4:22 PM, Jonathan Weintroub <[email protected]> wrote:
>>      Hi CASPERites,
>> 
>>      With experience on quite a few ROACH2s in the lab and in the field
>>      for some years, and a pattern has emerged which warrants a question
>>      to the ROACH2 experts on this list. The SAO team has seen strange
>>      faults happen on multiple ROACH2 units after power failures, dips
>>      and lightening storms.   I’ll list the various weirdnesses below,
>>      but the key point is while a full power cycle, including removing
>>      power from the line input, does not reset and cure the units. But
>>      extended power down (like overnight, or 24 hours, or more) does
>>      seem to bring the units back to life again.  This was discovered
>>      serendipitously, and has happened often enough that the pattern
>>      seems repeatable (though controlled experiments aren’t really
>>      possible, we try not to stress our equipment this way).
>> 
>>      Has anyone else seen this, and does someone perhaps have a
>>      suggestion as to root cause, or some way to accelerate the reset?
>> 
>>      Example faults have included:
>> 
>>      —ADC5G clock not being correctly received, or not being transmitted
>>      to FPGA, or being transmitted at incorrect speed.
>> 
>>      —A particular ADC would refuse to calibrate its digital interface
>>      to the FPGA.
>> 
>>      —QDRs which don’t calibrate
>> 
>>      —After a lightening storm on Maunakea we have two units with a
>>      single SFP+ port among 8 falling to transmit packets, though we
>>      have yet to see if an extended power down will cure this.
>> 
>>      Again these faults have been distributed across multiple units, and
>>      in all cases have eventually been cleared, after extended power
>>      down.  Which is good, but the pathology worries us.
>> 
>>      Thanks in advance for any light that might be cast on this issue.
>> 
>>      Jonathan and André
>>      EHT/SMA
>> 
>>      --
>>      You received this message because you are subscribed to the Google
>>      Groups "[email protected]" group.
>>      To unsubscribe from this group and stop receiving emails from it,
>>      send an email to [email protected].
>>      To post to this group, send email to [email protected].
>> --
>> You received this message because you are subscribed to the Google 
>> Groups "[email protected]" group.
>> To unsubscribe from this group and stop receiving emails from it, 
>> send an email to [email protected].
>> To post to this group, send email to [email protected].
>> 
> 
> --
> You received this message because you are subscribed to the Google Groups 
> "[email protected]" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To post to this group, send email to [email protected].

--
You received this message because you are subscribed to the Google Groups 
"[email protected]" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].


Please consider the environment before printing this e-mail.


DISCLAIMER:
This message is intended for the use of the addressee and may contain
information that is privileged and/or confidential. If you are not the intended
recipient, you are hereby notified that any dissemination, distribution or
copying of the information contained in this message is strictly unauthorized
and prohibited. If you have received this message in error, please notify the
sender by reply e-mail and delete the message from your system. Opinions,
conclusions, or other statements in this message that do not relate to the
business of REUTECH Radar Systems, a Division of Reutech Ltd., its subsidiaries
or affiliates, are neither given nor endorsed by REUTECH Radar Systems, a
Division of Reutech (Pty)Ltd.

-- 
You received this message because you are subscribed to the Google Groups 
"[email protected]" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].

Reply via email to