Hi Matt, Good sleuthing!
The x-ray evidence points to pcb damage under the Actel device, without that I would intuitively suggested that the PPC is the more likely candidate for soldering problems. I would say that not monitoring PPC temperature is not a complete show stopper from a hardware point of view, but the device does get warm, so a good check on the heatsink mounting would be required. I'll leave it to the software guys to comment on the workaround suggestion. Francois On Apr 25, 2012 11:40 PM, "Matt Dexter" <[email protected]> wrote: > Hi, > > Yesterday I spent some time debugging a Roach1 with > something strange with the PPC temperature sensor. > The PPC temperature reported is about 50 deg higher > than that reported for the Xilinx FPGA or Actel Fusion > device (which actually has the ADC). > As in 80 vs 30 when just the Xport is running. > > Could this be caused by the temp sensor having a good > connection to 1 and only 1 of the 2 PPC's PNP transistor leads ? > > Is there any easy software workaround ? > Would it be OK to use such a workaround and not monitor PPC's temp ? > Would you accept delivery of a Roach1 that did not monitor > the PPC's temp but was otherwise AOK ? > > Or should we remove the Actel Fusion BGA device, inspect&repair > as necessary, install a new Actel and continue debugging ? > > Thanks > Matt > > ------------------------------**------------------------------ > > Before I had a chance to look at the board the Actel > Fusion device U60 was replaced. Both the current and > previous devices were programmed with the latest and greatest > design. I don't know for certain but suspect the previous > Actel part behaved identically to the device now > installed. > > We made some ohmmeter measurements and for a while we > thought that ATRTN1 (J31-7) was 43 ohms to GND vs 7.x ohms > to ground on a board that reported good temps. Later > we redid the measurements and the strange board also reported > 7.x ohms. The reported temps were still bonkers. > > The voltages at J31-7 vs J31-8 were about .69 volts and > that matched the voltages for the Xilinx temps on J31-3 vs J31-4. > Those voltages decreased and the reported temperatures increased > as expected after the board was fully powered up. But still > the reported values were too high by 50 for the PPC. > > The board would only stay powered up if we disabled the > automatic failure condition shutdown function We checked > the other reported temps, voltages and currents and they > were all fine. The board would also stay up if we temporarily shorted > J31-7 to J31-8 which leads to a reported temp of -271. The -271 > is nonsensical but understandable. As it is higher than the low temp > shutoff threshold of -280 all the other > autoshutoff protections can remain enabled. > > The Actel Fusion device is a 256 pin BGA so it's virtually impossible > to probe even though the PPC temp sensor input connections > are on or near the edge of the package. > Using an X-Ray inspection device and comparing vs a known > good board we found the very short length of etch that delivers > the ATRTN1 signal from a via (from the PPC IC and to J31-7) to the > PCB bad for the Actel's T6 pin looks to be mainly missing. > > So maybe the Actel's ADC is getting a valid version of just one of > the PPC's PNP transistor leads and thus reporting a value with a > strange offset ? > > We tried blinding pushing in a fine wire to make a connection from J31-7 > to the tiny U60-T6 solder ball but never improved the reported value. > It would have been a big surprise if that actually worked but hey > no guts no glory. > > Do you have any ideas before we remove the Actual Fusion part at U60 > and inspect&repair the PCB trace from the PCB pad to the breakout > via ? > > other ideas: > 1) tweak the autoshutdown code running on this 1 Roach1 to deal > with the ~ 50deg offset in reported PPC temps ? > 2) jumper the PPC temp signals so Vdiff is 0V (-271 deg) or > perhaps some other voltage like 1.0VCC so that > PPC temperature monitoring is lost but the rest of the > protections are up and running. > 3) no need to disable entirely the failure mode shutdown function > so don't want to persue that path. > 4) ? > > For now, I believe, additional board bringup and debugging will continue > with J31-7&8 shorted together... > >

