Re: [Milkymist-devel] The dungeons of NORia: Meeting the Balrog

Ed Leckie Fri, 28 Oct 2011 18:07:17 -0700


Hi Werner,

Nice job capturing all this!
What about a multi-voltage supervisor for 1V2, 2V5 and 3V3 rails, such as the 
STM6179 [1] rather than relying on an unregulated 5V from a wallwart.
And perhaps should you use a 12V wallwart and and board 5V switching (pre-) 
regulator? This allows for looser constraints on the wallwart, then there's a 
backwards compatibility issue?

Cheers,
Ed.

[1] 
http://www.st.com/internet/com/TECHNICAL_RESOURCES/TECHNICAL_LITERATURE/DATASHEET/CD00060157.pdf

> Date: Fri, 28 Oct 2011 21:39:44 -0300
> From: [email protected]
> To: [email protected]
> Subject: [Milkymist-devel] The dungeons of NORia: Meeting the Balrog
> 
> The exploration of the dungeons of NORia has finally led to a
> meeting with the supposed arch-enemy: the power-down behaviour of
> the reset circuit.
> 
> 
> Background
> ----------
> 
> M1rc3 has a special reset chip (U24, [1]) that resets FPGA and NOR
> when powering up and that also holds them in reset when the 3.3 V
> rail drops below 2.63 V. The expectation was that this would
> prevent the NOR corruption. Alas, it didn't.
> 
> After poking around for a while, we started to suspect that, when
> powering down, the 3.3 V rail may drop more slowly than some of
> the other rails - particularly any of the power rails supplying
> the FPGA core.
> 
> In this case, the FPGA could get confused, send out weird signals,
> which would then be properly amplified by the FPGA's I/O drivers
> (operating at 3.3 V), received by the NOR (also operating at
> 3.3 V), and finally every once in a while producing a valid
> command the NOR may still have enough time to process before it
> also loses power.
> 
> Power rails can drop at different speeds because each has its own
> regulator and output buffering. It's not trivial to assure that
> rails come up or down in a specific order and it's also difficult
> to measure the exact order, because it can vary a lot with what
> the system is doing at the time of the power cut.
> 
> However, we know that no power rail can drop faster than the power
> input. Because if a rail would drop faster, the regulator could
> simply draw more power from the input to bring the rail back up
> again.
> 
> Thus the idea was born to drive the reset chip not from the
> regulated 3.3 V rail but from the filtered but unregulated 5 V
> input. Also, to make sure we cut out in time, the threshold
> voltage of the reset chip should be closer to 5 V.
> 
> 
> The rework
> ----------
> 
> I removed the old reset chip and replaced it with an
> APX803-44SAG-7 [2] which has a threshold voltage of 4.38 V. To
> isolate the input pin from the 3.3 V pad on the PCB, I placed a 
> piece of single-sided 0.36 mm FR4 board [3] between chip and pad.
> 
> The closest 5 V source I could find is C125, part of the MIDI TX
> circuit.
> 
> This is what it looks like:
> 
> http://downloads.qi-hardware.com/people/werner/m1/nor/d8/u24-to-5V.jpg
> 
> 
> M1 behaviour after rework
> -------------------------
> 
> Immediately after the rework, the M1 behaved a little odd. It did
> reset and enter standby, but when I tried to get into the BIOS to
> run the CRC test, it just stopped (maybe a spurious reset).
> 
> I'm not sure what happened there. Later, I checked the voltages,
> and they're all good: 4.98 V at the DC jack and 4.94 V at U24 pin
> 3.
> 
> Eventually, it gave in and behaved properly. I then proceeded to
> run the usual power-cycling loop.
> 
> 
> Testing
> -------
> 
> I ran the power-cycling test for 4284 cycles. It did not report a
> single corruption.
> 
> Afterwards, I did a CRC check, which also showed that everything
> was in good health (*). Last but not least, I dumped the lock bits
> and verified that block 0 was indeed unlocked.
> 
> This means that the test seems to be valid. If we assume a
> previous corruption probability of 1/500 per cycle, the
> probability of passing 4284 cycles without hitting a single
> corruption would be about 0.02%.
> 
> (*) In case you're checking my log [4]: the rescue BIOS failed the
>     CRC check. I think it's the MAC address that causes the CRC to
>     fail. I never bothered to fix this, so that failure is normal
>     and expected.
> 
> 
> Conclusion
> ----------
> 
> It seems that changing the reset circuit such that it always
> resets FPGA and NOR when power is ramping down does reduce the
> rate of NOR corruptions substantially and may even eliminate the
> problem entirely.
> 
> The instabilities observed immediately after the rework need
> further examination. They may have been caused by residues of the
> rework (e.g., flux that hasn't dried completely), but another
> possible explanation would be short voltage drops on the 5 V rail
> during load changes.
> 
> We may also consider using a reset chip with a lower threshold
> voltage. E.g., the APX803-40SAG-7 with a nominal threshold of
> 4.0 V should still give the 3.3 V regulator [5] enough room to do
> its work, while being less sensitive to small upsets of the 5 V
> supply.
> 
> 
> What's next
> -----------
> 
> I'll play with my M1 in "regular use" for a bit and watch for
> unexplained resets/hangs/etc.
> 
> After that, a longer test run should provide more certainty that
> the corruption is really gone. The probability for that increases
> roughly exponentially with the number of cycles, and each 5-6
> hours add a factor of ten. So a couple of days should be
> sufficient.
> 
> Last but not least, this needs testing with the supply voltage at
> its limits, e.g., the 4.75 V to 5.25 V allowed for a USB host.
> 
> 
> [1] http://www.ait-ic.com/uploads//2009-10/21/_1256089836_7ol2c.pdf
> [2] http://www.diodes.com/datasheets/APX803.pdf
> [3] http://search.digikey.com/us/en/products/PC94/PC94-ND/354417
> [4] http://downloads.qi-hardware.com/people/werner/m1/nor/d8/raw.tar.bz2
> [5] http://www.national.com/profile/snip.cgi/openDS=LP38690
> 
> - Werner
> _______________________________________________
> http://lists.milkymist.org/listinfo.cgi/devel-milkymist.org
> IRC: #milkymist@Freenode

_______________________________________________
http://lists.milkymist.org/listinfo.cgi/devel-milkymist.org
IRC: #milkymist@Freenode

Re: [Milkymist-devel] The dungeons of NORia: Meeting the Balrog

Reply via email to