-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi folks -
I made some progress reproducing and maybe even understanding the WSOD (White screen of Death) that the Glamo has always been partial to as part of a wider deathmatch with it. It seems that our choice about divider network (R1813 / R1814) to level-translate 3.3V CPU GPIO to 1.8V Glamo RST# might not have been the brightest thing we ever did. I can reset the Glamo (provoking sticky WSOD) by touching Glamo RST# with my scope probe. Glamo data says you need to assert RST# for 1us to get a reset, but it appears to reset the thing on any old glitch and is just recommending to hold it 1us to be sure everything got reset. With my scope probe applied, I cannot see any glitch: but it certainly knows if my probe is on since it resets immediately which is highly illegal behaviour for us. So, there is a glitch and the energy needed for Glamo to accept it with our driving network is so low that that my scope probe capacitance dumping on it is enough, highly abnormal. I finally got put onto this when I found the Glamo registers are being forced back to reset values by resume time during some suspends here. (I noticed this because we hold a lot of unused Glamo engines in reset to avoid trouble, but the engines do not reset to being in reset, and they were all out of reset sometimes on resume) Since we currently rely on bulk of Glamo regs staying where they were during suspend, this causes death. The Glamo docs also say that for 4ms after reset, we cannot touch the registers, and that matches a lot of the race outcomes I see, PLLs not started again sometimes even when we ran the correct code to start them, stuff passing PLL lock tests and then brain damage later when it updates cursor in framebuffer or brings up SD Card again. The brain damage is very ugly to debug because one race outcome is the Glamo just jams nWAIT down forever if it isn't in a state to service your read [1]. Further, since it is glitchy (due to 80K source impedence from this divider no doubt), I think we are racing the glitch and this is part of the general resume instability. But I do not know what provokes the glitching since it doesn't randomly do it in normal operation. Maybe when we transition from MEMLDO to the high current 1.8V reg on resume. Last night I started on workaround code to dump the registers we are using on suspend and reapply them in resume, but it was still crashy and unstable when I went to bed. Still, this is progress! - -Andy [1] But I have the medicine for that in most cases, short nWAIT to 1.8V briefly while holding down AUX, having sneaked a deliberate NULL pointer ~ OOPS into the AUX keyboard ISR which then triggers the emergency panic dump code I added some weeks ago, so I can see where we got locked up despite it is a true hard freeze. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org iEYEARECAAYFAkiOwOsACgkQOjLpvpq7dMrmDgCeKlxtbrOlCNWwmJX5w4L7YkG2 n5sAnjvuzch/sGL2NSU1bcixnRdUuyIj =2Rh/ -----END PGP SIGNATURE-----
