> -----Original Message----- > From: [email protected] [mailto:meta-freescale- > [email protected]] On Behalf Of Nikolay Dimitrov > Sent: Friday, January 23, 2015 3:11 PM > To: Fabio Estevam > Cc: [email protected] > Subject: Re: [meta-freescale] imx6 silent memory corruption > > Hi Fabio, > > On 01/23/2015 12:25 AM, Fabio Estevam wrote: > > On Thu, Jan 22, 2015 at 7:25 PM, Nikolay Dimitrov <[email protected]> > wrote: > > > >> I will appreciate if you can share ideas what could be wrong with > >> this setup, and also I'll be happy to hear from you suggestions for > >> similar simple tests for system reliability. > > > > Maybe you could try to run the 'memtester' utility and see it how your > > board behaves. > > Thanks for the idea. I ran the tool and it also reports errors, but this > happens > rarely (just like the hash test) and I still looking for how to easily > reproduce > the issue. Here's an example of memory error: > > > # memtester 64M 100 > memtester version 4.1.3 (32-bit) > Copyright (C) 2010 Charles Cazabon. > Licensed under the GNU General Public License version 2 (only). > > pagesize is 4096 > pagesizemask is 0xfffff000 > want 64MB (67108864 bytes) > got 64MB (67108864 bytes), trying mlock ...locked. > Loop 1/100: > Stuck Address : ok > Random Value : ok > FAILURE: 0xc3909006 != 0xc3909007 at offset 0x00291fac. > Compare XOR : Compare SUB : ok > Compare MUL : ok > Compare DIV : ok > Compare OR : ok > Compare AND : ok > Sequential Increment: ok > Solid Bits : ok > Block Sequential : ok > Checkerboard : ok > Bit Spread : ok > Bit Flip : ok > Walking Ones : ok > Walking Zeroes : ok > > > Memtester can run for hours without finding an issue, and sometimes it runs > for several minutes and reports a memory error. > > Found another tool, stresstestapp (http://stressapptest.googlecode.com > /svn/trunk/) which again seems to trigger the issue. Here's again an example > of memory error: > > > # ./stressapptest --no_timestamps --printsec 60 -M 64 -s 300 > Log: Commandline - ./stressapptest --no_timestamps --printsec 60 -M 64 > -s 300 > Stats: SAT revision 1.0.7_autoconf, 32 bit binary > Log: picmaster @ riotboard on Fri Jan 23 20:48:49 EET 2015 from open > source release > Log: 1 nodes, 2 cpus. > Log: Defaulting to 2 copy threads > Log: Flooring memory allocation to multiple of 4: 64MB > Log: Prefer plain malloc memory allocation. > Log: Using mmap() allocation at 0x72430000. > Stats: Starting SAT, 64M, 300 seconds > Log: region number 1 exceeds region count 1 > Log: Region mask: 0x1 > Log: Seconds remaining: 240 > Log: Seconds remaining: 180 > Report Error: miscompare : DIMM Unknown : 1 : 134s > Hardware Error: miscompare on CPU 1(0x2) at 0x74e93040(0x33f0d040:DIMM > Unknown): read:0xaaaaaaaaaaaaaa8a, reread:0xaaaaaaaaaaaaaa8a > expected:0xaaaaaaaaaaaaaaaa > Report Error: miscompare : DIMM Unknown : 1 : 136s > Hardware Error: miscompare on CPU 0(0x1) at 0x75528710(0x32270710:DIMM > Unknown): read:0xffffffbfffffffbe, reread:0xffffffbfffffffbe > expected:0xffffffbfffffffbf > Log: Seconds remaining: 120 > Log: Seconds remaining: 60 > Report Error: miscompare : DIMM Unknown : 1 : 266s > Hardware Error: miscompare on CPU 0(0x1) at > 0x74b979d0(0x358ae9d0:DIMM > Unknown): read:0x0000001000000000, reread:0x0000001000000000 > expected:0x0000001000000010 > Report Error: miscompare : DIMM Unknown : 1 : 274s > Hardware Error: miscompare on CPU 0(0x1) at 0x73b4cfd0(0x35e8afd0:DIMM > Unknown): read:0x0000001000000000, reread:0x0000001000000000 > expected:0x0000001000000010 > Log: Thread 1 found 3 hardware incidents > Log: Thread 2 found 1 hardware incidents > Stats: Found 4 hardware incidents > Stats: Completed: 256346.00M in 300.03s 854.40MB/s, with 4 hardware > incidents, 0 errors > Stats: Memory Copy: 256346.00M at 854.46MB/s > Stats: File Copy: 0.00M at 0.00MB/s > Stats: Net Copy: 0.00M at 0.00MB/s > Stats: Data Check: 0.00M at 0.00MB/s > Stats: Invert Data: 0.00M at 0.00MB/s > Stats: Disk: 0.00M at 0.00MB/s > > Status: FAIL - test discovered HW problems > > > I plan to run again the FSL DDR stress test to see whether it > detects issues with my DDR memory. My board uses a SO-DIMM DDR3, and I > was also thinking to try with another SO-DIMM module to see whether > there's any difference. > > Thanks for the ideas so far. This is a major problem for me so I need > to resolve it before doing anything else on this board. >
Have you read ERR005198 of the Chip Errata for the i.MX 6Dual/6Quad http://cache.freescale.com/files/32bit/doc/errata/IMX6DQCE.pdf -Doug Schwanke > Kind regards, > Nikolay > -- > _______________________________________________ > meta-freescale mailing list > [email protected] > https://lists.yoctoproject.org/listinfo/meta-freescale -- _______________________________________________ meta-freescale mailing list [email protected] https://lists.yoctoproject.org/listinfo/meta-freescale
