Hello,

Yesterday I have been trying to debug what's causing the XFCE desktop
background artefacts on my A10-Lime, which look like this:
    
http://people.freedesktop.org/~siamashka/files/20140504/a10-l2-cache-fail-artefacts-in-xfce.png

And narrowed them down to ARM Cortex-A8 L2 cache failures, which
are reproducible when doing JPEG decoding:

$ djpeg -v       
libjpeg-turbo version 1.3.0 (build 20130811)

$ wget http://linux-sunxi.org/images/8/83/A10-LIME.jpg

$ djpeg A10-LIME.jpg | md5sum
691497bd2e5d36976c1ea3150de89df6  -

$ djpeg A10-LIME.jpg | md5sum
6a874af750f92e1e3c019f2df7edf3f7  -

$ djpeg A10-LIME.jpg | md5sum
297b98ba10233cbbcea2566e1c4fd7c7  -

Please note that the md5sum of the decoded JPEG file is different for
each run.

There are other ways to reproduce it (the FFmpeg test suite can detect
this problem too), but the djpeg test is very simple and fast to do.
In the case if somebody does not have the djpeg tool from libjpeg-turbo
in their distro, I have a static djpeg binary here for extra
convenience:
    http://people.freedesktop.org/~siamashka/files/20140504/djpeg-static
It has been built using:
    
http://people.freedesktop.org/~siamashka/files/20140504/build-static-djpeg.sh

On my collection of just three Allwinner A10 based devices, I get the
following results with the libjpeg-turbo djpeg test (and the default
CPU core voltage):
    A10-Lime    - fails at 1008MHz (960MHz is fine)
    Mele A2000  - fails at 1152MHz (1104MHz is fine)
    Cubieboard1 - fails at 1152MHz (1104MHz is fine)

Why is it likely related to the L2 cache? Because this problem goes
away if we disable the L2 cache by adding something like
        mrc     p15, 0, r10, c1, c0, 1
        bic     r10, r10, #(1 << 1)
        mcr     p15, 0, r10, c1, c0, 1
to the code around
   
https://github.com/linux-sunxi/linux-sunxi/blob/sunxi-v3.4.86-r0/arch/arm/mm/proc-v7.S#L248

It is also interesting that sun4i and sun5i have different L2 cache
latency parameters configured there. I have tried increasing the
latencies in the L2 Cache Auxiliary Control Register, but these
changes did not seem to affect anything. It looks like the only
important factors are the CPU clock speed and the CPU core
voltage (increasing it to 1.45V from 1.4V also fixes the problem
on my A10-Lime).

Anyway, with the sample size of just 3 devices, 33% of them appear to
be unable to run stable at 1GHz and 1.4V core voltage. I wonder, how
common is this problem in general? Are there any other Allwinner A10
devices failing the libjpeg-turbo djpeg test at 1GHz?

Also it would make sense to run reliability tests for all the cpufreq
operating points, because any frequency+voltage pair can be a weak link.

-- 
Best regards,
Siarhei Siamashka

-- 
You received this message because you are subscribed to the Google Groups 
"linux-sunxi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to linux-sunxi+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to