Hi,

I got a nanopc-t4 amongst others which shipped with:

DDR Version 1.15 20181010
Channel 0: LPDDR3, 933MHz
Bus Width=32 Col=10 Bank=8 Row=15/15 CS=2 Die Bus-Width=32 Size=2048MB
..

I have since upgraded to more recent u-boot versions:

U-Boot TPL 2020.07 (Sep 27 2020 - 12:34:15)
Channel 0: LPDDR3, 933MHz
BW=32 Col=10 Bk=8 CS0 Row=15 CS1 Row=15 CS=2 Die BW=16 Size=2048MB

U-Boot TPL 2020.10 (Nov 10 2020 - 13:37:45)
Channel 0: LPDDR3, 933MHz
BW=32 Col=10 Bk=8 CS0 Row=15 CS1 Row=15 CS=2 Die BW=16 Size=2048MB


The machine was highly instable showing memory and locking issues.
When only using two little cores, it was a lot more stable.

I went and also tried:

DDR Version 1.24 20191016
Channel 0: LPDDR3, 933MHz
Bus Width=32 Col=10 Bank=8 Row=15/15 CS=2 Die Bus-Width=16 Size=2048MB


which seems to match recent u-boot but all of them are different to the original Die BW of 32 which I currently assume to be correct for the Samsung K4E6E304EC-EGCG (so possibly the error also migrated into rokchip-linux/rkbin ?).



Looking at sdram_common.c::sdram_detect_dbw()

    300                 cs_cap = (1 << (row + col + bk + bw - 20));
    301                 if (bw == 2) {
    302                         if (cs_cap <= 0x2000000) /* 256Mb */
    303                                 die_bw_0 = (col < 9) ? 2 : 1;
304 else if (cs_cap <= 0x10000000) /* 2Gb */
    305                                 die_bw_0 = (col < 10) ? 2 : 1;
306 else if (cs_cap <= 0x40000000) /* 8Gb */
    307                                 die_bw_0 = (col < 11) ? 2 : 1;
    308                         else
    309                                 die_bw_0 = (col < 12) ? 2 : 1;
    310                         if (cs > 1) {
    311                                 row = cap_info->cs1_row;
312 cs_cap = (1 << (row + col + bk + bw - 20)); 313 if (cs_cap <= 0x2000000) /* 256Mb */ 314 die_bw_0 = (col < 9) ? 2 : 1; 315 else if (cs_cap <= 0x10000000) /* 2Gb */ 316 die_bw_0 = (col < 10) ? 2 : 1; 317 else if (cs_cap <= 0x40000000) /* 8Gb */ 318 die_bw_0 = (col < 11) ? 2 : 1;
    319                                 else
320 die_bw_0 = (col < 12) ? 2 : 1;
    321                         }
    322                 } else {


ca_cap is off by 20 bits compared to the values you are comparing to; in my case 0x400 and not 0x40000000:

        type 6 row 15 col 10 bk 3 cs 2 bw 2 cs_cap 8 cs1_row 15
    1 << (15 + 10 + 3 + 2 - 20) == 1 << 10 == 0x400

And similar in the 2nd case with cs1_row given cs > 1.

Now I know very little about all the memory chips out there but it seems very unlikely to regain these 20 bits in these calculations.
So either the “-20” goes or the cs_cap <= values need adjustment.

The problem now comes from the fact that cap_info->dbw gets the wrong value from die_bw_0 this way and given it is LPDDR3 I assume that set_cap_relate_config() in sdram_rk3399.c later restores the wrong values for the “memdata_ratio”.


There might be more problems lingering, but changing this, my machine got a lot more reliable, though I still see memory errors when I push it to its (temperature) limits running on all 6 cores, even with decent cooling, but that might be a secondary problem.


Can someone with a lot more insight into this magic have a look and if needed please fix it?


/bz


Reply via email to