On 09/12/15 09:40, Siarhei Siamashka wrote: > On Fri, 13 Nov 2015 18:22:10 +0100 > Jens Kuske <jensku...@gmail.com> wrote: > >> On 13/11/15 13:36, Siarhei Siamashka wrote: >>> On Wed, 11 Nov 2015 18:26:54 +0100 >>> Jens Kuske <jensku...@gmail.com> wrote: >>> >>>> Based on some guessing and comparing with the parts I initially started >>>> disassembling (before this SDK appeared), I think it's I20. The I20 dram >>>> init code matched the disassembled parts exactly, except the ZQ >>>> calibration part... >>>> So, looks like it might not be correct to assume I20 == H3. >>> >>> The magic constants in mctl_set_master_priority() seem to be a bit >>> different too. This explains why I could not find the matching code >>> in the SDK. >> >> Indeed, didn't remember that. >> >>> >>>> This is the ZQ calibration function I got by disassembling, but I dropped >>>> it since the SDK version worked well too. Maybe I should have done the >>>> opposite and drop the SDK... >>> >>> Agreed, it's safer to prefer the variant of code that is actually >>> used in production on real devices. >>> >>>> It looks a bit strange, like if they are always calibrating the >>>> first channel and then copying to the others, maybe there is some >>>> hardware bug they work around. >>>> >>>> static void mctl_zq_calibration(struct dram_para *para) >>>> { >>>> struct sunxi_mctl_ctl_reg * const mctl_ctl = >>>> (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE; >>>> >>>> int i; >>>> u16 zq_val[6]; >>>> u32 reg_val, val; >>>> >>>> mctl_dbg("ZQ calibration... "); >>>> >>>> writel(0x0a0a0a0a, &mctl_ctl->zqdr[2]); >>>> >>>> for (i = 0; i < 6; i++) >>>> { >>>> u8 zq = (CONFIG_DRAM_ZQ >> (i * 4)) & 0xf; >>>> >>>> writel((zq << 20) | (zq << 16) | (zq << 12) | >>>> (zq << 8) | (zq << 4) | (zq << 0), >>>> &mctl_ctl->zqcr); >>>> >>>> writel(PIR_CLRSR, &mctl_ctl->pir); >>>> mctl_phy_init(PIR_ZCAL); >>>> >>>> val = readl(&mctl_ctl->zqdr[0]) & 0xff; >>>> writel((val << 24) | (val << 16) | (val << 8) | (val << 0), >>>> &mctl_ctl->zqdr[2]); >>>> >>>> writel(PIR_CLRSR, &mctl_ctl->pir); >>>> mctl_phy_init(PIR_ZCAL); >>>> >>>> zq_val[i] = val | >>>> (bin_to_mgray(mgray_to_bin(readl(&mctl_ctl->zqdr[0]) >> 24) - 1) << 8); >>>> } >>>> >>>> writel((zq_val[1] << 16) | zq_val[0], &mctl_ctl->zqdr[0]); >>>> writel((zq_val[3] << 16) | zq_val[2], &mctl_ctl->zqdr[1]); >>>> writel((zq_val[5] << 16) | zq_val[4], &mctl_ctl->zqdr[2]); >>>> >>>> mctl_dbg((mctl_ctl->zqsr & (1 << 30)) ? "ERROR\n" : "DONE\n"); >>>> } >>> >>> Thanks, using this implementation fixes the reliability problems at >>> 672MHz on my Orange Pi PC board. >> >> Ok, I'll replace the SDK version then. >> >>> >>>> And while trying to figure out what's the reason for this I used the >>>> following debug function, it might help you if you want to debug ZQ cal: >>>> >>>> static void dump_zq(void) >>>> { >>>> struct sunxi_mctl_ctl_reg * const mctl_ctl = >>>> (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE; >>>> >>>> int i; >>>> static const char *mod[3] = { "control", "DX0/DX1", "DX2/DX3" }; >>>> static const char *error[4] = { "\t", "overflow", >>>> "underflow", "in progress" }; >>>> >>>> printf("== ZQ calibration %s %s ==\n", >>>> (mctl_ctl->zqsr & (1 << 31)) ? "DONE" : "", >>>> (mctl_ctl->zqsr & (1 << 30)) ? "ERROR" : ""); >>>> >>>> printf("\tODT pull-up\tODT pull-down\tDRV pull-up\tDRV pull-down\n"); >>>> >>>> for (i = 0; i < 3; i++) >>>> printf("%s\t%2u %s\t%2u %s\t%2u %s\t%2u %s\n", mod[i], >>>> mgray_to_bin((mctl_ctl->zqdr[i] >> 24) & 0x1f), >>>> error[(mctl_ctl->zqsr >> (i * 8 + 6)) & 0x3], >>>> mgray_to_bin((mctl_ctl->zqdr[i] >> 16) & 0x1f), >>>> error[(mctl_ctl->zqsr >> (i * 8 + 4)) & 0x3], >>>> mgray_to_bin((mctl_ctl->zqdr[i] >> 8) & 0x1f), >>>> error[(mctl_ctl->zqsr >> (i * 8 + 2)) & 0x3], >>>> mgray_to_bin((mctl_ctl->zqdr[i] >> 0) & 0x1f), >>>> error[(mctl_ctl->zqsr >> (i * 8 + 0)) & 0x3]); >>>> } >>> >>> With your original code from github I get: >>> >>> == ZQ calibration DONE ERROR == >>> ODT pull-up ODT pull-down DRV pull-up DRV pull-down >>> control 16 underflow 16 12 underflow 12 >>> DX0/DX1 12 underflow 12 12 underflow 12 >>> DX2/DX3 12 12 12 underflow 12 >>> >>> With the implementation from your e-mail I get: >>> >>> == ZQ calibration DONE == >>> ODT pull-up ODT pull-down DRV pull-up DRV pull-down >>> control 20 17 15 13 >>> DX0/DX1 6 5 15 13 >>> DX2/DX3 6 5 15 13 >>> >> >> That looks *much* better :) >> >>> >>> I'm going to try running lima-memtester on the device and also will >>> do a detailed review of your code by the end of the weekend. BTW, which >>> boot0 binary did you use as a reference? It might help to make sure >>> that we are actually looking at exactly the same thing. >> >> If I remember correctly the libdram came from the SDK offered at the >> orange pi downloads. I roughly compared it to the preinstalled android >> boot0, it looked the same but had some symbols. >> But I dropped most unused codepaths and tried to cleaned up a lot of magic. > > Thanks for the explanations. I finally got lima-memtester up and > running on H3 hardware (not that it was difficult, but just the amount > of unnecessary compatibility breaks in the H3 SDK kernel was a bit > unexpected and really discouraging, they are really doing almost > *everything* in a different way compared to A10/A20 SDK kernel just > for the sake of making things different): > > > https://github.com/ssvb/lima-memtester/releases/tag/20151207-orange-pi-pc-fel-test > > Currently U-Boot v2016.01-rc2 fails the lima-memtester check unless > the DRAM clock speed is reduced to 456 MHz on my Orange Pi PC. But > if I use the boot0 bootloader to boot the same kernel image > (uImage-3.4-sun8i-h3-lima-memtester), then the test works fine. > This means that there must be still something wrong with the H3 > DRAM init code in U-Boot, or maybe some clocks are wrong >
I tried to compare boot0 and the SDK source again and actually found another difference. The read delays (except for dqs) are doubled in boot0 before writing them to the registers. Looks like they suddenly needed higher values than possible with 4 bit. The register seems to take 6 bit wide values. After fixing that, lima-memtester doesn't fail at 672 MHz anymore on my Orange Pi Plus. Before it failed at 552 and higher. Patch below. Maybe we could also replaced the setbits with writel, because I don't see a good reason not to overwrite the registers, they are all zero after reset and have 0x00003f3f mask. Jens diff --git a/arch/arm/cpu/armv7/sunxi/dram_sun8i_h3.c b/arch/arm/cpu/armv7/sunxi/dram_sun8i_h3.c index b721d60..03bd013 100644 --- a/arch/arm/cpu/armv7/sunxi/dram_sun8i_h3.c +++ b/arch/arm/cpu/armv7/sunxi/dram_sun8i_h3.c @@ -73,10 +73,10 @@ static void mctl_dq_delay(u32 read, u32 write) for (i = 0; i < 4; i++) { val = DATX_IOCR_WRITE_DELAY((write >> (i * 4)) & 0xf) | - DATX_IOCR_READ_DELAY((read >> (i * 4)) & 0xf); + DATX_IOCR_READ_DELAY(((read >> (i * 4)) & 0xf) * 2); for (j = DATX_IOCR_DQ(0); j <= DATX_IOCR_DM; j++) - setbits_le32(&mctl_ctl->datx[i].iocr[j], val); + writel(val, &mctl_ctl->datx[i].iocr[j]); } clrbits_le32(&mctl_ctl->pgcr[0], 1 << 26); @@ -85,8 +85,8 @@ static void mctl_dq_delay(u32 read, u32 write) val = DATX_IOCR_WRITE_DELAY((write >> (16 + i * 4)) & 0xf) | DATX_IOCR_READ_DELAY((read >> (16 + i * 4)) & 0xf); - setbits_le32(&mctl_ctl->datx[i].iocr[DATX_IOCR_DQS], val); - setbits_le32(&mctl_ctl->datx[i].iocr[DATX_IOCR_DQSN], val); + writel(val, &mctl_ctl->datx[i].iocr[DATX_IOCR_DQS]); + writel(val, &mctl_ctl->datx[i].iocr[DATX_IOCR_DQSN]); } setbits_le32(&mctl_ctl->pgcr[0], 1 << 26); -- You received this message because you are subscribed to the Google Groups "linux-sunxi" group. To unsubscribe from this group and stop receiving emails from it, send an email to linux-sunxi+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.