On Tue, 02 Mar 2021 21:50:49 +0800 Icenowy Zheng <icen...@aosc.io> wrote:
Hi Icenowy, > 于 2021年3月2日 GMT+08:00 下午9:40:44, Andre Przywara <andre.przyw...@arm.com> 写到: > >On Fri, 26 Feb 2021 00:13:25 +0800 > >Icenowy Zheng <icen...@aosc.io> wrote: > > > >> Previously we do not have proper dual rank memory detection on R40 > >> (because we omitted PIR_QSGATE, which does not work on R40 with our > >> configuration), and dual rank memory is just simply disabled as early > >> R40 boards available (Banana Pi M2 Ultra and Berry) have single rank > >> memory. > >> > >> As a board with dual rank memory (Forlinx OKA40i-C) is now known to > >us, > >> we need to have a way to do memory rank detection to support that > >board. > >> > >> Add some routine to detect memory rank by trying to access the memory > >> in rank 1 and check for error status of the memory controller, and > >then > >> enable dual rank memory on R40. > >> > >> Similar routine can be used to detect half DQ width (which is also > >> detected by PIR_QSGATE on other SoCs), but it's left unimplemented > >> because there's no known R40 board with half DQ width now. > > > >So when looking at the SPL size I realised that this patch breaks the > >socid constant parameter propagation, especially for mctl_set_cr(). I > >don't see immediately why, this whole code here should be compiled out > >on any A64 builds, for instance. Instead GCC turns > >"<mctl_set_cr.constprop.0>:" into "<mctl_set_cr>:", and passes 0x1689 > >around everytime. I tried GCC 10.2.0 and 9.2.0, also tried adding const > >everywhere, to amplify the constant nature of this value. Patch > >1/2 added to the code size, but kept the const propagation (only one > >instance of 0x1689 in the disassembly). This patch here should be for > >free on A64, but adds 104 bytes. > > > >Does anyone have a clue why this is happening? Is this a compiler > >issue? > > Maybe the issue is that I assume this codepath is only for R40 and > I didn't add socid to it? But that's clearly visible by this function only being called inside "if (socid == SOCID_R40)". And that works very well for the H3 ZQ calibration quirk, for instance. > Could you try to add a socid parameter to mctl_r40_detect_rank_count()? I just tried that, and apart from looking weird it didn't do anything. To be clear: Your code is absolutely fine, exactly the way it should be written. It's just that the compiler is playing stupid suddenly. I was thinking that your dummy readl might have upset it, but even with that commented out it's still happening. > Or maybe it makes mctl_calc_rank_size() not inlined? So the assembly looks like everything apart from mctl_set_cr() and mctl_auto_detect_dram_size_rank() is inlined. Those are extra because they are called multiple times and we are using -Os. > (Well I think the code looks noop on non-R40) Exactly that was my thinking: when compiling for the A64, it should be totally compiled out, and the object file should be the same before and after. > BTW, original rank/width detection part won't get called on R40. But > R40 is not where we are tight on code size, so I didn't bother to ignore > it on R40. I see. Yeah, I am not concerned about R40 either, but I want to avoid the A64 bloating up. > Or should we switch back to #if's and do not rely on compiler behavior any > longer? I'd rather not do that. We are doing the right thing, and it works marvellously so far. To be clear: it's not a show stopper, I was just curious why this happens. The code size is not really critical on the A64 at the moment, so I'd merge it as it, even if we don't find a solution. Maybe just leave a hint about it in the code should people need to come back to this. I also asked some compiler buffs about it, but it's not exactly the simple reproducer case they like to deal with ;-) Cheers, Andre > > > > >Cheers, > >Andre > > > >> Signed-off-by: Icenowy Zheng <icen...@aosc.io> > >> --- > >> arch/arm/mach-sunxi/dram_sunxi_dw.c | 55 > >+++++++++++++++++++++++++---- > >> 1 file changed, 49 insertions(+), 6 deletions(-) > >> > >> diff --git a/arch/arm/mach-sunxi/dram_sunxi_dw.c > >b/arch/arm/mach-sunxi/dram_sunxi_dw.c > >> index 2b9d631d49..b86ae7cdf3 100644 > >> --- a/arch/arm/mach-sunxi/dram_sunxi_dw.c > >> +++ b/arch/arm/mach-sunxi/dram_sunxi_dw.c > >> @@ -414,11 +414,9 @@ static void mctl_set_cr(uint16_t socid, struct > >dram_para *para) > >> } > >> > >> if (socid == SOCID_R40) { > >> - if (para->dual_rank) > >> - panic("Dual rank memory not supported\n"); > >> - > >> /* Mux pin to A15 address line for single rank memory. */ > >> - setbits_le32(&mctl_com->cr_r1, MCTL_CR_R1_MUX_A15); > >> + if (!para->dual_rank) > >> + setbits_le32(&mctl_com->cr_r1, MCTL_CR_R1_MUX_A15); > >> } > >> } > >> > >> @@ -702,8 +700,55 @@ static unsigned long mctl_calc_rank_size(struct > >rank_para *rank) > >> return (1UL << (rank->row_bits + rank->bank_bits)) * > >rank->page_size; > >> } > >> > >> +/* > >> + * Because we cannot do mctl_phy_init(PIR_QSGATE) on R40 now (which > >leads > >> + * to failure), it's needed to detect the rank count of R40 in > >another way. > >> + * > >> + * The code here is modelled after time_out_detect() in BSP, which > >tries to > >> + * access the memory and check for error code. > >> + * > >> + * TODO: auto detect half DQ width here > >> + */ > >> +static void mctl_r40_detect_rank_count(struct dram_para *para) > >> +{ > >> + ulong rank1_base = (ulong) CONFIG_SYS_SDRAM_BASE + > >> + mctl_calc_rank_size(¶->ranks[0]); > >> + struct sunxi_mctl_ctl_reg * const mctl_ctl = > >> + (struct sunxi_mctl_ctl_reg *)SUNXI_DRAM_CTL0_BASE; > >> + > >> + /* Enable read time out */ > >> + setbits_le32(&mctl_ctl->pgcr[0], 0x1 << 25); > >> + > >> + (void) readl((void *) rank1_base); > >> + udelay(10); > >> + > >> + if (readl(&mctl_ctl->pgsr[0]) & (0x1 << 13)) { > >> + clrsetbits_le32(&mctl_ctl->dtcr, 0xf << 24, 0x1 << 24); > >> + para->dual_rank = 0; > >> + } > >> + > >> + /* Reset PHY FIFO to clear it */ > >> + clrbits_le32(&mctl_ctl->pgcr[0], 0x1 << 26); > >> + udelay(100); > >> + setbits_le32(&mctl_ctl->pgcr[0], 0x1 << 26); > >> + > >> + /* Clear error status */ > >> + setbits_le32(&mctl_ctl->pgcr[0], 0x1 << 24); > >> + > >> + /* Clear time out flag */ > >> + clrbits_le32(&mctl_ctl->pgsr[0], 0x1 << 13); > >> + > >> + /* Disable read time out */ > >> + clrbits_le32(&mctl_ctl->pgcr[0], 0x1 << 25); > >> +} > >> + > >> static void mctl_auto_detect_dram_size(uint16_t socid, struct > >dram_para *para) > >> { > >> + if (socid == SOCID_R40) { > >> + mctl_r40_detect_rank_count(para); > >> + mctl_set_cr(socid, para); > >> + } > >> + > >> mctl_auto_detect_dram_size_rank(socid, para, > >(ulong)CONFIG_SYS_SDRAM_BASE, ¶->ranks[0]); > >> > >> if ((socid == SOCID_A64 || socid == SOCID_R40) && para->dual_rank) > >{ > >> @@ -854,8 +899,6 @@ unsigned long sunxi_dram_init(void) > >> uint16_t socid = SOCID_H3; > >> #elif defined(CONFIG_MACH_SUN8I_R40) > >> uint16_t socid = SOCID_R40; > >> - /* Currently we cannot support R40 with dual rank memory */ > >> - para.dual_rank = 0; > >> #elif defined(CONFIG_MACH_SUN8I_V3S) > >> uint16_t socid = SOCID_V3S; > >> #elif defined(CONFIG_MACH_SUN50I) -- You received this message because you are subscribed to the Google Groups "linux-sunxi" group. To unsubscribe from this group and stop receiving emails from it, send an email to linux-sunxi+unsubscr...@googlegroups.com. To view this discussion on the web, visit https://groups.google.com/d/msgid/linux-sunxi/20210302151904.0d628797%40slackpad.fritz.box.