Code:

---------------------------------------------------------------------

#include <libopencm3/stm32/rcc.h>
#include <libopencm3/stm32/timer.h>

int main(void)
{
        rcc_periph_clock_enable(RCC_TIM3);
        timer_set_period(TIM3, 1);
        return 0;
}

---------------------------------------------------------------------

Using default compiler options for f4:

---------------------------------------------------------------------

080001ac <main>:
 80001ac:    b508          push    {r3, lr}
 80001ae:    f640 0001     movw    r0, #2049    ; 0x801
 80001b2:    f000 f80b     bl    80001cc <rcc_periph_clock_enable>
 80001b6:    2101          movs    r1, #1
 80001b8:    4802          ldr    r0, [pc, #8]    ; (80001c4 <main+0x18>)
 80001ba:    f000 f805     bl    80001c8 <timer_set_period>
 80001be:    2000          movs    r0, #0
 80001c0:    bd08          pop    {r3, pc}
 80001c2:    bf00          nop
 80001c4:    40000400     .word    0x40000400

080001c8 <timer_set_period>:
 80001c8:    62c1          str    r1, [r0, #44]    ; 0x2c
 80001ca:    4770          bx    lr

080001cc <rcc_periph_clock_enable>:
 80001cc:    0943          lsrs    r3, r0, #5
 80001ce:    f103 4380     add.w    r3, r3, #1073741824    ; 0x40000000
 80001d2:    f503 330e     add.w    r3, r3, #145408    ; 0x23800
 80001d6:    f000 021f     and.w    r2, r0, #31
 80001da:    6819          ldr    r1, [r3, #0]
 80001dc:    2001          movs    r0, #1
 80001de:    4090          lsls    r0, r2
 80001e0:    4308          orrs    r0, r1
 80001e2:    6018          str    r0, [r3, #0]
 80001e4:    4770          bx    lr

---------------------------------------------------------------------

The store in rcc_periph_clock_enable happens at 80001e2. After returning to main(), we take 2 cycles to load arguments to timer_set_period, then calls it, and timer_set_period writes to the timer register in the first instruction.

If my math is correct, the second write happens on the 5th cycle after the first write. This violates the errata sheet constraint by one cycle. For APB peripherals, the errata sheet says we need to wait 1 + (AHB/APB) cycles, which in this case would be 5 (most people use (AHB/APB = 4).

Now this is with default compiler options on f4, with the lowest APB prescaler. With a higher prescaler, it would be even worse. With -flto, it would be even worse. On f7, we are very very far from the 8 cycles required (keeping in mind that Cortex-M7 can execute 2 instructions per cycle).

So I believe we are hitting it on all series (that require this delay), at default compiler settings and clock settings provided by libopencm3.

Matthew

On 03/13/2017 08:15 PM, Chuck McManis wrote:
Matthew,

I'm really grateful you brought up the warning you read in the F7 manual. Clearly if someone tried to read or write a register and it wasn't ready the system would hard fault and that would be bad.

I have reviewed the code and the warning in the datasheet and believe it is impossible to hit this situation with the current library. Certainly on the F0-F4 series and unlikely on the F7 series.

It sounded like you still had concerns. If you do, it would be super helpful if you had an example that demonstrated this problem and especially helpful if that example could be recreated on one of the ST demo boards.

--Chuck

On Mon, Mar 13, 2017 at 1:01 PM, Matthew Lai <[email protected] <mailto:[email protected]>> wrote:

    Chuck, I'm not sure why you feel that way, but having
    disagreements and discussions about technical aspects of a project
    is usually a good thing.

    I am also not sure how I am not being helpful by pointing out
    something on the errata sheets that would be a potential source of
    bugs, some ways it can happen, and some ways to mitigate it, for
    different chips, based on the chip manufacturer's recommendations.

    Use any combination of compiler flags to do what? Prove that there
    will be a less than 16 instructions delay between clock enable and
    the first peripheral register write? Does that really need
    proving? You don't even need to change compiler flags for that.

    I also have no idea what ST's supplied software has anything to do
    with any of this.

    Can we go back to technical discussions now?

    If so, here is how Chromium OS fixed it:
    https://chromium-review.googlesource.com/c/246181/
    <https://chromium-review.googlesource.com/c/246181/>

    Matthew

    On 03/13/2017 07:41 PM, Chuck McManis wrote:
    Matthew, it feels like you are trying to create an argument here.

    If you want to be helpful, and this is actually a problem given
    the way libopencm3 is built, then *reproduce the problem* with
    libopencm3. You can use any combination of compiler flags you
    want, you just can't change the source inside of libopencm3.

    *If* you can reproduce the problem then that will show what the
    window of opportunity is. If you cannot, then you will see that
    Karl is correct in that it won't occur in libopencm3.

    If having this issue is such a problem for you, then by all means
    use the ST Micro supplied software which is written presumably by
    their experts.

    ---Chuck

    On Mon, Mar 13, 2017 at 10:57 AM, Matthew Lai <[email protected]
    <mailto:[email protected]>> wrote:

        On 03/13/2017 01:25 PM, Karl Palsson wrote:

            Matthew Lai <[email protected] <mailto:[email protected]>> wrote:

                    I would ignore it.  There's very few places where
                    it will really matter, and the people who will
                    need it, will be handling it themselves anyway,
                    not using high level helper functions.

                It would mean things like:
                rcc_periph_clock_enable(RCC_TIM3);
                timer_set_period(TIM3, 1024);

                Would violate this, at high prescalar settings (or low
                prescalar settings with LTO). That is actually the
                example
                given here:
                
https://github.com/libopencm3/libopencm3/blob/master/lib/stm32/common/timer_common_all.c#L70
                
<https://github.com/libopencm3/libopencm3/blob/master/lib/stm32/common/timer_common_all.c#L70>

                It would be a potential source of very difficult to
                find bugs.

            There's plenty of edge cases if you really want to go
            looking.
            I'm not really entirely sure how far to go trying to protect
            people. [1]

        That is true, but in this case, an user that uses libopencm3
        functions to setup clocks, enables clock for a timer, and
        uses it right away will be bitten.

        I'd say we should protect the user in this case, since 1)
        it's a very common sequence, 2) most users probably don't
        know about it (it's not mentioned at all in reference manuals
        until f7), 3) resulting bugs will be very hard to find, and
        4) I think there's a pretty easy fix (see below).

        APB1 is usually AHB/4 on f2, f4, and f7 at least. That means
        5 cycles are required for f2 and f4, and 8 cycles on f7. I
        wouldn't want to rely on the compiler to happen to generate
        that much delay, especially on f7 which is dual-issue (8
        cycles can mean 16 instructions).

                Since I imagine rcc_periph_clock_enable() to not be
                performance-critical in most applications, wouldn't
                adding the
                wait states be better default behaviour, and people
                who really
                need to enable clocks very quickly can go to registers
                directly?

            It's more that I'm not sure you can do it neatly, and be
            robust
            enough to be worth trying. You'll need to interpret what
            bus a
            peripheral is on, as well as what speed it's currently
            running
            at. Doing this using the rcc_axx_speed globals is... ok, but
            they're not set if people didn't use clock helpers to set
            them
            up.

            Adding your peripheral speed as a parameter is a gross
            user API.
            For the vast majority of use cases, this is simply never
            going to
            be a problem. Doing any sort of meaningful cycle delay is
            going
            to just add code that most people will never need.

            In my opinion at least :)

        According to the errata sheets, it's very easy on pre-f7
        chips. All that's required is a DSB barrier instruction. We
        don't actually have to do any cycle counting, or determining
        which bus the peripheral is on.

        Unfortunately that option is not mentioned for f7, which
        requires 2x peripheral clock cycles (could be because of the
        new AXIM bus). However, there's still no need to figure out
        which bus a peripheral is on - we can always just do enough
        delay for APB1. There's no harm in delaying for longer than
        necessary, unless someone needs to enable clocks really
        quickly. Figuring out APB1 frequency is not easy. I would
        just use the globals and add a note in the documentation that
        users are on their own if they are setting up their own clocks.

        Matthew





------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
libopencm3-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/libopencm3-devel

Reply via email to