On Fri, May 27, 2011 at 4:33 PM, J. William Campbell <jwilliamcampb...@comcast.net> wrote: > On 5/26/2011 9:33 PM, Graeme Russ wrote: >> >> Hi Bill, >> > <snip> >>
[massive snip] OK, you have my ears pricked - Can you give me code samples for: - get_ticks() - sync_timbase() (no need to implement the whole lot if that is too much effort right now) - timer_isr() that handle the following hardware tick counter scenarios: a) 64-bit up counter b) 64-bit down counter c) 32-bit up counter, wraps at 65000 d) 16-bit microsecond up counter 0-999 which wraps and triggers a 16-bit millisecond up counter. Reading milliseconds latched microseconds and clears milliseconds (look in arch/x86/cpu/sc520/timer.c for example) e) 24-bit counter occupying bits 2-25 of a 32-bit word (just to be difficult) f) Any other option anyone cares to throw ;) All of these options must be covered using: - Minimal global data (we would like it to work before relocation, but not mandatory - GD footprint would be nice) - All use the same sync_timebase function - Demonstrate using an ISR NOT synced to the hardware tick counter and an ISR that is - Parameters to get_ticks() and sync_timer() are permitted, but not for timer_isr() (naturally) > I don't' see any reason to push this down to a lower level. It is just one > more thing to get messed up across implementations. Agreed >>> detection in the non-interrupt case to sync_timebase as well. >>> Sync_timebase >>> can also invert the down-counting counters, removing that from get_ticks. >>> The wrap detection code can be #ifdef out if one is using interrupts and >> >> Urghh - Why are you adding unnecessary ugliness - #ifdef in the middle of >> code is usually a sign you are doing something wrong > > As I said, this is an optional optimization. I do not agree that an #ifdef > in the middle of code indicates you have a bad design. Lots and Lots of > ifdefs certainly indicates a bad design. An ifdef to eliminate code if some > option is not selected is hardly such a strange thing, especially only a > single #ifdef. However, feel free to not have it if you so desire. OK, I'll let this slide for the moment - please include in above example >>> offended by it's presence. Thanks for pointing this out and compelling me >>> to >>> reduce the number of cases! Making get_ticks more lightweight is a good >>> idea >>> in my opinion. >>>> >>>> Lets say you have a platform with a 32-bit tick counter running at a >>>> reasonably long rollover time so you decide not to use interrupts. Then >>>> you create a new platform with the same tick counter, but it runs much >>>> faster and you realise you need to implement the interrupt routine to >>>> make get_timer() work for long enough periods - Fine, you add an ISR >>>> for the new platform that calls sync_timebase - No other changes are >>>> required. >>>> >>>> The last thing we want is for the 64-bit tick counter to be conceptually >>>> different across platforms >>>> >>>> I just realised - the ISR _does not need to call the sync_timebase at >>>> all_ >>>> The ISR only needs to call get_ticks(), so it will be fast anyway >>> >>> The problem is that the way we previously detected wrapping does not work >>> if >>> the interrupt rate is == to the counter wrap time, which it essentially >>> always is. If get_ticks is trying to update the wrap count when an >>> interrupt >> >> Is it, always, on every platform? > > Yes, pretty much. You get a terminal count/counter underflow interrupt and > that is it. Not on sc520 - The micro/millisecond counter cannot be used to driver an interrupt - you need to setup a seperate timer. I think the x86 internal performance counters are the same >>> comes in, it will do it wrong. If the interrupt rate was faster, it would >>> work, because get_ticks would always know the correct answer. Consider >>> the >>> following. Get ticks is called by the counter overflowing (which resets >>> it >>> to 0) and executes with the counter value at 0. Later, at the next >>> rollover, >>> with no intervening calls to get ticks, the interrupt routine calls get >>> ticks. Assuming negligible interrupt latency, get_ticks just sees the >>> counter is still 0, so it will not detect a wrap condition. So now you >>> loose >> >> But wait a minute, don't you KNOW that the interrupt gets triggered on a >> rollover so therefore there MUST have been a rollover, therefore there has >> been 2^32 (or 2^16 or 2^whatever) ticks which you can just subtract from >> the >> last known tick count (which would be zero if there had been no get_timer >> calls in between) > > Except if this interrupt was delayed because get_ticks turned off > interrupts, get_ticks may have already compensated the number. When we then > get the (delayed) interrupt, we will do it twice. That would mean get_timer() got called EXACTLY on the rollover - A race condition that should be avoided. But still, a race can still occur through using disable/enable interrupts which could push the timer isr out >>> a period. I thought by passing in a flag we could force get_ticks to do >>> the >>> right thing in this case, but since we must disable interrupts when we >>> call >>> get ticks from the get_timer routine, the outer call to get ticks may >>> have >>> already detected the rollover and the flag will force an additional >>> rollover >>> to be detected. It is a classic race condition. If we detect rollover >>> only >> >> Aha! - In nearly all situations, a race is better avoided than run! > > Yes, that is why the new approach does NOT turn off interrupts when > get_ticks is called, but rather loops if the overflow count was changed > while reading the count lsb. That ensures get_ticks always gets both words > either before the counter wrapped or after the counter wrapped, but not > halfway in between. Some ISYNCs may be required to make sure there are no > outstanding changes pending to memory, depending on your architecture. OK, lets have a closer look >>> in the interrupt routine, we do not need to protect the rollover variable >>> from access by the main program, so we can be sure the results of >>> get_ticks >>> is coherent. If we try do do it in both places, we will have trouble. If >>> the >>> interrupt occurred at a faster rate, like say twice the rollover, we >>> wouldn't have a problem. But it doesn't. In most cases it happens at the >>> rollover rate, or just a bit more sometimes due to interrupt latency not >> >> Again does it really - for all arches? > > Yep, pretty much. Almost all timers I know of only give one interrupt per > cycle. A bold assumption - and one that is wrong ;) >>> being a constant. It may appear to happen at somewhat less than a >>> rollover >>> as well, if the interrupt latency was long at time n-1 and short at time >>> n. >>> I think this problem can be overcome in the old design by keeping track >>> of >>> whether the last update was by a non-interrupt or interrupt call to >>> get_ticks , but I think the new design is better anyway. >> >> I disagree - The whole point of a HAL is to Abstract the Hardware > > The HAL should abstract what needs abstracting, and no more. We are really > talking about literally 4 lines of code here, but I think those four lines > certainly belong in sync_timer. All the math in one place is a good thing. > In most cases, get_ticks becomes trivial. Only in cases where the timer is > wrong endian or embedded in another bit field (both things being possible > but not all that common) does get_ticks need to do anything other than just > read the registers and loop until the msb is stable. Simple here is good, > because this is what new implementers need to add/change. The less here, the > better! In fact, the looping until the msb is stable can also be moved up to > sync_timer. That is even simpler. OK, you got me - "show me the money^H^H^H^H^Hcode" Regards, Graeme _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot