On Sep 15, 2014, at 12:00 AM, Tian, Feng <[email protected]> wrote:
> Hi, Laszlo > > Here is our intention for the logic: > > If Microseconds is greater than or equal to 0x1999999999999999ULL(about > 58,494 years), then the divide operation is performed first, which means we > will lose some accuracy, but this is a very large delay request, so waiting a > little bit extra in this extreme case is an acceptable compromise. > > So I would prefer not to add this patch if there is no real impact. > Was this issue caught by some static analysis tool? Why do we need to have code to deal with something that could never happen? Thanks, Andrew Fish > Thanks > Feng > > -----Original Message----- > From: Laszlo Ersek [mailto:[email protected]] > Sent: Sunday, September 14, 2014 21:07 > To: [email protected] > Subject: [edk2] [PATCH 2/2] MdeModulePkg: CoreStall: improve accuracy in > iterating branch > > The iterating branch of CoreStall() adds an extra tick for each iteration if > the division returned a nonzero remainder; effectively rounding up to whole > ticks separately per iteration. This has the potential to waste up to 9 ticks. > > While the current code is technically correct ("the Stall() function stalls > execution on the processor for *at least* the requested number of > microseconds", according to the UEFI spec, emphasis mine), we can improve the > accuracy by rounding up to the exact whole number of ticks. We accumulate > fractional ticks during the iteration, and flush whole ticks as they become > available. > > Contributed-under: TianoCore Contribution Agreement 1.0 > Signed-off-by: Laszlo Ersek <[email protected]> > --- > MdeModulePkg/Core/Dxe/Misc/Stall.c | 44 +++++++++++++++++++++++++++++--------- > 1 file changed, 34 insertions(+), 10 deletions(-) > > diff --git a/MdeModulePkg/Core/Dxe/Misc/Stall.c > b/MdeModulePkg/Core/Dxe/Misc/Stall.c > index aae54fd..4c63624 100644 > --- a/MdeModulePkg/Core/Dxe/Misc/Stall.c > +++ b/MdeModulePkg/Core/Dxe/Misc/Stall.c > @@ -51,51 +51,75 @@ CoreInternalWaitForTick ( **/ EFI_STATUS EFIAPI > CoreStall ( > IN UINTN Microseconds > ) > { > UINT64 Counter; > UINT32 Remainder; > UINTN Index; > + UINT64 Accumulator; > > if (gMetronome == NULL) { > return EFI_NOT_AVAILABLE_YET; > } > > // > // Counter = Microseconds * 10 / gMetronome->TickPeriod > // 0x1999999999999999 = (2^64 - 1) / 10 > // > if (Microseconds > 0x1999999999999999ULL) { > // > // Microseconds is too large to multiple by 10 first. Perform the divide > // operation first and loop 10 times to avoid 64-bit math overflow. > // > Counter = DivU64x32Remainder ( > Microseconds, > gMetronome->TickPeriod, > &Remainder > ); > + > + Accumulator = 0; > + // > + // In each iteration we wait for Counter whole ticks, and > + // (Remainder/TickPeriod) fractional ticks (the latter being strictly > less > + // than one). Clearly we can only pass whole ticks to > + // CoreInternalWaitForTick(), therefore we keep a running account of the > + // fractional ticks. Whenever enough fractional ticks accumulate, we > flush > + // a whole tick from them. > + // > + // Note that Accumulator will never reach or exceed 2*TickPeriod, because > + // Remainder < TickPeriod. This has two consequences: > + // - We never have to flush more than 1 whole tick. > + // - Accumulator is always expressible as UINT64 (TickPeriod being > UINT32). > + // > for (Index = 0; Index < 10; Index++) { > - CoreInternalWaitForTick (Counter); > - } > + Accumulator += Remainder; > + if (Accumulator < gMetronome->TickPeriod) { > + CoreInternalWaitForTick (Counter); > + } else { > + Accumulator -= gMetronome->TickPeriod; > + if (Counter == MAX_UINT64) { > + CoreInternalWaitForTick (Counter); > + CoreInternalWaitForTick (1); > + } else { > + CoreInternalWaitForTick (Counter + 1); > + } > + } > + } > > - if (Remainder != 0) { > - // > - // If Remainder was not zero, then normally, Counter would be rounded > - // up by 1 tick. In this case, since a loop for 10 counts was used > - // to emulate the multiply by 10 operation, Counter needs to be rounded > - // up by 10 counts. > - // > - CoreInternalWaitForTick (10); > + // > + // Flush any remaining fractional ticks. > + // > + if (Accumulator > 0) { > + CoreInternalWaitForTick (1); > } > } else { > // > // Calculate the number of ticks by dividing the number of microseconds by > // the TickPeriod. Calculation is based on 100ns unit. > // > Counter = DivU64x32Remainder ( > MultU64x32 (Microseconds, 10), > gMetronome->TickPeriod, > &Remainder > -- > 1.8.3.1 > > > ------------------------------------------------------------------------------ > Want excitement? > Manually upgrade your production database. > When you want reliability, choose Perforce > Perforce version control. Predictably reliable. > http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk > _______________________________________________ > edk2-devel mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/edk2-devel > > ------------------------------------------------------------------------------ > Want excitement? > Manually upgrade your production database. > When you want reliability, choose Perforce > Perforce version control. Predictably reliable. > http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk > _______________________________________________ > edk2-devel mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/edk2-devel ------------------------------------------------------------------------------ Want excitement? Manually upgrade your production database. When you want reliability, choose Perforce Perforce version control. Predictably reliable. http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk _______________________________________________ edk2-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/edk2-devel
