cksum need a performance boost

Glenn Fowler Mon, 28 Sep 2009 10:48:21 -0400

On Mon, 28 Sep 2009 16:27:10 +0200 Roland Mainz wrote:
> Glenn Fowler wrote:
> > it would be nice to see the improvements all of these ifdef sun* actually 
> > produce


> For a test file with 24884706 bytes in /tmp (=tmpfs/ramdisk):
> - GNU "cksum" currently takes 181 seconds for 1000 iterations
> - AST "cksum" (called as external program) currently takes 244 seconds
> for 1000 iterations (partially caused by dragging more shared libraries
> around, startup time issue with libast-based applications and some other
> things)
> - AST "cksum" called as ksh93 builtin takes 216 seconds for 1000
> iterations (e.g. ~~24 seconds are saved compared to the external
> application)

thanks for the data
can you provide some base case data for a standalone ast app that exits after 
the optget() loop
this will take into account the runtime shared lib loads plus i18n/l10n 
initialization

that will provide a lower bound on optimzations for any ast app

> The basic idea of the patch is to prefetch memory block x+1, then cksum
> block x, then prefetch block x+2, then cksum block x+1 etc. This reduces
> the time the code has to spend waiting for data becoming "ready" (e.g.
> loaded in the L1 cache or similar).

I'm not a chip designer
but
aren't such optimizations supposed to be done in the hw/fw?
and
wouldn't any hard-coded prefetch optimizations be sensitve to the L1 cache size?
and
wouldn't that be sensitive to the data blocking done by the algorithms?

e.g., suppose sum(1) used sizeof(L1) as its blocking size
would that effectively disable the hard-coded L1 prefetch calls?

is there a relationship between sizeof(L1) and the optimal sizes for
{ mmap() read() write() } ?

if there are performance conflicts between these sizes how do
you decide which ones to hard code among a range of hw configurations?

> BTW: I have a new patch queued for "cksum" which further improves the
> performance (primarily by using a static table for "cksum"'s CRC data
> and other stuff).

those kinds of changes will easily make it into the upstream

thanks

[ksh93-integration-discuss] [Bug 640] /usr/bin/sum and/usr/bin/cksum need a performance boost

Reply via email to