Re: [Alsa-devel] Re: dmix plugin

Abramo Bagnara Fri, 21 Feb 2003 02:27:16 -0800

Jaroslav Kysela wrote:
> 
> On Thu, 20 Feb 2003, Abramo Bagnara wrote:
> 
> > Now I'm able to get the same results you see.
> >
> > However I think that we need to extract some results from this data.
> >
> > I'll leave alone MMX optimizations because I want to compare apples with
> > apples.
> >
> > The distributed saturation (also when it's missing the check/repeat
> > concurrency correctness part) costs more than 4 times the ticks needed
> > for a (fully correct wrt concurrency) saturate once approach for the
> > case 2048 8 32768.
> >
> > CPU clock: 1460477150.884593
> > mix_areas0: 86747 0.031975%
> > mix_areas1: 259424 0.095623% (0)
> > mix_areas1_mmx: 253894 0.093585% (0)
> > mix_areas2: 132321 0.048773% (365)
> > mix_areas3: 332411 0.122526% (0)
> >
> > The server based approach has an added cost of an extra context switch
> > every period (about 1500 cycles on my machine i.e.), but this is fully
> > amortized by such an huge difference.
> >
> > What's your opinion?
> 
> Interesting is that my Intel P3 CPU has slightly different times:
> 
> pnote:/home/perex/alsa/alsa-lib/test # ./code 2048 8 32768
> Scheduler set to Round Robin with priority 99...
> CPU clock: 847.292487Mhz (UP)
> 
> Summary (the best times):
> mix_areas_srv : 576382 0.366206%
> mix_areas0    : 556852 0.353798%
> mix_areas1    : 867989 0.551480%
> mix_areas1_mmx: 625144 0.397187%
> mix_areas2    : 903335 0.573937%
> 
> areas1/srv ratio     : 1.505927
> areas1_mmx/srv ratio : 1.084600


This is due to cache poisoning effect. This is quite surprising for me.
With warm cache mix_areas_srv is 3 times faster than with cold cache,
while there's a smaller difference with other alternatives.

I've modified code.c to permit also to you to test such an effect.

However I think that the realistic scenario is neither 0 nor 1024KB
cache poison.

> I think that we can lose more in the client/server model. Also, note that
> we can use even futexes (if there's a hope that the possible context
> switch is acceptable) and then we can remove the cmpxchg trick and
> write-retry trick and use MMX for parallel saturation of two samples (this
> last can be used in the client/server model, too, indeed).

I really doubt that futex might be of some help, as it's very difficult
to choose the unit it protects. Also I like very much the fact that
concurring processes are totally independent. Using futex if one exit
badly you're screwed.

What seems more interesting for my eyes in dmix approach is (as Tomasz
has pointed out) the exceptional good latency (which is the other side
of the repeated saturation cost).

However we will enjoy this benefit *only* if pcm_dmix is the last PCM of
the chain.

-- 
Abramo Bagnara                       mailto:[EMAIL PROTECTED]

Opera Unica                          Phone: +39.546.656023
Via Emilia Interna, 140
48014 Castel Bolognese (RA) - Italy


-------------------------------------------------------
This SF.net email is sponsored by: SlickEdit Inc. Develop an edge.
The most comprehensive and flexible code editor you can use.
Code faster. C/C++, C#, Java, HTML, XML, many more. FREE 30-Day Trial.
www.slickedit.com/sourceforge
_______________________________________________
Alsa-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/alsa-devel

Re: [Alsa-devel] Re: dmix plugin

Reply via email to