Thanks for yours awswers
>Apparently the soundfonts I used were not polyphonic enough
>Using FluidR3_GM.sf2 the cpu load looks better, but I'm yet quite far from the
>"perfect" scalability that your profiling interface gives you JJC.
Effectivelly, your machine is fast and in this case playing MIDI file to
simulate a notes (voices) generator isn't not efficient. This is why the
profiling interface have is own notes generator (but it is still limited to 256
x 16 notes !).
The most important is that fluidsynth are able to play constant number of
voices during measurement. This gives consecutives measurement the same cpu
load result. This makes any future performances measurements easily much more
Note:During my experiment, initially i have noticed that result between
consecutives measurement was not constant. Quickly, i realized that a backgroud
process was running. The job of this process was to economize energy . It was
doing this by stealing cpu cycle!. Of course any performance measurement aren'
possible with this kind of jobs or services running silently behind the scene.
>Additionally I want to revise the current implementation, like using a
>parallel logarithmic buffer reduction to mix audio between threads or
>rethinking data layout and memory accesses in general, >hoping this makes it
Interresting. Looking the code (in the past) i have noticed that a lot of
things perhaps could be enhanced arround the following subject:
1) avoiding mutual access to the "active list of voices" between "primary
tasks" and the pool of "extra tasks".
- breaking the unique list in local list for each task.
- load balancing (same number of voices in each local list).
2) optimizing mixing of buffers between "primary" task and "extra task" (to
avoid actual possible synchronization overhead domination).
3) optimizing fluid_cond_signal(), fluid_cond_wait() each time the associated
mutex is pointless.
Of course all this is easier to say than to do :).
> Message du 14/04/18 17:58
> De : "Tom M."
> A : firstname.lastname@example.org
> Copie à :
> Objet : Re: [fluid-dev] Parallelize rendering using openMP
> Thanks for the feedback so far.
> > Please are you using a very fast machine ? did you ask to fluidsynth to
> > play sufficient number of notes ?
> I'm on a Intel Core i5-3570K @ 3.40GHz. I tested several midi files that have
> instruments playing on all 16 channels. Apparently the soundfonts I used were
> not polyphonic enough, you're right. Using FluidR3_GM.sf2 the cpu load looks
> better, but I'm yet quite far from the "perfect" scalability that your
> profiling interface gives you JJC.
> > How did you come to the conclusion that the synchronization overhead
> > dominates?
> Admittedly this might be a wrong/premature conclusion based on my
> observations + looking at the source code. I took a look at the callgraph
> generated with valgrind --tool=callgrind ./fluidsynth. Synchronization
> functions like g_mutex_lock() or g_cond_wait() are called quite often by
> fluid_mixer_thread_func(). Although it also reports to be not that expensive.
> Still I think it's worth evaluating what job openMP and other refactorings
> can do here. David Henningsson once told me that the parallel renderer was
> more like a (failed) experiment. So please see this current work as my little
> > I do wonder though why OpenMP can do a better job than the current code.
> openMP provides different scheduling strategies to process for loops. Also
> this restriction VOICES_PER_THREAD (==8) to avoid thread overhead seems quite
> magic to me (it probably worked well when David tested it, still, why is 8
> the right number?). Overall I'm not sure whether openMP alone can do a better
> job. It definitely reduces complexity of the code. Additionally I want to
> revise the current implementation, like using a parallel logarithmic buffer
> reduction to mix audio between threads or rethinking data layout and memory
> accesses in general, hoping this makes it more efficient.
> fluid-dev mailing list
fluid-dev mailing list