> Le 1 déc. 2017 à 20:15, Albert Graef <aggr...@gmail.com> a écrit :
> 
> I'd tend to agree with what Stéphane said, but there are two gotchas with the 
> LLVM-based embedded Faust compiler that I see.
> 
> First, last time I checked, clang/gcc -O3 still generated faster code in some 
> cases than the LLVM backend alone. In the past I've noticed this with some 
> rather simple Faust examples such as my organ.dsp, but I realize that many of 
> the internals in both the Faust compiler and the Faust standard library have 
> changed since then, so this might not even be true any more.

This can be tested accurately with bench tools described here : 
http://faust.grame.fr/news/2017/04/26/optimizing-compilation-parameters.html

faustbench foo.dsp to test the C++ path and faustbench-llvm foo.dsp to test 
LLVM path.

faustbench organ.dsp
Selected compiler is g++ with CXXFLAGS = -Ofast -march=native 
-fbracket-depth=512
DSP bench of ./faust.bTU/organ/organ compiled in C++ running with FAUSTFLOAT = 
float
duration 0.00914
-scal : 907.914 (DSP CPU % : 0.0384849)
duration 0.009423
-scal -exp10 : 907.755 (DSP CPU % : 0.0384788)
duration 0.024908
-vec -lv 0 -vs 4 : 325.032 (DSP CPU % : 0.104947)
duration 0.015622
-vec -lv 0 -vs 8 : 518.956 (DSP CPU % : 0.0656819)
duration 0.016802
-vec -lv 0 -vs 16 : 512.683 (DSP CPU % : 0.066705)
duration 0.016715
-vec -lv 0 -vs 32 : 496.94 (DSP CPU % : 0.0686769)
duration 0.018433
-vec -lv 0 -vs 64 : 444.006 (DSP CPU % : 0.0772959)
duration 0.019201
-vec -lv 0 -vs 128 : 424.956 (DSP CPU % : 0.0805368)
duration 0.019548
-vec -lv 0 -vs 256 : 419.543 (DSP CPU % : 0.0821961)
duration 0.02215
-vec -lv 0 -vs 512 : 414.592 (DSP CPU % : 0.084326)
duration 0.027425
-vec -lv 0 -vs 4 -g : 318.365 (DSP CPU % : 0.108883)
duration 0.016379
-vec -lv 0 -vs 8 -g : 520.232 (DSP CPU % : 0.0660098)
duration 0.016741
-vec -lv 0 -vs 16 -g : 513.927 (DSP CPU % : 0.0678274)
duration 0.016844
-vec -lv 0 -vs 32 -g : 501.489 (DSP CPU % : 0.0685877)
duration 0.01844
-vec -lv 0 -vs 64 -g : 442.581 (DSP CPU % : 0.0772858)
duration 0.019287
-vec -lv 0 -vs 128 -g : 425.529 (DSP CPU % : 0.0806568)
duration 0.019526
-vec -lv 0 -vs 256 -g : 418.026 (DSP CPU % : 0.0828758)
duration 0.019909
-vec -lv 0 -vs 512 -g : 415.567 (DSP CPU % : 0.0837813)
duration 0.027128
-vec -lv 1 -vs 4 : 320.13 (DSP CPU % : 0.106808)
duration 0.019699
-vec -lv 1 -vs 8 : 404.64 (DSP CPU % : 0.0849282)
duration 0.018441
-vec -lv 1 -vs 16 : 433.368 (DSP CPU % : 0.0794095)
duration 0.016663
-vec -lv 1 -vs 32 : 479.978 (DSP CPU % : 0.0710757)
duration 0.017045
-vec -lv 1 -vs 64 : 476.172 (DSP CPU % : 0.072545)
duration 0.01778
-vec -lv 1 -vs 128 : 465.504 (DSP CPU % : 0.0736911)
duration 0.017254
-vec -lv 1 -vs 256 : 463.708 (DSP CPU % : 0.0746453)
duration 0.017769
-vec -lv 1 -vs 512 : 461.023 (DSP CPU % : 0.0756531)
duration 0.025339
-vec -lv 1 -vs 4 -g : 318.678 (DSP CPU % : 0.107433)
duration 0.020131
-vec -lv 1 -vs 8 -g : 397.835 (DSP CPU % : 0.0862622)
duration 0.018462
-vec -lv 1 -vs 16 -g : 432.537 (DSP CPU % : 0.0801232)
duration 0.01801
-vec -lv 1 -vs 32 -g : 479.149 (DSP CPU % : 0.0719457)
duration 0.017515
-vec -lv 1 -vs 64 -g : 473.516 (DSP CPU % : 0.0736981)
duration 0.0177
-vec -lv 1 -vs 128 -g : 463.753 (DSP CPU % : 0.0748671)
duration 0.01743

faustbench-llvm  organ.dsp
Libfaust version : 2.5.9
Compiled with additional options : 
Estimate timing parameters
 -scal -I /usr/local/share/faust : duration 0.010527
Discover best parameters option
 -scal -I /usr/local/share/faust : 777.881 (DSP CPU % : 0.0446929)
 -scal -exp10 -I /usr/local/share/faust : 777.471 (DSP CPU % : 0.044369)
 -vec -lv 0 -vs 4 -I /usr/local/share/faust : 617.697 (DSP CPU % : 0.0556883)
 -vec -lv 0 -vs 8 -I /usr/local/share/faust : 576.244 (DSP CPU % : 0.059282)
 -vec -lv 0 -vs 16 -I /usr/local/share/faust : 518.256 (DSP CPU % : 0.0657592)
 -vec -lv 0 -vs 32 -I /usr/local/share/faust : 497.535 (DSP CPU % : 0.0686641)
 -vec -lv 0 -vs 64 -I /usr/local/share/faust : 487.839 (DSP CPU % : 0.0699512)
 -vec -lv 0 -vs 128 -I /usr/local/share/faust : 465.157 (DSP CPU % : 0.0742011)
 -vec -lv 0 -vs 256 -I /usr/local/share/faust : 457.488 (DSP CPU % : 0.0761608)
 -vec -lv 0 -vs 512 -I /usr/local/share/faust : 451.775 (DSP CPU % : 0.0765935)
 -vec -lv 0 -vs 1024 -I /usr/local/share/faust : 251.154 (DSP CPU % : 0.137605)
 -vec -fun -lv 0 -vs 4 -I /usr/local/share/faust : 761.114 (DSP CPU % : 
0.0450477)
 -vec -fun -lv 0 -vs 8 -I /usr/local/share/faust : 650.206 (DSP CPU % : 
0.0526827)
 -vec -fun -lv 0 -vs 16 -I /usr/local/share/faust : 613.235 (DSP CPU % : 
0.0559572)
 -vec -fun -lv 0 -vs 32 -I /usr/local/share/faust : 558.533 (DSP CPU % : 
0.0612515)
 -vec -fun -lv 0 -vs 64 -I /usr/local/share/faust : 573.086 (DSP CPU % : 
0.0596367)
 -vec -fun -lv 0 -vs 128 -I /usr/local/share/faust : 542.66 (DSP CPU % : 
0.06368)
 -vec -fun -lv 0 -vs 256 -I /usr/local/share/faust : 534.568 (DSP CPU % : 
0.0654364)
 -vec -fun -lv 0 -vs 512 -I /usr/local/share/faust : 522.227 (DSP CPU % : 
0.0666158)
 -vec -fun -lv 0 -vs 1024 -I /usr/local/share/faust : 253.841 (DSP CPU % : 
0.136031)
 -vec -lv 0 -vs 4 -g -I /usr/local/share/faust : 569.817 (DSP CPU % : 0.060622)
 -vec -lv 0 -vs 8 -g -I /usr/local/share/faust : 550.635 (DSP CPU % : 0.0623821)
 -vec -lv 0 -vs 16 -g -I /usr/local/share/faust : 518.219 (DSP CPU % : 
0.0658174)
 -vec -lv 0 -vs 32 -g -I /usr/local/share/faust : 497.175 (DSP CPU % : 
0.0688576)
 -vec -lv 0 -vs 64 -g -I /usr/local/share/faust : 490.308 (DSP CPU % : 
0.0701619)
 -vec -lv 0 -vs 128 -g -I /usr/local/share/faust : 464.595 (DSP CPU % : 
0.0742094)

My recent tests on OS X with clang/LLVM 4.0 and 5.0 shows really less 
difference that previously, or you have winners and losers on either side, 
depending of the DSP and the Faust compilations parameters. Some of those 
benchmark are here : 
http://faust.grame.fr/news/2017/09/15/backend-benchmarks.html

> 
> Second, last time I've checked the Faust compiler's internal memory 
> management didn't do any garbage collection on Faust's internal data 
> structures such as the deBruijn term trees. Is that still true? Because if it 
> is, it might become a problem in a live coding environment if you do a lot of 
> compilations and the environment keeps running for a *really* long time (say, 
> like in a DAW or sequencer application that might be kept open for hours or 
> even days if you're working on a session), since the Faust compiler may start 
> eating away more and more of your copious main memory. ;-)

libfaust is « obviously » supposed to correctly handle memory. We use a global 
memory allocator for all dynamic data structures that deallocate all of them 
when the DSP factory has been produced. I fixed some remaining memory leak some 
weeks/month ago. It should be OK now, and if not, then *please* send bug 
reports.. ((-;.

> 
> So there may be good reasons to go the more traditional route of just doing 
> the compilation in a separate process and then simply reloading the resulting 
> shared libraries. My own pd-faust still does it that way, and it works great, 
> also on macOS and Windows. However, you *have* to make sure that you properly 
> unload the dlls using dlclose() before reloading them using dlopen(). At 
> least it will do the trick with the portable libtool equivalents, lt_dlopen() 
> and lt_dlclose(), which is what pd-faust actually uses.
> 
> To do this properly, you may have to do your own reference counting on the 
> loaded libraries, and maybe also keep track of the timestamps of the shared 
> library modules so that you only reload them when they actually changed. This 
> is all done in pd-faust (or rather in pure-faust, the Pure module which 
> pd-faust uses to handle compiled Faust modules in Pure). If you're interested 
> then you may want to look at and grab some of the code around `void 
> module_t::reload()` in the faust.cc module here: 
> https://github.com/agraef/pure-lang/blob/master/pure-faust/faust.cc
> 
> HTH,
> Albert

Libfaust also does factory caching, so that the same DSP (same SHA key computed 
from the DSP source code ) will not be recompiled if the application tries is 
again in a same session. Note that the compile code can be save/restored as 
LLVM bitcode or even as machine code (see the 
writeDSPFactoryToMachine/readDSPFactoryFromMachine kind of API here : 
http://faust.grame.fr/news/2016/01/12/using-dynamic-compiler.html) . We use 
that in faustgen~ for instance, saving the compiled factory as machine code in 
the Max/MSP patch, restoring it at load time. This works like a charm of 
optimize load time of already compiled patches a lot...

So I strongly recommend looking at libfaust/LLVM path ((-; but you are free to 
test any solution in this open-source world…

Stéphane 


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Faudiostream-devel mailing list
Faudiostream-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/faudiostream-devel

Reply via email to