Juan, I would expect one thread to remain as the controller thread somehow but I don't know the internals of OpenMP. To very the hypothesis, you could simply increase the number of thread and verify that you're always getting N-1 results.
On Tue, Sep 2, 2008 at 4:55 PM, Juan Ángel Lorenzo <[EMAIL PROTECTED]> wrote: > Hi again, > > Thanks for fixing the problem so fast. I could make pfmon work without > problem. However, coming back to the issue why I wrote first, I cannot > monitor properly just a function in my code. I repeated the > one-thread-test with the new pfmon version and I got 0's again: > > ./pfmon -uk --verb --follow-pthread --trigger-code-start=producto > --trigger-code-stop=producto --trigger-code-follow > -eIA64_INST_RETIRED,CPU_OP_CYCLES_ALL ./msxm_ijk_MP_COL > MATRICES/bcsstm36.rcs 25 > > > loaded 18828 text symbols 13203 data symbols from /proc/kallsyms > 1 event set(s) defined > long sampling periods(val/mask/seed): 0/0x0/00/0x0/0 > short sampling periods(val/mask/seed): 0/0x0/00/0x0/0 > [PMC4(pmc4)=0x2000809 m=0 e=0 s=0 i=0 thres=0 all=0 es=0x08 plm=9 > umask=0x0 pm=0 ism=0x2 oi=0] IA64_INST_RETIRED > [PMD4(pmd4)] > [PMC5(pmc5)=0x2001209 m=0 e=0 s=0 i=0 thres=0 all=0 es=0x12 plm=9 > umask=0x0 pm=0 ism=0x2 oi=0] CPU_OP_CYCLES_ALL > [PMD5(pmd5)] > pmd setup for event set0: > [pmd4 set=0 ival=0x0 long_rate=0x0 short_rate=0x0 mask=0x0 seed=0 > randomize=n] > [pmd5 set=0 ival=0x0 long_rate=0x0 short_rate=0x0 mask=0x0 seed=0 > randomize=n] > exec-pattern=* > [19467] started task: ./msxm_ijk_MP_COL MATRICES/bcsstm36.rcs 25 > [19467] load pid map version 1, flushing samples > follow_exec=n follow_vfork=n follow_fork=n follow_pthread=y > [19467] using 64-bit ABI > entry point @ 0x4000000000000f40 > for > /home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL > [19467] deactivated the deactivation of symbol resolution because of > multiple exec() ;-) > [19467] load pid map version 2, flushing samples > [19467] > monitoring > /home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL > MATRICES/bcsstm36.rcs 25... > [19467] results are on terminal > [19467] monitoring not activated > [19467] installed entry code breakpoint (db0) at 0x4000000000000f40 > measurements started at Tue Sep 2 16:22:39 2008 > > [19467] reached entry code breakpoint @0x4000000000000f40 > [19467] loaded 181 text symbols and 81 data symbols, ELF > file /lib/ld-2.4.so > [19467] loaded 2361 text symbols and 4642 data symbols, ELF > file /opt/cesga/intel/intel9.0/cc/9.1.052/lib/libimf.so.6 > [19467] loaded 1069 text symbols and 438 data symbols, ELF > file /lib/libm-2.4.so > [19467] loaded 669 text symbols and 161 data symbols, ELF > file /opt/cesga/intel/intel9.0/cc/9.1.052/lib/libguide.so > [19467] loaded 5 text symbols and 8 data symbols, ELF > file /opt/cesga/intel/intel9.0/cc/9.1.052/lib/libipr.so.6 > [19467] loaded 99 text symbols and 0 data symbols, ELF > file /lib/libgcc_s.so.1 > [19467] loaded 282 text symbols and 64 data symbols, ELF > file /lib/libpthread-2.4.so > [19467] loaded 2640 text symbols and 647 data symbols, ELF > file /lib/libc-2.4.so > [19467] loaded 32 text symbols and 3 data symbols, ELF > file /lib/libdl-2.4.so > [19467] loaded 58 text symbols and 1 data symbols, ELF > file /lib/libunwind.so.7.0.0 > [19467] loaded 30 text symbols and 33 data symbols, ELF > file > /home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL > using hardware breakpoints > [19467] dlopen hook > on > /home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL > MATRICES/bcsstm36.rcs 25 > [19467] dlopen hook found @0x200000000001dbe0 > [19467] clearing code breakpoint(db0) @0x4000000000000f40 > [19467] installed start code breakpoint (db1) at 0x4000000000002200 > [19467] installed dlopen code breakpoint (db3) at 0x200000000001dbe0 > [19467] resume after code breakpoint > [19467] cloned [19534] > [19467] load pid map version 1, flushing samples > [19534] > monitoring > /home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL > MATRICES/bcsstm36.rcs 25... > [19534] results are on terminal > [19534] monitoring not activated > [19534] installed start code breakpoint (db1) at 0x4000000000002200 > Tiempo de ejecucion: 0.028932 > [19534] task exited > 0 > IA64_INST_RETIRED > /home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL > (19467,19534,19466) > 0 > CPU_OP_CYCLES_ALL > /home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL > (19467,19534,19466) > [19534] detached > [19467] task exited > 0 > IA64_INST_RETIRED > /home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL > (19467,19467,19466) > 0 > CPU_OP_CYCLES_ALL > /home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL > (19467,19467,19466) > [19467] detached > Task chain registered for processing: > task 19467 () > task 19467 > (/home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL > MATRICES/bcsstm36.rcs 25) > created tasks : 2 > maximum tasks : 2 > maximum active tasks : 2 > measurements completed at Tue Sep 2 16:22:41 2008 > > > Maybe I misunderstood the symbols the compiler (icc) created. If I run > the code with 2 threads inspecting the object "L_producto_21__par_loop1" > I get also 0's in the 3 threads. However, if I inspect the object > "L_main_21__par_loop0" I get data in 2 of the 3 threads: > > 72193836 IA64_INST_RETIRED /home/usc/el/jlc/HWcounters/hwcproject/ > Fase_1/ PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL > (22051,22119,22050) > > 32430200 CPU_OP_CYCLES_ALL /home/usc/el/jlc/HWcounters/hwcproject/ > Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL > (22051,22119,22050) > > 0 IA64_INST_RETIRED /home/usc/el/jlc/HWcounters/hwcproject/ > Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL > 22051,22118,22050) > > 0 CPU_OP_CYCLES_ALL /home/usc/el/jlc/HWcounters/hwcproject/ > Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL > 22051,22118,22050) > > 71767775 IA64_INST_RETIRED /home/usc/el/jlc/HWcounters/hwcproject/ > Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL > 22051,22051,22050) > > 27399361 CPU_OP_CYCLES_ALL /home/usc/el/jlc/HWcounters/hwcproject/ > Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL > (22051,22051,22050) > > So, taking into account that I only requested 2 threads, maybe the > outcomes are correct, and I'm getting the information about the 2 > threads in a different symbol (L_main_21__par_loop0) because the > compiler placed the parallel,openMP code there, instead of inside the > "producto" function. What I don't understand is why pfmon doesn't follow > to "L_main_21__par_loop0" from "producto". > > Regards, > Juan Ángel > > > > El mar, 02-09-2008 a las 13:52 +0200, stephane eranian escribió: >> Juan, >> >> Ok, I found two issues with libpfm and pfmon on Montecito hardware. >> One of them was causing the problem you saw. >> >> I have updated the CVS tree. You need to pull the latest releases from the >> CVS tree.Follow the instructions on the following page: >> >> http://sourceforge.net/cvs/?group_id=144822 >> >> The modules you need are: libpfm and pfmon >> >> Then recompile and try again. >> >> Thanks. >> > > > ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel