Hi again,

Thanks for fixing the problem so fast. I could make pfmon work without
problem. However, coming back to the issue why I wrote first, I cannot
monitor properly just a function in my code. I repeated the
one-thread-test with the new pfmon version and I got 0's again:

./pfmon -uk --verb  --follow-pthread --trigger-code-start=producto
--trigger-code-stop=producto --trigger-code-follow
-eIA64_INST_RETIRED,CPU_OP_CYCLES_ALL ./msxm_ijk_MP_COL
MATRICES/bcsstm36.rcs 25


loaded 18828 text symbols 13203 data symbols from /proc/kallsyms
1 event set(s) defined
long  sampling periods(val/mask/seed): 0/0x0/00/0x0/0
short sampling periods(val/mask/seed): 0/0x0/00/0x0/0
[PMC4(pmc4)=0x2000809 m=0 e=0 s=0 i=0 thres=0 all=0 es=0x08 plm=9
umask=0x0 pm=0 ism=0x2 oi=0] IA64_INST_RETIRED
[PMD4(pmd4)]
[PMC5(pmc5)=0x2001209 m=0 e=0 s=0 i=0 thres=0 all=0 es=0x12 plm=9
umask=0x0 pm=0 ism=0x2 oi=0] CPU_OP_CYCLES_ALL
[PMD5(pmd5)]
pmd setup for event set0:
[pmd4 set=0 ival=0x0 long_rate=0x0 short_rate=0x0 mask=0x0 seed=0
randomize=n]
[pmd5 set=0 ival=0x0 long_rate=0x0 short_rate=0x0 mask=0x0 seed=0
randomize=n]
exec-pattern=*
[19467] started task: ./msxm_ijk_MP_COL MATRICES/bcsstm36.rcs 25 
[19467] load pid map version 1, flushing samples
follow_exec=n follow_vfork=n follow_fork=n follow_pthread=y
[19467] using 64-bit ABI
entry point @ 0x4000000000000f40
for 
/home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
[19467] deactivated the deactivation of symbol resolution because of
multiple exec() ;-)
[19467] load pid map version 2, flushing samples
[19467]
monitoring 
/home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
 MATRICES/bcsstm36.rcs 25...
[19467] results are on terminal
[19467] monitoring not activated
[19467] installed entry code breakpoint (db0) at 0x4000000000000f40
measurements started at Tue Sep  2 16:22:39 2008

[19467] reached entry code breakpoint @0x4000000000000f40
[19467] loaded 181 text symbols and 81 data symbols, ELF
file /lib/ld-2.4.so
[19467] loaded 2361 text symbols and 4642 data symbols, ELF
file /opt/cesga/intel/intel9.0/cc/9.1.052/lib/libimf.so.6
[19467] loaded 1069 text symbols and 438 data symbols, ELF
file /lib/libm-2.4.so
[19467] loaded 669 text symbols and 161 data symbols, ELF
file /opt/cesga/intel/intel9.0/cc/9.1.052/lib/libguide.so
[19467] loaded 5 text symbols and 8 data symbols, ELF
file /opt/cesga/intel/intel9.0/cc/9.1.052/lib/libipr.so.6
[19467] loaded 99 text symbols and 0 data symbols, ELF
file /lib/libgcc_s.so.1
[19467] loaded 282 text symbols and 64 data symbols, ELF
file /lib/libpthread-2.4.so
[19467] loaded 2640 text symbols and 647 data symbols, ELF
file /lib/libc-2.4.so
[19467] loaded 32 text symbols and 3 data symbols, ELF
file /lib/libdl-2.4.so
[19467] loaded 58 text symbols and 1 data symbols, ELF
file /lib/libunwind.so.7.0.0
[19467] loaded 30 text symbols and 33 data symbols, ELF
file 
/home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
using hardware breakpoints
[19467] dlopen hook
on 
/home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
 MATRICES/bcsstm36.rcs 25
[19467] dlopen hook found @0x200000000001dbe0
[19467] clearing code breakpoint(db0) @0x4000000000000f40
[19467] installed start code breakpoint (db1) at 0x4000000000002200
[19467] installed dlopen code breakpoint (db3) at 0x200000000001dbe0
[19467] resume after code breakpoint
[19467] cloned [19534]
[19467] load pid map version 1, flushing samples
[19534]
monitoring 
/home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
 MATRICES/bcsstm36.rcs 25...
[19534] results are on terminal
[19534] monitoring not activated
[19534] installed start code breakpoint (db1) at 0x4000000000002200
 Tiempo de ejecucion: 0.028932
[19534] task exited
                         0
IA64_INST_RETIRED 
/home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
 (19467,19534,19466)
                         0
CPU_OP_CYCLES_ALL 
/home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
 (19467,19534,19466)
[19534] detached
[19467] task exited
                         0
IA64_INST_RETIRED 
/home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
 (19467,19467,19466)
                         0
CPU_OP_CYCLES_ALL 
/home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
 (19467,19467,19466)
[19467] detached
Task chain registered for processing:
        task 19467 ()
        task 19467
(/home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
 MATRICES/bcsstm36.rcs 25)
created tasks        : 2
maximum tasks        : 2
maximum active tasks : 2
measurements completed at Tue Sep  2 16:22:41 2008
 

Maybe I misunderstood the symbols the compiler (icc) created. If I run
the code with 2 threads inspecting the object "L_producto_21__par_loop1"
I get also 0's in the 3 threads. However, if I inspect the object
"L_main_21__par_loop0" I get data in 2 of the 3 threads:

72193836 IA64_INST_RETIRED /home/usc/el/jlc/HWcounters/hwcproject/
Fase_1/ PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
(22051,22119,22050)

32430200 CPU_OP_CYCLES_ALL /home/usc/el/jlc/HWcounters/hwcproject/
Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
(22051,22119,22050)

0 IA64_INST_RETIRED /home/usc/el/jlc/HWcounters/hwcproject/
Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
22051,22118,22050)

0 CPU_OP_CYCLES_ALL /home/usc/el/jlc/HWcounters/hwcproject/
Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
22051,22118,22050)

71767775 IA64_INST_RETIRED /home/usc/el/jlc/HWcounters/hwcproject/
Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
22051,22051,22050)

27399361 CPU_OP_CYCLES_ALL /home/usc/el/jlc/HWcounters/hwcproject/
Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
(22051,22051,22050)

So, taking into account that I only requested 2 threads, maybe the
outcomes are correct, and I'm getting the information about the 2
threads in a different symbol (L_main_21__par_loop0) because the
compiler placed the parallel,openMP code there, instead of inside the
"producto" function. What I don't understand is why pfmon doesn't follow
to "L_main_21__par_loop0" from "producto".

Regards,
Juan Ángel



El mar, 02-09-2008 a las 13:52 +0200, stephane eranian escribió:
> Juan,
> 
> Ok, I found two issues with libpfm and pfmon on Montecito hardware.
> One of them was causing the problem you saw.
> 
> I have updated the CVS tree. You need to pull the latest releases from the
> CVS tree.Follow the instructions on the following page:
> 
>     http://sourceforge.net/cvs/?group_id=144822
> 
> The modules you need are: libpfm and pfmon
> 
> Then recompile and try again.
> 
> Thanks.
> 



-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to