Juan,

I would expect one thread to remain as the controller thread somehow but I don't
know the internals of OpenMP. To very the hypothesis, you could simply
increase the
number of thread and verify that you're always getting N-1 results.


On Tue, Sep 2, 2008 at 4:55 PM, Juan Ángel Lorenzo
<[EMAIL PROTECTED]> wrote:
> Hi again,
>
> Thanks for fixing the problem so fast. I could make pfmon work without
> problem. However, coming back to the issue why I wrote first, I cannot
> monitor properly just a function in my code. I repeated the
> one-thread-test with the new pfmon version and I got 0's again:
>
> ./pfmon -uk --verb  --follow-pthread --trigger-code-start=producto
> --trigger-code-stop=producto --trigger-code-follow
> -eIA64_INST_RETIRED,CPU_OP_CYCLES_ALL ./msxm_ijk_MP_COL
> MATRICES/bcsstm36.rcs 25
>
>
> loaded 18828 text symbols 13203 data symbols from /proc/kallsyms
> 1 event set(s) defined
> long  sampling periods(val/mask/seed): 0/0x0/00/0x0/0
> short sampling periods(val/mask/seed): 0/0x0/00/0x0/0
> [PMC4(pmc4)=0x2000809 m=0 e=0 s=0 i=0 thres=0 all=0 es=0x08 plm=9
> umask=0x0 pm=0 ism=0x2 oi=0] IA64_INST_RETIRED
> [PMD4(pmd4)]
> [PMC5(pmc5)=0x2001209 m=0 e=0 s=0 i=0 thres=0 all=0 es=0x12 plm=9
> umask=0x0 pm=0 ism=0x2 oi=0] CPU_OP_CYCLES_ALL
> [PMD5(pmd5)]
> pmd setup for event set0:
> [pmd4 set=0 ival=0x0 long_rate=0x0 short_rate=0x0 mask=0x0 seed=0
> randomize=n]
> [pmd5 set=0 ival=0x0 long_rate=0x0 short_rate=0x0 mask=0x0 seed=0
> randomize=n]
> exec-pattern=*
> [19467] started task: ./msxm_ijk_MP_COL MATRICES/bcsstm36.rcs 25
> [19467] load pid map version 1, flushing samples
> follow_exec=n follow_vfork=n follow_fork=n follow_pthread=y
> [19467] using 64-bit ABI
> entry point @ 0x4000000000000f40
> for 
> /home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
> [19467] deactivated the deactivation of symbol resolution because of
> multiple exec() ;-)
> [19467] load pid map version 2, flushing samples
> [19467]
> monitoring 
> /home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
>  MATRICES/bcsstm36.rcs 25...
> [19467] results are on terminal
> [19467] monitoring not activated
> [19467] installed entry code breakpoint (db0) at 0x4000000000000f40
> measurements started at Tue Sep  2 16:22:39 2008
>
> [19467] reached entry code breakpoint @0x4000000000000f40
> [19467] loaded 181 text symbols and 81 data symbols, ELF
> file /lib/ld-2.4.so
> [19467] loaded 2361 text symbols and 4642 data symbols, ELF
> file /opt/cesga/intel/intel9.0/cc/9.1.052/lib/libimf.so.6
> [19467] loaded 1069 text symbols and 438 data symbols, ELF
> file /lib/libm-2.4.so
> [19467] loaded 669 text symbols and 161 data symbols, ELF
> file /opt/cesga/intel/intel9.0/cc/9.1.052/lib/libguide.so
> [19467] loaded 5 text symbols and 8 data symbols, ELF
> file /opt/cesga/intel/intel9.0/cc/9.1.052/lib/libipr.so.6
> [19467] loaded 99 text symbols and 0 data symbols, ELF
> file /lib/libgcc_s.so.1
> [19467] loaded 282 text symbols and 64 data symbols, ELF
> file /lib/libpthread-2.4.so
> [19467] loaded 2640 text symbols and 647 data symbols, ELF
> file /lib/libc-2.4.so
> [19467] loaded 32 text symbols and 3 data symbols, ELF
> file /lib/libdl-2.4.so
> [19467] loaded 58 text symbols and 1 data symbols, ELF
> file /lib/libunwind.so.7.0.0
> [19467] loaded 30 text symbols and 33 data symbols, ELF
> file 
> /home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
> using hardware breakpoints
> [19467] dlopen hook
> on 
> /home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
>  MATRICES/bcsstm36.rcs 25
> [19467] dlopen hook found @0x200000000001dbe0
> [19467] clearing code breakpoint(db0) @0x4000000000000f40
> [19467] installed start code breakpoint (db1) at 0x4000000000002200
> [19467] installed dlopen code breakpoint (db3) at 0x200000000001dbe0
> [19467] resume after code breakpoint
> [19467] cloned [19534]
> [19467] load pid map version 1, flushing samples
> [19534]
> monitoring 
> /home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
>  MATRICES/bcsstm36.rcs 25...
> [19534] results are on terminal
> [19534] monitoring not activated
> [19534] installed start code breakpoint (db1) at 0x4000000000002200
>  Tiempo de ejecucion: 0.028932
> [19534] task exited
>                         0
> IA64_INST_RETIRED 
> /home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
>  (19467,19534,19466)
>                         0
> CPU_OP_CYCLES_ALL 
> /home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
>  (19467,19534,19466)
> [19534] detached
> [19467] task exited
>                         0
> IA64_INST_RETIRED 
> /home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
>  (19467,19467,19466)
>                         0
> CPU_OP_CYCLES_ALL 
> /home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
>  (19467,19467,19466)
> [19467] detached
> Task chain registered for processing:
>        task 19467 ()
>        task 19467
> (/home/usc/el/jlc/HWcounters/hwcproject/Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
>  MATRICES/bcsstm36.rcs 25)
> created tasks        : 2
> maximum tasks        : 2
> maximum active tasks : 2
> measurements completed at Tue Sep  2 16:22:41 2008
>
>
> Maybe I misunderstood the symbols the compiler (icc) created. If I run
> the code with 2 threads inspecting the object "L_producto_21__par_loop1"
> I get also 0's in the 3 threads. However, if I inspect the object
> "L_main_21__par_loop0" I get data in 2 of the 3 threads:
>
> 72193836 IA64_INST_RETIRED /home/usc/el/jlc/HWcounters/hwcproject/
> Fase_1/ PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
> (22051,22119,22050)
>
> 32430200 CPU_OP_CYCLES_ALL /home/usc/el/jlc/HWcounters/hwcproject/
> Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
> (22051,22119,22050)
>
> 0 IA64_INST_RETIRED /home/usc/el/jlc/HWcounters/hwcproject/
> Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
> 22051,22118,22050)
>
> 0 CPU_OP_CYCLES_ALL /home/usc/el/jlc/HWcounters/hwcproject/
> Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
> 22051,22118,22050)
>
> 71767775 IA64_INST_RETIRED /home/usc/el/jlc/HWcounters/hwcproject/
> Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
> 22051,22051,22050)
>
> 27399361 CPU_OP_CYCLES_ALL /home/usc/el/jlc/HWcounters/hwcproject/
> Fase_1/PichelSrc/DISTANCES/codigosCache/msxm_ijk_MP_COL
> (22051,22051,22050)
>
> So, taking into account that I only requested 2 threads, maybe the
> outcomes are correct, and I'm getting the information about the 2
> threads in a different symbol (L_main_21__par_loop0) because the
> compiler placed the parallel,openMP code there, instead of inside the
> "producto" function. What I don't understand is why pfmon doesn't follow
> to "L_main_21__par_loop0" from "producto".
>
> Regards,
> Juan Ángel
>
>
>
> El mar, 02-09-2008 a las 13:52 +0200, stephane eranian escribió:
>> Juan,
>>
>> Ok, I found two issues with libpfm and pfmon on Montecito hardware.
>> One of them was causing the problem you saw.
>>
>> I have updated the CVS tree. You need to pull the latest releases from the
>> CVS tree.Follow the instructions on the following page:
>>
>>     http://sourceforge.net/cvs/?group_id=144822
>>
>> The modules you need are: libpfm and pfmon
>>
>> Then recompile and try again.
>>
>> Thanks.
>>
>
>
>
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to