On Wednesday, February 19, 2014 4:42 PM, Chinmaya Dwibedy <[email protected]>
wrote:
Hi Martin,
Hope this email of mine finds you in best of your health and spirit. I request
your goodness to go thru the below email and provide your valuable feedback if
possible.
Thanking you in advance for your support from time to time so far which really
help me.
Regards,
Chinmaya
Hi,
Changed the
strongswan (5.0.4) code to implement the diffie_hellman_t interface in order
to use Octeon Core Crypto
Library APIs. Run the IPsec scenario in
high loads with DH group 1 (Encryption algo: AES and integrity algorithm: SHA1)
in Windriver Linux (on Octeon platform) and found the tunnel setup rate to be
165-170. Note that, with gmp library (using the same set of parameters), the
setup rate was found out to be 120-125. I profiled the code to figure out the
hotspots at both the ends (IKE Responder and IKE Initiator). The overall CPU
utilization
at both ends was below 10%. Here goes the profiled result.
IKE Initiator
PerfTop: 2508649 irqs/sec kernel:96.0% [1000Hz cpu-clock-msecs],
(all, 16 CPUs)
-------------------------------------------------------------------------------
samples pcnt
function
DSO
_______ _____ ________________________ _____________________
3874309.00 89.5%
r4k_wait [kernel.kallsyms]
73973.00 1.7%
dso__find_symbol /usr/bin/perf
39131.00 0.9% pthread_mutex_lock libpthread-2.11.1.so
37251.00 0.9% event__preprocess_sample /usr/bin/perf
17659.00 0.4% pthread_rwlock_rdlock libpthread-2.11.1.so
17010.00 0.4% __pthread_rwlock_unlock libpthread-2.11.1.so
15596.00 0.4% __libc_malloc /lib64/libc-2.11.1.so
13918.00 0.3%
maps__find
/usr/bin/perf
13066.00
0.3% cfree /lib64/libc-2.11.1.so
12733.00 0.3% vfprintf /lib64/libc-2.11.1.so
12684.00 0.3% POSTLOOP1 libstrongswan-gmp.so
11746.00 0.3%
dump_printf
/usr/bin/perf
11257.00 0.3% MM$L2 libstrongswan-gmp.so
10115.00 0.2% LOOP1 libstrongswan-gmp.so
8556.00 0.2% SHA1Transform libstrongswan-sha1.so
IKE
Responder
-------------------------------------------------------------------------------
PerfTop: 859932 irqs/sec kernel:88.4% [1000Hz
cpu-clock-msecs], (all, 16 CPUs)
-------------------------------------------------------------------------------
samples pcnt
function
DSO
_______ _____ ________________________ ____________________
2783544.00 89.8%
r4k_wait [kernel.kallsyms]
49292.00 1.6%
dso__find_symbol /usr/bin/perf
27989.00 0.9% pthread_mutex_lock libpthread-2.11.1.so
25644.00 0.8% event__preprocess_sample /usr/bin/perf
14973.00 0.5% pthread_rwlock_rdlock libpthread-2.11.1.so
14370.00 0.5% __pthread_rwlock_unlock libpthread-2.11.1.so
10068.00 0.3%
__libc_malloc
libc-2.11.1.so
9905.00 0.3%
maps__find
/usr/bin/perf
9845.00 0.3%
vfprintf
libc-2.11.1.so
9244.00 0.3% POSTLOOP1 libstrongswan-gmp.so
8480.00 0.3%
cfree
libc-2.11.1.so
8279.00
0.3% MM$L2 libstrongswan-gmp.so
7978.00 0.3%
dump_printf
/usr/bin/perf
7297.00 0.2% LOOP1 libstrongswan-gmp.so
6902.00 0.2% perf_session__findnew /usr/bin/perf
It seems that, pthread_mutex_lock()
is at the top of the stack and thus consumes more CPU cycles and CPU time. Is
there any option to find out the exact piece of code which causes this
performance issue?
Regards,
Chinmaya
_______________________________________________
Users mailing list
[email protected]
https://lists.strongswan.org/mailman/listinfo/users