Hey Dude,
I pulled down a copy of your test program and ran a few experiments.
$ time ./xml
100000 iter in 22.715982 sec
real 0m22.721s
user 0m22.694s
sys 0m0.007s
This seems to indicate that all of our time is being spent in usermode, so
whatever it is in Solaris that is slower than Linux, it's not the operating
system.
Just to double check, I used the following dtrace to look at the time this
application is spending in syscalls:
# dtrace -n 'syscall:::entry /execname == "xml"/ { self->traced = timestamp;
self->func = probefunc;} syscall:::return /self->traced/ { @a[self->func] =
sum(timestamp - self->traced); @b[self->func] = count(); self->func = (char
*)0; self->timestamp = 0;}'
(This returns the time xml spends in syscalls as well as the number of syscalls
made)
Time (nanoseconds):
getpid 1978
sysconfig 2229
sigpending 3071
sysi86 3529
getrlimit 4234
setcontext 5763
fstat64 7042
close 16606
write 19898
getcwd 21302
ioctl 23668
munmap 25757
brk 28445
open 40712
resolvepath 51777
xstat 57616
mmap 159275
memcntl 267070
Number of invocations:
fstat64 1
getcwd 1
getpid 1
getrlimit 1
ioctl 1
sigpending 1
sysconfig 1
sysi86 1
write 1
setcontext 2
memcntl 6
close 7
munmap 8
open 8
resolvepath 9
xstat 9
brk 10
mmap 32
This agrees with the output from time. You're spending a miniscule amount
of system time performing memory operations.
The next place to look would be at libxml2. What library versions do
you have installed on the different machines. If the library versions
don't match, or you've compiled with a different set of options, or used
a different compiler to build the library, you'll get different results
for the benchmark.
I ran my test on snv_84. It has version 2.6.23 installed.
Another approach would be to use DTrace's profile provider. I obtained
usable results by performing the following:
# dtrace -n 'profile-1001 /execname == "xml"/ {...@a[ustack()] = count();} END
{ trunc(@a, 20);}' -c /tmp/xml
This returns the 20 most frequent stack frames (and the number of times
the occurred) when DTrace runs a profile probe at 1001hz.
xmlDictLookup shows up as a popular function, but that may be a red
herring. I'll include the top 5 here, but you can run this command, or adjust
the truncation to view any amount you'd like:
libc.so.1`_free_unlocked+0x4c
libc.so.1`free+0x2b
libxml2.so.2`xmlFreeNodeList+0x19e
libxml2.so.2`xmlFreeNodeList+0xb6
libxml2.so.2`xmlFreeNodeList+0xb6
libxml2.so.2`xmlFreeDoc+0xac
xml`main+0x55
xml`_start+0x80
61
libxml2.so.2`xmlDictLookup+0x60
libxml2.so.2`xmlParseNCName+0xab
libxml2.so.2`xmlParseQName+0x44
libxml2.so.2`xmlParseAttribute2+0x54
libxml2.so.2`xmlParseStartTag2+0x246
libxml2.so.2`xmlParseElement+0x99
libxml2.so.2`xmlParseContent+0x12c
libxml2.so.2`xmlParseElement+0x134
libxml2.so.2`xmlParseContent+0x12c
libxml2.so.2`xmlParseElement+0x134
libxml2.so.2`xmlParseDocument+0x33b
libxml2.so.2`xmlDoRead+0x6b
libxml2.so.2`xmlReadMemory+0x36
xml`main+0x4c
xml`_start+0x80
66
libxml2.so.2`xmlParseStartTag2+0xb7e
libxml2.so.2`xmlParseElement+0x99
libxml2.so.2`xmlParseContent+0x12c
libxml2.so.2`xmlParseElement+0x134
libxml2.so.2`xmlParseContent+0x12c
libxml2.so.2`xmlParseElement+0x134
libxml2.so.2`xmlParseDocument+0x33b
libxml2.so.2`xmlDoRead+0x6b
libxml2.so.2`xmlReadMemory+0x36
xml`main+0x4c
xml`_start+0x80
66
libc.so.1`_free_unlocked+0x53
libc.so.1`free+0x2b
libxml2.so.2`xmlFreeNodeList+0x19e
libxml2.so.2`xmlFreeNodeList+0xb6
libxml2.so.2`xmlFreeNodeList+0xb6
libxml2.so.2`xmlFreeDoc+0xac
xml`main+0x55
xml`_start+0x80
72
libxml2.so.2`xmlDictLookup+0x60
libxml2.so.2`xmlParseNCName+0xab
libxml2.so.2`xmlParseQName+0x44
libxml2.so.2`xmlParseStartTag2+0x14c
libxml2.so.2`xmlParseElement+0x99
libxml2.so.2`xmlParseContent+0x12c
libxml2.so.2`xmlParseElement+0x134
libxml2.so.2`xmlParseContent+0x12c
libxml2.so.2`xmlParseElement+0x134
libxml2.so.2`xmlParseDocument+0x33b
libxml2.so.2`xmlDoRead+0x6b
libxml2.so.2`xmlReadMemory+0x36
xml`main+0x4c
xml`_start+0x80
80
Hope this helps. It may be that you've got a newer / more optimized
version of libxml2 on your Linux box.
-j
On Mon, Apr 28, 2008 at 04:21:19PM -0400, Matty wrote:
> Howdy,
>
> I have been working with one of our developers to port a Linux application to
> opensolaris. While benchmarking the app, we noticed that it ran 2x slower on
> a Nevada build 85 host than it did on Linux. The application utilizes libxml
> to
> transform XML documents, and I think we have narrowed down the discrepancy
> to the xmlReadMemory libxml function. We created a test program to call
> xmlReadMemory100k times, and measured the program execution time. Here
> are the results we got on the same hardware (Sun X2200):
>
> CentOS Linux 5:
>
> % ./xml
> 100000 iter in 9.581637 sec
>
> Nevada build 85:
>
> % ./xml
> 100000 iter in 15.983286 sec
>
> When we ran the collect / er_print tools to generate execution
> profiles, we noticed that
> the top functions make heavy use of memory. Based on a number of documents I
> read on the Sun developer site, I tried different compiler flags,
> memory allocators
> (mtmalloc, libumem) and compilers (Sun studio 12). These items didn't
> help, and we
> are currently a bit mystified. The test program our developer wrote is
> available here:
>
> http://prefetch.net/xml.c
>
> And the following command line can be used to compile it;
>
> $ gcc -O3 -o xml `/usr/bin/xml2-config --libs --cflags` xml.c
>
> That said, does anyone have any thoughts on why the same code would be slower
> on Nevada? I really want to get our app running on opensolaris, but
> it's hard to
> justify the port when it when it's not running as well under Linux. :(
>
> Thanks for any insight,
> - Ryan
> _______________________________________________
> perf-discuss mailing list
> perf-discuss at opensolaris.org