It's not obvious is it?
I've created KB with an example of how to do it, which should be published
soon.
Mark
Preview Knowledge Article
Content Entry Fields
Title
How to Convert Thread IDs (TID) in AR Server Logs to Linux LWP IDs.
Problem
On Solaris the TID used in the AR Server logs is the same as that used by the
OS to identify individual threads. This makes it easy to take the output of
commands such as pstack/top and map the activity of a particular thread to
SQL/API being run by the server. This is not the case on Linux as the OS
utilities may show different LWP Ids that do no match the TID values used in
the server logs. For example, on startup the server thread log records the
creation of each thread;
<THRD> /* Tue Feb 26 2013 07:14:36.3408 */ Thread Id 1106151744 (thread number
9) on LIST queue started.
<THRD> /* Tue Feb 26 2013 07:14:36.3409 */ Thread Id 1086691648 (thread number
10) on PRIVATE queue 390680 started.
The server logs use the Thread Id value;
<API > <TID: 1086691648> <RPC ID: 0000001180> <Queue: Prv:390680> <Client-RPC:
390680 > <USER: Remedy Application Service > /* Tue Feb 26
2013 07:24:39.8051 */+GES ARGetEntryStatistics
However, OS utilities show LWP Ids;
$ top
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3533 remedy 15 0 564m 160m 21m S 6.7 8.0 0:00.20 arserverd
3469 remedy 15 0 564m 160m 21m S 0.0 8.0 0:00.48 arserverd
3470 remedy 15 0 564m 160m 21m S 0.0 8.0 0:00.00 arserverd
3471 remedy 15 0 564m 160m 21m S 0.0 8.0 0:00.00 arserverd
$ pstack
Thread 11 (Thread 0x40c59940 (LWP 3533)):
#0 0x0000003742e0a899 in pthread_cond_wait@@GLIBC_2.3.2 ()
#1 0x00000000009df414 in WaitForRxSemaphore ()
#2 0x00000000006c40a4 in GetCall ()
#3 0x000000000045490b in HandleRPCs ()
#4 0x00000000007e9266 in WorkerThread ()
#5 0x00000000009df95c in RestartableThreadMain ()
#6 0x00000000009df8e7 in UnixThreadStartRoutine ()
#7 0x0000003742e06367 in start_thread () from /lib64/libpthread.so.0
#8 0x00000037422d30ad in clone () from /lib64/libc.so.6
Solution
It is possible to find the mapping between the different Ids being used by
taking the hexadecimal thread number from the pstack output and converting it
to decimal.
In the example above we can see that Thread 0x40c59940 is also known as LWP
3533. The decimal value of 0x40c59940 can be found by using this Linux command
line, or a similar converter;
$ echo $((0x40c59940))
1086691648
LWP 3533 is the thread logged as TID: 1086691648 in the server logs. Note that
the Thread 11 in the pstack output does NOT match the thread number recorded in
the arthread.log file.
A full list of mappings for an arserverd process can be found using the gdb
debugger or pstack/gstack command;
WARNING - although the risk is low please be careful running this command
against a production system. These are debugging tools and it will suspend the
arserverd process for a moment which may cause problems on a very busy server.
If possible run this command when the server is not heavily loaded.
$ echo quit | gdb arserverd `pgrep arserverd` | grep LWP | sort -k5 | awk
'BEGIN {print "TID\t\tLWP\t\tHEX";}{ printf "%i\t%i\t\t%s\n", strtonum($3), $5,
$3}'
TID LWP HEX
47883904271504 3469 0x2b8cd71ac490
1088616768 3470 0x40e2f940
1089669440 3471 0x40f30940
1092467008 3472 0x411db940
1095625024 3527 0x414de940
1102993728 3528 0x41be5940
1100888384 3529 0x419e3940
1101941056 3530 0x41ae4940
1084586304 3531 0x40a57940
1106151744 3532 0x41ee8940
1086691648 3533 0x40c59940
1094572352 3534 0x413dd940
1104046400 3535 0x41ce6940
1097730368 3536 0x416e0940
1107204416 3537 0x41fe9940
1093519680 3538 0x412dc940
1105099072 3539 0x41de7940
1085638976 3540 0x40b58940
1083533632 3542 0x40956940
1099835712 3547 0x418e2940
The same output can be found using pstack/gstack;
$ pstack `pgrep arserverd` | grep '(LWP ' | sort -k6 | awk 'BEGIN {print
"TID\t\tLWP\t\tHEX";}{ printf "%i\t%i\t\t%s\n", strtonum($4), $6, $4}'
TID LWP HEX
47883904271504 3469 0x2b8cd71ac490
1088616768 3470 0x40e2f940
1089669440 3471 0x40f30940
1092467008 3472 0x411db940
1095625024 3527 0x414de940
1102993728 3528 0x41be5940
1100888384 3529 0x419e3940
1101941056 3530 0x41ae4940
1084586304 3531 0x40a57940
1106151744 3532 0x41ee8940
1086691648 3533 0x40c59940
1094572352 3534 0x413dd940
1104046400 3535 0x41ce6940
1097730368 3536 0x416e0940
1107204416 3537 0x41fe9940
1093519680 3538 0x412dc940
1105099072 3539 0x41de7940
1085638976 3540 0x40b58940
1083533632 3542 0x40956940
1099835712 3547 0x418e2940
In both examples a list of lines containing the hexadecimal thread and LWP
numbers is printed and then processed using common Linux utilities to produce
the formatted output.
From: Action Request System discussion list(ARSList)
[mailto:[email protected]] On Behalf Of Campbell, Paul (Paul)
Sent: 25 February 2013 20:05
To: [email protected]
Subject: Help with identifying process id for a specific arserverd Thread on
linux
**
OK ARSystem Linux gurus out there, I am moving to RHEL6 from Solaris 9 and on
Solaris I could run a prstat -L -p <arserverd pid> and it would give me the
arserverd/<threadID> of the top cpu using thread, I could then look at the
arthread log and see which tid that thread would be in the filter/sql log file
to try and find out what that thread was doing. Now that we are on RHEL Linux,
we don't have prstat, I have to use top and press <shift>H to get the per
thread cpu usage, but on Linux, each LWP has its own process ID, so I can't
figure out how to tie a high running thread pid with the thread tid in arthread
log or filter/sql logs. I tried ps -C arserverd -m -o pid,tid,pcpu, but the
tid column showed me the process id from top. I even looked at
/proc/<pid>/status and it showed pid and tgid as the same and nothing matched
what I saw in the arthread.log. Ideas?
_ARSlist: "Where the Answers Are" and have been for 20 years_
_______________________________________________________________________________
UNSUBSCRIBE or access ARSlist Archives at www.arslist.org
"Where the Answers Are, and have been for 20 years"