I just stumbled upon OProfile, a relatively new profiling tool for Linux/x86 that works well with Apache:
http://oprofile.sourceforge.net/ It's a sampling profiler, like gprof, but it's completely external to the app being measured--no recompilation needed. It's implemented as a kernel module that samples the values of hardware performance counters and associates the data with the currently active function. Depending on the underlying CPU, it also can report on things like icache misses and branch prediction success rate on a per-function basis. From a test run of 2.0.37-dev, OProfile's profile of CPU usage in the httpd and its libraries yielded the following "top 10" lists. --Brian httpd: vma samples %-age symbol name 0808188c 32 5.71429 ap_rgetline_core 08076f58 25 4.46429 ap_merge_per_dir_configs 08088e38 22 3.92857 ap_directory_walk 08086f50 22 3.92857 core_input_filter 08080680 20 3.57143 add_any_filter_handle 080628ac 20 3.57143 config_log_transaction 08087328 15 2.67857 core_output_filter 080807e4 15 2.67857 add_any_filter 0806868c 15 2.67857 analyze_ct 08086e48 14 2.5 net_time_filter libapr.so: vma samples %-age symbol name 00017304 98 16.7521 apr_palloc 000087e4 53 9.05983 apr_vformatter 0000c234 40 6.83761 apr_table_get 00017fa8 23 3.93162 apr_pool_cleanup_register 00018a00 22 3.76068 __divdi3 0000bc5c 21 3.58974 make_array_core 0000daa0 19 3.24786 apr_filepath_merge 0000c49c 19 3.24786 apr_table_setn 00011cbc 16 2.73504 apr_setsocketopt 0000ce64 14 2.39316 apr_table_overlap libaprutil.so: vma samples %-age symbol name 00005e84 91 27.1642 apr_brigade_puts 00005840 41 12.2388 apr_brigade_create 000057bc 29 8.65672 apr_brigade_cleanup 00004f28 23 6.86567 apr_bucket_simple_copy 0000614c 17 5.07463 apr_bucket_alloc 0000e394 14 4.1791 match_boyer_moore_horspool 00005d5c 14 4.1791 apr_brigade_write 00005b50 12 3.58209 apr_brigade_split_line 00004dec 11 3.28358 heap_bucket_destroy 00004dc8 11 3.28358 heap_bucket_read
