Hi, On 2017-01-24 18:57:47 +0100, Tobias Oberstein wrote: > Am 24.01.2017 um 18:41 schrieb Andres Freund: > > On 2017-01-24 18:37:14 +0100, Tobias Oberstein wrote: > > > The syscall overhead is visible in production too .. I watched PG using > > > perf > > > live, and lseeks regularily appear at the top of the list. > > > > Could you show such perf profiles? That'll help us. > > oberstet@bvr-sql18:~$ psql -U postgres -d adr > psql (9.5.4) > Type "help" for help. > > adr=# select * from svc_sqlbalancer.f_perf_syscalls(); > NOTICE: starting Linux perf syscalls sampling - be patient, this can take > some time .. > NOTICE: sudo /usr/bin/perf stat -e "syscalls:sys_enter_*" -x ";" -a > sleep 30 2>&1 > pid | syscall | cnt | cnt_per_sec > -----+---------------------------------------+---------+------------- > | syscalls:sys_enter_lseek | 4091584 | 136386 > | syscalls:sys_enter_newfstat | 2054988 | 68500 > | syscalls:sys_enter_read | 767990 | 25600 > | syscalls:sys_enter_close | 503803 | 16793 > | syscalls:sys_enter_newstat | 434080 | 14469 > | syscalls:sys_enter_open | 380382 | 12679 > | syscalls:sys_enter_mmap | 301491 | 10050 > | syscalls:sys_enter_munmap | 182313 | 6077 > | syscalls:sys_enter_getdents | 162443 | 5415 > | syscalls:sys_enter_rt_sigaction | 158947 | 5298 > | syscalls:sys_enter_openat | 85325 | 2844 > | syscalls:sys_enter_readlink | 77439 | 2581 > | syscalls:sys_enter_rt_sigprocmask | 60929 | 2031 > | syscalls:sys_enter_mprotect | 58372 | 1946 > | syscalls:sys_enter_futex | 49726 | 1658 > | syscalls:sys_enter_access | 40845 | 1362 > | syscalls:sys_enter_write | 39513 | 1317 > | syscalls:sys_enter_brk | 33656 | 1122 > | syscalls:sys_enter_epoll_wait | 23776 | 793 > | syscalls:sys_enter_ioctl | 19764 | 659 > | syscalls:sys_enter_wait4 | 17371 | 579 > | syscalls:sys_enter_newlstat | 13008 | 434 > | syscalls:sys_enter_exit_group | 10135 | 338 > | syscalls:sys_enter_recvfrom | 8595 | 286 > | syscalls:sys_enter_sendto | 8448 | 282 > | syscalls:sys_enter_poll | 7200 | 240 > | syscalls:sys_enter_lgetxattr | 6477 | 216 > | syscalls:sys_enter_dup2 | 5790 | 193 > > <snip> > > Note: there isn't a lot of load currently (this is from production).
That doesn't really mean that much - sure it shows that lseek is frequent, but it doesn't tell you how much impact this has to the overall workload. For that'd you'd need a generic (i.e. not syscall tracepoint, but cpu cycle) perf profile, and look in the call graph (via perf report --children) how much of that is below the lseek syscall. > > > > I'm much less against this change than Tom, but doing artificial syscall > > > > microbenchmark seems unlikely to make a big case for using it in > > > > > > This isn't a syscall benchmark, but FIO. > > > > There's not really a difference between those, when you use fio to > > benchmark seek vs pseek. > > Sorry, I don't understand what you are talking about. Fio as you appear to have used is a microbenchmark benchmarking individual syscalls. > > > > postgres, where it's part of vastly more expensive operations (like > > > > actually reading data afterwards, exclusive locks, ...). > > > > > > PG is very CPU hungry, yes. > > > > Indeed - working on it ;) > > > > > > > But there are quite some system related effects > > > too .. eg we've managed to get down the system load with huge pages (big > > > improvement). > > > > Glad to hear it. > > With 3TB RAM, huge pages is absolutely essential (otherwise, the system bogs > down in TLB etc overhead). I was one of the people working on adding hugepage support to pg, that's why I was glad ;) Regards, Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers