Re: [HACKERS] lseek/read/write overhead becomes visible at scale ..

2017-06-22 Thread Andres Freund
On 2017-06-22 12:43:16 -0400, Robert Haas wrote: > On Wed, Jan 25, 2017 at 2:52 PM, Andres Freund wrote: > > You'll, depending on your workload, still have a lot of lseeks even if > > we were to use pread/pwrite because we do lseek(SEEK_END) to get file > > sizes. > > I'm

Re: [HACKERS] lseek/read/write overhead becomes visible at scale ..

2017-06-22 Thread Robert Haas
On Wed, Jan 25, 2017 at 2:52 PM, Andres Freund wrote: > You'll, depending on your workload, still have a lot of lseeks even if > we were to use pread/pwrite because we do lseek(SEEK_END) to get file > sizes. I'm pretty convinced that the lseek overhead that we're incurring

Re: [HACKERS] lseek/read/write overhead becomes visible at scale ..

2017-01-25 Thread Tobias Oberstein
Hi, Synthetic PG workload or real world production workload? Both might work, production-like has bigger pull, but I'd guess synthetic is good enough. Thanks! The box should get PostgreSQL in the not too distant future. It'll get a backup from prod, but will act as new prod, so it might

Re: [HACKERS] lseek/read/write overhead becomes visible at scale ..

2017-01-25 Thread Andres Freund
Hi, On 2017-01-25 10:16:32 +0100, Tobias Oberstein wrote: > > > Using pread instead of lseek+read halfes the syscalls. > > > > > > I really don't understand what you are fighting here .. > > > > Sure, there's some overhead. And as I said upthread, I'm much less > > against this change than Tom.

Re: [HACKERS] lseek/read/write overhead becomes visible at scale ..

2017-01-25 Thread Tobias Oberstein
Hi Andres, Using pread instead of lseek+read halfes the syscalls. I really don't understand what you are fighting here .. Sure, there's some overhead. And as I said upthread, I'm much less against this change than Tom. What I'm saying is that your benchmarks haven't shown a benefit in a

Re: [HACKERS] lseek/read/write overhead becomes visible at scale ..

2017-01-25 Thread Tobias Oberstein
Hi Alvaro, Am 24.01.2017 um 19:36 schrieb Alvaro Herrera: Tobias Oberstein wrote: I am benchmarking IOPS, and while doing so, it becomes apparent that at these scales it does matter _how_ IO is done. The most efficient way is libaio. I get 9.7 million/sec IOPS with low CPU load. Using any

Re: [HACKERS] lseek/read/write overhead becomes visible at scale ..

2017-01-24 Thread Andres Freund
On 2017-01-24 19:25:52 +0100, Tobias Oberstein wrote: > Hi, > > > > pid |syscall| cnt | cnt_per_sec > > > -+---+-+- > > > | syscalls:sys_enter_lseek | 4091584 | 136386 > > >

Re: [HACKERS] lseek/read/write overhead becomes visible at scale ..

2017-01-24 Thread Andres Freund
On 2017-01-24 15:36:13 -0300, Alvaro Herrera wrote: > Tobias Oberstein wrote: > > > I am benchmarking IOPS, and while doing so, it becomes apparent that at > > these scales it does matter _how_ IO is done. > > > > The most efficient way is libaio. I get 9.7 million/sec IOPS with low CPU > >

Re: [HACKERS] lseek/read/write overhead becomes visible at scale ..

2017-01-24 Thread Alvaro Herrera
Tobias Oberstein wrote: > I am benchmarking IOPS, and while doing so, it becomes apparent that at > these scales it does matter _how_ IO is done. > > The most efficient way is libaio. I get 9.7 million/sec IOPS with low CPU > load. Using any synchronous IO engine is slower and produces higher

Re: [HACKERS] lseek/read/write overhead becomes visible at scale ..

2017-01-24 Thread Tobias Oberstein
Hi, pid |syscall| cnt | cnt_per_sec -+---+-+- | syscalls:sys_enter_lseek | 4091584 | 136386 | syscalls:sys_enter_newfstat | 2054988 | 68500 |

Re: [HACKERS] lseek/read/write overhead becomes visible at scale ..

2017-01-24 Thread Andres Freund
Hi, On 2017-01-24 18:57:47 +0100, Tobias Oberstein wrote: > Am 24.01.2017 um 18:41 schrieb Andres Freund: > > On 2017-01-24 18:37:14 +0100, Tobias Oberstein wrote: > > > The syscall overhead is visible in production too .. I watched PG using > > > perf > > > live, and lseeks regularily appear at

Re: [HACKERS] lseek/read/write overhead becomes visible at scale ..

2017-01-24 Thread Tobias Oberstein
Hi, Am 24.01.2017 um 18:41 schrieb Andres Freund: Hi, On 2017-01-24 18:37:14 +0100, Tobias Oberstein wrote: assume that it'd get more than swamped with doing actualy work, and with buffering the frequently accessed stuff in memory. What I am trying to say is: the syscall overhead of doing

Re: [HACKERS] lseek/read/write overhead becomes visible at scale ..

2017-01-24 Thread Andres Freund
Hi, On 2017-01-24 18:37:14 +0100, Tobias Oberstein wrote: > > assume that it'd get more than swamped with doing actualy work, and with > > buffering the frequently accessed stuff in memory. > > > > > > > What I am trying to say is: the syscall overhead of doing lseek/read/write > > > instead of

Re: [HACKERS] lseek/read/write overhead becomes visible at scale ..

2017-01-24 Thread Tobias Oberstein
Hi, Switching to sync engine, it drops to 9.1 mio - but the system load then is also much higher! I doubt those have very much to do with postgres - I'd quite strongly In the machine in production, we see 8kB reads in the 300k-650k/s range. In spikes, because, yes, due to the 3TB RAM, we

Re: [HACKERS] lseek/read/write overhead becomes visible at scale ..

2017-01-24 Thread Andres Freund
Hi, On 2017-01-24 18:11:09 +0100, Tobias Oberstein wrote: > I have done lots of benchmarking over the last days on a massive box, and I > can provide numbers that I think show that the impact can be significant. > Above number was using psync FIO engine .. with libaio, it's at 9.7 mio with >