Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-10 Thread Jim Nasby
Just to add a data point (and sorry, I can't find where someone was talking about numbers in the thread)... For a while earlier this year we were running a 3.x kernel and saw a very modest (1-2%) improvement in overall performance. This would be on a server with 512G RAM running ext4. -- Jim

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-10 Thread Jeff Janes
On Tue, Dec 3, 2013 at 11:39 PM, Claudio Freire klaussfre...@gmail.comwrote: On Wed, Dec 4, 2013 at 4:28 AM, Tatsuo Ishii is...@postgresql.org wrote: Can we avoid the Linux kernel problem by simply increasing our shared buffer size, say up to 80% of memory? It will be swap more easier.

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-10 Thread Andres Freund
On 2013-12-04 05:39:23 -0200, Claudio Freire wrote: Problem is, Postgres relies on a working kernel cache for checkpoints. Checkpoint logic would have to be heavily reworked to account for an impaired kernel cache. I don't think checkpoints are the critical problem with that, they are nicely

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-10 Thread Claudio Freire
On Tue, Dec 10, 2013 at 9:22 PM, Jeff Janes jeff.ja...@gmail.com wrote: Communicating more with the kernel (through posix_fadvise, fallocate, aio, iovec, etc...) would probably be good, but it does expose more kernel issues. posix_fadvise, for instance, is a double-edged sword ATM. I do

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-10 Thread Tom Lane
Jeff Janes jeff.ja...@gmail.com writes: On Tue, Dec 3, 2013 at 11:39 PM, Claudio Freire klaussfre...@gmail.comwrote: Problem is, Postgres relies on a working kernel cache for checkpoints. Checkpoint logic would have to be heavily reworked to account for an impaired kernel cache. I don't

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-10 Thread Jeff Janes
On Tuesday, December 10, 2013, Tom Lane wrote: Jeff Janes jeff.ja...@gmail.com javascript:; writes: On Tue, Dec 3, 2013 at 11:39 PM, Claudio Freire klaussfre...@gmail.comjavascript:; wrote: Problem is, Postgres relies on a working kernel cache for checkpoints. Checkpoint logic would

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-10 Thread Claudio Freire
On Tue, Dec 10, 2013 at 11:33 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Tuesday, December 10, 2013, Tom Lane wrote: Jeff Janes jeff.ja...@gmail.com writes: On Tue, Dec 3, 2013 at 11:39 PM, Claudio Freire klaussfre...@gmail.comwrote: Problem is, Postgres relies on a working kernel

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-10 Thread KONDO Mitsumasa
(2013/12/11 10:25), Tom Lane wrote: Jeff Janes jeff.ja...@gmail.com writes: On Tue, Dec 3, 2013 at 11:39 PM, Claudio Freire klaussfre...@gmail.comwrote: Problem is, Postgres relies on a working kernel cache for checkpoints. Checkpoint logic would have to be heavily reworked to account for an

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-08 Thread Jim Nasby
On 12/5/13 9:59 AM, Tom Lane wrote: Greg Stark st...@mit.edu writes: I think the way to use mmap would be to mmap very large chunks, possibly whole tables. We would need some way to control page flushes that doesn't involve splitting mappings and can be efficiently controlled without having the

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-08 Thread KONDO Mitsumasa
(2013/12/05 23:42), Greg Stark wrote: On Thu, Dec 5, 2013 at 8:35 AM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: Yes. And using something efficiently DirectIO is more difficult than BufferedIO. If we change write() flag with direct IO in PostgreSQL, it will execute hardest ugly

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-05 Thread KONDO Mitsumasa
(2013/12/04 16:39), Claudio Freire wrote: On Wed, Dec 4, 2013 at 4:28 AM, Tatsuo Ishii is...@postgresql.org wrote: Can we avoid the Linux kernel problem by simply increasing our shared buffer size, say up to 80% of memory? It will be swap more easier. Is that the case? If the system has not

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-05 Thread Peter Geoghegan
On Wed, Dec 4, 2013 at 11:07 AM, Josh Berkus j...@agliodbs.com wrote: I also wasn't exaggerating the reception I got when I tried to talk about IO and PostgreSQL at LinuxCon and other events. The majority of Linux hackers I've talked to simply don't want to be bothered with PostgreSQL's

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-05 Thread Stephen Frost
* Peter Geoghegan (p...@heroku.com) wrote: On Wed, Dec 4, 2013 at 11:07 AM, Josh Berkus j...@agliodbs.com wrote: But you know what? 2.6, overall, still performs better than any kernel in the 3.X series, at least for Postgres. What about the fseek() scalability issue? Not to mention that

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-05 Thread Greg Stark
On Thu, Dec 5, 2013 at 8:35 AM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: Yes. And using something efficiently DirectIO is more difficult than BufferedIO. If we change write() flag with direct IO in PostgreSQL, it will execute hardest ugly randomIO. Using DirectIO presumes you're

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-05 Thread Claudio Freire
On Thu, Dec 5, 2013 at 11:42 AM, Greg Stark st...@mit.edu wrote: (b) is the way more interesting research project though. I don't think anyone's tried it and the kernel interface to provide the kinds of information Postgres needs requires a lot of thought. If it's done right then Postgres

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-05 Thread Greg Stark
On Thu, Dec 5, 2013 at 2:54 PM, Claudio Freire klaussfre...@gmail.com wrote: That's a bad idea in the current state of affairs. MM files haven't been designed for that usage, and getting stable performance out of that will be way too difficult. I'm talking about long-term goals here. Either of

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-05 Thread Tom Lane
Greg Stark st...@mit.edu writes: I think the way to use mmap would be to mmap very large chunks, possibly whole tables. We would need some way to control page flushes that doesn't involve splitting mappings and can be efficiently controlled without having the kernel storing arbitrarily large

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-05 Thread Josh Berkus
On 12/05/2013 07:40 AM, Greg Stark wrote: On Thu, Dec 5, 2013 at 2:54 PM, Claudio Freire klaussfre...@gmail.com wrote: That's a bad idea in the current state of affairs. MM files haven't been designed for that usage, and getting stable performance out of that will be way too difficult. I'm

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-05 Thread Josh Berkus
On 12/05/2013 05:48 AM, Stephen Frost wrote: * Peter Geoghegan (p...@heroku.com) wrote: On Wed, Dec 4, 2013 at 11:07 AM, Josh Berkus j...@agliodbs.com wrote: But you know what? 2.6, overall, still performs better than any kernel in the 3.X series, at least for Postgres. What about the

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-05 Thread Robert Haas
On Thu, Dec 5, 2013 at 12:54 PM, Josh Berkus j...@agliodbs.com wrote: Actually, I've been able to do 35K TPS on commodity hardware on Ubuntu 10.04. I have yet to go about 15K on any Ubuntu running a 3.X Kernel. The CPU scheduling on 2.6 just seems to be far better tuned, aside from the IO

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-05 Thread Josh Berkus
On 12/05/2013 12:41 PM, Robert Haas wrote: On Thu, Dec 5, 2013 at 12:54 PM, Josh Berkus j...@agliodbs.com wrote: Actually, I've been able to do 35K TPS on commodity hardware on Ubuntu 10.04. I have yet to go about 15K on any Ubuntu running a 3.X Kernel. The CPU scheduling on 2.6 just seems

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-05 Thread bricklen
On Thu, Dec 5, 2013 at 12:43 PM, Josh Berkus j...@agliodbs.com wrote: On 12/05/2013 12:41 PM, Robert Haas wrote: Do drunks lurch differently in cathedrals than they do elsewhere? Yeah, because they lurch from one column to another. Row by row?

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Heikki Linnakangas
On 12/04/2013 01:08 AM, Tom Lane wrote: Magnus Hagander mag...@hagander.net writes: On Tue, Dec 3, 2013 at 11:44 PM, Josh Berkus j...@agliodbs.com wrote: Would certainly be nice. Realistically, getting good automated performace tests will require paying someone like Greg S., Mark or me for 6

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Peter Eisentraut
On 12/4/13, 2:14 AM, Stefan Kaltenbrunner wrote: running a few kvm instances that get bootstrapped automatically is something that is a solved problem. Is it sound to run performance tests on kvm? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Jonathan Corbet
On Tue, 03 Dec 2013 10:44:15 -0800 Josh Berkus j...@agliodbs.com wrote: It seems clear that Kernel.org, since 2.6, has been in the business of pushing major, hackish, changes to the IO stack without testing them or even thinking too hard about what the side-effects might be. This is perhaps

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Stefan Kaltenbrunner
On 12/04/2013 04:30 PM, Peter Eisentraut wrote: On 12/4/13, 2:14 AM, Stefan Kaltenbrunner wrote: running a few kvm instances that get bootstrapped automatically is something that is a solved problem. Is it sound to run performance tests on kvm? as sounds as on any other platform imho, the

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Stefan Kaltenbrunner
On 12/04/2013 04:33 PM, Jonathan Corbet wrote: On Tue, 03 Dec 2013 10:44:15 -0800 Josh Berkus j...@agliodbs.com wrote: It seems clear that Kernel.org, since 2.6, has been in the business of pushing major, hackish, changes to the IO stack without testing them or even thinking too hard about

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Joshua D. Drake
On 12/04/2013 07:32 AM, Stefan Kaltenbrunner wrote: On 12/04/2013 04:30 PM, Peter Eisentraut wrote: On 12/4/13, 2:14 AM, Stefan Kaltenbrunner wrote: running a few kvm instances that get bootstrapped automatically is something that is a solved problem. Is it sound to run performance tests

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Stefan Kaltenbrunner
On 12/04/2013 07:30 PM, Joshua D. Drake wrote: On 12/04/2013 07:32 AM, Stefan Kaltenbrunner wrote: On 12/04/2013 04:30 PM, Peter Eisentraut wrote: On 12/4/13, 2:14 AM, Stefan Kaltenbrunner wrote: running a few kvm instances that get bootstrapped automatically is something that is a solved

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Josh Berkus
On 12/04/2013 07:33 AM, Jonathan Corbet wrote: Wow, Josh, I'm surprised to hear this from you. Well, I figured it was too angry to propose for an LWN article. ;-) The active/inactive list mechanism works great for the vast majority of users. The second-use algorithm prevents a lot of

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Joshua D. Drake
On 12/04/2013 07:33 AM, Jonathan Corbet wrote: Wow, Josh, I'm surprised to hear this from you. The active/inactive list mechanism works great for the vast majority of users. The second-use algorithm prevents a lot of pathological behavior, like wiping out your entire cache by copying a big

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Jonathan Corbet
On Wed, 04 Dec 2013 11:07:04 -0800 Josh Berkus j...@agliodbs.com wrote: On 12/04/2013 07:33 AM, Jonathan Corbet wrote: Wow, Josh, I'm surprised to hear this from you. Well, I figured it was too angry to propose for an LWN article. ;-) So you're going to make us write it for you :) The

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Magnus Hagander
On Wed, Dec 4, 2013 at 9:31 PM, Jonathan Corbet cor...@lwn.net wrote: I also wasn't exaggerating the reception I got when I tried to talk about IO and PostgreSQL at LinuxCon and other events. The majority of Linux hackers I've talked to simply don't want to be bothered with PostgreSQL's

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Stephen Frost
* Magnus Hagander (mag...@hagander.net) wrote: I think that's an excellent idea. If one of our developers could find the time to attend that, I think that could be very productive. While I'm not on the funds team, I'd definitely vote for funding such participation out of community funds if

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Josh Berkus
Jonathan, For those interested in the details... (1) It's not quite 50/50, that's one bound for how the balance is allowed to go. (2) Anybody trying to add tunables to the kernel tends to run into resistance. Exposing thousands of knobs tends to lead to a situation where you *have* to be an

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Merlin Moncure
On Wed, Dec 4, 2013 at 2:31 PM, Jonathan Corbet cor...@lwn.net wrote: For those interested in the details... (1) It's not quite 50/50, that's one bound for how the balance is allowed to go. (2) Anybody trying to add tunables to the kernel tends to run into resistance. Exposing thousands of

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Jonathan Corbet
On Wed, 04 Dec 2013 13:01:37 -0800 Josh Berkus j...@agliodbs.com wrote: Perhaps even better: the next filesystem, storage, and memory management summit is March 24-25. Link? I can't find anything Googling by that name. I'm pretty sure we can get at least one person there. It looks

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-04 Thread Andres Freund
Hi, On 2013-12-03 10:44:15 -0800, Josh Berkus wrote: I don't know where we'll get the resources to implement our own storage, but it's looking like we don't have a choice. As long as our storage layer is a s suboptimal as it is today, I think it's a purely detractory to primarily blame the

[HACKERS] Why we are going to have to go DirectIO

2013-12-03 Thread Josh Berkus
All, https://lkml.org/lkml/2013/11/24/133 What this means for us: http://citusdata.com/blog/72-linux-memory-manager-and-your-big-data It seems clear that Kernel.org, since 2.6, has been in the business of pushing major, hackish, changes to the IO stack without testing them or even thinking too

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-03 Thread Robert Haas
On Tue, Dec 3, 2013 at 1:44 PM, Josh Berkus j...@agliodbs.com wrote: All, https://lkml.org/lkml/2013/11/24/133 What this means for us: http://citusdata.com/blog/72-linux-memory-manager-and-your-big-data It seems clear that Kernel.org, since 2.6, has been in the business of pushing major,

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-03 Thread Joshua D. Drake
On 12/03/2013 10:44 AM, Josh Berkus wrote: All, https://lkml.org/lkml/2013/11/24/133 What this means for us: http://citusdata.com/blog/72-linux-memory-manager-and-your-big-data It seems clear that Kernel.org, since 2.6, has been in the business of pushing major, hackish, changes to the IO

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-03 Thread Josh Berkus
On 12/03/2013 10:59 AM, Joshua D. Drake wrote: This seems rather half cocked. I read the article. They found a problem, that really will only affect a reasonably small percentage of users, created a test case, reported it, and a patch was produced. Users with at least one file bigger than 50%

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-03 Thread Stefan Kaltenbrunner
On 12/03/2013 08:23 PM, Josh Berkus wrote: On 12/03/2013 10:59 AM, Joshua D. Drake wrote: This seems rather half cocked. I read the article. They found a problem, that really will only affect a reasonably small percentage of users, created a test case, reported it, and a patch was produced.

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-03 Thread Tom Lane
Stefan Kaltenbrunner ste...@kaltenbrunner.cc writes: If we care about our performance on various operating systems it is _OUR_ responsibility to track that closely and automated and report back and only if that feedback loop fails to work we are actually in a real position to consider

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-03 Thread Joshua D. Drake
On 12/03/2013 12:35 PM, Tom Lane wrote: Stefan Kaltenbrunner ste...@kaltenbrunner.cc writes: If we care about our performance on various operating systems it is _OUR_ responsibility to track that closely and automated and report back and only if that feedback loop fails to work we are actually

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-03 Thread Josh Berkus
On 12/03/2013 12:15 PM, Stefan Kaltenbrunner wrote: We are in no way different and I would like to note that we do not have any form of sensible performance related regression testing either. I would even argue that there is ton more regression testing (be it performance or otherwise) going

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-03 Thread Magnus Hagander
On Tue, Dec 3, 2013 at 11:44 PM, Josh Berkus j...@agliodbs.com wrote: On 12/03/2013 12:15 PM, Stefan Kaltenbrunner wrote: We are in no way different and I would like to note that we do not have any form of sensible performance related regression testing either. I would even argue that

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-03 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes: On Tue, Dec 3, 2013 at 11:44 PM, Josh Berkus j...@agliodbs.com wrote: Would certainly be nice. Realistically, getting good automated performace tests will require paying someone like Greg S., Mark or me for 6 solid months to develop them, since

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-03 Thread Alvaro Herrera
Magnus Hagander wrote: On Tue, Dec 3, 2013 at 11:44 PM, Josh Berkus j...@agliodbs.com wrote: Would certainly be nice. Realistically, getting good automated performace tests will require paying someone like Greg S., Mark or me for 6 solid months to develop them, since worthwhile open

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-03 Thread Josh Berkus
Magnus, So in order to get *testing* we need to pay somebody. But to build a great database server, we can rely on volunteer efforts or sponsorship from companies who are interested in moving the project forward? That hardly seems right... Either it's just not high enough on peoples priority

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-03 Thread Joshua D. Drake
On 12/03/2013 03:02 PM, Magnus Hagander wrote: On Tue, Dec 3, 2013 at 11:44 PM, Josh Berkus j...@agliodbs.com Would certainly be nice. Realistically, getting good automated performace tests will require paying someone like Greg S., Mark or me for 6 solid months to develop them,

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-03 Thread Joshua D. Drake
On 12/03/2013 03:15 PM, Josh Berkus wrote: It's *always* much easier to get money for features than for other things. Earlier this year I was really hoping that our new corporate community members, who seemed to be interested in testing, would put some serious resources behind this. When

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-03 Thread Tatsuo Ishii
Magnus Hagander mag...@hagander.net writes: On Tue, Dec 3, 2013 at 11:44 PM, Josh Berkus j...@agliodbs.com wrote: Would certainly be nice. Realistically, getting good automated performace tests will require paying someone like Greg S., Mark or me for 6 solid months to develop them, since

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-03 Thread KONDO Mitsumasa
(2013/12/04 11:28), Tatsuo Ishii wrote: Magnus Hagander mag...@hagander.net writes: On Tue, Dec 3, 2013 at 11:44 PM, Josh Berkus j...@agliodbs.com wrote: Would certainly be nice. Realistically, getting good automated performace tests will require paying someone like Greg S., Mark or me for 6

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-03 Thread Peter Eisentraut
On Tue, 2013-12-03 at 14:44 -0800, Josh Berkus wrote: Would certainly be nice. Realistically, getting good automated performace tests will require paying someone like Greg S., Mark or me for 6 solid months to develop them, since worthwhile open source performance test platforms currently

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-03 Thread Stefan Kaltenbrunner
On 12/04/2013 05:40 AM, Peter Eisentraut wrote: On Tue, 2013-12-03 at 14:44 -0800, Josh Berkus wrote: Would certainly be nice. Realistically, getting good automated performace tests will require paying someone like Greg S., Mark or me for 6 solid months to develop them, since worthwhile open

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-03 Thread Tatsuo Ishii
Can we avoid the Linux kernel problem by simply increasing our shared buffer size, say up to 80% of memory? It will be swap more easier. Is that the case? If the system has not enough memory, the kernel buffer will be used for other purpose, and the kernel cache will not work very well anyway.

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-03 Thread Claudio Freire
On Wed, Dec 4, 2013 at 4:28 AM, Tatsuo Ishii is...@postgresql.org wrote: Can we avoid the Linux kernel problem by simply increasing our shared buffer size, say up to 80% of memory? It will be swap more easier. Is that the case? If the system has not enough memory, the kernel buffer will be