Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
--On 28. Februar 2011 15:02:30 -0500 Tom Lane wrote: Because it's fifty times more mechanism than we need here? We don't want a SQL interface (not even a lightweight one) and it's unclear that we ever want the data to go to disk at all. I wonder wether a library like librrd would be a solution for this. -- Thanks Bernd -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
Tom Lane writes: > The ideal solution would likely be for the stats collector to expose its > data structures as shared memory, but I don't think we get to do that > under SysV shmem --- it doesn't like variable-size shmem much. Maybe > that's another argument for looking harder into mmap or POSIX shmem, > although it's not clear to me how well either of those fixes that. We could certainly use message passing style atop pgpipe.c here, right? After all we already have a protocol and know how to represent complex data structure in there, and all components of PostgreSQL should be able to alleviate this, I'd think. Or this fever ain't really gone yet :) Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
On Mon, Feb 28, 2011 at 2:31 PM, Tom Lane wrote: > Robert Haas writes: >> On Mon, Feb 28, 2011 at 1:50 PM, Tom Lane wrote: >>> Ultimately we need to think of a reporting mechanism that's a bit >>> smarter than "rewrite the whole file for any update" ... > >> Well, we have these things called "tables". Any chance of using those? > > Having the stats collector write tables would violate the classical form > of the heisenberg principle (thou shalt avoid having thy measurement > tools affect that which is measured), not to mention assorted practical > problems like not wanting the stats collector to take locks or run > transactions. > > The ideal solution would likely be for the stats collector to expose its > data structures as shared memory, but I don't think we get to do that > under SysV shmem --- it doesn't like variable-size shmem much. Maybe > that's another argument for looking harder into mmap or POSIX shmem, > although it's not clear to me how well either of those fixes that. Well, certainly, you could make it work with mmap() - you could arrange a mechanism whereby anyone who tries to reference off the end of the portion they've mapped calls stat() on the file and remaps it at its now-increased size.But you'd need to think carefully about locking and free-space management, which is where it starts to sound an awful lot like you're reinventing the idea of a heap. Maybe there's a way to design some kind of lighter weight mechanism, but the complexity of the problem is not obviously a lot less than the general problem of storing frequently updated tabular data. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
Euler Taveira de Oliveira writes: > Em 28-02-2011 15:50, Tom Lane escreveu: >> Ultimately we need to think of a reporting mechanism that's a bit >> smarter than "rewrite the whole file for any update" ... > What about splitting statistic file per database? That would improve matters for some usage patterns, but I'm afraid only a minority. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
"Joshua D. Drake" writes: > On Mon, 2011-02-28 at 11:39 -0800, Josh Berkus wrote: > Spitballing here, but could sqlite be an intermediate, compromise solution? >> >> For a core PostgreSQL component ?!?!? > Sure, why not? Because it's fifty times more mechanism than we need here? We don't want a SQL interface (not even a lightweight one) and it's unclear that we ever want the data to go to disk at all. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
Em 28-02-2011 15:50, Tom Lane escreveu: Ultimately we need to think of a reporting mechanism that's a bit smarter than "rewrite the whole file for any update" ... What about splitting statistic file per database? -- Euler Taveira de Oliveira http://www.timbira.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
On Mon, 2011-02-28 at 11:39 -0800, Josh Berkus wrote: > > Spitballing here, but could sqlite be an intermediate, compromise solution? > > For a core PostgreSQL component ?!?!? Sure, why not? It is ACID compliant, has the right kind of license, has a standard API that we are all used to. It seems like a pretty decent solution in consideration. We don't need MVCC for this problem. JD > > -- > -- Josh Berkus > PostgreSQL Experts Inc. > http://www.pgexperts.com > -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
> Spitballing here, but could sqlite be an intermediate, compromise solution? For a core PostgreSQL component ?!?!? -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
On Feb 28, 2011, at 14:31, Tom Lane wrote: > Robert Haas writes: >> On Mon, Feb 28, 2011 at 1:50 PM, Tom Lane wrote: >>> Ultimately we need to think of a reporting mechanism that's a bit >>> smarter than "rewrite the whole file for any update" ... > >> Well, we have these things called "tables". Any chance of using those? > > Having the stats collector write tables would violate the classical form > of the heisenberg principle (thou shalt avoid having thy measurement > tools affect that which is measured), not to mention assorted practical > problems like not wanting the stats collector to take locks or run > transactions. > > The ideal solution would likely be for the stats collector to expose its > data structures as shared memory, but I don't think we get to do that > under SysV shmem --- it doesn't like variable-size shmem much. Maybe > that's another argument for looking harder into mmap or POSIX shmem, > although it's not clear to me how well either of those fixes that. Spitballing here, but could sqlite be an intermediate, compromise solution? Michael Glaesemann grzm seespotcode net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
Robert Haas writes: > On Mon, Feb 28, 2011 at 1:50 PM, Tom Lane wrote: >> Ultimately we need to think of a reporting mechanism that's a bit >> smarter than "rewrite the whole file for any update" ... > Well, we have these things called "tables". Any chance of using those? Having the stats collector write tables would violate the classical form of the heisenberg principle (thou shalt avoid having thy measurement tools affect that which is measured), not to mention assorted practical problems like not wanting the stats collector to take locks or run transactions. The ideal solution would likely be for the stats collector to expose its data structures as shared memory, but I don't think we get to do that under SysV shmem --- it doesn't like variable-size shmem much. Maybe that's another argument for looking harder into mmap or POSIX shmem, although it's not clear to me how well either of those fixes that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
On Mon, Feb 28, 2011 at 1:50 PM, Tom Lane wrote: > Josh Berkus writes: >> On 2/28/11 10:24 AM, Robert Haas wrote: >>> On Mon, Feb 28, 2011 at 1:04 PM, Josh Berkus wrote: On the other hand, anything which increases the size of pg_statistic would be a nightmare. > >>> Hmm? > >> Like replacing each statistic with a series of time-based buckets, which >> would then increase the size of the table by 5X to 10X. That was the >> first solution I thought of, and rejected. > > I think Josh is thinking of the stats collector's dump file, not > pg_statistic. Yeah. > Ultimately we need to think of a reporting mechanism that's a bit > smarter than "rewrite the whole file for any update" ... Well, we have these things called "tables". Any chance of using those? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
Josh Berkus writes: > On 2/28/11 10:24 AM, Robert Haas wrote: >> On Mon, Feb 28, 2011 at 1:04 PM, Josh Berkus wrote: >>> On the other hand, anything which increases the size of pg_statistic >>> would be a nightmare. >> Hmm? > Like replacing each statistic with a series of time-based buckets, which > would then increase the size of the table by 5X to 10X. That was the > first solution I thought of, and rejected. I think Josh is thinking of the stats collector's dump file, not pg_statistic. Ultimately we need to think of a reporting mechanism that's a bit smarter than "rewrite the whole file for any update" ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
On 2/28/11 10:24 AM, Robert Haas wrote: > On Mon, Feb 28, 2011 at 1:04 PM, Josh Berkus wrote: >> On the other hand, anything which increases the size of pg_statistic >> would be a nightmare. > > Hmm? Like replacing each statistic with a series of time-based buckets, which would then increase the size of the table by 5X to 10X. That was the first solution I thought of, and rejected. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
On Mon, Feb 28, 2011 at 1:04 PM, Josh Berkus wrote: > On the other hand, anything which increases the size of pg_statistic > would be a nightmare. Hmm? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
On Mon, Feb 28, 2011 at 10:04:54AM -0800, Josh Berkus wrote: > Take, for example, a problem I was recently grappling with for Nagios. > I'd like to do a check as to whether or not tables are getting > autoanalyzed often enough. After all, autovac can fall behind, and we'd > want to be alerted of that. > > The problem is, in order to measure whether or not autoanalyze is > behind, you need to count how many inserts,updates,deletes have happened > since the last autoanalyze. pg_stat_user_tables just gives us the > counters since the last reset ... and the reset time isn't even stored > in PostgreSQL. The solution I use for that in to use munin to monitor everything and let it generate alerts based on the levels. It's not great, but better than nothing. The problem, as you say, is that you want to now the rates rather than the absolute values. The problem with rates is that you can get wildly different results depending on the time interval you're looking at. For the concrete example above, autoanalyse has to be able to determine if there is work to do so the information must be somehwere. I'm guessing it's not easily available? If you had a function is_autovacuumcandidate you'd be done ofcourse. But there's ofcourse lots of stats people want, it's just not clear how to get them. What you really need is to store the stats every few minutes, but that's what munin does. I doubt it's worth building RRD like capabilities into postgres. Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > Patriotism is when love of your own people comes first; nationalism, > when hate for people other than your own comes first. > - Charles de Gaulle signature.asc Description: Digital signature
Re: [HACKERS] Why our counters need to be time-based WAS: WIP: cross column correlation ...
> Well, what we have now is a bunch of counters in pg_stat_all_tables > and pg_statio_all_tables. Right. What I'm saying is those aren't good enough, and have never been good enough. Counters without a time basis are pretty much useless for performance monitoring/management (Baron Schwartz has a blog post talking about this, but I can't find it right now). Take, for example, a problem I was recently grappling with for Nagios. I'd like to do a check as to whether or not tables are getting autoanalyzed often enough. After all, autovac can fall behind, and we'd want to be alerted of that. The problem is, in order to measure whether or not autoanalyze is behind, you need to count how many inserts,updates,deletes have happened since the last autoanalyze. pg_stat_user_tables just gives us the counters since the last reset ... and the reset time isn't even stored in PostgreSQL. This means that, without adding external tools like pg_statsinfo, we can't autotune autoanalyze at all. There are quite a few other examples where the counters could contribute to autotuning and DBA performance monitoring if only they were time-based. As it is, they're useful for finding unused indexes and that's about it. One possibility, of course, would be to take pg_statsinfo and make it part of core. There's a couple disadvantages of that; (1) is the storage and extra objects required, which would then require us to add extra management routines as well. (2) is that pg_statsinfo only stores top-level view history, meaning that it wouldn't be very adaptable to improvements we make in system views in the future. On the other hand, anything which increases the size of pg_statistic would be a nightmare. One possible compromise solution might be to implement code for the stats collector to automatically reset the stats at a given clock interval. If we combined this with keeping the reset time, and keeping a snapshot of the stats from the last clock tick (and their reset time) that would be "good enough" for most monitoring. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers