Re: [HACKERS] pg_stat_*_columns?

2015-07-06 Thread Joel Jacobson
On Mon, Jun 29, 2015 at 11:14 PM, Jim Nasby jim.na...@bluetreble.com wrote: What might be interesting is setting things up so the collector simply inserted into history tables every X seconds and then had a separate process to prune that data. The big problem with that is I see no way for

Re: [HACKERS] pg_stat_*_columns?

2015-07-06 Thread Andres Freund
On 2015-06-29 16:14:34 -0500, Jim Nasby wrote: What might be interesting is setting things up so the collector simply inserted into history tables every X seconds and then had a separate process to prune that data. The big problem with that is I see no way for that to easily allow access to

Re: [HACKERS] pg_stat_*_columns?

2015-07-06 Thread Tom Lane
Joel Jacobson j...@trustly.com writes: On Mon, Jun 29, 2015 at 11:14 PM, Jim Nasby jim.na...@bluetreble.com wrote: What might be interesting is setting things up so the collector simply inserted into history tables every X seconds and then had a separate process to prune that data. The big

Re: [HACKERS] pg_stat_*_columns?

2015-06-29 Thread Jim Nasby
On 6/26/15 6:09 PM, Joel Jacobson wrote: Can't we just use the infrastructure of PostgreSQL to handle the few megabytes of data we are talking about here? Why not just store the data in a regular table? Why bother with special files and special data structures? If it's just a table we want to

Re: [HACKERS] pg_stat_*_columns?

2015-06-26 Thread Tomas Vondra
On 06/27/2015 12:30 AM, Jim Nasby wrote: On 6/24/15 6:41 PM, Tomas Vondra wrote: Were the stories (or the experience which lead to the stories) on 9.3 or later? Do they have a good way to reproduce it for testing purposes? The per-db split can only improve things if there actually are

Re: [HACKERS] pg_stat_*_columns?

2015-06-26 Thread Jim Nasby
On 6/24/15 6:41 PM, Tomas Vondra wrote: Were the stories (or the experience which lead to the stories) on 9.3 or later? Do they have a good way to reproduce it for testing purposes? The per-db split can only improve things if there actually are multiple databases, and if the objects are

Re: [HACKERS] pg_stat_*_columns?

2015-06-24 Thread Andres Freund
On 2015-06-23 16:32:54 -0400, Robert Haas wrote: On Tue, Jun 23, 2015 at 3:47 PM, Andres Freund and...@anarazel.de wrote: On 2015-06-22 21:05:52 -0400, Robert Haas wrote: Interesting idea. We could consider the set of stats files a database unto itself and reserve a low-numbered OID for

Re: [HACKERS] pg_stat_*_columns?

2015-06-24 Thread Tomas Vondra
On 06/24/2015 07:54 PM, Jeff Janes wrote: Were the stories (or the experience which lead to the stories) on 9.3 or later? Do they have a good way to reproduce it for testing purposes? The per-db split can only improve things if there actually are multiple databases, and if the objects are

Re: [HACKERS] pg_stat_*_columns?

2015-06-24 Thread Jeff Janes
On Sat, Jun 20, 2015 at 7:20 AM, Robert Haas robertmh...@gmail.com wrote: On Sat, Jun 20, 2015 at 10:12 AM, Joel Jacobson j...@trustly.com wrote: Is there any chance the project would accept a patch which adds the pg_stat_*_columns-feature without first optimizing the collector? I doubt

Re: [HACKERS] pg_stat_*_columns?

2015-06-24 Thread Jeff Janes
On Mon, Jun 15, 2015 at 7:47 PM, Jim Nasby jim.na...@bluetreble.com wrote: On 6/8/15 3:26 PM, Joel Jacobson wrote: So I've heard from Magnus Hagander today IRL at our Stockholm PostgreSQL User Group meeting where we discussed this idea. He told me the overhead in the statistics collector is

Re: [HACKERS] pg_stat_*_columns?

2015-06-24 Thread Robert Haas
On Wed, Jun 24, 2015 at 2:03 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Sat, Jun 20, 2015 at 7:20 AM, Robert Haas robertmh...@gmail.com wrote: On Sat, Jun 20, 2015 at 10:12 AM, Joel Jacobson j...@trustly.com wrote: Is there any chance the project would accept a patch which adds the

Re: [HACKERS] pg_stat_*_columns?

2015-06-23 Thread Magnus Hagander
On Tue, Jun 23, 2015 at 3:01 AM, Robert Haas robertmh...@gmail.com wrote: On Sun, Jun 21, 2015 at 11:43 AM, Magnus Hagander mag...@hagander.net wrote: On Sat, Jun 20, 2015 at 11:55 PM, Robert Haas robertmh...@gmail.com wrote: On Sat, Jun 20, 2015 at 7:01 PM, Tom Lane t...@sss.pgh.pa.us

Re: [HACKERS] pg_stat_*_columns?

2015-06-23 Thread Andres Freund
On 2015-06-22 21:05:52 -0400, Robert Haas wrote: Interesting idea. We could consider the set of stats files a database unto itself and reserve a low-numbered OID for it. The obvious thing to do is use the database's OID as the relfilenode, but then how do you replace the stats file? The

Re: [HACKERS] pg_stat_*_columns?

2015-06-23 Thread Jim Nasby
On 6/22/15 8:05 PM, Robert Haas wrote: In totally different crazy way we could just use the existing buffer manager we have and simply put the stats file in shared_buffers. Inventing a per-database relfilenode that doesn't conflict doesn't seem impossible. With some care it shouldn't be hard to

Re: [HACKERS] pg_stat_*_columns?

2015-06-23 Thread Robert Haas
On Tue, Jun 23, 2015 at 3:47 PM, Andres Freund and...@anarazel.de wrote: On 2015-06-22 21:05:52 -0400, Robert Haas wrote: Interesting idea. We could consider the set of stats files a database unto itself and reserve a low-numbered OID for it. The obvious thing to do is use the database's OID

Re: [HACKERS] pg_stat_*_columns?

2015-06-22 Thread Robert Haas
On Sun, Jun 21, 2015 at 12:52 PM, Andres Freund and...@anarazel.de wrote: On 2015-06-21 12:40:50 -0400, Tom Lane wrote: Andres Freund and...@anarazel.de writes: We could also just mmap() the stats file into memory in various processes. With a bit care it should be quite possible to only mmap

Re: [HACKERS] pg_stat_*_columns?

2015-06-22 Thread Robert Haas
On Sun, Jun 21, 2015 at 11:43 AM, Magnus Hagander mag...@hagander.net wrote: On Sat, Jun 20, 2015 at 11:55 PM, Robert Haas robertmh...@gmail.com wrote: On Sat, Jun 20, 2015 at 7:01 PM, Tom Lane t...@sss.pgh.pa.us wrote: But if the structure got too big to map (on a 32-bit system), then you'd

Re: [HACKERS] pg_stat_*_columns?

2015-06-21 Thread Andres Freund
On 2015-06-20 10:55:03 -0400, Tom Lane wrote: I dunno that tweaking the format would accomplish much. Where I'd love to get to is to not have to write the data to disk at all (except at shutdown). But that seems to require an adjustable-size shared memory block, and I'm not sure how to do

Re: [HACKERS] pg_stat_*_columns?

2015-06-21 Thread Magnus Hagander
On Sat, Jun 20, 2015 at 11:55 PM, Robert Haas robertmh...@gmail.com wrote: On Sat, Jun 20, 2015 at 7:01 PM, Tom Lane t...@sss.pgh.pa.us wrote: But if the structure got too big to map (on a 32-bit system), then you'd be sort of hosed, because there's no way to attach just part of it. That

Re: [HACKERS] pg_stat_*_columns?

2015-06-21 Thread Tom Lane
Andres Freund and...@anarazel.de writes: On 2015-06-20 10:55:03 -0400, Tom Lane wrote: I dunno that tweaking the format would accomplish much. Where I'd love to get to is to not have to write the data to disk at all (except at shutdown). But that seems to require an adjustable-size shared

Re: [HACKERS] pg_stat_*_columns?

2015-06-21 Thread Andres Freund
On 2015-06-21 12:40:50 -0400, Tom Lane wrote: Andres Freund and...@anarazel.de writes: We could also just mmap() the stats file into memory in various processes. With a bit care it should be quite possible to only mmap a subsets of the file at once, taking care of the address space issues.

Re: [HACKERS] pg_stat_*_columns?

2015-06-20 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On Sat, Jun 20, 2015 at 10:12 AM, Joel Jacobson j...@trustly.com wrote: I guess it primarily depends on how much of the new code that would need to be rewritten, if the collector is optimized/rewritten in the future? I don't think that's really the

Re: [HACKERS] pg_stat_*_columns?

2015-06-20 Thread Magnus Hagander
On Sat, Jun 20, 2015 at 10:55 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Sat, Jun 20, 2015 at 10:12 AM, Joel Jacobson j...@trustly.com wrote: I guess it primarily depends on how much of the new code that would need to be rewritten, if the

Re: [HACKERS] pg_stat_*_columns?

2015-06-20 Thread Robert Haas
On Sat, Jun 20, 2015 at 10:12 AM, Joel Jacobson j...@trustly.com wrote: Is there any chance the project would accept a patch which adds the pg_stat_*_columns-feature without first optimizing the collector? I doubt it. It's such a pain point already that massively increasing the amount of data

Re: [HACKERS] pg_stat_*_columns?

2015-06-20 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes: On Sat, Jun 20, 2015 at 10:55 AM, Tom Lane t...@sss.pgh.pa.us wrote: I dunno that tweaking the format would accomplish much. Where I'd love to get to is to not have to write the data to disk at all (except at shutdown). But that seems to require an

Re: [HACKERS] pg_stat_*_columns?

2015-06-20 Thread Joel Jacobson
On Tue, Jun 16, 2015 at 4:47 AM, Jim Nasby jim.na...@bluetreble.com wrote: Magnus idea was to first optimize the collector to make it less of a problem to collect more data. Sounds like a good thing to do, but maybe more data in it wouldn't be a problem as long as you don't read too often

Re: [HACKERS] pg_stat_*_columns?

2015-06-20 Thread Heikki Linnakangas
On 06/20/2015 11:32 AM, Tom Lane wrote: Magnus Hagander mag...@hagander.net writes: On Sat, Jun 20, 2015 at 10:55 AM, Tom Lane t...@sss.pgh.pa.us wrote: I dunno that tweaking the format would accomplish much. Where I'd love to get to is to not have to write the data to disk at all (except at

Re: [HACKERS] pg_stat_*_columns?

2015-06-20 Thread Robert Haas
On Sat, Jun 20, 2015 at 7:01 PM, Tom Lane t...@sss.pgh.pa.us wrote: If we arranged things so that the processes could use the data in the DSM directly rather than having to copy it out, we'd presumably save quite a bit of memory, since the whole structure would be shared rather than each

Re: [HACKERS] pg_stat_*_columns?

2015-06-20 Thread Robert Haas
On Sat, Jun 20, 2015 at 6:15 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: One idea would be to advertise a DSM ID in the main shared memory segment, and have the individual backends read that value and attach to it. When new stats are generated, the stats

Re: [HACKERS] pg_stat_*_columns?

2015-06-20 Thread Alvaro Herrera
Robert Haas wrote: If we arranged things so that the processes could use the data in the DSM directly rather than having to copy it out, we'd presumably save quite a bit of memory, since the whole structure would be shared rather than each backend having its own copy. But if the structure

Re: [HACKERS] pg_stat_*_columns?

2015-06-20 Thread Robert Haas
On Sat, Jun 20, 2015 at 11:15 AM, Magnus Hagander mag...@hagander.net wrote: I've considered both that and to perhaps use a shared memory message queue to communicate. Basically, have a backend send a request when it needs a snapshot of the stats data and get a copy back through that method

Re: [HACKERS] pg_stat_*_columns?

2015-06-20 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: One idea would be to advertise a DSM ID in the main shared memory segment, and have the individual backends read that value and attach to it. When new stats are generated, the stats collector creates a new DSM (which might be bigger or smaller than

Re: [HACKERS] pg_stat_*_columns?

2015-06-20 Thread Tomas Vondra
Hi, On 06/21/2015 12:15 AM, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: One idea would be to advertise a DSM ID in the main shared memory segment, and have the individual backends read that value and attach to it. When new stats are generated, the stats collector creates a new

Re: [HACKERS] pg_stat_*_columns?

2015-06-20 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On Sat, Jun 20, 2015 at 6:15 PM, Tom Lane t...@sss.pgh.pa.us wrote: Hmmm. This sounds attractive, but what happens if we fail to create a new DSM when needed? Presumably you keep using the old one and retry later. I mean, out of memory is out of

Re: [HACKERS] pg_stat_*_columns?

2015-06-16 Thread Jim Nasby
On 6/8/15 3:26 PM, Joel Jacobson wrote: So I've heard from Magnus Hagander today IRL at our Stockholm PostgreSQL User Group meeting where we discussed this idea. He told me the overhead in the statistics collector is mainly when reading from it, not that much when writing to it. I've heard

Re: [HACKERS] pg_stat_*_columns?

2015-06-08 Thread Joel Jacobson
So I've heard from Magnus Hagander today IRL at our Stockholm PostgreSQL User Group meeting where we discussed this idea. He told me the overhead in the statistics collector is mainly when reading from it, not that much when writing to it. Magnus idea was to first optimize the collector to make it

Re: [HACKERS] pg_stat_*_columns?

2015-06-08 Thread Robert Haas
On Fri, Jun 5, 2015 at 7:51 AM, Joel Jacobson j...@trustly.com wrote: Would others find it useful to see per column statistics about accesses to specific columns? A probably serious and maybe not entirely obvious problem with this is that it would increase the amount of statistical information

[HACKERS] pg_stat_*_columns?

2015-06-05 Thread Joel Jacobson
Would others find it useful to see per column statistics about accesses to specific columns? Two possible use-cases: (maybe there are more?) 1. I think it would be helpful for DBAs to better understand their own system. Finding unused *tables* is today easy thanks to pg_stat_*_tables, but