Re: [HACKERS] Increasing the length of pg_stat_activity.current_query...

2004-11-10 Thread Jan Wieck
On 11/8/2004 5:32 PM, Tom Lane wrote:
Another relevant question is why you are expecting to get this
information through pgstats and not by looking in the postmaster log.
The pgstats were originally designed to give hints for tuning. That's 
why they cover cache hits vs. misses per table and numbers that can be 
used to point out missing as well as obsolete indexes. That was what led 
to the design of the pgstats file, the UDP communication and those fixed 
sizes. The goal was to let it have as little impact on the server 
performance as possible. The whole current query stuff was added later 
on request.

In my opinion it is quite pointless to attempt to transmit the last byte 
of every single query sent to the backend, when all you can get out of 
that view is a random query every 500 milliseconds.

Jan
--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Increasing the length of pg_stat_activity.current_query...

2004-11-10 Thread Greg Stark

Simon Riggs [EMAIL PROTECTED] writes:

 My intention was towards a data warehouse situation, and my comments are
 only relevant in that context. Possibly 25+% of the user base use this
 style of processing. In that case, I expect queries to run for minutes
 or hours.

I come from the opposite angle but have also ended up with the same
conclusion. 

In an OLTP environment you can't be trying to save every single SQL query in
the log file. And saving only queries that take longer than some arbitrary
amount of time might not be capturing enough to give a good picture of what's
going on.

I like the idea of a stats daemon that's isolated from the server by something
like UDP and keeps statistics. It would let me turn off logging while still
being able to peek into what queries are running, which take the longest,
which are being executed the most often, and which are taking the most
cumulative time (which isn't necessarily the same thing as either of the other
two).

The idea of tracking cache misses is great, though in the current design a
postgres buffer cache miss doesn't necessarily mean a cache miss. If Postgres
moves to O_DIRECT then it would be a valuable statistic, or if instrumentation
to test for timing of cache hits and misses is added then it could be a good
statistic to have.

I can say that with Oracle it was *incredibly* useful to have the queries
being executed and cached queryable in the cache. The ora_explain tool that
comes with DBD::Oracle makes it extremely easy to identify queries consuming
resources, experiment with rewrites, and then copy the new query into the
application.

It would be great to have something equivalent for Postgres. It would be
extremely kludgy by comparison to have to dig through the log files for
queries. Much better would be to have an interface to access the data pgstats
gathers. But that only works if the entire query is there.

-- 
greg


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Increasing the length of pg_stat_activity.current_query...

2004-11-09 Thread Andreas Pflug
Josh Berkus wrote:
Tom,

Another relevant question is why you are expecting to get this
information through pgstats and not by looking in the postmaster log.
I don't know about you, but I don't have any tools that are designed to
cope nicely with looking at tables that have columns that might be many
K wide.  Looking in the log seems a much nicer way of examining the full
text of extremely long queries.  So I think it's actually a good thing
that pgstats truncates the queries at some reasonable width.

Because pg_stat_activity can be queried dynamically, and the log can't. 
I've been planning to post a lengthy mail after 8.0 release, but it 
seems a good idea to do it now.

When comparing pgsql to MSSQL in practice, I encounter a similar problem 
as Josh. I got a server hammered by countless queries, some of them not 
too well constructed and thus soaking CPU from all users. On MSSQL, I'd 
be using the Profiler, which lets me tap one or more connections, and 
log whatever I think is important to trace down the problem. This lets 
me filter out those uninteresting 99.9 % of queries which would make my 
log unreadable. Additionally, some performance measures are recorded for 
each query, enabling me to spot the bad guys, analyze and improve them.

On pgsql, all logging goes unstructured into one file, I even can't 
start and stop a new log on demand on my observation period (somebody 
refused to implement a manual log rotation function, nobody needs 
that...) On a server addressed by 100 users, with several dozens of 
queries fired every second, it's hard work to locate the offending query.

It appears to me that simple increasing the max query length won't do 
the deal (and 16k would not be enough). What I'd like to see is the 
possibility to tap one or more backends (this is superuser only, of 
course), and put them in a logging mode, which will record the complete 
query including performance counters to some process in a lossless way. 
When I say tapping I mean that the backend configuration switch is *not* 
set by the very same backend, but from a different superuser backend.

Regards,
Andreas
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Increasing the length of pg_stat_activity.current_query...

2004-11-08 Thread Gaetano Mendola
Tom Lane wrote:
 Greg Stark [EMAIL PROTECTED] writes:

What do you think is broken about fragmented UDP packets?


Fragmentation happens at the IP protocol level, the kernel is responsible for
reassembly. There's nothing for the application level to handle.


 And, by the same token, on platforms where it is broken there is nothing
 we can do about it.
Like what? If the OS can not handle UDP reassembly then we have some other 
problems
around
I think the OS breakage is a non issue here.

Regards
Gaetano Mendola
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Increasing the length of pg_stat_activity.current_query...

2004-11-08 Thread Greg Stark
Tom Lane [EMAIL PROTECTED] writes:

 Greg Stark [EMAIL PROTECTED] writes:
  What do you think is broken about fragmented UDP packets?
 
  Fragmentation happens at the IP protocol level, the kernel is responsible 
  for
  reassembly. There's nothing for the application level to handle.
 
 And, by the same token, on platforms where it is broken there is nothing
 we can do about it.

Ok having read the source I think I see why we're talking past each other.

The pgstat messages are indeed fixed size. So bumping the size up to 8k would
mean *every* udp packet would be bumped up to 8k.

However there's no good reason for that to be the case. recv(2) always returns
exactly one packet and tells you how large it was. And in fact the
PgStat_MsgHdr even has a redundant message size field that could serve the
same purpose. So we actually have all the machinery needed twice over to avoid
the fixed size messages.

In fact looking over the code I think it would be simple to change this. I
think it would be reasonable to make just the PgStat_MsgActivity variable
sized. I'll look at it a bit more, I think it's well worth the slight code
complexity, especially if it were only done for the one message type.

This has potential to reduce the average size of these messages quite a lot.
Potentially reducing the data being pumped through udp and the pipe buffer
(?!) quite a lot.

-- 
greg


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Increasing the length of pg_stat_activity.current_query...

2004-11-08 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 The pgstat messages are indeed fixed size.

No, there's a fixed maximum size.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Increasing the length of pg_stat_activity.current_query...

2004-11-08 Thread Greg Stark
Tom Lane [EMAIL PROTECTED] writes:

 Greg Stark [EMAIL PROTECTED] writes:
  The pgstat messages are indeed fixed size.
 
 No, there's a fixed maximum size.

Hm. *rereads source*

It's true, pgstat_report_activity only sends the actual size of the query, not
the full payload size.

The only problem I see in raising the size of PGSTAT_MSG_PAYLOAD is that it
also governs the size of PGSTAT_NUM_TABPURGE and PGSTAT_NUM_TABENTRIES.
There's no need to grow those arrays and risk losing them. But these message
sizes could just be left based on the 1k value while boosting the maximum size
of PGSTAT_ACTIVITY_SIZE.

That would have no downside and only benefits. The worst case is that a
machine that didn't handle UDP fragment reassembly would drop the packets that
postgres is currently dropping preemptively. Shorter queries and other packets
would be unaffected.

-- 
greg


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Increasing the length of pg_stat_activity.current_query...

2004-11-08 Thread Greg Stark
Greg Stark [EMAIL PROTECTED] writes:

 The only problem I see in raising the size of PGSTAT_MSG_PAYLOAD is that it
 also governs the size of PGSTAT_NUM_TABPURGE and PGSTAT_NUM_TABENTRIES.
 There's no need to grow those arrays and risk losing them. But these message
 sizes could just be left based on the 1k value while boosting the maximum size
 of PGSTAT_ACTIVITY_SIZE.

Just to be clear, I'm talking about something as simple as this:
(Haven't finished compiling it yet)

--- pgstat.h.~1.26.~2004-08-29 00:13:03.0 -0400
+++ pgstat.h2004-11-08 17:17:17.0 -0500
@@ -57,12 +57,13 @@
 } PgStat_MsgHdr;
 
 /* --
- * Space available in a message.  This will keep the UDP packets below 1K,
- * which should fit unfragmented into the MTU of the lo interface on most
- * platforms. Does anybody care for platforms where it doesn't?
+ * Space used by a message ideally and maximum space used. We try to not to go
+ * over 1k unless necessary to avoid UDP packets that don't fit into the MTU
+ * of the loopback interface on very old systems and need to be fragmented.
  * --
  */
-#define PGSTAT_MSG_PAYLOAD (1000 - sizeof(PgStat_MsgHdr))
+#define PGSTAT_MSG_PAYLOAD_IDEAL   (1000  - sizeof(PgStat_MsgHdr))
+#define PGSTAT_MSG_PAYLOAD (32740 - sizeof(PgStat_MsgHdr))
 
 /* --
  * PgStat_TableEntry   Per-table info in a MsgTabstat
@@ -131,7 +132,7 @@
  * and buffer 
access statistics.
  * --
  */
-#define PGSTAT_NUM_TABENTRIES  ((PGSTAT_MSG_PAYLOAD - 3 * sizeof(int)) 
\
+#define PGSTAT_NUM_TABENTRIES  ((PGSTAT_MSG_PAYLOAD_IDEAL - 3 * sizeof(int))   
\
/ 
sizeof(PgStat_TableEntry))
 
 typedef struct PgStat_MsgTabstat
@@ -148,7 +149,7 @@
  * about dead 
tables.
  * --
  */
-#define PGSTAT_NUM_TABPURGE((PGSTAT_MSG_PAYLOAD - sizeof(int)) 
\
+#define PGSTAT_NUM_TABPURGE((PGSTAT_MSG_PAYLOAD_IDEAL - 
sizeof(int))   \
/ sizeof(Oid))
 
 typedef struct PgStat_MsgTabpurge

-- 
greg


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Increasing the length of pg_stat_activity.current_query...

2004-11-08 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 That would have no downside and only benefits. The worst case is that a
 machine that didn't handle UDP fragment reassembly would drop the packets that
 postgres is currently dropping preemptively.

Huh?  We're not dropping the query *entirely*, which is what I would
expect to happen if the kernel doesn't want to deal with UDP packet
fragmentation.

However, after rereading the RFCs I think this discussion may be based
on false premises.  In a network stack designed per the RFCs, both TCP
and UDP use the same IP-level fragmentation logic, and so it's unlikely
that there would be no fragmentation support at all.  It's really a
performance issue: do you want to pay the penalty associated with
reassembling messages that exceed the loopback MTU, and do you want to
risk the possibility that the kernel will drop stuff on the floor rather
than fragment or reassemble it?  Remember that UDP is non-guaranteed
delivery, and the cases you are most interested in are likely to be
exactly the same cases where the kernel is under stress and may decide
to shed load that way.

BTW, although the transmitted packets might not be fixed-size, the
per-backend entries written to the stats file are.  Cranking
PGSTAT_ACTIVITY_SIZE up to the moon without loss of performance
will take more than just changing one #define.

Another relevant question is why you are expecting to get this
information through pgstats and not by looking in the postmaster log.
I don't know about you, but I don't have any tools that are designed to
cope nicely with looking at tables that have columns that might be many
K wide.  Looking in the log seems a much nicer way of examining the full
text of extremely long queries.  So I think it's actually a good thing
that pgstats truncates the queries at some reasonable width.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Increasing the length of pg_stat_activity.current_query...

2004-11-08 Thread Oliver Jowett
Tom Lane wrote:
It's really a
performance issue: do you want to pay the penalty associated with
reassembling messages that exceed the loopback MTU [...]
BTW, the loopback MTU here is quite large:
[EMAIL PROTECTED]:~$ /sbin/ifconfig lo | grep MTU
  UP LOOPBACK RUNNING  MTU:16436  Metric:1
[EMAIL PROTECTED]:~$ uname -a
Linux flood 2.6.8.1-flood #1 Wed Sep 29 21:58:09 NZST 2004 i686 GNU/Linux
so at least on Linux 2.6 it seems like the risk of fragmentation is minimal.
-O
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] [pgsql-hackers] Increasing the length of pg_stat_activity.current_query...

2004-11-08 Thread Josh Berkus
Tom,

 Another relevant question is why you are expecting to get this
 information through pgstats and not by looking in the postmaster log.
 I don't know about you, but I don't have any tools that are designed to
 cope nicely with looking at tables that have columns that might be many
 K wide. Looking in the log seems a much nicer way of examining the full
 text of extremely long queries. So I think it's actually a good thing
 that pgstats truncates the queries at some reasonable width.

Because pg_stat_activity can be queried dynamically, and the log can't.   I'm 
currently dealing with this at a clients site who is having elusive bad 
queries hammer the CPU.  

In order to find a bad query by PID, I have to:
1) turn on log_statement, log_timestamp and log_pid;
2) HUP the postmaster;
3) watch top and record the time and pid of the bad query;
4) cp the log off to a file;
5) turn back off log_statement and log_pid;
6) grep the log for the time/pid, using a regexp to deal with minor variations 
in timestamp.

It's a big PITA to retrieve the text of one bad query.   And that's assuming 
that the bad query re-occurs within a reasonable window of time from when I 
spotted it so that I don't end up watching top for the rest of the afternoon.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Increasing the length of pg_stat_activity.current_query...

2004-11-07 Thread Greg Stark

Tom Lane [EMAIL PROTECTED] writes:

  I'd vote in favour of relaxing the limit entirely, as Sean suggests.
 
 The choice is not between limit and no limit, it is between
 limit and broken.

What do you think is broken about fragmented UDP packets?

Once Upon a Time fragmented UDP packets basically didn't work at all. But
that's going on 20 years now. These days you can reliably send large packets
up to 32k certainly over local connections and even over long-haul connections
when you don't have packet loss problems. Even when you do the worst case
scenario is your packet doesn't make it, so what's the harm in at least trying
to send it?

I'm assuming the packets aren't a fixed size. As long as we aren't bumping up
the other packets to 8k then there's no danger to sending the occasional 8k
packet. 

The reason people don't like fragmented UDP packets is that there's no
retransmission facility and a packet is lost if a single fragment is lost. So
if you're sending an 8k packet with an MTU of 1500 you'll have 5 fragments.
With 10% packet loss that gives your 8k fragmented packet a 50/50 chance of
getting through.

But if you're having 10% packet loss on your local area network you already
have a problem. Even then you're losing 10% of your smaller queries and 50% of
your larger queries whereas currently you would be losing 10% of your smaller
queries and 100% of your larger queries...

-- 
greg


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Increasing the length of pg_stat_activity.current_query...

2004-11-07 Thread Greg Stark

Simon Riggs [EMAIL PROTECTED] writes:

 On Sun, 2004-11-07 at 20:59, Greg Stark wrote:
  
  What do you think is broken about fragmented UDP packets?
 
 ...probably that pgstat.c doesn't handle them at all, so if they occur
 then you've lost data. Until that is fixed, we have a limit.

Fragmentation happens at the IP protocol level, the kernel is responsible for
reassembly. There's nothing for the application level to handle.

-- 
greg


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Increasing the length of pg_stat_activity.current_query...

2004-11-07 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 What do you think is broken about fragmented UDP packets?

 Fragmentation happens at the IP protocol level, the kernel is responsible for
 reassembly. There's nothing for the application level to handle.

And, by the same token, on platforms where it is broken there is nothing
we can do about it.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Increasing the length of pg_stat_activity.current_query...

2004-11-06 Thread Bruce Momjian
Sean Chittenden wrote:
 Is there any reason the length of 
 pg_catalog.pg_stat_activity.current_query is capped at 255 characters?  
 Why can't it be a pointer to the currently running query?
 
 Seems silly to me and is a PITA to try and use as a debugging tool only 
 to find out that the query in question, has a logical break right at 
 character 255 so the query in pg_stat_query looks like it's the 
 complete query, but it's not (extra foo at the end of the query is 
 causing it to run dog slow, but it's difficult to see that without 
 going to the logs and digging through them to find the problem 
 statement).
 
 Anyway, is there any good reason for this or can this be increased 
 somehow?  -sc

I think it is limited because the queries are stored in shared memory,
maybe.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Increasing the length of pg_stat_activity.current_query...

2004-11-06 Thread Tom Lane
Sean Chittenden [EMAIL PROTECTED] writes:
 Is there any reason the length of 
 pg_catalog.pg_stat_activity.current_query is capped at 255 characters?  

The reason for a limit is to avoid fragmentation of UDP messages.
I believe we've set it at 1K for 8.0, though, and if you are on
a platform with a higher message size limit you could raise it more.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Increasing the length of pg_stat_activity.current_query...

2004-11-06 Thread Sean Chittenden
Is there any reason the length of
pg_catalog.pg_stat_activity.current_query is capped at 255 characters?
The reason for a limit is to avoid fragmentation of UDP messages.
I believe we've set it at 1K for 8.0, though, and if you are on
a platform with a higher message size limit you could raise it more.
I'm confused... UDP as in the UDP/IP?  RPC caps UDP messages at 8K and 
NFS over UDP often runs at 32K...  where is UDP used in the backend?  
-sc

--
Sean Chittenden
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [HACKERS] Increasing the length of pg_stat_activity.current_query...

2004-11-06 Thread Tom Lane
Sean Chittenden [EMAIL PROTECTED] writes:
 I'm confused... UDP as in the UDP/IP?  RPC caps UDP messages at 8K and 
 NFS over UDP often runs at 32K...  where is UDP used in the backend?  

pgstat messages travel over UDP/IP.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Increasing the length of pg_stat_activity.current_query...

2004-11-06 Thread Sean Chittenden
I'm confused... UDP as in the UDP/IP?  RPC caps UDP messages at 8K and
NFS over UDP often runs at 32K...  where is UDP used in the backend?
pgstat messages travel over UDP/IP.
Over the loopback interface, right?  Then why worry about 
fragmentation?  This seems like premature optimization/prevention.  A 
lost packet over lo0 is symptom of a bigger problem.  The contents of 
pgstat messages are probably the least of an admins concerns if that's 
happening.

Having a 1K query isn't uncommon on some of the stuff I work on, an 8K 
query... that's a tad different and would stick out like a sore thumb.  
Would you be open to increasing this further after the 8.0 release?  I 
haven't heard of anyone complaining about dropped/fragmented pgstat 
messages.  :)  -sc

--
Sean Chittenden
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Increasing the length of pg_stat_activity.current_query...

2004-11-06 Thread Tom Lane
Sean Chittenden [EMAIL PROTECTED] writes:
 Would you be open to increasing this further after the 8.0 release?

Nope.

 I haven't heard of anyone complaining about dropped/fragmented pgstat 
 messages.  :)  -sc

That's because we keep 'em small enough to not fragment.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Increasing the length of pg_stat_activity.current_query...

2004-11-06 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 Not having the whole query is painful. Raising it to 1K doesn't get
 round the fact that it's the longer queries that tend to be the more
 painful ones, and so they are the ones you want to trap in full and
 EXPLAIN, so you can find out if they are *ever* coming back. 

... so look in the postmaster log ...

 I'd vote in favour of relaxing the limit entirely, as Sean suggests.

The choice is not between limit and no limit, it is between
limit and broken.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Increasing the length of pg_stat_activity.current_query...

2004-11-06 Thread Greg Stark

Sean Chittenden [EMAIL PROTECTED] writes:

 Having a 1K query isn't uncommon on some of the stuff I work on, an 8K 
 query...
 that's a tad different and would stick out like a sore thumb.  

Just as a point of reference, I've been processing my logs to see how large my
queries work out to. They seem to max out at just over 5k, (5330 bytes to be
exact). 

This is excluding CREATE FUNCTION calls where the body of the function can
of course be much larger but isn't interesting for stats.

 Would you be open to increasing this further after the 8.0 release? I
 haven't heard of anyone complaining about dropped/fragmented pgstat
 messages. :) -sc

From my own experience I would suggest 8k. If it's good enough for NFS
defaults it ought to be good enough for Postgres.

Realistically, you shouldn't be expecting any real quantities of dropped
packets on a local area network, so fragmented UDP packets aren't really a
problem. Anyone running their stats collector over a long haul internet
connection with dropped packets is probably doing something pretty unusual.

I think historically implementations didn't handle fragmented UDP packets at
all, or perhaps not over 32k. But any platform today ought to be happy with
packets at least up to 32k and any modern platform quite a bit larger.

-- 
greg


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html