Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
Josh Berkus writes: > However, it would really be useful to have an extra tag (in addition to > the ERROR or FATAL) for "If you're seeing this message, something has > gone seriously wrong on the server." Just stuff like corruption > messages, backend crashes, etc. Right, we've discussed that idea elsewhere; there's a basically orthogonal classification that needs to happen. Pretty much all PANICs are high priority from a DBA's perspective, but only a subset of either FATAL or ERROR are. Somebody needs to do the legwork to determine just what kind of classification scheme we want and propose at least an initial set of ereports to be so marked. One thought I had was that we could probably consider the default behavior (in the absence of any call of an explicit criticality-marking function) to be like this: for ereport(), it's critical if a PANIC and otherwise not for elog(), it's critical if >= ERROR level, otherwise not. The rationale for this is that we generally use elog for not-supposed-to-happen cases, so those are probably interesting. If we start getting complaints about some elog not being so interesting, we can convert it to an ereport so it can include an explicit marking call. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
On 12/11/2013 08:48 AM, Tom Lane wrote: > The fundamental problem IMO is that you want to complicate the definition > of what these things mean as a substitute for DBAs learning something > about Postgres. That seems like a fool's errand from here. They're going > to have to learn what FATAL means sooner or later, and making it more > complicated just raises the height of that barrier. I don't think it works to change the NOTICE/ERROR/FATAL tags; for one thing, I can hear the screaming about people's log scripts from here. However, it would really be useful to have an extra tag (in addition to the ERROR or FATAL) for "If you're seeing this message, something has gone seriously wrong on the server." Just stuff like corruption messages, backend crashes, etc. Otherwise we're requiring users to come up with an alphabet soup of regexes to filter out the noise error messages from the really, really important ones. Speaking as someone who does trainings for new DBAs, the part where I do "what to look for in the logs" requires over an hour and still doesn't cover everything. And doesn't internationalize. That's nasty. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
From: "Kevin Grittner" 5. FATAL: terminating walreceiver process due to administrator command 6. FATAL: terminating background worker \"%s\" due to administrator command Those are client connections and their backends terminated for a reason other than the client side of the connection requesting it. If we don't classify those as FATAL then the definition of FATAL becomes much more fuzzy. What would you define it to mean? I'm sorry to cause you trouble, but my understanding is that those are not client connections. They are just background server processes; walreceiver is started by startup process, and background workers are started by extension modules. Am I misunderstanding something? According to Table 18-1 in the manual: http://www.postgresql.org/docs/current/static/runtime-config-logging.html the definition of FATAL is: FATAL Reports an error that caused the current session to abort. This does not apply to the above messages, because there is no error. The DBA just shut down the database server, and the background processes terminated successfully. If some message output is desired, LOG's definition seems the nearest: LOG Reports information of interest to administrators, e.g., checkpoint activity. So, I thought "ereport(LOG, ...); proc_exit(0)" is more appropriate than ereport(FATAL, ...). Is this so strange? Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
From: "Andres Freund" On 2013-12-12 00:31:25 +0900, MauMau wrote: 5. FATAL: terminating walreceiver process due to administrator command 6. FATAL: terminating background worker \"%s\" due to administrator command Those are important if they happen outside a shutdown. So, if you really want to remove them from there, you'd need to change the signalling to handle the cases differently. How are they important? If someone mistakenly sends SIGTERM to walreceiver and background workers, they are automatically launched by postmaster or startup process later like other background processes. But other background processes such as walsender, bgwriter, etc. don't emit FATAL messages. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
"MauMau" writes: > I agree that #1-#3 are of course reasonable when there's any client the user > runs. The problem is that #1 (The database system is starting up) is output > in the server log by pg_ctl. In that case, there's no client the user is > responsible for. Why does a new DBA have to be worried about that FATAL > message? He didn't do anything wrong. FATAL doesn't mean "the DBA did something wrong". It means "we terminated a client session". The fundamental problem IMO is that you want to complicate the definition of what these things mean as a substitute for DBAs learning something about Postgres. That seems like a fool's errand from here. They're going to have to learn what FATAL means sooner or later, and making it more complicated just raises the height of that barrier. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
From: "Tom Lane" Jim Nasby writes: On 12/9/13 5:56 PM, Tom Lane wrote: How so? "FATAL" means "an error that terminates your session", which is exactly what these are. Except in these cases the user never actually got a working session; their request was denied. To be clear, from the client standpoint it's certainly fatal, but not from the server's point of view. This is fully expected behavior as far as the server is concerned. (Obviously it might be an error that caused the shutdown/recovery, but that's something different.) Right, but as already pointed out in this thread, these messages are worded from the client's point of view. "The client never got a working connection" seems to me to be an empty distinction. If you got SIGTERM'd before you could issue your first query, should that not be FATAL because you'd not gotten any work done? More generally, we also say FATAL for all sorts of entirely routine connection failures, like wrong password or mistyped user name. People don't seem to have a problem with those. Even if some do complain, the costs of changing that behavior after fifteen-years-and-counting would certainly exceed any benefit. I agree that #1-#3 are of course reasonable when there's any client the user runs. The problem is that #1 (The database system is starting up) is output in the server log by pg_ctl. In that case, there's no client the user is responsible for. Why does a new DBA have to be worried about that FATAL message? He didn't do anything wrong. I thought adding "options='-c log_min_messages=PANIC'" to the connection string for PQping() in pg_ctl.c would vanish the message, but it didn't. The reason is that connection options take effect in PostgresMain(), which is after checking the FATAL condition in ProcessStartupPacket(). Do you think there is any good solution? Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
MauMau wrote: > From: "Kevin Grittner" > FATAL is used when the problem is severe enough that the process > or connection must end. It seems to me to be what should > consistently be used when a client connection or its process must > be terminated for a reason other than a client-side request to > terminate. > > What do you think of #5 and #6 when matching the above criteria? > > 5. FATAL: terminating walreceiver process due to administrator > command > 6. FATAL: terminating background worker \"%s\" due to > administrator command Those are client connections and their backends terminated for a reason other than the client side of the connection requesting it. If we don't classify those as FATAL then the definition of FATAL becomes much more fuzzy. What would you define it to mean? -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
On 2013-12-12 00:31:25 +0900, MauMau wrote: > What do you think of #5 and #6 when matching the above criteria? > > 5. FATAL: terminating walreceiver process due to administrator command > 6. FATAL: terminating background worker \"%s\" due to administrator command Those are important if they happen outside a shutdown. So, if you really want to remove them from there, you'd need to change the signalling to handle the cases differently. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
From: "Kevin Grittner" It seems to be a fairly common term of art for a problem which requires a restart or reconnection. FATAL is used when the problem is severe enough that the process or connection must end. It seems to me to be what should consistently be used when a client connection or its process must be terminated for a reason other than a client-side request to terminate. What do you think of #5 and #6 when matching the above criteria? 5. FATAL: terminating walreceiver process due to administrator command 6. FATAL: terminating background worker \"%s\" due to administrator command These are output when the DBA shuts down the database server and there's no client connection. That is, these don't meet the criteria. I believe these should be suppressed, or use LOG instead of FATAL. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
On Tue, Dec 10, 2013 at 08:47:22AM +0800, Craig Ringer wrote: > On 12/05/2013 11:25 PM, MauMau wrote: > > Hello, > > > > My customers and colleagues sometimes (or often?) ask about the > > following message: > > > > FATAL: the database system is starting up > > I would LOVE that message to do away, forever. > > It's a huge PITA for automated log monitoring, analysis, and alerting. > > The other one I'd personally like to change, but realise is harder to > actually do, is to separate "ERROR"s caused by obvious user input issues > from internal ERRORs like not finding the backing file for a relation, > block read errors, etc. > > String pattern matching is a crude and awful non-solution, especially > given the way PostgreSQL loves to output messages to the log in whatever > language and encoding the current database connection is in. Yes, this is certainly a challenge. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
On 12/05/2013 11:25 PM, MauMau wrote: > Hello, > > My customers and colleagues sometimes (or often?) ask about the > following message: > > FATAL: the database system is starting up I would LOVE that message to do away, forever. It's a huge PITA for automated log monitoring, analysis, and alerting. The other one I'd personally like to change, but realise is harder to actually do, is to separate "ERROR"s caused by obvious user input issues from internal ERRORs like not finding the backing file for a relation, block read errors, etc. String pattern matching is a crude and awful non-solution, especially given the way PostgreSQL loves to output messages to the log in whatever language and encoding the current database connection is in. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
Jim Nasby writes: > On 12/9/13 5:56 PM, Tom Lane wrote: >> How so? "FATAL" means "an error that terminates your session", which >> is exactly what these are. > Except in these cases the user never actually got a working session; their > request was denied. > To be clear, from the client standpoint it's certainly fatal, but not from > the server's point of view. This is fully expected behavior as far as the > server is concerned. (Obviously it might be an error that caused the > shutdown/recovery, but that's something different.) Right, but as already pointed out in this thread, these messages are worded from the client's point of view. "The client never got a working connection" seems to me to be an empty distinction. If you got SIGTERM'd before you could issue your first query, should that not be FATAL because you'd not gotten any work done? More generally, we also say FATAL for all sorts of entirely routine connection failures, like wrong password or mistyped user name. People don't seem to have a problem with those. Even if some do complain, the costs of changing that behavior after fifteen-years-and-counting would certainly exceed any benefit. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
On 12/9/13 5:56 PM, Tom Lane wrote: Jim Nasby writes: Arguably 1-3 are inaccurate since they're not really about a backend dying... they occur during the startup phase; you never even get a functioning backend. That's a marked difference from other uses of FATAL. How so? "FATAL" means "an error that terminates your session", which is exactly what these are. Except in these cases the user never actually got a working session; their request was denied. To be clear, from the client standpoint it's certainly fatal, but not from the server's point of view. This is fully expected behavior as far as the server is concerned. (Obviously it might be an error that caused the shutdown/recovery, but that's something different.) -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
Jim Nasby writes: > Arguably 1-3 are inaccurate since they're not really about a backend dying... > they occur during the startup phase; you never even get a functioning > backend. That's a marked difference from other uses of FATAL. How so? "FATAL" means "an error that terminates your session", which is exactly what these are. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
On 12/6/13 7:38 AM, Andres Freund wrote: On 2013-12-06 22:35:21 +0900, MauMau wrote: From: "Tom Lane" No. They are FATAL so far as the individual session is concerned. Possibly some documentation effort is needed here, but I don't think any change in the code behavior would be an improvement. You are suggesting that we should add a note like "Don't worry about the following message. This is a result of normal connectivity checking", don't you? FATAL: the database system is starting up Uh. An explanation why you cannot connect to the database hardly seems like a superflous log message. It is when *you* are not actually trying to connect but rather pg_ctl is (which is one of the use cases here). Arguably 1-3 are inaccurate since they're not really about a backend dying... they occur during the startup phase; you never even get a functioning backend. That's a marked difference from other uses of FATAL. -- Jim C. Nasby, Data Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
MauMau wrote: > From: "Greg Stark" >> On the client end the FATAL is pretty logical but in the logs it >> makes it sound like the entire server died. I agree that is easily misunderstood, especially since a FATAL problem is less severe than a PANIC; while in common English usage panic is what might feel when faced with the prospect of speaking in public, but fatal generally means something that kills -- like a disease or a plane crash. There is the notion of a "fatal error" in English, though; which means an error which puts an end to what the person who makes such an error is attempting. >> FATAL is a term of art peculiar to Postgres. No, it is not; at least not in terms of being "characteristic of only one person, group, or thing". The Java logger from Apache called log4j has a FATAL level which is more serious than ERROR. The distinction is intended to indicate whether the application is likely to be able to continue running. Similarly, Sybase ASE has severity levels, where above a certain point they are described as "fatal" -- meaning that the application must acquire a new connection. It's probably used elsewhere as well; those are just a couple I happened to be familiar with. > I find it unnatural for a normal administration operation to emit > a FATAL message. It seems to be a fairly common term of art for a problem which requires a restart or reconnection. FATAL is used when the problem is severe enough that the process or connection must end. It seems to me to be what should consistently be used when a client connection or its process must be terminated for a reason other than a client-side request to terminate. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
From: "Greg Stark" On the client end the FATAL is pretty logical but in the logs it makes it sound like the entire server died. Especially in this day of multithreaded servers. I was suggesting that that was the origin of the confusion here. Anyone who has seen these messages on the client end many times might interpret them correctly in the server logs but someone who has only been a DBA, not a database user might never have seen them except in the server logs and without the context might not realize that FATAL is a term of art peculiar to Postgres. I think so, too. My customers and colleagues, who are relatively new to PostgreSQL, asked like "Is this FATAL message a sign of some problem? What does it mean?" I think it's natural to show concern when finding FATAL messages. I find it unnatural for a normal administration operation to emit a FATAL message. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
On Sat, Dec 7, 2013 at 3:56 PM, Andres Freund wrote: > > I don't really see much vagueness there. FATAL is an unexpected but > orderly shutdown. PANIC is for the situations where we can't handle the > problem that occurred in any orderly way. Sorry, I was unclear. I meant that without context if someone asked you which was more severe, "fatal" or "panic" you would have no particular way to know. After all, for a person it's easier to cure a panic than a fatality :) On the client end the FATAL is pretty logical but in the logs it makes it sound like the entire server died. Especially in this day of multithreaded servers. I was suggesting that that was the origin of the confusion here. Anyone who has seen these messages on the client end many times might interpret them correctly in the server logs but someone who has only been a DBA, not a database user might never have seen them except in the server logs and without the context might not realize that FATAL is a term of art peculiar to Postgres. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
On 2013-12-07 13:58:11 +, Greg Stark wrote: > FATAL means a backend died. It is kind of vague how FATAL and PANIC > differ. I don't really see much vagueness there. FATAL is an unexpected but orderly shutdown. PANIC is for the situations where we can't handle the problem that occurred in any orderly way. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
On Sat, Dec 7, 2013 at 12:27 AM, David Johnston wrote: >> >> 1. FATAL: the database system is starting up > > How about altering the message to tone down the severity by a half-step... > > FATAL: (probably) not! - the database system is starting up Well it is fatal, the backend for that client doesn't continue. FATAL means a backend died. It is kind of vague how FATAL and PANIC differ. They both sound like the end of the world. If you read FATAL thinking it means the whole service is quitting -- ie what PANIC means -- then these sound like they're noise. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
On 2013-12-06 22:35:21 +0900, MauMau wrote: > From: "Tom Lane" > >No. They are FATAL so far as the individual session is concerned. > >Possibly some documentation effort is needed here, but I don't think > >any change in the code behavior would be an improvement. > > You are suggesting that we should add a note like "Don't worry about the > following message. This is a result of normal connectivity checking", don't > you? > > FATAL: the database system is starting up Uh. An explanation why you cannot connect to the database hardly seems like a superflous log message. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
From: "Tom Lane" No. They are FATAL so far as the individual session is concerned. Possibly some documentation effort is needed here, but I don't think any change in the code behavior would be an improvement. You are suggesting that we should add a note like "Don't worry about the following message. This is a result of normal connectivity checking", don't you? FATAL: the database system is starting up But I doubt most users would recognize such notes. Anyway, lots of such messages certainly make monitoring and troubleshooting harder, because valuable messages are buried. 4. FATAL: sorry, too many clients already Report these as FATAL to the client because the client wants to know the reason. But don't output them to server log because they are not necessary for DBAs (4 is subtle.) The notion that a DBA should not be allowed to find out how often #4 is happening is insane. I thought someone would point out so. You are right, #4 is a strong hint for the DBA about max_connection setting or connection pool configuration. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
From: "Peter Eisentraut" Yeah, this is part of a more general problem, which you have characterized correctly: What is fatal (or error, or warning, ...) to the client isn't necessarily fatal (or error, or warning, ...) to the server or DBA. Thanks. In addition, #5 and #6 in my previous mail are even unnecessary for both the client and the DBA, aren't they? Fixing this would need a larger enhancement of the logging infrastructure. It's been discussed before, but it's a bit of work. How about the easy fix I proposed? The current logging infrastructure seems enough to solve the original problem with small effort without complicating the code. If you don't like "log_min_messages = PANIC", SetConfigOption() can be used instead. I think we'd better take a step to eliminate the facing problem, as well as consider a much richer infrastracture in the long run. I'm also interested in the latter, and want to discuss it after solving the problem in front of me. Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
"MauMau" writes: > Shouldn't we lower the severity or avoiding those messages to server log? No. They are FATAL so far as the individual session is concerned. Possibly some documentation effort is needed here, but I don't think any change in the code behavior would be an improvement. > 1. FATAL: the database system is starting up > 2. FATAL: the database system is shutting down > 3. FATAL: the database system is in recovery mode > 4. FATAL: sorry, too many clients already > Report these as FATAL to the client because the client wants to know the > reason. But don't output them to server log because they are not necessary > for DBAs (4 is subtle.) The notion that a DBA should not be allowed to find out how often #4 is happening is insane. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
On 12/5/13, 10:25 AM, MauMau wrote: > Report these as FATAL to the client because the client wants to know the > reason. But don't output them to server log because they are not > necessary for DBAs Yeah, this is part of a more general problem, which you have characterized correctly: What is fatal (or error, or warning, ...) to the client isn't necessarily fatal (or error, or warning, ...) to the server or DBA. Fixing this would need a larger enhancement of the logging infrastructure. It's been discussed before, but it's a bit of work. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?
Hello, My customers and colleagues sometimes (or often?) ask about the following message: FATAL: the database system is starting up This message is often output dozens of times during a failover or PITR. The users seem to be worried because the message level is FATAL and they don't know why such severe message is output in a successful failover and recovery. I can blame the users, because the message is merely a sub-product of pg_ctl's internal ping. Similarly, the below message is output when I stop the standby server normally. Why FATAL as a result of successful operation? I'm afraid DBAs are annoyed by these messages, as system administration software collects ERROR and more severe messages for daily monitoring. FATAL: terminating walreceiver process due to administrator command Shouldn't we lower the severity or avoiding those messages to server log? How about the following measures? 1. FATAL: the database system is starting up 2. FATAL: the database system is shutting down 3. FATAL: the database system is in recovery mode 4. FATAL: sorry, too many clients already Report these as FATAL to the client because the client wants to know the reason. But don't output them to server log because they are not necessary for DBAs (4 is subtle.) 5. FATAL: terminating walreceiver process due to administrator command 6. FATAL: terminating background worker \"%s\" due to administrator command Don't output these to server log. Why are they necessary? For troubleshooting purposes? If necessary, the severity should be LOG (but I wonder why other background processes are not reported...) To suppress server log output, I think we can do as follows. I guess ereport(FATAL) is still needed for easily handling both client report and process termination. log_min_messages = PANIC; ereport(FATAL, (errcode(ERRCODE_CANNOT_CONNECT_NOW), errmsg("the database system is starting up"))); May I hear your opinions? Regards MauMau -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers