Re: [HACKERS] Syntax error reporting (was Re: [PATCHES] syntax error position
Dear Tom, The point is that CONTEXT is essentially a record of how we got here. Yes, but for human eyes. In a situation where the actual error occurs inside a couple of levels of nesting, you want to be able to report the outer queries as well as the one that directly caused the error. I agree that there's probably little hope of clients automatically making sense of the CONTEXT info. But we need to provide it to help people debug complex functions. Yes. More over, I have other ideas for CONTEXT, which should really be a stack. It already is a stack. Ok, I agree that there is a push, but I'm still looking fot the pop. Maybe I missed something, but it seemed to me that strings are appended on to the other, and there is no way back. That's what I mean by a real stack. If a student of mine comes to me with his or her stack without a pop, he or she will not have a very good grade;-) So when more information are to be fed into the context, they are to be fed when they are actually needed on the error, maybe with callbacks? They cannot be fed prevently. What I would have looked for, is a stack on which functions could push and pop informations as they want, so that the stack would be always available for any error or warning or debug trace down the callgraph. If it is the case already, as I said above, I haven't seen the pop feature yet. I think a new field is alas necessary. I'm leaning in that direction also. How about 'P': used only for syntax error position in the client-submitted query. 'p': syntax error position in an internally-generated query. 'q': text of an internally-generated query ('p' and 'q' would always appear together). Why not. Taking your very idea a little step further, I would suggest the following, which is pretty similar, but even more general: 1/ field conventions: - upper case letter fields are for humans. they are meant to be shown to the user, even new ones? - lower case letter fields are for machines / clients they are never meant to be shown to the user! but they should give some information to the client on how to process an error report. 2/ new fields: - p: character [length] [reserved for future use] where the parser error occured, with the length of the offending token if available (for better hilighting?) this is basically the current P, but not meant to the user, and with possible additions. - q: select foo frm bla; the raw sql query that was being processed, whatever the case. I think syntax error are rare occurences, so it is no big deal to return the query anyway. this may make some clients code easier because there is no difference in the processing for different kind of errors. - c: context-description some context for the client, not the human, so no localisation. something like: c: lang=sql query c: lang=sql query-part c: lang=sql function 'foo' 4 it could be adapted to other syntax errors, something like. c: lang=plpgsql function 'foo' 4 c: lang=perl function 'bla' 43 or separate fields: l:sql for the language, f: for the function... in such cases, the q: field would not necessarily contains sql. the client may open an editor for plpgsql function of it sees it fits... or whatever. The information is available, that is the point. Have a nice day, -- Fabien Coelho - [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Syntax error reporting (was Re: [PATCHES] syntax error position CREATE FUNCTION bug fix)
Fabien COELHO [EMAIL PROTECTED] writes: More over, I have other ideas for CONTEXT, which should really be a stack. It already is a stack. Ok, I agree that there is a push, but I'm still looking fot the pop. Maybe I missed something, but it seemed to me that strings are appended on to the other, and there is no way back. But the string list is not constructed until the error actually occurs. You don't need a pop at that point --- the call stack is what it is. I think you are imagining that outer-level context hooks should be able to editorialize on what inner-level ones said (or perhaps vice versa?) but I honestly cannot think of a valid use-case for that. What I would have looked for, is a stack on which functions could push and pop informations as they want, so that the stack would be always available for any error or warning or debug trace down the callgraph. Look at the existing examples of adjusting the error_context_stack. They already do all that, they just don't bother to compute the error strings unless actually needed. I'm not willing to push very much cost into the non-error path of control when there's no visible payoff. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Syntax error reporting (was Re: [PATCHES] syntax error position CREATE FUNCTION bug fix)
Fabien COELHO [EMAIL PROTECTED] writes: I agree with you that both reports should not look the same. The good news is that they already do not look the same, thanks to the CONTEXT information. Right, but you quite properly didn't like my quick-hack to psql that assumes that the presence of any CONTEXT means it's not a top-level syntax error. I would like to replace that hack with something cleaner. We have in fact misimplemented it, because it is being set for syntax errors in internally generated queries too. Well, from the parser point of view, it is a little bit messy to have to do different things for error reporting in different context. That why I would try a one-fit-all solution. The parser needn't do anything different. What I'm imagining is that for internally submitted queries, there will always be an error_context_stack routine that can transform the error report into whatever we decide the appropriate format is. However it might be better to invent a new error-message field that carries just the text of the SQL command, rather than stuffing it into CONTEXT. I'm not sure I would like the CONTEXT field for that? as it may be usefull for a lot of things. In your above example, the CONTEXT is filled with 2 different informations that are just messed up for the client. If localisation is used, there would be no way for a client to seperate them. Different information should require different fields. The point is that CONTEXT is essentially a record of how we got here. In a situation where the actual error occurs inside a couple of levels of nesting, you want to be able to report the outer queries as well as the one that directly caused the error. I agree that there's probably little hope of clients automatically making sense of the CONTEXT info. But we need to provide it to help people debug complex functions. More over, I have other ideas for CONTEXT, which should really be a stack. It already is a stack. The other thing to think about is whether we should invent a new field to carry syntax error position for generated queries, rather than making 'P' do double duty as it does now. I think a new field is alas necessary. I'm leaning in that direction also. How about 'P': used only for syntax error position in the client-submitted query. 'p': syntax error position in an internally-generated query. 'q': text of an internally-generated query ('p' and 'q' would always appear together). In the case of a non-syntax error in an internally generated query, we should stick the query text into the CONTEXT stack, since it might not be the most closely nested context item anyway. An existing client will ignore the 'p' and 'q' fields, thus providing behavior that's no worse than what you get with 7.4 now. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Syntax error reporting (was Re: [PATCHES] syntax error position
Dear Tom, However, I still stick with my bad simple idea because the simpler the better, and also because of the following example: ... psql SELECT count_tup('pg_shadow'); ERROR: syntax error at or near FRM at character 22 CONTEXT: PL/pgSQL function count_tup line 4 at for over execute statement As you can notice, the extract is not in the submitted query, so there is no point to show it there. Yeah. However, I dislike your solution because it confuses the cases of a syntax error in the actually submitted query, and a syntax error in an internally generated query. We should keep these cases clearly separate because clients may want to do different things. [...] I agree with you that both reports should not look the same. The good news is that they already do not look the same, thanks to the CONTEXT information. However, the context information is informal (it is just plain English), to it is not easy for a client to take that into account. The original design concept for the 'P' (position) error field is that it would be used to locate syntax errors in the *original query*, and so its presence is a cue to the client code to go in the direction of setting the editing cursor. (Note the protocol specification says index into the original query string.) Yes, I noted that. In my proposed patch, I changed it to the processed query, which may or may not be the initial query. We have in fact misimplemented it, because it is being set for syntax errors in internally generated queries too. Well, from the parser point of view, it is a little bit messy to have to do different things for error reporting in different context. That why I would try a one-fit-all solution. Maybe I'm a little bit lazy, but sometimes it is a quality in programming, if it help keep things simple. I was already planning to modify plpgsql to send back the full text of generated queries when there is an error. My intention was to supply this just as part of the CONTEXT stack, that is instead of your example of ERROR: syntax error at or near FRM at character 22 CONTEXT: PL/pgSQL function count_tup line 4 at for over execute statement you'd get something like ERROR: syntax error at or near FRM at character 22 CONTEXT: Executing command SELECT COUNT(*) AS c FRM pg_shadow PL/pgSQL function count_tup line 4 at for over execute statement However it might be better to invent a new error-message field that carries just the text of the SQL command, rather than stuffing it into CONTEXT. I'm not sure I would like the CONTEXT field for that? as it may be usefull for a lot of things. In your above example, the CONTEXT is filled with 2 different informations that are just messed up for the client. If localisation is used, there would be no way for a client to seperate them. Different information should require different fields. More over, I have other ideas for CONTEXT, which should really be a stack. But that is another issue. (This is similar to your original patch, but different in detail because I'm envisioning sending back generated queries, never the submitted query. Regurgitating the submitted query is just a waste of bandwidth.) Well, if it is the same I agree. On the other hand, it is a rare case, during an exception. I would expect submitted queries to work most of the time, because the client is just some program which uses the database. The plus side of that would be that it'd be easy to extract for syntax-error highlighting. The minuses are that existing clients would fail to print such a field (the protocol spec says to ignore unknown fields That's a really good design idea, because it allows to include new fields and still to have old client compatible with the new protocol. I think it is no big deal if existing clients don't make use if the new information. It was not there before anyway, so it is just as is was before. Moreover, one should distinguish between fields for the human and fields for the client. reporting the query and the position is more for the client, giving the context is more for the human. One is use for some automatic processing, the other is a high level semantical information to be interpreted by the user. ), and that there is no good way to cope with nested queries. A possible compromise is to put the text of the generated SQL command into a new field only if the error is a syntax error, and put it into the CONTEXT stack otherwise. Syntax errors couldn't be nested so at, least that problem goes away. This seems a bit messy though. I don't like it, because above comments. The other thing to think about is whether we should invent a new field to carry syntax error position for generated queries, rather than making 'P' do double duty as it does now. If we don't do that then we have to change the protocol specification to reflect reality. In any case I think it has to be possible to tell very easily from the error message