Re: [HACKERS] Syntax error reporting (was Re: [PATCHES] syntax error position

2004-03-22 Thread Fabien COELHO

Dear Tom,

 The point is that CONTEXT is essentially a record of how we got here.

Yes, but for human eyes.

 In a situation where the actual error occurs inside a couple of levels
 of nesting, you want to be able to report the outer queries as well as
 the one that directly caused the error.  I agree that there's probably
 little hope of clients automatically making sense of the CONTEXT info.
 But we need to provide it to help people debug complex functions.

Yes.

  More over, I have other ideas for CONTEXT, which should really be a stack.
 It already is a stack.

Ok, I agree that there is a push, but I'm still looking fot the pop.

Maybe I missed something, but it seemed to me that strings are appended
on to the other, and there is no way back.

That's what I mean by a real stack. If a student of mine comes to me
with his or her stack without a pop, he or she will not have a very
good grade;-)

So when more information are to be fed into the context, they are to be
fed when they are actually needed on the error, maybe with callbacks?
They cannot be fed prevently.

What I would have looked for, is a stack on which functions could push
and pop informations as they want, so that the stack would be always
available for any error or warning or debug trace down the callgraph.

If it is the case already, as I said above, I haven't seen the pop
feature yet.

  I think a new field is alas necessary.
 I'm leaning in that direction also.  How about
 'P': used only for syntax error position in the client-submitted query.
 'p': syntax error position in an internally-generated query.
 'q': text of an internally-generated query ('p' and 'q' would always
  appear together).

Why not.

Taking your very idea a little step further, I would suggest the
following, which is pretty similar, but even more general:

1/ field conventions:

  - upper case letter fields are for humans.
they are meant to be shown to the user, even new ones?

  - lower case letter fields are for machines / clients
they are never meant to be shown to the user!
but they should give some information to the client on how to
process an error report.

2/ new fields:

 - p: character [length] [reserved for future use]
   where the parser error occured, with the length of the
   offending token if available (for better hilighting?)
   this is basically the current P, but not meant to the user,
   and with possible additions.

 - q: select foo frm bla;
  the raw sql query that was being processed, whatever the case.
  I think syntax error are rare occurences, so it is no big deal
  to return the query anyway. this may make some clients code easier
  because there is no difference in the processing for different kind
  of errors.

 - c: context-description
  some context for the client, not the human, so no localisation.
  something like:
   c: lang=sql query
   c: lang=sql query-part
   c: lang=sql function 'foo' 4

  it could be adapted to other syntax errors, something like.

   c: lang=plpgsql function 'foo' 4
   c: lang=perl function 'bla' 43

  or separate fields: l:sql for the language, f: for the function...
  in such cases, the q: field would not necessarily contains sql.

  the client may open an editor for plpgsql function of it sees it fits...
  or whatever. The information is available, that is the point.

Have a nice day,

-- 
Fabien Coelho - [EMAIL PROTECTED]

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Syntax error reporting (was Re: [PATCHES] syntax error position CREATE FUNCTION bug fix)

2004-03-22 Thread Tom Lane
Fabien COELHO [EMAIL PROTECTED] writes:
 More over, I have other ideas for CONTEXT, which should really be a stack.
 It already is a stack.

 Ok, I agree that there is a push, but I'm still looking fot the pop.

 Maybe I missed something, but it seemed to me that strings are appended
 on to the other, and there is no way back.

But the string list is not constructed until the error actually occurs.
You don't need a pop at that point --- the call stack is what it is.

I think you are imagining that outer-level context hooks should be able
to editorialize on what inner-level ones said (or perhaps vice versa?)
but I honestly cannot think of a valid use-case for that.

 What I would have looked for, is a stack on which functions could push
 and pop informations as they want, so that the stack would be always
 available for any error or warning or debug trace down the callgraph.

Look at the existing examples of adjusting the error_context_stack.
They already do all that, they just don't bother to compute the error
strings unless actually needed.  I'm not willing to push very much cost
into the non-error path of control when there's no visible payoff.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Syntax error reporting (was Re: [PATCHES] syntax error position CREATE FUNCTION bug fix)

2004-03-21 Thread Tom Lane
Fabien COELHO [EMAIL PROTECTED] writes:
 I agree with you that both reports should not look the same.

 The good news is that they already do not look the same, thanks
 to the CONTEXT information.

Right, but you quite properly didn't like my quick-hack to psql that
assumes that the presence of any CONTEXT means it's not a top-level
syntax error.  I would like to replace that hack with something cleaner.

 We have in fact misimplemented it, because it is being set for syntax
 errors in internally generated queries too.

 Well, from the parser point of view, it is a little bit messy to have
 to do different things for error reporting in different context. That
 why I would try a one-fit-all solution.

The parser needn't do anything different.  What I'm imagining is that
for internally submitted queries, there will always be an
error_context_stack routine that can transform the error report into
whatever we decide the appropriate format is.

 However it might be better to invent a new error-message field that
 carries just the text of the SQL command, rather than stuffing it into
 CONTEXT.

 I'm not sure I would like the CONTEXT field for that? as it may be
 usefull for a lot of things. In your above example, the CONTEXT is filled
 with 2 different informations that are just messed up for the client.
 If localisation is used, there would be no way for a client to seperate
 them. Different information should require different fields.

The point is that CONTEXT is essentially a record of how we got here.
In a situation where the actual error occurs inside a couple of levels
of nesting, you want to be able to report the outer queries as well as
the one that directly caused the error.  I agree that there's probably
little hope of clients automatically making sense of the CONTEXT info.
But we need to provide it to help people debug complex functions.

 More over, I have other ideas for CONTEXT, which should really be a stack.

It already is a stack.

 The other thing to think about is whether we should invent a new field
 to carry syntax error position for generated queries, rather than making
 'P' do double duty as it does now.

 I think a new field is alas necessary.

I'm leaning in that direction also.  How about

'P': used only for syntax error position in the client-submitted query.

'p': syntax error position in an internally-generated query.

'q': text of an internally-generated query ('p' and 'q' would always
 appear together).

In the case of a non-syntax error in an internally generated query, we
should stick the query text into the CONTEXT stack, since it might not
be the most closely nested context item anyway.

An existing client will ignore the 'p' and 'q' fields, thus providing
behavior that's no worse than what you get with 7.4 now.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Syntax error reporting (was Re: [PATCHES] syntax error position

2004-03-20 Thread Fabien COELHO

Dear Tom,

  However, I still stick with my bad simple idea because the simpler the
  better, and also because of the following example:
  ...
  psql SELECT count_tup('pg_shadow');
  ERROR:  syntax error at or near FRM at character 22
  CONTEXT:  PL/pgSQL function count_tup line 4 at for over execute statement

  As you can notice, the extract is not in the submitted query, so there
  is no point to show it there.

 Yeah.  However, I dislike your solution because it confuses the cases of
 a syntax error in the actually submitted query, and a syntax error in an
 internally generated query.  We should keep these cases clearly separate
 because clients may want to do different things.
 [...]

I agree with you that both reports should not look the same.

The good news is that they already do not look the same, thanks
to the CONTEXT information. However, the context information is
informal (it is just plain English), to it is not easy for a client
to take that into account.

 The original design concept for the 'P' (position) error field is that
 it would be used to locate syntax errors in the *original query*, and
 so its presence is a cue to the client code to go in the direction of
 setting the editing cursor.  (Note the protocol specification says
 index into the original query string.)

Yes, I noted that.

In my proposed patch, I changed it to the processed query, which
may or may not be the initial query.

 We have in fact misimplemented it, because it is being set for syntax
 errors in internally generated queries too.

Well, from the parser point of view, it is a little bit messy to have
to do different things for error reporting in different context. That
why I would try a one-fit-all solution. Maybe I'm a little bit lazy, but
sometimes it is a quality in programming, if it help keep things simple.

 I was already planning to modify plpgsql to send back the full text of
 generated queries when there is an error.  My intention was to supply
 this just as part of the CONTEXT stack, that is instead of your example
 of

 ERROR:  syntax error at or near FRM at character 22
 CONTEXT:  PL/pgSQL function count_tup line 4 at for over execute statement

 you'd get something like

 ERROR:  syntax error at or near FRM at character 22
 CONTEXT:  Executing command SELECT COUNT(*) AS c FRM pg_shadow
 PL/pgSQL function count_tup line 4 at for over execute statement

 However it might be better to invent a new error-message field that
 carries just the text of the SQL command, rather than stuffing it into
 CONTEXT.

I'm not sure I would like the CONTEXT field for that? as it may be
usefull for a lot of things. In your above example, the CONTEXT is filled
with 2 different informations that are just messed up for the client.
If localisation is used, there would be no way for a client to seperate
them. Different information should require different fields.

More over, I have other ideas for CONTEXT, which should really be a stack.
But that is another issue.

 (This is similar to your original patch, but different in detail because
 I'm envisioning sending back generated queries, never the submitted
 query.  Regurgitating the submitted query is just a waste of bandwidth.)

Well, if it is the same I agree. On the other hand, it is a rare case,
during an exception. I would expect submitted queries to work most of the
time, because the client is just some program which uses the database.

 The plus side of that would be that it'd be easy to extract
 for syntax-error highlighting.  The minuses are that existing clients
 would fail to print such a field (the protocol spec says to ignore
 unknown fields

That's a really good design idea, because it allows to include new fields
and still to have old client compatible with the new protocol.

I think it is no big deal if existing clients don't make use if the new
information. It was not there before anyway, so it is just as is was
before.

Moreover, one should distinguish between fields for the human and fields
for the client. reporting the query and the position is more for the
client, giving the context is more for the human. One is use for
some automatic processing, the other is a high level semantical
information to be interpreted by the user.

), and that there is no good way to cope with nested queries.

 A possible compromise is to put the text of the generated SQL command
 into a new field only if the error is a syntax error, and put it into
 the CONTEXT stack otherwise.  Syntax errors couldn't be nested so at,
 least that problem goes away.  This seems a bit messy though.

I don't like it, because above comments.

 The other thing to think about is whether we should invent a new field
 to carry syntax error position for generated queries, rather than making
 'P' do double duty as it does now.  If we don't do that then we have to
 change the protocol specification to reflect reality.  In any case I
 think it has to be possible to tell very easily from the error message