Hi Wendy, On Thu, Oct 10, 2013 at 12:25 PM, Wendy Roome <[email protected]>wrote:
> I hadn't realized JSON parsers have gotten that sophisticated -- I've just > used the standard DOM and SAX parsers. > > That said, I don't like the idea of returning E_SYNTAX when a field has > the wrong type (or is missing). > I am in a CS department with a strong emphasis on type theory, which in essence is to capture semantic errors by syntax (e.g., mismatch in type system). Hence, I feel more comfortable with a syntax error means of a wrong type or a missing field :-) But I agree that this can be confusing. That strikes me as very confusing to a client programmer. As currently > defined, E_SYNTAX doesn't say where the error occurred. Odds are that the > client programmer will use a generic JSON parser to verify the JSON she > sent, and that parser will say the JSON is perfectly valid. And then > programmer will go, "What the ….?" > > So I'd reserve E_SYNTAX for pure JSON syntax errors. Use E_MISSING_FIELD > or E_INVALID_FIELD_TYPE for schema violations. > Agree that if the server's parser can identify such schema errors, the server should be explicit. A question is whether we specify in the protocol that the server must distinguish all such cases. To avoid limiting the choice of JSON parsers, we can add a sentence in the spec that E_SYNTAX indicates that the server's JSON parser could not parse the request. If it helps, one possibility is to change E_SYNTAX to E_PARSING. How does this sound? > > I didn't go into the details of the JSON parsers you cited, but wouldn't > the parser distinguish between "JSON syntax error" and "doesn't follow > schema"? I believe XML parsers make that distinction when validating > against a DTD. > > >From what I know, the return could be just that the result is a null. > Also, having written a few compilers early in my career, I learned the > hard way that it was better to extend the YACC grammar to accept some > common errors, and detect them in action routines and give the user a > detailed explanation of the error. When I let YACC give a generic "syntax > error", users would bang on my door telling me the compiler was broken. > > Totally agree with your experience. I teach Introduction to Programming, and it is my first hand experience to see how frustrated the students are when they see the generic syntax messages. I am not sure how extensible the JSON parsers are. As a best practice, the ALTO Server programmer should make errors as specific as possible. We can add such a sentence in the spec to encourage this, but will not use the inter-op (e.g., MUST, SHOULD) words, because they do will not have an impact on interop. What do you think? It is always a great discussion! Richard > - Wendy > > From: "Y. Richard Yang" <[email protected]> > Date: Wed, October 9, 2013 17:31 > To: Wendy Roome <[email protected]> > Cc: IETF ALTO <[email protected]> > Subject: Re: [alto] Is E_INVALID_FIELD_TYPE necessary? > > On Wed, Oct 9, 2013 at 3:37 PM, Wendy Roome <[email protected]> > wrote: > >> Do we really need a separate E_INVALID_FIELD_TYPE error code? Why not >> just fold it into E_MISSING_FIELD, by defining that as "The field is >> missing or it has the wrong type." >> >> First reason it's unnecessary: if the protocol says the client must >> provide a String field named "cost-type", and the client defines >> "cost-type" as an array, well, the STRING field is missing, isn't it? >> >> Second reason: JSON libraries rarely distinguish between "missing" and >> "wrong type". Eg, getString("foo") usually returns null if "foo" doesn't >> exist or if it exists but isn't a string. The server has to do additional >> analysis to distinguish between the two cases. >> >> > Suppose one uses Data Binding (e.g., > http://wiki.fasterxml.com/JacksonInFiveMinutes) for deserialization. A > wrong type, in a strong typed language, will cause the deserialization to > fail, depending on specified features ( > https://github.com/FasterXML/jackson-databind/wiki/Deserialization-Features). > Hence, if one uses a strict parser, a wrong type (e.g., a number but should > be a string) will cause the parser to fail, and the server may only know > that it is a syntax error E_SYNTAX. I agree that a missing field will cause > the same. In this sense, they are all syntax errors in a PL sense. Hence, a > more appropriate coarse error is E_SYNTAX. In this sense, E_SYNTAX is a > base class of two specific sub types (missing or wrong type). What do you > think? > >
_______________________________________________ alto mailing list [email protected] https://www.ietf.org/mailman/listinfo/alto
