Hi Wendy,

On Thu, Oct 10, 2013 at 12:25 PM, Wendy Roome <[email protected]>wrote:

> I hadn't realized JSON parsers have gotten that sophisticated -- I've just
> used the standard DOM and SAX parsers.
>
> That said, I don't like the idea of returning E_SYNTAX when a field has
> the wrong type (or is missing).
>

I am in a CS department with a strong emphasis on type theory, which in
essence is to capture semantic errors by syntax (e.g., mismatch in type
system). Hence, I feel more comfortable with a syntax error means of a
wrong type or a missing field :-) But I agree that this can be confusing.

That strikes me as very confusing to a client programmer. As currently
> defined, E_SYNTAX doesn't say where the error occurred. Odds are that the
> client programmer will use a generic JSON parser to verify the JSON she
> sent, and that parser will say the JSON is perfectly valid. And then
> programmer will go, "What the ….?"
>
> So I'd reserve E_SYNTAX for pure JSON syntax errors.  Use E_MISSING_FIELD
> or E_INVALID_FIELD_TYPE for schema violations.
>

Agree that if the server's parser can identify such schema errors, the
server should be explicit. A question is whether we specify in the protocol
that the server must distinguish all such cases. To avoid limiting the
choice of JSON parsers, we can add a sentence in the spec that E_SYNTAX
indicates that the server's JSON parser could not parse the request. If it
helps, one possibility is to change E_SYNTAX to E_PARSING. How does this
sound?


>
> I didn't go into the details of the JSON parsers you cited, but wouldn't
> the parser distinguish  between "JSON syntax error" and "doesn't follow
> schema"? I believe XML parsers make that distinction when validating
> against a DTD.
>
>
>From what I know, the return could be just that the result is a null.


> Also, having written a few compilers early in my career, I learned the
> hard way that it was better to extend the YACC grammar to accept some
> common errors, and detect them in action routines and give the user a
> detailed explanation of the error. When I let YACC give a generic "syntax
> error", users would bang on my door telling me the compiler was broken.
>
>
Totally agree with your experience. I teach Introduction to Programming,
and it is my first hand experience to see how frustrated the students are
when they see the  generic syntax messages. I am not sure how extensible
the JSON parsers are. As a best practice, the ALTO Server programmer should
make errors as specific as possible. We can add such a sentence in the spec
to encourage this, but will not use the inter-op (e.g., MUST, SHOULD)
words, because they do will not have an impact on interop. What do you
think?

It is always a great discussion!

Richard



> - Wendy
>
> From: "Y. Richard Yang" <[email protected]>
> Date: Wed, October 9, 2013 17:31
> To: Wendy Roome <[email protected]>
> Cc: IETF ALTO <[email protected]>
> Subject: Re: [alto] Is E_INVALID_FIELD_TYPE necessary?
>
> On Wed, Oct 9, 2013 at 3:37 PM, Wendy Roome <[email protected]>
> wrote:
>
>> Do we really need a separate E_INVALID_FIELD_TYPE error code?  Why not
>> just fold it into E_MISSING_FIELD, by defining that as "The field is
>> missing or it has the wrong type."
>>
>> First reason it's unnecessary: if the protocol says the client must
>> provide a String field named "cost-type", and the client defines
>> "cost-type" as an array, well, the STRING field is missing, isn't it?
>>
>> Second reason: JSON libraries rarely distinguish between "missing" and
>> "wrong type". Eg, getString("foo") usually returns null if "foo" doesn't
>> exist or if it exists but isn't a string. The server has to do additional
>> analysis to distinguish between the two cases.
>>
>>
> Suppose one uses Data Binding (e.g.,
> http://wiki.fasterxml.com/JacksonInFiveMinutes) for deserialization. A
> wrong type, in a strong typed language, will cause the deserialization to
> fail, depending on specified features (
> https://github.com/FasterXML/jackson-databind/wiki/Deserialization-Features).
> Hence, if one uses a strict parser, a wrong type (e.g., a number but should
> be a string) will cause the parser to fail, and the server may only know
> that it is a syntax error E_SYNTAX. I agree that a missing field will cause
> the same. In this sense, they are all syntax errors in a PL sense. Hence, a
> more appropriate coarse error is E_SYNTAX. In this sense, E_SYNTAX is a
> base class of two specific sub types (missing or wrong type). What do you
> think?
>
>
_______________________________________________
alto mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/alto

Reply via email to