Makes sense,
We have to agree on he scope of this implementation.
Right now the implementation I have in java, handles only the:
union {null, [some type]} situation.
Are we ok with this for a start?
What I see more, is to handle:
1) union {string, double}, (although we have to specify behavior for NAN,
Positive and negative infinity); union {string, boolean}; ….
2) Make decimal an avro first class type. Current logical type approach is not
natural in JSON. (see https://issues.apache.org/jira/browse/AVRO-2164).
For 1.9.x 2) is probably a non-starter
let me know.
—Z
> On Jan 14, 2020, at 12:09 PM, roger peppe <[email protected]> wrote:
>
>
> On Tue, 14 Jan 2020 at 15:00, Zoltan Farkas <[email protected]
> <mailto:[email protected]>> wrote:
> I can go ahead create a PR to add the Encoder/Decoder implementations.
> let me know if anyone else plans to do that. (to avoid wasting time)
>
> Hi,
>
> Before you do that, would it be possible to write a specification for exactly
> what the conventions are and publish it somewhere? There are a bunch of edge
> cases that could be done in different ways, I think.
>
> That way people like me that don't use Java can implement the same spec. (and
> also it's useful to know exactly what one is implementing before diving in
> and writing the code :])
>
> cheers,
> rog.
>
>
> thanks
>
> —Z
>
>> On Jan 9, 2020, at 3:51 AM, Driesprong, Fokko <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>> Thanks for chipping in Zoltan and Sean. I did not plan to change the current
>> JSON encoder. My initial suggestion would make this an option that the user
>> can set. The default will be the current situation, so nothing should change
>> when upgrading to a newer version of Avro.
>>
>> Cheers, Fokko
>>
>> Op wo 8 jan. 2020 om 21:39 schreef Sean Busbey <[email protected]
>> <mailto:[email protected]>>:
>> I agree with Zoltan here. We have a really long history of maintaining
>> compatibility for encoders.
>>
>> On Tue, Jan 7, 2020 at 10:06 AM Zoltan Farkas <[email protected]
>> <mailto:[email protected]>> wrote:
>> Fokko,
>>
>> I am not sure we should be changing the existing json encoder,
>> I think we should just add another encoder, and devs can use either one of
>> them based on their use case… and stay backward compatible.
>>
>> we should maybe standardize the content types for them… I have seen
>> application/avro being used for binary, we could have for json:
>> application/avro+json for the current format, application/avro.2+json for
>> the new format….
>>
>> At some point in the future we could deprecate the old one…
>>
>> —Z
>>
>>
>>> On Jan 7, 2020, at 2:41 AM, Driesprong, Fokko <[email protected]
>>> <mailto:[email protected]>> wrote:
>>>
>>> I would be a great fan of this as well. This also bothered me. The tricky
>>> part here is to see when to release this because it will break the existing
>>> JSON structure. We could make this configurable as well.
>>>
>>> Cheers, Fokko
>>>
>>> Op ma 6 jan. 2020 om 22:36 schreef roger peppe <[email protected]
>>> <mailto:[email protected]>>:
>>> That's great, thanks! I thought this would probably have come up before.
>>>
>>> Have you written down your changes in a somewhat more formal specification
>>> document, by any chance?
>>>
>>> cheers,
>>> rog.
>>>
>>>
>>> On Mon, 6 Jan 2020, 18:50 zoly farkas, <[email protected]
>>> <mailto:[email protected]>> wrote:
>>> I think there is consensus that this should be implemented, see [AVRO-1582]
>>> Json serialization of nullable fileds and fields with default values
>>> improvement. - ASF JIRA <https://issues.apache.org/jira/browse/AVRO-1582>
>>>
>>> [AVRO-1582] Json serialization of nullable fileds and fields with defaul...
>>> <https://issues.apache.org/jira/browse/AVRO-1582>
>>>
>>>
>>> Here is a live example to get some sample data in avro json:
>>> https://demo.spf4j.org/example/records/1?_Accept=application/avro%2Bjson
>>> <https://demo.spf4j.org/example/records/1?_Accept=application/avro%2Bjson>
>>> and the "Natural"
>>> https://demo.spf4j.org/example/records/1?_Accept=application/json
>>> <https://demo.spf4j.org/example/records/1?_Accept=application/json> using
>>> the encoder suggested as implementation in the jira.
>>>
>>> Somebody needs to find the time do the work to integrate this...
>>>
>>> --Z
>>>
>>>
>>>
>>>
>>> On Monday, January 6, 2020, 12:36:44 PM EST, roger peppe
>>> <[email protected] <mailto:[email protected]>> wrote:
>>>
>>>
>>> Hi,
>>>
>>> The JSON encoding in the specification
>>> <https://avro.apache.org/docs/current/spec.html#json_encoding> includes an
>>> explicit type name for all kinds of object other than null. This means that
>>> a JSON-encoded Avro value with a union is very rarely directly compatible
>>> with normal JSON formats.
>>>
>>> For example, it's very common for a JSON-encoded value to allow a value
>>> that's either null or string. In Avro, that's trivially expressed as the
>>> union type ["null", "string"]. With conventional JSON, a string value "foo"
>>> would be encoded just as "foo", which is easily distinguished from null
>>> when decoding. However when using the Avro JSON format it must be encoded
>>> as {"string": "foo"}.
>>>
>>> This means that Avro JSON-encoded values don't interchange easily with
>>> other JSON-encoded values.
>>>
>>> AFAICS the main reason that the type name is always required in
>>> JSON-encoded unions is to avoid ambiguity. This particularly applies to
>>> record and map types, where it's not possible in general to tell which
>>> member of the union has been specified by looking at the data itself.
>>>
>>> However, that reasoning doesn't apply if all the members of the union can
>>> be distinguished from their JSON token type.
>>>
>>> I am considering using a JSON encoding that omits the type name when all
>>> the members of the union encode to distinct JSON token types (the JSON
>>> token types being: null, boolean, string, number, object and array).
>>>
>>> For example, JSON-encoded values using the Avro schema ["null", "string",
>>> "int"] would encode as the literal values themselves (e.g. null, "foo",
>>> 999), but JSON-encoded values using the Avro schema ["int", "double"] would
>>> require the type name because the JSON lexeme doesn't distinguish between
>>> different kinds of number.
>>>
>>> This would mean that it would be possible to represent a significant subset
>>> of "normal" JSON schemas with Avro. It seems to me that would potentially
>>> be very useful.
>>>
>>> Thoughts? Is this a really bad idea to be contemplating? :)
>>>
>>> cheers,
>>> rog.
>>>
>>>
>>
>