Can a logical extension be based on another logical extension? HOCON support might be nice..
-----Original Message----- From: Micah Kornfield <emkornfi...@gmail.com> Sent: Monday, November 28, 2022 11:50 AM To: dev@arrow.apache.org Subject: Re: [DISCUSS] JSON Canonical Extension Type External Email: Use caution with links and attachments This seems like a reasonable definition to me. Since there hasn't been much feedback, I think maybe following through an implementation + this description in a PR would be the next steps. If there isn't further feedback on this, once the PR is up we can have try to vote (which might bring up some more feedback, but hopefully wouldn't cause too much implementation churn). Thanks, Micah On Thu, Nov 17, 2022 at 3:58 PM Pradeep Gollakota <pgollak...@google.com.invalid> wrote: > Hi folks! > > I put together this specification for canonicalizing the JSON type in > Arrow. > > ## Introduction > JSON is a widely used text based data interchange format. There are > many use cases where a user has a column whose contents are a JSON > encoded string. BigQuery's [JSON Type][1] and Parquet’s [JSON Logical > Type][2] are two such examples. > > The JSON specification is defined in [RFC-8259][3]. However, many of > the most popular parsers support non standard extensions. Examples of > non standard extensions to JSON include comments, unquoted keys, > trailing commas, etc. > > ## Extension Specification > * The name of the extension is `arrow.json` > * The storage type of the extension is `utf8` > * The extension type has no parameters > * The metadata MUST be either empty or a valid JSON object > - There is no canonical metadata > - Implementations MAY include implementation-specific metadata by > using a namespaced key. For example `{"google.bigquery": {"my": > "metadata"}}` > * Implementations... > - MUST produce valid UTF-8 encoded text > - SHOULD produce valid standard JSON > - MAY produce valid non-standard JSON > - MUST support parsing standard JSON > - MAY support parsing non standard JSON > - SHOULD pass through contents that they do not understand > > ## Forward compatibility > In the future we might allow this logical type to annotate a byte > storage type with a different text encoding. Implementations > consuming JSON logical types should verify this. > > [1]: > > https://urldefense.com/v3/__https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types*json_type__;Iw!!KSjYCgUGsB4!YhB-EpSLu8HTacaUsWvTVqF0kYh81UlVwNFBAc4-f95F7bGtdGuyWN_JObBkRSee-jTU20_MmGe2WUH8UMqTxPY$ > [2]: > https://urldefense.com/v3/__https://github.com/apache/parquet-format/blob/master/LogicalTypes.md*json__;Iw!!KSjYCgUGsB4!YhB-EpSLu8HTacaUsWvTVqF0kYh81UlVwNFBAc4-f95F7bGtdGuyWN_JObBkRSee-jTU20_MmGe2WUH8RFfD8NY$ > [3]: > https://urldefense.com/v3/__https://datatracker.ietf.org/doc/html/rfc8259__;!!KSjYCgUGsB4!YhB-EpSLu8HTacaUsWvTVqF0kYh81UlVwNFBAc4-f95F7bGtdGuyWN_JObBkRSee-jTU20_MmGe2WUH8MGoes7Q$ > This message may contain information that is confidential or privileged. If you are not the intended recipient, please advise the sender immediately and delete this message. See http://www.blackrock.com/corporate/compliance/email-disclaimers for further information. Please refer to http://www.blackrock.com/corporate/compliance/privacy-policy for more information about BlackRock’s Privacy Policy. For a list of BlackRock's office addresses worldwide, see http://www.blackrock.com/corporate/about-us/contacts-locations. © 2022 BlackRock, Inc. All rights reserved.