This is an automated email from the ASF dual-hosted git repository.
blue pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/parquet-format.git
The following commit(s) were added to refs/heads/master by this push:
new 855a58c Variant: Remove JSON conversion table (#485)
855a58c is described below
commit 855a58cda2a7b82c527fea2f1e7ee0d4db49114b
Author: Ryan Blue <[email protected]>
AuthorDate: Mon Feb 24 13:39:30 2025 -0800
Variant: Remove JSON conversion table (#485)
---
VariantEncoding.md | 29 -----------------------------
1 file changed, 29 deletions(-)
diff --git a/VariantEncoding.md b/VariantEncoding.md
index b091165..65c8370 100644
--- a/VariantEncoding.md
+++ b/VariantEncoding.md
@@ -473,32 +473,3 @@ A single Variant object may have poor read performance
when only a small subset
A better approach is to create separate columns for individual fields,
referred to as shredding or subcolumnarization.
[VariantShredding.md](VariantShredding.md) describes the Variant shredding
specification in Parquet.
-## Conversion to JSON
-
-Values stored in the Variant encoding are a superset of JSON values.
-For example, a Variant value can be a date that has no equivalent type in JSON.
-To maximize compatibility with readers that can process JSON but not Variant,
the following conversions should be used when producing JSON from a Variant:
-
-| Variant type | JSON type | Representation requirements
| Example |
-|------------------|-----------|----------------------------------------------------------|------------------------------------------|
-| Null type | null | `null`
| `null` |
-| Boolean | boolean | `true` or `false`
| `true` |
-| Exact Numeric | number | Digits in fraction must match scale, no
exponent | `34`, `34.00` |
-| Float | number | Fraction must be present
| `14.20` |
-| Double | number | Fraction must be present
| `1.0` |
-| Date | string | ISO-8601 formatted date
| `"2017-11-16"` |
-| Time | string | ISO-8601 formatted UTC time
| `"22:31:08.000001"` |
-| Timestamp (6) | string | ISO-8601 formatted UTC timestamp including
+00:00 offset | `"2017-11-16T22:31:08.000001+00:00"` |
-| Timestamp (9) | string | ISO-8601 formatted UTC timestamp including
+00:00 offset | `"2017-11-16T22:31:08.000000001+00:00"` |
-| TimestampNTZ (6) | string | ISO-8601 formatted UTC timestamp with no
offset or zone | `"2017-11-16T22:31:08.000001"` |
-| TimestampNTZ (9) | string | ISO-8601 formatted UTC timestamp with no
offset or zone | `"2017-11-16T22:31:08.000000001"` |
-| Binary | string | Base64 encoded binary
| `"dmFyaWFudAo="` |
-| String | string |
| `"variant"` |
-| UUID | string |
| `"f79c3e09-677c-4bbd-a479-3f349cb785e7"` |
-| Array | array |
| `[34, "abc", "2017-11-16]` |
-| Object | object |
| `{"id": 34, "data": "abc"}` |
-
-Notes:
-
-* For timestamp and timestampntz, values must use microsecond precision and
trailing 0s are required
-* For float and double, infinities and not a number values are encoded as
strings: `"Infinity"`, `"-Infinity"`, and `"NaN"`