amoeba commented on code in PR #561:
URL: https://github.com/apache/arrow-go/pull/561#discussion_r2505783511


##########
parquet/doc.go:
##########
@@ -75,6 +75,60 @@
 //
 // Tip: Some platforms don't necessarily support all kinds of encodings. If 
you're not
 // sure what to use, just use Plain and Dictionary encoding.
+//
+// # Arrow to Parquet Type Mappings
+//
+// When reading and writing Parquet, the parquet package converts between Arrow
+// and Parquet types in the manner described in the table below.
+//
+// When converting a Parquet type where a large and non-large offset Arrow type
+// would work, the non-large variant is chosen. If the Parquet file is written
+// with `WithStoreSchema`, types will be preserved and dictionaries will be
+// restored when round-tripping.
+//
+//     Arrow Type              Parquet Physical Type     Parquet Logical Type
+//     ----------              ---------------------     --------------------
+//     NULL                    Int32                     Null
+//     BOOL                    Boolean                   -
+//     INT8                    Int32                     Int(8, signed)
+//     UINT8                   Int32                     Int(8, unsigned)
+//     INT16                   Int32                     Int(16, signed)
+//     UINT16                  Int32                     Int(16, unsigned)
+//     INT32                   Int32                     Int(32, signed)
+//     UINT32                  Int32                     Int(32, unsigned)
+//     INT64                   Int64                     Int(64, signed)
+//     UINT64                  Int64                     Int(64, unsigned)
+//     FLOAT16                 FixedLenByteArray(2)      Float16
+//     FLOAT32                 Float                     -
+//     FLOAT64                 Double                    -
+//     STRING                  ByteArray                 String
+//     LARGE_STRING            ByteArray                 String
+//     BINARY                  ByteArray                 -
+//     LARGE_BINARY            ByteArray                 -
+//     FIXED_SIZE_BINARY       FixedLenByteArray         -
+//     DECIMAL128              Int32/Int64/FLBA*         Decimal
+//     DECIMAL256              Int32/Int64/FLBA*         Decimal
+//     DATE32                  Int32                     Date
+//     DATE64                  Int32                     Date
+//     TIMESTAMP               Int64 or Int96            Timestamp
+//     TIME32                  Int32                     Time(millis)
+//     TIME64                  Int64                     Time(micros/nanos)
+//     LIST                    Group (LIST)              -
+//     FIXED_SIZE_LIST         Group (LIST)              -
+//     STRUCT                  Group                     -
+//     MAP                     Group (MAP)               -
+//     DICTIONARY              (converted to value type) -
+//     EXTENSION               (depends on storage)      (may be custom)
+//
+// * FLBA means FixedLenByteArray
+//
+// Unsupported Arrow Types (will return arrow.ErrNotImplemented):
+//
+//     DURATION, INTERVAL_MONTHS, INTERVAL_DAY_TIME, INTERVAL_MONTH_DAY_NANO
+//     SPARSE_UNION, DENSE_UNION
+//     STRING_VIEW, BINARY_VIEW, LIST_VIEW, LARGE_LIST_VIEW
+//     LARGE_LIST, RUN_END_ENCODED

Review Comment:
   I think so. This test passes,
   
   ```patch
   diff --git a/parquet/pqarrow/schema_test.go b/parquet/pqarrow/schema_test.go
   index 6f5d14c7..f9817492 100644
   --- a/parquet/pqarrow/schema_test.go
   +++ b/parquet/pqarrow/schema_test.go
   @@ -439,6 +439,7 @@ func TestUnsupportedTypes(t *testing.T) {
                {typ: &arrow.MonthDayNanoIntervalType{}},
                {typ: &arrow.DenseUnionType{}},
                {typ: &arrow.SparseUnionType{}},
   +            {typ: arrow.LargeListOf(arrow.PrimitiveTypes.Int32)},
        }
        for _, tc := range unsupportedTypes {
                t.Run(tc.typ.ID().String(), func(t *testing.T) {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to