Problem Description Currently Arrow schemas can only contain columns of types supported by Arrow. In some cases an Arrow schema maps to an external schema. This can result in the Arrow schema not being able to support all the columns from the external schema.
Consider an external system that contains a column of type UUID. To model the schema in Arrow, the user has two choices: 1. Do not include the UUID column in the Arrow schema 2. Map the column to an existing Arrow type. This will not include the original type information. A UUID can be mapped to a FixedSizeBinary, but consumers of the Arrow schema will be unable to distinguish a FixedSizeBinary field from a UUID field. Possible Solution * Add a new type code that represents unsupported types * Values for the new type are represented as variable length binary Some drivers can expose data even when they don’t understand the data type. For example, the PostgreSQL driver will return the raw bytes for fields of an unknown type. Using an explicit type lets clients know that they should convert values if they were able to determine the actual data type. Questions * What is the impact on existing clients when they encounter fields of the unsupported type? * Is it safe to assume that all unsupported values can safely be converted to a variable length binary? * How can we preserve information about the original type?