This sounds reasonable from an Arrow perspective, you might want to CC the ORC 
list as well or ask someone there to co-review your work in the adapter.

Uwe

> Am 18.10.2020 um 17:24 schrieb Ying Zhou <yzhou7...@gmail.com>:
> 
> Hi,
> 
> I’m developing the adapter that converts Arrow Arrays, ChunkedArrays, 
> RecordBatches and Tables into ORC files. Given the ORC Specification and 
> Arrow Columnar Format. 
> 
> Here is my current type mapping:
> 
> Type::type::NA -> nulllptr
> Type::type::BOOL -> liborc::TypeKind::BOOLEAN
> Type::type::UINT8 -> liborc::TypeKind::BYTE
> Type::type::INT8 -> liborc::TypeKind::BYTE
> Type::type::UINT16 -> liborc::TypeKind::SHORT
> Type::type::INT16 -> liborc::TypeKind::SHORT
> Type::type::UINT32 -> liborc::TypeKind::INT
> Type::type::INT32 -> liborc::TypeKind::INT
> Type::type::INTERVAL_MONTH -> liborc::TypeKind:INT
> Type::type::UINT64 -> liborc::TypeKind::LONG
> Type::type::INT64 -> liborc::TypeKind::LONG
> Type::type::INTERVAL_DAY_TIME -> liborc::TypeKind:LONG
> Type::type::DURATION -> liborc::TypeKind::LONG
> Type::type::HALF_FLOAT -> liborc::TypeKind::FLOAT
> Type::type::FLOAT -> liborc::TypeKind::FLOAT
> Type::type::DOUBLE -> liborc::TypeKind::DOUBLE
> Type::type::STRING -> liborc::TypeKind::STRING
> Type::type::LARGE_STRING -> liborc::TypeKind::STRING
> Type::type::FIXED_SIZE_BINARY -> liborc::TypeKind::CHAR
> Type::type::BINARY -> liborc::TypeKind::BINARY
> Type::type::LARGE_BINARY -> liborc::TypeKind::BINARY
> Type::type::DATE32 -> liborc::TypeKind::DATE
> Type::type::TIMESTAMP -> liborc::TypeKind::TIMESTAMP
> Type::type::TIME32 -> liborc::TypeKind::TIMESTAMP
> Type::type::TIME64 -> liborc::TypeKind::TIMESTAMP
> Type::type::DATE64 -> liborc::TypeKind::TIMESTAMP
> Type::type::DECIMAL -> liborc::TypeKind::DECIMAL
> Type::type::LIST -> liborc::TypeKind::LIST
> Type::type::FIXED_SIZE_LIST -> liborc::TypeKind::LIST
> Type::type::LARGE_LIST -> liborc::TypeKind::LIST
> Type::type::STRUCT -> liborc::TypeKind::STRUCT
> Type::type::MAP -> liborc::TypeKind::MAP
> Type::type::DENSE_UNION -> liborc::TypeKind::UNION
> Type::type::SPARSE_UNION -> liborc::TypeKind::UNION
> Type::type::DICTIONARY -> the ORC version of its value type
> 
> There are some concerns particularly related to duration types which don’t 
> exist for Apache ORC which I have to convert to integers. Is my current 
> mapping reasonable? Thanks!
> 
> Best,
> Ying Zhou

Reply via email to