I think the first option is the best, and basically what what we did for
Apache Arrow in

https://github.com/apache/parquet-cpp/blob/master/src/parquet/arrow/schema.cc

Look at the FromPrimitive function

If the LogicalType is None, then the logical type is whatever the physical
type is. For example:

static Status FromInt32(const PrimitiveNode* node, TypePtr* out) {
  switch (node->logical_type()) {
    case LogicalType::NONE:
      *out = ::arrow::int32();
      break;
    case LogicalType::UINT_8:
      *out = ::arrow::uint8();
      break;
    case LogicalType::INT_8:
      *out = ::arrow::int8();
      break;
<SNIP>
etc.

On Thu, Jun 8, 2017 at 10:45 AM, Felipe Aramburu <[email protected]>
wrote:

> Ok to be clear.
>
> Would you say the safest behaviour is.
>
> 1. check for a logical type
>
> 2. if set to none check for a physical type
>
>
> or is it
>
>
> 1. check if the node has is_primitive() set to true
>
> 2. if true use physical type, if false use logical type
>
> Felipe
>
> On Thu, Jun 8, 2017 at 9:40 AM, Wes McKinney <[email protected]> wrote:
>
> > hi Felipe,
> >
> > Yes, that's right. For primitive types it is typical for the
> > LogicalType to be not set in the Thrift metadata. The particular
> > integer logical types were added relatively late to the Parquet format
> > and are not used in all implementations (for example, some databases
> > like Hive and Impala have their own metastores which are used together
> > with Parquet files to cast to the appropriate runtime type, like
> > smallint or tinyint)
> >
> > - Wes
> >
> > On Thu, Jun 8, 2017 at 10:34 AM, Felipe Aramburu <[email protected]>
> > wrote:
> > > I was playing around with some Parquet files that were generated using
> > > Apache Drill and I as I look at the ColumnDescriptors that one of the
> > > columns has a logical type LogicalType::None and a physical type of
> > > Type::Int32.
> > >
> > > Is it normal for this to happen. When something is of type none can
> that
> > > mean and the ColumnDescriptor's node  is_primitive()  function returns
> > true
> > > does that mean I can ignore the logical type and just look at the
> > primitive
> > > type to know how to interpret the data?
> > >
> > > Felipe
> > >
> > > ᐧ
> >
>

Reply via email to