Re: MaterializedField

Jason Altekruse Fri, 17 Jul 2015 11:40:06 -0700

It is meant to represent that a data type has been materialized for a given
column. As we can have column with the ANY type at planning time, discovery
of type is the distinction made when we create one of the MaterializedField
s.

Unfortunately there is no explicit concept for a non-MaterializedField, we
currently represent this implicitly, as a column lacking both data and type
information is considered equivalent to a column that does not exist. The
main area where this shows in reading JSON files. If a file contains only
nulls, we don't know a type. The current behavior is to defer adding the
field to the schema at all, unless we can create a MaterializedField with
the type information. Later when the column is requested as part of an
expression, or when we need to send the final list of requested columns to
the client, we will materialize the type nullable bigint. Unfortunately
this only "solves" a very limited case, and can cause odd behavior in a
number of other cases, pretty much anything where the user expects to
actually operate on a file with typeless nulls.

On Thu, Jul 16, 2015 at 11:04 PM, Daniel Barclay <[email protected]>
wrote:

> What exactly is materialized about class
> org.apache.drill.exec.record.MaterializedField?
>
> The name gave me the impression that it would be a field/column with
> its data materialized (as a materialized view has copies of data).
>
> However, MaterializedField doesn't seem to contain data values (just
> field metadata like the name/pathname and data type).
>
> So what exactly does the class represent?  (What's materialized,
> and relative to what?)
>
> Daniel
> --
> Daniel Barclay
> MapR Technologies
>

Re: MaterializedField

Reply via email to