[ 
https://issues.apache.org/jira/browse/DRILL-6361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-6361:
-------------------------------
    Description: 
Drill provides a {{typeof()}} function to return the type of a column. The 
returned string, however, has only the base data type. A Drill data type (a 
"major type") also includes a cardinality (a "mode"). For example, {{Optional 
Int}} or {{Required VarChar}}.

This type information is useful for handling data conversions. For example, if 
I could tell that a column value was a {{Nullable Int}}, I could guess that it 
is one Drill invented, and I could merge it, by hand, with the type from 
another file that had actual values.

The two options are equivalent. Either provide a {{modeOf()}} to just return 
cardinality, or a {{dataTypeOf()}} that returns both. (Maybe the {{modeOf()}} 
might be more useful.)

h4. Documentation

Documentation information (extracted from PR):

h5. {{sqlTypeOf()}}

{{sqlTypeOf()}} returns the data type (using the SQL names) whether the column 
is NULL or not. The SQL name is the one that can be used in a CAST statement. 
Thus,

{{sqlTypeOf( CAST(x AS <type> ))}} returns _<type>_ as the type name.

If the type is {{DECIMAL}}, then the type also includes precision and scale. 
Example: {{DECIMAL(6, 3)}}.

h5. {{modeOf()}}

{{modeOf()}} returns the cardinality (mode) of the column as "NOT NULL", 
"NULLABLE" or "ARRAY".

h5. {{drillTypeOf()}}

The {{drillTypeOf()}} function that works just like {{typeOf()}}, but returns 
the internal Drill names even if the value is NULL.

h5. Example

Here is an example usage that highlights our old friend, "nullable int" for a 
missing column:

{noformat}
SELECT sqlTypeOf(a) AS a_type, modeOf(a) AS a_mode FROM `json/all-null.json`;

+----------+-----------+
|  a_type  |  a_mode   |
+----------+-----------+
| INTEGER  | NULLABLE  |
+----------+-----------+
{noformat}

For arrays (repeated) types:

{noformat}
SELECT sqlTypeOf(columns) as col_type, modeOf(columns) as col_mode
FROM `csv/cust.csv`;

+--------------------+-----------+
|      col_type      | col_mode  |
+--------------------+-----------+
| CHARACTER VARYING  | ARRAY     |
+--------------------+-----------+
{noformat}

For non-null types:

{noformat}
SELECT sqlTypeOf(`name`) AS name_type, 
modeOf(`name`) AS name_mode FROM `csvh/cust.csvh`;

+--------------------+------------+
|     name_type      | name_mode  |
+--------------------+------------+
| CHARACTER VARYING  | NOT NULL   |
+--------------------+------------+
{noformat}

  was:
Drill provides a {{typeof()}} function to return the type of a column. The 
returned string, however, has only the base data type. A Drill data type (a 
"major type") also includes a cardinality (a "mode"). For example, {{Optional 
Int}} or {{Required VarChar}}.

This type information is useful for handling data conversions. For example, if 
I could tell that a column value was a {{Nullable Int}}, I could guess that it 
is one Drill invented, and I could merge it, by hand, with the type from 
another file that had actual values.

The two options are equivalent. Either provide a {{modeOf()}} to just return 
cardinality, or a {{dataTypeOf()}} that returns both. (Maybe the {{modeOf()}} 
might be more useful.)


> Provide sqlTypeOf() and modeOf() functions
> ------------------------------------------
>
>                 Key: DRILL-6361
>                 URL: https://issues.apache.org/jira/browse/DRILL-6361
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: 1.13.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Minor
>              Labels: doc-impacting, ready-to-commit
>             Fix For: 1.14.0
>
>
> Drill provides a {{typeof()}} function to return the type of a column. The 
> returned string, however, has only the base data type. A Drill data type (a 
> "major type") also includes a cardinality (a "mode"). For example, {{Optional 
> Int}} or {{Required VarChar}}.
> This type information is useful for handling data conversions. For example, 
> if I could tell that a column value was a {{Nullable Int}}, I could guess 
> that it is one Drill invented, and I could merge it, by hand, with the type 
> from another file that had actual values.
> The two options are equivalent. Either provide a {{modeOf()}} to just return 
> cardinality, or a {{dataTypeOf()}} that returns both. (Maybe the {{modeOf()}} 
> might be more useful.)
> h4. Documentation
> Documentation information (extracted from PR):
> h5. {{sqlTypeOf()}}
> {{sqlTypeOf()}} returns the data type (using the SQL names) whether the 
> column is NULL or not. The SQL name is the one that can be used in a CAST 
> statement. Thus,
> {{sqlTypeOf( CAST(x AS <type> ))}} returns _<type>_ as the type name.
> If the type is {{DECIMAL}}, then the type also includes precision and scale. 
> Example: {{DECIMAL(6, 3)}}.
> h5. {{modeOf()}}
> {{modeOf()}} returns the cardinality (mode) of the column as "NOT NULL", 
> "NULLABLE" or "ARRAY".
> h5. {{drillTypeOf()}}
> The {{drillTypeOf()}} function that works just like {{typeOf()}}, but returns 
> the internal Drill names even if the value is NULL.
> h5. Example
> Here is an example usage that highlights our old friend, "nullable int" for a 
> missing column:
> {noformat}
> SELECT sqlTypeOf(a) AS a_type, modeOf(a) AS a_mode FROM `json/all-null.json`;
> +----------+-----------+
> |  a_type  |  a_mode   |
> +----------+-----------+
> | INTEGER  | NULLABLE  |
> +----------+-----------+
> {noformat}
> For arrays (repeated) types:
> {noformat}
> SELECT sqlTypeOf(columns) as col_type, modeOf(columns) as col_mode
> FROM `csv/cust.csv`;
> +--------------------+-----------+
> |      col_type      | col_mode  |
> +--------------------+-----------+
> | CHARACTER VARYING  | ARRAY     |
> +--------------------+-----------+
> {noformat}
> For non-null types:
> {noformat}
> SELECT sqlTypeOf(`name`) AS name_type, 
> modeOf(`name`) AS name_mode FROM `csvh/cust.csvh`;
> +--------------------+------------+
> |     name_type      | name_mode  |
> +--------------------+------------+
> | CHARACTER VARYING  | NOT NULL   |
> +--------------------+------------+
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to