[ 
https://issues.apache.org/jira/browse/CALCITE-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17819373#comment-17819373
 ] 

Mihai Budiu commented on CALCITE-6275:
--------------------------------------

I was looking at Postgres for guidance, but it turns out that Postgres cannot 
even allow a type like `INTEGER NOT NULL ARRAY` for a column type. So it's the 
wrong model to base this PR on.

The crux of this problem is the following: consider the type INTEGER ARRAY. 
There are actually 4 different related types when considering nullability:
* INTEGER ARRAY
* INTEGER ARRAY NOT NULL
* INTEGER NOT NULL ARRAY
* INTEGER NOT NULL ARRAY NOT NULL

Mathematically these are very different types, and they are all legal.
A similar, but even more complicated set of types exists for MAP types, which 
have 8 combinations.
The set of possible types gets even more interesting when you consider 
multi-dimensional arrays, such as INTEGER ARRAY ARRAY, or even types such as 
(STRING, INTEGER ARRAY ARRAY) MAP.

The Jira issue is about supporting all these types in the parser, and there is 
a PR which allows the parser to accept such types.

However, once these four kinds of types are supported by the parser, they need 
to be supported in every kind of context, including casts, and unparsing. 

My question is: what is the correct way to unparse an expression such as 
CAST(ARRAY [1] AS INTEGER NOT NULL ARRAY NOT NULL)? It looks like the grammar 
for parsing types is different depending on context (sometimes the default is 
nullable, sometimes the default is non-nullable), so the unparsing should also 
change depending on context.

(This is a rather unfortunate state of affairs, in a language such as Rust 
these types are always written in the same way: Vec<i32>, Vec<Option<i32>>, 
Option<Vec<i32>>, Option<Vec<Option<I32>>>.)


> Parser for data types ignores element nullability in collections
> ----------------------------------------------------------------
>
>                 Key: CALCITE-6275
>                 URL: https://issues.apache.org/jira/browse/CALCITE-6275
>             Project: Calcite
>          Issue Type: Bug
>          Components: core, server
>    Affects Versions: 1.36.0
>            Reporter: Mihai Budiu
>            Priority: Major
>              Labels: pull-request-available
>
> The parser (Parser.jj) has this production rule for DataType:
> {code}
> // Type name with optional scale and precision.
> SqlDataTypeSpec DataType() :
> {
>     SqlTypeNameSpec typeName;
>     final Span s;
> }
> {
>     typeName = TypeName() {
>         s = Span.of(typeName.getParserPos());
>     }
>     (
>         typeName = CollectionsTypeName(typeName)
>     )*
>     {
>         return new SqlDataTypeSpec(typeName, 
> s.add(typeName.getParserPos()).pos());
>     }
> }
> {code}
> Note that there is no way to specify the nullability for the elements of a 
> collection, they are always assumed to be non-null. This is most pertinent 
> for the server component, where in DDL one cannot specify a table column of 
> type INTEGER ARRAY; one always gets an INTEGER NOT NULL ARRAY instead.
> But note that SqlCollectionTypeNameSpec cannot even represent the nullability 
> of the elements' type, it takes a SqlTypeNameSpec instead of a 
> SqlDataTypeSpec.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to