I personally think the current usage of flatten is very unintuitive and
confusing, and I think the BigQuery usage is much better. If I were
designing this function from scratch, I would not allow using flatten in
the select cause and only allow it as a table function.

For example, take this table:

0: jdbc:drill:drillbit=localhost> select * from t6;
+--------+----+
|   a    | b  |
+--------+----+
| [1,2]  | 1  |
| [3]    | 2  |
| [3]    | 3  |


It's not clear how to flatten the column a, and return all of the columns.
The closest option is:

0: jdbc:drill:drillbit=localhost> select *, flatten(a) as a_flattened from
t6;
+--------+----+--------------+
|   a    | b  | a_flattened  |
+--------+----+--------------+
| [1,2]  | 1  | 1            |
| [1,2]  | 1  | 2            |
| [3]    | 2  | 3            |
| [3]    | 3  | 3            |
+--------+----+--------------+

But that doesn't really make any sense. Why would I want the original,
unflattened column, duplicated multiple times?

Anyway, I think this usage is very confusing. Is there a use case where
this is necessary?



On Mon, Oct 12, 2015 at 5:28 PM, Julian Hyde <[email protected]> wrote:

>
> > On Oct 12, 2015, at 3:42 PM, Jacques Nadeau <[email protected]> wrote:
> >
> > - we have shortcut for a lateral join combined with a table function used
> > in the select clause
>
> It’s funny, Postgres has a short-cut that allows you to use UNNEST in the
> SELECT clause[1]. James and I discussed it for Phoenix Unnest support[2],
> and I’ll recap what I said there.
>
> The semantics of a table expression in the SELECT clause are weird,
> because you get multiple rows out for each row in. It gets even weirder if
> you have more than one table expression in the SELECT clause and some
> non-table expressions too. Presumably it should return the cartesian
> product.
>
> LINQ (and Spark) has “selectMany”, which is like “select” except that the
> expression is a collection and one row is output for each member of the
> collection. Bart de Smet claims that selectMany is powerful enough to
> subsume the other relational operators (see [3] around 48 minutes). So, I’m
> tempted to add “SELECT MANY” to Calcite SQL. But I think the way postgres
> did it — changing the behavior of the SELECT clause if it happens to
> contain an UNNEST function — is wrong. The workaround — learning how to use
> UNNEST or indeed a table function such as FLATTEN in the FROM clause — is
> not too hard.
>
> Julian
>
> [1]
> http://stackoverflow.com/questions/23003601/sql-multiple-unnest-in-single-select-list
> <
> http://stackoverflow.com/questions/23003601/sql-multiple-unnest-in-single-select-list
> >
>
> [2] https://issues.apache.org/jira/browse/PHOENIX-953 <
> https://issues.apache.org/jira/browse/PHOENIX-953>
>
> [3]
> https://channel9.msdn.com/Shows/Going+Deep/Bart-De-Smet-MinLINQ-The-Essence-of-LINQ
> <
> https://channel9.msdn.com/Shows/Going+Deep/Bart-De-Smet-MinLINQ-The-Essence-of-LINQ
> >

Reply via email to