Re: flatten() function, scalar functions, nested ?

Jacques Nadeau Mon, 12 Oct 2015 18:13:33 -0700

You're hitting on a different topic that I was going to bring up soon. In a
schemaless system, we'd really like to have  select *(excluding a) but I
haven't come up with a proposed syntax for that. It is especially important
when working with schemaless data where you might want to coerce one or two
columns. For example, leave the columns alone, except col3, which i want to
cast to varchar.


I'm not clear how any of the other proposals solve this problem. (What I'm
calling * except or * excluding for now.)

--
Jacques Nadeau
CTO and Co-Founder, Dremio

On Mon, Oct 12, 2015 at 5:48 PM, Steven Phillips <[email protected]> wrote:

> I personally think the current usage of flatten is very unintuitive and
> confusing, and I think the BigQuery usage is much better. If I were
> designing this function from scratch, I would not allow using flatten in
> the select cause and only allow it as a table function.
>
> For example, take this table:
>
> 0: jdbc:drill:drillbit=localhost> select * from t6;
> +--------+----+
> |   a    | b  |
> +--------+----+
> | [1,2]  | 1  |
> | [3]    | 2  |
> | [3]    | 3  |
>
>
> It's not clear how to flatten the column a, and return all of the columns.
> The closest option is:
>
> 0: jdbc:drill:drillbit=localhost> select *, flatten(a) as a_flattened from
> t6;
> +--------+----+--------------+
> |   a    | b  | a_flattened  |
> +--------+----+--------------+
> | [1,2]  | 1  | 1            |
> | [1,2]  | 1  | 2            |
> | [3]    | 2  | 3            |
> | [3]    | 3  | 3            |
> +--------+----+--------------+
>
> But that doesn't really make any sense. Why would I want the original,
> unflattened column, duplicated multiple times?
>
> Anyway, I think this usage is very confusing. Is there a use case where
> this is necessary?
>
>
>
> On Mon, Oct 12, 2015 at 5:28 PM, Julian Hyde <[email protected]>
> wrote:
>
> >
> > > On Oct 12, 2015, at 3:42 PM, Jacques Nadeau <[email protected]>
> wrote:
> > >
> > > - we have shortcut for a lateral join combined with a table function
> used
> > > in the select clause
> >
> > It’s funny, Postgres has a short-cut that allows you to use UNNEST in the
> > SELECT clause[1]. James and I discussed it for Phoenix Unnest support[2],
> > and I’ll recap what I said there.
> >
> > The semantics of a table expression in the SELECT clause are weird,
> > because you get multiple rows out for each row in. It gets even weirder
> if
> > you have more than one table expression in the SELECT clause and some
> > non-table expressions too. Presumably it should return the cartesian
> > product.
> >
> > LINQ (and Spark) has “selectMany”, which is like “select” except that the
> > expression is a collection and one row is output for each member of the
> > collection. Bart de Smet claims that selectMany is powerful enough to
> > subsume the other relational operators (see [3] around 48 minutes). So,
> I’m
> > tempted to add “SELECT MANY” to Calcite SQL. But I think the way postgres
> > did it — changing the behavior of the SELECT clause if it happens to
> > contain an UNNEST function — is wrong. The workaround — learning how to
> use
> > UNNEST or indeed a table function such as FLATTEN in the FROM clause — is
> > not too hard.
> >
> > Julian
> >
> > [1]
> >
> http://stackoverflow.com/questions/23003601/sql-multiple-unnest-in-single-select-list
> > <
> >
> http://stackoverflow.com/questions/23003601/sql-multiple-unnest-in-single-select-list
> > >
> >
> > [2] https://issues.apache.org/jira/browse/PHOENIX-953 <
> > https://issues.apache.org/jira/browse/PHOENIX-953>
> >
> > [3]
> >
> https://channel9.msdn.com/Shows/Going+Deep/Bart-De-Smet-MinLINQ-The-Essence-of-LINQ
> > <
> >
> https://channel9.msdn.com/Shows/Going+Deep/Bart-De-Smet-MinLINQ-The-Essence-of-LINQ
> > >
>

Re: flatten() function, scalar functions, nested ?

Reply via email to