[jira] [Comment Edited] (CALCITE-5701) Add NAMED_STRUCT function (enabled in Spark library)

Julian Hyde (Jira) Wed, 05 Jul 2023 11:33:04 -0700


    [ 
https://issues.apache.org/jira/browse/CALCITE-5701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17740291#comment-17740291
 ]


Julian Hyde edited comment on CALCITE-5701 at 7/5/23 6:32 PM:
--------------------------------------------------------------

With the addition of {{{}class NamedStruct{}}}, the PR seems to be going in the 
wrong direction. While Spark's embedding in Scala allows introspection (e.g. 
what fields does a value have at run time), I believe that Spark SQL does not. 
Certainly Calcite's emulation of Spark SQL should not.

I think that
{code:java}
named_struct{'k', 1, 'v', true}
{code}
should basically be syntactic sugar for {{ROW(1, true)}} except that the fields 
are named {{k}} and {{v}} rather than {{EXPR$0}} and {{{}EXPR$1{}}}.

The item operator can currently apply to a {{{}MAP{}}}. If we want the item 
operator to apply to a {{STRUCT}} or {{{}ROW{}}}, we can do that. No 
introspection is required, because the fields are known at compile time. But 
let's make it a separate PR.


was (Author: julianhyde):
With the addition of {{class NamedStruct}}, the PR seems to be going in the 
wrong direction. While Spark's embedding in Scala allows introspection (e.g. 
what fields does a value have at run time), I believe that Spark SQL does not. 
Certainly Calcite's emulation of Spark SQL should not.

I think that {{named_struct{'k', 1, 'v', true} }} should basically be syntactic 
sugar for {{ROW(1, true)}} except that the fields are named {{k}} and {{v}}.

The item operator can currently apply to a {{MAP}}. If we want the item 
operator to apply to a {{STRUCT}} or {{ROW}}, we can do that. No introspection 
is required, because the fields are known at compile time. But let's make it a 
separate PR.

> Add NAMED_STRUCT function (enabled in Spark library)
> ----------------------------------------------------
>
>                 Key: CALCITE-5701
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5701
>             Project: Calcite
>          Issue Type: New Feature
>          Components: core
>            Reporter: Guillaume Massé
>            Priority: Minor
>              Labels: pull-request-available
>
> [https://spark.apache.org/docs/3.4.0/api/sql/index.html#named_struct]
>  
> {code:java}
> spark.sql("""select named_struct("a", 1, "b", 2)""")
> res4: org.apache.spark.sql.DataFrame = [named_struct(a, 1, b, 2): struct<a: 
> int, b: int>]
> Calcite:
> SELECT named_struct('a', 1, 'b", 2);
> type: row(a int not null, b int not null){code}
>  
> It's also possible to be nested:
> {code:java}
> spark.sql("""select named_struct("a", 1, "b", named_struct("c", 2))""")
> res5: org.apache.spark.sql.DataFrame = [named_struct(a, 1, b, named_struct(c, 
> 2)): struct<a: int, b: struct<c: int>>] {code}
> {code:java}
> Calcite:
> SELECT named_struct('a', 1, 'b', named_struct('c', 2));
> type: row(a int not null, b row(c int not null) not null){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (CALCITE-5701) Add NAMED_STRUCT function (enabled in Spark library)

Reply via email to