[ 
https://issues.apache.org/jira/browse/FLINK-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-10448:
---------------------------------
    Description: 
It seems that a SQL VALUES clause uses one operator per value under certain 
conditions which leads to a complicated job graph. Given that we need to 
compile code for every operator in the open method and have other overhead as 
well, this looks inefficient to me.

For example, the following query creates and unions 6 operators together:
{code}
SELECT *
  FROM (
    VALUES
      (1, 'Bob', CAST(0 AS BIGINT)),
      (22, 'Alice', CAST(0 AS BIGINT)),
      (42, 'Greg', CAST(0 AS BIGINT)),
      (42, 'Greg', CAST(0 AS BIGINT)),
      (42, 'Greg', CAST(0 AS BIGINT)),
      (1, 'Bob', CAST(0 AS BIGINT)))
    AS UserCountTable(user_id, user_name, user_count)
{code}

  was:
It seems that a SQL VALUES clause uses one operator per value under certain 
conditions which leads to a complicated job graph. Given that we need to 
compile code for every operator in the open method, this looks inefficient to 
me.

For example, the following query creates and unions 6 operators together:
{code}
SELECT *
  FROM (
    VALUES
      (1, 'Bob', CAST(0 AS BIGINT)),
      (22, 'Alice', CAST(0 AS BIGINT)),
      (42, 'Greg', CAST(0 AS BIGINT)),
      (42, 'Greg', CAST(0 AS BIGINT)),
      (42, 'Greg', CAST(0 AS BIGINT)),
      (1, 'Bob', CAST(0 AS BIGINT)))
    AS UserCountTable(user_id, user_name, user_count)
{code}


> VALUES clause is translated into a separate operator per value
> --------------------------------------------------------------
>
>                 Key: FLINK-10448
>                 URL: https://issues.apache.org/jira/browse/FLINK-10448
>             Project: Flink
>          Issue Type: Bug
>          Components: Table API & SQL
>    Affects Versions: 1.6.1
>            Reporter: Timo Walther
>            Priority: Major
>
> It seems that a SQL VALUES clause uses one operator per value under certain 
> conditions which leads to a complicated job graph. Given that we need to 
> compile code for every operator in the open method and have other overhead as 
> well, this looks inefficient to me.
> For example, the following query creates and unions 6 operators together:
> {code}
> SELECT *
>   FROM (
>     VALUES
>       (1, 'Bob', CAST(0 AS BIGINT)),
>       (22, 'Alice', CAST(0 AS BIGINT)),
>       (42, 'Greg', CAST(0 AS BIGINT)),
>       (42, 'Greg', CAST(0 AS BIGINT)),
>       (42, 'Greg', CAST(0 AS BIGINT)),
>       (1, 'Bob', CAST(0 AS BIGINT)))
>     AS UserCountTable(user_id, user_name, user_count)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to