[jira] [Updated] (FLINK-16361) FLIP-84: Improve & Refactor API of TableEnvironment & Table

godfrey he (Jira) Wed, 08 Apr 2020 00:04:54 -0700


     [ 
https://issues.apache.org/jira/browse/FLINK-16361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


godfrey he updated FLINK-16361:
-------------------------------
    Description: 
as the 
[FLIP-84|https://cwiki.apache.org/confluence/display/FLINK/FLIP-84%3A+Improve+%26+Refactor+API+of+TableEnvironment]
 document described, 

We propose to deprecate the following methods:
{code:java}
TableEnvironment.sqlUpdate(String)
TableEnvironment.insertInto(String, Table)
TableEnvironment.execute(String)
TableEnvironment.explain(boolean)
TableEnvironment.fromTableSource(TableSource<?>)
Table.insertInto(String)
{code}

meanwhile, we propose to introduce the following new methods:

{code:java}
interface TableEnvironment {
    // execute the given single statement, and return the execution result.
    TableResult executeSql(String statement);
     
    // get the AST and the execution plan for the given single statement (DQL, 
DML)
    String explainSql(String statement, ExplainDetail... extraDetails);
 
    // create a StatementSet instance which can add DML statements or Tables
    // to the set and explain or execute them as a batch.
    StatementSet createStatementSet();
}

interface Table {
    // write the Table to a TableSink that was registered
    // under the specified path.
    TableResult executeInsert(String tablePath);
     
    // write the Table to a TableSink that was registered
    // under the specified path.
    TableResult executeInsert(String tablePath, boolean overwrite);
 
    // create a StatementSet instance which can add DML statements or Tables
    // to the set and explain or execute them as a batch.
    String explain(ExplainDetail... extraDetails);
 
    // get the contents of the current table.
    TableResult execute();
}

interface TableResult {
     // return JobClient if a Flink job is submitted
     // (for DML/DQL statement), else return empty (e.g. for DDL).
    Optional<JobClient> getJobClient();
 
    // return the schema of the result
    TableSchema getTableSchema();
     
    // return the ResultKind which can avoid custom parsing of
    // an "OK" row in programming
    ResultKind getResultKind();
 
    // get the row contents as an iterable rows
    Iterator<Row> collect();
 
    // print the result contents
    void print();
}

public enum ResultKind {
    // for DDL, DCL and statements with a simple "OK"
    SUCCESS,
 
    // rows with important content are available (DML, DQL)
    SUCCESS_WITH_CONTENT
}

interface StatementSet  {
    // add single INSERT statement into the set
    StatementSet addInsertSql(String statement);
 
    // add Table with the given sink table name to the set
    StatementSet addInsert(String targetPath, Table table);
 
    // add Table with the given sink table name to the set
    StatementSet addInsert(String targetPath, Table table, boolean overwrite);
 
    // returns the AST and the execution plan to compute
    // the result of all statements and Tables
    String explain(ExplainDetail... extraDetails);
 
    // execute all statements and Tables as a batch
    TableResult execute();
}

public enum ExplainDetail {
   STATE_SIZE_ESTIMATE,
   UID,
   HINTS,
   ...
}
{code}


We unify the Flink table program trigger behavior, and propose that: for 
{{TableEnvironment}} and {{StreamTableEnvironment}}, you must use 
{{TableEnvironment.execute()}} to trigger table program execution, once you 
convert the table program to a {{DataStream}} program (through 
{{toAppendStream}} or {{toRetractStream}} method), you must use 
{{StreamExecutionEnvironment.execute}} to trigger the {{DataStream}} program. 
Similar rule for BatchTableEnvironment, you must use 
`TableEnvironment.execute()` to trigger batch table program execution, once you 
convert the table program (through `toDataSet` method) to a DataSet program, 
you must use `ExecutionEnvironment.execute` to trigger the DataSet program.

  was:
as the 
[FLIP-84|https://cwiki.apache.org/confluence/display/FLINK/FLIP-84%3A+Improve+%26+Refactor+API+of+TableEnvironment]
 document described, 

We propose to deprecate the following methods in TableEnvironment:
{code:java}
void sqlUpdate(String sql)
void insertInto(String targetPath, Table table)
void execute(String jobName)
String explain(boolean extended)
Table fromTableSource(TableSource<?> source)
{code}

meanwhile, we propose to introduce the following new methods in 
TableEnvironment:

{code:java}
// synchronously execute the given single statement immediately, and return the 
execution result.
ResultTable executeStatement(String statement) 

public interface ResultTable {
    TableSchema getResultSchema();
    Iterable<Row> getResultRows();
}

// create a DmlBatch instance which can add dml statements or Tables to the 
batch and explain or execute them as a batch.
DmlBatch createDmlBatch()

interface DmlBatch {
    void addInsert(String insert);
    void addInsert(String targetPath, Table table);
    ResultTable execute() throws Exception ;
    String explain(boolean extended);
}
{code}


We unify the Flink table program trigger behavior, and propose that: for 
{{TableEnvironment}} and {{StreamTableEnvironment}}, you must use 
{{TableEnvironment.execute()}} to trigger table program execution, once you 
convert the table program to a {{DataStream}} program (through 
{{toAppendStream}} or {{toRetractStream}} method), you must use 
{{StreamExecutionEnvironment.execute}} to trigger the {{DataStream}} program.


> FLIP-84: Improve & Refactor API of TableEnvironment & Table
> -----------------------------------------------------------
>
>                 Key: FLINK-16361
>                 URL: https://issues.apache.org/jira/browse/FLINK-16361
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table SQL / API
>            Reporter: godfrey he
>            Assignee: godfrey he
>            Priority: Major
>             Fix For: 1.11.0
>
>
> as the 
> [FLIP-84|https://cwiki.apache.org/confluence/display/FLINK/FLIP-84%3A+Improve+%26+Refactor+API+of+TableEnvironment]
>  document described, 
> We propose to deprecate the following methods:
> {code:java}
> TableEnvironment.sqlUpdate(String)
> TableEnvironment.insertInto(String, Table)
> TableEnvironment.execute(String)
> TableEnvironment.explain(boolean)
> TableEnvironment.fromTableSource(TableSource<?>)
> Table.insertInto(String)
> {code}
> meanwhile, we propose to introduce the following new methods:
> {code:java}
> interface TableEnvironment {
>     // execute the given single statement, and return the execution result.
>     TableResult executeSql(String statement);
>      
>     // get the AST and the execution plan for the given single statement 
> (DQL, DML)
>     String explainSql(String statement, ExplainDetail... extraDetails);
>  
>     // create a StatementSet instance which can add DML statements or Tables
>     // to the set and explain or execute them as a batch.
>     StatementSet createStatementSet();
> }
> interface Table {
>     // write the Table to a TableSink that was registered
>     // under the specified path.
>     TableResult executeInsert(String tablePath);
>      
>     // write the Table to a TableSink that was registered
>     // under the specified path.
>     TableResult executeInsert(String tablePath, boolean overwrite);
>  
>     // create a StatementSet instance which can add DML statements or Tables
>     // to the set and explain or execute them as a batch.
>     String explain(ExplainDetail... extraDetails);
>  
>     // get the contents of the current table.
>     TableResult execute();
> }
> interface TableResult {
>      // return JobClient if a Flink job is submitted
>      // (for DML/DQL statement), else return empty (e.g. for DDL).
>     Optional<JobClient> getJobClient();
>  
>     // return the schema of the result
>     TableSchema getTableSchema();
>      
>     // return the ResultKind which can avoid custom parsing of
>     // an "OK" row in programming
>     ResultKind getResultKind();
>  
>     // get the row contents as an iterable rows
>     Iterator<Row> collect();
>  
>     // print the result contents
>     void print();
> }
> public enum ResultKind {
>     // for DDL, DCL and statements with a simple "OK"
>     SUCCESS,
>  
>     // rows with important content are available (DML, DQL)
>     SUCCESS_WITH_CONTENT
> }
> interface StatementSet  {
>     // add single INSERT statement into the set
>     StatementSet addInsertSql(String statement);
>  
>     // add Table with the given sink table name to the set
>     StatementSet addInsert(String targetPath, Table table);
>  
>     // add Table with the given sink table name to the set
>     StatementSet addInsert(String targetPath, Table table, boolean overwrite);
>  
>     // returns the AST and the execution plan to compute
>     // the result of all statements and Tables
>     String explain(ExplainDetail... extraDetails);
>  
>     // execute all statements and Tables as a batch
>     TableResult execute();
> }
> public enum ExplainDetail {
>    STATE_SIZE_ESTIMATE,
>    UID,
>    HINTS,
>    ...
> }
> {code}
> We unify the Flink table program trigger behavior, and propose that: for 
> {{TableEnvironment}} and {{StreamTableEnvironment}}, you must use 
> {{TableEnvironment.execute()}} to trigger table program execution, once you 
> convert the table program to a {{DataStream}} program (through 
> {{toAppendStream}} or {{toRetractStream}} method), you must use 
> {{StreamExecutionEnvironment.execute}} to trigger the {{DataStream}} program. 
> Similar rule for BatchTableEnvironment, you must use 
> `TableEnvironment.execute()` to trigger batch table program execution, once 
> you convert the table program (through `toDataSet` method) to a DataSet 
> program, you must use `ExecutionEnvironment.execute` to trigger the DataSet 
> program.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-16361) FLIP-84: Improve & Refactor API of TableEnvironment & Table

Reply via email to