[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-06 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r311146724
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -1304,20 +1414,50 @@ val table: Table = tableEnv.fromDataStream(stream, 
'name as 'myName)
 Query Optimization
 --
 
-Apache Flink leverages Apache Calcite to optimize and translate queries. The 
optimization currently performed include projection and filter push-down, 
subquery decorrelation, and other kinds of query rewriting. Flink does not yet 
optimize the order of joins, but executes them in the same order as defined in 
the query (order of Tables in the `FROM` clause and/or order of join predicates 
in the `WHERE` clause).
+
+
+
+Apache Flink leverages Apache Calcite to optimize and translate queries. The 
optimization currently performed include projection and filter push-down, 
subquery decorrelation, and other kinds of query rewriting. Flink planner does 
not yet optimize the order of joins, but executes them in the same order as 
defined in the query (order of Tables in the `FROM` clause and/or order of join 
predicates in the `WHERE` clause).
+
+It is possible to tweak the set of optimization rules which are applied in 
different phases by providing a `CalciteConfig` object. This can be created via 
a builder by calling `CalciteConfig.createBuilder())` and is provided to the 
TableEnvironment by calling 
`tableEnv.getConfig.setPlannerConfig(calciteConfig)`. 
+
+
+
+
+
+The foundation of Apache Flink query optimization is Apache Calcite. In 
addition to apply Calcite in optimization, Blink planner also does a lot to 
enhance it.
 
 Review comment:
   @dawidwys @godfreyhe 
   ```
   Apache Flink leverages and extends Apache Calcite to perform sophisticated 
query optimization.
   This includes a series of rule and cost-based optimizations such as:
   * Subquery decorrelation based on Apache Calcite 
   * Project pruning
   * Partition pruning
   * Filter push-down 
   * Sub-plan deduplication to avoid duplicate computation 
   * Special subquery rewriting:
   * Converts IN and EXISTS into left semi-joins
   * Converts NOT IN and NOT EXISTS into left anti-join 
   * Optional join reordering
   * Enabled via `table.optimizer.join-reorder-enabled`  
   
   {% info %} IN/EXISTS/NOT IN/NOT EXISTS are currently only supported in 
conjunctive conditions.
   
   The optimizer makes intelligent decisions, based not only on the plan but 
also rich statistics available from the data sources and fine-grain costs for 
each operator such as io, cpu, network, and memory. 
   
   Advanced users may provide custom optimizations via a `CalciteConfig` object 
that can be provided to the table environment by calling 
`TableEnvironment#getConfig#setPlannerConfig`.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-06 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r311148855
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -768,11 +852,37 @@ A Table API or SQL query is translated when:
 
 Once translated, a Table API or SQL query is handled like a regular DataStream 
or DataSet program and is executed when `StreamExecutionEnvironment.execute()` 
or `ExecutionEnvironment.execute()` is called.
 
+
+
+
+Table API and SQL queries are translated into [DataStream]({{ site.baseurl 
}}/dev/datastream_api.html) programs whether their input is streaming or batch. 
A query is internally represented as a logical query plan and is translated in 
two phases: 
+
+1. Optimization of the logical plan, 
+2. Translation into a DataStream program.
+
+The behavior of translating  a query is different for `TableEnvironment` and 
`StreamTableEnvironment`.
+
+For `TableEnvironment`, A Table API or SQL query is translated when 
`TableEnvironment.execute()` is called, because `TableEnvironment` will 
optimize multiple-sinks into one DAG.
+
+while for `StreamTableEnvironment`, A Table API or SQL query is translated 
when:
+
+* a `Table` is emitted to a `TableSink`, i.e., when `Table.insertInto()` is 
called.
+* a SQL update query is specified, i.e., when `TableEnvironment.sqlUpdate()` 
is called.
+* a `Table` is converted into a `DataStream`.
+
+Once translated, a Table API or SQL query is handled like a regular DataStream 
program and is executed when `TableEnvironment.execute()` or 
`StreamExecutionEnvironment.execute()` is called.
+
+
+
+
 {% top %}
 
 Integration with DataStream and DataSet API
 ---
 
+Both Flink and Blink streaming jobs can integrate with the `DataStream` API. 
Only Flink batch job can integrate with the `DataSet API`, Blink batch job 
could not be combined with both.
 
 Review comment:
   nit: legacy sounds better than old


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-06 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r311148855
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -768,11 +852,37 @@ A Table API or SQL query is translated when:
 
 Once translated, a Table API or SQL query is handled like a regular DataStream 
or DataSet program and is executed when `StreamExecutionEnvironment.execute()` 
or `ExecutionEnvironment.execute()` is called.
 
+
+
+
+Table API and SQL queries are translated into [DataStream]({{ site.baseurl 
}}/dev/datastream_api.html) programs whether their input is streaming or batch. 
A query is internally represented as a logical query plan and is translated in 
two phases: 
+
+1. Optimization of the logical plan, 
+2. Translation into a DataStream program.
+
+The behavior of translating  a query is different for `TableEnvironment` and 
`StreamTableEnvironment`.
+
+For `TableEnvironment`, A Table API or SQL query is translated when 
`TableEnvironment.execute()` is called, because `TableEnvironment` will 
optimize multiple-sinks into one DAG.
+
+while for `StreamTableEnvironment`, A Table API or SQL query is translated 
when:
+
+* a `Table` is emitted to a `TableSink`, i.e., when `Table.insertInto()` is 
called.
+* a SQL update query is specified, i.e., when `TableEnvironment.sqlUpdate()` 
is called.
+* a `Table` is converted into a `DataStream`.
+
+Once translated, a Table API or SQL query is handled like a regular DataStream 
program and is executed when `TableEnvironment.execute()` or 
`StreamExecutionEnvironment.execute()` is called.
+
+
+
+
 {% top %}
 
 Integration with DataStream and DataSet API
 ---
 
+Both Flink and Blink streaming jobs can integrate with the `DataStream` API. 
Only Flink batch job can integrate with the `DataSet API`, Blink batch job 
could not be combined with both.
 
 Review comment:
   nit: legacy sounds better old


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-06 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r311146724
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -1304,20 +1414,50 @@ val table: Table = tableEnv.fromDataStream(stream, 
'name as 'myName)
 Query Optimization
 --
 
-Apache Flink leverages Apache Calcite to optimize and translate queries. The 
optimization currently performed include projection and filter push-down, 
subquery decorrelation, and other kinds of query rewriting. Flink does not yet 
optimize the order of joins, but executes them in the same order as defined in 
the query (order of Tables in the `FROM` clause and/or order of join predicates 
in the `WHERE` clause).
+
+
+
+Apache Flink leverages Apache Calcite to optimize and translate queries. The 
optimization currently performed include projection and filter push-down, 
subquery decorrelation, and other kinds of query rewriting. Flink planner does 
not yet optimize the order of joins, but executes them in the same order as 
defined in the query (order of Tables in the `FROM` clause and/or order of join 
predicates in the `WHERE` clause).
+
+It is possible to tweak the set of optimization rules which are applied in 
different phases by providing a `CalciteConfig` object. This can be created via 
a builder by calling `CalciteConfig.createBuilder())` and is provided to the 
TableEnvironment by calling 
`tableEnv.getConfig.setPlannerConfig(calciteConfig)`. 
+
+
+
+
+
+The foundation of Apache Flink query optimization is Apache Calcite. In 
addition to apply Calcite in optimization, Blink planner also does a lot to 
enhance it.
 
 Review comment:
   @dawidwys @godfreyhe 
   ```
   Apache Flink leverages and extends Apache Calcite to perform sophisticated 
query optimization.
   This includes a series of rule and cost-based optimizations, including but 
not limited to:
   * Subquery decorrelation based on Apache Calcite 
   * Project pruning
   * Partition pruning
   * Filter push-down 
   * Sub-plan deduplication to avoid duplicate computation 
   * Special subquery rewriting:
   * Converts IN and EXISTS into left semi-joins
   * Converts NOT IN and NOT EXISTS into left anti-join 
   * Optional join reordering
   * Enabled via `table.optimizer.join-reorder-enabled`  
   
   {% info %} IN/EXISTS/NOT IN/NOT EXISTS are currently only supported in 
conjunctive conditions.
   
   The optimizer makes intelligent decisions, based not only on the plan but 
also rich statistics available from the data sources and fine-grain costs for 
each operator such as io, cpu, network, and memory. 
   
   Advanced users may provide custom optimizations via a `CalciteConfig` object 
that can be provided to the table environment by calling 
`TableEnvironment#getConfig#setPlannerConfig`.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-06 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r311146724
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -1304,20 +1414,50 @@ val table: Table = tableEnv.fromDataStream(stream, 
'name as 'myName)
 Query Optimization
 --
 
-Apache Flink leverages Apache Calcite to optimize and translate queries. The 
optimization currently performed include projection and filter push-down, 
subquery decorrelation, and other kinds of query rewriting. Flink does not yet 
optimize the order of joins, but executes them in the same order as defined in 
the query (order of Tables in the `FROM` clause and/or order of join predicates 
in the `WHERE` clause).
+
+
+
+Apache Flink leverages Apache Calcite to optimize and translate queries. The 
optimization currently performed include projection and filter push-down, 
subquery decorrelation, and other kinds of query rewriting. Flink planner does 
not yet optimize the order of joins, but executes them in the same order as 
defined in the query (order of Tables in the `FROM` clause and/or order of join 
predicates in the `WHERE` clause).
+
+It is possible to tweak the set of optimization rules which are applied in 
different phases by providing a `CalciteConfig` object. This can be created via 
a builder by calling `CalciteConfig.createBuilder())` and is provided to the 
TableEnvironment by calling 
`tableEnv.getConfig.setPlannerConfig(calciteConfig)`. 
+
+
+
+
+
+The foundation of Apache Flink query optimization is Apache Calcite. In 
addition to apply Calcite in optimization, Blink planner also does a lot to 
enhance it.
 
 Review comment:
   @dawidwys @godfreyhe 
   ```
   Apache Flink leverages and extends Apache Calcite to perform sophisticated 
query optimization.
   This includes a series of rule and cost-based optimizations, including but 
not limited to:
   * Subquery decorrelation based on Apache Calcite 
   * Project pruning
   * Partition pruning
   * Filter push-down 
   * Sub-plan deduplication to avoid duplicate computation 
   * Special subquery rewriting:
   * Converts IN and EXISTS into left semi-joins
   * Converts NOT IN and NOT EXISTS into left anti-join 
   * Optional join reordering
   * Enabled via `table.optimizer.join-reorder-enabled`  
   
   {% info %} IN/EXISTS/NOT IN/NOT EXISTS are currently only supported in 
conjunctive conditions.
   
   The optimizer makes intelligent decisions, based not only on the plan but 
also rich statistics available from the data sources and fine-grain costs for 
each operator such as io, cpu, network, and memory. 
   
   Advanced users may provide custom optimizations via a `CalciteConfig` that 
can be provided to the table environment by calling  
`TableEnvironment#getConfig#setPlannerConfig`.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-06 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r311146724
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -1304,20 +1414,50 @@ val table: Table = tableEnv.fromDataStream(stream, 
'name as 'myName)
 Query Optimization
 --
 
-Apache Flink leverages Apache Calcite to optimize and translate queries. The 
optimization currently performed include projection and filter push-down, 
subquery decorrelation, and other kinds of query rewriting. Flink does not yet 
optimize the order of joins, but executes them in the same order as defined in 
the query (order of Tables in the `FROM` clause and/or order of join predicates 
in the `WHERE` clause).
+
+
+
+Apache Flink leverages Apache Calcite to optimize and translate queries. The 
optimization currently performed include projection and filter push-down, 
subquery decorrelation, and other kinds of query rewriting. Flink planner does 
not yet optimize the order of joins, but executes them in the same order as 
defined in the query (order of Tables in the `FROM` clause and/or order of join 
predicates in the `WHERE` clause).
+
+It is possible to tweak the set of optimization rules which are applied in 
different phases by providing a `CalciteConfig` object. This can be created via 
a builder by calling `CalciteConfig.createBuilder())` and is provided to the 
TableEnvironment by calling 
`tableEnv.getConfig.setPlannerConfig(calciteConfig)`. 
+
+
+
+
+
+The foundation of Apache Flink query optimization is Apache Calcite. In 
addition to apply Calcite in optimization, Blink planner also does a lot to 
enhance it.
 
 Review comment:
   @dawidwys @godfreyhe 
   ```
   Apache Flink leverages and extends Apache Calcite to perform sophisticated 
query optimization.
   This includes a series of rule and cost-based optimizations, including but 
not limited to:
   * Subquery decorrelation based on Apache Calcite 
   * Project pruning
   * Partition pruning
   * Filter push-down 
   * Sub-plan deduplication to avoid duplicate computation 
   * Special subquery rewriting:
   * Converts IN and EXISTS into left semi-joins
   * Converts NOT IN and NOT EXISTS into left anti-join 
   * Optional join reordering
   * Enabled via `table.optimizer.join-reorder-enabled`  
   
   {% note %} IN/EXISTS/NOT IN/NOT EXISTS are currently only supported in 
conjunctive conditions.
   
   The optimizer makes intelligent decisions, based not only on the plan but 
also rich statistics available from the data sources and fine-grain costs for 
each operator such as io, cpu, network, and memory. 
   
   Advanced users may provide custom optimizations via a `CalciteConfig` that 
can be provided to the table environment by calling  
`TableEnvironment#getConfig#setPlannerConfig`.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-06 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r311146724
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -1304,20 +1414,50 @@ val table: Table = tableEnv.fromDataStream(stream, 
'name as 'myName)
 Query Optimization
 --
 
-Apache Flink leverages Apache Calcite to optimize and translate queries. The 
optimization currently performed include projection and filter push-down, 
subquery decorrelation, and other kinds of query rewriting. Flink does not yet 
optimize the order of joins, but executes them in the same order as defined in 
the query (order of Tables in the `FROM` clause and/or order of join predicates 
in the `WHERE` clause).
+
+
+
+Apache Flink leverages Apache Calcite to optimize and translate queries. The 
optimization currently performed include projection and filter push-down, 
subquery decorrelation, and other kinds of query rewriting. Flink planner does 
not yet optimize the order of joins, but executes them in the same order as 
defined in the query (order of Tables in the `FROM` clause and/or order of join 
predicates in the `WHERE` clause).
+
+It is possible to tweak the set of optimization rules which are applied in 
different phases by providing a `CalciteConfig` object. This can be created via 
a builder by calling `CalciteConfig.createBuilder())` and is provided to the 
TableEnvironment by calling 
`tableEnv.getConfig.setPlannerConfig(calciteConfig)`. 
+
+
+
+
+
+The foundation of Apache Flink query optimization is Apache Calcite. In 
addition to apply Calcite in optimization, Blink planner also does a lot to 
enhance it.
 
 Review comment:
   @dawidwys @godfreyhe 
   ```
   Apache Flink leverages and extends Apache Calcite to perform sophisticated 
query optimization.
   This includes a series of rule and cost-based optimizations, including but 
not limited to:
   * Subquery decorrelation based on Apache Calcite 
   * Project pruning
   * Partition pruning
   * Filter push-down 
   * Sub-plan deduplication to avoid duplicate computation. 
   * Special subquery rewriting:
   * Converts IN and EXISTS into left semi-joins
   * Converts NOT IN and NOT EXISTS into left anti-join 
   * Optional join reordering
   * Enabled via `table.optimizer.join-reorder-enabled`  
   
   {% note %} IN/EXISTS/NOT IN/NOT EXISTS are currently only supported in 
conjunctive conditions.
   
   The optimizer makes intelligent decisions, based not only on the plan but 
also rich statistics available from the data sources and fine-grain costs for 
each operator such as io, cpu, network, and memory. 
   
   Advanced users may provide custom optimizations via a `CalciteConfig` that 
can be provided to the table environment by calling  
`TableEnvironment#getConfig#setPlannerConfig`.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-06 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r311146724
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -1304,20 +1414,50 @@ val table: Table = tableEnv.fromDataStream(stream, 
'name as 'myName)
 Query Optimization
 --
 
-Apache Flink leverages Apache Calcite to optimize and translate queries. The 
optimization currently performed include projection and filter push-down, 
subquery decorrelation, and other kinds of query rewriting. Flink does not yet 
optimize the order of joins, but executes them in the same order as defined in 
the query (order of Tables in the `FROM` clause and/or order of join predicates 
in the `WHERE` clause).
+
+
+
+Apache Flink leverages Apache Calcite to optimize and translate queries. The 
optimization currently performed include projection and filter push-down, 
subquery decorrelation, and other kinds of query rewriting. Flink planner does 
not yet optimize the order of joins, but executes them in the same order as 
defined in the query (order of Tables in the `FROM` clause and/or order of join 
predicates in the `WHERE` clause).
+
+It is possible to tweak the set of optimization rules which are applied in 
different phases by providing a `CalciteConfig` object. This can be created via 
a builder by calling `CalciteConfig.createBuilder())` and is provided to the 
TableEnvironment by calling 
`tableEnv.getConfig.setPlannerConfig(calciteConfig)`. 
+
+
+
+
+
+The foundation of Apache Flink query optimization is Apache Calcite. In 
addition to apply Calcite in optimization, Blink planner also does a lot to 
enhance it.
 
 Review comment:
   @dawidwys @godfreyhe 
   ```
   Apache Flink leverages and extends Apache Calcite to perform sophisticated 
query optimization. 
   
   The Blink planner begins by performing a series of rule and cost-based 
optimizations, including but not limited to:
   * Subquery decorrelation based on Apache Calcite 
   * Project pruning
   * Partition pruning
   * Filter push-down 
   * Sub-plan deduplication to avoid duplicate computation. 
   * Special subquery rewriting:
   * Converts IN and EXISTS into left semi-joins
   * Converts NOT IN and NOT EXISTS into left anti-join 
   * Optional join reordering
   * Enabled via `table.optimizer.join-reorder-enabled`  
   
   {% note %} IN/EXISTS/NOT IN/NOT EXISTS are currently only supported in 
conjunctive conditions.
   
   The optimizer makes intelligent decisions, based not only on the plan but 
also rich statistics available from the data sources and fine-grain costs for 
each operator such as io, cpu, network, and memory. 
   
   Advanced users may provide custom optimizations via a `CalciteConfig` that 
can be provided to the table environment by calling  
`TableEnvironment#getConfig#setPlannerConfig`.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-06 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r311109415
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -1304,20 +1414,50 @@ val table: Table = tableEnv.fromDataStream(stream, 
'name as 'myName)
 Query Optimization
 --
 
-Apache Flink leverages Apache Calcite to optimize and translate queries. The 
optimization currently performed include projection and filter push-down, 
subquery decorrelation, and other kinds of query rewriting. Flink does not yet 
optimize the order of joins, but executes them in the same order as defined in 
the query (order of Tables in the `FROM` clause and/or order of join predicates 
in the `WHERE` clause).
+
+
+
+Apache Flink leverages Apache Calcite to optimize and translate queries. The 
optimization currently performed include projection and filter push-down, 
subquery decorrelation, and other kinds of query rewriting. Flink planner does 
not yet optimize the order of joins, but executes them in the same order as 
defined in the query (order of Tables in the `FROM` clause and/or order of join 
predicates in the `WHERE` clause).
+
+It is possible to tweak the set of optimization rules which are applied in 
different phases by providing a `CalciteConfig` object. This can be created via 
a builder by calling `CalciteConfig.createBuilder())` and is provided to the 
TableEnvironment by calling 
`tableEnv.getConfig.setPlannerConfig(calciteConfig)`. 
+
+
+
+
+
+The foundation of Apache Flink query optimization is Apache Calcite. In 
addition to apply Calcite in optimization, Blink planner also does a lot to 
enhance it.
 
 Review comment:
   > Apache Flink leverages Apache Calcite to optimize and translate queries in 
conjunction with a large number of Flink specific optimizations and 
enhancements. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-06 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r311109415
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -1304,20 +1414,50 @@ val table: Table = tableEnv.fromDataStream(stream, 
'name as 'myName)
 Query Optimization
 --
 
-Apache Flink leverages Apache Calcite to optimize and translate queries. The 
optimization currently performed include projection and filter push-down, 
subquery decorrelation, and other kinds of query rewriting. Flink does not yet 
optimize the order of joins, but executes them in the same order as defined in 
the query (order of Tables in the `FROM` clause and/or order of join predicates 
in the `WHERE` clause).
+
+
+
+Apache Flink leverages Apache Calcite to optimize and translate queries. The 
optimization currently performed include projection and filter push-down, 
subquery decorrelation, and other kinds of query rewriting. Flink planner does 
not yet optimize the order of joins, but executes them in the same order as 
defined in the query (order of Tables in the `FROM` clause and/or order of join 
predicates in the `WHERE` clause).
+
+It is possible to tweak the set of optimization rules which are applied in 
different phases by providing a `CalciteConfig` object. This can be created via 
a builder by calling `CalciteConfig.createBuilder())` and is provided to the 
TableEnvironment by calling 
`tableEnv.getConfig.setPlannerConfig(calciteConfig)`. 
+
+
+
+
+
+The foundation of Apache Flink query optimization is Apache Calcite. In 
addition to apply Calcite in optimization, Blink planner also does a lot to 
enhance it.
 
 Review comment:
   > Apache Flink leverages Apache Calcite to optimize and translate queries in 
conjunction with a large number of Flink specific optimizations and 
enhancements. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-05 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r310678504
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -768,11 +856,36 @@ A Table API or SQL query is translated when:
 
 Once translated, a Table API or SQL query is handled like a regular DataStream 
or DataSet program and is executed when `StreamExecutionEnvironment.execute()` 
or `ExecutionEnvironment.execute()` is called.
 
+{% endhighlight %}
+
+
+
+{% highlight java %}
+Table API and SQL queries are translated into [DataStream]({{ site.baseurl 
}}/dev/datastream_api.html) program whether their input is a streaming or batch 
input. A query is internally represented as a logical query plan and is 
translated in two phases: 
+
+1. optimization of the logical plan, 
+2. translation into a DataStream program.
+
+A Table API or SQL query is translated when:
+
+* a `Table` is emitted to a `TableSink`, i.e., when `Table.insertInto()` is 
called.
+* a SQL update query is specified, i.e., when `TableEnvironment.sqlUpdate()` 
is called.
+* a `Table` is converted into a `DataStream` (only for Blink stream job).
+
+Once translated, a Table API or SQL query is handled like a regular DataStream 
program and is executed when `TableEnvironment.execute()` or 
`StreamExecutionEnvironment.execute()` is called.
+
+{% endhighlight %}
+
+
+
 {% top %}
 
 Integration with DataStream and DataSet API
 ---
 
+Both Flink stream job and Blink stream job could be integrated with 
`DataStream`, and only Flink batch job could be integrated with `DataSet`, 
Blink batch job could not be integrated with both.
 
 Review comment:
   ```suggestion
   Both Flink and Blink streaming jobs can integrate with the `DataStream` API. 
Only Flink batch job can integrate with the `DataSet API`, Blink batch job 
could not be combined with both.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-05 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r310670241
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -27,6 +27,20 @@ The Table API and SQL are integrated in a joint API. The 
central concept of this
 * This will be replaced by the TOC
 {:toc}
 
+Main differences of two planners
+
+
+the main differences on user interfaces level are
+
+1. The Blink planner does not support `BatchTableSource`, only supports 
`StreamTableSource`. Because the batch job is a special case of streaming job 
in Blink planner, and a batch job will not be translated into `DateSet` program 
but translated into `DataStream` program same as the stream job.
 
 Review comment:
   ```suggestion
   1. The Blink planner does not support `BatchTableSource`, only 
`StreamTableSource`. Blink treats batch jobs a special case of streaming. As 
such, batch jobs will not be translated into `DateSet` programs but translated 
into `DataStream` programs, the same as the streaming jobs.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-05 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r310675449
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -755,6 +838,11 @@ result.insert_into("CsvSinkTable")
 Translate and Execute a Query
 -
 
+The behavior of `Translate and Execute a Query` is different for two planners:
+
+
 
 Review comment:
   This doesn't render properly, Jekyll does not support nested tags so the 
links are output literally. But you also don't need the highlighting here, this 
is just text. 
   
   It can just be: 
   ```
   
   
   Text Here
   
   
   Text here
   
   
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-05 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r310677585
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -768,11 +856,36 @@ A Table API or SQL query is translated when:
 
 Once translated, a Table API or SQL query is handled like a regular DataStream 
or DataSet program and is executed when `StreamExecutionEnvironment.execute()` 
or `ExecutionEnvironment.execute()` is called.
 
+{% endhighlight %}
+
+
+
+{% highlight java %}
+Table API and SQL queries are translated into [DataStream]({{ site.baseurl 
}}/dev/datastream_api.html) program whether their input is a streaming or batch 
input. A query is internally represented as a logical query plan and is 
translated in two phases: 
+
+1. optimization of the logical plan, 
 
 Review comment:
   ```suggestion
   1. Optimization of the logical plan
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-05 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r310677684
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -768,11 +856,36 @@ A Table API or SQL query is translated when:
 
 Once translated, a Table API or SQL query is handled like a regular DataStream 
or DataSet program and is executed when `StreamExecutionEnvironment.execute()` 
or `ExecutionEnvironment.execute()` is called.
 
+{% endhighlight %}
+
+
+
+{% highlight java %}
+Table API and SQL queries are translated into [DataStream]({{ site.baseurl 
}}/dev/datastream_api.html) program whether their input is a streaming or batch 
input. A query is internally represented as a logical query plan and is 
translated in two phases: 
+
+1. optimization of the logical plan, 
+2. translation into a DataStream program.
 
 Review comment:
   ```suggestion
   2. Translation into a DataStream program
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-05 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r310676201
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -755,6 +838,11 @@ result.insert_into("CsvSinkTable")
 Translate and Execute a Query
 -
 
+The behavior of `Translate and Execute a Query` is different for two planners:
 
 Review comment:
   ```suggestion
   The behavior of translating and executing a query is different for the two 
planners.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-05 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r310669808
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -27,6 +27,20 @@ The Table API and SQL are integrated in a joint API. The 
central concept of this
 * This will be replaced by the TOC
 {:toc}
 
+Main differences of two planners
+
+
+the main differences on user interfaces level are
 
 Review comment:
   ```suggestion
   ```
   
   Unneccessary, this is just repeating the section header.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-05 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r310678779
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -768,11 +856,36 @@ A Table API or SQL query is translated when:
 
 Once translated, a Table API or SQL query is handled like a regular DataStream 
or DataSet program and is executed when `StreamExecutionEnvironment.execute()` 
or `ExecutionEnvironment.execute()` is called.
 
+{% endhighlight %}
+
+
+
+{% highlight java %}
+Table API and SQL queries are translated into [DataStream]({{ site.baseurl 
}}/dev/datastream_api.html) program whether their input is a streaming or batch 
input. A query is internally represented as a logical query plan and is 
translated in two phases: 
+
+1. optimization of the logical plan, 
+2. translation into a DataStream program.
+
+A Table API or SQL query is translated when:
+
+* a `Table` is emitted to a `TableSink`, i.e., when `Table.insertInto()` is 
called.
+* a SQL update query is specified, i.e., when `TableEnvironment.sqlUpdate()` 
is called.
+* a `Table` is converted into a `DataStream` (only for Blink stream job).
+
+Once translated, a Table API or SQL query is handled like a regular DataStream 
program and is executed when `TableEnvironment.execute()` or 
`StreamExecutionEnvironment.execute()` is called.
+
+{% endhighlight %}
+
+
+
 {% top %}
 
 Integration with DataStream and DataSet API
 ---
 
+Both Flink stream job and Blink stream job could be integrated with 
`DataStream`, and only Flink batch job could be integrated with `DataSet`, 
Blink batch job could not be integrated with both.
+**Note:** The `DataSet` discussed next is only for the Flink batch planner.
 
 Review comment:
   ```suggestion
   **Note:** The `DataSet` API discussed below is only relevant for the Flink 
batch planner.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-05 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r310669473
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -27,6 +27,20 @@ The Table API and SQL are integrated in a joint API. The 
central concept of this
 * This will be replaced by the TOC
 {:toc}
 
+Main differences of two planners
 
 Review comment:
   ```suggestion
   Main Differences Between the Two Planners
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-05 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r310674210
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -682,8 +765,8 @@ The following examples shows how to emit a `Table`:
 
 
 {% highlight java %}
-// get a StreamTableEnvironment, works for BatchTableEnvironment equivalently
-StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env);
+# get a TableEnvironment
 
 Review comment:
   ```suggestion
   // get a TableEnvironment
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-05 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r310679038
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -1304,20 +1417,52 @@ val table: Table = tableEnv.fromDataStream(stream, 
'name as 'myName)
 Query Optimization
 --
 
-Apache Flink leverages Apache Calcite to optimize and translate queries. The 
optimization currently performed include projection and filter push-down, 
subquery decorrelation, and other kinds of query rewriting. Flink does not yet 
optimize the order of joins, but executes them in the same order as defined in 
the query (order of Tables in the `FROM` clause and/or order of join predicates 
in the `WHERE` clause).
+
+
+{% highlight Flink planner %}
 
 Review comment:
   Same as above, the highlights need to be removed since this is just plain 
text. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-05 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r310674028
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -527,8 +610,8 @@ The following example shows how to specify a query and 
return the result as a `T
 
 
 {% highlight java %}
-// get a StreamTableEnvironment, works for BatchTableEnvironment equivalently
-StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env);
+# get a TableEnvironment
 
 Review comment:
   ```suggestion
   // get a TableEnvironment
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-05 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r310677474
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -768,11 +856,36 @@ A Table API or SQL query is translated when:
 
 Once translated, a Table API or SQL query is handled like a regular DataStream 
or DataSet program and is executed when `StreamExecutionEnvironment.execute()` 
or `ExecutionEnvironment.execute()` is called.
 
+{% endhighlight %}
+
+
+
+{% highlight java %}
+Table API and SQL queries are translated into [DataStream]({{ site.baseurl 
}}/dev/datastream_api.html) program whether their input is a streaming or batch 
input. A query is internally represented as a logical query plan and is 
translated in two phases: 
 
 Review comment:
   ```suggestion
   Table API and SQL queries are translated into [DataStream]({{ site.baseurl 
}}/dev/datastream_api.html) programs whether their input is streaming or batch. 
A query is internally represented as a logical query plan and is translated in 
two phases: 
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-05 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r310676402
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -755,6 +838,11 @@ result.insert_into("CsvSinkTable")
 Translate and Execute a Query
 -
 
+The behavior of `Translate and Execute a Query` is different for two planners:
+
+
 
 Review comment:
   https://user-images.githubusercontent.com/1891970/62478152-020ae780-b770-11e9-91a1-47179143ddc8.png;>
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-05 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r310671482
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -142,82 +150,153 @@ A `Table` is always bound to a specific 
`TableEnvironment`. It is not possible t
 
 A `TableEnvironment` is created by calling the static 
`BatchTableEnvironment.create()` or `StreamTableEnvironment.create()` method 
with a `StreamExecutionEnvironment` or an `ExecutionEnvironment` and an 
optional `TableConfig`. The `TableConfig` can be used to configure the 
`TableEnvironment` or to customize the query optimization and translation 
process (see [Query Optimization](#query-optimization)).
 
-Make sure to choose the `BatchTableEnvironment`/`StreamTableEnvironment` that 
matches your programming language.
+Make sure to choose the specific planner 
`BatchTableEnvironment`/`StreamTableEnvironment` that matches your programming 
language.
+
+If both planner jars are located in `/lib` directory, we should explicitly set 
which planner is active in current program.
 
 Review comment:
   ```suggestion
   If both planner jars are in the `/lib` directory, you should explicitly set 
which planner is active in the current program.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-05 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r310671058
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -27,6 +27,20 @@ The Table API and SQL are integrated in a joint API. The 
central concept of this
 * This will be replaced by the TOC
 {:toc}
 
+Main differences of two planners
+
+
+the main differences on user interfaces level are
+
+1. The Blink planner does not support `BatchTableSource`, only supports 
`StreamTableSource`. Because the batch job is a special case of streaming job 
in Blink planner, and a batch job will not be translated into `DateSet` program 
but translated into `DataStream` program same as the stream job.
+2. The Blink planner does not support `ExternalCatalog` which is deprecated.
+3. The implementation of `FilterableTableSource` for the Flink planner and the 
Blink planner is incompatible. The Flink planner will push down 
`PlannerExpression`s into `FilterableTableSource`, while the Blink planner will 
push down `Expression`s.
+4. String based key-value config options (defined in `OptimizerConfigOptions` 
and `ExecutionConfigOptions`) are only used for the Blink planner.
+5. The implementation(`CalciteConfig`) of `PlannerConfig` in two planners is 
different.
+6. The Blink planner will optimize multiple-sinks into one DAG (supported only 
on `TableEnvironment`, not on `StreamTableEnvironment`), while the Flink 
planner will always optimize each sink into a new DAG, and all DAGs are 
independent of each other.
 
 Review comment:
   ```suggestion
   6. The Blink planner will optimize multiple-sinks into one DAG (supported 
only on `TableEnvironment`, not on `StreamTableEnvironment`). The Flink planner 
will always optimize each sink into a new DAG, where all DAGs are independent 
of each other.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-05 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r310674136
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -592,8 +675,8 @@ The following example shows how to specify an update query 
that inserts its resu
 
 
 {% highlight java %}
-// get a StreamTableEnvironment, works for BatchTableEnvironment equivalently
-StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env);
+# get a TableEnvironment
 
 Review comment:
   ```suggestion
   // get a TableEnvironment
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-05 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r310679771
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -1304,20 +1417,52 @@ val table: Table = tableEnv.fromDataStream(stream, 
'name as 'myName)
 Query Optimization
 --
 
-Apache Flink leverages Apache Calcite to optimize and translate queries. The 
optimization currently performed include projection and filter push-down, 
subquery decorrelation, and other kinds of query rewriting. Flink does not yet 
optimize the order of joins, but executes them in the same order as defined in 
the query (order of Tables in the `FROM` clause and/or order of join predicates 
in the `WHERE` clause).
+
+
+{% highlight Flink planner %}
+
+Apache Flink leverages Apache Calcite to optimize and translate queries. The 
optimization currently performed include projection and filter push-down, 
subquery decorrelation, and other kinds of query rewriting. Flink planner does 
not yet optimize the order of joins, but executes them in the same order as 
defined in the query (order of Tables in the `FROM` clause and/or order of join 
predicates in the `WHERE` clause).
+
+It is possible to tweak the set of optimization rules which are applied in 
different phases by providing a `CalciteConfig` object. This can be created via 
a builder by calling `CalciteConfig.createBuilder())` and is provided to the 
TableEnvironment by calling 
`tableEnv.getConfig.setPlannerConfig(calciteConfig)`. 
+
+{% endhighlight %}
+
+
+
+{% highlight Blink planner %}
+
+The foundation of Apache Flink query optimization is Apache Calcite. In 
addition to apply Calcite in optimization, Blink planner also does a lot to 
enhance it.
+
+Fist of all, Blink planner does a series of rule-based optimization and 
cost-based optimization including:
+* special subquery rewriting, including two part: 1. converts IN and EXISTS 
into left semi-join 2.converts NOT IN and NOT EXISTS into left anti-join. Note: 
only IN/EXISTS/NOT IN/NOT EXISTS in conjunctive condition is supported.
+* normal subquery decorrelation based on Calcite
+* projection pruning
+* filter push down
+* partition pruning
+* join reorder if it is enabled (`table.optimizer.join-reorder-enabled` is 
true)
+* other kinds of query rewriting
+
+Secondly, Blink planner introduces rich statistics of data source and 
propagate those statistics up to the whole plan based on all kinds of extended 
`MetadataHandler`s. Optimizer could choose better plan based on those metadata.
+
+Finally, Blink planner provides fine-grain cost of each operator, which takes 
io, cpu, network and memory into account. Cost-based optimization could choose 
better plan based on fine-grain cost definition .
+
+It is possible to customize optimization programs referencing to 
`FlinkBatchProgram`(default optimization programs for batch) or 
`FlinkStreamProgram`(default optimization programs for stream), and replace the 
default optimization programs by providing a `CalciteConfig` object. This can 
be created via a builder by calling `CalciteConfig.createBuilder())` and is 
provided to the TableEnvironment by calling 
`tableEnv.getConfig.setPlannerConfig(calciteConfig)`. 
+
+{% endhighlight %}
+
+
 
-It is possible to tweak the set of optimization rules which are applied in 
different phases by providing a `CalciteConfig` object. This can be created via 
a builder by calling `CalciteConfig.createBuilder())` and is provided to the 
TableEnvironment by calling 
`tableEnv.getConfig.setCalciteConfig(calciteConfig)`. 
 
 ### Explaining a Table
 
 The Table API provides a mechanism to explain the logical and optimized query 
plans to compute a `Table`. 
-This is done through the `TableEnvironment.explain(table)` method. It returns 
a String describing three plans: 
+This is done through the `TableEnvironment.explain(table)` method or 
`TableEnvironment.explain()` method. `explain(table)` returns the plan of given 
`Table`, `explain()` returns the result of multiple-sinks plan mainly used for 
Blink planner. It returns a String describing three plans: 
 
 Review comment:
   ```suggestion
   This is done through the `TableEnvironment.explain(table)` method or 
`TableEnvironment.explain()` method. `explain(table)` returns the plan of a 
given `Table`. `explain()` returns the result of a multiple-sinks plan and is 
mainly used for the Blink planner. It returns a String describing three plans: 
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-05 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r310670732
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -27,6 +27,20 @@ The Table API and SQL are integrated in a joint API. The 
central concept of this
 * This will be replaced by the TOC
 {:toc}
 
+Main differences of two planners
+
+
+the main differences on user interfaces level are
+
+1. The Blink planner does not support `BatchTableSource`, only supports 
`StreamTableSource`. Because the batch job is a special case of streaming job 
in Blink planner, and a batch job will not be translated into `DateSet` program 
but translated into `DataStream` program same as the stream job.
+2. The Blink planner does not support `ExternalCatalog` which is deprecated.
+3. The implementation of `FilterableTableSource` for the Flink planner and the 
Blink planner is incompatible. The Flink planner will push down 
`PlannerExpression`s into `FilterableTableSource`, while the Blink planner will 
push down `Expression`s.
 
 Review comment:
   ```suggestion
   3. The implementations of `FilterableTableSource` for the Flink planner and 
the Blink planner are incompatible. The Flink planner will push down 
`PlannerExpression`s into `FilterableTableSource`, while the Blink planner will 
push down `Expression`s.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add documentation for how to use blink planner

2019-08-05 Thread GitBox
sjwiesman commented on a change in pull request #9362: [FLINK-13354] [docs] Add 
documentation for how to use blink planner
URL: https://github.com/apache/flink/pull/9362#discussion_r310673851
 
 

 ##
 File path: docs/dev/table/common.md
 ##
 @@ -344,8 +425,8 @@ A `TableSink` is registered in a `TableEnvironment` as 
follows:
 
 
 {% highlight java %}
-// get a StreamTableEnvironment, works for BatchTableEnvironment equivalently
-StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env);
+# get a TableEnvironment
 
 Review comment:
   ```suggestion
   // get a TableEnvironment
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services