[GitHub] [flink] sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system
sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system URL: https://github.com/apache/flink/pull/9161#discussion_r305029535 ## File path: docs/dev/table/types.md ## @@ -0,0 +1,1201 @@ +--- +title: "Data Types" +nav-parent_id: tableapi +nav-pos: 1 +--- + + +Due to historical reasons, the data types of Flink's Table & SQL API were closely coupled to Flink's Review comment: I tried to rewrite some of the introductory paragraphs to use a less passive voice. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system
sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system URL: https://github.com/apache/flink/pull/9161#discussion_r305030971 ## File path: docs/dev/table/types.md ## @@ -0,0 +1,1201 @@ +--- +title: "Data Types" +nav-parent_id: tableapi +nav-pos: 1 +--- + + +Due to historical reasons, the data types of Flink's Table & SQL API were closely coupled to Flink's Review comment: I tried to remove the passive voice from this introduction. Also maybe drop references to DataSet? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system
sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system URL: https://github.com/apache/flink/pull/9161#discussion_r304912777 ## File path: docs/dev/table/types.md ## @@ -0,0 +1,1201 @@ +--- +title: "Data Types" +nav-parent_id: tableapi +nav-pos: 1 +--- + + +Due to historical reasons, the data types of Flink's Table & SQL API were closely coupled to Flink's Review comment: Due to historical reasons, before Flink 1.9, Flink's Table & SQL API data types were tightly coupled to Flink's TypeInformation. TypeInformation is used in the DataStream API and is sufficient to describe all information needed to serialize and deserialize JVM-based objects in a distributed setting. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system
sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system URL: https://github.com/apache/flink/pull/9161#discussion_r304912777 ## File path: docs/dev/table/types.md ## @@ -0,0 +1,1201 @@ +--- +title: "Data Types" +nav-parent_id: tableapi +nav-pos: 1 +--- + + +Due to historical reasons, the data types of Flink's Table & SQL API were closely coupled to Flink's Review comment: Due to historical reasons, before Flink 1.9, Flink's Table & SQL API data types were tightly coupled to Flink's TypeInformation. TypeInformation is used in DataSet and DataStream API's and is sufficient to describe all information needed to serialize and deserialize JVM-based objects in a distributed setting. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system
sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system URL: https://github.com/apache/flink/pull/9161#discussion_r305029535 ## File path: docs/dev/table/types.md ## @@ -0,0 +1,1201 @@ +--- +title: "Data Types" +nav-parent_id: tableapi +nav-pos: 1 +--- + + +Due to historical reasons, the data types of Flink's Table & SQL API were closely coupled to Flink's Review comment: I tried to rewrite some of the introductory paragraphs to use a less passive voice. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system
sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system URL: https://github.com/apache/flink/pull/9161#discussion_r305024945 ## File path: docs/dev/table/types.md ## @@ -0,0 +1,1201 @@ +--- +title: "Data Types" +nav-parent_id: tableapi +nav-pos: 1 +--- + + +Due to historical reasons, the data types of Flink's Table & SQL API were closely coupled to Flink's +`TypeInformation` before Flink 1.9. `TypeInformation` is used in DataSet and DataStream API and is +sufficient to describe all information needed to serialize and deserialize JVM-based objects in a +distributed setting. + +However, `TypeInformation` was not designed to properly represent logical types independent of an +actual JVM class. In the past, it was difficult to properly map SQL standard types to this abstraction. +Furthermore, some types were not SQL-compliant and were introduced without a bigger picture in mind. + +Starting with Flink 1.9, the Table & SQL API will receive a new type system that serves as a long-term +solution for API stablility and standard compliance. + +Reworking the type system is a major effort that touches almost all user-facing interfaces. Therefore, its introduction +spans multiple releases and the community aims to finish this effort by Flink 1.10. + +Due to the simultaneous addition of a new planner for table programs (see [FLINK-11439](https://issues.apache.org/jira/browse/FLINK-11439)), +not every combination of planner and data type is supported. Furthermore, planners might not support every +data type with the desired precision or parameter. + +Attention Please see the planner compatibility table and limitations +section before using a data type. + +* This will be replaced by the TOC +{:toc} + +Data Type +- + +A *data type* describes the data type of a value in the table ecosystem. It can be used to declare input and/or +output types of operations. + +Flink's data types are similar to the SQL standard's *data type* terminology but also contain information +about the nullability of a value for efficient handling of scalar expressions. + +Examples of data types are: +- `INT` +- `INT NOT NULL` +- `INTERVAL DAY TO SECOND(3)` +- `ROW, myOtherField TIMESTAMP(3)>` + +A list of all pre-defined data types can be found in [below](#list-of-data-types). + +### Data Types in the Table API + +Users of the JVM-based API are dealing with instances of `org.apache.flink.table.types.DataType` within the Table API or when Review comment: ```suggestion Users of the JVM-based API work with instances of `org.apache.flink.table.types.DataType` within the Table API or when ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system
sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system URL: https://github.com/apache/flink/pull/9161#discussion_r305025804 ## File path: docs/dev/table/types.md ## @@ -0,0 +1,1201 @@ +--- +title: "Data Types" +nav-parent_id: tableapi +nav-pos: 1 +--- + + +Due to historical reasons, the data types of Flink's Table & SQL API were closely coupled to Flink's +`TypeInformation` before Flink 1.9. `TypeInformation` is used in DataSet and DataStream API and is +sufficient to describe all information needed to serialize and deserialize JVM-based objects in a +distributed setting. + +However, `TypeInformation` was not designed to properly represent logical types independent of an +actual JVM class. In the past, it was difficult to properly map SQL standard types to this abstraction. +Furthermore, some types were not SQL-compliant and were introduced without a bigger picture in mind. + +Starting with Flink 1.9, the Table & SQL API will receive a new type system that serves as a long-term +solution for API stablility and standard compliance. + +Reworking the type system is a major effort that touches almost all user-facing interfaces. Therefore, its introduction +spans multiple releases and the community aims to finish this effort by Flink 1.10. + +Due to the simultaneous addition of a new planner for table programs (see [FLINK-11439](https://issues.apache.org/jira/browse/FLINK-11439)), +not every combination of planner and data type is supported. Furthermore, planners might not support every +data type with the desired precision or parameter. + +Attention Please see the planner compatibility table and limitations +section before using a data type. + +* This will be replaced by the TOC +{:toc} + +Data Type +- + +A *data type* describes the data type of a value in the table ecosystem. It can be used to declare input and/or +output types of operations. + +Flink's data types are similar to the SQL standard's *data type* terminology but also contain information +about the nullability of a value for efficient handling of scalar expressions. + +Examples of data types are: +- `INT` +- `INT NOT NULL` +- `INTERVAL DAY TO SECOND(3)` +- `ROW, myOtherField TIMESTAMP(3)>` + +A list of all pre-defined data types can be found in [below](#list-of-data-types). + +### Data Types in the Table API + +Users of the JVM-based API are dealing with instances of `org.apache.flink.table.types.DataType` within the Table API or when +defining connectors, catalogs, or user-defined functions. + +A `DataType` instance has two responsibilities: +- **Declaration of a logical type** which does not imply a concrete physical representation for transmission +or storage but defines the boundaries between JVM-based languages and the table ecosystem. +- *Optional:* **Giving hints about the physical representation of data to the planner** which is useful at the edges to other APIs . + +For JVM-based languages, all pre-defined data types are available in `org.apache.flink.table.api.DataTypes`. + +It is recommended to add a star import to your table programs for having a fluent API: + + + + +{% highlight java %} +import static org.apache.flink.table.api.DataTypes.*; + +DataType t = INTERVAL(DAY(), SECOND(3)); +{% endhighlight %} + + + +{% highlight scala %} +import org.apache.flink.table.api.DataTypes._ + +val t: DataType = INTERVAL(DAY(), SECOND(3)); +{% endhighlight %} + + + + + Physical Hints + +Physical hints are required at the edges of the table ecosystem. Hints indicate the data format that an implementation +expects. + +For example, a data source could express that it produces values for logical `TIMESTAMP`s using a `java.sql.Timestamp` class +instead of using `java.time.LocalDateTime` which would be the default. With this information, the runtime is able to convert +the produced class into its internal data format. In return, a data sink can declare the data format it consumes from the runtime. + +Here are some examples of how to declare a bridging conversion class: + + + + +{% highlight java %} +// tell the runtime to not produce or consume java.time.LocalDateTime instances +// but java.sql.Timestamp +DataType t = DataTypes.TIMESTAMP(3).bridgedTo(java.sql.Timestamp.class); + +// tell the runtime to not produce or consume boxed integer arrays +// but primitive int arrays +DataType t = DataTypes.ARRAY(DataTypes.INT().notNull()).bridgedTo(int[].class); +{% endhighlight %} + + + +{% highlight scala %} +// tell the runtime to not produce or consume java.time.LocalDateTime instances +// but java.sql.Timestamp +val t: DataType = DataTypes.TIMESTAMP(3).bridgedTo(classOf[java.sql.Timestamp]); + +// tell the runtime to not produce or consume boxed integer arrays +// but primitive int arrays +val t: DataType = DataTypes.ARRAY(DataTypes.INT().notNull()).bridgedTo(classOf[Array[Int]]); +{% endhighlight %} + + + +
[GitHub] [flink] sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system
sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system URL: https://github.com/apache/flink/pull/9161#discussion_r305027128 ## File path: docs/dev/table/types.md ## @@ -0,0 +1,1201 @@ +--- +title: "Data Types" +nav-parent_id: tableapi +nav-pos: 1 +--- + + +Due to historical reasons, the data types of Flink's Table & SQL API were closely coupled to Flink's +`TypeInformation` before Flink 1.9. `TypeInformation` is used in DataSet and DataStream API and is +sufficient to describe all information needed to serialize and deserialize JVM-based objects in a +distributed setting. + +However, `TypeInformation` was not designed to properly represent logical types independent of an +actual JVM class. In the past, it was difficult to properly map SQL standard types to this abstraction. +Furthermore, some types were not SQL-compliant and were introduced without a bigger picture in mind. + +Starting with Flink 1.9, the Table & SQL API will receive a new type system that serves as a long-term +solution for API stablility and standard compliance. + +Reworking the type system is a major effort that touches almost all user-facing interfaces. Therefore, its introduction +spans multiple releases and the community aims to finish this effort by Flink 1.10. + +Due to the simultaneous addition of a new planner for table programs (see [FLINK-11439](https://issues.apache.org/jira/browse/FLINK-11439)), +not every combination of planner and data type is supported. Furthermore, planners might not support every +data type with the desired precision or parameter. + +Attention Please see the planner compatibility table and limitations +section before using a data type. + +* This will be replaced by the TOC +{:toc} + +Data Type +- + +A *data type* describes the data type of a value in the table ecosystem. It can be used to declare input and/or +output types of operations. + +Flink's data types are similar to the SQL standard's *data type* terminology but also contain information +about the nullability of a value for efficient handling of scalar expressions. + +Examples of data types are: +- `INT` +- `INT NOT NULL` +- `INTERVAL DAY TO SECOND(3)` +- `ROW, myOtherField TIMESTAMP(3)>` + +A list of all pre-defined data types can be found in [below](#list-of-data-types). + +### Data Types in the Table API + +Users of the JVM-based API are dealing with instances of `org.apache.flink.table.types.DataType` within the Table API or when +defining connectors, catalogs, or user-defined functions. + +A `DataType` instance has two responsibilities: +- **Declaration of a logical type** which does not imply a concrete physical representation for transmission +or storage but defines the boundaries between JVM-based languages and the table ecosystem. +- *Optional:* **Giving hints about the physical representation of data to the planner** which is useful at the edges to other APIs . + +For JVM-based languages, all pre-defined data types are available in `org.apache.flink.table.api.DataTypes`. + +It is recommended to add a star import to your table programs for having a fluent API: + + + + +{% highlight java %} +import static org.apache.flink.table.api.DataTypes.*; + +DataType t = INTERVAL(DAY(), SECOND(3)); +{% endhighlight %} + + + +{% highlight scala %} +import org.apache.flink.table.api.DataTypes._ + +val t: DataType = INTERVAL(DAY(), SECOND(3)); +{% endhighlight %} + + + + + Physical Hints + +Physical hints are required at the edges of the table ecosystem. Hints indicate the data format that an implementation +expects. + +For example, a data source could express that it produces values for logical `TIMESTAMP`s using a `java.sql.Timestamp` class +instead of using `java.time.LocalDateTime` which would be the default. With this information, the runtime is able to convert +the produced class into its internal data format. In return, a data sink can declare the data format it consumes from the runtime. + +Here are some examples of how to declare a bridging conversion class: + + + + +{% highlight java %} +// tell the runtime to not produce or consume java.time.LocalDateTime instances +// but java.sql.Timestamp +DataType t = DataTypes.TIMESTAMP(3).bridgedTo(java.sql.Timestamp.class); + +// tell the runtime to not produce or consume boxed integer arrays +// but primitive int arrays +DataType t = DataTypes.ARRAY(DataTypes.INT().notNull()).bridgedTo(int[].class); +{% endhighlight %} + + + +{% highlight scala %} +// tell the runtime to not produce or consume java.time.LocalDateTime instances +// but java.sql.Timestamp +val t: DataType = DataTypes.TIMESTAMP(3).bridgedTo(classOf[java.sql.Timestamp]); + +// tell the runtime to not produce or consume boxed integer arrays +// but primitive int arrays +val t: DataType = DataTypes.ARRAY(DataTypes.INT().notNull()).bridgedTo(classOf[Array[Int]]); +{% endhighlight %} + + + +
[GitHub] [flink] sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system
sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system URL: https://github.com/apache/flink/pull/9161#discussion_r305025907 ## File path: docs/dev/table/types.md ## @@ -0,0 +1,1201 @@ +--- +title: "Data Types" +nav-parent_id: tableapi +nav-pos: 1 +--- + + +Due to historical reasons, the data types of Flink's Table & SQL API were closely coupled to Flink's +`TypeInformation` before Flink 1.9. `TypeInformation` is used in DataSet and DataStream API and is +sufficient to describe all information needed to serialize and deserialize JVM-based objects in a +distributed setting. + +However, `TypeInformation` was not designed to properly represent logical types independent of an +actual JVM class. In the past, it was difficult to properly map SQL standard types to this abstraction. +Furthermore, some types were not SQL-compliant and were introduced without a bigger picture in mind. + +Starting with Flink 1.9, the Table & SQL API will receive a new type system that serves as a long-term +solution for API stablility and standard compliance. + +Reworking the type system is a major effort that touches almost all user-facing interfaces. Therefore, its introduction +spans multiple releases and the community aims to finish this effort by Flink 1.10. + +Due to the simultaneous addition of a new planner for table programs (see [FLINK-11439](https://issues.apache.org/jira/browse/FLINK-11439)), +not every combination of planner and data type is supported. Furthermore, planners might not support every +data type with the desired precision or parameter. + +Attention Please see the planner compatibility table and limitations +section before using a data type. + +* This will be replaced by the TOC +{:toc} + +Data Type +- + +A *data type* describes the data type of a value in the table ecosystem. It can be used to declare input and/or +output types of operations. + +Flink's data types are similar to the SQL standard's *data type* terminology but also contain information +about the nullability of a value for efficient handling of scalar expressions. + +Examples of data types are: +- `INT` +- `INT NOT NULL` +- `INTERVAL DAY TO SECOND(3)` +- `ROW, myOtherField TIMESTAMP(3)>` + +A list of all pre-defined data types can be found in [below](#list-of-data-types). + +### Data Types in the Table API + +Users of the JVM-based API are dealing with instances of `org.apache.flink.table.types.DataType` within the Table API or when +defining connectors, catalogs, or user-defined functions. + +A `DataType` instance has two responsibilities: +- **Declaration of a logical type** which does not imply a concrete physical representation for transmission +or storage but defines the boundaries between JVM-based languages and the table ecosystem. +- *Optional:* **Giving hints about the physical representation of data to the planner** which is useful at the edges to other APIs . + +For JVM-based languages, all pre-defined data types are available in `org.apache.flink.table.api.DataTypes`. + +It is recommended to add a star import to your table programs for having a fluent API: + + + + +{% highlight java %} +import static org.apache.flink.table.api.DataTypes.*; + +DataType t = INTERVAL(DAY(), SECOND(3)); +{% endhighlight %} + + + +{% highlight scala %} +import org.apache.flink.table.api.DataTypes._ + +val t: DataType = INTERVAL(DAY(), SECOND(3)); +{% endhighlight %} + + + + + Physical Hints + +Physical hints are required at the edges of the table ecosystem. Hints indicate the data format that an implementation +expects. + +For example, a data source could express that it produces values for logical `TIMESTAMP`s using a `java.sql.Timestamp` class +instead of using `java.time.LocalDateTime` which would be the default. With this information, the runtime is able to convert +the produced class into its internal data format. In return, a data sink can declare the data format it consumes from the runtime. + +Here are some examples of how to declare a bridging conversion class: + + + + +{% highlight java %} +// tell the runtime to not produce or consume java.time.LocalDateTime instances +// but java.sql.Timestamp +DataType t = DataTypes.TIMESTAMP(3).bridgedTo(java.sql.Timestamp.class); + +// tell the runtime to not produce or consume boxed integer arrays +// but primitive int arrays +DataType t = DataTypes.ARRAY(DataTypes.INT().notNull()).bridgedTo(int[].class); +{% endhighlight %} + + + +{% highlight scala %} +// tell the runtime to not produce or consume java.time.LocalDateTime instances +// but java.sql.Timestamp +val t: DataType = DataTypes.TIMESTAMP(3).bridgedTo(classOf[java.sql.Timestamp]); + +// tell the runtime to not produce or consume boxed integer arrays +// but primitive int arrays +val t: DataType = DataTypes.ARRAY(DataTypes.INT().notNull()).bridgedTo(classOf[Array[Int]]); +{% endhighlight %} + + + +
[GitHub] [flink] sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system
sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system URL: https://github.com/apache/flink/pull/9161#discussion_r305027613 ## File path: docs/dev/table/types.md ## @@ -0,0 +1,1201 @@ +--- +title: "Data Types" +nav-parent_id: tableapi +nav-pos: 1 +--- + + +Due to historical reasons, the data types of Flink's Table & SQL API were closely coupled to Flink's +`TypeInformation` before Flink 1.9. `TypeInformation` is used in DataSet and DataStream API and is +sufficient to describe all information needed to serialize and deserialize JVM-based objects in a +distributed setting. + +However, `TypeInformation` was not designed to properly represent logical types independent of an +actual JVM class. In the past, it was difficult to properly map SQL standard types to this abstraction. +Furthermore, some types were not SQL-compliant and were introduced without a bigger picture in mind. + +Starting with Flink 1.9, the Table & SQL API will receive a new type system that serves as a long-term +solution for API stablility and standard compliance. + +Reworking the type system is a major effort that touches almost all user-facing interfaces. Therefore, its introduction +spans multiple releases and the community aims to finish this effort by Flink 1.10. + +Due to the simultaneous addition of a new planner for table programs (see [FLINK-11439](https://issues.apache.org/jira/browse/FLINK-11439)), +not every combination of planner and data type is supported. Furthermore, planners might not support every +data type with the desired precision or parameter. + +Attention Please see the planner compatibility table and limitations +section before using a data type. + +* This will be replaced by the TOC +{:toc} + +Data Type +- + +A *data type* describes the data type of a value in the table ecosystem. It can be used to declare input and/or +output types of operations. + +Flink's data types are similar to the SQL standard's *data type* terminology but also contain information +about the nullability of a value for efficient handling of scalar expressions. + +Examples of data types are: +- `INT` +- `INT NOT NULL` +- `INTERVAL DAY TO SECOND(3)` +- `ROW, myOtherField TIMESTAMP(3)>` + +A list of all pre-defined data types can be found in [below](#list-of-data-types). + +### Data Types in the Table API + +Users of the JVM-based API are dealing with instances of `org.apache.flink.table.types.DataType` within the Table API or when +defining connectors, catalogs, or user-defined functions. + +A `DataType` instance has two responsibilities: +- **Declaration of a logical type** which does not imply a concrete physical representation for transmission +or storage but defines the boundaries between JVM-based languages and the table ecosystem. +- *Optional:* **Giving hints about the physical representation of data to the planner** which is useful at the edges to other APIs . + +For JVM-based languages, all pre-defined data types are available in `org.apache.flink.table.api.DataTypes`. + +It is recommended to add a star import to your table programs for having a fluent API: + + + + +{% highlight java %} +import static org.apache.flink.table.api.DataTypes.*; + +DataType t = INTERVAL(DAY(), SECOND(3)); +{% endhighlight %} + + + +{% highlight scala %} +import org.apache.flink.table.api.DataTypes._ + +val t: DataType = INTERVAL(DAY(), SECOND(3)); +{% endhighlight %} + + + + + Physical Hints + +Physical hints are required at the edges of the table ecosystem. Hints indicate the data format that an implementation +expects. + +For example, a data source could express that it produces values for logical `TIMESTAMP`s using a `java.sql.Timestamp` class +instead of using `java.time.LocalDateTime` which would be the default. With this information, the runtime is able to convert +the produced class into its internal data format. In return, a data sink can declare the data format it consumes from the runtime. + +Here are some examples of how to declare a bridging conversion class: + + + + +{% highlight java %} +// tell the runtime to not produce or consume java.time.LocalDateTime instances +// but java.sql.Timestamp +DataType t = DataTypes.TIMESTAMP(3).bridgedTo(java.sql.Timestamp.class); + +// tell the runtime to not produce or consume boxed integer arrays +// but primitive int arrays +DataType t = DataTypes.ARRAY(DataTypes.INT().notNull()).bridgedTo(int[].class); +{% endhighlight %} + + + +{% highlight scala %} +// tell the runtime to not produce or consume java.time.LocalDateTime instances +// but java.sql.Timestamp +val t: DataType = DataTypes.TIMESTAMP(3).bridgedTo(classOf[java.sql.Timestamp]); + +// tell the runtime to not produce or consume boxed integer arrays +// but primitive int arrays +val t: DataType = DataTypes.ARRAY(DataTypes.INT().notNull()).bridgedTo(classOf[Array[Int]]); +{% endhighlight %} + + + +
[GitHub] [flink] sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system
sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system URL: https://github.com/apache/flink/pull/9161#discussion_r305026501 ## File path: docs/dev/table/types.md ## @@ -0,0 +1,1201 @@ +--- +title: "Data Types" +nav-parent_id: tableapi +nav-pos: 1 +--- + + +Due to historical reasons, the data types of Flink's Table & SQL API were closely coupled to Flink's +`TypeInformation` before Flink 1.9. `TypeInformation` is used in DataSet and DataStream API and is +sufficient to describe all information needed to serialize and deserialize JVM-based objects in a +distributed setting. + +However, `TypeInformation` was not designed to properly represent logical types independent of an +actual JVM class. In the past, it was difficult to properly map SQL standard types to this abstraction. +Furthermore, some types were not SQL-compliant and were introduced without a bigger picture in mind. + +Starting with Flink 1.9, the Table & SQL API will receive a new type system that serves as a long-term +solution for API stablility and standard compliance. + +Reworking the type system is a major effort that touches almost all user-facing interfaces. Therefore, its introduction +spans multiple releases and the community aims to finish this effort by Flink 1.10. + +Due to the simultaneous addition of a new planner for table programs (see [FLINK-11439](https://issues.apache.org/jira/browse/FLINK-11439)), +not every combination of planner and data type is supported. Furthermore, planners might not support every +data type with the desired precision or parameter. + +Attention Please see the planner compatibility table and limitations +section before using a data type. + +* This will be replaced by the TOC +{:toc} + +Data Type +- + +A *data type* describes the data type of a value in the table ecosystem. It can be used to declare input and/or +output types of operations. + +Flink's data types are similar to the SQL standard's *data type* terminology but also contain information +about the nullability of a value for efficient handling of scalar expressions. + +Examples of data types are: +- `INT` +- `INT NOT NULL` +- `INTERVAL DAY TO SECOND(3)` +- `ROW, myOtherField TIMESTAMP(3)>` + +A list of all pre-defined data types can be found in [below](#list-of-data-types). + +### Data Types in the Table API + +Users of the JVM-based API are dealing with instances of `org.apache.flink.table.types.DataType` within the Table API or when +defining connectors, catalogs, or user-defined functions. + +A `DataType` instance has two responsibilities: +- **Declaration of a logical type** which does not imply a concrete physical representation for transmission +or storage but defines the boundaries between JVM-based languages and the table ecosystem. +- *Optional:* **Giving hints about the physical representation of data to the planner** which is useful at the edges to other APIs . + +For JVM-based languages, all pre-defined data types are available in `org.apache.flink.table.api.DataTypes`. + +It is recommended to add a star import to your table programs for having a fluent API: + + + + +{% highlight java %} +import static org.apache.flink.table.api.DataTypes.*; + +DataType t = INTERVAL(DAY(), SECOND(3)); +{% endhighlight %} + + + +{% highlight scala %} +import org.apache.flink.table.api.DataTypes._ + +val t: DataType = INTERVAL(DAY(), SECOND(3)); +{% endhighlight %} + + + + + Physical Hints + +Physical hints are required at the edges of the table ecosystem. Hints indicate the data format that an implementation +expects. + +For example, a data source could express that it produces values for logical `TIMESTAMP`s using a `java.sql.Timestamp` class +instead of using `java.time.LocalDateTime` which would be the default. With this information, the runtime is able to convert +the produced class into its internal data format. In return, a data sink can declare the data format it consumes from the runtime. + +Here are some examples of how to declare a bridging conversion class: + + + + +{% highlight java %} +// tell the runtime to not produce or consume java.time.LocalDateTime instances +// but java.sql.Timestamp +DataType t = DataTypes.TIMESTAMP(3).bridgedTo(java.sql.Timestamp.class); + +// tell the runtime to not produce or consume boxed integer arrays +// but primitive int arrays +DataType t = DataTypes.ARRAY(DataTypes.INT().notNull()).bridgedTo(int[].class); +{% endhighlight %} + + + +{% highlight scala %} +// tell the runtime to not produce or consume java.time.LocalDateTime instances +// but java.sql.Timestamp +val t: DataType = DataTypes.TIMESTAMP(3).bridgedTo(classOf[java.sql.Timestamp]); + +// tell the runtime to not produce or consume boxed integer arrays +// but primitive int arrays +val t: DataType = DataTypes.ARRAY(DataTypes.INT().notNull()).bridgedTo(classOf[Array[Int]]); +{% endhighlight %} + + + +
[GitHub] [flink] sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system
sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system URL: https://github.com/apache/flink/pull/9161#discussion_r305022907 ## File path: docs/dev/table/types.md ## @@ -0,0 +1,1201 @@ +--- +title: "Data Types" +nav-parent_id: tableapi +nav-pos: 1 +--- + + +Due to historical reasons, the data types of Flink's Table & SQL API were closely coupled to Flink's +`TypeInformation` before Flink 1.9. `TypeInformation` is used in DataSet and DataStream API and is +sufficient to describe all information needed to serialize and deserialize JVM-based objects in a +distributed setting. + +However, `TypeInformation` was not designed to properly represent logical types independent of an +actual JVM class. In the past, it was difficult to properly map SQL standard types to this abstraction. +Furthermore, some types were not SQL-compliant and were introduced without a bigger picture in mind. + +Starting with Flink 1.9, the Table & SQL API will receive a new type system that serves as a long-term +solution for API stablility and standard compliance. + +Reworking the type system is a major effort that touches almost all user-facing interfaces. Therefore, its introduction +spans multiple releases and the community aims to finish this effort by Flink 1.10. Review comment: ```suggestion spans multiple releases, and the community aims to finish this effort by Flink 1.10. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system
sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system URL: https://github.com/apache/flink/pull/9161#discussion_r305024070 ## File path: docs/dev/table/types.md ## @@ -0,0 +1,1201 @@ +--- +title: "Data Types" +nav-parent_id: tableapi +nav-pos: 1 +--- + + +Due to historical reasons, the data types of Flink's Table & SQL API were closely coupled to Flink's +`TypeInformation` before Flink 1.9. `TypeInformation` is used in DataSet and DataStream API and is +sufficient to describe all information needed to serialize and deserialize JVM-based objects in a +distributed setting. + +However, `TypeInformation` was not designed to properly represent logical types independent of an +actual JVM class. In the past, it was difficult to properly map SQL standard types to this abstraction. +Furthermore, some types were not SQL-compliant and were introduced without a bigger picture in mind. + +Starting with Flink 1.9, the Table & SQL API will receive a new type system that serves as a long-term +solution for API stablility and standard compliance. + +Reworking the type system is a major effort that touches almost all user-facing interfaces. Therefore, its introduction +spans multiple releases and the community aims to finish this effort by Flink 1.10. + +Due to the simultaneous addition of a new planner for table programs (see [FLINK-11439](https://issues.apache.org/jira/browse/FLINK-11439)), +not every combination of planner and data type is supported. Furthermore, planners might not support every +data type with the desired precision or parameter. + +Attention Please see the planner compatibility table and limitations +section before using a data type. + +* This will be replaced by the TOC +{:toc} + +Data Type +- + +A *data type* describes the data type of a value in the table ecosystem. It can be used to declare input and/or Review comment: >A *data type* describes a data type . . . It sounds strange to me since this sentence is trying to describe what a data type is. Would something like "A *data type* describes the logical types of a value . . " make sense? I don't know enough about the new type system to know if that's correct. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system
sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system URL: https://github.com/apache/flink/pull/9161#discussion_r305024625 ## File path: docs/dev/table/types.md ## @@ -0,0 +1,1201 @@ +--- +title: "Data Types" +nav-parent_id: tableapi +nav-pos: 1 +--- + + +Due to historical reasons, the data types of Flink's Table & SQL API were closely coupled to Flink's +`TypeInformation` before Flink 1.9. `TypeInformation` is used in DataSet and DataStream API and is +sufficient to describe all information needed to serialize and deserialize JVM-based objects in a +distributed setting. + +However, `TypeInformation` was not designed to properly represent logical types independent of an +actual JVM class. In the past, it was difficult to properly map SQL standard types to this abstraction. +Furthermore, some types were not SQL-compliant and were introduced without a bigger picture in mind. + +Starting with Flink 1.9, the Table & SQL API will receive a new type system that serves as a long-term +solution for API stablility and standard compliance. + +Reworking the type system is a major effort that touches almost all user-facing interfaces. Therefore, its introduction +spans multiple releases and the community aims to finish this effort by Flink 1.10. + +Due to the simultaneous addition of a new planner for table programs (see [FLINK-11439](https://issues.apache.org/jira/browse/FLINK-11439)), +not every combination of planner and data type is supported. Furthermore, planners might not support every +data type with the desired precision or parameter. + +Attention Please see the planner compatibility table and limitations +section before using a data type. + +* This will be replaced by the TOC +{:toc} + +Data Type +- + +A *data type* describes the data type of a value in the table ecosystem. It can be used to declare input and/or +output types of operations. + +Flink's data types are similar to the SQL standard's *data type* terminology but also contain information +about the nullability of a value for efficient handling of scalar expressions. + +Examples of data types are: +- `INT` +- `INT NOT NULL` +- `INTERVAL DAY TO SECOND(3)` +- `ROW, myOtherField TIMESTAMP(3)>` + +A list of all pre-defined data types can be found in [below](#list-of-data-types). + +### Data Types in the Table API + +Users of the JVM-based API are dealing with instances of `org.apache.flink.table.types.DataType` within the Table API or when Review comment: Is JVM correct or are DataTypes in the python table api as well? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system
sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system URL: https://github.com/apache/flink/pull/9161#discussion_r304912777 ## File path: docs/dev/table/types.md ## @@ -0,0 +1,1201 @@ +--- +title: "Data Types" +nav-parent_id: tableapi +nav-pos: 1 +--- + + +Due to historical reasons, the data types of Flink's Table & SQL API were closely coupled to Flink's Review comment: Due to historical reasons, before Flink 1.9, Flink's Table & SQL API data types are tightly coupled to Flink's TypeInformation. TypeInformation is used in DataSet and DataStream API's and is sufficient to describe all information needed to serialize and deserialize JVM-based objects in a distributed setting. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system
sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system URL: https://github.com/apache/flink/pull/9161#discussion_r305025425 ## File path: docs/dev/table/types.md ## @@ -0,0 +1,1201 @@ +--- +title: "Data Types" +nav-parent_id: tableapi +nav-pos: 1 +--- + + +Due to historical reasons, the data types of Flink's Table & SQL API were closely coupled to Flink's +`TypeInformation` before Flink 1.9. `TypeInformation` is used in DataSet and DataStream API and is +sufficient to describe all information needed to serialize and deserialize JVM-based objects in a +distributed setting. + +However, `TypeInformation` was not designed to properly represent logical types independent of an +actual JVM class. In the past, it was difficult to properly map SQL standard types to this abstraction. +Furthermore, some types were not SQL-compliant and were introduced without a bigger picture in mind. + +Starting with Flink 1.9, the Table & SQL API will receive a new type system that serves as a long-term +solution for API stablility and standard compliance. + +Reworking the type system is a major effort that touches almost all user-facing interfaces. Therefore, its introduction +spans multiple releases and the community aims to finish this effort by Flink 1.10. + +Due to the simultaneous addition of a new planner for table programs (see [FLINK-11439](https://issues.apache.org/jira/browse/FLINK-11439)), +not every combination of planner and data type is supported. Furthermore, planners might not support every +data type with the desired precision or parameter. + +Attention Please see the planner compatibility table and limitations +section before using a data type. + +* This will be replaced by the TOC +{:toc} + +Data Type +- + +A *data type* describes the data type of a value in the table ecosystem. It can be used to declare input and/or +output types of operations. + +Flink's data types are similar to the SQL standard's *data type* terminology but also contain information +about the nullability of a value for efficient handling of scalar expressions. + +Examples of data types are: +- `INT` +- `INT NOT NULL` +- `INTERVAL DAY TO SECOND(3)` +- `ROW, myOtherField TIMESTAMP(3)>` + +A list of all pre-defined data types can be found in [below](#list-of-data-types). + +### Data Types in the Table API + +Users of the JVM-based API are dealing with instances of `org.apache.flink.table.types.DataType` within the Table API or when +defining connectors, catalogs, or user-defined functions. + +A `DataType` instance has two responsibilities: +- **Declaration of a logical type** which does not imply a concrete physical representation for transmission +or storage but defines the boundaries between JVM-based languages and the table ecosystem. +- *Optional:* **Giving hints about the physical representation of data to the planner** which is useful at the edges to other APIs . + +For JVM-based languages, all pre-defined data types are available in `org.apache.flink.table.api.DataTypes`. + +It is recommended to add a star import to your table programs for having a fluent API: + + + + +{% highlight java %} +import static org.apache.flink.table.api.DataTypes.*; + +DataType t = INTERVAL(DAY(), SECOND(3)); +{% endhighlight %} + + + +{% highlight scala %} +import org.apache.flink.table.api.DataTypes._ + +val t: DataType = INTERVAL(DAY(), SECOND(3)); +{% endhighlight %} + + + + + Physical Hints + +Physical hints are required at the edges of the table ecosystem. Hints indicate the data format that an implementation Review comment: Can you expand on what the edges are? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system
sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system URL: https://github.com/apache/flink/pull/9161#discussion_r304913208 ## File path: docs/dev/table/types.md ## @@ -0,0 +1,1201 @@ +--- +title: "Data Types" +nav-parent_id: tableapi +nav-pos: 1 +--- + + +Due to historical reasons, the data types of Flink's Table & SQL API were closely coupled to Flink's +`TypeInformation` before Flink 1.9. `TypeInformation` is used in DataSet and DataStream API and is +sufficient to describe all information needed to serialize and deserialize JVM-based objects in a +distributed setting. + +However, `TypeInformation` was not designed to properly represent logical types independent of an Review comment: However, TypeInformation was not designed to represent logical types independent of an actual JVM class. In the past, it was difficult to map SQL standard types to this abstraction. Furthermore, some types were not SQL-compliant and introduced without a bigger picture in mind. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system
sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system URL: https://github.com/apache/flink/pull/9161#discussion_r305026826 ## File path: docs/dev/table/types.md ## @@ -0,0 +1,1201 @@ +--- +title: "Data Types" +nav-parent_id: tableapi +nav-pos: 1 +--- + + +Due to historical reasons, the data types of Flink's Table & SQL API were closely coupled to Flink's +`TypeInformation` before Flink 1.9. `TypeInformation` is used in DataSet and DataStream API and is +sufficient to describe all information needed to serialize and deserialize JVM-based objects in a +distributed setting. + +However, `TypeInformation` was not designed to properly represent logical types independent of an +actual JVM class. In the past, it was difficult to properly map SQL standard types to this abstraction. +Furthermore, some types were not SQL-compliant and were introduced without a bigger picture in mind. + +Starting with Flink 1.9, the Table & SQL API will receive a new type system that serves as a long-term +solution for API stablility and standard compliance. + +Reworking the type system is a major effort that touches almost all user-facing interfaces. Therefore, its introduction +spans multiple releases and the community aims to finish this effort by Flink 1.10. + +Due to the simultaneous addition of a new planner for table programs (see [FLINK-11439](https://issues.apache.org/jira/browse/FLINK-11439)), +not every combination of planner and data type is supported. Furthermore, planners might not support every +data type with the desired precision or parameter. + +Attention Please see the planner compatibility table and limitations +section before using a data type. + +* This will be replaced by the TOC +{:toc} + +Data Type +- + +A *data type* describes the data type of a value in the table ecosystem. It can be used to declare input and/or +output types of operations. + +Flink's data types are similar to the SQL standard's *data type* terminology but also contain information +about the nullability of a value for efficient handling of scalar expressions. + +Examples of data types are: +- `INT` +- `INT NOT NULL` +- `INTERVAL DAY TO SECOND(3)` +- `ROW, myOtherField TIMESTAMP(3)>` + +A list of all pre-defined data types can be found in [below](#list-of-data-types). + +### Data Types in the Table API + +Users of the JVM-based API are dealing with instances of `org.apache.flink.table.types.DataType` within the Table API or when +defining connectors, catalogs, or user-defined functions. + +A `DataType` instance has two responsibilities: +- **Declaration of a logical type** which does not imply a concrete physical representation for transmission +or storage but defines the boundaries between JVM-based languages and the table ecosystem. +- *Optional:* **Giving hints about the physical representation of data to the planner** which is useful at the edges to other APIs . + +For JVM-based languages, all pre-defined data types are available in `org.apache.flink.table.api.DataTypes`. + +It is recommended to add a star import to your table programs for having a fluent API: + + + + +{% highlight java %} +import static org.apache.flink.table.api.DataTypes.*; + +DataType t = INTERVAL(DAY(), SECOND(3)); +{% endhighlight %} + + + +{% highlight scala %} +import org.apache.flink.table.api.DataTypes._ + +val t: DataType = INTERVAL(DAY(), SECOND(3)); +{% endhighlight %} + + + + + Physical Hints + +Physical hints are required at the edges of the table ecosystem. Hints indicate the data format that an implementation +expects. + +For example, a data source could express that it produces values for logical `TIMESTAMP`s using a `java.sql.Timestamp` class +instead of using `java.time.LocalDateTime` which would be the default. With this information, the runtime is able to convert +the produced class into its internal data format. In return, a data sink can declare the data format it consumes from the runtime. + +Here are some examples of how to declare a bridging conversion class: + + + + +{% highlight java %} +// tell the runtime to not produce or consume java.time.LocalDateTime instances +// but java.sql.Timestamp +DataType t = DataTypes.TIMESTAMP(3).bridgedTo(java.sql.Timestamp.class); + +// tell the runtime to not produce or consume boxed integer arrays +// but primitive int arrays +DataType t = DataTypes.ARRAY(DataTypes.INT().notNull()).bridgedTo(int[].class); +{% endhighlight %} + + + +{% highlight scala %} +// tell the runtime to not produce or consume java.time.LocalDateTime instances +// but java.sql.Timestamp +val t: DataType = DataTypes.TIMESTAMP(3).bridgedTo(classOf[java.sql.Timestamp]); + +// tell the runtime to not produce or consume boxed integer arrays +// but primitive int arrays +val t: DataType = DataTypes.ARRAY(DataTypes.INT().notNull()).bridgedTo(classOf[Array[Int]]); +{% endhighlight %} + + + +
[GitHub] [flink] sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system
sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system URL: https://github.com/apache/flink/pull/9161#discussion_r305022448 ## File path: docs/dev/table/types.md ## @@ -0,0 +1,1201 @@ +--- +title: "Data Types" +nav-parent_id: tableapi +nav-pos: 1 +--- + + +Due to historical reasons, the data types of Flink's Table & SQL API were closely coupled to Flink's +`TypeInformation` before Flink 1.9. `TypeInformation` is used in DataSet and DataStream API and is +sufficient to describe all information needed to serialize and deserialize JVM-based objects in a +distributed setting. + +However, `TypeInformation` was not designed to properly represent logical types independent of an +actual JVM class. In the past, it was difficult to properly map SQL standard types to this abstraction. +Furthermore, some types were not SQL-compliant and were introduced without a bigger picture in mind. + +Starting with Flink 1.9, the Table & SQL API will receive a new type system that serves as a long-term +solution for API stablility and standard compliance. Review comment: ```suggestion solution for API stability and standard compliance. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system
sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system URL: https://github.com/apache/flink/pull/9161#discussion_r304915601 ## File path: docs/dev/table/types.md ## @@ -0,0 +1,1201 @@ +--- +title: "Data Types" +nav-parent_id: tableapi +nav-pos: 1 +--- + + +Due to historical reasons, the data types of Flink's Table & SQL API were closely coupled to Flink's +`TypeInformation` before Flink 1.9. `TypeInformation` is used in DataSet and DataStream API and is +sufficient to describe all information needed to serialize and deserialize JVM-based objects in a +distributed setting. + +However, `TypeInformation` was not designed to properly represent logical types independent of an +actual JVM class. In the past, it was difficult to properly map SQL standard types to this abstraction. +Furthermore, some types were not SQL-compliant and were introduced without a bigger picture in mind. + +Starting with Flink 1.9, the Table & SQL API will receive a new type system that serves as a long-term +solution for API stablility and standard compliance. + +Reworking the type system is a major effort that touches almost all user-facing interfaces. Therefore, its introduction +spans multiple releases and the community aims to finish this effort by Flink 1.10. + +Due to the simultaneous addition of a new planner for table programs (see [FLINK-11439](https://issues.apache.org/jira/browse/FLINK-11439)), +not every combination of planner and data type is supported. Furthermore, planners might not support every +data type with the desired precision or parameter. + +Attention Please see the planner compatibility table and limitations +section before using a data type. + +* This will be replaced by the TOC +{:toc} + +Data Type +- + +A *data type* describes the data type of a value in the table ecosystem. It can be used to declare input and/or +output types of operations. + +Flink's data types are similar to the SQL standard's *data type* terminology but also contain information +about the nullability of a value for efficient handling of scalar expressions. + +Examples of data types are: +- `INT` +- `INT NOT NULL` +- `INTERVAL DAY TO SECOND(3)` +- `ROW, myOtherField TIMESTAMP(3)>` + +A list of all pre-defined data types can be found in [below](#list-of-data-types). Review comment: ```suggestion A list of all pre-defined data types can be found [below](#list-of-data-types). ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system
sjwiesman commented on a change in pull request #9161: [FLINK-13262][docs] Add documentation for the new Table & SQL API type system URL: https://github.com/apache/flink/pull/9161#discussion_r305027290 ## File path: docs/dev/table/types.md ## @@ -0,0 +1,1201 @@ +--- +title: "Data Types" +nav-parent_id: tableapi +nav-pos: 1 +--- + + +Due to historical reasons, the data types of Flink's Table & SQL API were closely coupled to Flink's +`TypeInformation` before Flink 1.9. `TypeInformation` is used in DataSet and DataStream API and is +sufficient to describe all information needed to serialize and deserialize JVM-based objects in a +distributed setting. + +However, `TypeInformation` was not designed to properly represent logical types independent of an +actual JVM class. In the past, it was difficult to properly map SQL standard types to this abstraction. +Furthermore, some types were not SQL-compliant and were introduced without a bigger picture in mind. + +Starting with Flink 1.9, the Table & SQL API will receive a new type system that serves as a long-term +solution for API stablility and standard compliance. + +Reworking the type system is a major effort that touches almost all user-facing interfaces. Therefore, its introduction +spans multiple releases and the community aims to finish this effort by Flink 1.10. + +Due to the simultaneous addition of a new planner for table programs (see [FLINK-11439](https://issues.apache.org/jira/browse/FLINK-11439)), +not every combination of planner and data type is supported. Furthermore, planners might not support every +data type with the desired precision or parameter. + +Attention Please see the planner compatibility table and limitations +section before using a data type. + +* This will be replaced by the TOC +{:toc} + +Data Type +- + +A *data type* describes the data type of a value in the table ecosystem. It can be used to declare input and/or +output types of operations. + +Flink's data types are similar to the SQL standard's *data type* terminology but also contain information +about the nullability of a value for efficient handling of scalar expressions. + +Examples of data types are: +- `INT` +- `INT NOT NULL` +- `INTERVAL DAY TO SECOND(3)` +- `ROW, myOtherField TIMESTAMP(3)>` + +A list of all pre-defined data types can be found in [below](#list-of-data-types). + +### Data Types in the Table API + +Users of the JVM-based API are dealing with instances of `org.apache.flink.table.types.DataType` within the Table API or when +defining connectors, catalogs, or user-defined functions. + +A `DataType` instance has two responsibilities: +- **Declaration of a logical type** which does not imply a concrete physical representation for transmission +or storage but defines the boundaries between JVM-based languages and the table ecosystem. +- *Optional:* **Giving hints about the physical representation of data to the planner** which is useful at the edges to other APIs . + +For JVM-based languages, all pre-defined data types are available in `org.apache.flink.table.api.DataTypes`. + +It is recommended to add a star import to your table programs for having a fluent API: + + + + +{% highlight java %} +import static org.apache.flink.table.api.DataTypes.*; + +DataType t = INTERVAL(DAY(), SECOND(3)); +{% endhighlight %} + + + +{% highlight scala %} +import org.apache.flink.table.api.DataTypes._ + +val t: DataType = INTERVAL(DAY(), SECOND(3)); +{% endhighlight %} + + + + + Physical Hints + +Physical hints are required at the edges of the table ecosystem. Hints indicate the data format that an implementation +expects. + +For example, a data source could express that it produces values for logical `TIMESTAMP`s using a `java.sql.Timestamp` class +instead of using `java.time.LocalDateTime` which would be the default. With this information, the runtime is able to convert +the produced class into its internal data format. In return, a data sink can declare the data format it consumes from the runtime. + +Here are some examples of how to declare a bridging conversion class: + + + + +{% highlight java %} +// tell the runtime to not produce or consume java.time.LocalDateTime instances +// but java.sql.Timestamp +DataType t = DataTypes.TIMESTAMP(3).bridgedTo(java.sql.Timestamp.class); + +// tell the runtime to not produce or consume boxed integer arrays +// but primitive int arrays +DataType t = DataTypes.ARRAY(DataTypes.INT().notNull()).bridgedTo(int[].class); +{% endhighlight %} + + + +{% highlight scala %} +// tell the runtime to not produce or consume java.time.LocalDateTime instances +// but java.sql.Timestamp +val t: DataType = DataTypes.TIMESTAMP(3).bridgedTo(classOf[java.sql.Timestamp]); + +// tell the runtime to not produce or consume boxed integer arrays +// but primitive int arrays +val t: DataType = DataTypes.ARRAY(DataTypes.INT().notNull()).bridgedTo(classOf[Array[Int]]); +{% endhighlight %} + + + +