godfreyhe commented on a change in pull request #12335: URL: https://github.com/apache/flink/pull/12335#discussion_r436337888
########## File path: flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/internal/CatalogTableSchemaResolver.java ########## @@ -0,0 +1,111 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.api.internal; + +import org.apache.flink.annotation.Internal; +import org.apache.flink.table.api.TableColumn; +import org.apache.flink.table.api.TableException; +import org.apache.flink.table.api.TableSchema; +import org.apache.flink.table.api.ValidationException; +import org.apache.flink.table.delegation.Parser; +import org.apache.flink.table.expressions.ResolvedExpression; +import org.apache.flink.table.types.DataType; +import org.apache.flink.table.types.logical.LogicalType; +import org.apache.flink.table.types.logical.TimestampKind; +import org.apache.flink.table.types.logical.TimestampType; +import org.apache.flink.table.types.utils.TypeConversions; + +import java.util.Arrays; + +/** + * The {@link CatalogTableSchemaResolver} is used to derive correct result type of computed column, + * because the date type of computed column from catalog table is not trusted. + * + * <p>Such as `proctime()` function, its type in given TableSchema is Timestamp(3), + * but its correct type is Timestamp(3) *PROCTIME*. + */ +@Internal +public class CatalogTableSchemaResolver { + public final Parser parser; + + public CatalogTableSchemaResolver(Parser parser) { + this.parser = parser; + } + + /** + * Resolve the computed column's type for the given schema. + * + * @param tableSchema Table schema to derive table field names and data types + * @param isStreamingMode Flag to determine whether the schema of a stream or batch table is created + * @return the resolved TableSchema + */ + public TableSchema resolve(TableSchema tableSchema, boolean isStreamingMode) { + final String rowtime; + if (!tableSchema.getWatermarkSpecs().isEmpty()) { + // TODO: [FLINK-14473] we only support top-level rowtime attribute right now + rowtime = tableSchema.getWatermarkSpecs().get(0).getRowtimeAttribute(); + if (rowtime.contains(".")) { + throw new ValidationException( + String.format("Nested field '%s' as rowtime attribute is not supported right now.", rowtime)); + } + } else { + rowtime = null; + } + + String[] fieldNames = tableSchema.getFieldNames(); + DataType[] fieldTypes = Arrays.copyOf(tableSchema.getFieldDataTypes(), tableSchema.getFieldCount()); + + for (int i = 0; i < tableSchema.getFieldCount(); ++i) { + TableColumn tableColumn = tableSchema.getTableColumns().get(i); + if (tableColumn.isGenerated() && isProctimeType(tableColumn.getExpr().get(), tableSchema)) { + if (fieldNames[i].equals(rowtime)) { + throw new TableException("proctime can't be defined on watermark spec."); + } + TimestampType originalType = (TimestampType) fieldTypes[i].getLogicalType(); + LogicalType proctimeType = new TimestampType( + originalType.isNullable(), + TimestampKind.PROCTIME, + originalType.getPrecision()); + fieldTypes[i] = TypeConversions.fromLogicalToDataType(proctimeType); + } else if (isStreamingMode && fieldNames[i].equals(rowtime)) { Review comment: I agree with you that we should push the `isStreamingMode` to the planner as much as possible. Many classes `isStreamingMode` flag now. I'm also try to remove the `isStreamingMode` from `CatalogTableSchemaResolver`, but I find we have to handle "erase the rowtime type logic for batch" at least three places (`TableEnvironmentImpl#scanInternal` for table api, `CatalogSchemaTable#getRowType` for `CatalogTable`, `DatabaseCalciteSchema#getTable` for `QueryOperationCatalogView`). Because we should make sure the type of `Table` from catalog and the type of `RelNode` expanded from `Table` (e.g. add projection node for computed column, add watermark assigner node for watermark, expand different kinds of view) are the same. For a long term, I think we should also keep the rowtime type for batch (e.g. support rowtime temporal join), and then we can remove `isStreamingMode` from `CatalogTableSchemaResolver`, and many other related logic can be simplified. (It's too complex to handle different kind of table or view for both batch and streaming) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
