rdblue commented on code in PR #7500: URL: https://github.com/apache/iceberg/pull/7500#discussion_r1185594321
########## format/view-spec.md: ########## @@ -96,11 +96,15 @@ Summary is a string to string map of metadata about a view version. Common metad View definitions can be represented in multiple ways. Representations are documented ways to express a view definition. -A view version can have more than one representation. All representations for a version must express the same underlying definition. Engines are free to choose the representation to use. +A view version can have more than one representation. All representations for a version must express the same underlying definition. Engines are free to choose the representation(s) to use. + +View versions are immutable. Once a version is created, it cannot be changed. This means that representations for a version cannot be changed. If a view definition changes (or new representations are to be added), a new version must be created. Each representation is an object with at least one common field, `type`, that is one of the following: * `sql`: a SQL SELECT statement that defines the view +In addition to `type`, each representation defines a `dialect`. A `dialect` is a string that identifies the query language dialect used in the representation. For example, `trino` or `spark`. A view version cannot have duplicate representations with the same `type` and `dialect`. Review Comment: > The dialect field is required. Did not notice something in the spec that says it was not true. Dialect is required for the `SQLViewRepresentation`. The common representation fields are defined, but currently only include `type`. > I was thinking of other languages such as Datalog, SPARQL, Cypher. Those could have dialects too. In that case, a representation that contains those might have a dialect. But IR representations like Substrait would probably not have a dialect. > I was actually under the impression that the spec allows multiple SQL representations with different dialects. We should decide. This would be fine with the current encoding. Since we don't have a reliable dialect in all cases, I don't think we would want to build rules based on it. We should just decide whether you can have multiple representations with the same type or not. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
