[GitHub] [iceberg] rdblue commented on a diff in pull request #7500: Views: Update spec with expectations on versions, representations, and dialects

via GitHub Thu, 04 May 2023 16:17:42 -0700


rdblue commented on code in PR #7500:
URL: https://github.com/apache/iceberg/pull/7500#discussion_r1185594321



##########
format/view-spec.md:
##########
@@ -96,11 +96,15 @@ Summary is a string to string map of metadata about a view 
version. Common metad
 
 View definitions can be represented in multiple ways. Representations are 
documented ways to express a view definition.
 
-A view version can have more than one representation. All representations for 
a version must express the same underlying definition. Engines are free to 
choose the representation to use.
+A view version can have more than one representation. All representations for 
a version must express the same underlying definition. Engines are free to 
choose the representation(s) to use.
+
+View versions are immutable. Once a version is created, it cannot be changed. 
This means that representations for a version cannot be changed. If a view 
definition changes (or new representations are to be added), a new version must 
be created.
 
 Each representation is an object with at least one common field, `type`, that 
is one of the following:
 * `sql`: a SQL SELECT statement that defines the view
 
+In addition to `type`, each representation defines a `dialect`. A `dialect` is 
a string that identifies the query language dialect used in the representation. 
For example, `trino` or `spark`. A view version cannot have duplicate 
representations with the same `type` and `dialect`.

Review Comment:
   > The dialect field is required. Did not notice something in the spec that 
says it was not true.
   
   Dialect is required for the `SQLViewRepresentation`. The common 
representation fields are defined, but currently only include `type`.
   
   > I was thinking of other languages such as Datalog, SPARQL, Cypher. Those 
could have dialects too.
   
   In that case, a representation that contains those might have a dialect. But 
IR representations like Substrait would probably not have a dialect.
   
   > I was actually under the impression that the spec allows multiple SQL 
representations with different dialects.
   
   We should decide. This would be fine with the current encoding. Since we 
don't have a reliable dialect in all cases, I don't think we would want to 
build rules based on it. We should just decide whether you can have multiple 
representations with the same type or not.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] rdblue commented on a diff in pull request #7500: Views: Update spec with expectations on versions, representations, and dialects

Reply via email to