manuzhang commented on code in PR #15630:
URL: https://github.com/apache/iceberg/pull/15630#discussion_r3279214807


##########
format/spec.md:
##########
@@ -168,6 +184,48 @@ All columns must be written to data files even if they 
introduce redundancy with
 
 Writers are not allowed to commit files with a partition spec that contains a 
field with an unknown transform.
 
+### Paths in Metadata
+
+Path strings stored in Iceberg metadata location fields are classified as one 
of two types:
+
+* **Absolute path** -- A path string that starts with a [URI 
scheme](https://datatracker.ietf.org/doc/html/rfc3986#section-3.1) (e.g., 
`s3:`, `gs:`, `hdfs:`, `file:`). Absolute paths are used as-is without 
modification.
+* **Relative path** -- A path string that does not start with a URI scheme. 
Relative paths must be resolved against the table's base location before use.
+
+Prior to v4, all path fields must contain fully-qualified paths. Starting with 
v4, path fields may contain either absolute or relative paths. [Relative 
resolution within a 
URI](https://datatracker.ietf.org/doc/html/rfc3986#section-5.2) (e.g. `.` and 
`..`) and other file system navigation conventions are not supported in 
relative paths.
+
+#### Path Resolution
+
+Path resolution is the process of producing an absolute path from a relative 
path by combining it with the table's base location:
+
+* If the path starts with a URI scheme, it is absolute and is used without 
modification.
+* If the path does not start with a URI scheme, the resolved path is the table 
location followed by the relative path joined by the URI separator character 
`/`.
+
+The relative portion is joined to the prefix (table location) without 
consideration of any additional separator characters. The recommended 
convention for table location is to not end in a path separator because the 
join process would add a second separator character. (See example below.)
+
+Paths in manifests produced prior to v4 are fully-qualified and must be 
produced with a URI scheme if the scheme was omitted to be consistent with V4 
paths.

Review Comment:
   Me too ;), might need a comma somewhere.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to