[GitHub] [iceberg] jackye1995 commented on a change in pull request #1825: Add timestamp to table definition in Nessie Catalog

GitBox Thu, 21 Jan 2021 22:35:27 -0800


jackye1995 commented on a change in pull request #1825:
URL: https://github.com/apache/iceberg/pull/1825#discussion_r561390896




##########
File path: nessie/src/main/java/org/apache/iceberg/nessie/NessieCatalog.java
##########
@@ -308,6 +314,36 @@ private UpdateableReference loadReference(String 
requestedRef) {
     }
   }
 
+  private RefreshableReference loadReferenceAtTime(String requestedRef, 
Instant timestamp) {
+    try {
+      Reference ref = requestedRef == null ? 
client.getTreeApi().getDefaultBranch()
+          : client.getTreeApi().getReferenceByName(requestedRef);
+      if (ref instanceof Hash) {
+        throw new IllegalArgumentException(String.format("Cannot specify a 
hash %s and timestamp %s together. " +
+            "The timestamp is redundant and has been ignored", requestedRef, 
timestamp));
+      }

Review comment:
       yes, thank you

##########
File path: nessie/src/main/java/org/apache/iceberg/nessie/TableReference.java
##########
@@ -64,31 +87,137 @@ public static TableReference parse(TableIdentifier path) {
    */
   public static TableReference parse(String path) {
     // I am assuming tables can't have @ or # symbols
-    if (path.split("@").length > 2) {
+    if (REF.splitToList(path).size() > 2) {
       throw new IllegalArgumentException(String.format("Can only reference one 
branch in %s", path));
     }
-    if (path.split("#").length > 2) {
+    if (TIMESTAMP.splitToList(path).size() > 2) {
       throw new IllegalArgumentException(String.format("Can only reference one 
timestamp in %s", path));
     }
 
     if (path.contains("@") && path.contains("#")) {
-      throw new IllegalArgumentException("Invalid table name:" +
-          " # is not allowed (reference by timestamp is not supported)");
+      List<String> tableRef = REF.splitToList(path);
+      if (tableRef.get(0).contains("#")) {
+        throw new IllegalArgumentException("Invalid table name:" +
+            " # is not allowed before @. Correct format is 
table@ref#timestamp");
+      }
+      TableIdentifier identifier = TableIdentifier.parse(tableRef.get(0));
+      List<String> timestampRef = TIMESTAMP.splitToList(tableRef.get(1));
+      return new TableReference(identifier, 
parseTimestamp(timestampRef.get(1)), timestampRef.get(0));
     }
 
     if (path.contains("@")) {
-      String[] tableRef = path.split("@");
-      TableIdentifier identifier = TableIdentifier.parse(tableRef[0]);
-      return new TableReference(identifier, null, tableRef[1]);
+      List<String> tableRef = REF.splitToList(path);
+      TableIdentifier identifier = TableIdentifier.parse(tableRef.get(0));
+      return new TableReference(identifier, null, tableRef.get(1));
     }
 
     if (path.contains("#")) {
-      throw new IllegalArgumentException("Invalid table name:" +
-          " # is not allowed (reference by timestamp is not supported)");
+      List<String> tableTimestamp = TIMESTAMP.splitToList(path);
+      TableIdentifier identifier = 
TableIdentifier.parse(tableTimestamp.get(0));
+      return new TableReference(identifier, 
parseTimestamp(tableTimestamp.get(1)), null);
     }
 
     TableIdentifier identifier = TableIdentifier.parse(path);
 
     return new TableReference(identifier, null, null);
   }
+
+  private enum FormatOptions {
+    DATE_TIME(DateTimeFormatter.ISO_DATE_TIME, ZonedDateTime::from),
+    LOCAL_DATE_TIME(DateTimeFormatter.ISO_LOCAL_DATE_TIME, t -> 
LocalDateTime.from(t).atZone(sparkTimezoneOrUTC())),

Review comment:
       I am fine with the current implementation with Spark specific logic and 
your refactored code, if that is a primary use case that you need to cover this 
is probably the best way to have it. But we should definitely take a note for 
this and think about how interactions and dependencies between different 
modules, especially with different engines, should be handled in the future.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] jackye1995 commented on a change in pull request #1825: Add timestamp to table definition in Nessie Catalog

Reply via email to