[GitHub] [iceberg] rdblue commented on a change in pull request #3269: Spark: Support time travel through table names

GitBox Mon, 11 Oct 2021 13:15:08 -0700


rdblue commented on a change in pull request #3269:
URL: https://github.com/apache/iceberg/pull/3269#discussion_r726539751




##########
File path: spark3/src/main/java/org/apache/iceberg/spark/SparkCatalog.java
##########
@@ -80,6 +87,9 @@
  */
 public class SparkCatalog extends BaseCatalog {
   private static final Set<String> DEFAULT_NS_KEYS = 
ImmutableSet.of(TableCatalog.PROP_OWNER);
+  private static final Splitter COMMA = Splitter.on(",");
+  private static final Pattern AT_TIME = 
Pattern.compile("at(?:_(?:time(?:stamp)?)?)?_?(\\d+)");

Review comment:
       I may have gone a bit too far here. I think that we want to be able to 
use the full "at_timestamp_12938471" version because that is the least likely 
to conflict with existing table names. Similarly, I think it is valuable to 
have a short version, like `at_<timestamp>` and `snap_<id>` so you don't have 
to type `at_timestamp_` or the full `snapshot_id_` every time. Since we were 
already testing multiple prefixes, I added a few that I thought would be 
valuable to avoid confusion:
   * Make the final `_` optional because that's an easy mistake to make
   * Allow `time` and not just `timestamp` because people may not remember
   * Allow omitting `_id` and allow just `snap_` or `s_` as shortened forms
   
   The logic made sense at every step but there are quite a few variations. I'm 
up for defining the full version and one shortening if you think it's best to 
have just a couple.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] rdblue commented on a change in pull request #3269: Spark: Support time travel through table names

Reply via email to