kbendick commented on pull request #2850:
URL: https://github.com/apache/iceberg/pull/2850#issuecomment-886463630
In testing etc, I very often use a similar pattern (possibly using a
timestamp as the table suffix).
However, I'm not sure if the best place to be doing this is in the Iceberg
code.
What other tools are you using to create these tables that have UUID
suffixes? Usually, when I encounter this need, I'm doing it in one of two
places:
(1) Directly from shell scripts or small Spark / Trino jobs when testing on
S3 (and wanting to ensure a brand new table). The solution for me there is
simply to either place the table name with a timestamp in the code. Here's a
sample from some code I have elsewhere:
```scala
val currentTime = new Date().getTime
val tableName = "table_" + currentTime;
spark.sql(s"CREATE TABLE IF NOT EXISTS my_catalog.default.${tableName}
(name string, age int) USING iceberg")
```
(2) From some sort of scheduling tool, such as Airflow or Azkaban. In this
case, it's very easy to create a UUID when passing In the "new table name" to
the spark job.
Effectively, for me, I'm not sure if this is something that makes sense to
place it in Iceberg.
Can you elaborate further on why this isn't something that you can pass as
an argument to your jobs etc? It feels very use case specific, with possible
ways for you to deal with it using existing tools, but maybe I'm not fully
understanding the scope of your problem. 🙂
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]