[GitHub] [iceberg] kbendick commented on pull request #2850: Spark: Added ability to add uuid suffix to the table location in Hive catalog

GitBox Mon, 26 Jul 2021 00:51:53 -0700


kbendick commented on pull request #2850:
URL: https://github.com/apache/iceberg/pull/2850#issuecomment-886463630



   In testing etc, I very often use a similar pattern (possibly using a 
timestamp as the table suffix).
   
   However, I'm not sure if the best place to be doing this is in the Iceberg 
code.
   
   What other tools are you using to create these tables that have UUID 
suffixes? Usually, when I encounter this need, I'm doing it in one of two 
places:
   (1) Directly from shell scripts or small Spark / Trino jobs when testing on 
S3 (and wanting to ensure a brand new table). The solution for me there is 
simply to either place the table name with a timestamp in the code. Here's a 
sample from some code I have elsewhere:
   ```scala
       val currentTime = new Date().getTime
       val tableName = "table_" + currentTime;
       spark.sql(s"CREATE TABLE IF NOT EXISTS my_catalog.default.${tableName} 
(name string, age int) USING iceberg")
   ``` 
   (2) From some sort of scheduling tool, such as Airflow or Azkaban. In this 
case, it's very easy to create a UUID when passing In the "new table name" to 
the spark job.
   
   Effectively, for me, I'm not sure if this is something that makes sense to 
place it in Iceberg.
   
   Can you elaborate further on why this isn't something that you can pass as 
an argument to your jobs etc? It feels very use case specific, with possible 
ways for you to deal with it using existing tools, but maybe I'm not fully 
understanding the scope of your problem. 🙂 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] kbendick commented on pull request #2850: Spark: Added ability to add uuid suffix to the table location in Hive catalog

Reply via email to