anigos commented on code in PR #8194:
URL: https://github.com/apache/iceberg/pull/8194#discussion_r1283931646


##########
core/src/main/java/org/apache/iceberg/BaseMetastoreCatalog.java:
##########
@@ -192,6 +192,14 @@ public Table create() {
 
       String baseLocation = location != null ? location : 
defaultWarehouseLocation(identifier);
       tableProperties.putAll(tableOverrideProperties());
+
+      if 
(Boolean.parseBoolean(tableProperties.get(TableProperties.UNIQUE_LOCATION))) {
+        boolean alreadyExists = ops.io().newInputFile(baseLocation).exists();
+        if (alreadyExists) {
+          throw new AlreadyExistsException("Table location already in use: 
%s", baseLocation);

Review Comment:
   I have thought through this and mostly two cases came to my mind. We may 
think with this route
   
   1. No database creation should be allowed under an existing database path. 
It will help a major problem of people creating even databases under existing 
db path. 
   2. No table creation should be allowed under an existing table path. 
   
   **Case 1**
   
   We have the following information with us which is an existing Table and 
it's location. 
   
   Say a table's location is `s3://somerandompath/my_database/my_table `
    
   I feel instead of looking into fileIO why not we leverage our own metadata? 
We have various ways of creating iceberg table just via database.tableName, 
with location etc. This DB path is always a constant path by practice. If 
someone is trying to create a table under the same location with same name we 
can just throw the exception that s3://somerandompath/my_database/my_table 
exists just by looking it's database reference, which should be one level up 
and only one level under a database path should be a permissible table path. 
The uniqueness not necessarily you need from storage file location but from our 
metadata information. 
   
   ```
   CREATE TABLE prod.db.sample
   USING iceberg
   PARTITIONED BY (part)
   TBLPROPERTIES ('key'='value')
   AS SELECT ...
   ```
   
   OR 
   
   ```
   CREATE TABLE IF NOT EXISTS prod.db.sample (
            id integer,
           ......
          )
          USING ICEBERG 
          LOCATION 
          TBLPROPERTIES (
            'type' 'hive',.......
          )
   ```
   
   
   **Case 2**
   
   Rename table : When we rename a table we don't move files it is a metadata 
operation. The base path remains same but the table name gets updated. So in 
this case there is no impact. For unique location we can still look up to the 
metadata and get all unique paths under db reference. 
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to