szehon-ho commented on code in PR #6849:
URL: https://github.com/apache/iceberg/pull/6849#discussion_r1109224211


##########
docs/spark-procedures.md:
##########
@@ -462,16 +462,29 @@ will then treat these files as if they are part of the 
set of files  owned by Ic
 
 #### Usage
 
-| Argument Name | Required? | Type | Description |
-|---------------|-----------|------|-------------|
-| `table`       | ✔️  | string | Table which will have files added to|
-| `source_table`| ✔️  | string | Table where files should come from, paths are 
also possible in the form of \`file_format\`.\`path\` |
-| `partition_filter`  | ️   | map<string, string> | A map of partitions in the 
source table to import from |
+| Argument Name | Required? | Type | Description                               
                                                                                
                             |
+|---------------|-----------|------|--------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `table`       | ✔️  | string | Table which will have files added to          
                                                                                
                         |
+| `source_table`| ✔️  | string | Table where files should come from, paths are 
also possible in the form of \`file_format\`.\`path\`                           
                         |
+| `partition_filter`  | ️   | map<string, string> | A map of partitions in the 
source table to import from                                                     
                                            |
+| `check_duplicate_files`  | ️   | boolean | When true, will throw a exception 
if files added will result in duplicate (on by default, it's checking against 
files path in entries metadata table). |
 
 Warning : Schema is not validated, adding files with different schema to the 
Iceberg table will cause issues.
 
 Warning : Files added by this method can be physically deleted by Iceberg 
operations
 
+Warning : SQL delete followed by this add_files procedure immediately might 
not add files back due to deleted files are still tracked in manifest entry 
with status = 2: DELETED

Review Comment:
   Actually if its just because of DELETED in current snapshot, is it because 
of check_duplicate_files?  If so, maybe we should just make a patch that 
selects out DELETED in that check.  Havent looked at the code yet.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to