j1wonpark commented on issue #4237:
URL: https://github.com/apache/amoro/issues/4237#issuecomment-4590966748

   Thanks for the detail. Before treating this as established, I'd like to pin 
down the actual trigger, because line 95 is only reached when a HEAD on a 
**slash-less** key returns 200.
   
   `directoryPath` comes from `TableFileUtil.getParent()` = `new 
Path(path).getParent().toString()`, which has **no trailing slash** (e.g. 
`.../op_time_hour=12`). For `S3FileIO`, `exists()` is `BaseS3File.exists()` → 
`headObject(key = uri().key())` on that exact key, returning `false` on 404. If 
it's 404 we return at `TableFileUtil:90` and never reach the throw.
   
   That undercuts the three mechanisms on flat S3 (AWS/MinIO/Ceph):
   - Directory markers are slash-*suffixed* (`.../op_time_hour=12/`), so a HEAD 
on the slash-less key is a different key → 404. These stores also don't 
auto-create markers on PUT.
   - `HeadObject` is an exact-key op — a non-empty prefix or a LIST-style 
rewrite doesn't make it 200. Do you have a concrete gateway impl that does?
   
   I also ran the AWS "Create folder" recipe and couldn't reproduce: the 
console writes `.../<partition>/` (with slash) but the cleaner HEADs 
`.../<partition>` (no slash) → 404. So "100% on every object-store table" 
doesn't match what the code does on flat S3.
   
   A HEAD returning 200 on a slash-less directory key would imply a 
hierarchical-namespace bucket rather than plain flat S3 — could you confirm the 
bucket type/mode? To make it reproducible, please share, for one failing 
partition dir, the raw `aws s3api head-object --key <slash-less-key>` response 
(200 vs 404) and the exact `io-impl` + storage config.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to