mladjan-gadzic opened a new pull request, #5124:
URL: https://github.com/apache/ozone/pull/5124

   ## What changes were proposed in this pull request?
   Content length is not reliable measure to determine wheter dir should be 
created. In this case Hadoop does not provide any content, but because body is 
still chunk signed, length is not 0 but 86. Because of this, zero byte file is 
created instead of dir.
   
   Proposed solution is to have additional check wheter body is empty. If it 
is, dir will be created, otherwise file.
   
   In order for Hadoop to be 100% compatible with FSO layout, it is necessary 
to set property:
   ```
   <property>
     <name>fs.s3a.directory.marker.retention</name>
     <value>keep</value>
   </property>
   ```
   To demonstrate this, let's use simple case such as:
   
   1. create dir `a1`
   2. create dir `b1` under `a1`
   
   Dir `a1` can be safely created. Listing of keys on Ozone side would show key 
as `a1/` with size of 0. Attempt to create `b1` as `a1/b1/` would succeed on 
Ozone's side. Listing of keys on it would show keys such as `a1/` and `a1/b1/` 
both with size of 0. Issue happens on Hadoop side because issued command would 
hang (never finish) with warning logs such as:
   ```
   2023-07-26 13:53:33,937 WARN impl.MultiObjectDeleteSupport: Bulk delete 
operation failed to delete all objects; failure count = 1
   2023-07-26 13:53:33,938 WARN impl.MultiObjectDeleteSupport: InternalError: 
a1/: Directory is not empty. Key:a1
   ```
   The reason for this is that Hadoop tries to delete `a1/` key and just leave 
`a1/b1/` which is alright for OBS, but FSO does not work like that. Because of 
this above-mentioned propery is necesarry in order for Hadoop to be compatible 
with FSO over S3a. After setting above property there are no warning logs, but 
instead something like this:
   ```
   2023-07-26 22:30:43,950 INFO impl.DirectoryPolicyImpl: Directory markers 
will be kept
   ```
   Full logs from Hadoop side:
   ```
   ubuntu@ip-172-31-30-88:~/hadoop-3.3.4$ bin/hdfs dfs -Dfs.s3a.access.key=1 
-Dfs.s3a.secret.key=1 -Dfs.s3a.endpoint=http://localhost:9878/ 
-Dfs.s3a.path.style.access=true -mkdir s3a://fso/c1/b1
   2023-07-26 22:29:55,358 INFO impl.MetricsConfig: Loaded properties from 
hadoop-metrics2.properties
   2023-07-26 22:29:55,507 INFO impl.MetricsSystemImpl: Scheduled Metric 
snapshot period at 10 second(s).
   2023-07-26 22:29:55,507 INFO impl.MetricsSystemImpl: s3a-file-system metrics 
system started
   2023-07-26 22:30:43,950 INFO impl.DirectoryPolicyImpl: Directory markers 
will be kept
   2023-07-26 22:30:49,531 INFO impl.MetricsSystemImpl: Stopping 
s3a-file-system metrics system...
   2023-07-26 22:30:49,531 INFO impl.MetricsSystemImpl: s3a-file-system metrics 
system stopped.
   2023-07-26 22:30:49,532 INFO impl.MetricsSystemImpl: s3a-file-system metrics 
system shutdown complete.
   ```
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-8437
   
   ## How was this patch tested?
   - manual test
   For info on steps how to reproduce the issue check linked Apache Jira.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to