[
https://issues.apache.org/jira/browse/HDFS-16984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774316#comment-17774316
]
ASF GitHub Bot commented on HDFS-16984:
---------------------------------------
hanke580 opened a new pull request, #6175:
URL: https://github.com/apache/hadoop/pull/6175
<!--
Thanks for sending a pull request!
1. If this is your first time, please read our contributor guidelines:
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
2. Make sure your PR title starts with JIRA issue id, e.g.,
'HADOOP-17799. Your PR title ...'.
-->
### Description of PR
During the upgrade process, when creating the FSImage the access time of the
directory is not serialized to the FSImage. This causes the access timestamp
lost.
This PR adds access time field in INodeDirectory just like INodeFile proto
definition and persists access time field during the snapshotting process.
### How was this patch tested?
(1) Tested with system snapshot + restart.
(2) Tested old version system without the patch => upgrade to system with
the patch.
### For code changes:
- [ ] Does the title or this PR starts with the corresponding JIRA issue id
(e.g. 'HADOOP-17799. Your PR title ...')?
- [ ] Object storage: have the integration tests been executed and the
endpoint declared according to the connector-specific documentation?
- [ ] If adding new dependencies to the code, are these dependencies
licensed in a way that is compatible for inclusion under [ASF
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`,
`NOTICE-binary` files?
> Directory timestamp lost during the upgrade process: 2.10.2=>3.3.6
> ------------------------------------------------------------------
>
> Key: HDFS-16984
> URL: https://issues.apache.org/jira/browse/HDFS-16984
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs
> Affects Versions: 2.10.2, 3.3.6
> Reporter: Ke Han
> Priority: Major
> Attachments: GUBIkxOc.tar.gz
>
>
> h1. Symptoms
> The access timestamp for a directory is lost after the upgrading from HDFS
> cluster 2.10.2 to 3.3.6.
> h1. Reproduce
> Start up a four-node HDFS cluster in 2.10.2 version.
> Execute the following commands. (The client is started up in NN, We have
> minimized the command sequence for reproducing)
> {code:java}
> bin/hdfs dfs -mkdir /GUBIkxOc
> bin/hdfs dfs -put -f -p -d /tmp/upfuzz/hdfs/GUBIkxOc/bQfxf /GUBIkxOc/
> bin/hdfs dfs -mkdir /GUBIkxOc/sKbTRjvS{code}
> Perform read in the old version
> {code:java}
> bin/hdfs dfs -ls -t -r -u /GUBIkxOc/
> Found 2 items
> drwxr-xr-x - root supergroup 0 1970-01-01 00:00 /GUBIkxOc/sKbTRjvS
> drwxr-xr-x - 20001 998 0 2023-04-17 16:15
> /GUBIkxOc/bQfxf{code}
> Then perform a full-stop upgrade to upgrade the entire cluster to 3.3.6.
> (Follow upgrade procedure in the website: (1) enter safemode (2) rolling
> upgrade prepare (3) exit from safe mode). When all nodes in new version have
> started up, we perform the same read:
> {code:java}
> Found 2 items
> drwxr-xr-x - 20001 998 0 1970-01-01 00:00 /GUBIkxOc/bQfxf
> drwxr-xr-x - root supergroup 0 1970-01-01 00:00
> /GUBIkxOc/sKbTRjvS {code}
> The access timestamp info of directory /GUBIkxOc/bQfxf is lost. It changes
> from 2023-04-17 16:15 to 1970-01-01 00:00.
> PS: The prepare upgrade must happen after the commands have been executed.
> I have also attached the required file: +/tmp/upfuzz/hdfs/GUBIkxOc/bQfxf+ .
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]