ArafatKhan2198 commented on PR #6950:
URL: https://github.com/apache/ozone/pull/6950#issuecomment-2236868663
### How to Replicate the ClassCastException ➖
To understand the unit test, it's crucial to understand how the
`ClassCastException` occurs and what bug in Recon led to it. When an event from
OM comes to Recon, `OMDBUpdatesHandler` is responsible for packaging this
information into an event object class called `OMDBUpdateEvent`, which has two
essential fields: `updatedValue` and `oldValue`.
**`oldValue`**: This is the previous state of a database entry before an
update (PUT) or delete (DELETE) operation. It represents what was stored in the
database before the current operation.
**`newValue`**: This is the new state of a database entry after an update
(UPDATE) or create (PUT) operation. It represents the current state being
written to the database.
The updated value is fetched from the OM side, while the oldValue is fetched
from an existing map inside the `OMDBUpdatesHandler` called
`omdbLatestUpdateEvents`. The old map, `omdbLatestUpdateEvents`, was a simple
`Map<Object, OMDBUpdateEvent>` that stored the latest database update events
without distinguishing between different tables. This led to conflicts and
potential corruption when different tables had the same key structure, causing
issues like `ClassCastException` during event processing in Recon.
For example, consider the `FileTable` and the `DirectoryTable`. Both tables
have the same key structure in RocksDB:
- `directoryTable: /volumeId/bucketId/parentId/dirName -> DirInfo`
- `fileTable: /volumeId/bucketId/parentId/fileName -> KeyInfo`
Here, using the same name for a file and a directory will cause an error. If
we execute the following commands, we will encounter a ClassCastException
because the file name and directory name are the same:
```
ozone sh key put s3v/fso-bucket/dir6 NOTICE.txt
ozone sh key delete s3v/fso-bucket/dir6
ozone fs -mkdir -p ofs://om/s3v/fso-bucket/dir6
```
### Breakdown of the Error:
1. **First Command**: The first command will be recorded as a PUT operation
on the `FileTable` in `OMDBUpdateEvent` and stored in the
`omdbLatestUpdateEvents` map as:
- Key: `/volumeId/bucketId/parentId/dirName`
- Value: `OMDBUpdateEvent` with `updatedValue` as `OmKeyInfo` for the
file named `dir6` and `oldValue` as `null`.
2. **Second Command**: The second command will be recorded as a DELETE
operation on the `FileTable` in `OMDBUpdateEvent`. It will first check the
`omdbLatestUpdateEvents` for a previous mention of this key (dir6). It finds
the previous PUT operation, so the record in the map changes to a DELETE
operation with `updatedValue` as `OmKeyInfo` (the value to be deleted) and
`oldValue` as `null`.
3. **Third Command**: When creating a directory named `dir6`, it results in
a new `OMDBUpdateEvent` for the `DirectoryTable`. The `newValue` will be an
`OmDirectoryInfo` object. However, when it checks `omdbLatestUpdateEvents`, it
finds the old value associated with the previous DELETE operation, which is
`OmKeyInfo` (the newValue of the delete event). This mismatch (newValue as
`OmDirectoryInfo` and oldValue as `OmKeyInfo`) leads to a ClassCastException.
**To prevent such issues, we implemented a safeguard (HDDS-8310) that checks
for value mismatches and ignores such events. However, ignoring these events is
not ideal, as it can lead to data inconsistency. For example, Recon would never
know about the directory `dir6`, leading to data inconsistency. To fix this, we
need to implement a map that distinguishes between different tables.**
cc: @sadanand48
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]