[
https://issues.apache.org/jira/browse/HDDS-8173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17703266#comment-17703266
]
Ethan Rose commented on HDDS-8173:
----------------------------------
I took a quick look and it looks like the iterator checking for entries is not
being seeked to the prefix before checking it, so it thinks there are no
entries for the prefix. Just making an {{RDBStoreIterator}} and calling
{{hasNext}} looks to leave the iterator positioned at the beginning of the
table since {{seekToFirst}} has not been called. So invoking
{{RDBTable#deleteBatchWithPrefix}} doesn't actually enter the while loop to
delete anything because the iterator is pointing to the first key in the table,
which does not begin with the prefix and fails the {{hasNext}} check. This
seems easy enough to unit test, I'm not sure why existing tests didn't catch
this.
> SchemaV3 RocksDB entries are not removed after container delete
> ---------------------------------------------------------------
>
> Key: HDDS-8173
> URL: https://issues.apache.org/jira/browse/HDDS-8173
> Project: Apache Ozone
> Issue Type: Bug
> Reporter: Christos Bisias
> Assignee: Hemant Kumar
> Priority: Critical
> Attachments: rocksDBContainerDelete.diff
>
>
> After deleting a container, all RocksDB entries for that container should be
> deleted from RocksDB. Metadata and block data are still on the DB intact.
>
> Problem appears to stem from the call to
> {code:java}
> BlockUtils.removeContainerFromDB(containerData, conf){code}
> and does not clear entries in datanode SchemaV3 rocksDb for the given
> container id.
>
> We can reproduce this issue on a docker cluster as follows:
> * start a docker cluster with 5 datanodes
> * put a key under a bucket to create a container
> * close the container
> * put 2 datanodes that the container has replicas, on decommission
> * recommission the datanodes
> * container should be over-replicated
> * ReplicationManager should issue a container delete for 2 datanodes
> * Check one of the two datanodes
> * Container should be deleted
> * Check RocksDB block data entries for the container
>
> on {color:#00875a}master{color}, ozone root
>
> {code:java}
> ❯ cd hadoop-ozone/dist/target/ozone-1.4.0-SNAPSHOT/compose/ozone {code}
> edit {color:#00875a}docker-config {color}and add below two configs needed for
> decommission
>
>
> {code:java}
> OZONE-SITE.XML_ozone.scm.nodes.scmservice=scm
> OZONE-SITE.XML_ozone.scm.address.scmservice.scm=scm {code}
> start Ozone cluster with 5 datanodes, connect to scm and create a key
>
> {code:java}
> ❯ docker-compose up --scale datanode=5 -d
> ❯ docker exec -it ozone_scm_1 bash
> bash-4.2$ ozone sh volume create /vol1
> bash-4.2$ ozone sh bucket create /vol1/bucket1
> bash-4.2$ ozone sh key put /vol1/bucket1/key1 /etc/hosts{code}
> close the container and check on which datanodes it's on
> {code:java}
> bash-4.2$ ozone admin container close 1
> bash-4.2$ ozone admin container info 1
> ...
> {code}
> check scm roles, to get scm IP and port
>
> {code:java}
> bash-4.2$ ozone admin scm roles
> 99960cfeda73:9894:LEADER:62393063-a1e0-4d5e-bcf5-938cf09a9511:172.25.0.4
> {code}
>
> check datanode list, to get IP and hostname for 2 datanodes the container is
> on
>
> {code:java}
> bash-4.2$ ozone admin datanode list
> ... {code}
> place both datanodes on decommission
>
>
> {code:java}
> bash-4.2$ ozone admin datanode decommission -id=scmservice
> --scm=172.25.0.4:9894 <datanodeIP>/<datanodeHostname> {code}
> wait until both datanodes are decommissioned, at that point if we check the
> container's info we can see that it has replicas placed upon other datanodes
> as well
>
>
> recommission both datanodes
>
> {code:java}
> bash-4.2$ ozone admin datanode recommission -id=scmservice
> --scm=172.25.0.4:9894 <datanodeIP>/<datanodeHostname> {code}
> After a few minutes, on scm logs
>
> {code:java}
> 2023-03-15 18:24:53,810 [ReplicationMonitor] INFO
> replication.LegacyReplicationManager: Container #1 is over replicated.
> Expected replica count is 3, but found 5.
> 2023-03-15 18:24:53,810 [ReplicationMonitor] INFO
> replication.LegacyReplicationManager: Sending delete container command for
> container #1 to datanode
> d6461c13-c2fa-4437-94f5-f75010a49069(ozone_datanode_2.ozone_default/172.25.0.11)
> 2023-03-15 18:24:53,811 [ReplicationMonitor] INFO
> replication.LegacyReplicationManager: Sending delete container command for
> container #1 to datanode
> 6b077eea-543b-47ca-abf2-45f26c106903(ozone_datanode_5.ozone_default/172.25.0.6)
> {code}
> connect to one of the datanodes, the container is being deleted
>
> check that the container is deleted
> {code:java}
> bash-4.2$ ls
> /data/hdds/hdds/CID-ca9fef0f-9af2-4dbf-af02-388d624c2f10/current/containerDir0/
> bash-4.2$ {code}
> check RocksDB
> {code:java}
> bash-4.2$ ozone debug ldb --db
> /data/hdds/hdds/CID-ca9fef0f-9af2-4dbf-af02-388d624c2f10/DS-a8a72696-e4cf-42a6-a66c-04f0b614fde4/container.db
> scan --column-family=block_data {code}
> Block data for the deleted container are still there
> {code:java}
> "blockID": {
> "containerBlockID": {
> "containerID": 1,
> "localID": 111677748019200001 {code}
> {color:#00875a}metadata{color} and {color:#00875a}block_data{color} still
> have the entries while {color:#00875a}deleted_blocks{color} and
> {color:#00875a}delete_txns{color} are empty.
>
> I've also attached a diff with a test added under
> {color:#00875a}TestContainerPersistence{color}, that verifies above issue.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]