[
https://issues.apache.org/jira/browse/HDDS-7454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17644063#comment-17644063
]
István Fajth commented on HDDS-7454:
------------------------------------
As we discussed, the problem is that currently a rogue client can write blocks
to DataNodes that are different from the Pipeline information that is provided
for the client from Ozone Manager. This is true in secure and non-secure
environments.As Neil mentioned this might compromise a container when SCM
checks the replicas and figures out which are the over replicated container and
if there are excess replicas which ones to delete, as if a rogue client writes
a container to 3 nodes (even via STANDALONE replication type) and properly sync
these writes bcsid associated with the container might go above the one in the
good containers, and with that take over the precedence and make the old valid
data to be removed potentially.
As this can happen in a non secure environment, I strongly believe we should
not touch the tokens as that does not solves the problem at all, as tokens are
present only in a secured environment.
I think the solution is within SCM, as if a DN does not have the container yet
(it does not have a valid replica of the container), then at container creation
an ICR is being triggered, and while that ICR is processed, that container
should be marked as an invalid replica and SCM should issue a delete container
to the DataNode reported the invalid container. (We should be able to determine
that the container is invalid during ICR processing, as SCM should know which
container belongs to which Pipeline and if the DN is not part of the Pipeline
it should not report creation of a container with the specific container ID.)If
possible Ozone Manager also should refuse the write and metadata update, based
on information provided by SCM (either by caching the in flight write Pipelines
and then the Pipelines reported by the client at the end of the write, or by
directly checking the write location with SCM to validate the write).
We should not include this information in the tokens I believe, as we don't
gain anything with that, after implementing proper measures to deal with such
rouge clients. Here is why: if the SCM instructs the DN within 2 heartbeats to
remove the rogue container, then rogue clients will have 2HB of time (1 min by
default if no container creation happens in between the 2 HB, but it happens...
so less than 1 min) to occupy space from the cluster with garbage data, but in
order to do that they need access permission the first time, and if they have
access permissions, they can write garbage anyway to valid locations, so the
only thing we need to prevent is messing up the container space and the OM
metadata, and that is done with the proposed check in ICR and with the check at
committing the write from the client to OM.
> OM to DN token verification should include Pipeline
> ---------------------------------------------------
>
> Key: HDDS-7454
> URL: https://issues.apache.org/jira/browse/HDDS-7454
> Project: Apache Ozone
> Issue Type: Bug
> Reporter: Sumit Agrawal
> Assignee: Sumit Agrawal
> Priority: Minor
> Labels: pull-request-available
>
> Client will request for block information to be used to write data, In this
> process,
> - OM call allocateBlock to SCM, SCM will provide block information, pipeline
> and related DN
> - OM also create token (when security enabled) with block information
> - Client will pass this information to DN
> - DN will verify token for block information and start write block
> Here, pipeline information is not verified for which request is created. As
> security, this also needs to be verified.
> Pipeline and DN mapping is shared to DN which Pipeline command from SCM to
> DNs, CreatePipelineCommand
> Impact (If client is not trustable):
> 1. Client can forward request with token to different DN with different
> pipeline information.
> So DN since do not have information about SMC mapping of container to
> pipeline, that DN can start operating over that.
> Having pipeline in token verification, it will ensure,
> - block write is done with correct pipeline (DNs)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]