[
https://issues.apache.org/jira/browse/HDDS-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16912973#comment-16912973
]
Anu Engineer commented on HDDS-1094:
------------------------------------
Perhaps I am mistaken. I tend to break down performance/correctness into three
sections.
# OM
# SCM
# Datanodes.
The first two I often think of as the metadata layer and the last as the data
layer.
So, if you have a flag in the client to skip data path, that is very useful for
us to create load and exercise only the metadata path. Also, a load generator
for data path is also something I think is useful.
Please correct me if I am wrong, this is the case where we will exercise both
paths but will not do the final I/O. Something like a redirection to null. I am
not sure how useful it is to read from the network and then discard that data –
unless we are certain that slowest elements in the system are disks. I often
imagine a datanode full of disks, thus providing lots of data I/O bandwidth.
So I wonder if this case is easier to test with the support where the client
itself drops the data, rather than the datanode.
> Performance test infrastructure : skip writing user data on Datanode
> --------------------------------------------------------------------
>
> Key: HDDS-1094
> URL: https://issues.apache.org/jira/browse/HDDS-1094
> Project: Hadoop Distributed Data Store
> Issue Type: Improvement
> Components: Ozone Datanode
> Reporter: Supratim Deka
> Assignee: Supratim Deka
> Priority: Major
> Labels: pull-request-available
> Time Spent: 1h 10m
> Remaining Estimate: 0h
>
> Goal:
> Make Ozone chunk Read/Write operations CPU/network bound for specially
> constructed performance micro benchmarks.
> Remove disk bandwidth and latency constraints - running ozone data path
> against extreme low-latency & high throughput storage will expose performance
> bottlenecks in the flow. But low-latency storage(NVME flash drives, Storage
> class memory etc) is expensive and availability is limited. Is there a
> workaround which achieves similar running conditions for the software without
> actually having the low latency storage? At least for specially constructed
> datasets - for example zero-filled blocks (*not* zero-length blocks).
> Required characteristics of the solution:
> No changes in Ozone client, OM and SCM. Changes limited to Datanode, Minimal
> footprint in datanode code.
> Possible High level Approach:
> The ChunkManager and ChunkUtils can enable writeChunk for zero-filled chunks
> to be dropped without actually writing to the local filesystem. Similarly, if
> readChunk can construct a zero-filled buffer without reading from the local
> filesystem whenever it detects a zero-filled chunk. Specifics of how to
> detect and record a zero-filled chunk can be discussed on this jira. Also
> discuss how to control this behaviour and make it available only for internal
> testing.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]