[
https://issues.apache.org/jira/browse/HDDS-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Arpit Agarwal updated HDDS-1094:
--------------------------------
Description:
Goal:
It can be useful to exercise the IO and control paths in Ozone for simulated
large datasets without having huge disk capacity at hand. For example, this
will allow us to get things like container reports and incremental container
reports, while not needing huge cluster capacity. The
[SimulatedFsDataset|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java]
does something similar in HDFS. It has been an invaluable tool to simulate
large data stores.
was:
Goal:
Make Ozone chunk Read/Write operations CPU/network bound for specially
constructed performance micro benchmarks.
Remove disk bandwidth and latency constraints - running ozone data path against
extreme low-latency & high throughput storage will expose performance
bottlenecks in the flow. But low-latency storage(NVME flash drives, Storage
class memory etc) is expensive and availability is limited. Is there a
workaround which achieves similar running conditions for the software without
actually having the low latency storage? At least for specially constructed
datasets - for example zero-filled blocks (*not* zero-length blocks).
Required characteristics of the solution:
No changes in Ozone client, OM and SCM. Changes limited to Datanode, Minimal
footprint in datanode code.
Possible High level Approach:
The ChunkManager and ChunkUtils can enable writeChunk for zero-filled chunks to
be dropped without actually writing to the local filesystem. Similarly, if
readChunk can construct a zero-filled buffer without reading from the local
filesystem whenever it detects a zero-filled chunk. Specifics of how to detect
and record a zero-filled chunk can be discussed on this jira. Also discuss how
to control this behaviour and make it available only for internal testing.
> Performance test infrastructure : skip writing user data on Datanode
> --------------------------------------------------------------------
>
> Key: HDDS-1094
> URL: https://issues.apache.org/jira/browse/HDDS-1094
> Project: Hadoop Distributed Data Store
> Issue Type: Improvement
> Components: Ozone Datanode
> Reporter: Supratim Deka
> Assignee: Supratim Deka
> Priority: Major
> Labels: pull-request-available
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> Goal:
> It can be useful to exercise the IO and control paths in Ozone for simulated
> large datasets without having huge disk capacity at hand. For example, this
> will allow us to get things like container reports and incremental container
> reports, while not needing huge cluster capacity. The
> [SimulatedFsDataset|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java]
> does something similar in HDFS. It has been an invaluable tool to simulate
> large data stores.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]