[
https://issues.apache.org/jira/browse/HDDS-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952414#comment-16952414
]
Bharat Viswanadham edited comment on HDDS-2309 at 10/16/19 1:25 AM:
--------------------------------------------------------------------
Thank You [~rajesh.balamohan] for reporting this issue.
Few questions:
# Is the workload only one client is used, or there are clients running in
parallel. (Because if it is only client, in non-HA OM, until we flush to disk,
we don't return the response to the client. So, if only one client is sending
requests to OM, then it will be flushed for every one request. (Whereas in HA
OM, we will not see this, as we return the response to the client after adding
to cache, we don't wait for buffer flush).
{quote}This forces {{cleanupCache}} to be invoked which ends up choking in
single thread executor. Attaching the profiler information which gives more
details.
{quote}
I think for non-HA we can skip scheduling cleanup cache for a few flush
transaction iterations and also one more thing we can do is marking the future
complete and then call cleanup cache. So, the client will not see the time
taken for submitting to clean up the cache.
was (Author: bharatviswa):
Thank You [~rajesh.balamohan] for reporting this issue.
Few questions:
# Is the workload only one client is used, or there are clients running in
parallel. (Because if it is only client, in non-HA OM, until we flush to disk,
we don't return the response to the client. So, if only one client is sending
requests to OM, then it will be flushed for every one request. (Whereas in HA
OM, we will not see this, as we return the response to the client after adding
to cache, we don't wait for buffer flush).
{quote}This forces {{cleanupCache}} to be invoked which ends up choking in
single thread executor. Attaching the profiler information which gives more
details.
{quote}
I think for non-HA we can skip scheduling cleanup cache immediately, as when
run with singleThreadExecutor it will call cleanup cache immediately and also
one more thing we can do is marking the future complete and then call cleanup
cache. So, the client will not see the time taken for submitting to cleanup
cache.
> Optimise OzoneManagerDoubleBuffer::flushTransactions to flush in batches
> ------------------------------------------------------------------------
>
> Key: HDDS-2309
> URL: https://issues.apache.org/jira/browse/HDDS-2309
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Components: Ozone Manager
> Reporter: Rajesh Balamohan
> Assignee: Bharat Viswanadham
> Priority: Major
> Attachments: Screenshot 2019-10-15 at 4.19.13 PM.png
>
>
> When running a write heavy benchmark,
> {{{color:#000000}org/apache/hadoop/ozone/om/ratis/OzoneManagerDoubleBuffer.flushTransactions{color}}}
> was invoked for pretty much every write.
> This forces {{cleanupCache}} to be invoked which ends up choking in single
> thread executor. Attaching the profiler information which gives more details.
> Ideally, {{flushTransactions}} should batch up the work to reduce load on
> rocksDB.
>
> [https://github.com/apache/hadoop-ozone/blob/master/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerDoubleBuffer.java#L130]
>
> [https://github.com/apache/hadoop-ozone/blob/master/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerDoubleBuffer.java#L322]
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]