[
https://issues.apache.org/jira/browse/HDDS-4687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17264729#comment-17264729
]
Marton Elek commented on HDDS-4687:
-----------------------------------
Testing is almost free (thanks to your patch), and good to double-check our
expectations. I just applied your patch and re-started the export.
But after ten minutes I gave up:
{code}
2021-01-14 01:46:44,504 [main] INFO debug.ExportContainer: Preparation is done
{code}
Only 1/3 of the file is copied during this time:
{code}
date && ls -lah container-6.tar.gz
Thu Jan 14 01:55:32 PST 2021
-rw-r--r-- 1 root root 1.5G Jan 14 01:55 container-6.tar.gz
{code}
I had 56 core and only one was busy with the compression.
> Disable compression for closed-container replication
> ----------------------------------------------------
>
> Key: HDDS-4687
> URL: https://issues.apache.org/jira/browse/HDDS-4687
> Project: Hadoop Distributed Data Store
> Issue Type: Improvement
> Reporter: Marton Elek
> Assignee: Marton Elek
> Priority: Critical
> Attachments: HDDS-4687.patch
>
>
> During the measurement of closed container replication I found that the
> biggest bottleneck is the read side. 5 Gb container is replicated under ~3
> minutes but ~2:30 was the downloading part.
> Closed containers are replicated via GRPC. The source side creates an
> OutputStream on-the-fly (OnDemandContainerReplicationSource.java) and stream
> all the container content as a "tar.gz" archive to the client.
> It turned out that the compression (the .gz part) is quite expensive:
> I created a CLI tool to export containers to tar files (same logic as the
> replication but without streaming via GRPC, just saving to a file).
> I have seen the 2:30 time to create the archive:
> {code}
> 2021-01-13 05:51:25,302 [main] INFO debug.ExportContainer: Preparation is done
> 2021-01-13 05:53:53,472 [main] INFO debug.ExportContainer: Container is
> exported to /tmp/container-3.tar.gz
> {code}
> But when I removed the compression in TarContainerPacker.java, the speed was
> significant better (25 sec instead of the 150 sec)
> {code}
> 2021-01-13 06:11:46,254 [main] INFO debug.ExportContainer: Preparation is done
> 2021-01-13 06:12:11,512 [main] INFO debug.ExportContainer: Container is
> exported to /tmp/container-3.tar
> {code}
> As a result I suggest turning off the compression for closed container
> replication.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]