[jira] [Created] (HDDS-9228) Poor S3G read performance

Kirill Sizov (Jira) Wed, 30 Aug 2023 07:02:07 -0700

Kirill Sizov created HDDS-9228:
----------------------------------

             Summary: Poor S3G read performance
                 Key: HDDS-9228
                 URL: https://issues.apache.org/jira/browse/HDDS-9228
             Project: Apache Ozone
          Issue Type: Bug
          Components: S3
    Affects Versions: 1.4.0
            Reporter: Kirill Sizov



h3. TL;DR:
*S3G writes all its responses byte-after-byte.*

h3. Details
This issue was discovered during a performance test run

h4. Cluster configuration
3 master nodes, 5 datanodes.
Each machine runs 96core CPU.
S3G instances are installed on master nodes (3 gateways).

h4. Test preparation
Before the test we uploaded 300000 files to Ozone, 20MB each.

h4. Test configuration
We ran two tests
1. pure writes, no concurrent reads
2. pure reads, no concurrent writes

h4. Load generator
3 load generator nodes, each runs 50 threads.

h4. Ozone configuration
The buckets were created with Erasure Coding RS-3-2-1024k

h3. Results
We found that  writes are 3 times faster than reads, moreover reads caused ~70% 
CPU usage.

Thread dumps and JFR showed the following stacktraces of HTTP threads:

Stacktrace:
{noformat}
"qtp2079179914-1055393" Id=1055393 RUNNABLE
        at 
org.glassfish.jersey.servlet.internal.ResponseWriter$NonCloseableOutputStreamWrapper.write(ResponseWriter.java:291)
        at 
org.glassfish.jersey.message.internal.CommittingOutputStream.write(CommittingOutputStream.java:215)
        at java.io.FilterOutputStream.write(FilterOutputStream.java:77)
        at java.io.FilterOutputStream.write(FilterOutputStream.java:125)
        at 
org.glassfish.jersey.message.internal.WriterInterceptorExecutor$UnCloseableOutputStream.write(WriterInterceptorExecutor.java:276)
        at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1310)
        at org.apache.commons.io.IOUtils.copy(IOUtils.java:978)
        at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1282)
        at 
org.apache.hadoop.ozone.s3.endpoint.ObjectEndpoint.lambda$get$0(ObjectEndpoint.java:382)
{noformat}

JFR:

{noformat}
Stack Trace     Count   Percentage
void org.eclipse.jetty.server.HttpOutput.write(int)     431146  39 %
void 
org.glassfish.jersey.servlet.internal.ResponseWriter$NonCloseableOutputStreamWrapper.write(int)
    431145  39 %
void org.glassfish.jersey.message.internal.CommittingOutputStream.write(int)    
431145  39 %
void java.io.FilterOutputStream.write(int)      431145  39 %
void java.io.FilterOutputStream.write(byte[], int, int) 431145  39 %
void 
org.glassfish.jersey.message.internal.WriterInterceptorExecutor$UnCloseableOutputStream.write(byte[],
 int, int)    431145  39 %
long org.apache.commons.io.IOUtils.copyLarge(InputStream, OutputStream, byte[]) 
431145  39 %
{noformat}

We can clearly see the transition {{FilterOutputStream.write(byte[], int, int) 
->FilterOutputStream.write(int)}}, meaning that any incoming array is written 
as single bytes, not as an array as a whole.

The place in the code that creates {{FilterOutputStream}} is 
{{org.apache.hadoop.ozone.s3.TracingFilter}}:

{code}
    OutputStream out = responseContext.getEntityStream();
    if (out != null) {
      responseContext.setEntityStream(new FilterOutputStream(out) {
        @Override
        public void close() throws IOException {
          super.close();
          finishAndClose(scope, span);
        }
      });
    }
{code}

Removing this filter or fixing {{FilterOutputStream.write(byte[], int, int)}} 
method resolves performance issues and we see a 5x better throughput.





--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (HDDS-9228) Poor S3G read performance

Reply via email to