Kirill Sizov created HDDS-9228:
----------------------------------
Summary: Poor S3G read performance
Key: HDDS-9228
URL: https://issues.apache.org/jira/browse/HDDS-9228
Project: Apache Ozone
Issue Type: Bug
Components: S3
Affects Versions: 1.4.0
Reporter: Kirill Sizov
h3. TL;DR:
*S3G writes all its responses byte-after-byte.*
h3. Details
This issue was discovered during a performance test run
h4. Cluster configuration
3 master nodes, 5 datanodes.
Each machine runs 96core CPU.
S3G instances are installed on master nodes (3 gateways).
h4. Test preparation
Before the test we uploaded 300000 files to Ozone, 20MB each.
h4. Test configuration
We ran two tests
1. pure writes, no concurrent reads
2. pure reads, no concurrent writes
h4. Load generator
3 load generator nodes, each runs 50 threads.
h4. Ozone configuration
The buckets were created with Erasure Coding RS-3-2-1024k
h3. Results
We found that writes are 3 times faster than reads, moreover reads caused ~70%
CPU usage.
Thread dumps and JFR showed the following stacktraces of HTTP threads:
Stacktrace:
{noformat}
"qtp2079179914-1055393" Id=1055393 RUNNABLE
at
org.glassfish.jersey.servlet.internal.ResponseWriter$NonCloseableOutputStreamWrapper.write(ResponseWriter.java:291)
at
org.glassfish.jersey.message.internal.CommittingOutputStream.write(CommittingOutputStream.java:215)
at java.io.FilterOutputStream.write(FilterOutputStream.java:77)
at java.io.FilterOutputStream.write(FilterOutputStream.java:125)
at
org.glassfish.jersey.message.internal.WriterInterceptorExecutor$UnCloseableOutputStream.write(WriterInterceptorExecutor.java:276)
at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1310)
at org.apache.commons.io.IOUtils.copy(IOUtils.java:978)
at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1282)
at
org.apache.hadoop.ozone.s3.endpoint.ObjectEndpoint.lambda$get$0(ObjectEndpoint.java:382)
{noformat}
JFR:
{noformat}
Stack Trace Count Percentage
void org.eclipse.jetty.server.HttpOutput.write(int) 431146 39 %
void
org.glassfish.jersey.servlet.internal.ResponseWriter$NonCloseableOutputStreamWrapper.write(int)
431145 39 %
void org.glassfish.jersey.message.internal.CommittingOutputStream.write(int)
431145 39 %
void java.io.FilterOutputStream.write(int) 431145 39 %
void java.io.FilterOutputStream.write(byte[], int, int) 431145 39 %
void
org.glassfish.jersey.message.internal.WriterInterceptorExecutor$UnCloseableOutputStream.write(byte[],
int, int) 431145 39 %
long org.apache.commons.io.IOUtils.copyLarge(InputStream, OutputStream, byte[])
431145 39 %
{noformat}
We can clearly see the transition {{FilterOutputStream.write(byte[], int, int)
->FilterOutputStream.write(int)}}, meaning that any incoming array is written
as single bytes, not as an array as a whole.
The place in the code that creates {{FilterOutputStream}} is
{{org.apache.hadoop.ozone.s3.TracingFilter}}:
{code}
OutputStream out = responseContext.getEntityStream();
if (out != null) {
responseContext.setEntityStream(new FilterOutputStream(out) {
@Override
public void close() throws IOException {
super.close();
finishAndClose(scope, span);
}
});
}
{code}
Removing this filter or fixing {{FilterOutputStream.write(byte[], int, int)}}
method resolves performance issues and we see a 5x better throughput.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]