Alexander Trushev created FLINK-24459:
-----------------------------------------
Summary: Performance improvement of file sink on Nexmark
Key: FLINK-24459
URL: https://issues.apache.org/jira/browse/FLINK-24459
Project: Flink
Issue Type: Improvement
Components: Connectors / FileSystem
Reporter: Alexander Trushev
Attachments: after.jfr.zip, after_cpu.png, after_mem.png,
before.jfr.zip, before_cpu.png, before_mem.png
h3. Context
{{PartitionPathUtils.escapePathName}} is a pretty simple method that takes
{{String}}, allocates {{StringBuilder}}, appends original or escaped chars, and
outputs the result {{String}}.
Filesystem sink calls the method several times for each element to determine
bucket id. Because of this, it is a hot spot on a workload that writes
intensively to filesystem, such as [nexmark
q10|https://github.com/nexmark/nexmark/blob/master/nexmark-flink/src/main/resources/queries/q10.sql].
On my local machine escaping of chars takes 9.53% CPU and 17.8% mem
allocations of the whole TaskManager process.
h3. Proposal
{{PartitionPathUtils.escapePathName}} improvements
# Use more efficient {{Integer.toHexString}} instead of {{String.format}}
# Do not allocate new string when there is no escapable char in the original
string
# Allocate {{StringBuilder}} depending on the original string length instead of
the default value
h3. Benefit
Experiment on local machine.
1 TaskManager with 6 slots. Job parallelism 6. Nexmark default configuration +
object reuse option.
Before: flink-1.14.0
After: flink-1.14.0 + patch with the improvements
|| Nexmark q10 || Before || After ||
| CPU samples of escapePathName() (% of all) | 9.53 | 1.64 |
| Memory allocations by escapePathName() (% of all) | 17.8 | 2.98 |
| Throughput/Cores (K/s) | 107.64 | 119.42 |
Diff: CPU *-7.89*%, Memory *-14.82*%, Throughput *+10.9*%
Profiling reports are in the attachment.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)