[
https://issues.apache.org/jira/browse/BEAM-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17431384#comment-17431384
]
Robert Burke edited comment on BEAM-13082 at 10/20/21, 5:27 PM:
----------------------------------------------------------------
Wrote a benchmark: Before
{{
goos: linux
goarch: amd64
pkg: github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/harness
BenchmarkDataWriter/4B-8 124502478 9.551 ns/op
23 B/op 0 allocs/op
BenchmarkDataWriter/16B-8 66365191 19.76 ns/op
91 B/op 0 allocs/op
BenchmarkDataWriter/1KB-8 1240803 919.3 ns/op
5891 B/op 0 allocs/op
BenchmarkDataWriter/4KB-8 361453 4160 ns/op
23556 B/op 0 allocs/op
BenchmarkDataWriter/100KB-8 14484 77779 ns/op
492353 B/op 0 allocs/op
BenchmarkDataWriter/1MB-8 4654 315765 ns/op
2274483 B/op 2 allocs/op
BenchmarkDataWriter/10MB-8 896 1513997 ns/op
10486064 B/op 6 allocs/op
BenchmarkDataWriter/100MB-8 82 35511605 ns/op
104857911 B/op 6 allocs/op
BenchmarkDataWriter/256MB-8 19 202720085 ns/op
268435785 B/op 6 allocs/op
PASS
}}
was (Author: lostluck):
Wrote a benchmark: Before
{{goos: linux
goarch: amd64
pkg: github.com/apache/beam/sdks/v2/go/pkg/beam/core/runtime/harness
BenchmarkDataWriter/4B-8 124502478 9.551 ns/op
23 B/op 0 allocs/op
BenchmarkDataWriter/16B-8 66365191 19.76 ns/op
91 B/op 0 allocs/op
BenchmarkDataWriter/1KB-8 1240803 919.3 ns/op
5891 B/op 0 allocs/op
BenchmarkDataWriter/4KB-8 361453 4160 ns/op
23556 B/op 0 allocs/op
BenchmarkDataWriter/100KB-8 14484 77779 ns/op
492353 B/op 0 allocs/op
BenchmarkDataWriter/1MB-8 4654 315765 ns/op
2274483 B/op 2 allocs/op
BenchmarkDataWriter/10MB-8 896 1513997 ns/op
10486064 B/op 6 allocs/op
BenchmarkDataWriter/100MB-8 82 35511605 ns/op
104857911 B/op 6 allocs/op
BenchmarkDataWriter/256MB-8 19 202720085 ns/op
268435785 B/op 6 allocs/op
PASS}}
> [Go SDK] Reduce churn in dataWriter by retaining byte slice.
> ------------------------------------------------------------
>
> Key: BEAM-13082
> URL: https://issues.apache.org/jira/browse/BEAM-13082
> Project: Beam
> Issue Type: Improvement
> Components: sdk-go
> Reporter: Robert Burke
> Assignee: Robert Burke
> Priority: P2
>
> It's been noted that we can reduce allocations and GC overhead produced by
> the dataWriter if we change the `w.buf = nil` to `w.buf = w.buf[:0]`. We
> should still nil out the buffer after the final flush in Close() however, to
> avoid retaining larger byte buffers after bundle termination.
> A dataWriter is created per bundle, and is only used and is safe to use by
> that bundle 's processing thread. Further, GRPC's Send call doesn't maintain
> ownership of the Proto message data after Send returns, allowing this re-use.
> A later optimization could use a sync.Pool to maintain a "freelist" of
> buffers to further reduce per bundle allocations but this would likely only
> be noticeable in streaming contexts. Such a free list should have a cap of
> keeping buffers under some threshold (say slices under 64MB in cap) to avoid
> retaining overly large buffers that aren't in active use. This idea though is
> out of scope for a first pass.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)