steveniemitz opened a new pull request #17134:
URL: https://github.com/apache/beam/pull/17134
Many coders have significant overhead due to the usage of `DataInputStream`.
DataInputStream allocates a significant amount of internal buffers when
instantiated, which adds unnecessary overhead for very simple operations like
decoding a big-endian long.
This changes most coders that use DataInputStream internally to use a more
optimized big-endian decoder. I actually benchmarked three different options
here, the solution I arrived at was the best mix of performance and allocations.
```
Benchmark Mode Cnt Score Error Units
readLongViaLocalBuffer thrpt 10 204364633.343 ± 7412002.528 ops/s
readLongViaTLBuffer thrpt 10 108663164.381 ± 229471.991 ops/s
readLongViaReadCalls thrpt 10 160694853.195 ± 5272248.704 ops/s
```
readLongViaLocalBuffer allocates an 8 byte buffer per call and reads it
using a single read() call.
readLongViaTLBuffer does the same, but uses a thread-local buffer rather
than allocating a new one each call.
readLongViaReadCalls simply calls read 8 times, storing the results in
temporary variables.
R: @lukecwik maybe? Not really sure who's the best to look at this.
------------------------
Thank you for your contribution! Follow this checklist to help us
incorporate your contribution quickly and easily:
- [x] [**Choose
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and
mention them in a comment (`R: @username`).
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA
issue, if applicable. This will automatically link the pull request to the
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [x] If this contribution is large, please file an Apache [Individual
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
See the [Contributor Guide](https://beam.apache.org/contribute) for more
tips on [how to make review process
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
To check the build health, please visit
[https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
GitHub Actions Tests Status (on master branch)
------------------------------------------------------------------------------------------------
[](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
[](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
[](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more
information about GitHub Actions CI.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]