CurtHagenlocher opened a new issue, #261:
URL: https://github.com/apache/arrow-dotnet/issues/261
### Describe the enhancement requested
In an internal project where I'm looking at the overhead of translating JSON
responses into Arrow, I'm seeing the following performance characteristics:
.NET 4.7.2
| Method | Mean | Error | StdDev | Gen0 | Gen1 |
Gen2 | Allocated |
|---------------
|---------:|--------:|--------:|----------:|---------:|---------:|----------:|
| SystemTextJson | 663.9 us | 3.32 us | 3.10 us | 58.5938 | 19.5313 |
- | 366.89 KB |
| Arrow | 666.3 us | 5.77 us | 5.11 us | 7756.8359 | 965.8203 |
117.1875 | 97.06 KB |
.NET 8.0
| Method | Mean | Error | StdDev | Gen0 | Gen1 | Gen2
| Allocated |
|---------------
|---------:|--------:|--------:|---------:|---------:|---------:|----------:|
| SystemTextJson | 285.6 us | 1.00 us | 0.94 us | 23.4375 | 11.7188 |
- | 363.92 KB |
| Arrow | 334.2 us | 6.46 us | 6.34 us | 199.7070 | 199.7070 |
199.7070 | 52.17 KB |
.NET 10.0
| Method | Mean | Error | StdDev | Gen0 | Gen1 | Gen2
| Allocated |
|---------------
|---------:|--------:|--------:|---------:|---------:|---------:|----------:|
| SystemTextJson | 267.2 us | 1.42 us | 1.26 us | 23.4375 | 14.1602 |
- | 362.37 KB |
| Arrow | 331.9 us | 6.23 us | 5.83 us | 198.7305 | 198.7305 |
198.7305 | 48.44 KB |
When I effectively remove the calls to `GC.AddMemoryPressure` and
`GC.RemoveMemoryPressure` but make no other changes, I instead get the
following:
.NET 4.7.2
| Method | Mean | Error | StdDev | Gen0 | Gen1 |
Allocated |
|---------------
|---------:|--------:|--------:|--------:|--------:|----------:|
| SystemTextJson | 661.6 us | 3.45 us | 3.23 us | 58.5938 | 19.5313 | 366.91
KB |
| Arrow | 453.6 us | 8.35 us | 7.81 us | 7.8125 | 3.9063 | 53.68
KB |
.NET 8.0
| Method | Mean | Error | StdDev | Gen0 | Gen1 |
Allocated |
|---------------
|---------:|--------:|--------:|--------:|--------:|----------:|
| SystemTextJson | 288.0 us | 2.57 us | 2.41 us | 23.4375 | 11.7188 | 363.92
KB |
| Arrow | 277.9 us | 4.47 us | 3.73 us | 2.9297 | 2.4414 | 52.11
KB |
.NET 10.0
| Method | Mean | Error | StdDev | Gen0 | Gen1 |
Allocated |
|---------------
|---------:|--------:|--------:|--------:|--------:|----------:|
| SystemTextJson | 270.4 us | 1.77 us | 1.66 us | 23.4375 | 14.1602 | 362.37
KB |
| Arrow | 274.6 us | 3.46 us | 3.24 us | 2.9297 | 2.4414 | 48.3
KB |
The amount allocated by the "SystemTextJson" test is a reasonable proxy for
the amount of native memory allocated by the "Arrow" test. Especially under
.NET 4.7.2, it appears that these calls are triggering frequent garbage
collection which (at least in a microbenchmark) are having a large effect on
performance.
One way to deal with this is to effectively batch the calls. For instance,
we might consider rounding up to 64 KB and only adding or removing memory
pressure when we go past that threshold. For instance, the first allocation of
128 bytes would add 64 KB, then subsequent allocations of 128 bytes, 4096
bytes, 8192 bytes and 32768 bytes would only increment an internal counter. At
that point, an allocation of 40000 bytes would push us past the next 64 KB
boundary and trigger another `GC.AddMemoryPressure` for another 64 KB.
It's possible, of course, that the benchmark is pathological due to an
overall heap size which is quite small at that point and that in a "real"
application we would not see this kind of noise. I'll try to figure out a way
to measure that.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]