CurtHagenlocher opened a new issue, #261:
URL: https://github.com/apache/arrow-dotnet/issues/261

   ### Describe the enhancement requested
   
   In an internal project where I'm looking at the overhead of translating JSON 
responses into Arrow, I'm seeing the following performance characteristics:
   
   .NET 4.7.2
   
   | Method         | Mean     | Error   | StdDev  | Gen0      | Gen1     | 
Gen2     | Allocated |
   |--------------- 
|---------:|--------:|--------:|----------:|---------:|---------:|----------:|
   | SystemTextJson | 663.9 us | 3.32 us | 3.10 us |   58.5938 |  19.5313 |     
   - | 366.89 KB |
   | Arrow          | 666.3 us | 5.77 us | 5.11 us | 7756.8359 | 965.8203 | 
117.1875 |  97.06 KB |
   
   .NET 8.0
   
   | Method         | Mean     | Error   | StdDev  | Gen0     | Gen1     | Gen2 
    | Allocated |
   |--------------- 
|---------:|--------:|--------:|---------:|---------:|---------:|----------:|
   | SystemTextJson | 285.6 us | 1.00 us | 0.94 us |  23.4375 |  11.7188 |      
  - | 363.92 KB |
   | Arrow          | 334.2 us | 6.46 us | 6.34 us | 199.7070 | 199.7070 | 
199.7070 |  52.17 KB |
   
   .NET 10.0
   
   | Method         | Mean     | Error   | StdDev  | Gen0     | Gen1     | Gen2 
    | Allocated |
   |--------------- 
|---------:|--------:|--------:|---------:|---------:|---------:|----------:|
   | SystemTextJson | 267.2 us | 1.42 us | 1.26 us |  23.4375 |  14.1602 |      
  - | 362.37 KB |
   | Arrow          | 331.9 us | 6.23 us | 5.83 us | 198.7305 | 198.7305 | 
198.7305 |  48.44 KB |
   
   When I effectively remove the calls to `GC.AddMemoryPressure` and 
`GC.RemoveMemoryPressure` but make no other changes, I instead get the 
following:
   
   .NET 4.7.2
   
   | Method         | Mean     | Error   | StdDev  | Gen0    | Gen1    | 
Allocated |
   |--------------- 
|---------:|--------:|--------:|--------:|--------:|----------:|
   | SystemTextJson | 661.6 us | 3.45 us | 3.23 us | 58.5938 | 19.5313 | 366.91 
KB |
   | Arrow          | 453.6 us | 8.35 us | 7.81 us |  7.8125 |  3.9063 |  53.68 
KB |
   
   .NET 8.0
   
   | Method         | Mean     | Error   | StdDev  | Gen0    | Gen1    | 
Allocated |
   |--------------- 
|---------:|--------:|--------:|--------:|--------:|----------:|
   | SystemTextJson | 288.0 us | 2.57 us | 2.41 us | 23.4375 | 11.7188 | 363.92 
KB |
   | Arrow          | 277.9 us | 4.47 us | 3.73 us |  2.9297 |  2.4414 |  52.11 
KB |
   
   .NET 10.0
   
   | Method         | Mean     | Error   | StdDev  | Gen0    | Gen1    | 
Allocated |
   |--------------- 
|---------:|--------:|--------:|--------:|--------:|----------:|
   | SystemTextJson | 270.4 us | 1.77 us | 1.66 us | 23.4375 | 14.1602 | 362.37 
KB |
   | Arrow          | 274.6 us | 3.46 us | 3.24 us |  2.9297 |  2.4414 |   48.3 
KB |
   
   The amount allocated by the "SystemTextJson" test is a reasonable proxy for 
the amount of native memory allocated by the "Arrow" test. Especially under 
.NET 4.7.2, it appears that these calls are triggering frequent garbage 
collection which (at least in a microbenchmark) are having a large effect on 
performance.
   
   One way to deal with this is to effectively batch the calls. For instance, 
we might consider rounding up to 64 KB and only adding or removing memory 
pressure when we go past that threshold. For instance, the first allocation of 
128 bytes would add 64 KB, then subsequent allocations of 128 bytes, 4096 
bytes, 8192 bytes and 32768 bytes would only increment an internal counter. At 
that point, an allocation of 40000 bytes would push us past the next 64 KB 
boundary and trigger another `GC.AddMemoryPressure` for another 64 KB.
   
   It's possible, of course, that the benchmark is pathological due to an 
overall heap size which is quite small at that point and that in a "real" 
application we would not see this kind of noise.  I'll try to figure out a way 
to measure that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to