HashidaTKS commented on a change in pull request #10527:
URL: https://github.com/apache/arrow/pull/10527#discussion_r668329358



##########
File path: csharp/src/Apache.Arrow/Ipc/ArrowStreamWriter.cs
##########
@@ -248,6 +263,13 @@ private protected void 
WriteRecordBatchInternal(RecordBatch recordBatch)
                 HasWrittenSchema = true;
             }
 
+            if (!HasWrittenDictionaryBatch)
+            {
+                DictionaryCollector.Collect(recordBatch, ref _dictionaryMemo);

Review comment:
       The benchmark results are below.
   I think there is not much impact.
   How it that?
   
   before this change:
   
   |                                    Method |   Count |       Mean |     
Error |     StdDev |     Median | Gen 0 | Gen 1 | Gen 2 |   Allocated |
   |------------------------------------------ |-------- 
|-----------:|----------:|-----------:|-----------:|------:|------:|------:|------------:|
   |               ArrowReaderWithMemoryStream |   10000 |   2.869 ms | 0.0561 
ms |  0.0982 ms |   2.841 ms |     - |     - |     - |    10.09 KB |
   | ArrowReaderWithMemoryStream_ManagedMemory |   10000 |   2.371 ms | 0.0474 
ms |  0.1116 ms |   2.335 ms |     - |     - |     - |  1444.86 KB |
   |                     ArrowReaderWithMemory |   10000 |   2.122 ms | 0.1173 
ms |  0.3365 ms |   2.160 ms |     - |     - |     - |     7.67 KB |
   |               ArrowReaderWithMemoryStream | 1000000 | 220.505 ms | 4.3805 
ms | 12.5684 ms | 219.305 ms |     - |     - |     - |     8.13 KB |
   | ArrowReaderWithMemoryStream_ManagedMemory | 1000000 | 211.168 ms | 4.1733 
ms | 11.1395 ms | 206.210 ms |     - |     - |     - | 142425.4 KB |
   |                     ArrowReaderWithMemory | 1000000 | 166.486 ms | 3.9192 
ms | 11.5558 ms | 165.832 ms |     - |     - |     - |     7.59 KB |
   
   |     Method | BatchLength | ColumnSetCount |       Mean |     Error |    
StdDev |     Median | Gen 0 | Gen 1 | Gen 2 | Allocated |
   |----------- |------------ |--------------- 
|-----------:|----------:|----------:|-----------:|------:|------:|------:|----------:|
   | WriteBatch |       10000 |             10 |   2.844 ms | 0.1597 ms | 
0.4633 ms |   2.732 ms |     - |     - |     - |  78.06 KB |
   | WriteBatch |       10000 |             14 |   3.913 ms | 0.2181 ms | 
0.6362 ms |   3.692 ms |     - |     - |     - | 112.16 KB |
   | WriteBatch |     1000000 |             10 | 239.486 ms | 4.7245 ms | 
4.4193 ms | 239.822 ms |     - |     - |     - |   79.1 KB |
   | WriteBatch |     1000000 |             14 | 331.087 ms | 6.2670 ms | 
7.4604 ms | 331.364 ms |     - |     - |     - | 111.48 KB |
   
   
   
   after this change:
   
   |                                    Method |   Count |       Mean |     
Error |     StdDev |     Median |     Gen 0 | Gen 1 | Gen 2 |    Allocated |
   |------------------------------------------ |-------- 
|-----------:|----------:|-----------:|-----------:|----------:|------:|------:|-------------:|
   |               ArrowReaderWithMemoryStream |   10000 |   2.820 ms | 0.0560 
ms |  0.1319 ms |   2.788 ms | 1000.0000 |     - |     - |     10.27 KB |
   | ArrowReaderWithMemoryStream_ManagedMemory |   10000 |   2.376 ms | 0.0475 
ms |  0.1252 ms |   2.329 ms |         - |     - |     - |   1445.03 KB |
   |                     ArrowReaderWithMemory |   10000 |   2.233 ms | 0.0305 
ms |  0.0255 ms |   2.235 ms |         - |     - |     - |      7.81 KB |
   |               ArrowReaderWithMemoryStream | 1000000 | 206.608 ms | 4.1289 
ms | 10.5095 ms | 201.813 ms |         - |     - |     - |       8.3 KB |
   | ArrowReaderWithMemoryStream_ManagedMemory | 1000000 | 226.408 ms | 4.4877 
ms | 11.7435 ms | 222.866 ms |         - |     - |     - | 142425.57 KB |
   |                     ArrowReaderWithMemory | 1000000 | 149.248 ms | 2.3519 
ms |  5.1625 ms | 147.150 ms |         - |     - |     - |      7.73 KB |
   
   
   |     Method | BatchLength | ColumnSetCount |       Mean |     Error |     
StdDev |     Median | Gen 0 | Gen 1 | Gen 2 | Allocated |
   |----------- |------------ |--------------- 
|-----------:|----------:|-----------:|-----------:|------:|------:|------:|----------:|
   | WriteBatch |       10000 |             10 |   2.794 ms | 0.1591 ms |  
0.4566 ms |   2.639 ms |     - |     - |     - |  78.97 KB |
   | WriteBatch |       10000 |             14 |   3.991 ms | 0.2347 ms |  
0.6772 ms |   3.776 ms |     - |     - |     - | 113.35 KB |
   | WriteBatch |     1000000 |             10 | 245.601 ms | 8.2668 ms | 
23.8516 ms | 240.295 ms |     - |     - |     - |  77.97 KB |
   | WriteBatch |     1000000 |             14 | 317.090 ms | 6.2694 ms | 
13.4956 ms | 317.577 ms |     - |     - |     - | 112.35 KB |




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to