[GitHub] [arrow] wesm edited a comment on pull request #7506: ARROW-9197: [C++] Overhaul integer/floating point casting: vectorize truncation checks, reduce binary size

2020-06-22 Thread GitBox


wesm edited a comment on pull request #7506:
URL: https://github.com/apache/arrow/pull/7506#issuecomment-647633470


   Here's the benchmark comparison with clang-8
   
   ```
   $ archery benchmark diff --cc=clang-8 --cxx=clang++-8 2db48b4 653817301 
--benchmark-filter=Cast
 benchmarkbaseline   
contender  change %  regression
   41   CastUInt32ToInt32Safe/262144/1  769.933m items/sec4.474b 
items/sec   481.076   False
   62   CastUInt32ToInt32Safe/262144/0  409.277m items/sec2.189b 
items/sec   434.792   False
   15CastUInt32ToInt32Safe/32768/0  399.127m items/sec2.089b 
items/sec   423.357   False
   7 CastUInt32ToInt32Safe/32768/1  742.226m items/sec3.721b 
items/sec   401.307   False
   18CastInt64ToInt32Safe/262144/0  341.403m items/sec1.569b 
items/sec   359.706   False
   55CastUInt32ToInt32Safe/262144/1000  351.335m items/sec1.612b 
items/sec   358.917   False
   16 CastUInt32ToInt32Safe/32768/1000  334.147m items/sec1.484b 
items/sec   344.139   False
   47 CastInt64ToInt32Safe/32768/0  328.491m items/sec1.414b 
items/sec   330.412   False
   30CastInt64ToInt32Safe/262144/1  742.497m items/sec3.131b 
items/sec   321.638   False
   51 CastInt64ToInt32Safe/262144/1000  304.244m items/sec1.226b 
items/sec   302.928   False
   42  CastInt64ToInt32Safe/32768/1000  288.976m items/sec1.101b 
items/sec   280.942   False
   32 CastInt64ToInt32Safe/32768/1  706.339m items/sec2.552b 
items/sec   261.273   False
   45   CastDoubleToInt32Safe/262144/1  924.369m items/sec2.997b 
items/sec   224.214   False
   54   CastInt64ToDoubleSafe/262144/0  419.319m items/sec1.324b 
items/sec   215.783   False
   11CastInt64ToDoubleSafe/32768/0  408.425m items/sec1.216b 
items/sec   197.672   False
   49   CastDoubleToInt32Safe/262144/2  207.799m items/sec  614.658m 
items/sec   195.795   False
   9 CastDoubleToInt32Safe/32768/2  202.480m items/sec  584.558m 
items/sec   188.699   False
   23CastInt64ToDoubleSafe/262144/1000  375.572m items/sec1.078b 
items/sec   186.948   False
   50CastDoubleToInt32Safe/32768/1  869.447m items/sec2.445b 
items/sec   181.248   False
   21   CastInt64ToDoubleSafe/262144/1  790.625m items/sec2.222b 
items/sec   181.101   False
   59CastDoubleToInt32Safe/262144/1000  360.792m items/sec1.013b 
items/sec   180.714   False
   44 CastInt64ToDoubleSafe/32768/1000  360.492m items/sec  988.897m 
items/sec   174.319   False
   48 CastDoubleToInt32Safe/32768/1000  349.576m items/sec  932.771m 
items/sec   166.829   False
   58   CastDoubleToInt32Safe/262144/0  407.159m items/sec1.067b 
items/sec   162.086   False
   63CastInt64ToDoubleSafe/32768/1  746.561m items/sec1.893b 
items/sec   153.520   False
   8 CastDoubleToInt32Safe/32768/0  395.857m items/sec  990.704m 
items/sec   150.268   False
   67  CastDoubleToInt32Safe/262144/10  275.237m items/sec  612.002m 
items/sec   122.354   False
   10   CastDoubleToInt32Safe/32768/10  266.596m items/sec  583.346m 
items/sec   118.813   False
   69CastInt64ToInt32Safe/32768/10  232.545m items/sec  449.883m 
items/sec93.461   False
   64   CastUInt32ToInt32Safe/32768/10  256.845m items/sec  482.636m 
items/sec87.909   False
   61   CastInt64ToInt32Safe/262144/10  243.012m items/sec  435.232m 
items/sec79.099   False
   0   CastUInt32ToInt32Safe/262144/10  264.244m items/sec  466.981m 
items/sec76.723   False
   53   CastInt64ToDoubleSafe/32768/10  278.548m items/sec  441.752m 
items/sec58.591   False
   1   CastInt64ToDoubleSafe/262144/10  283.181m items/sec  431.990m 
items/sec52.549   False
   14 CastInt64ToInt32Safe/32768/2  170.844m items/sec  224.195m 
items/sec31.228   False
   37CastUInt32ToInt32Safe/32768/2  182.246m items/sec  238.051m 
items/sec30.621   False
   27   CastUInt32ToInt32Safe/262144/2  187.277m items/sec  231.385m 
items/sec23.553   False
   28CastInt64ToInt32Safe/262144/2  175.893m items/sec  216.887m 
items/sec23.306   False
   26CastInt64ToDoubleSafe/32768/2  189.465m items/sec  228.996m 
items/sec20.864   False
   3CastInt64ToDoubleSafe/262144/2  192.523m items/sec  219.324m 
items/sec13.921   False
   36   CastInt64ToInt32Unsafe/32768/02.993b items/sec3.227b 
items/sec 7.800   False
   35   CastInt64ToInt32Unsafe/32768/22.937b items/sec3.154b 
items/sec 7.367   False
   68   CastInt64ToInt32Unsafe/32768/12.966b items/sec3.176b 
items/sec 7.088   False
   43  CastInt64ToInt32Unsafe/32768/102.940b item

[GitHub] [arrow] wesm edited a comment on pull request #7506: ARROW-9197: [C++] Overhaul integer/floating point casting: vectorize truncation checks, reduce binary size

2020-06-21 Thread GitBox


wesm edited a comment on pull request #7506:
URL: https://github.com/apache/arrow/pull/7506#issuecomment-647233286


   @emkornfield I agree. Realistically we're going to have to look at them 
both. FWIW, in this particular case it seems that the Clang performance is the 
most representative of how things are behaving across platforms. Stuff that 
autovectorizes well in gcc may not do much at all on MSVC. There's also the 
question of how `__builtin_expect` impacts optimizations. But in general I 
don't think we should be using the "ursabot benchmark" results (which use gcc) 
to make conclusions about what perf optimizations are working



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org