scsmithr commented on issue #7001:
URL: 
https://github.com/apache/arrow-datafusion/issues/7001#issuecomment-1638315502

   ๐Ÿ‘ for this.
   
   I poked around with removing the gate in the distribution channel 
implementation. My challenges with the gate implementation was high lock 
contention, as well as high memory usage when inputs are highly skewed (e.g. 
one channel received the bulk of the batches, but the gate still let batches in 
since other channels were empty).
   
   Some experimentation here: 
https://github.com/GlareDB/arrow-datafusion/blob/repart-perf/datafusion/core/src/physical_plan/repartition/distributor_channels.rs.
 Note that this hangs on some of the repartition tests, so there's likely a bug 
in the futures logic.
   
   With a high enough channel buffer size, we start to see similar performance 
increases seen in the flume pr:
   
   ```
   โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“
   โ”ƒ Query        โ”ƒ     main โ”ƒ repart-perf โ”ƒ        Change โ”ƒ
   โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ฉ
   โ”‚ QQuery 1     โ”‚ 309.43ms โ”‚    309.24ms โ”‚     no change โ”‚
   โ”‚ QQuery 2     โ”‚  47.65ms โ”‚     48.91ms โ”‚     no change โ”‚
   โ”‚ QQuery 3     โ”‚ 129.55ms โ”‚    109.39ms โ”‚ +1.18x faster โ”‚
   โ”‚ QQuery 4     โ”‚  81.07ms โ”‚     51.09ms โ”‚ +1.59x faster โ”‚
   โ”‚ QQuery 5     โ”‚ 158.61ms โ”‚    123.25ms โ”‚ +1.29x faster โ”‚
   โ”‚ QQuery 6     โ”‚  79.99ms โ”‚     80.40ms โ”‚     no change โ”‚
   โ”‚ QQuery 7     โ”‚ 218.47ms โ”‚    202.31ms โ”‚ +1.08x faster โ”‚
   โ”‚ QQuery 8     โ”‚ 184.88ms โ”‚    170.81ms โ”‚ +1.08x faster โ”‚
   โ”‚ QQuery 9     โ”‚ 266.72ms โ”‚    212.60ms โ”‚ +1.25x faster โ”‚
   โ”‚ QQuery 10    โ”‚ 207.46ms โ”‚    149.29ms โ”‚ +1.39x faster โ”‚
   โ”‚ QQuery 11    โ”‚  44.44ms โ”‚     44.40ms โ”‚     no change โ”‚
   โ”‚ QQuery 12    โ”‚ 150.27ms โ”‚    119.97ms โ”‚ +1.25x faster โ”‚
   โ”‚ QQuery 13    โ”‚ 227.68ms โ”‚    229.70ms โ”‚     no change โ”‚
   โ”‚ QQuery 14    โ”‚ 113.26ms โ”‚    111.97ms โ”‚     no change โ”‚
   โ”‚ QQuery 15    โ”‚  81.23ms โ”‚     81.10ms โ”‚     no change โ”‚
   โ”‚ QQuery 16    โ”‚  51.57ms โ”‚     51.50ms โ”‚     no change โ”‚
   โ”‚ QQuery 17    โ”‚ 338.65ms โ”‚    305.00ms โ”‚ +1.11x faster โ”‚
   โ”‚ QQuery 18    โ”‚ 379.12ms โ”‚    370.23ms โ”‚     no change โ”‚
   โ”‚ QQuery 19    โ”‚ 233.23ms โ”‚    234.30ms โ”‚     no change โ”‚
   โ”‚ QQuery 20    โ”‚ 132.11ms โ”‚    129.32ms โ”‚     no change โ”‚
   โ”‚ QQuery 21    โ”‚ 324.30ms โ”‚    256.37ms โ”‚ +1.26x faster โ”‚
   โ”‚ QQuery 22    โ”‚  50.06ms โ”‚     43.18ms โ”‚ +1.16x faster โ”‚
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
   ```
   
   (On an M2 Macbook Air)
   
   Not trying to push for any one approach, and this was mostly just an 
experiment.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to