pbacsko commented on PR #1043:
URL: https://github.com/apache/yunikorn-core/pull/1043#issuecomment-3472465678

   OK, I couldn't resist and tried this locally with 
`BenchmarkSchedulingThroughPut()`.
   
   There is a significant **decrease** in throughput, which baffled me a bit, 
but then I realized what was going on. This test is a bit synthetic, so not 
really realistic. The first node in the B-tree is always a "hit", so this piece 
of code slows things down, because it always has to construct at least a single 
batch (eg. 24), then walk through the nodes even if it's not needed. The first 
node of the batch is always the one that we need, so we perform unnecessary 
verification of predicates for the remaining 23 nodes, essentially wasting CPU 
cycles.
   
   I'm thinking about how to makes this more efficient for this case. Even if 
it's not always the first node which is good for us, eg. it's the 5th node, 
we're still evaluating too many nodes. So, we have to have some sort of a 
threshold to make this improvement justifiable. I think the "first node fit" is 
always a case whose throughput is hard (or even impossible) to match with a 
parallel approach, simply due to the extra overhead. But I still suggest 
exploring different ways to make this more performant.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to