pbacsko commented on PR #1043: URL: https://github.com/apache/yunikorn-core/pull/1043#issuecomment-3472465678
OK, I couldn't resist and tried this locally with `BenchmarkSchedulingThroughPut()`. There is a significant **decrease** in throughput, which baffled me a bit, but then I realized what was going on. This test is a bit synthetic, so not really realistic. The first node in the B-tree is always a "hit", so this piece of code slows things down, because it always has to construct at least a single batch (eg. 24), then walk through the nodes even if it's not needed. The first node of the batch is always the one that we need, so we perform unnecessary verification of predicates for the remaining 23 nodes, essentially wasting CPU cycles. I'm thinking about how to makes this more efficient for this case. Even if it's not always the first node which is good for us, eg. it's the 5th node, we're still evaluating too many nodes. So, we have to have some sort of a threshold to make this improvement justifiable. I think the "first node fit" is always a case whose throughput is hard (or even impossible) to match with a parallel approach, simply due to the extra overhead. But I still suggest exploring different ways to make this more performant. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
