holdenk commented on PR #39825: URL: https://github.com/apache/spark/pull/39825#issuecomment-1419876529
Ok @dongjoon-hyun I hear the -1 concern around excessive scale-up -- but blocking scale-up on snapshots being delivered seems like kind of a hack (what we are doing today). What about if we use "allocation batch size" to gate it like @attilapiros suggested? The other option would be to add this as a default feature off flag (e.g. `enableAllocationWithPendingPods` and set it to `false` by default). The current logic which exists in the master branch seems to be based on 0343854f54b48b206ca434accec99355011560c2 / https://github.com/apache/spark/pull/25236 from @vanzin which introduced `hasPendingPods` (which we now in master track the number of instead) and seems in line with the stated goals of that (" More responsive dynamic allocation with K8S"). I don't see anything mentioned in it about reducing the number of pending resources in EKS but maybe there was a discussion off-PR/off-list I don't see. Do any of those work for your concern? Just as an aside I'm a little surprised with a veto ( https://spark.apache.org/committers.html / https://www.apache.org/foundation/voting.html ) this early in the conversation around a proposed change, is there some context I'm missing? Did y'all get a stuck cluster with the previous behavior? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
