dongjoon-hyun edited a comment on pull request #30876: URL: https://github.com/apache/spark/pull/30876#issuecomment-751577804
@mridulm . Your comments are true and nothing wrong to me. I agree with you in every bits and we can disable this back for the problematic cases before Apache Spark 3.2.0 vote. To do that, I believe that we are able to agree that we need to identify what are the problematic corner cases in this threads. At least, we need to provide a better document to the community about this option if some PMCs already have an implicit knowledge about the reasons why this option should be prohibited in YARN environment. It's an invaluable knowledge for the community to share. Besides, initially, I'm continuing this discussion because your initial concerns are crucial to the community. AFAIK, nobody else shared the concerns before explicitly. 1. [Especially in context of dynamic resource allocation, it can become very chatty when executor's start getting dropped.](https://github.com/apache/spark/pull/30876#discussion_r547383304) 2. [In the past, I found this to be noisy for the cases where replication was enabled.](https://github.com/apache/spark/pull/30876#discussion_r548191318) Could you elaborate about your concern more specifically? 1. What is the negative side-effect of `very chatty` and `noisy`? 2. How severe it was? Again, I'm aiming to protect the default value of the configuration. I'm trying to understand why this should be prohibited in some resource managers or in a normal Spark operation environment and trying to make the Apache Spark better for those cases. That's the reason why I tried to go deeper for that part by proposing the potential points and asked you similar questions in this thread specifically. We will have many choices in Apache Spark 3.2.0 if the implicit knowledge is shared more. 1. [For the following, Apache Spark usually drop only empty executors. If you are saying a storage timeout configuration, I believe that what we need is to improve storage timeout configuration behavior after this enabling. I guess storage timeout had better not cause any chatty situation, of course.](https://github.com/apache/spark/pull/30876#discussion_r547421217) 2. [I'm trying to understand the risk you mentioned for YARN environment. Could you give me more hints about your concerns on this at the YARN dynamic allocation situation? We can fix it the behavior and move forward if that's valid.](https://github.com/apache/spark/pull/30876#pullrequestreview-557444101) 3. [This is my focused use case and I love to hear your concerns. You have all my ears.](https://github.com/apache/spark/pull/30876#issuecomment-750471287) So far, I didn't get your answers explicitly. Please let me know if I missed something there. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
