Hi Jay

As far as I am aware in Spark 2.4.4, there is no feature to enable executor
decommissioning with graceful shutdown, nor is there a way to specify a
timeout for forcefully killing executors. These were introduced in Spark
3.0.


HTH


Mich Talebzadeh,

Architect | Data Engineer | Data Science | Financial Crime
PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial College
London <https://en.wikipedia.org/wiki/Imperial_College_London>
London, United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner  <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von
Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".


On Thu, 10 Oct 2024 at 14:27, Jay Han <tunyu...@gmail.com> wrote:

> Thank you for your prompt reply. By the way, I am utilizing Spark 2.4.4.
> My question is as the following:
> When the Spark driver tries to remove executors from Kubernetes, it
> invokes pods().delete() without specifying a grace period, regardless of
> whether it's due to job failure or success. If this is an intentional
> design by the Spark developers, is it possible for me to alter this default
> behavior, like pods().gracePeriod(30).delete()?
>
>
>
> Mich Talebzadeh <mich.talebza...@gmail.com> 于2024年10月10日周四 16:22写道:
>
>> to be clear are you referring to these
>>
>> spark.executor.decommission.enabled=true
>> spark.executor.decommission.gracefulShutdown=true
>>
>> thanks
>>
>> Mich Talebzadeh,
>>
>> Architect | Data Engineer | Data Science | Financial Crime
>> PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial
>> College London <https://en.wikipedia.org/wiki/Imperial_College_London>
>> London, United Kingdom
>>
>>
>>    view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>
>>
>>
>> *Disclaimer:* The information provided is correct to the best of my
>> knowledge but of course cannot be guaranteed . It is essential to note
>> that, as with any advice, quote "one test result is worth one-thousand
>> expert opinions (Werner
>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun
>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>>
>>
>> On Thu, 10 Oct 2024 at 09:03, Ángel <angel.alvarez.pas...@gmail.com>
>> wrote:
>>
>>> Do you know by any chance if that config also applies to Databricks?
>>>
>>> El jue, 10 oct 2024 a las 10:02, Ángel (<angel.alvarez.pas...@gmail.com>)
>>> escribió:
>>>
>>>> Thanks a lot for the clarification. Interesting... I've never needed
>>>> it, even though I've been using Spark for over 8 years.
>>>>
>>>> El jue, 10 oct 2024 a las 9:21, Liu Cao (<twcnnj...@gmail.com>)
>>>> escribió:
>>>>
>>>>> I’m unclear on what the exact issue the OP ran into.
>>>>>
>>>>> But if we are talking about decommission, just one side note:
>>>>>
>>>>> The decommission feature
>>>>> <https://docs.google.com/document/d/1xVO1b6KAwdUhjEJBolVPl9C6sLj7oOveErwDSYdT-pE/edit?tab=t.0#heading=h.70pylrl2vdg8>
>>>>> has been in spark for a while, and decommission on K8S and standalone was
>>>>> GA-ed in 3.1 actually. See
>>>>> https://spark.apache.org/releases/spark-release-3-1-1.html
>>>>>
>>>>> The fact that you didn’t see it in the 3.3 site is simply a lack of
>>>>> documentation. The missing documentation was added in 3.4, thanks to
>>>>> https://github.com/apache/spark/pull/38131/files
>>>>>
>>>>> On Wed, Oct 9, 2024 at 10:13 PM Ángel <angel.alvarez.pas...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> The Synapse config "spark.yarn.executor.decommission.enabled"  is the
>>>>>> closest thing to the proposed config 
>>>>>> "spark.executor.decommission.enabled"
>>>>>> we've seen so far, I was only remarking that.
>>>>>>
>>>>>> On the other hand, seems like the "decomission" config came out in
>>>>>> Spark 3.4.0:
>>>>>>
>>>>>> https://archive.apache.org/dist/spark/docs/3.3.4/configuration.html
>>>>>> https://archive.apache.org/dist/spark/docs/3.4.0/configuration.html
>>>>>>
>>>>>>
>>>>>>
>>>>>> El jue, 10 oct 2024 a las 4:44, Jungtaek Lim (<
>>>>>> kabhwan.opensou...@gmail.com>) escribió:
>>>>>>
>>>>>>> Ángel,
>>>>>>>
>>>>>>> https://spark.apache.org/docs/latest/configuration.html
>>>>>>> search through `decommission` in this page.
>>>>>>>
>>>>>>> The config you may have found from Synapse is
>>>>>>> spark."yarn".executor.decommission.enabled. And the question was even 
>>>>>>> "k8s"
>>>>>>> and none of the information for the vendor was mentioned from the 
>>>>>>> question.
>>>>>>> I don't even think these configs are on the internet - simply saying,
>>>>>>> google them (be sure to wrap with double quotes).
>>>>>>>
>>>>>>> Interestingly, I wouldn't even expect the config for graceful
>>>>>>> shutdown for decommission. The functionality Spark provides for
>>>>>>> "decommission" is basically a "graceful shutdown" of the executor. It
>>>>>>> sounds to me as redundant.
>>>>>>>
>>>>>>> On Thu, Oct 10, 2024 at 11:11 AM Ángel <
>>>>>>> angel.alvarez.pas...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Looks like it actually exists ... but only for the Spark Synapse
>>>>>>>> implementation ...
>>>>>>>>
>>>>>>>>
>>>>>>>> https://learn.microsoft.com/en-us/answers/questions/1496283/purpose-of-spark-yarn-executor-decommission-enable
>>>>>>>>
>>>>>>>>
>>>>>>>> Jay Han was asking for some config on k8s, so .... we shouldn't
>>>>>>>> bring this config to the table, should we?
>>>>>>>>
>>>>>>>> El jue, 10 oct 2024 a las 2:55, Sean Owen (<sro...@gmail.com>)
>>>>>>>> escribió:
>>>>>>>>
>>>>>>>>> Mich: you can set any key-value pair you want in Spark config. It
>>>>>>>>> doesn't mean it is a real flag that code reads.
>>>>>>>>>
>>>>>>>>> spark.conf.set("ham", "sandwich")
>>>>>>>>> print(spark.conf.get("ham"))
>>>>>>>>>
>>>>>>>>> prints "sandwich"
>>>>>>>>>
>>>>>>>>> forceKillTimeout is a real config:
>>>>>>>>>
>>>>>>>>> https://github.com/apache/spark/blob/fed9a8da3d4187794161e0be325aa96be8487783/core/src/main/scala/org/apache/spark/internal/config/package.scala#L2394
>>>>>>>>>
>>>>>>>>> The others I cannot find, as in:
>>>>>>>>>
>>>>>>>>> https://github.com/search?q=repo%3Aapache%2Fspark%20spark.executor.decommission.gracefulShutdown&type=code
>>>>>>>>>
>>>>>>>>> If you're continuing to suggest these are real configs, where are
>>>>>>>>> you finding those two in any docs or source?
>>>>>>>>> Or what config were you thinking of if it's a typo?
>>>>>>>>>
>>>>>>>>> On Wed, Oct 9, 2024 at 5:14 PM Mich Talebzadeh <
>>>>>>>>> mich.talebza...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Let us take this for a ride using these so called non-existent
>>>>>>>>>> configuration settings
>>>>>>>>>>
>>>>>>>>>> spark.executor.decommission.enabled=true
>>>>>>>>>> spark.executor.decommission.gracefulShutdown=true
>>>>>>>>>>
>>>>>>>>>> Tested on Spark 3.4
>>>>>>>>>>
>>>>>>>>>> from pyspark.sql import SparkSession
>>>>>>>>>> # Initialize a Spark session
>>>>>>>>>> spark = SparkSession.builder \
>>>>>>>>>>     .appName("Verifying Spark Configurations") \
>>>>>>>>>>     .config("spark.executor.decommission.enabled", "true") \
>>>>>>>>>>     .config("spark.executor.decommission.forceKillTimeout",
>>>>>>>>>> "100s") \
>>>>>>>>>>     .getOrCreate()
>>>>>>>>>>
>>>>>>>>>> # Access Spark context
>>>>>>>>>> sc = spark.sparkContext
>>>>>>>>>> # Set the log level to ERROR to reduce verbosity
>>>>>>>>>> sc.setLogLevel("ERROR")
>>>>>>>>>> print(f"\n\nSpark version: ", sc.version)
>>>>>>>>>>
>>>>>>>>>> # Verify the configuration for executor decommissioning
>>>>>>>>>> decommission_enabled =
>>>>>>>>>> sc.getConf().get("spark.executor.decommission.enabled", "false")
>>>>>>>>>> force_kill_timeout =
>>>>>>>>>> sc.getConf().get("spark.executor.decommission.forceKillTimeout",
>>>>>>>>>> "default_value")
>>>>>>>>>>
>>>>>>>>>> # Print the values
>>>>>>>>>> print(f"spark.executor.decommission.enabled:
>>>>>>>>>> {decommission_enabled}")
>>>>>>>>>> print(f"spark.executor.decommission.forceKillTimeout:
>>>>>>>>>> {force_kill_timeout}")
>>>>>>>>>>
>>>>>>>>>> The output
>>>>>>>>>>
>>>>>>>>>> Spark version:  3.4.0
>>>>>>>>>> spark.executor.decommission.enabled: true
>>>>>>>>>> spark.executor.decommission.forceKillTimeout: 100s
>>>>>>>>>>
>>>>>>>>>> By creating a simple Spark application and verifying the
>>>>>>>>>> configuration values, I trust it is shown that these two parameters 
>>>>>>>>>> are
>>>>>>>>>> valid and are applied by Spark
>>>>>>>>>>
>>>>>>>>>> HTH
>>>>>>>>>>
>>>>>>>>>> Mich Talebzadeh,
>>>>>>>>>>
>>>>>>>>>> Architect | Data Engineer | Data Science | Financial Crime
>>>>>>>>>> PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial
>>>>>>>>>> College London
>>>>>>>>>> <https://en.wikipedia.org/wiki/Imperial_College_London>
>>>>>>>>>> London, United Kingdom
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>    view my Linkedin profile
>>>>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> *Disclaimer:* The information provided is correct to the best of
>>>>>>>>>> my knowledge but of course cannot be guaranteed . It is essential to 
>>>>>>>>>> note
>>>>>>>>>> that, as with any advice, quote "one test result is worth 
>>>>>>>>>> one-thousand
>>>>>>>>>> expert opinions (Werner
>>>>>>>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun
>>>>>>>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, 9 Oct 2024 at 16:51, Mich Talebzadeh <
>>>>>>>>>> mich.talebza...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Do you have a better recommendation?
>>>>>>>>>>>
>>>>>>>>>>> Or trying to waste time as usual.
>>>>>>>>>>>
>>>>>>>>>>> It is far easier to throw than catch.
>>>>>>>>>>>
>>>>>>>>>>> Do your homework and stop throwing spanners at work.
>>>>>>>>>>>
>>>>>>>>>>> Mich Talebzadeh,
>>>>>>>>>>>
>>>>>>>>>>> Architect | Data Engineer | Data Science | Financial Crime
>>>>>>>>>>> PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial
>>>>>>>>>>> College London
>>>>>>>>>>> <https://en.wikipedia.org/wiki/Imperial_College_London>
>>>>>>>>>>> London, United Kingdom
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>    view my Linkedin profile
>>>>>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *Disclaimer:* The information provided is correct to the best
>>>>>>>>>>> of my knowledge but of course cannot be guaranteed . It is 
>>>>>>>>>>> essential to
>>>>>>>>>>> note that, as with any advice, quote "one test result is worth
>>>>>>>>>>> one-thousand expert opinions (Werner
>>>>>>>>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun
>>>>>>>>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, 9 Oct 2024 at 16:43, Nicholas Chammas <
>>>>>>>>>>> nicholas.cham...@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Mich,
>>>>>>>>>>>>
>>>>>>>>>>>> Can you please share with the list where *exactly* you are
>>>>>>>>>>>> citing these configs from?
>>>>>>>>>>>>
>>>>>>>>>>>> As far as I can tell, these two configs don’t exist and have
>>>>>>>>>>>> never existed in the Spark codebase:
>>>>>>>>>>>>
>>>>>>>>>>>> spark.executor.decommission.enabled=true
>>>>>>>>>>>> spark.executor.decommission.gracefulShutdown=true
>>>>>>>>>>>>
>>>>>>>>>>>> Where exactly are you getting this information from (and then
>>>>>>>>>>>> posting it to the list as advice)? Please be clear and provide 
>>>>>>>>>>>> specific
>>>>>>>>>>>> references.
>>>>>>>>>>>>
>>>>>>>>>>>> Nick
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Oct 9, 2024, at 1:20 PM, Mich Talebzadeh <
>>>>>>>>>>>> mich.talebza...@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Before responding, what configuration parameters are you using
>>>>>>>>>>>> to make this work?
>>>>>>>>>>>>
>>>>>>>>>>>> spark.executor.decommission.enabled=true
>>>>>>>>>>>> spark.executor.decommission.gracefulShutdown=true
>>>>>>>>>>>> spark.executor.decommission.forceKillTimeout=100s
>>>>>>>>>>>>
>>>>>>>>>>>> HTH
>>>>>>>>>>>>
>>>>>>>>>>>> Mich Talebzadeh,
>>>>>>>>>>>>
>>>>>>>>>>>> Architect | Data Engineer | Data Science | Financial Crime
>>>>>>>>>>>> PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial
>>>>>>>>>>>> College London
>>>>>>>>>>>> <https://en.wikipedia.org/wiki/Imperial_College_London>
>>>>>>>>>>>> London, United Kingdom
>>>>>>>>>>>>
>>>>>>>>>>>>    view my Linkedin profile
>>>>>>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *Disclaimer:* The information provided is correct to the best
>>>>>>>>>>>> of my knowledge but of course cannot be guaranteed . It is 
>>>>>>>>>>>> essential to
>>>>>>>>>>>> note that, as with any advice, quote "one test result is worth
>>>>>>>>>>>> one-thousand expert opinions (Werner
>>>>>>>>>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun
>>>>>>>>>>>> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, 9 Oct 2024 at 11:05, Jay Han <tunyu...@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi spark community,
>>>>>>>>>>>>>      I have such a question: Why driver doesn't shutdown
>>>>>>>>>>>>> executors gracefully on k8s. For instance,
>>>>>>>>>>>>> kubernetesClient.pods().withGracePeriod(100).delete().
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Jay
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Liu Cao
>>>>>
>>>>>
>>>>>
>
> --
> Best,
> Jay
>

Reply via email to