Re: Re: Running Spark Connect Server in Cluster Mode on Kubernetes

Nagatomi Yasukazu Sun, 29 Oct 2023 03:30:04 -0700

Hi, eabour

Thank you for the insights.


Based on the information you provided, along with the PR
[SPARK-42371][CONNECT] that add "./sbin/start-connect-server.sh" script,
I'll experiment with launching the Spark Connect Server in Cluster Mode on
Kubernetes.

[SPARK-42371][CONNECT] Add scripts to start and stop Spark Connect server
https://github.com/apache/spark/pull/39928

I'll keep you updated on the progress in this thread.

> ALL

If anyone has successfully launched the Spark Connect Server in Cluster
Mode on an on-premises Kubernetes, I'd greatly appreciate it if you could
share your experience or any relevant information.

Any related insights are also very welcome!

Best regards,
Yasukazu

2023年10月19日(木) 16:11 eab...@163.com <eab...@163.com>:

> Hi,
>     I have found three important classes:
>
>    1. *org.apache.spark.sql.connect.service.SparkConnectServer* : the 
> ./sbin/start-connect-server.sh
>    script use SparkConnectServer  class as main class. In main function,
>    use SparkSession.builder.getOrCreate() create local sessin, and
>    start SparkConnectService.
>    2. *org.apache.spark.sql.connect.SparkConnectPlugin* : To enable Spark
>    Connect, simply make sure that the appropriate JAR is available in the
>    CLASSPATH and the driver plugin is configured to load this class.
>    3. *org.apache.spark.sql.connect.SimpleSparkConnectService* : A simple
>    main class method to start the spark connect server as a service for client
>    tests.
>
>
>    So, I believe that by configuring the spark.plugins and starting the
> Spark cluster on Kubernetes, clients can utilize sc://ip:port to connect
> to the remote server.
>    Let me give it a try.
>
> ------------------------------
> eabour
>
>
> *From:* eab...@163.com
> *Date:* 2023-10-19 14:28
> *To:* Nagatomi Yasukazu <yassan0...@gmail.com>; user @spark
> <user@spark.apache.org>
> *Subject:* Re: Re: Running Spark Connect Server in Cluster Mode on
> Kubernetes
> Hi all,
>
> Has the spark connect server running on k8s functionality been implemented?
>
> ------------------------------
>
>
> *From:* Nagatomi Yasukazu <yassan0...@gmail.com>
> *Date:* 2023-09-05 17:51
> *To:* user <user@spark.apache.org>
> *Subject:* Re: Running Spark Connect Server in Cluster Mode on Kubernetes
> Dear Spark Community,
>
> I've been exploring the capabilities of the Spark Connect Server and
> encountered an issue when trying to launch it in a cluster deploy mode with
> Kubernetes as the master.
>
> While initiating the `start-connect-server.sh` script with the `--conf`
> parameter for `spark.master` and `spark.submit.deployMode`, I was met with
> an error message:
>
> ```
> Exception in thread "main" org.apache.spark.SparkException: Cluster deploy
> mode is not applicable to Spark Connect server.
> ```
>
> This error message can be traced back to Spark's source code here:
>
> https://github.com/apache/spark/blob/6c885a7cf57df328b03308cff2eed814bda156e4/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L307
>
> Given my observations, I'm curious about the Spark Connect Server roadmap:
>
> Is there a plan or current conversation to enable Kubernetes as a master
> in Spark Connect Server's cluster deploy mode?
>
> I have tried to gather information from existing JIRA tickets, but have
> not been able to get a definitive answer:
>
> https://issues.apache.org/jira/browse/SPARK-42730
> https://issues.apache.org/jira/browse/SPARK-39375
> https://issues.apache.org/jira/browse/SPARK-44117
>
> Any thoughts, updates, or references to similar conversations or
> initiatives would be greatly appreciated.
>
> Thank you for your time and expertise!
>
> Best regards,
> Yasukazu
>
> 2023年9月5日(火) 12:09 Nagatomi Yasukazu <yassan0...@gmail.com>:
>
>> Hello Mich,
>> Thank you for your questions. Here are my responses:
>>
>> > 1. What investigation have you done to show that it is running in local
>> mode?
>>
>> I have verified through the History Server's Environment tab that:
>> - "spark.master" is set to local[*]
>> - "spark.app.id" begins with local-xxx
>> - "spark.submit.deployMode" is set to local
>>
>>
>> > 2. who has configured this kubernetes cluster? Is it supplied by a
>> cloud vendor?
>>
>> Our Kubernetes cluster was set up in an on-prem environment using RKE2(
>> https://docs.rke2.io/ ).
>>
>>
>> > 3. Confirm that you have configured Spark Connect Server correctly for
>> cluster mode. Make sure you specify the cluster manager (e.g., Kubernetes)
>> and other relevant Spark configurations in your Spark job submission.
>>
>> Based on the Spark Connect documentation I've read, there doesn't seem to
>> be any specific settings for cluster mode related to the Spark Connect
>> Server.
>>
>> Configuration - Spark 3.4.1 Documentation
>> https://spark.apache.org/docs/3.4.1/configuration.html#spark-connect
>>
>> Quickstart: Spark Connect — PySpark 3.4.1 documentation
>>
>> https://spark.apache.org/docs/latest/api/python/getting_started/quickstart_connect.html
>>
>> Spark Connect Overview - Spark 3.4.1 Documentation
>> https://spark.apache.org/docs/latest/spark-connect-overview.html
>>
>> The documentation only suggests running ./sbin/start-connect-server.sh
>> --packages org.apache.spark:spark-connect_2.12:3.4.0, leaving me at a loss.
>>
>>
>> > 4. Can you provide a full spark submit command
>>
>> Given the nature of Spark Connect, I don't use the spark-submit command.
>> Instead, as per the documentation, I can execute workloads using only a
>> Python script. For the Spark Connect Server, I have a Kubernetes manifest
>> executing "/opt.spark/sbin/start-connect-server.sh --packages
>> org.apache.spark:spark-connect_2.12:3.4.0".
>>
>>
>> > 5. Make sure that the Python client script connecting to Spark Connect
>> Server specifies the cluster mode explicitly, like using --master or
>> --deploy-mode flags when creating a SparkSession.
>>
>> The Spark Connect Server operates as a Driver, so it isn't possible to
>> specify the --master or --deploy-mode flags in the Python client script. If
>> I try, I encounter a RuntimeError.
>>
>> like this:
>> RuntimeError: Spark master cannot be configured with Spark Connect
>> server; however, found URL for Spark Connect [sc://.../]
>>
>>
>> > 6. Ensure that you have allocated the necessary resources (CPU, memory
>> etc) to Spark Connect Server when running it on Kubernetes.
>>
>> Resources are ample, so that shouldn't be the problem.
>>
>>
>> > 7. Review the environment variables and configurations you have set,
>> including the SPARK_NO_DAEMONIZE=1 variable. Ensure that these variables
>> are not conflicting with
>>
>> I'm unsure if SPARK_NO_DAEMONIZE=1 conflicts with cluster mode settings.
>> But without it, the process goes to the background when executing
>> start-connect-server.sh, causing the Pod to terminate prematurely.
>>
>>
>> > 8. Are you using the correct spark client version that is fully
>> compatible with your spark on the server?
>>
>> Yes, I have verified that without using Spark Connect (e.g., using Spark
>> Operator), Spark applications run as expected.
>>
>> > 9. check the kubernetes error logs
>>
>> The Kubernetes logs don't show any errors, and jobs are running in local
>> mode.
>>
>>
>> > 10. Insufficient resources can lead to the application running in local
>> mode
>>
>> I wasn't aware that insufficient resources could lead to local mode
>> execution. Thank you for pointing it out.
>>
>>
>> Best regards,
>> Yasukazu
>>
>> 2023年9月5日(火) 1:28 Mich Talebzadeh <mich.talebza...@gmail.com>:
>>
>>>
>>> personally I have not used this feature myself. However, some points
>>>
>>>
>>>    1. What investigation have you done to show that it is running in
>>>    local mode?
>>>    2. who has configured this kubernetes cluster? Is it supplied by a
>>>    cloud vendor?
>>>    3. Confirm that you have configured Spark Connect Server correctly
>>>    for cluster mode. Make sure you specify the cluster manager (e.g.,
>>>    Kubernetes) and other relevant Spark configurations in your Spark job
>>>    submission.
>>>    4. Can you provide a full spark submit command
>>>    5. Make sure that the Python client script connecting to Spark
>>>    Connect Server specifies the cluster mode explicitly, like using
>>>    --master or --deploy-mode flags when creating a SparkSession.
>>>    6. Ensure that you have allocated the necessary resources (CPU,
>>>    memory etc) to Spark Connect Server when running it on Kubernetes.
>>>    7. Review the environment variables and configurations you have set,
>>>    including the SPARK_NO_DAEMONIZE=1 variable. Ensure that these
>>>    variables are not conflicting with cluster mode settings.
>>>    8. Are you using the correct spark client version that is fully
>>>    compatible with your spark on the server?
>>>    9. check the kubernetes error logs
>>>    10. Insufficient resources can lead to the application running in
>>>    local mode
>>>
>>> HTH
>>>
>>> Mich Talebzadeh,
>>> Distinguished Technologist, Solutions Architect & Engineer
>>> London
>>> United Kingdom
>>>
>>>
>>>    view my Linkedin profile
>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>
>>>
>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>>
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>>
>>> On Mon, 4 Sept 2023 at 04:57, Nagatomi Yasukazu <yassan0...@gmail.com>
>>> wrote:
>>>
>>>> Hi Cley,
>>>>
>>>> Thank you for taking the time to respond to my query. Your insights on
>>>> Spark cluster deployment are much appreciated.
>>>>
>>>> However, I'd like to clarify that my specific challenge is related to
>>>> running the Spark Connect Server on Kubernetes in Cluster Mode. While I
>>>> understand the general deployment strategies for Spark on Kubernetes, I am
>>>> seeking guidance particularly on the Spark Connect Server aspect.
>>>>
>>>> cf. Spark Connect Overview - Spark 3.4.1 Documentation
>>>>     https://spark.apache.org/docs/latest/spark-connect-overview.html
>>>>
>>>> To reiterate, when I connect from an external Python client and execute
>>>> scripts, the server operates in Local Mode instead of the expected
>>>> Kubernetes Cluster Mode (with master as k8s://... and deploy-mode set to
>>>> cluster).
>>>>
>>>> If I've misunderstood your initial response and it was indeed related
>>>> to Spark Connect, I sincerely apologize for the oversight. In that case,
>>>> could you please expand a bit on the Spark Connect-specific aspects?
>>>>
>>>> Do you, or anyone else in the community, have experience with this
>>>> specific setup or encountered a similar issue with Spark Connect Server on
>>>> Kubernetes? Any targeted advice or guidance would be invaluable.
>>>>
>>>> Thank you again for your time and help.
>>>>
>>>> Best regards,
>>>> Yasukazu
>>>>
>>>> 2023年9月4日(月) 0:23 Cleyson Barros <euroc...@gmail.com>:
>>>>
>>>>> Hi Nagatomi,
>>>>> Use Apache imagers, then run your master node, then start your many
>>>>> slavers. You can add a command line in the docker files to call for the
>>>>> master using the docker container names in your service composition if you
>>>>> wish to run 2 masters active and standby follow the instructions in the
>>>>> Apache docs to do this configuration, the recipe is the same except when
>>>>> you start the masters and how you expect the behaviour of your cluster.
>>>>> I hope it helps.
>>>>> Have a nice day :)
>>>>> Cley
>>>>>
>>>>> Nagatomi Yasukazu <yassan0...@gmail.com> escreveu no dia sábado,
>>>>> 2/09/2023 à(s) 15:37:
>>>>>
>>>>>> Hello Apache Spark community,
>>>>>>
>>>>>> I'm currently trying to run Spark Connect Server on Kubernetes in
>>>>>> Cluster Mode and facing some challenges. Any guidance or hints would be
>>>>>> greatly appreciated.
>>>>>>
>>>>>> ## Environment:
>>>>>> Apache Spark version: 3.4.1
>>>>>> Kubernetes version:  1.23
>>>>>> Command executed:
>>>>>>  /opt/spark/sbin/start-connect-server.sh \
>>>>>>    --packages
>>>>>> org.apache.spark:spark-connect_2.13:3.4.1,org.apache.iceberg:iceberg-spark-runtime-3.4_2.13:1.3.1...
>>>>>> Note that I'm running it with the environment variable
>>>>>> SPARK_NO_DAEMONIZE=1.
>>>>>>
>>>>>> ## Issue:
>>>>>> When I connect from an external Python client and run scripts, it
>>>>>> operates in Local Mode instead of the expected Cluster Mode.
>>>>>>
>>>>>> ## Expected Behavior:
>>>>>> When connecting from a Python client to the Spark Connect Server, I
>>>>>> expect it to run in Cluster Mode.
>>>>>>
>>>>>> If anyone has any insights, advice, or has faced a similar issue, I'd
>>>>>> be grateful for your feedback.
>>>>>> Thank you in advance.
>>>>>>
>>>>>>
>>>>>>

Re: Re: Running Spark Connect Server in Cluster Mode on Kubernetes

Reply via email to