Can you share your Dockerfile (not all but gist of it) and instructions how you do it and what you actually run to get that message?
I have just pushed my local repo to Github where I have created an example of Spark on Docker some time ago. Please take a look and compare what you are doing. https://github.com/khalidmammadov/spark_docker On Sat, Jul 24, 2021 at 4:07 PM Dinakar Chennubotla < chennu.bigd...@gmail.com> wrote: > Hi Khalid Mammadov, > > I tried the which says distributed mode Spark installation. But when run > below command it says " deployment mode = cluster is not allowed in > standalone cluster". > > Source Url I used is: > > https://towardsdatascience.com/diy-apache-spark-docker-bb4f11c10d24?gi=fa52ac767c0b > > Kiddly refer this section in the url I mentioned. > "Docker & Spark — Multiple Machines" > > I removed third party things and dockerized my way. > > Thanks, > Dinakar > > On Sat, 24 Jul, 2021, 20:28 Khalid Mammadov, <khalidmammad...@gmail.com> > wrote: > >> Standalone mode already implies you are running on cluster (distributed) >> mode. i.e. it's one of 4 available cluster manager options. The difference >> is Standalone uses it's one resource manager rather than using YARN for >> example. >> If you are running docker on a single machine then you are limited to >> that but if you run your docker on a cluster and deploy your Spark >> containers on it then you will get your distribution and cluster mode. >> And also If you are referring to scalability then you need to register >> worker nodes when you need to scale. >> You do it by registering a VM/container as a worker node as per doc using: >> >> ./sbin/start-worker.sh <master-spark-URL> >> >> You can create a new docker container with your base image and run the above >> command on the bootstrap and that would register a worker node and scale >> your cluster when you want. >> >> And if you kill them then you would scale down ( I think this is how >> Databricks autoscaling works..). I am not sure k8s TBH, perhaps it's handled >> this more gracefully >> >> >> On Sat, Jul 24, 2021 at 3:38 PM Dinakar Chennubotla < >> chennu.bigd...@gmail.com> wrote: >> >>> Hi Khalid Mammadov, >>> >>> Thank you for your response, >>> Yes, I did, I built standalone apache spark cluster on docker containers. >>> >>> But I am looking for distributed spark cluster, >>> Where spark workers are scalable and spark "deployment mode = cluster". >>> >>> Source url I used to built standalone apache spark cluster >>> https://www.kdnuggets.com/2020/07/apache-spark-cluster-docker.html >>> >>> If you have documentation on distributed spark, which I am looking for, >>> could you please send me. >>> >>> >>> Thanks, >>> Dinakar >>> >>> On Sat, 24 Jul, 2021, 19:32 Khalid Mammadov, <khalidmammad...@gmail.com> >>> wrote: >>> >>>> Hi, >>>> >>>> Have you checked out docs? >>>> https://spark.apache.org/docs/latest/spark-standalone.html >>>> >>>> Thanks, >>>> Khalid >>>> >>>> On Sat, Jul 24, 2021 at 1:45 PM Dinakar Chennubotla < >>>> chennu.bigd...@gmail.com> wrote: >>>> >>>>> Hi All, >>>>> >>>>> I am Dinakar, Hadoop admin, >>>>> could someone help me here, >>>>> >>>>> 1. I have a DEV-POC task to do, >>>>> 2. Need to Installing Distributed apache-spark cluster with Cluster >>>>> mode on Docker containers. >>>>> 3. with Scalable spark-worker containers. >>>>> 4. we have a 9 node cluster with some other services or tools. >>>>> >>>>> Thanks, >>>>> Dinakar >>>>> >>>>