"
To: dev
Subject: Re: Beam's job crashes on cluster
> I applied some modifications to the code to run Beam tasks on k8s cluster using spark-submit.
Interesting, how does that work?
On Fri, Dec 13, 2019 at 12:49 PM Matthew K. <softm...@gmx.com> wrote:
I'
he job-server:runShadow command.
Without specifying the master URL, the default is to start an embedded Spark master within the same JVM as the job server, rather than using your standalone master.
On Fri, Dec 13, 2019 at 12:15 PM Matthew K. <softm...@gmx.com> wrote:
Job ser
as lost.
Some additional info that might help us establish a chain of causation:
- the arguments you used to start the job server?
- the spark cluster deployment setup?
On Fri, Dec 13, 2019 at 8:00 AM Matthew K. <softm...@gmx.com> wrote:
Actually the reason for that error i
lExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Sent: Friday, December 13, 2019 at 6:58 AM
From: "Matthew K."
To: dev@beam.apache.org
Cc: dev
Sub
Sent: Thursday, December 12, 2019 at 5:30 PM
From: "Kyle Weaver"
To: dev
Subject: Re: Beam's job crashes on cluster
Can you share the pipeline options you are using? Particularly environment_type and environment_config.
On Thu, Dec 12, 2019 at 2:58 PM Matthew K. <softm...
Running Beam on Spark cluster, it crashhes and I get the following error (workers are on separate nodes, it works fine when workers are on the same node as runner):
> Task :runners:spark:job-server:runShadow FAILED
Exception in thread wait_until_finish_read:
Traceback (most recent call last):
Hi,
To run a beam job on a spark cluster with some number of nodes running:
1. Is it recommended to set pipeline parameters --num_workers, --max_num_workers, --autoscaling_algorithms, --worker_machine_type, etc, or beam (spark) will figure that out?
2. If that is recommended to set thos
you mean here.
On Wed, Nov 6, 2019 at 3:32 PM Matthew K. <softm...@gmx.com> wrote:
Thanks, still I need to pass parameters to the boot executable, such as, worker id, control endpoint, logging endpoint, etc.
Where can I extract these parameters from? (In apache_beam Python cod
pendencies on all of your worker machines.
The best example I know of is here: https://github.com/apache/beam/blob/cbf8a900819c52940a0edd90f59bf6aec55c817a/sdks/python/test-suites/portable/py2/build.gradle#L146-L165
On Wed, Nov 6, 2019 at 2:24 PM Matthew K. <softm...@gmx.com> wrote:
Hi all,
I am trying to run *Python* beam pipeline on a Spark cluster. Since workers are running on separate nodes, I am using "PROCESS" for "evironment_type" in pipeline options, but I couldn't find any documentation on what "command" I should pass to "environment_config" to run on the worker,
10 matches
Mail list logo