[
https://issues.apache.org/jira/browse/BEAM-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sathish Jayaraman updated BEAM-2719:
------------------------------------
Description:
Hi,
The Beam job submitted for execution via spark-submit does not get past the
Evaluating ParMultiDo step. The compile execution runs fine when given
--runner=SparkRunner as parameter. But if I bundle the jar & submit it using
spark-submit, there were no executors getting assigned. I tried to submit with
both master spark-url & YARN but no luck in getting it executed past that step.
Below is the command I used to submit & job log from YARN.
I tried executing in both local single node cluster & in Azure HDInsight
cluster, the result is the same. So I guess there is nothing wrong in the Spark
configuration & could be a bug.
{code}
$ ~/spark/bin/spark-submit --class org.apache.beam.examples.WordCount --master
yarn --executor-memory 2G --num-executors 2
target/word-count-beam-0.1-shaded.jar --runner=SparkRunner --inputFile=pom.xml
--output=counts
{code}
{code}
17/08/03 13:00:33 INFO client.RMProxy: Connecting to ResourceManager at
/0.0.0.0:8030
17/08/03 13:00:33 INFO yarn.YarnRMClient: Registering the ApplicationMaster
17/08/03 13:00:34 INFO yarn.YarnAllocator: Will request 2 executor
container(s), each with 1 core(s) and 2432 MB memory (including 384 MB of
overhead)
17/08/03 13:00:34 INFO yarn.YarnAllocator: Submitted 2 unlocalized container
requests.
17/08/03 13:00:34 INFO yarn.ApplicationMaster: Started progress reporter thread
with (heartbeat : 3000, initial allocation : 200) intervals
17/08/03 13:00:35 INFO impl.AMRMClientImpl: Received new token for :
192.168.0.7:50173
17/08/03 13:00:35 INFO yarn.YarnAllocator: Launching container
container_1501744514957_0003_01_000002 on host 192.168.0.7
17/08/03 13:00:35 INFO yarn.YarnAllocator: Received 1 containers from YARN,
launching executors on 1 of them.
17/08/03 13:00:35 INFO impl.ContainerManagementProtocolProxy:
yarn.client.max-cached-nodemanagers-proxies : 0
17/08/03 13:00:35 INFO impl.ContainerManagementProtocolProxy: Opening proxy :
192.168.0.7:50173
17/08/03 13:00:37 INFO yarn.YarnAllocator: Launching container
container_1501744514957_0003_01_000003 on host 192.168.0.7
17/08/03 13:00:37 INFO yarn.YarnAllocator: Received 1 containers from YARN,
launching executors on 1 of them.
17/08/03 13:00:37 INFO impl.ContainerManagementProtocolProxy:
yarn.client.max-cached-nodemanagers-proxies : 0
17/08/03 13:00:37 INFO impl.ContainerManagementProtocolProxy: Opening proxy :
192.168.0.7:50173
{code}
was:
Hi,
The Beam job submitted for execution via spark-submit does not get past the
Evaluating ParMultiDo step. The compile execution runs fine when given
--runner=SparkRunner as parameter. But if I bundle the jar & submit it using
spark-submit, there were no executors getting assigned. I tried to submit with
both master spark-url & YARN but no luck in getting it executed past that step.
Below is the command I used to submit & job log from YARN.
I tried executing in both local single node cluster & in Azure HDInsight
cluster, the result is the same. So I guess there is nothing wrong in the Spark
configuration & could be a bug.
{code:bash}
$ ~/spark/bin/spark-submit --class org.apache.beam.examples.WordCount --master
yarn --executor-memory 2G --num-executors 2
target/word-count-beam-0.1-shaded.jar --runner=SparkRunner --inputFile=pom.xml
--output=counts
{code}
{code:bash}
17/08/03 13:00:33 INFO client.RMProxy: Connecting to ResourceManager at
/0.0.0.0:8030
17/08/03 13:00:33 INFO yarn.YarnRMClient: Registering the ApplicationMaster
17/08/03 13:00:34 INFO yarn.YarnAllocator: Will request 2 executor
container(s), each with 1 core(s) and 2432 MB memory (including 384 MB of
overhead)
17/08/03 13:00:34 INFO yarn.YarnAllocator: Submitted 2 unlocalized container
requests.
17/08/03 13:00:34 INFO yarn.ApplicationMaster: Started progress reporter thread
with (heartbeat : 3000, initial allocation : 200) intervals
17/08/03 13:00:35 INFO impl.AMRMClientImpl: Received new token for :
192.168.0.7:50173
17/08/03 13:00:35 INFO yarn.YarnAllocator: Launching container
container_1501744514957_0003_01_000002 on host 192.168.0.7
17/08/03 13:00:35 INFO yarn.YarnAllocator: Received 1 containers from YARN,
launching executors on 1 of them.
17/08/03 13:00:35 INFO impl.ContainerManagementProtocolProxy:
yarn.client.max-cached-nodemanagers-proxies : 0
17/08/03 13:00:35 INFO impl.ContainerManagementProtocolProxy: Opening proxy :
192.168.0.7:50173
17/08/03 13:00:37 INFO yarn.YarnAllocator: Launching container
container_1501744514957_0003_01_000003 on host 192.168.0.7
17/08/03 13:00:37 INFO yarn.YarnAllocator: Received 1 containers from YARN,
launching executors on 1 of them.
17/08/03 13:00:37 INFO impl.ContainerManagementProtocolProxy:
yarn.client.max-cached-nodemanagers-proxies : 0
17/08/03 13:00:37 INFO impl.ContainerManagementProtocolProxy: Opening proxy :
192.168.0.7:50173
{code}
> Beam job hangs at Evaluating ParMultiDo when submitted via spark-runner
> ------------------------------------------------------------------------
>
> Key: BEAM-2719
> URL: https://issues.apache.org/jira/browse/BEAM-2719
> Project: Beam
> Issue Type: Bug
> Components: runner-spark
> Affects Versions: 2.0.0
> Environment: OSX / i5 / 10GB
> Reporter: Sathish Jayaraman
> Assignee: Amit Sela
>
> Hi,
> The Beam job submitted for execution via spark-submit does not get past the
> Evaluating ParMultiDo step. The compile execution runs fine when given
> --runner=SparkRunner as parameter. But if I bundle the jar & submit it using
> spark-submit, there were no executors getting assigned. I tried to submit
> with both master spark-url & YARN but no luck in getting it executed past
> that step. Below is the command I used to submit & job log from YARN.
> I tried executing in both local single node cluster & in Azure HDInsight
> cluster, the result is the same. So I guess there is nothing wrong in the
> Spark configuration & could be a bug.
> {code}
> $ ~/spark/bin/spark-submit --class org.apache.beam.examples.WordCount
> --master yarn --executor-memory 2G --num-executors 2
> target/word-count-beam-0.1-shaded.jar --runner=SparkRunner
> --inputFile=pom.xml --output=counts
> {code}
> {code}
> 17/08/03 13:00:33 INFO client.RMProxy: Connecting to ResourceManager at
> /0.0.0.0:8030
> 17/08/03 13:00:33 INFO yarn.YarnRMClient: Registering the ApplicationMaster
> 17/08/03 13:00:34 INFO yarn.YarnAllocator: Will request 2 executor
> container(s), each with 1 core(s) and 2432 MB memory (including 384 MB of
> overhead)
> 17/08/03 13:00:34 INFO yarn.YarnAllocator: Submitted 2 unlocalized container
> requests.
> 17/08/03 13:00:34 INFO yarn.ApplicationMaster: Started progress reporter
> thread with (heartbeat : 3000, initial allocation : 200) intervals
> 17/08/03 13:00:35 INFO impl.AMRMClientImpl: Received new token for :
> 192.168.0.7:50173
> 17/08/03 13:00:35 INFO yarn.YarnAllocator: Launching container
> container_1501744514957_0003_01_000002 on host 192.168.0.7
> 17/08/03 13:00:35 INFO yarn.YarnAllocator: Received 1 containers from YARN,
> launching executors on 1 of them.
> 17/08/03 13:00:35 INFO impl.ContainerManagementProtocolProxy:
> yarn.client.max-cached-nodemanagers-proxies : 0
> 17/08/03 13:00:35 INFO impl.ContainerManagementProtocolProxy: Opening proxy :
> 192.168.0.7:50173
> 17/08/03 13:00:37 INFO yarn.YarnAllocator: Launching container
> container_1501744514957_0003_01_000003 on host 192.168.0.7
> 17/08/03 13:00:37 INFO yarn.YarnAllocator: Received 1 containers from YARN,
> launching executors on 1 of them.
> 17/08/03 13:00:37 INFO impl.ContainerManagementProtocolProxy:
> yarn.client.max-cached-nodemanagers-proxies : 0
> 17/08/03 13:00:37 INFO impl.ContainerManagementProtocolProxy: Opening proxy :
> 192.168.0.7:50173
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)