[ 
https://issues.apache.org/jira/browse/BEAM-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sathish Jayaraman updated BEAM-2719:
------------------------------------
    Description: 
Hi,

The Beam job submitted for execution via spark-submit does not get past the 
Evaluating ParMultiDo step. The compile execution runs fine when given 
--runner=SparkRunner as parameter. But if I bundle the jar & submit it using 
spark-submit, there were no executors getting assigned. I tried to submit with 
both master spark-url & YARN but no luck in getting it executed past that step. 
Below is the command I used to submit & job log from YARN. 

I tried executing in both local single node cluster & in Azure HDInsight 
cluster, the result is the same. So I guess there is nothing wrong in the Spark 
configuration & could be a bug. 

{code}
$ ~/spark/bin/spark-submit --class org.apache.beam.examples.WordCount --master 
yarn --executor-memory 2G --num-executors 2 
target/word-count-beam-0.1-shaded.jar --runner=SparkRunner --inputFile=pom.xml 
--output=counts
{code}

{code}
17/08/03 13:00:33 INFO client.RMProxy: Connecting to ResourceManager at 
/0.0.0.0:8030
17/08/03 13:00:33 INFO yarn.YarnRMClient: Registering the ApplicationMaster
17/08/03 13:00:34 INFO yarn.YarnAllocator: Will request 2 executor 
container(s), each with 1 core(s) and 2432 MB memory (including 384 MB of 
overhead)
17/08/03 13:00:34 INFO yarn.YarnAllocator: Submitted 2 unlocalized container 
requests.
17/08/03 13:00:34 INFO yarn.ApplicationMaster: Started progress reporter thread 
with (heartbeat : 3000, initial allocation : 200) intervals
17/08/03 13:00:35 INFO impl.AMRMClientImpl: Received new token for : 
192.168.0.7:50173
17/08/03 13:00:35 INFO yarn.YarnAllocator: Launching container 
container_1501744514957_0003_01_000002 on host 192.168.0.7
17/08/03 13:00:35 INFO yarn.YarnAllocator: Received 1 containers from YARN, 
launching executors on 1 of them.
17/08/03 13:00:35 INFO impl.ContainerManagementProtocolProxy: 
yarn.client.max-cached-nodemanagers-proxies : 0
17/08/03 13:00:35 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
192.168.0.7:50173
17/08/03 13:00:37 INFO yarn.YarnAllocator: Launching container 
container_1501744514957_0003_01_000003 on host 192.168.0.7
17/08/03 13:00:37 INFO yarn.YarnAllocator: Received 1 containers from YARN, 
launching executors on 1 of them.
17/08/03 13:00:37 INFO impl.ContainerManagementProtocolProxy: 
yarn.client.max-cached-nodemanagers-proxies : 0
17/08/03 13:00:37 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
192.168.0.7:50173
{code}


  was:
Hi,

The Beam job submitted for execution via spark-submit does not get past the 
Evaluating ParMultiDo step. The compile execution runs fine when given 
--runner=SparkRunner as parameter. But if I bundle the jar & submit it using 
spark-submit, there were no executors getting assigned. I tried to submit with 
both master spark-url & YARN but no luck in getting it executed past that step. 
Below is the command I used to submit & job log from YARN. 

I tried executing in both local single node cluster & in Azure HDInsight 
cluster, the result is the same. So I guess there is nothing wrong in the Spark 
configuration & could be a bug. 

{code:bash}
$ ~/spark/bin/spark-submit --class org.apache.beam.examples.WordCount --master 
yarn --executor-memory 2G --num-executors 2 
target/word-count-beam-0.1-shaded.jar --runner=SparkRunner --inputFile=pom.xml 
--output=counts
{code}

{code:bash}
17/08/03 13:00:33 INFO client.RMProxy: Connecting to ResourceManager at 
/0.0.0.0:8030
17/08/03 13:00:33 INFO yarn.YarnRMClient: Registering the ApplicationMaster
17/08/03 13:00:34 INFO yarn.YarnAllocator: Will request 2 executor 
container(s), each with 1 core(s) and 2432 MB memory (including 384 MB of 
overhead)
17/08/03 13:00:34 INFO yarn.YarnAllocator: Submitted 2 unlocalized container 
requests.
17/08/03 13:00:34 INFO yarn.ApplicationMaster: Started progress reporter thread 
with (heartbeat : 3000, initial allocation : 200) intervals
17/08/03 13:00:35 INFO impl.AMRMClientImpl: Received new token for : 
192.168.0.7:50173
17/08/03 13:00:35 INFO yarn.YarnAllocator: Launching container 
container_1501744514957_0003_01_000002 on host 192.168.0.7
17/08/03 13:00:35 INFO yarn.YarnAllocator: Received 1 containers from YARN, 
launching executors on 1 of them.
17/08/03 13:00:35 INFO impl.ContainerManagementProtocolProxy: 
yarn.client.max-cached-nodemanagers-proxies : 0
17/08/03 13:00:35 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
192.168.0.7:50173
17/08/03 13:00:37 INFO yarn.YarnAllocator: Launching container 
container_1501744514957_0003_01_000003 on host 192.168.0.7
17/08/03 13:00:37 INFO yarn.YarnAllocator: Received 1 containers from YARN, 
launching executors on 1 of them.
17/08/03 13:00:37 INFO impl.ContainerManagementProtocolProxy: 
yarn.client.max-cached-nodemanagers-proxies : 0
17/08/03 13:00:37 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
192.168.0.7:50173
{code}



> Beam job hangs at Evaluating ParMultiDo when submitted via spark-runner 
> ------------------------------------------------------------------------
>
>                 Key: BEAM-2719
>                 URL: https://issues.apache.org/jira/browse/BEAM-2719
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-spark
>    Affects Versions: 2.0.0
>         Environment: OSX / i5 / 10GB
>            Reporter: Sathish Jayaraman
>            Assignee: Amit Sela
>
> Hi,
> The Beam job submitted for execution via spark-submit does not get past the 
> Evaluating ParMultiDo step. The compile execution runs fine when given 
> --runner=SparkRunner as parameter. But if I bundle the jar & submit it using 
> spark-submit, there were no executors getting assigned. I tried to submit 
> with both master spark-url & YARN but no luck in getting it executed past 
> that step. Below is the command I used to submit & job log from YARN. 
> I tried executing in both local single node cluster & in Azure HDInsight 
> cluster, the result is the same. So I guess there is nothing wrong in the 
> Spark configuration & could be a bug. 
> {code}
> $ ~/spark/bin/spark-submit --class org.apache.beam.examples.WordCount 
> --master yarn --executor-memory 2G --num-executors 2 
> target/word-count-beam-0.1-shaded.jar --runner=SparkRunner 
> --inputFile=pom.xml --output=counts
> {code}
> {code}
> 17/08/03 13:00:33 INFO client.RMProxy: Connecting to ResourceManager at 
> /0.0.0.0:8030
> 17/08/03 13:00:33 INFO yarn.YarnRMClient: Registering the ApplicationMaster
> 17/08/03 13:00:34 INFO yarn.YarnAllocator: Will request 2 executor 
> container(s), each with 1 core(s) and 2432 MB memory (including 384 MB of 
> overhead)
> 17/08/03 13:00:34 INFO yarn.YarnAllocator: Submitted 2 unlocalized container 
> requests.
> 17/08/03 13:00:34 INFO yarn.ApplicationMaster: Started progress reporter 
> thread with (heartbeat : 3000, initial allocation : 200) intervals
> 17/08/03 13:00:35 INFO impl.AMRMClientImpl: Received new token for : 
> 192.168.0.7:50173
> 17/08/03 13:00:35 INFO yarn.YarnAllocator: Launching container 
> container_1501744514957_0003_01_000002 on host 192.168.0.7
> 17/08/03 13:00:35 INFO yarn.YarnAllocator: Received 1 containers from YARN, 
> launching executors on 1 of them.
> 17/08/03 13:00:35 INFO impl.ContainerManagementProtocolProxy: 
> yarn.client.max-cached-nodemanagers-proxies : 0
> 17/08/03 13:00:35 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
> 192.168.0.7:50173
> 17/08/03 13:00:37 INFO yarn.YarnAllocator: Launching container 
> container_1501744514957_0003_01_000003 on host 192.168.0.7
> 17/08/03 13:00:37 INFO yarn.YarnAllocator: Received 1 containers from YARN, 
> launching executors on 1 of them.
> 17/08/03 13:00:37 INFO impl.ContainerManagementProtocolProxy: 
> yarn.client.max-cached-nodemanagers-proxies : 0
> 17/08/03 13:00:37 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
> 192.168.0.7:50173
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to