[jira] [Commented] (BEAM-2267) Final files for WordCount not appearing with Apex on YARN
[ https://issues.apache.org/jira/browse/BEAM-2267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16359268#comment-16359268 ] Thomas Weise commented on BEAM-2267: Found working during 2.2.0 release verification, possibly due to BEAM-2575. Please re-open if still an issue. > Final files for WordCount not appearing with Apex on YARN > - > > Key: BEAM-2267 > URL: https://issues.apache.org/jira/browse/BEAM-2267 > Project: Beam > Issue Type: Bug > Components: runner-apex >Reporter: Kenneth Knowles >Priority: Major > Fix For: Not applicable > > > When I run WordCount with the Apex runner on a YARN cluster - specifically > Dataproc, reading/writing GCS - the word counts are all written to temporary > files but they are never moved to their final destination. > Hadoop version 2.7.3 > Beam RC 2.0.0 > Steps to repro: > 1. Instantiate archetype (see below) > 2. Build uber jar {{mvn --settings ../beamrc-settings.xml clean package -P > apex-runner}} > 3. SCP to master (or wherever you'd like to launch from) > 4. {{java -cp word-count-beam-0.1.jar beamrc.WordCount --runner=ApexRunner > --embeddedExecution=false > --inputfile=gs://apache-beam-samples/shakespeare/winterstale-personae > --output=SOMEWHERE}} > Appendix: steps to instantiate RC archetype: > Build an RC-specific {{beamrc-settings.xml}} > {code} > > > > beam-2.0.0 > > > > archetype > RC_REPO > > > > > > > beam-2.0.0 > > > {code} > And then instantiate like so > {code} > mvn archetype:generate \ > --settings beam-rc-settings.xml \ > -D archetypeCatalog=internal \ > -D archetypeGroupId=org.apache.beam \ > -D archetypeArtifactId=beam-sdks-java-maven-archetypes-examples \ > -D archetypeVersion=2.0.0 \ > -D groupId=beamrc \ > -D artifactId=word-count-beam \ > -D version="0.1" \ > -D package=beamrc \ > -D interactiveMode=false > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-2267) Final files for WordCount not appearing with Apex on YARN
[ https://issues.apache.org/jira/browse/BEAM-2267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008198#comment-16008198 ] Thomas Weise commented on BEAM-2267: I have seen this as intermittent issue in the past even with embedded mode. The move to final location may not happen and it looks like a timing dependency with the topology shutdown. > Final files for WordCount not appearing with Apex on YARN > - > > Key: BEAM-2267 > URL: https://issues.apache.org/jira/browse/BEAM-2267 > Project: Beam > Issue Type: Bug > Components: runner-apex >Reporter: Kenneth Knowles >Assignee: Thomas Weise > > When I run WordCount with the Apex runner on a YARN cluster - specifically > Dataproc, reading/writing GCS - the word counts are all written to temporary > files but they are never moved to their final destination. > Hadoop version 2.7.3 > Beam RC 2.0.0 > Steps to repro: > 1. Instantiate archetype (see below) > 2. Build uber jar {{mvn --settings ../beamrc-settings.xml clean package -P > apex-runner}} > 3. SCP to master (or wherever you'd like to launch from) > 4. {{java -cp word-count-beam-0.1.jar beamrc.WordCount --runner=ApexRunner > --embeddedExecution=false > --inputfile=gs://apache-beam-samples/shakespeare/winterstale-personae > --output=SOMEWHERE}} > Appendix: steps to instantiate RC archetype: > Build an RC-specific {{beamrc-settings.xml}} > {code} > > > > beam-2.0.0 > > > > archetype > RC_REPO > > > > > > > beam-2.0.0 > > > {code} > And then instantiate like so > {code} > mvn archetype:generate \ > --settings beam-rc-settings.xml \ > -D archetypeCatalog=internal \ > -D archetypeGroupId=org.apache.beam \ > -D archetypeArtifactId=beam-sdks-java-maven-archetypes-examples \ > -D archetypeVersion=2.0.0 \ > -D groupId=beamrc \ > -D artifactId=word-count-beam \ > -D version="0.1" \ > -D package=beamrc \ > -D interactiveMode=false > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)