[
https://issues.apache.org/jira/browse/BEAM-2267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008198#comment-16008198
]
Thomas Weise commented on BEAM-2267:
------------------------------------
I have seen this as intermittent issue in the past even with embedded mode. The
move to final location may not happen and it looks like a timing dependency
with the topology shutdown.
> Final files for WordCount not appearing with Apex on YARN
> ---------------------------------------------------------
>
> Key: BEAM-2267
> URL: https://issues.apache.org/jira/browse/BEAM-2267
> Project: Beam
> Issue Type: Bug
> Components: runner-apex
> Reporter: Kenneth Knowles
> Assignee: Thomas Weise
>
> When I run WordCount with the Apex runner on a YARN cluster - specifically
> Dataproc, reading/writing GCS - the word counts are all written to temporary
> files but they are never moved to their final destination.
> Hadoop version 2.7.3
> Beam RC 2.0.0
> Steps to repro:
> 1. Instantiate archetype (see below)
> 2. Build uber jar {{mvn --settings ../beamrc-settings.xml clean package -P
> apex-runner}}
> 3. SCP to master (or wherever you'd like to launch from)
> 4. {{java -cp word-count-beam-0.1.jar beamrc.WordCount --runner=ApexRunner
> --embeddedExecution=false
> --inputfile=gs://apache-beam-samples/shakespeare/winterstale-personae
> --output=SOMEWHERE}}
> Appendix: steps to instantiate RC archetype:
> Build an RC-specific {{beamrc-settings.xml}}
> {code}
> <settings>
> <profiles>
> <profile>
> <id>beam-2.0.0</id>
> <repositories>
> <repository>
> <!-- This id _must_ be "archetype" -->
> <id>archetype</id>
> <url>RC_REPO</url>
> </repository>
> </repositories>
> </profile>
> </profiles>
>
> <activeProfiles>
> <activeProfile>beam-2.0.0</activeProfile>
> </activeProfiles>
> </settings>
> {code}
> And then instantiate like so
> {code}
> mvn archetype:generate \
> --settings beam-rc-settings.xml \
> -D archetypeCatalog=internal \
> -D archetypeGroupId=org.apache.beam \
> -D archetypeArtifactId=beam-sdks-java-maven-archetypes-examples \
> -D archetypeVersion=2.0.0 \
> -D groupId=beamrc \
> -D artifactId=word-count-beam \
> -D version="0.1" \
> -D package=beamrc \
> -D interactiveMode=false
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)