[ 
https://issues.apache.org/jira/browse/BEAM-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008904#comment-16008904
 ] 

Daniel Halperin commented on BEAM-2277:
---------------------------------------

Using the latest version on relesae-2.0.0 branch 
(6931219d8725ad90f6b18b1bf8219d5adb10ff8a) after I successfully ran this using 
an archetype-generated copy of the examples:

{code}
gcloud dataproc jobs submit spark --cluster jasonkuster-test1-0 --properties 
spark.default.parallelism=200 --class org.apache.beam.examples.WordCount --jars 
./target/java-0.1.jar -- --runner=SparkRunner 
--inputFile=hdfs:///home/dhalperi-hdfs/words.txt 
--output=hdfs:///home/dhalperi-hdfs/output-
{code}

Output:

{code}
dhalperi@jasonkuster-test1-0-m:~$ hadoop fs -cat /home/dhalperi-hdfs/words.txt
17/05/12 23:27:47 INFO gcs.GoogleHadoopFileSystemBase: GHFS version: 
1.5.5-hadoop2
a
b
c
d
e
f
dhalperi@jasonkuster-test1-0-m:~$ hadoop fs -cat /home/dhalperi-hdfs/
/home/dhalperi-hdfs/output--00000-of-00006             
/home/dhalperi-hdfs/output--00004-of-00006
/home/dhalperi-hdfs/output--00001-of-00006             
/home/dhalperi-hdfs/output--00005-of-00006
/home/dhalperi-hdfs/output--00002-of-00006             
/home/dhalperi-hdfs/.temp-beam-2017-05-132_23-26-39-0
/home/dhalperi-hdfs/output--00003-of-00006             
/home/dhalperi-hdfs/words.txt
dhalperi@jasonkuster-test1-0-m:~$ hadoop fs -cat /home/dhalperi-hdfs/output*
17/05/12 23:28:05 INFO gcs.GoogleHadoopFileSystemBase: GHFS version: 
1.5.5-hadoop2
a: 1
e: 1
d: 1
b: 1
f: 1
c: 1
{code}

(Notice the double {{--}} in output file names is my fault.)

> IllegalArgumentException when using Hadoop file system for WordCount example.
> -----------------------------------------------------------------------------
>
>                 Key: BEAM-2277
>                 URL: https://issues.apache.org/jira/browse/BEAM-2277
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-extensions
>            Reporter: Aviem Zur
>            Assignee: Aviem Zur
>            Priority: Blocker
>             Fix For: 2.0.0
>
>
> IllegalArgumentException when using Hadoop file system for WordCount example.
> Occurred when running WordCount example using Spark runner on a YARN cluster.
> Command-line arguments:
> {code:none}
> --runner=SparkRunner --inputFile=hdfs:///user/myuser/kinglear.txt 
> --output=hdfs:///user/myuser/wc/wc
> {code}
> Stack trace:
> {code:none}
> java.lang.IllegalArgumentException: Expect srcResourceIds and destResourceIds 
> have the same scheme, but received file, hdfs.
>       at 
> org.apache.beam.sdk.repackaged.com.google.common.base.Preconditions.checkArgument(Preconditions.java:122)
>       at 
> org.apache.beam.sdk.io.FileSystems.validateSrcDestLists(FileSystems.java:394)
>       at org.apache.beam.sdk.io.FileSystems.copy(FileSystems.java:236)
>       at 
> org.apache.beam.sdk.io.FileBasedSink$WriteOperation.copyToOutputFiles(FileBasedSink.java:626)
>       at 
> org.apache.beam.sdk.io.FileBasedSink$WriteOperation.finalize(FileBasedSink.java:516)
>       at 
> org.apache.beam.sdk.io.WriteFiles$2.processElement(WriteFiles.java:592)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to