Thanks Thomas!

My desktop runs Linux.  I was using gradle to run wordcount, and that was
how I got the job hanging. Since both of you get it working, I guess more
likely sth is wrong with my setup.


By using Thmoas's python command line exactly as is, I am able to see the
job run succeeds, however two questions:

1)  Did you check whether output file "/tmp/py-wordcount-direct" exists or
not?  I expect there should be a text output, but I don't see this file
afterwards.   (I am still in the stage building confidence in telling what
a succeeded run is.  Maybe I will try DataflowRunner and cross check
outputs).

2)  Why it needs a "--streaming" arg?  Isn't this a static batch input, by
feeding a txt file input?  In fact, I got failure message if I remove
'--streaming', not sure if it is due to my setup again.


On Wed, Nov 14, 2018 at 7:51 AM Thomas Weise <t...@apache.org> wrote:

> Works for me on macOS as well.
>
> In case you don't launch the pipeline through Gradle, this would be the
> command:
>
> python -m apache_beam.examples.wordcount \
>   --input=/etc/profile \
>   --output=/tmp/py-wordcount-direct \
>   --runner=PortableRunner \
>   --job_endpoint=localhost:8099 \
>   --parallelism=1 \
>   --OPTIONALflink_master=localhost:8081 \
>   --streaming
>
> We talked about adding the wordcount to pre-commit..
>
> Regarding using ULR vs. Flink runner: There seems to be confusion between
> PortableRunner using the user supplied endpoint vs. trying to launch a job
> server. I commented in the doc.
>
> Thomas
>
>
>
> On Wed, Nov 14, 2018 at 3:30 AM Maximilian Michels <m...@apache.org> wrote:
>
>> Hi Ruoyun,
>>
>> I just ran the wordcount locally using the instructions on the page.
>> I've tried the local file system and GCS. Both times it ran successfully
>> and produced valid output.
>>
>> I'm assuming there is some problem with your setup. Which platform are
>> you using? I'm on MacOS.
>>
>> Could you expand on the planned merge? From my understanding we will
>> always need PortableRunner in Python to be able to submit against the
>> Beam JobServer.
>>
>> Thanks,
>> Max
>>
>> On 14.11.18 00:39, Ruoyun Huang wrote:
>> > A quick follow-up on using current PortableRunner.
>> >
>> > I followed the exact three steps as Ankur and Maximilian shared in
>> > https://beam.apache.org/roadmap/portability/#python-on-flink  ;   The
>> > wordcount example keeps hanging after 10 minutes.  I also tried
>> > specifying explicit input/output args, either using gcs folder or local
>> > file system, but none of them works.
>> >
>> > Spent some time looking into it but conclusion yet.  At this point
>> > though, I guess it does not matter much any more, given we already have
>> > the plan of merging PortableRunner into using java reference runner
>> > (i.e. :beam-runners-reference-job-server).
>> >
>> > Still appreciated if someone can try out the python-on-flink
>> > <https://beam.apache.org/roadmap/portability/#python-on-flink>instructions
>>
>> > in case it is just due to my local machine setup.  Thanks!
>> >
>> >
>> >
>> > On Thu, Nov 8, 2018 at 5:04 PM Ruoyun Huang <ruo...@google.com
>> > <mailto:ruo...@google.com>> wrote:
>> >
>> >     Thanks Maximilian!
>> >
>> >     I am working on migrating existing PortableRunner to using java ULR
>> >     (Link to Notes
>> >     <
>> https://docs.google.com/document/d/1S86saZqiDaE_M5wxO0zOQ_rwC6QHv7sp1BmGTm0dLNE/edit#
>> >).
>> >     If this issue is non-trivial to solve, I would vote for removing
>> >     this default behavior as part of the consolidation.
>> >
>> >     On Thu, Nov 8, 2018 at 2:58 AM Maximilian Michels <m...@apache.org
>> >     <mailto:m...@apache.org>> wrote:
>> >
>> >         In the long run, we should get rid of the Docker-inside-Docker
>> >         approach,
>> >         which was only intended for testing anyways. It would be
>> cleaner to
>> >         start the SDK harness container alongside with JobServer
>> container.
>> >
>> >         Short term, I think it should be easy to either fix the
>> >         permissions of
>> >         the mounted "docker" executable or use a Docker image for the
>> >         JobServer
>> >         which comes with Docker pre-installed.
>> >
>> >         JIRA: https://issues.apache.org/jira/browse/BEAM-6020
>> >
>> >         Thanks for reporting this Ruoyun!
>> >
>> >         -Max
>> >
>> >         On 08.11.18 00:10, Ruoyun Huang wrote:
>> >          > Thanks Ankur and Maximilian.
>> >          >
>> >          > Just for reference in case other people encountering the same
>> >         error
>> >          > message, the "permission denied" error in my original email
>> >         is exactly
>> >          > due to dockerinsidedocker issue that Ankur mentioned.
>> >         Thanks Ankur!
>> >          > Didn't make the link when you said it, had to discover that
>> >         in a hard
>> >          > way (I thought it is due to my docker installation messed
>> up).
>> >          >
>> >          > On Tue, Nov 6, 2018 at 1:53 AM Maximilian Michels
>> >         <m...@apache.org <mailto:m...@apache.org>
>> >          > <mailto:m...@apache.org <mailto:m...@apache.org>>> wrote:
>> >          >
>> >          >     Hi,
>> >          >
>> >          >     Please follow
>> >          > https://beam.apache.org/roadmap/portability/#python-on-flink
>> >          >
>> >          >     Cheers,
>> >          >     Max
>> >          >
>> >          >     On 06.11.18 01:14, Ankur Goenka wrote:
>> >          >      > Hi,
>> >          >      >
>> >          >      > The Portable Runner requires a job server uri to work
>> >         with. The
>> >          >     current
>> >          >      > default job server docker image is broken because of
>> >         docker inside
>> >          >      > docker issue.
>> >          >      >
>> >          >      > Please refer to
>> >          >      >
>> >         https://beam.apache.org/roadmap/portability/#python-on-flink
>> for
>> >          >     how to
>> >          >      > run a wordcount using Portable Flink Runner.
>> >          >      >
>> >          >      > Thanks,
>> >          >      > Ankur
>> >          >      >
>> >          >      > On Mon, Nov 5, 2018 at 3:41 PM Ruoyun Huang
>> >         <ruo...@google.com <mailto:ruo...@google.com>
>> >          >     <mailto:ruo...@google.com <mailto:ruo...@google.com>>
>> >          >      > <mailto:ruo...@google.com <mailto:ruo...@google.com>
>> >         <mailto:ruo...@google.com <mailto:ruo...@google.com>>>> wrote:
>> >          >      >
>> >          >      >     Hi, Folks,
>> >          >      >
>> >          >      >           I want to try out Python PortableRunner, by
>> >         using following
>> >          >      >     command:
>> >          >      >
>> >          >      >     *sdk/python: python -m
>> apache_beam.examples.wordcount
>> >          >      >       --output=/tmp/test_output   --runner
>> PortableRunner*
>> >          >      >
>> >          >      >           It complains with following error message:
>> >          >      >
>> >          >      >     Caused by: java.lang.Exception: The user defined
>> >         'open()' method
>> >          >      >     caused an exception: java.io.IOException: Cannot
>> >         run program
>> >          >      >     "docker": error=13, Permission denied
>> >          >      >     at
>> >          >
>> >
>>  org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498)
>> >          >      >     at
>> >          >      >
>> >          >
>> >
>>  org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
>> >          >      >     at
>> >         org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
>> >          >      >     ... 1 more
>> >          >      >     Caused by:
>> >          >      >
>> >          >
>> >
>>  
>> org.apache.beam.repackaged.beam_runners_java_fn_execution.com.google.common.util.concurrent.UncheckedExecutionException:
>> >          >      >     java.io.IOException: Cannot run program "docker":
>> >         error=13,
>> >          >      >     Permission denied
>> >          >      >     at
>> >          >      >
>> >          >
>> >
>>  
>> org.apache.beam.repackaged.beam_runners_java_fn_execution.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4994)
>> >          >      >
>> >          >      >     ... 7 more
>> >          >      >
>> >          >      >
>> >          >      >
>> >          >      >     My py2 environment is properly configured, because
>> >         DirectRunner
>> >          >      >     works.  Also I tested my docker installation by
>> >         'docker run
>> >          >      >     hello-world ', no issue.
>> >          >      >
>> >          >      >
>> >          >      >     Thanks.
>> >          >      >     --
>> >          >      >     ================
>> >          >      >     Ruoyun  Huang
>> >          >      >
>> >          >
>> >          >
>> >          >
>> >          > --
>> >          > ================
>> >          > Ruoyun  Huang
>> >          >
>> >
>> >
>> >
>> >     --
>> >     ================
>> >     Ruoyun  Huang
>> >
>> >
>> >
>> > --
>> > ================
>> > Ruoyun  Huang
>> >
>>
>

-- 
================
Ruoyun  Huang

Reply via email to