1) The default behavior, where PortableRunner starts a flink server. It is
confusing to new users
It does that only if no JobServer endpoint is specified. AFAIK there a
problems with the bootstrapping, it can definitely be improved.
2) All the related docs and inline comments. Similarly, it could be very
confusing connecting PortableRunner to Flink server.
+1 We definitely need to improve docs and usability.
3) [Probably no longer an issue]. I couldn't make the flink server example working. And I could not make example working on Java-ULR either.
AFAIK Java URL hasn't received love for a long time.
-Max
On 14.11.18 20:57, Ruoyun Huang wrote:
To answer Maximilian's question.
I am using Linux, debian distribution.
It probably sounded too much when I used the word 'planned merge'. What
I really meant entails less change than it sounds. More specifically:
1) The default behavior, where PortableRunner starts a flink server. It
is confusing to new users.
2) All the related docs and inline comments. Similarly, it could be
very confusing connecting PortableRunner to Flink server.
3) [Probably no longer an issue]. I couldn't make the flink server
example working. And I could not make example working on Java-ULR
either. Both will require debugging for resolutions. Thus I figured
maybe let us only focus on one single thing: the java-ULR part, without
worrying about Flink-server. Again, looks like this may not be a valid
concern, given flink part is most likely due to my setup.
On Wed, Nov 14, 2018 at 3:30 AM Maximilian Michels <m...@apache.org
<mailto:m...@apache.org>> wrote:
Hi Ruoyun,
I just ran the wordcount locally using the instructions on the page.
I've tried the local file system and GCS. Both times it ran
successfully
and produced valid output.
I'm assuming there is some problem with your setup. Which platform are
you using? I'm on MacOS.
Could you expand on the planned merge? From my understanding we will
always need PortableRunner in Python to be able to submit against the
Beam JobServer.
Thanks,
Max
On 14.11.18 00:39, Ruoyun Huang wrote:
> A quick follow-up on using current PortableRunner.
>
> I followed the exact three steps as Ankur and Maximilian shared in
> https://beam.apache.org/roadmap/portability/#python-on-flink ;
The
> wordcount example keeps hanging after 10 minutes. I also tried
> specifying explicit input/output args, either using gcs folder or
local
> file system, but none of them works.
>
> Spent some time looking into it but conclusion yet. At this point
> though, I guess it does not matter much any more, given we
already have
> the plan of merging PortableRunner into using java reference runner
> (i.e. :beam-runners-reference-job-server).
>
> Still appreciated if someone can try out the python-on-flink
>
<https://beam.apache.org/roadmap/portability/#python-on-flink>instructions
> in case it is just due to my local machine setup. Thanks!
>
>
>
> On Thu, Nov 8, 2018 at 5:04 PM Ruoyun Huang <ruo...@google.com
<mailto:ruo...@google.com>
> <mailto:ruo...@google.com <mailto:ruo...@google.com>>> wrote:
>
> Thanks Maximilian!
>
> I am working on migrating existing PortableRunner to using
java ULR
> (Link to Notes
>
<https://docs.google.com/document/d/1S86saZqiDaE_M5wxO0zOQ_rwC6QHv7sp1BmGTm0dLNE/edit#>).
> If this issue is non-trivial to solve, I would vote for removing
> this default behavior as part of the consolidation.
>
> On Thu, Nov 8, 2018 at 2:58 AM Maximilian Michels
<m...@apache.org <mailto:m...@apache.org>
> <mailto:m...@apache.org <mailto:m...@apache.org>>> wrote:
>
> In the long run, we should get rid of the
Docker-inside-Docker
> approach,
> which was only intended for testing anyways. It would be
cleaner to
> start the SDK harness container alongside with JobServer
container.
>
> Short term, I think it should be easy to either fix the
> permissions of
> the mounted "docker" executable or use a Docker image for the
> JobServer
> which comes with Docker pre-installed.
>
> JIRA: https://issues.apache.org/jira/browse/BEAM-6020
>
> Thanks for reporting this Ruoyun!
>
> -Max
>
> On 08.11.18 00:10, Ruoyun Huang wrote:
> > Thanks Ankur and Maximilian.
> >
> > Just for reference in case other people encountering
the same
> error
> > message, the "permission denied" error in my original
email
> is exactly
> > due to dockerinsidedocker issue that Ankur mentioned.
> Thanks Ankur!
> > Didn't make the link when you said it, had to discover
that
> in a hard
> > way (I thought it is due to my docker installation
messed up).
> >
> > On Tue, Nov 6, 2018 at 1:53 AM Maximilian Michels
> <m...@apache.org <mailto:m...@apache.org>
<mailto:m...@apache.org <mailto:m...@apache.org>>
> > <mailto:m...@apache.org <mailto:m...@apache.org>
<mailto:m...@apache.org <mailto:m...@apache.org>>>> wrote:
> >
> > Hi,
> >
> > Please follow
> >
https://beam.apache.org/roadmap/portability/#python-on-flink
> >
> > Cheers,
> > Max
> >
> > On 06.11.18 01:14, Ankur Goenka wrote:
> > > Hi,
> > >
> > > The Portable Runner requires a job server uri
to work
> with. The
> > current
> > > default job server docker image is broken
because of
> docker inside
> > > docker issue.
> > >
> > > Please refer to
> > >
> https://beam.apache.org/roadmap/portability/#python-on-flink for
> > how to
> > > run a wordcount using Portable Flink Runner.
> > >
> > > Thanks,
> > > Ankur
> > >
> > > On Mon, Nov 5, 2018 at 3:41 PM Ruoyun Huang
> <ruo...@google.com <mailto:ruo...@google.com>
<mailto:ruo...@google.com <mailto:ruo...@google.com>>
> > <mailto:ruo...@google.com
<mailto:ruo...@google.com> <mailto:ruo...@google.com
<mailto:ruo...@google.com>>>
> > > <mailto:ruo...@google.com
<mailto:ruo...@google.com> <mailto:ruo...@google.com
<mailto:ruo...@google.com>>
> <mailto:ruo...@google.com <mailto:ruo...@google.com>
<mailto:ruo...@google.com <mailto:ruo...@google.com>>>>> wrote:
> > >
> > > Hi, Folks,
> > >
> > > I want to try out Python
PortableRunner, by
> using following
> > > command:
> > >
> > > *sdk/python: python -m
apache_beam.examples.wordcount
> > > --output=/tmp/test_output --runner
PortableRunner*
> > >
> > > It complains with following error
message:
> > >
> > > Caused by: java.lang.Exception: The user
defined
> 'open()' method
> > > caused an exception: java.io.IOException:
Cannot
> run program
> > > "docker": error=13, Permission denied
> > > at
> >
>
org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498)
> > > at
> > >
> >
>
org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
> > > at
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
> > > ... 1 more
> > > Caused by:
> > >
> >
>
org.apache.beam.repackaged.beam_runners_java_fn_execution.com.google.common.util.concurrent.UncheckedExecutionException:
> > > java.io.IOException: Cannot run program
"docker":
> error=13,
> > > Permission denied
> > > at
> > >
> >
>
org.apache.beam.repackaged.beam_runners_java_fn_execution.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4994)
> > >
> > > ... 7 more
> > >
> > >
> > >
> > > My py2 environment is properly configured,
because
> DirectRunner
> > > works. Also I tested my docker installation by
> 'docker run
> > > hello-world ', no issue.
> > >
> > >
> > > Thanks.
> > > --
> > > ================
> > > Ruoyun Huang
> > >
> >
> >
> >
> > --
> > ================
> > Ruoyun Huang
> >
>
>
>
> --
> ================
> Ruoyun Huang
>
>
>
> --
> ================
> Ruoyun Huang
>
--
================
Ruoyun Huang