Unfortunately, flink server still doesn't work consistently on my machine yet. Funny thing is, it did worked ONCE ( :beam-sdks-python:portableWordCount BUILD successful, finished in 18s). When I tried gain, things were back to hanging with server printing messages like:
""" [flink-akka.actor.default-dispatcher-25] DEBUG org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager - Received slot report from instance 1ad9060bcc87cf5fd19c9a233c15a18f. [flink-akka.actor.default-dispatcher-25] DEBUG org.apache.flink.runtime.jobmaster.JobMaster - Trigger heartbeat request. [flink-akka.actor.default-dispatcher-23] DEBUG org.apache.flink.runtime.taskexecutor.TaskExecutor - Received heartbeat request from 006b3653dc7a24471c115d70c4c55fa6. [flink-akka.actor.default-dispatcher-25] DEBUG org.apache.flink.runtime.jobmaster.JobMaster - Received heartbeat from e188c32c-cfa5-4b85-bda9-16ce4742c490. ... repeat above forever after 5 minutes. """ I am trying to figure out what I did right for that one time succeeded run. For the step 3 Thomas mentioned, all I did for cleanup is "gradle clean", if there are actually more to do, please kindly let me know. On Mon, Nov 19, 2018 at 6:00 AM Maximilian Michels <m...@apache.org> wrote: > Thanks for investing, Thomas! > > Ruoyun, does that solve the WordCount problem you were experiencing? > > -Max > > On 19.11.18 04:53, Thomas Weise wrote: > > With latest master the problem seems fixed. Unfortunately that was first > > masked by build and docker issues. But I changed multiple things at once > > after getting nowhere (the container build "succeeded" when in fact it > > did not): > > > > * Update to latest docker > > * Increase docker disk space after seeing a spurious, non-reproducible > > message in one of the build attempts > > * Full clean and manually remove Go build residuals from the workspace > > > > After that I could see Go and container builds execute differently > > (longer build time) and the result certainly looks better.. > > > > HTH, > > Thomas > > > > > > > > On Sun, Nov 18, 2018 at 2:11 PM Ruoyun Huang <ruo...@google.com > > <mailto:ruo...@google.com>> wrote: > > > > I was after the same issue (I was using reference runner job server, > > but same error message), had some clue but no conclusion yet. > > > > By retaining the container instance, error message says "bad MD5" > > (see the other thread [1] I asked in dev last week). My hypothesis, > > based on the symptoms, is that the underlying container expects an > > MD5 to validate staged files, but job request from python SDK does > > not send file hash code. Hope someone can confirm if that is the > > case (I am still trying to understand how come dataflow does not > > have such issue), and if so, the best way to fix it. > > > > > > [1] > > > https://lists.apache.org/thread.html/b26560087ff88f142e26d66c8a5a9283558c8e55b5edd705b5e53c9c@%3Cdev.beam.apache.org%3E > > > > On Fri, Nov 16, 2018 at 7:06 PM Thomas Weise <t...@apache.org > > <mailto:t...@apache.org>> wrote: > > > > Since last few days, the steps under > > https://beam.apache.org/roadmap/portability/#python-on-flink are > > broken. > > > > The gradle task hangs because the job server isn't able to > > launch the docker container. > > > > ./gradlew :beam-sdks-python:portableWordCount > > -PjobEndpoint=localhost:8099 > > > > [CHAIN MapPartition (MapPartition at > > > 36write/Write/WriteImpl/DoOnce/Impulse.None/beam:env:docker:v1:0) -> > > FlatMap (FlatMap at > > > 36write/Write/WriteImpl/DoOnce/Impulse.None/beam:env:docker:v1:0/out.0) > > (8/8)] INFO > > > org.apache.beam.runners.fnexecution.environment.DockerEnvironmentFactory > > - Still waiting for startup of environment > > tweise-docker-apache.bintray.io/beam/python:latest > > <http://tweise-docker-apache.bintray.io/beam/python:latest> for > > worker id 1 > > > > Unfortunately this isn't covered by tests yet. Is anyone aware > > what change may have caused this or looking into resolving it? > > > > Thanks, > > Thomas > > > > > > > > -- > > ================ > > Ruoyun Huang > > > -- ================ Ruoyun Huang