I think beam is a fantastic software . Its a great pleasure contributing. Thanks, Yu
On Fri, Sep 27, 2019 at 9:09 AM Robert Bradshaw <[email protected]> wrote: > https://issues.apache.org/jira/browse/BEAM-8312 should help with this. In > summary, I would say that portable beam-on-Flink is basically feature > complete, but we're still smoothing out some of the ease-of-use issues. And > feedback like this really helps, so thanks! > > On Tue, Sep 24, 2019 at 8:10 PM Yu Watanabe <[email protected]> wrote: > >> I needed little adjustment with the pipeline option to work it out. >> >> Pipeline option required 'job_endpoint' when using manual start up of >> job server using jar file. >> ========================================================= >> options = PipelineOptions([ >> "--runner=PortableRunner", >> "--environment_config= >> asia.gcr.io/creationline001/beam/python3:latest", >> "--environment_type=DOCKER", >> "--experiments=beam_fn_api", >> "--job_endpoint=localhost:8099" >> ]) >> ========================================================= >> >> Otherwise, runner spins up job server container. >> ========================================================== >> ERROR:root:Starting job service with ['docker', 'run', '-v', >> '/usr/bin/docker:/bin/docker', '-v', >> '/var/run/docker.sock:/var/run/docker.sock', '--network=host', ' >> admin-docker-apache.bintray.io/beam/flink-job-server:latest', >> '--job-host', 'localhost', '--job-port', '42915', '--artifact-port', >> '56561', '--expansion-port', '53857'] >> ERROR:root:Error bringing up job service >> ========================================================== >> >> Without "job_endpoint" option default job server is used. >> ========================================================== >> >> https://github.com/apache/beam/blob/7931ec055e2da7214c82e368ef7d7fd679faaef1/sdks/python/apache_beam/runners/portability/portable_runner.py#L176 >> server = self.default_job_server(options) >> >> => >> >> https://github.com/apache/beam/blob/d963aeb91a63f165b5ff1ebf6add8275aec204f1/sdks/python/apache_beam/runners/portability/job_server.py#L172 >> "-docker-apache.bintray.io/beam/flink-job-server:latest" >> ========================================================== >> >> Thanks, >> Yu >> >> On Tue, Sep 24, 2019 at 2:06 PM Ankur Goenka <[email protected]> wrote: >> >>> Flink job server does not have artifacts-dir option yet. >>> We have a PR to add it https://github.com/apache/beam/pull/9648 >>> >>> However, for now you can do a few manual steps to achieve this. >>> >>> Start Job server. >>> >>> 1. Download >>> https://repo.maven.apache.org/maven2/org/apache/beam/beam-runners-flink-1.8-job-server/2.15.0/beam-runners-flink-1.8-job-server-2.15.0.jar >>> >>> 2. Start the job server >>> java -jar >>> /home/admin/.apache_beam/cache/beam-runners-flink-1.8-job-server-2.15.0.jar >>> --flink-master-url ip-172-31-12-113.ap-northeast-1.compute.internal:8081 >>> --artifacts-dir /home/ec2-user --job-port 8099 >>> >>> 3. Submit your pipeline >>> options = PipelineOptions([ >>> "--runner=PortableRunner", >>> "--environment_config= >>> asia.gcr.io/creationline001/beam/python3:latest", >>> "--environment_type=DOCKER", >>> "--experiments=beam_fn_api" >>> ]) >>> >>> On Mon, Sep 23, 2019 at 8:20 PM Yu Watanabe <[email protected]> >>> wrote: >>> >>>> Kyle. >>>> >>>> Thank you for the assistance. >>>> >>>> Looks like "--artifacts-dir" is not parsed. Below is the DEBUG log of >>>> the pipline. >>>> ==================================================================== >>>> WARNING:root:Discarding unparseable args: ['--flink_version=1.8', >>>> '--flink_master_url=ip-172-31-12-113.ap-northeast-1.compute.internal:8081', >>>> '--artifacts-dir=/home/ec2-user'] >>>> ... >>>> WARNING:root:Downloading job server jar from >>>> https://repo.maven.apache.org/maven2/org/apache/beam/beam-runners-flink-1.8-job-server/2.15.0/beam-runners-flink-1.8-job-server-2.15.0.jar >>>> DEBUG:root:Starting job service with ['java', '-jar', >>>> '/home/admin/.apache_beam/cache/beam-runners-flink-1.8-job-server-2.15.0.jar', >>>> '--flink-master-url', >>>> 'ip-172-31-12-113.ap-northeast-1.compute.internal:8081', '--artifacts-dir', >>>> '/tmp/artifactsicq2kj8c', '--job-port', 41757, '--artifact-port', 0, >>>> '--expansion-port', 0] >>>> DEBUG:root:Waiting for jobs grpc channel to be ready at localhost:41757. >>>> ... >>>> DEBUG:root:Runner option 'job_endpoint' was already added >>>> DEBUG:root:Runner option 'sdk_worker_parallelism' was already added >>>> WARNING:root:Discarding unparseable args: >>>> ['--artifacts-dir=/home/admin'] >>>> ... >>>> ==================================================================== >>>> >>>> My pipeline option . >>>> ==================================================================== >>>> options = PipelineOptions([ >>>> "--runner=FlinkRunner", >>>> "--flink_version=1.8", >>>> >>>> "--flink_master_url=ip-172-31-12-113.ap-northeast-1.compute.internal:8081" >>>> , >>>> "--environment_config= >>>> asia.gcr.io/creationline001/beam/python3:latest", >>>> "--environment_type=DOCKER", >>>> "--experiments=beam_fn_api", >>>> "--artifacts-dir=/home/admin" >>>> ]) >>>> ==================================================================== >>>> >>>> Tracing from the log , looks like artifacts dir respects the default >>>> tempdir of OS. >>>> Thus, to adjust it I will need environment variable instead. I used >>>> 'TMPDIR' in my case. >>>> ==================================================================== >>>> >>>> https://github.com/apache/beam/blob/7931ec055e2da7214c82e368ef7d7fd679faaef1/sdks/python/apache_beam/runners/portability/job_server.py#L203 >>>> artifacts_dir = self.local_temp_dir(prefix='artifacts') >>>> >>>> => >>>> >>>> https://github.com/apache/beam/blob/7931ec055e2da7214c82e368ef7d7fd679faaef1/sdks/python/apache_beam/runners/portability/job_server.py#L157 >>>> return tempfile.mkdtemp(dir=self._local_temp_root, **kwargs) >>>> >>>> => >>>> >>>> https://github.com/apache/beam/blob/7931ec055e2da7214c82e368ef7d7fd679faaef1/sdks/python/apache_beam/runners/portability/job_server.py#L102 >>>> self._local_temp_root = None >>>> ==================================================================== >>>> >>>> Then it worked. >>>> ==================================================================== >>>> (python-bm-2150) admin@ip-172-31-9-89:~$ env | grep TMPDIR >>>> TMPDIR=/home/admin >>>> >>>> => >>>> DEBUG:root:Starting job service with ['java', '-jar', >>>> '/home/admin/.apache_beam/cache/beam-runners-flink-1.8-job-server-2.15.0.jar', >>>> '--flink-master-url', >>>> 'ip-172-31-12-113.ap-northeast-1.compute.internal:8081', '--artifacts-dir', >>>> '/home/admin/artifacts18unmki3', '--job-port', 46767, '--artifact-port', 0, >>>> '--expansion-port', 0] >>>> ==================================================================== >>>> >>>> I will use nfs server for now to share the artifact directory with >>>> worker nodes. >>>> >>>> Thanks. >>>> Yu Watanabe >>>> >>>> On Tue, Sep 24, 2019 at 9:50 AM Kyle Weaver <[email protected]> >>>> wrote: >>>> >>>>> The relevant configuration flag for the job server is >>>>> `--artifacts-dir`. >>>>> >>>>> @Robert Bradshaw <[email protected]> I added this info to the log >>>>> message: https://github.com/apache/beam/pull/9646 >>>>> >>>>> Kyle Weaver | Software Engineer | github.com/ibzib | >>>>> [email protected] >>>>> >>>>> >>>>> On Mon, Sep 23, 2019 at 11:36 AM Robert Bradshaw <[email protected]> >>>>> wrote: >>>>> >>>>>> You need to set your artifact directory to point to a >>>>>> distributed filesystem that's also accessible to the workers (when >>>>>> starting >>>>>> up the job server). >>>>>> >>>>>> On Mon, Sep 23, 2019 at 5:43 AM Yu Watanabe <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hello. >>>>>>> >>>>>>> I am working on flink runner (2.15.0) and would like to ask question >>>>>>> about how to solve my error. >>>>>>> >>>>>>> Currently , I have a remote cluster deployed as below . (please see >>>>>>> slide1) >>>>>>> All master and worker nodes are installed on different server from >>>>>>> apache beam. >>>>>>> >>>>>>> >>>>>>> https://drive.google.com/file/d/1vBULp6kiEfQNGVV3Nl2mMKAZZKYsb11h/view?usp=sharing >>>>>>> >>>>>>> When I run beam pipeline, harness container tries to start up, >>>>>>> however, fails immediately with below error on docker side. >>>>>>> >>>>>>> ===================================================================================== >>>>>>> Sep 23 21:04:05 ip-172-31-0-143 51106514ffc0[7920]: 2019/09/23 >>>>>>> 12:04:05 Initializing python harness: /opt/apache/beam/boot --id=1 >>>>>>> --logging_endpoint=localhost:34227 --artifact_endpoint=localhost:45303 >>>>>>> --provision_endpoint=localhost:44585 --control_endpoint=localhost:43869 >>>>>>> Sep 23 21:04:05 ip-172-31-0-143 dockerd: >>>>>>> time="2019-09-23T21:04:05.380942292+09:00" level=debug msg=event >>>>>>> module=libcontainerd namespace=moby topic=/tasks/start >>>>>>> Sep 23 21:04:05 ip-172-31-0-143 51106514ffc0[7920]: 2019/09/23 >>>>>>> 12:04:05 Failed to retrieve staged files: failed to get manifest >>>>>>> Sep 23 21:04:05 ip-172-31-0-143 51106514ffc0[7920]: #011caused by: >>>>>>> Sep 23 21:04:05 ip-172-31-0-143 51106514ffc0[7920]: rpc error: code >>>>>>> = Unknown desc = >>>>>>> >>>>>>> ===================================================================================== >>>>>>> >>>>>>> At the same time, task manager logs below error. >>>>>>> >>>>>>> ===================================================================================== >>>>>>> 2019-09-23 21:04:05,525 INFO >>>>>>> >>>>>>> org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactRetrievalService >>>>>>> - GetManifest for >>>>>>> /tmp/artifactsfkyik3us/job_de145881-9ea7-4e44-8e6d-31a6ea298010/MANIFEST >>>>>>> 2019-09-23 21:04:05,526 INFO >>>>>>> >>>>>>> org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactRetrievalService >>>>>>> - Loading manifest for retrieval token >>>>>>> /tmp/artifactsfkyik3us/job_de145881-9ea7-4e44-8e6d-31a6ea298010/MANIFEST >>>>>>> 2019-09-23 21:04:05,531 INFO >>>>>>> >>>>>>> org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactRetrievalService >>>>>>> - GetManifest for >>>>>>> /tmp/artifactsfkyik3us/job_de145881-9ea7-4e44-8e6d-31a6ea298010/MANIFEST >>>>>>> failed >>>>>>> java.util.concurrent.ExecutionException: >>>>>>> java.io.FileNotFoundException: >>>>>>> /tmp/artifactsfkyik3us/job_de145881-9ea7-4e44-8e6d-31a6ea298010/MANIFEST >>>>>>> (No such file or directory) >>>>>>> at >>>>>>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:531) >>>>>>> ... >>>>>>> >>>>>>> ===================================================================================== >>>>>>> >>>>>>> I see this artifact directory on the server where beam pipeline is >>>>>>> executed but not on worker node. >>>>>>> >>>>>>> ===================================================================================== >>>>>>> # Beam server >>>>>>> (python-bm-2150) admin@ip-172-31-9-89:~$ sudo ls -ld >>>>>>> /tmp/artifactsfkyik3us >>>>>>> drwx------ 3 admin admin 4096 Sep 23 12:03 /tmp/artifactsfkyik3us >>>>>>> >>>>>>> # Flink worker node >>>>>>> [ec2-user@ip-172-31-0-143 flink]$ sudo ls -ld /tmp/artifactsfkyik3us >>>>>>> ls: cannot access /tmp/artifactsfkyik3us: No such file or directory >>>>>>> >>>>>>> >>>>>>> ===================================================================================== >>>>>>> >>>>>>> From the error, it seems that container is not starting up correctly >>>>>>> due to manifest file is missing. >>>>>>> What would be a good approach to reference artifact directory from >>>>>>> worker node? >>>>>>> I appreciate if I could get some advice . >>>>>>> >>>>>>> Best Regards, >>>>>>> Yu Watanabe >>>>>>> >>>>>>> -- >>>>>>> Yu Watanabe >>>>>>> Weekend Freelancer who loves to challenge building data platform >>>>>>> [email protected] >>>>>>> [image: LinkedIn icon] <https://www.linkedin.com/in/yuwatanabe1> >>>>>>> [image: >>>>>>> Twitter icon] <https://twitter.com/yuwtennis> >>>>>>> >>>>>> >>>> >>>> -- >>>> Yu Watanabe >>>> Weekend Freelancer who loves to challenge building data platform >>>> [email protected] >>>> [image: LinkedIn icon] <https://www.linkedin.com/in/yuwatanabe1> [image: >>>> Twitter icon] <https://twitter.com/yuwtennis> >>>> >>> >> >> -- >> Yu Watanabe >> Weekend Freelancer who loves to challenge building data platform >> [email protected] >> [image: LinkedIn icon] <https://www.linkedin.com/in/yuwatanabe1> [image: >> Twitter icon] <https://twitter.com/yuwtennis> >> > -- Yu Watanabe Weekend Freelancer who loves to challenge building data platform [email protected] [image: LinkedIn icon] <https://www.linkedin.com/in/yuwatanabe1> [image: Twitter icon] <https://twitter.com/yuwtennis>
