Ankur.

Thank you for the comment.

Manual start up with job server looks more solid.
Setting TMPDIR then using Flink Runner sounded bit hacky .

Thanks,
Yu

On Tue, Sep 24, 2019 at 2:06 PM Ankur Goenka <[email protected]> wrote:

> Flink job server does not have artifacts-dir option yet.
> We have a PR to add it https://github.com/apache/beam/pull/9648
>
> However, for now you can do a few manual steps to achieve this.
>
> Start Job server.
>
> 1. Download
> https://repo.maven.apache.org/maven2/org/apache/beam/beam-runners-flink-1.8-job-server/2.15.0/beam-runners-flink-1.8-job-server-2.15.0.jar
>
> 2. Start the job server
> java -jar
> /home/admin/.apache_beam/cache/beam-runners-flink-1.8-job-server-2.15.0.jar
> --flink-master-url ip-172-31-12-113.ap-northeast-1.compute.internal:8081
> --artifacts-dir /home/ec2-user --job-port 8099
>
> 3. Submit your pipeline
> options = PipelineOptions([
>                       "--runner=PortableRunner",
>                       "--environment_config=
> asia.gcr.io/creationline001/beam/python3:latest",
>                       "--environment_type=DOCKER",
>                       "--experiments=beam_fn_api"
>                   ])
>
> On Mon, Sep 23, 2019 at 8:20 PM Yu Watanabe <[email protected]> wrote:
>
>> Kyle.
>>
>> Thank you for the assistance.
>>
>> Looks like "--artifacts-dir" is not parsed. Below is the DEBUG log of the
>> pipline.
>> ====================================================================
>> WARNING:root:Discarding unparseable args: ['--flink_version=1.8',
>> '--flink_master_url=ip-172-31-12-113.ap-northeast-1.compute.internal:8081',
>> '--artifacts-dir=/home/ec2-user']
>> ...
>> WARNING:root:Downloading job server jar from
>> https://repo.maven.apache.org/maven2/org/apache/beam/beam-runners-flink-1.8-job-server/2.15.0/beam-runners-flink-1.8-job-server-2.15.0.jar
>> DEBUG:root:Starting job service with ['java', '-jar',
>> '/home/admin/.apache_beam/cache/beam-runners-flink-1.8-job-server-2.15.0.jar',
>> '--flink-master-url',
>> 'ip-172-31-12-113.ap-northeast-1.compute.internal:8081', '--artifacts-dir',
>> '/tmp/artifactsicq2kj8c', '--job-port', 41757, '--artifact-port', 0,
>> '--expansion-port', 0]
>> DEBUG:root:Waiting for jobs grpc channel to be ready at localhost:41757.
>> ...
>> DEBUG:root:Runner option 'job_endpoint' was already added
>> DEBUG:root:Runner option 'sdk_worker_parallelism' was already added
>> WARNING:root:Discarding unparseable args: ['--artifacts-dir=/home/admin']
>> ...
>> ====================================================================
>>
>> My pipeline option .
>> ====================================================================
>>         options = PipelineOptions([
>>                       "--runner=FlinkRunner",
>>                       "--flink_version=1.8",
>>
>> "--flink_master_url=ip-172-31-12-113.ap-northeast-1.compute.internal:8081"
>> ,
>>                       "--environment_config=
>> asia.gcr.io/creationline001/beam/python3:latest",
>>                       "--environment_type=DOCKER",
>>                       "--experiments=beam_fn_api",
>>                       "--artifacts-dir=/home/admin"
>>                   ])
>> ====================================================================
>>
>> Tracing from the log , looks like artifacts dir respects the default
>> tempdir of OS.
>> Thus, to adjust it I will need environment variable instead. I used
>> 'TMPDIR' in my case.
>> ====================================================================
>>
>> https://github.com/apache/beam/blob/7931ec055e2da7214c82e368ef7d7fd679faaef1/sdks/python/apache_beam/runners/portability/job_server.py#L203
>>     artifacts_dir = self.local_temp_dir(prefix='artifacts')
>>
>> =>
>>
>> https://github.com/apache/beam/blob/7931ec055e2da7214c82e368ef7d7fd679faaef1/sdks/python/apache_beam/runners/portability/job_server.py#L157
>>     return tempfile.mkdtemp(dir=self._local_temp_root, **kwargs)
>>
>> =>
>>
>> https://github.com/apache/beam/blob/7931ec055e2da7214c82e368ef7d7fd679faaef1/sdks/python/apache_beam/runners/portability/job_server.py#L102
>>     self._local_temp_root = None
>> ====================================================================
>>
>> Then it worked.
>> ====================================================================
>> (python-bm-2150) admin@ip-172-31-9-89:~$ env | grep TMPDIR
>> TMPDIR=/home/admin
>>
>> =>
>> DEBUG:root:Starting job service with ['java', '-jar',
>> '/home/admin/.apache_beam/cache/beam-runners-flink-1.8-job-server-2.15.0.jar',
>> '--flink-master-url',
>> 'ip-172-31-12-113.ap-northeast-1.compute.internal:8081', '--artifacts-dir',
>> '/home/admin/artifacts18unmki3', '--job-port', 46767, '--artifact-port', 0,
>> '--expansion-port', 0]
>> ====================================================================
>>
>> I will use nfs server for now to share the artifact directory with worker
>> nodes.
>>
>> Thanks.
>> Yu Watanabe
>>
>> On Tue, Sep 24, 2019 at 9:50 AM Kyle Weaver <[email protected]> wrote:
>>
>>> The relevant configuration flag for the job server is `--artifacts-dir`.
>>>
>>> @Robert Bradshaw <[email protected]> I added this info to the log
>>> message: https://github.com/apache/beam/pull/9646
>>>
>>> Kyle Weaver | Software Engineer | github.com/ibzib | [email protected]
>>>
>>>
>>> On Mon, Sep 23, 2019 at 11:36 AM Robert Bradshaw <[email protected]>
>>> wrote:
>>>
>>>> You need to set your artifact directory to point to a
>>>> distributed filesystem that's also accessible to the workers (when starting
>>>> up the job server).
>>>>
>>>> On Mon, Sep 23, 2019 at 5:43 AM Yu Watanabe <[email protected]>
>>>> wrote:
>>>>
>>>>> Hello.
>>>>>
>>>>> I am working on flink runner (2.15.0) and would like to ask question
>>>>> about how to solve my error.
>>>>>
>>>>> Currently , I have a remote cluster deployed as below . (please see
>>>>> slide1)
>>>>> All master and worker nodes are installed on different server from
>>>>> apache beam.
>>>>>
>>>>>
>>>>> https://drive.google.com/file/d/1vBULp6kiEfQNGVV3Nl2mMKAZZKYsb11h/view?usp=sharing
>>>>>
>>>>> When I run beam pipeline, harness container tries to start up,
>>>>> however, fails immediately with below error on docker side.
>>>>>
>>>>> =====================================================================================
>>>>> Sep 23 21:04:05 ip-172-31-0-143 51106514ffc0[7920]: 2019/09/23
>>>>> 12:04:05 Initializing python harness: /opt/apache/beam/boot --id=1
>>>>> --logging_endpoint=localhost:34227 --artifact_endpoint=localhost:45303
>>>>> --provision_endpoint=localhost:44585 --control_endpoint=localhost:43869
>>>>> Sep 23 21:04:05 ip-172-31-0-143 dockerd:
>>>>> time="2019-09-23T21:04:05.380942292+09:00" level=debug msg=event
>>>>> module=libcontainerd namespace=moby topic=/tasks/start
>>>>> Sep 23 21:04:05 ip-172-31-0-143 51106514ffc0[7920]: 2019/09/23
>>>>> 12:04:05 Failed to retrieve staged files: failed to get manifest
>>>>> Sep 23 21:04:05 ip-172-31-0-143 51106514ffc0[7920]: #011caused by:
>>>>> Sep 23 21:04:05 ip-172-31-0-143 51106514ffc0[7920]: rpc error: code =
>>>>> Unknown desc =
>>>>>
>>>>> =====================================================================================
>>>>>
>>>>> At the same time, task manager logs below error.
>>>>>
>>>>> =====================================================================================
>>>>> 2019-09-23 21:04:05,525 INFO
>>>>>  
>>>>> org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactRetrievalService
>>>>>  - GetManifest for
>>>>> /tmp/artifactsfkyik3us/job_de145881-9ea7-4e44-8e6d-31a6ea298010/MANIFEST
>>>>> 2019-09-23 21:04:05,526 INFO
>>>>>  
>>>>> org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactRetrievalService
>>>>>  - Loading manifest for retrieval token
>>>>> /tmp/artifactsfkyik3us/job_de145881-9ea7-4e44-8e6d-31a6ea298010/MANIFEST
>>>>> 2019-09-23 21:04:05,531 INFO
>>>>>  
>>>>> org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactRetrievalService
>>>>>  - GetManifest for
>>>>> /tmp/artifactsfkyik3us/job_de145881-9ea7-4e44-8e6d-31a6ea298010/MANIFEST
>>>>> failed
>>>>> java.util.concurrent.ExecutionException:
>>>>> java.io.FileNotFoundException:
>>>>> /tmp/artifactsfkyik3us/job_de145881-9ea7-4e44-8e6d-31a6ea298010/MANIFEST
>>>>> (No such file or directory)
>>>>>         at
>>>>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:531)
>>>>> ...
>>>>>
>>>>> =====================================================================================
>>>>>
>>>>> I see this artifact directory on the server where beam pipeline is
>>>>> executed but not on worker node.
>>>>>
>>>>> =====================================================================================
>>>>> # Beam server
>>>>> (python-bm-2150) admin@ip-172-31-9-89:~$ sudo ls -ld
>>>>> /tmp/artifactsfkyik3us
>>>>> drwx------ 3 admin admin 4096 Sep 23 12:03 /tmp/artifactsfkyik3us
>>>>>
>>>>> # Flink worker node
>>>>> [ec2-user@ip-172-31-0-143 flink]$ sudo ls -ld /tmp/artifactsfkyik3us
>>>>> ls: cannot access /tmp/artifactsfkyik3us: No such file or directory
>>>>>
>>>>>  
>>>>> =====================================================================================
>>>>>
>>>>> From the error, it seems that container is not starting up correctly
>>>>> due to manifest file is missing.
>>>>> What would be a good approach to reference artifact directory from
>>>>> worker node?
>>>>> I appreciate if I could get some advice .
>>>>>
>>>>> Best Regards,
>>>>> Yu Watanabe
>>>>>
>>>>> --
>>>>> Yu Watanabe
>>>>> Weekend Freelancer who loves to challenge building data platform
>>>>> [email protected]
>>>>> [image: LinkedIn icon] <https://www.linkedin.com/in/yuwatanabe1>  [image:
>>>>> Twitter icon] <https://twitter.com/yuwtennis>
>>>>>
>>>>
>>
>> --
>> Yu Watanabe
>> Weekend Freelancer who loves to challenge building data platform
>> [email protected]
>> [image: LinkedIn icon] <https://www.linkedin.com/in/yuwatanabe1>  [image:
>> Twitter icon] <https://twitter.com/yuwtennis>
>>
>

-- 
Yu Watanabe
Weekend Freelancer who loves to challenge building data platform
[email protected]
[image: LinkedIn icon] <https://www.linkedin.com/in/yuwatanabe1>  [image:
Twitter icon] <https://twitter.com/yuwtennis>

Reply via email to