Kyle.

Thank you for the assistance.

Looks like "--artifacts-dir" is not parsed. Below is the DEBUG log of the
pipline.
====================================================================
WARNING:root:Discarding unparseable args: ['--flink_version=1.8',
'--flink_master_url=ip-172-31-12-113.ap-northeast-1.compute.internal:8081',
'--artifacts-dir=/home/ec2-user']
...
WARNING:root:Downloading job server jar from
https://repo.maven.apache.org/maven2/org/apache/beam/beam-runners-flink-1.8-job-server/2.15.0/beam-runners-flink-1.8-job-server-2.15.0.jar
DEBUG:root:Starting job service with ['java', '-jar',
'/home/admin/.apache_beam/cache/beam-runners-flink-1.8-job-server-2.15.0.jar',
'--flink-master-url',
'ip-172-31-12-113.ap-northeast-1.compute.internal:8081', '--artifacts-dir',
'/tmp/artifactsicq2kj8c', '--job-port', 41757, '--artifact-port', 0,
'--expansion-port', 0]
DEBUG:root:Waiting for jobs grpc channel to be ready at localhost:41757.
...
DEBUG:root:Runner option 'job_endpoint' was already added
DEBUG:root:Runner option 'sdk_worker_parallelism' was already added
WARNING:root:Discarding unparseable args: ['--artifacts-dir=/home/admin']
...
====================================================================

My pipeline option .
====================================================================
        options = PipelineOptions([
                      "--runner=FlinkRunner",
                      "--flink_version=1.8",

"--flink_master_url=ip-172-31-12-113.ap-northeast-1.compute.internal:8081"
,
                      "--environment_config=
asia.gcr.io/creationline001/beam/python3:latest",
                      "--environment_type=DOCKER",
                      "--experiments=beam_fn_api",
                      "--artifacts-dir=/home/admin"
                  ])
====================================================================

Tracing from the log , looks like artifacts dir respects the default
tempdir of OS.
Thus, to adjust it I will need environment variable instead. I used
'TMPDIR' in my case.
====================================================================
https://github.com/apache/beam/blob/7931ec055e2da7214c82e368ef7d7fd679faaef1/sdks/python/apache_beam/runners/portability/job_server.py#L203
    artifacts_dir = self.local_temp_dir(prefix='artifacts')

=>
https://github.com/apache/beam/blob/7931ec055e2da7214c82e368ef7d7fd679faaef1/sdks/python/apache_beam/runners/portability/job_server.py#L157
    return tempfile.mkdtemp(dir=self._local_temp_root, **kwargs)

=>
https://github.com/apache/beam/blob/7931ec055e2da7214c82e368ef7d7fd679faaef1/sdks/python/apache_beam/runners/portability/job_server.py#L102
    self._local_temp_root = None
====================================================================

Then it worked.
====================================================================
(python-bm-2150) admin@ip-172-31-9-89:~$ env | grep TMPDIR
TMPDIR=/home/admin

=>
DEBUG:root:Starting job service with ['java', '-jar',
'/home/admin/.apache_beam/cache/beam-runners-flink-1.8-job-server-2.15.0.jar',
'--flink-master-url',
'ip-172-31-12-113.ap-northeast-1.compute.internal:8081', '--artifacts-dir',
'/home/admin/artifacts18unmki3', '--job-port', 46767, '--artifact-port', 0,
'--expansion-port', 0]
====================================================================

I will use nfs server for now to share the artifact directory with worker
nodes.

Thanks.
Yu Watanabe

On Tue, Sep 24, 2019 at 9:50 AM Kyle Weaver <[email protected]> wrote:

> The relevant configuration flag for the job server is `--artifacts-dir`.
>
> @Robert Bradshaw <[email protected]> I added this info to the log
> message: https://github.com/apache/beam/pull/9646
>
> Kyle Weaver | Software Engineer | github.com/ibzib | [email protected]
>
>
> On Mon, Sep 23, 2019 at 11:36 AM Robert Bradshaw <[email protected]>
> wrote:
>
>> You need to set your artifact directory to point to a
>> distributed filesystem that's also accessible to the workers (when starting
>> up the job server).
>>
>> On Mon, Sep 23, 2019 at 5:43 AM Yu Watanabe <[email protected]>
>> wrote:
>>
>>> Hello.
>>>
>>> I am working on flink runner (2.15.0) and would like to ask question
>>> about how to solve my error.
>>>
>>> Currently , I have a remote cluster deployed as below . (please see
>>> slide1)
>>> All master and worker nodes are installed on different server from
>>> apache beam.
>>>
>>>
>>> https://drive.google.com/file/d/1vBULp6kiEfQNGVV3Nl2mMKAZZKYsb11h/view?usp=sharing
>>>
>>> When I run beam pipeline, harness container tries to start up, however,
>>> fails immediately with below error on docker side.
>>>
>>> =====================================================================================
>>> Sep 23 21:04:05 ip-172-31-0-143 51106514ffc0[7920]: 2019/09/23 12:04:05
>>> Initializing python harness: /opt/apache/beam/boot --id=1
>>> --logging_endpoint=localhost:34227 --artifact_endpoint=localhost:45303
>>> --provision_endpoint=localhost:44585 --control_endpoint=localhost:43869
>>> Sep 23 21:04:05 ip-172-31-0-143 dockerd:
>>> time="2019-09-23T21:04:05.380942292+09:00" level=debug msg=event
>>> module=libcontainerd namespace=moby topic=/tasks/start
>>> Sep 23 21:04:05 ip-172-31-0-143 51106514ffc0[7920]: 2019/09/23 12:04:05
>>> Failed to retrieve staged files: failed to get manifest
>>> Sep 23 21:04:05 ip-172-31-0-143 51106514ffc0[7920]: #011caused by:
>>> Sep 23 21:04:05 ip-172-31-0-143 51106514ffc0[7920]: rpc error: code =
>>> Unknown desc =
>>>
>>> =====================================================================================
>>>
>>> At the same time, task manager logs below error.
>>>
>>> =====================================================================================
>>> 2019-09-23 21:04:05,525 INFO
>>>  
>>> org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactRetrievalService
>>>  - GetManifest for
>>> /tmp/artifactsfkyik3us/job_de145881-9ea7-4e44-8e6d-31a6ea298010/MANIFEST
>>> 2019-09-23 21:04:05,526 INFO
>>>  
>>> org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactRetrievalService
>>>  - Loading manifest for retrieval token
>>> /tmp/artifactsfkyik3us/job_de145881-9ea7-4e44-8e6d-31a6ea298010/MANIFEST
>>> 2019-09-23 21:04:05,531 INFO
>>>  
>>> org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactRetrievalService
>>>  - GetManifest for
>>> /tmp/artifactsfkyik3us/job_de145881-9ea7-4e44-8e6d-31a6ea298010/MANIFEST
>>> failed
>>> java.util.concurrent.ExecutionException: java.io.FileNotFoundException:
>>> /tmp/artifactsfkyik3us/job_de145881-9ea7-4e44-8e6d-31a6ea298010/MANIFEST
>>> (No such file or directory)
>>>         at
>>> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:531)
>>> ...
>>>
>>> =====================================================================================
>>>
>>> I see this artifact directory on the server where beam pipeline is
>>> executed but not on worker node.
>>>
>>> =====================================================================================
>>> # Beam server
>>> (python-bm-2150) admin@ip-172-31-9-89:~$ sudo ls -ld
>>> /tmp/artifactsfkyik3us
>>> drwx------ 3 admin admin 4096 Sep 23 12:03 /tmp/artifactsfkyik3us
>>>
>>> # Flink worker node
>>> [ec2-user@ip-172-31-0-143 flink]$ sudo ls -ld /tmp/artifactsfkyik3us
>>> ls: cannot access /tmp/artifactsfkyik3us: No such file or directory
>>>
>>>  
>>> =====================================================================================
>>>
>>> From the error, it seems that container is not starting up correctly due
>>> to manifest file is missing.
>>> What would be a good approach to reference artifact directory from
>>> worker node?
>>> I appreciate if I could get some advice .
>>>
>>> Best Regards,
>>> Yu Watanabe
>>>
>>> --
>>> Yu Watanabe
>>> Weekend Freelancer who loves to challenge building data platform
>>> [email protected]
>>> [image: LinkedIn icon] <https://www.linkedin.com/in/yuwatanabe1>  [image:
>>> Twitter icon] <https://twitter.com/yuwtennis>
>>>
>>

-- 
Yu Watanabe
Weekend Freelancer who loves to challenge building data platform
[email protected]
[image: LinkedIn icon] <https://www.linkedin.com/in/yuwatanabe1>  [image:
Twitter icon] <https://twitter.com/yuwtennis>

Reply via email to