Hello Kyle.

Thank you for the reply  and happy new year for you too !

Yes. I got it working by sharing all container network with host kernel .

My docker compose is below.
https://github.com/yuwtennis/beam-deployment/blob/master/flink-session-cluster/docker/docker-compose.yml

I confirmed it by writing data to file on gcs.

====================================================================

    with beam.Pipeline(options=options) as p:

        lines = ( p | beam.Create(['Hello World.', 'Apache beam']) )

        # Write to GCS
        ( lines | WriteToText('gs://{}/sample.txt'.format(project_id))
        )

====================================================================

https://github.com/yuwtennis/beam-deployment/blob/master/flink-session-cluster/docker/samples/src/sample.py

RESULT
====================================================================
$ gsutil cat gs://${PROJECT_ID}/sample.txt-00000-of-00001
Hello World.
Apache beam
====================================================================

Thanks,
Yu Watanabe


On Fri, Jan 3, 2020 at 5:58 AM Kyle Weaver <[email protected]> wrote:

> This is the root cause:
>
> > python-sdk_1   | 2019/12/31 02:59:45 Failed to obtain provisioning
> > information: failed to dial server at localhost:45759
>
> The Flink task manager and Beam SDK harness use connections over
> `localhost` to communicate.
>
> You will have to put `taskmanager` and `python-sdk` on the same host.
> Maybe you can try using `--networking=host` so they will share the
> namespace. https://docs.docker.com/network/host/
>
> Happy new year!
>
> Kyle
>
> On Mon, Dec 30, 2019 at 7:21 PM Yu Watanabe <[email protected]> wrote:
>
>> Hello .
>>
>> I would like to ask question about the error I am facing with worker
>> pool of sdk container.
>> I get below error when I run the pipeline.
>>
>> ----------------------------------------------------------------------------------------
>> python-sdk_1   | 2019/12/31 02:57:26 Starting worker pool 1: python -m
>> apache_beam.runners.worker.worker_pool_main --service_port=50000
>> --container_executable=/opt/apache/beam/boot
>> python-sdk_1   | INFO:root:Started worker pool servicer at port:
>> localhost:50000 with executable: /opt/apache/beam/boot
>> python-sdk_1   | WARNING:root:Starting worker with command
>> ['/opt/apache/beam/boot', '--id=1-1',
>> '--logging_endpoint=localhost:35615',
>> '--artifact_endpoint=localhost:42723',
>> '--provision_endpoint=localhost:45759',
>> '--control_endpoint=localhost:43185']
>> python-sdk_1   | 2019/12/31 02:57:45 Initializing python harness:
>> /opt/apache/beam/boot --id=1-1 --logging_endpoint=localhost:35615
>> --artifact_endpoint=localhost:42723
>> --provision_endpoint=localhost:45759
>> --control_endpoint=localhost:43185
>> python-sdk_1   | 2019/12/31 02:59:45 Failed to obtain provisioning
>> information: failed to dial server at localhost:45759
>> python-sdk_1   | caused by:
>> python-sdk_1   | context deadline exceeded
>>
>> ----------------------------------------------------------------------------------------
>>
>> In flink taskmanager's log ,  it keeps waiting for response from sdk
>> container.
>>
>> ----------------------------------------------------------------------------------------
>> taskmanager_1  | 2019-12-31 02:57:45,445 INFO
>> org.apache.flink.configuration.GlobalConfiguration            -
>> Loading configuration property: query.server.port, 6125
>> taskmanager_1  | 2019-12-31 02:58:26,678 INFO
>> org.apache.beam.runners.fnexecution.environment.ExternalEnvironmentFactory
>>  - Still waiting for startup of environment from python-sdk:50000 for
>> worker id 1-1
>>
>> ----------------------------------------------------------------------------------------
>>
>> Looking at flink  jobmanager's log, error is logged after starting map
>> transform.
>> So looks like, request from taskmanager is reached to sdk conatiner
>> but not processed correctly.
>> Sounds like I am missing some setting for sdk container..
>>
>> ----------------------------------------------------------------------------------------
>> jobmanager_1   | 2019-12-31 02:57:44,987 INFO
>> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
>> DataSource (Impulse) (1/1) (4e6f68fd31bafa066b740943bc3ea736) switched
>> from RUNNING to FINISHED.
>> jobmanager_1   | 2019-12-31 02:57:44,989 INFO
>> org.apache.flink.runtime.executiongraph.ExecutionGraph        -
>> MapPartition (MapPartition at [2]Create/{FlatMap(<lambda at
>> core.py:2468>), Map(decode)}) (1/1) (067ed452ebd15c1175ecde0ae40e8ac7)
>> switched from DEPLOYING to RUNNING.
>>
>> ----------------------------------------------------------------------------------------
>>
>> Command line for building sdk container is
>>
>> ----------------------------------------------------------------------------------------
>> ./gradlew :sdks:python:container:py37:dockerPush
>> -Pdocker-repository-root=${GCR_HOSTNAME}/${PROJECT_ID}
>> -Pdocker-tag=release-2.16.0
>>
>> ----------------------------------------------------------------------------------------
>>
>> My docker compose
>>
>> https://github.com/yuwtennis/beam-deployment/blob/master/flink-session-cluster/docker/docker-compose.yml
>>
>> My pipeline code
>>
>> https://github.com/yuwtennis/beam-deployment/blob/master/flink-session-cluster/docker/samples/src/sample.py
>>
>> Would there be any settings I need to use for starting up sdk container ?
>>
>> Best Regards,
>> Yu Watanabe
>>
>> --
>> Yu Watanabe
>> [email protected]
>>
>

-- 
Yu Watanabe
[email protected]
[image: LinkedIn icon] <https://www.linkedin.com/in/yuwatanabe1>  [image:
Twitter icon] <https://twitter.com/yuwtennis>

Reply via email to