[
https://issues.apache.org/jira/browse/BEAM-8372?focusedWorklogId=334913&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-334913
]
ASF GitHub Bot logged work on BEAM-8372:
----------------------------------------
Author: ASF GitHub Bot
Created on: 28/Oct/19 11:49
Start Date: 28/Oct/19 11:49
Worklog Time Spent: 10m
Work Description: mxm commented on issue #9844: [BEAM-8372] Support both
flink_master and flink_master_url parameter
URL: https://github.com/apache/beam/pull/9844#issuecomment-546911267
Let's do the following:
1. Unify the two parameters `flink_master` and `flink_master_url` as fixed
here for the job server and previously by #9803 .
2. Make the master address a url. Add `http://` to the `flink_master` in
Python if no scheme is specified. Similarly, remove any `http://` in Java,
since the Java rest client does not expect a scheme.
3. Deprecate the `[auto]` and `[local]` property, it should be sufficient to
replace them with either an address string or an empty string. The empty string
would either mean local execution or, in the context of the Flink CLI tool,
loading the master address from the config.
(2) and (3) should be follow-ups. I'll post a summary to the list.
> If flink_master is the option, and has been for a long time, then we
should just use that.
+1
> As for adapting #9775, REST does require specification of a protocol.
Currently if we just pass --runner=FlinkRunner --flink_master=localhost:8081 we
get
Oh I see. I wasn't aware that we need to pass `http://` due to requests. In
that regard the `_url` parameter makes sense. Still better to settle for a
single parameter.
> The use case here is a Flink REST client. In that context there is no
Flink config, the REST API is fully described here:
https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html
This is not true. The Flink Rest client loads the config in order to support
uses cases like SSL encryption (i.e. loading the trust store, initializing the
ssl connection, then sending http requests over it). However, the code in
Python's `FlinkRunner` attempts to directly communicate with the Flink master
via the `requests` library.
>I don't see why flink_master has to serve as endpoint URL necessarily. Why
can there not be a separate property for just for the REST client?
Why should there be multiple options for the same thing? `flink_master` is
already the Rest endpoint address. No need to have another `flink_master_url`
option. We can add URL support and probably it is fair to add `http://` in case
no URL has been supplied.
>I'm not sure what you mean by "a job run by Flink CLI." What does that mean
for a Python job? (I was trying to read the docs and code for [auto] but
wasn't able to figure it out.)
I mean running a jar directly with the Flink command-line tool, i.e.
`bin/flink`. This is the defacto standard way to run Flink jobs. It does not
apply here but we want to make sure that generated Jars do not break when
submitted through the CLI. The `[auto]` mode basically says, either (1) figure
out the cluster address from the context (`bin/flink` reads it from a config
file), or (2) execute locally if there is no such context or the jar is run
directly.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 334913)
Time Spent: 8h 20m (was: 8h 10m)
> Allow submission of Flink UberJar directly to flink cluster.
> ------------------------------------------------------------
>
> Key: BEAM-8372
> URL: https://issues.apache.org/jira/browse/BEAM-8372
> Project: Beam
> Issue Type: New Feature
> Components: sdk-py-core
> Reporter: Robert Bradshaw
> Assignee: Robert Bradshaw
> Priority: Major
> Time Spent: 8h 20m
> Remaining Estimate: 0h
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)