[jira] [Commented] (FLINK-17579) Set the resource id of taskexecutor according to environment variable if exist in standalone mode

2020-06-15 Thread Till Rohrmann (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17135591#comment-17135591
 ] 

Till Rohrmann commented on FLINK-17579:
---

You are now assigned [~karmagyz].

> Set the resource id of taskexecutor according to environment variable if 
> exist in standalone mode
> -
>
> Key: FLINK-17579
> URL: https://issues.apache.org/jira/browse/FLINK-17579
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Yangze Guo
>Assignee: Yangze Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> Allow user to specify the resource id of TaskExecutor through the environment 
> variable in standalone mode. The name of that variable could be 
> {{FLINK_STANDALONE_TASK_EXECUTOR_ID}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17579) Set the resource id of taskexecutor according to environment variable if exist in standalone mode

2020-06-15 Thread Yangze Guo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17135561#comment-17135561
 ] 

Yangze Guo commented on FLINK-17579:


Agreed.

[~trohrmann] Could you assign this to me? I'd like to prepare a PR according to 
the latest consensus.

> Set the resource id of taskexecutor according to environment variable if 
> exist in standalone mode
> -
>
> Key: FLINK-17579
> URL: https://issues.apache.org/jira/browse/FLINK-17579
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Yangze Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> Allow user to specify the resource id of TaskExecutor through the environment 
> variable in standalone mode. The name of that variable could be 
> {{FLINK_STANDALONE_TASK_EXECUTOR_ID}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17579) Set the resource id of taskexecutor according to environment variable if exist in standalone mode

2020-06-15 Thread Till Rohrmann (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17135527#comment-17135527
 ] 

Till Rohrmann commented on FLINK-17579:
---

This sounds good to me [~karmagyz]. Maybe we don't need the full length of a 
{{UUID}} to make the name unique. Otherwise the names might become quite 
lengthy.

> Set the resource id of taskexecutor according to environment variable if 
> exist in standalone mode
> -
>
> Key: FLINK-17579
> URL: https://issues.apache.org/jira/browse/FLINK-17579
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Yangze Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> Allow user to specify the resource id of TaskExecutor through the environment 
> variable in standalone mode. The name of that variable could be 
> {{FLINK_STANDALONE_TASK_EXECUTOR_ID}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17579) Set the resource id of taskexecutor according to environment variable if exist in standalone mode

2020-06-15 Thread Yangze Guo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17135451#comment-17135451
 ] 

Yangze Guo commented on FLINK-17579:


Yes, I think it could work. So, it seems that it is still necessary to allow 
the user to set an arbitrary prefix.

To summarize, the proposed changes are:
- Add a config option "taskmanager.resource-id.prefix".
- In standalone mode, if "taskmanager.resource-id.prefix" is defined, the 
{{ResourceID}} of the {{taskexecutor}} should be {{prefix-uuid}}.

WDYT? [~trohrmann][~azagrebin][~fly_in_gis]

> Set the resource id of taskexecutor according to environment variable if 
> exist in standalone mode
> -
>
> Key: FLINK-17579
> URL: https://issues.apache.org/jira/browse/FLINK-17579
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Yangze Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> Allow user to specify the resource id of TaskExecutor through the environment 
> variable in standalone mode. The name of that variable could be 
> {{FLINK_STANDALONE_TASK_EXECUTOR_ID}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17579) Set the resource id of taskexecutor according to environment variable if exist in standalone mode

2020-06-13 Thread Till Rohrmann (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17134718#comment-17134718
 ] 

Till Rohrmann commented on FLINK-17579:
---

In the standalone case, users would need to use a third party tool to restart 
the TMs. This tool would then have to set the right resource ids which should 
be ok as long as we support setting some part of the resource id explicitly. I 
believe this should be the case.

> Set the resource id of taskexecutor according to environment variable if 
> exist in standalone mode
> -
>
> Key: FLINK-17579
> URL: https://issues.apache.org/jira/browse/FLINK-17579
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Yangze Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> Allow user to specify the resource id of TaskExecutor through the environment 
> variable in standalone mode. The name of that variable could be 
> {{FLINK_STANDALONE_TASK_EXECUTOR_ID}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17579) Set the resource id of taskexecutor according to environment variable if exist in standalone mode

2020-06-11 Thread Yangze Guo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17133867#comment-17133867
 ] 

Yangze Guo commented on FLINK-17579:


I ask because I do not know whether we plan to support local recovery in 
standalone mode as well. If we plan to, it seems we could not restart a 
{{TaskManager}} with the fixed {{ResourceID}} (hostname could be fixed but uuid 
would be different each time). Do you have some suggestions/ideas to achieve it?

BTW, it seems we do not support to restart a TM with a fixed {{ResourceID}} in 
the standalone mode now. I think this proposal will not introduce any 
regression.

> Set the resource id of taskexecutor according to environment variable if 
> exist in standalone mode
> -
>
> Key: FLINK-17579
> URL: https://issues.apache.org/jira/browse/FLINK-17579
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Yangze Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> Allow user to specify the resource id of TaskExecutor through the environment 
> variable in standalone mode. The name of that variable could be 
> {{FLINK_STANDALONE_TASK_EXECUTOR_ID}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17579) Set the resource id of taskexecutor according to environment variable if exist in standalone mode

2020-06-11 Thread Till Rohrmann (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17133328#comment-17133328
 ] 

Till Rohrmann commented on FLINK-17579:
---

I think the support for persistent volumes is not super relevant for the 
standalone mode [~karmagyz]. But it could still work if the each 
{{TaskManager}} process is being started with a fixed {{ResourceID}} (fixed 
across restarts).

> Set the resource id of taskexecutor according to environment variable if 
> exist in standalone mode
> -
>
> Key: FLINK-17579
> URL: https://issues.apache.org/jira/browse/FLINK-17579
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Yangze Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> Allow user to specify the resource id of TaskExecutor through the environment 
> variable in standalone mode. The name of that variable could be 
> {{FLINK_STANDALONE_TASK_EXECUTOR_ID}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17579) Set the resource id of taskexecutor according to environment variable if exist in standalone mode

2020-06-10 Thread Yangze Guo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17132864#comment-17132864
 ] 

Yangze Guo commented on FLINK-17579:


Thanks for the feedback, [~trohrmann], [~fly_in_gis] and [~azagrebin].

Here are some of my thoughts:
- Since I do not see any drawback or regression in using {{hostname-uuid}} as 
the {{ResourceID}}, I think it probably makes sense to just change the default 
behavior. We may not need a config option for it.
- Regarding Till's concern, I think it would not obstruct it because, in 
Kubernetes scenario, the {{hostname}} is unique and constant. Correct me if I 
mistake [~fly_in_gis].
- However, it may not work in standalone scenario since the {{hostname}} is not 
ensured to be unique. Do we also plan to support local recovery in standalone 
mode? [~trohrmann]

> Set the resource id of taskexecutor according to environment variable if 
> exist in standalone mode
> -
>
> Key: FLINK-17579
> URL: https://issues.apache.org/jira/browse/FLINK-17579
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Yangze Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> Allow user to specify the resource id of TaskExecutor through the environment 
> variable in standalone mode. The name of that variable could be 
> {{FLINK_STANDALONE_TASK_EXECUTOR_ID}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17579) Set the resource id of taskexecutor according to environment variable if exist in standalone mode

2020-06-10 Thread Till Rohrmann (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130635#comment-17130635
 ] 

Till Rohrmann commented on FLINK-17579:
---

One thing to consider is that we wanted to add support for persistent volumes 
and local recovery at some point which could vastly improve recovery speed of 
Flink jobs. One idea how to solve the problem was to give TM processes a unique 
and constant {{ResourceID}}. If now a TM is always started with the same 
persistent volume, then we could achieve local recovery by simply redeploying 
tasks to the TM with the same {{ResourceID}} as before. This could also work if 
we only match on a unique and constant prefix of course. The important bit 
would that we keep this idea in mind and try not to obstruct it if possible.

> Set the resource id of taskexecutor according to environment variable if 
> exist in standalone mode
> -
>
> Key: FLINK-17579
> URL: https://issues.apache.org/jira/browse/FLINK-17579
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Yangze Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> Allow user to specify the resource id of TaskExecutor through the environment 
> variable in standalone mode. The name of that variable could be 
> {{FLINK_STANDALONE_TASK_EXECUTOR_ID}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17579) Set the resource id of taskexecutor according to environment variable if exist in standalone mode

2020-06-08 Thread Yang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128805#comment-17128805
 ] 

Yang Wang commented on FLINK-17579:
---

[~azagrebin] [~karmagyz] Thanks for the fruitful discussion. First i want to 
add more background of this ticket.

> Why the users want to specify the TaskManager instance name?

More and more users are deploying Flink in container environment, especially 
K8s. When they have started a standalone session/job cluster, they need a 
easier way to find the corresponding pod for a specific TaskManager. So then 
they could tunnel in and debug the process(e.g. jmap).

 

> Use env or config option

I have no preference. Either of them makes sense to me. If we could provide a 
unified approach to set Flink options via environment variables, it will be 
great. 

 

> How to generate the TaskManager name?

I think an config option whether to use {{hostname-uuid}} for TaskManager name 
is enough. I agree that the uuid is necessary to avoid duplication. It could be 
a short string, maybe 6 characters are enough. I am not sure whether 
{{hostname-uuid}} is a too long string. In our production environment, the full 
qualified hostname is usually no more than 45 characters(e.g. 
xyz011177171118.na610.aliyun.com). In K8s, the podname could not be more than 
63 characters. So i think maybe it is similar to Yarn container id(e.g. 
container_e04_1591199811063_0665_01_02).

All in all, providing a meaningful name for each TaskManager will make the log 
more human readable and help with debugging.

> Set the resource id of taskexecutor according to environment variable if 
> exist in standalone mode
> -
>
> Key: FLINK-17579
> URL: https://issues.apache.org/jira/browse/FLINK-17579
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Yangze Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> Allow user to specify the resource id of TaskExecutor through the environment 
> variable in standalone mode. The name of that variable could be 
> {{FLINK_STANDALONE_TASK_EXECUTOR_ID}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17579) Set the resource id of taskexecutor according to environment variable if exist in standalone mode

2020-06-08 Thread Andrey Zagrebin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128327#comment-17128327
 ] 

Andrey Zagrebin commented on FLINK-17579:
-

Ok, couple of more questions.
Do you think the option should be to configure an arbitrary prefix for the id 
or the option can just say whether to use its hostname?
What about the Master id?
Also, the `-` can be quite a long string for logs. I 
suppose we use in many places. Maybe, it would be better to output it once in 
relevant places, like first connection and then just ids?
cc [~trohrmann]

> Set the resource id of taskexecutor according to environment variable if 
> exist in standalone mode
> -
>
> Key: FLINK-17579
> URL: https://issues.apache.org/jira/browse/FLINK-17579
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Yangze Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> Allow user to specify the resource id of TaskExecutor through the environment 
> variable in standalone mode. The name of that variable could be 
> {{FLINK_STANDALONE_TASK_EXECUTOR_ID}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17579) Set the resource id of taskexecutor according to environment variable if exist in standalone mode

2020-06-08 Thread Yangze Guo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128044#comment-17128044
 ] 

Yangze Guo commented on FLINK-17579:


[~azagrebin] Thanks for the advice.

+1 to use config option instead.

+1 to append a random id at the end of it.

> Set the resource id of taskexecutor according to environment variable if 
> exist in standalone mode
> -
>
> Key: FLINK-17579
> URL: https://issues.apache.org/jira/browse/FLINK-17579
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Yangze Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> Allow user to specify the resource id of TaskExecutor through the environment 
> variable in standalone mode. The name of that variable could be 
> {{FLINK_STANDALONE_TASK_EXECUTOR_ID}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17579) Set the resource id of taskexecutor according to environment variable if exist in standalone mode

2020-06-08 Thread Andrey Zagrebin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17127970#comment-17127970
 ] 

Andrey Zagrebin commented on FLINK-17579:
-

[~karmagyz] Thanks for looking into this!

Why do you think it should be an environment variable and not a Flink option?
The Flink option can be also changed as a dynamic property argument (-D) for 
taskmanager.sh if you want to share flink-conf.yaml among TMs in standalone. 
Moreover, there are plans in the community to introduce a unified approach to 
set Flink options via environment variables. I would suggest to avoid 
multiplying ways of Flink configuration for better maintainability.

I also think it is dangerous to allow users to set fixed ids. We assume that 
all ids are unique everywhere in Flink. If some TMs accidently get the same id, 
it can lead to unpredictable failures. Also, it might be the case that if the 
same TM rejoins cluster, we assume that it will have another id to avoid 
collisions with its previous run in the system. Therefore, I would consider to 
keep ids always random and unique. The id could consist of a fixed part and 
still some random prefix: `-`.


> Set the resource id of taskexecutor according to environment variable if 
> exist in standalone mode
> -
>
> Key: FLINK-17579
> URL: https://issues.apache.org/jira/browse/FLINK-17579
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Yangze Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.12.0
>
>
> Allow user to specify the resource id of TaskExecutor through the environment 
> variable in standalone mode. The name of that variable could be 
> {{FLINK_STANDALONE_TASK_EXECUTOR_ID}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-17579) Set the resource id of taskexecutor according to environment variable if exist in standalone mode

2020-05-08 Thread Yang Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-17579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102561#comment-17102561
 ] 

Yang Wang commented on FLINK-17579:
---

It will be a very useful feature when deploying Flink standalone cluster in the 
K8s cluster. Each taskmanager will have a dedicated hostname and could be used 
to register with jobmanager.

For users, it will be easier to identify the taskmanager and help with 
profiling and debuging.

> Set the resource id of taskexecutor according to environment variable if 
> exist in standalone mode
> -
>
> Key: FLINK-17579
> URL: https://issues.apache.org/jira/browse/FLINK-17579
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Yangze Guo
>Priority: Major
>
> Allow user to specify the resource id of TaskExecutor through the environment 
> variable in standalone mode. The name of that variable could be 
> {{FLINK_TASKEXECUTOR_RESOURCE_ID}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)