[jira] [Comment Edited] (FLINK-15447) Change "java.io.tmpdir" of JM/TM on Yarn to "{{PWD}}/tmp"

2020-01-18 Thread Victor Wong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018790#comment-17018790
 ] 

Victor Wong edited comment on FLINK-15447 at 1/19/20 4:24 AM:
--

[~rongr] for me #2 is most close to the main concern, but we do not want to 
share with others for fine-grain control on disk resource, but for a shared 
location is very prone to be disk full.

There are some good points in HADOOP-2735:

_Can we add -Djava.io.tmpdir="./tmp" somewhere ?_
 _so that,_
 _1) Tasks can utilize all disks when using tmp_
 _2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) 
is done._


was (Author: victor-wong):
[~rongr] for me #2 is most close to the main concern, but we do not want to 
share with others not for fine-grain control on disk resource, but for a shared 
location is very prone to be disk full.

There are some good points in HADOOP-2735:

_Can we add -Djava.io.tmpdir="./tmp" somewhere ?_
 _so that,_
 _1) Tasks can utilize all disks when using tmp_
 _2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) 
is done._

> Change "java.io.tmpdir"  of JM/TM on Yarn to "{{PWD}}/tmp" 
> ---
>
> Key: FLINK-15447
> URL: https://issues.apache.org/jira/browse/FLINK-15447
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Affects Versions: 1.9.1
>Reporter: Victor Wong
>Priority: Major
>
> Currently, when running Flink on Yarn, the "java.io.tmpdir" property is set 
> to the default value, which is "/tmp". 
>  
> Sometimes we ran into exceptions caused by a full "/tmp" directory, which 
> would not be cleaned automatically after applications finished.
> I think we can set "java.io.tmpdir" to "PWD/tmp" directory, or 
> something similar. "PWD" will be replaced with the true working 
> directory of JM/TM by Yarn, which will be cleaned automatically.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-15447) Change "java.io.tmpdir" of JM/TM on Yarn to "{{PWD}}/tmp"

2020-01-18 Thread Victor Wong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018790#comment-17018790
 ] 

Victor Wong edited comment on FLINK-15447 at 1/19/20 4:23 AM:
--

[~rongr] for me #2 is most close to the main concern, but we do not want to 
share with others not for fine-grain control on disk resource, but for a shared 
location is very prone to be disk full.

There are some good points in HADOOP-2735:

_Can we add -Djava.io.tmpdir="./tmp" somewhere ?_
 _so that,_
 _1) Tasks can utilize all disks when using tmp_
 _2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) 
is done._


was (Author: victor-wong):
[~rongr] for me #2 is most close to the main concern, but we do not want to 
share with others not for fine-grain control on disk resource, but for a shared 
location is very prone to be disk full.

There are some good points in 
[HADOOP-2735|https://issues.apache.org/jira/browse/HADOOP-2735]:

_Can we add -Djava.io.tmpdir="./tmp" somewhere ?
so that,
1) Tasks can utilize all disks when using tmp
2) Any undeleted tmp files will be deleted by the tasktracker when task(job?) 
is done._

> Change "java.io.tmpdir"  of JM/TM on Yarn to "{{PWD}}/tmp" 
> ---
>
> Key: FLINK-15447
> URL: https://issues.apache.org/jira/browse/FLINK-15447
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Affects Versions: 1.9.1
>Reporter: Victor Wong
>Priority: Major
>
> Currently, when running Flink on Yarn, the "java.io.tmpdir" property is set 
> to the default value, which is "/tmp". 
>  
> Sometimes we ran into exceptions caused by a full "/tmp" directory, which 
> would not be cleaned automatically after applications finished.
> I think we can set "java.io.tmpdir" to "PWD/tmp" directory, or 
> something similar. "PWD" will be replaced with the true working 
> directory of JM/TM by Yarn, which will be cleaned automatically.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-15447) Change "java.io.tmpdir" of JM/TM on Yarn to "{{PWD}}/tmp"

2020-01-18 Thread Rong Rong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018645#comment-17018645
 ] 

Rong Rong edited comment on FLINK-15447 at 1/18/20 5:02 PM:


Ahh. I see your intention. Let me clarify what I understood. please correct me 
if I were wrong:
* Both Flink-YARN (JM/TM) and some other 3rd party uses {{java.io.tmpdir}} 
which is a JVM system env key.
* This means, potentially, whatever directory configured, all Java process will 
be sharing this directory as tmp folder.

So could you clarify which is the main concern: 
1. We don't want to pollute {{/tmp}} - which potentially will be used also by 
NON-JVM processes.
2. We want Flink JM/TM to NOT share with others JVM process or other YARN 
containers. 


If the above analysis is correct, 
For #1, we actually creates dedicate partitions to put {{/tmp}} in our YARN 
node, which resolves the issue. Not sure if this can be a solution on your 
case. 
For #2, yes I think the question is not easy to answer especially we want 
fine-grain control on disk resource. 


was (Author: rongr):
Ahh. I see your intention. Let me clarify what I understood. please correct me 
if I were wrong:
* Both Flink-YARN (JM/TM) and some other 3rd party uses {{java.io.tmpdir}} 
which is a JVM system env key.
* This means, potentially, whatever directory configured, all Java process will 
be sharing this directory as tmp folder.

So could you clarify which is the main concern: 
1. The default key is set to {{/tmp}} - which potentially will be used also by 
NON-JVM process.
2. In addition, we also want Flink JM/TM to NOT share with others JVM process 
or other YARN containers. 


If the above analysis is correct, 
For #1, we actually creates dedicate partitions to put {{/tmp}} in our YARN 
node, which resolves the issue. Not sure if this can be a solution on your 
case. 
For #2, yes I think the question is not easy to answer especially we want 
fine-grain control on disk resource. 

> Change "java.io.tmpdir"  of JM/TM on Yarn to "{{PWD}}/tmp" 
> ---
>
> Key: FLINK-15447
> URL: https://issues.apache.org/jira/browse/FLINK-15447
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Affects Versions: 1.9.1
>Reporter: Victor Wong
>Priority: Major
>
> Currently, when running Flink on Yarn, the "java.io.tmpdir" property is set 
> to the default value, which is "/tmp". 
>  
> Sometimes we ran into exceptions caused by a full "/tmp" directory, which 
> would not be cleaned automatically after applications finished.
> I think we can set "java.io.tmpdir" to "PWD/tmp" directory, or 
> something similar. "PWD" will be replaced with the true working 
> directory of JM/TM by Yarn, which will be cleaned automatically.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-15447) Change "java.io.tmpdir" of JM/TM on Yarn to "{{PWD}}/tmp"

2020-01-18 Thread Rong Rong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018645#comment-17018645
 ] 

Rong Rong edited comment on FLINK-15447 at 1/18/20 5:01 PM:


Ahh. I see your intention. Let me clarify what I understood. please correct me 
if I were wrong:
* Both Flink-YARN (JM/TM) and some other 3rd party uses {{java.io.tmpdir}} 
which is a JVM system env key.
* This means, potentially, whatever directory configured, all Java process will 
be sharing this directory as tmp folder.

So could you clarify which is the main concern: 
1. The default key is set to {{/tmp}} - which potentially will be used also by 
NON-JVM process.
2. In addition, we also want Flink JM/TM to NOT share with others JVM process 
or other YARN containers. 


If the above analysis is correct, 
For #1, we actually creates dedicate partitions to put {{/tmp}} in our YARN 
node, which resolves the issue. Not sure if this can be a solution on your 
case. 
For #2, yes I think the question is not easy to answer especially we want 
fine-grain control on disk resource. 


was (Author: rongr):
Ahh. I see your intention. Let me clarify what I understood. please correct me 
if I were wrong:
* Both Flink-YARN (JM/TM) and some other 3rd party uses {{java.io.tmpdir}} 
which is a JVM system env key.
* This means, potentially, whatever directory configured via system env, all 
Java process will be sharing this location.

So could you clarify which is the main concern: 
1. The default key is set to {{/tmp}} - which potentially will be used also by 
NON-JVM process.
2. In addition, we also want Flink JM/TM to NOT share with others JVM process 
or other YARN containers. 


If the above analysis is correct, 
For #1, we actually creates dedicate partitions to put {{/tmp}} in our YARN 
node, which resolves the issue. Not sure if this can be a solution on your 
case. 
For #2, yes I think the question is not easy to answer especially we want 
fine-grain control on disk resource. 

> Change "java.io.tmpdir"  of JM/TM on Yarn to "{{PWD}}/tmp" 
> ---
>
> Key: FLINK-15447
> URL: https://issues.apache.org/jira/browse/FLINK-15447
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Affects Versions: 1.9.1
>Reporter: Victor Wong
>Priority: Major
>
> Currently, when running Flink on Yarn, the "java.io.tmpdir" property is set 
> to the default value, which is "/tmp". 
>  
> Sometimes we ran into exceptions caused by a full "/tmp" directory, which 
> would not be cleaned automatically after applications finished.
> I think we can set "java.io.tmpdir" to "PWD/tmp" directory, or 
> something similar. "PWD" will be replaced with the true working 
> directory of JM/TM by Yarn, which will be cleaned automatically.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-15447) Change "java.io.tmpdir" of JM/TM on Yarn to "{{PWD}}/tmp"

2020-01-18 Thread Rong Rong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018645#comment-17018645
 ] 

Rong Rong edited comment on FLINK-15447 at 1/18/20 5:01 PM:


Ahh. I see your intention. Let me clarify what I understood. please correct me 
if I were wrong:
* Both Flink-YARN (JM/TM) and some other 3rd party uses {{java.io.tmpdir}} 
which is a JVM system env key.
* This means, potentially, whatever directory configured via system env, all 
Java process will be sharing this location.

So could you clarify which is the main concern: 
1. The default key is set to {{/tmp}} - which potentially will be used also by 
NON-JVM process.
2. In addition, we also want Flink JM/TM to NOT share with others JVM process 
or other YARN containers. 


If the above analysis is correct, 
For #1, we actually creates dedicate partitions to put {{/tmp}} in our YARN 
node, which resolves the issue. Not sure if this can be a solution on your 
case. 
For #2, yes I think the question is not easy to answer especially we want 
fine-grain control on disk resource. 


was (Author: rongr):
Ahh. I see your intention. Let me clarify what I understood. please correct me 
if I were wrong:
* Both Flink-YARN (JM/TM) and some other 3rd party uses {{java.io.tmpdir}} 
which is a JVM system env key.
* This means potentially means, whatever directory configured via system env, 
all Java process will be sharing this location.

So could you clarify which is the main concern: 
1. The default key is set to {{/tmp}} - which potentially will be used also by 
NON-JVM process.
2. In addition, we also want Flink JM/TM to NOT share with others JVM process 
or other YARN containers. 


If the above analysis is correct, 
For #1, we actually creates dedicate partitions to put {{/tmp}} in our YARN 
node, which resolves the issue. Not sure if this can be a solution on your 
case. 
For #2, yes I think the question is not easy to answer especially we want 
fine-grain control on disk resource. 

> Change "java.io.tmpdir"  of JM/TM on Yarn to "{{PWD}}/tmp" 
> ---
>
> Key: FLINK-15447
> URL: https://issues.apache.org/jira/browse/FLINK-15447
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Affects Versions: 1.9.1
>Reporter: Victor Wong
>Priority: Major
>
> Currently, when running Flink on Yarn, the "java.io.tmpdir" property is set 
> to the default value, which is "/tmp". 
>  
> Sometimes we ran into exceptions caused by a full "/tmp" directory, which 
> would not be cleaned automatically after applications finished.
> I think we can set "java.io.tmpdir" to "PWD/tmp" directory, or 
> something similar. "PWD" will be replaced with the true working 
> directory of JM/TM by Yarn, which will be cleaned automatically.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-15447) Change "java.io.tmpdir" of JM/TM on Yarn to "{{PWD}}/tmp"

2020-01-17 Thread Victor Wong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018462#comment-17018462
 ] 

Victor Wong edited comment on FLINK-15447 at 1/18/20 1:41 AM:
--

[~rongr] Thanks for your reply!

_Could you elaborate what does it mean by Flink-YARN default the value to /tmp? 
I am guessing you mean JVM default the value to /tmp ?_
---
Yes,  I mean that Flink-YARN is using the default value of JVM, which is "/tmp".


_So far the only place I can see in flink yarn code utilizing this key is .._
---
The third-party dependencies might utilize this key as well, as [~lzljs3620320] 
mentioned. 
My intention is similar to how [~xymaqingxiang] mentioned that setting the 
value to a tmp directory under the working directory of Yarn container.


was (Author: victor-wong):
[~rongr] Thanks for your reply!

_Could you elaborate what does it mean by Flink-YARN default the value to /tmp? 
I am guessing you mean JVM default the value to /tmp ?_
---
Yes,  I mean that Flink-YARN is using the default value of JVM, which is "/tmp".


_So far the only place I can see in flink yarn code utilizing this key is: 
https://github.com/apache/flink/blob/release-1.10/flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java#L899_
---
The third-party dependencies might utilize this key as well, as [~lzljs3620320] 
mentioned. 
My intention is similar to how [~xymaqingxiang] mentioned that setting the 
value to a tmp directory under the working directory of Yarn container.

> Change "java.io.tmpdir"  of JM/TM on Yarn to "{{PWD}}/tmp" 
> ---
>
> Key: FLINK-15447
> URL: https://issues.apache.org/jira/browse/FLINK-15447
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Affects Versions: 1.9.1
>Reporter: Victor Wong
>Priority: Major
>
> Currently, when running Flink on Yarn, the "java.io.tmpdir" property is set 
> to the default value, which is "/tmp". 
>  
> Sometimes we ran into exceptions caused by a full "/tmp" directory, which 
> would not be cleaned automatically after applications finished.
> I think we can set "java.io.tmpdir" to "PWD/tmp" directory, or 
> something similar. "PWD" will be replaced with the true working 
> directory of JM/TM by Yarn, which will be cleaned automatically.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-15447) Change "java.io.tmpdir" of JM/TM on Yarn to "{{PWD}}/tmp"

2020-01-17 Thread Victor Wong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018462#comment-17018462
 ] 

Victor Wong edited comment on FLINK-15447 at 1/18/20 1:40 AM:
--

[~rongr] Thanks for your reply!

_Could you elaborate what does it mean by Flink-YARN default the value to /tmp? 
I am guessing you mean JVM default the value to /tmp ?_
---
Yes,  I mean that Flink-YARN is using the default value of JVM, which is "/tmp".

_So far the only place I can see in flink yarn code utilizing this key is: 
https://github.com/apache/flink/blob/release-1.10/flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java#L899_
---
The third-party dependencies might utilize this key as well, as [~lzljs3620320] 
mentioned. 
My intention is similar to how [~xymaqingxiang] mentioned that setting the 
value to a tmp directory under the working directory of Yarn container.


was (Author: victor-wong):
[~rongr] Thanks for your reply!

_Could you elaborate what does it mean by Flink-YARN default the value to /tmp? 
I am guessing you mean JVM default the value to /tmp ?
_
---
Yes,  I mean that Flink-YARN is using the default value of JVM, which is "/tmp".

_So far the only place I can see in flink yarn code utilizing this key is: 
https://github.com/apache/flink/blob/release-1.10/flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java#L899
_
---
The third-party dependencies might utilize this key as well, as [~lzljs3620320] 
mentioned. 
My intention is similar to how [~xymaqingxiang] mentioned that setting the 
value to a tmp directory under the working directory of Yarn container.

> Change "java.io.tmpdir"  of JM/TM on Yarn to "{{PWD}}/tmp" 
> ---
>
> Key: FLINK-15447
> URL: https://issues.apache.org/jira/browse/FLINK-15447
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Affects Versions: 1.9.1
>Reporter: Victor Wong
>Priority: Major
>
> Currently, when running Flink on Yarn, the "java.io.tmpdir" property is set 
> to the default value, which is "/tmp". 
>  
> Sometimes we ran into exceptions caused by a full "/tmp" directory, which 
> would not be cleaned automatically after applications finished.
> I think we can set "java.io.tmpdir" to "PWD/tmp" directory, or 
> something similar. "PWD" will be replaced with the true working 
> directory of JM/TM by Yarn, which will be cleaned automatically.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (FLINK-15447) Change "java.io.tmpdir" of JM/TM on Yarn to "{{PWD}}/tmp"

2020-01-17 Thread Victor Wong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018462#comment-17018462
 ] 

Victor Wong edited comment on FLINK-15447 at 1/18/20 1:40 AM:
--

[~rongr] Thanks for your reply!

_Could you elaborate what does it mean by Flink-YARN default the value to /tmp? 
I am guessing you mean JVM default the value to /tmp ?_
---
Yes,  I mean that Flink-YARN is using the default value of JVM, which is "/tmp".


_So far the only place I can see in flink yarn code utilizing this key is: 
https://github.com/apache/flink/blob/release-1.10/flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java#L899_
---
The third-party dependencies might utilize this key as well, as [~lzljs3620320] 
mentioned. 
My intention is similar to how [~xymaqingxiang] mentioned that setting the 
value to a tmp directory under the working directory of Yarn container.


was (Author: victor-wong):
[~rongr] Thanks for your reply!

_Could you elaborate what does it mean by Flink-YARN default the value to /tmp? 
I am guessing you mean JVM default the value to /tmp ?_
---
Yes,  I mean that Flink-YARN is using the default value of JVM, which is "/tmp".

_So far the only place I can see in flink yarn code utilizing this key is: 
https://github.com/apache/flink/blob/release-1.10/flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java#L899_
---
The third-party dependencies might utilize this key as well, as [~lzljs3620320] 
mentioned. 
My intention is similar to how [~xymaqingxiang] mentioned that setting the 
value to a tmp directory under the working directory of Yarn container.

> Change "java.io.tmpdir"  of JM/TM on Yarn to "{{PWD}}/tmp" 
> ---
>
> Key: FLINK-15447
> URL: https://issues.apache.org/jira/browse/FLINK-15447
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / YARN
>Affects Versions: 1.9.1
>Reporter: Victor Wong
>Priority: Major
>
> Currently, when running Flink on Yarn, the "java.io.tmpdir" property is set 
> to the default value, which is "/tmp". 
>  
> Sometimes we ran into exceptions caused by a full "/tmp" directory, which 
> would not be cleaned automatically after applications finished.
> I think we can set "java.io.tmpdir" to "PWD/tmp" directory, or 
> something similar. "PWD" will be replaced with the true working 
> directory of JM/TM by Yarn, which will be cleaned automatically.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)