[GitHub] flink issue #3525: [FLINK-6020]add a random integer suffix to blob key to av...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3525 @tillrohrmann I've tried with your commit and the issue is resolved, thanks. Closing this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3525: [FLINK-6020]add a random integer suffix to blob ke...
Github user WangTaoTheTonic closed the pull request at: https://github.com/apache/flink/pull/3525 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3525: [FLINK-6020]add a random integer suffix to blob key to av...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3525 Thanks for your fix, i'll check in a day or two. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3525: [FLINK-6020]add a random integer suffix to blob key to av...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3525 That looks good to me. Looking forward to fix from @tillrohrmann. Thank you very much :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3525: [FLINK-6020]add a random integer suffix to blob key to av...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3525 For HA case, the blob server will upload jars to HDFS for recovery, and there's a cocurrent operations here too. I'm not sure if the solutions ou proposed can cover that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3727: [FLINK-6312]update curator version to 2.12.0 to av...
Github user WangTaoTheTonic closed the pull request at: https://github.com/apache/flink/pull/3727 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3709: [FLINK-6295]Update suspended ExecutionGraph to lower late...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3709 @StephanEwen Sure. The current fix is like a "pull", while what you suggest is a "push" way. Both them can fix just make difference in how the EGs being updated. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3727: [FLINK-6312]update curator version to 2.12.0 to avoid pot...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3727 @StephanEwen All tests passed! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3745: [FLINK-6341]Don't let JM fall into infinite loop
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3745 I see. Connection ID is added, please check if it's ok :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3745: [FLINK-6341]Don't let JM fall into infinite loop
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3745 @tillrohrmann What problem it will bring if we access `currentResourceManager` from another thread? It is a variable in JobManager and can be shared across multi threads, right? The new added code just read it and there's no cocurrency problem coming, i think. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3709: [FLINK-6295]use LoadingCache instead of WeakHashMap to lo...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3709 I've testet and the function is ok. Please check if it's good to go, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3709: [FLINK-6295]use LoadingCache instead of WeakHashMap to lo...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3709 All right. I'll change as you suggest and verify the result. Thanks for comments and advise :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3745: [FLINK-6341]Don't let JM fall into infinite loop
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/flink/pull/3745 [FLINK-6341]Don't let JM fall into infinite loop When TaskManager register to JobManager, JM will send a "NotifyResourceStarted" message to kick off Resource Manager, then trigger a reconnection to resource manager through sending a "TriggerRegistrationAtJobManager". When the ref of resource manager in JobManager is not None and the reconnection is to same resource manager, JobManager will go to a infinite message sending loop which will always sending himself a "ReconnectResourceManager" every 2 seconds. We have already observed that phonomenon. More details, check how JobManager handles `ReconnectResourceManager`. You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/flink FLINK-6341 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3745.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3745 commit 8eb4dd42a71d9830c91b3db824e3133ce3d35c08 Author: WangTaoTheTonic <wangtao...@huawei.com> Date: 2017-04-20T12:28:10Z Don't let JM fall into infinite loop --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3709: [FLINK-6295]use LoadingCache instead of WeakHashMap to lo...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3709 My main concern is that the status showing in web doesn't match the actual state backend. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3709: [FLINK-6295]use LoadingCache instead of WeakHashMap to lo...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3709 I got it, but still have one question: what about the other state transition? Like when job is cancelling or failing or else? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3709: [FLINK-6295]use LoadingCache instead of WeakHashMap to lo...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3709 @zentol How do we know if a job requested is supended or not, as the status of jobs in backend is alway changing? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3727: [FLINK-6312]update curator version to 2.12.0 to avoid pot...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3727 Sure. Seems like it will take a little long time but i'll try my best :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3709: [FLINK-6295]use LoadingCache instead of WeakHashMap to lo...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3709 That means every time EGHolder received a request, it will check if the job status in request is suspended or not, right? This will make cache in EGHolder unmeaningful. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3709: [FLINK-6295]use LoadingCache instead of WeakHashMap to lo...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3709 Ok i think i've got your point. Now using WeakHashMap, we add entries when the map doesn't contain the requested EG id, remove invalid entries when GC happens. By adding `small 2-line branch` as you suggest, we add entries as same way as before, but check if a entry is valid when it's accessed by a handler, and update/remove it if it's invalid. Is it right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3414: [FLINK-5904][YARN]make jobmanager.heap.mb and taskmanager...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3414 Thanks. I've resolved conflicts. enjoy :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3709: [FLINK-6295]use LoadingCache instead of WeakHashMap to lo...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3709 I mean who's in charge of updating EGHolder? EGHolder itself or JobManager? EGHolder don't sense status changing of jobs until it queries from JobManager periodically. If JobManager took the responsibility, so it will be a listenser design pattern, i guess? Would it be too complicated as now EGHolder is just a light weighted cache? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3709: [FLINK-6295]use LoadingCache instead of WeakHashMap to lo...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3709 In my opinion EGHolder is simply a cache which should not be assigned too complicated task. If we add the check logic, how long it should be? Will other events afftects status of tasks? I believe there're more concerns if we added it. This fix only change internal data structures and decouple with both JobManager and web frontend. I am not sure why we are reducing usage of guava, but it sounds not a very good idea :( --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3709: [FLINK-6295]use LoadingCache instead of WeakHashMap to lo...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3709 I'm not a akka expert. As we observed, the status of cancelled tasks will be updated to running only when gc happens in JM. Way to reproduce: 1. launch a flink job with ha mode 2. restart zookeeper(to make tasks failed) 3. after tasks recovered, check if status of tasks are running or cancelled(if there's gc happens, tasks' status showed in web frontend will be same with the actual states, or the tasks' status are delayed, may cause inconsistend with those in backend) We oberved such phenomemon in yarn mode, and it is fixed after this patch. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3709: [FLINK-6295]use LoadingCache instead of WeakHashMap to lo...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3709 @zentol No you're wrong. If you take a look at `ExecutionGraphHolder`, you'll find the graphs in it are generated from message answered by JobManager, which means there's no reference from JobManager but only from handlers in netty web backend. Once there's no reference from those handlers, they would be garbage collected no matter the actual job is running or recovering. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3709: [FLINK-6295]use LoadingCache instead of WeakHashMap to lo...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3709 @zentol The execution graphs cached in `ExecutionGraphHolder`(which is backed by a WeakHashMap) will be evicted only when gc happens. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3709: [FLINK-6295]use LoadingCache instead of WeakHashMap to lo...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3709 @wenlong88 LoadingCache can also cache and evict data as WeakHashMap, as this implementation shows it will evict data every 30 seconds and fetch data if it doesn't contain the required key. @zentol You're right. The data structures used doesn't matter, while what is showed in web frontend and how they are updated does. I don't think user can tasks' stauts update only triggered by JobManager GC(which could be a very long time). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3727: [FLINK-6312]update curator version to 2.12.0 to av...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/flink/pull/3727 [FLINK-6312]update curator version to 2.12.0 to avoid potential block As there's a Major bug([CURATOR-344](https://issues.apache.org/jira/browse/CURATOR-344)) in curator release used by flink, we need to update the release to 2.12.0 to avoid potential block in flink. (flink use recipes in checkpoint coordinator and we have already occurred problem in zookeeper failover when we're trying to fix [FLINK-6174](https://issues.apache.org/jira/browse/FLINK-6174)) You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/flink FLINK-6312 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3727.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3727 commit 014b042006c0d6b1939e00c68e7d15ce8262400a Author: WangTaoTheTonic <wangtao...@huawei.com> Date: 2017-04-17T06:55:27Z update curator version to 2.12.0 to avoid potential block --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3709: [FLINK-6295]use LoadingCache instead of WeakHashMa...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/flink/pull/3709 [FLINK-6295]use LoadingCache instead of WeakHashMap to lower latency Now in ExecutionGraphHolder, which is used in many handlers, we use a WeakHashMap to cache ExecutionGraph(s), which is only sensitive to garbage collection. The latency is too high when JVM do GC rarely, which will make status of jobs or its tasks unmatched with the real ones. (WE once observed that the web still shows tasks cancelled/failed, after the actual states of tasks coming back to normal for **30+ mins,** until a gc happened) LoadingCache is a common used cache implementation from guava lib, we can use its time based eviction to lower latency of status update. You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/flink FLINK-6295 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3709.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3709 commit d76ced06242623d150f9ad09205e2b92f910c1a1 Author: WangTaoTheTonic <wangtao...@huawei.com> Date: 2017-04-11T11:48:52Z use LoadingCache instead of WeakHashMap to lower latency --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3704: [FLINK-5756] Replace RocksDB dependency with FRocksDB
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3704 Could you tell what modifications are done in "FRocksDB" and post the url of source code repository? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3525: [FLINK-6020]add a random integer suffix to blob key to av...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3525 hi stephan, could you help review? @StephanEwen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3408: [FLINK-5903][YARN]respect taskmanager.numberOfTaskSlots a...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3408 @tillrohrmann After changing code the test results(both session and single mode) is like: | Configurations | #vcores of container(TM) | #slots of TM | | | | | | -s/-ys 5, yarn.containers.vcores: 4, taskmanager.numberOfTaskSlots: 3 | 4| 5| | yarn.containers.vcores: 4, taskmanager.numberOfTaskSlots: 3 | 4 | 3| | yarn.containers.vcores: 4| 4| 1 | | -s/-ys 5, taskmanager.numberOfTaskSlots: 3 | 5| 5| | taskmanager.numberOfTaskSlots: 3 | 3| 3 | | Nothing to specify, all use default | 1| 1 | Please check :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3599: [FLINK-6174][HA]introduce a SMARTER leader latch to make ...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3599 @wenlong88 Feel free to review, thanks :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3599: [FLINK-6174][HA]introduce a SMARTER leader latch to make ...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3599 @StephanEwen I've done the changes, which introduce a new smarter leader latch(the reason why i write a new class is that `handleStateChange` method is private in `LeaderLatch` and cannot be overrided) which will wait a connection timeout duration when connection to zookeeper is broken, instead of revoke leadership immediately. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3408: [FLINK-5903][YARN]respect taskmanager.numberOfTaskSlots a...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3408 All right. That makes sense. Let me rephrase that and please check if we are in same channel: 1. slots of taskmanager is decided by `-s/-ys` and `taskmanager.numberOfTaskSlots`, the former has higher priority; and 2. the vcores of yarn container is decided by `yarn.containers.vcores`, which will use values of `-s/-ys` or `taskmanager.numberOfTaskSlots` if user doesn't set `yarn.containers.vcores` explicitly. Is that right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3408: [FLINK-5903][YARN]respect taskmanager.numberOfTaskSlots a...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3408 @tillrohrmann I still cannot get your point entirely. don't this three configs(`-s/-ys`, `yarn.containers.vcores` and `taskmanager.numberOfTaskSlots` mean same thing? Do they have difference in usage except priority? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3617: [FLINK-6192]reuse zookeeper client created by CuratorFram...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3617 Can you please review https://github.com/apache/flink/pull/3408, by the way? @tillrohrmann --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3617: [FLINK-6192]reuse zookeeper client created by CuratorFram...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3617 Glad we have same idea :) @tillrohrmann I'll mark the JIRA duplicated now and close this PR as soon as you open the new one. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3617: [FLINK-6192]reuse zookeeper client created by Cura...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/flink/pull/3617 [FLINK-6192]reuse zookeeper client created by CuratorFramework Now in yarn mode, there're three places using zookeeper client(web monitor, jobmanager and resourcemanager) in ApplicationMaster/JobManager, while there're two in TaskManager. They create new one zookeeper client when they need them. I believe there're more other places do the same thing, but in one JVM, one CuratorFramework is enough for connections to one zookeeper client, so we need a singleton to reuse them. You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/flink FLINK-6192 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3617.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3617 commit 3d0c164a0d41fcd5c88fe7b959062827eb1a5909 Author: WangTaoTheTonic <wangtao...@huawei.com> Date: 2017-03-27T07:43:01Z reuse zookeeper client created by CuratorFramework --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3614: [FLINK-6189][YARN]Do not use yarn client config to...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/flink/pull/3614 [FLINK-6189][YARN]Do not use yarn client config to do sanity check Now in client, if #slots is greater than then number of "yarn.nodemanager.resource.cpu-vcores" in yarn client config, the submission will be rejected. It makes no sense as the actual vcores of node manager is decided in cluster side, but not in client side. If we don't set the config or don't set the right value of it(indeed this config is not a mandatory), it should not affect flink submission. You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/flink FLINK-6189 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3614.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3614 commit eef3bba405557c6f7c55aee6983bb1bd9ade7501 Author: WangTaoTheTonic <wangtao...@huawei.com> Date: 2017-03-25T10:00:35Z Do not use yarn client config to do sanity check --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3599: [FLINK-6174][HA]introduce a new election service to make ...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3599 I don't think it's a good idea, as it can not solve the "split brain" issue too. The key problem is that `LeaderLatch` in curator is too sensitive to connection state to Zookeeper(it will revoke leadership when connection to zookeeper is temporarily broke), and probably the best way is offerring a "duller" LeaderLatch, which can be also used in standalone cluster. I did same work in our own private Spark release, let me see if it can be reused. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3408: [FLINK-5903][YARN]respect taskmanager.numberOfTaskSlots a...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3408 @tillrohrmann @StephanEwen The code was changed and I've verified the functions, could you please review this and merge it if it's good to go? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3599: [FLINK-6174][HA]introduce a new election service to make ...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3599 Thanks for your comments @wenlong88 . I also gave a thought about adding retry logic when zk failover, but this part should modify `LeaderLatch` in curator, which is a 3rd party library, or we can only add a our private LeaderLatch through coping most parts of the implementation in curator. Even with adding this AlwaysLeaderService, the JM failover can also go well as RM will start a new instance. about FLIP-6, I'll check the solution and find if anything can help with this :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3599: [FLINK-6174][HA]introduce a new election service t...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/flink/pull/3599 [FLINK-6174][HA]introduce a new election service to make JobManager always available Now in yarn mode, if we use zookeeper as high availability choice, it will create a election service to get a leader depending on zookeeper election. When zookeeper leader crashes or the connection between JobManager and zookeeper instance was broken, JobManager's leadership will be revoked and send a Disconnect message to TaskManager, which will cancle all running tasks and make them waiting connection rebuild between JM and ZK. In yarn mode, we have one and only JobManager(AM) in same time, and it should be alwasy leader instead of elected through zookeeper. We can introduce a new leader election service in yarn mode to achive that. You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/flink FLINK-6174 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3599.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3599 commit a758ec6320b026fb5d767cdc190c29c8043838da Author: WangTaoTheTonic <wangtao...@huawei.com> Date: 2017-03-23T03:15:40Z introduce a new election service to make JobManager always available --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3525: [FLINK-6020]add a random integer suffix to blob key to av...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3525 ping @StephanEwen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3525: [FLINK-6020]add a random integer suffix to blob key to av...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3525 Right...I have same thought as you at the beginning and i've tried to make the move atomic but it has serveral side affect, like: 1. if we use this way to handle this, which means two job can share the same jar file in blobserver, it will be a problem when one of them being canceled and deleting its jars(now it seems like it doesn't do the delete, but it should do) 2. for job recovery(or other kind of recovery, i'm not sure, just observed the phenomenon) blob server will upload jars to hdfs using same name of local file. Even the two jobs share same jar in blob store, they will upload it twice at same time, which will cause file lease occuptation in hdfs. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3525: [FLINK-6020]add a random integer suffix to blob key to av...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3525 The second rename will not fail, but make the file which written by the first corrupted, which will make the first job failed if the task is loading this jar. by the way, the jar file will be uploaded to hdfs for recovery, and the uploading will fail too if there are more than two clients writing file with same name. It is easy to reoccur. First launch a session with enough slots, then run a script contains many same job submitting, says there are 20 lines of "flink run ../examples/steaming/WindowJoin.jar &". Make sure there's a "&" in end of each line to make them run in parallel. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3486: [FLINK-5981][SECURITY]make ssl version and cipher suites ...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3486 Fixed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #2425: FLINK-3930 Added shared secret based authorization...
Github user WangTaoTheTonic commented on a diff in the pull request: https://github.com/apache/flink/pull/2425#discussion_r106335560 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/io/network/netty/CookieHandler.java --- @@ -0,0 +1,130 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.io.network.netty; + +import io.netty.buffer.ByteBuf; +import io.netty.buffer.Unpooled; +import io.netty.channel.Channel; +import io.netty.channel.ChannelHandlerContext; +import io.netty.channel.ChannelInboundHandlerAdapter; +import io.netty.handler.codec.MessageToMessageDecoder; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.nio.charset.Charset; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +public class CookieHandler { + + public static class ClientCookieHandler extends ChannelInboundHandlerAdapter { + + private final Logger LOG = LoggerFactory.getLogger(ClientCookieHandler.class); + + private final String secureCookie; + + final Charset DEFAULT_CHARSET = Charset.forName("utf-8"); + + public ClientCookieHandler(String secureCookie) { + this.secureCookie = secureCookie; + } + + @Override + public void channelActive(ChannelHandlerContext ctx) throws Exception { + super.channelActive(ctx); + LOG.debug("In channelActive method of ClientCookieHandler"); + + if(this.secureCookie != null && this.secureCookie.length() != 0) { + LOG.debug("In channelActive method of ClientCookieHandler -> sending secure cookie"); + final ByteBuf buffer = Unpooled.buffer(4 + this.secureCookie.getBytes(DEFAULT_CHARSET).length); + buffer.writeInt(secureCookie.getBytes(DEFAULT_CHARSET).length); + buffer.writeBytes(secureCookie.getBytes(DEFAULT_CHARSET)); + ctx.writeAndFlush(buffer); + } + } + } + + public static class ServerCookieDecoder extends MessageToMessageDecoder { + + private final String secureCookie; + + private final List channelList = new ArrayList<>(); + + private final Charset DEFAULT_CHARSET = Charset.forName("utf-8"); + + private final Logger LOG = LoggerFactory.getLogger(ServerCookieDecoder.class); + + public ServerCookieDecoder(String secureCookie) { + this.secureCookie = secureCookie; + } + + /** +* Decode from one message to an other. This method will be called for each written message that can be handled +* by this encoder. +* +* @param ctx the {@link ChannelHandlerContext} which this {@link MessageToMessageDecoder} belongs to +* @param msg the message to decode to an other one +* @param out the {@link List} to which decoded messages should be added +* @throws Exception is thrown if an error accour +*/ + @Override + protected void decode(ChannelHandlerContext ctx, ByteBuf msg, List out) throws Exception { + + LOG.debug("ChannelHandlerContext name: {}, channel: {}", ctx.name(), ctx.channel()); + + if(secureCookie == null || secureCookie.length() == 0) { + LOG.debug("Not validating secure cookie since the server configuration is not enabled to use cookie"); + return; + } + + LOG.debug("Going to decode the secure co
[GitHub] flink pull request #2425: FLINK-3930 Added shared secret based authorization...
Github user WangTaoTheTonic commented on a diff in the pull request: https://github.com/apache/flink/pull/2425#discussion_r106335331 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/io/network/netty/CookieHandler.java --- @@ -0,0 +1,130 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.io.network.netty; + +import io.netty.buffer.ByteBuf; +import io.netty.buffer.Unpooled; +import io.netty.channel.Channel; +import io.netty.channel.ChannelHandlerContext; +import io.netty.channel.ChannelInboundHandlerAdapter; +import io.netty.handler.codec.MessageToMessageDecoder; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.nio.charset.Charset; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +public class CookieHandler { + + public static class ClientCookieHandler extends ChannelInboundHandlerAdapter { + + private final Logger LOG = LoggerFactory.getLogger(ClientCookieHandler.class); + + private final String secureCookie; + + final Charset DEFAULT_CHARSET = Charset.forName("utf-8"); + + public ClientCookieHandler(String secureCookie) { + this.secureCookie = secureCookie; + } + + @Override + public void channelActive(ChannelHandlerContext ctx) throws Exception { + super.channelActive(ctx); + LOG.debug("In channelActive method of ClientCookieHandler"); + + if(this.secureCookie != null && this.secureCookie.length() != 0) { + LOG.debug("In channelActive method of ClientCookieHandler -> sending secure cookie"); + final ByteBuf buffer = Unpooled.buffer(4 + this.secureCookie.getBytes(DEFAULT_CHARSET).length); + buffer.writeInt(secureCookie.getBytes(DEFAULT_CHARSET).length); + buffer.writeBytes(secureCookie.getBytes(DEFAULT_CHARSET)); + ctx.writeAndFlush(buffer); + } + } + } + + public static class ServerCookieDecoder extends MessageToMessageDecoder { + + private final String secureCookie; + + private final List channelList = new ArrayList<>(); --- End diff -- Is it better to use `Set` instead of a `List` here? As it is mainly used for lookup. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3486: [FLINK-5981][SECURITY]make ssl version and cipher suites ...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3486 Since not merged, I've turned them around. Sorry for the carelessness :( --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3486: [FLINK-5981][SECURITY]make ssl version and cipher suites ...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3486 @vijikarthi I've checked the [JDK doc](http://docs.oracle.com/javase/8/docs/technotes/guides/security/StandardNames.html#ciphersuites) and not found any notes about combination of ssl version and ciper suites. About cihper suites, it says `The following list contains the standard JSSE cipher suite names. Over time, various groups have added additional cipher suites to the SSL/TLS namespace. `, so i think we better not add additional description about that and let user to follow JRE/JDK rules. @StephanEwen Two comments you mentioned was fixed :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3486: [FLINK-5981][SECURITY]make ssl version and cipher suites ...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3486 thanks for review @vijikarthi. I will check if there are mismatch between protocols and cipher suites and document it if any. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3486: [FLINK-5981][SECURITY]make ssl version and cipher suites ...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3486 @StephanEwen I've added some test cases for testing new function and a ITCase to prove akka cannot accept more than one protocol setting. Let me know if there's better way to implementation :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3525: [FLINK-6020]add a random integer suffix to blob ke...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/flink/pull/3525 [FLINK-6020]add a random integer suffix to blob key to avoid naming conflicting In yarn-cluster mode, if we submit one same job multiple times parallelly, the task will encounter class load problem and lease occuputation. Because blob server stores user jars in name with generated sha1sum of those, first writes a temp file and move it to finalialize. For recovery it also will put them to HDFS with same file name. In same time, when multiple clients sumit same job with same jar, the local jar files in blob server and those file on hdfs will be handled in multiple threads(BlobServerConnection), and impact each other. I've found a way to solve this by adding a random integer suffix to blob key. Like changed here. You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/flink FLINK-6020 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3525.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3525 commit 3d9f41afad9c831431b3c7bd0eb2a8006b92718e Author: WangTaoTheTonic <wangtao...@huawei.com> Date: 2017-03-13T11:52:36Z add a random integer suffix to blob key to avoid naming conflicting --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3408: [FLINK-5903][YARN]respect taskmanager.numberOfTaskSlots a...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3408 ping @tillrohrmann --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3414: [FLINK-5904][YARN]make jobmanager.heap.mb and taskmanager...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3414 @tillrohrmann I've changed per comments. Mind reviewing again? Thanks :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3408: [FLINK-5903][YARN]respect taskmanager.numberOfTaskSlots a...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3408 I move the initiallization of this config to constructor of cluster descripter and restore the deleted configuration setting. Please check if we are good with the usage of `YARN_VCORES`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3408: [FLINK-5903][YARN]respect taskmanager.numberOfTask...
Github user WangTaoTheTonic commented on a diff in the pull request: https://github.com/apache/flink/pull/3408#discussion_r104867661 --- Diff: flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java --- @@ -537,7 +543,6 @@ public YarnClusterClient createCluster( Preconditions.checkNotNull(userJarFiles, "User jar files should not be null."); AbstractYarnClusterDescriptor yarnClusterDescriptor = createDescriptor(applicationName, cmdLine); - yarnClusterDescriptor.setFlinkConfiguration(config); --- End diff -- all right --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3414: [FLINK-5904][YARN]make jobmanager.heap.mb and task...
Github user WangTaoTheTonic commented on a diff in the pull request: https://github.com/apache/flink/pull/3414#discussion_r104862993 --- Diff: flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java --- @@ -306,12 +308,16 @@ public AbstractYarnClusterDescriptor createDescriptor(String defaultApplicationN if (cmd.hasOption(JM_MEMORY.getOpt())) { int jmMemory = Integer.valueOf(cmd.getOptionValue(JM_MEMORY.getOpt())); yarnClusterDescriptor.setJobManagerMemory(jmMemory); + } else if (config.containsKey(ConfigConstants.JOB_MANAGER_HEAP_MEMORY_KEY)) { + yarnClusterDescriptor.setJobManagerMemory(config.getInteger(ConfigConstants.JOB_MANAGER_HEAP_MEMORY_KEY, ConfigConstants.DEFAULT_JOB_MANAGER_HEAP_MEMORY)); --- End diff -- That's a good idea. we don't need to set it explicitly here if we init it in constructor :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3408: [FLINK-5903][YARN]respect taskmanager.numberOfTask...
Github user WangTaoTheTonic commented on a diff in the pull request: https://github.com/apache/flink/pull/3408#discussion_r104862308 --- Diff: flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java --- @@ -317,6 +319,10 @@ public AbstractYarnClusterDescriptor createDescriptor(String defaultApplicationN if (cmd.hasOption(SLOTS.getOpt())) { int slots = Integer.valueOf(cmd.getOptionValue(SLOTS.getOpt())); yarnClusterDescriptor.setTaskManagerSlots(slots); + } else if (config.containsKey(ConfigConstants.YARN_VCORES)) { --- End diff -- the document says YARN_VCORE > slotsOfTaskManager > default value(1) shen they are set in file. the parameters in command line will be used before those in config file, in some way it is a common sense :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3408: [FLINK-5903][YARN]respect taskmanager.numberOfTask...
Github user WangTaoTheTonic commented on a diff in the pull request: https://github.com/apache/flink/pull/3408#discussion_r104855322 --- Diff: flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java --- @@ -317,6 +319,10 @@ public AbstractYarnClusterDescriptor createDescriptor(String defaultApplicationN if (cmd.hasOption(SLOTS.getOpt())) { int slots = Integer.valueOf(cmd.getOptionValue(SLOTS.getOpt())); yarnClusterDescriptor.setTaskManagerSlots(slots); + } else if (config.containsKey(ConfigConstants.YARN_VCORES)) { --- End diff -- And in document, there still has introduction of YARN_VCORES. ``` yarn.containers.vcores The number of virtual cores (vcores) per YARN container. By default, the number of vcores is set to the number of slots per TaskManager, if set, or to 1, otherwise. ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3414: [FLINK-5904][YARN]make jobmanager.heap.mb and task...
Github user WangTaoTheTonic commented on a diff in the pull request: https://github.com/apache/flink/pull/3414#discussion_r104854352 --- Diff: flink-core/src/main/java/org/apache/flink/configuration/ConfigConstants.java --- @@ -110,7 +110,12 @@ public static final String EXECUTION_RETRY_DELAY_KEY = "execution-retries.delay"; // Runtime --- - + + /** +* JVM heap size (in megabytes) for the JobManager +*/ + public static final String JOB_MANAGER_HEAP_MEMORY_KEY = "jobmanager.heap.mb"; --- End diff -- Nice. I've changed :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3414: [FLINK-5904][YARN]make jobmanager.heap.mb and task...
Github user WangTaoTheTonic commented on a diff in the pull request: https://github.com/apache/flink/pull/3414#discussion_r104853211 --- Diff: flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java --- @@ -306,12 +308,16 @@ public AbstractYarnClusterDescriptor createDescriptor(String defaultApplicationN if (cmd.hasOption(JM_MEMORY.getOpt())) { int jmMemory = Integer.valueOf(cmd.getOptionValue(JM_MEMORY.getOpt())); yarnClusterDescriptor.setJobManagerMemory(jmMemory); + } else if (config.containsKey(ConfigConstants.JOB_MANAGER_HEAP_MEMORY_KEY)) { + yarnClusterDescriptor.setJobManagerMemory(config.getInteger(ConfigConstants.JOB_MANAGER_HEAP_MEMORY_KEY, ConfigConstants.DEFAULT_JOB_MANAGER_HEAP_MEMORY)); --- End diff -- do you mean we should only create `jobManagerMemoryMb` object but not init it with value? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3408: [FLINK-5903][YARN]respect taskmanager.numberOfTaskSlots a...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3408 I'm sorry about the commit message :( next time I'll format it, as it's better not to squash. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3408: [FLINK-5903][YARN]respect taskmanager.numberOfTask...
Github user WangTaoTheTonic commented on a diff in the pull request: https://github.com/apache/flink/pull/3408#discussion_r104837998 --- Diff: flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java --- @@ -537,7 +543,6 @@ public YarnClusterClient createCluster( Preconditions.checkNotNull(userJarFiles, "User jar files should not be null."); AbstractYarnClusterDescriptor yarnClusterDescriptor = createDescriptor(applicationName, cmdLine); - yarnClusterDescriptor.setFlinkConfiguration(config); --- End diff -- @tillrohrmann Under what condition this configuration will contains values added programmatically? I've checked the codes and only found this config is initiallized from config file. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3408: [FLINK-5903][YARN]respect taskmanager.numberOfTask...
Github user WangTaoTheTonic commented on a diff in the pull request: https://github.com/apache/flink/pull/3408#discussion_r104836285 --- Diff: flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java --- @@ -317,6 +319,10 @@ public AbstractYarnClusterDescriptor createDescriptor(String defaultApplicationN if (cmd.hasOption(SLOTS.getOpt())) { int slots = Integer.valueOf(cmd.getOptionValue(SLOTS.getOpt())); yarnClusterDescriptor.setTaskManagerSlots(slots); + } else if (config.containsKey(ConfigConstants.YARN_VCORES)) { --- End diff -- @tillrohrmann You mean YARN_VCORES is deprecated now? After checking code I found there're two places where we still use it: [here](https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/AbstractYarnClusterDescriptor.java#L317) and [here](https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnFlinkResourceManager.java#L339), especially the latter one is used for container request. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3486: [FLINK-5981][SECURITY]make ssl version and cipher ...
Github user WangTaoTheTonic commented on a diff in the pull request: https://github.com/apache/flink/pull/3486#discussion_r104828499 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/net/SSLUtils.java --- @@ -55,6 +58,42 @@ public static boolean getSSLEnabled(Configuration sslConfig) { } /** +* Sets SSl version and cipher suites for SSLServerSocket +* @param socket +*Socket to be handled +* @param config +*The application configuration +*/ + public static void setSSLVerAndCipherSuites(ServerSocket socket, Configuration config) { + if (socket instanceof SSLServerSocket) { + ((SSLServerSocket) socket).setEnabledProtocols(config.getString( + ConfigConstants.SECURITY_SSL_PROTOCOL, + ConfigConstants.DEFAULT_SECURITY_SSL_PROTOCOL).split(",")); --- End diff -- By "do explicit handing", do you mean we should check if the elements in split resutls are both legal(must be one of `SSLv3, TLSv1, TLSv1.1`)? BTW/FYI: This config also applies to akka, and it seems like akka only support single value setting but not multiple ones(It throws exception when I set the value to `TLSv1.1,TLSv1.2`). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3415: [FLINK-5916][YARN]make env.java.opts.jobmanager and env.j...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3415 @tillrohrmann Thanks for the review. I've added the logic checking if the config is non-empty and test cases :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3486: [FLINK-5981][SECURITY]make ssl version and cipher ...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/flink/pull/3486 [FLINK-5981][SECURITY]make ssl version and cipher suites work as configured I configured ssl and start flink job, but found configured properties cannot apply properly: ``` akka port: only ciper suites apply right, ssl version not blob server/netty server: both ssl version and ciper suites are not like what I configured ``` I've found out the reason why: http://stackoverflow.com/questions/11504173/sslcontext-initialization (for blob server and netty server) https://groups.google.com/forum/#!topic/akka-user/JH6bGnWE8kY(for akka ssl version, it's fixed in akka 2.4:https://github.com/akka/akka/pull/21078) Configs: ``` security.ssl.protocol: TLSv1.1 security.ssl.algorithms: TLS_RSA_WITH_AES_128_CBC_SHA ``` **The scan results before:** ![before_blob_server](https://cloud.githubusercontent.com/assets/5276001/23655830/d37eb680-0371-11e7-952c-4a6514b1c42b.JPG) **The scan results after fix:** ![after_blob_server](https://cloud.githubusercontent.com/assets/5276001/23655841/dfc09da0-0371-11e7-8486-bc807e877dff.JPG) You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/flink FLINK-5981 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3486.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3486 commit c75c2e3f38e0a856ead1316223ad3d81061e4252 Author: WangTaoTheTonic <wangtao...@huawei.com> Date: 2017-03-07T12:05:21Z make ssl version and cipher suites work as configured --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3274: [FLINK-5723][UI]Use Used instead of Initial to make taskm...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3274 sure. Fixed in - 1.2.1 via e3e3c2a7f9c8dd8576e0e27b2efddb7ff42c7c0d - 1.3.0 via 03e6c249156fbbfeef39397a70c70bb905469d09 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3274: [FLINK-5723][UI]Use Used instead of Initial to mak...
Github user WangTaoTheTonic closed the pull request at: https://github.com/apache/flink/pull/3274 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3415: [FLINK-5916][YARN]make env.java.opts.jobmanager an...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/flink/pull/3415 [FLINK-5916][YARN]make env.java.opts.jobmanager and env.java.opts.taskmanager working i⦠Now only env.java.opts works in YARN mode, and it applies both to JM and TM. It's useful to make env.java.opts.jobmanager and env.java.opts.taskmanager working in YARN mode in addition, to support fine grained params setting. You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/flink FLINK-5916 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3415.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3415 commit 63a4e82858109644c8c9f940fd891b9aea11090a Author: WangTaoTheTonic <wangtao...@huawei.com> Date: 2017-02-25T07:35:19Z make env.java.opts.jobmanager and env.java.opts.taskmanager working in YARN mode --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3414: [FLINK-5904][YARN]make jobmanager.heap.mb and task...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/flink/pull/3414 [FLINK-5904][YARN]make jobmanager.heap.mb and taskmanager.heap.mb work in YARN mode I'm making these two configuration items same with "-yjm""-ytm" in yarn session and "-jm""-tm" in single job. Looks like it might get some misunderstanding. We don't have a proper config item here. What's better idea do you suggest? You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/flink FLINK-5904 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3414.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3414 commit b21d27ea99ffa613a8eb9c00a4f977b619b12418 Author: WangTaoTheTonic <wangtao...@huawei.com> Date: 2017-02-25T04:19:43Z make jobmanager.heap.mb and taskmanager.heap.mb work in YARN mode --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3408: [FLINK-5903][YARN]respect taskmanager.numberOfTask...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/flink/pull/3408 [FLINK-5903][YARN]respect taskmanager.numberOfTaskSlots and yarn.containers.vcores in YARN mode Make sure taskmanager.numberOfTaskSlots and yarn.containers.vcores works in YARN mode. The priorities is: -s/-ys > yarn.containers.vcores > taskmanager.numberOfTaskSlots. You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/flink FLINK-5903 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3408.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3408 commit 3515f9faf4b40bff7310a55b7094b52999525cbb Author: WangTaoTheTonic <wangtao...@huawei.com> Date: 2017-02-24T08:11:47Z xxx --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3335: [FLINK-5818][Security]change checkpoint dir permission to...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3335 @greghogan I think it's more like an improvement rather than a new feature. Anyway I'll post to mailling list for discussion. Thanks all guys :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3335: [FLINK-5818][Security]change checkpoint dir permission to...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3335 As sub dirs are created by different jobs/users under root directory, we keep it minimum(or configurable) at creation in order to keep the data safe. When a user has needs of accessing checkpointing files of other users, we(admin or file owner) can give it right to access. This can be more flexible than setting ACLs in root directory and more fine grained, because each user can decide who can touch its checkpointing files ;) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3335: [FLINK-5818][Security]change checkpoint dir permission to...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3335 @greghogan I'm aware of that, but my concern is when lots of users store their checkpoint files under same root directory, it would be a burden for admin to set different ACLs for different needs, like user1 can read user2 and user3's files, while user2 can only read files of user1, while user3 would like read files of user4, while ... Only set one ACL(like flink_admin) to allow one group to access all is not fine grained, as there is need that for some user (like user1), we only allow it to access some, not all, of sub directories(like sub directories user2 and user3 created). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3335: [FLINK-5818][Security]change checkpoint dir permission to...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3335 We are very close in scenario :) My point is that multiple users would use same root directory to store their checkpoint files(creating single directory for each job is complex), which makes it very hard for admin to set a proper permissions for it. Adding a configuration item is a very good idea. Would it be better if this configuration would be applied to the sub directories each job created? It will resolve isolation of access between different users' checkpoint file and also can be customized for migrating. @StephanEwen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3274: [FLINK-5723][UI]Use Used instead of Initial to make taskm...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3274 @StephanEwen What do you think about this? Could you help merge this if you're ok with it? Thanks :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3335: [FLINK-5818][Security]change checkpoint dir permission to...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3335 I've read all guys and list preconditions and solutions for this directory permission setting. ## Preconditions 1. Every flink job(session or single) can specify a directory storing checkpoint, called `state.backend.fs.checkpointdir`. 2. Different jobs can set same or different directories, which means their checkpoint files can be stored in one same or different directories, with **sub-dir** created with their own job-ids. 3. Jobs can be run by different users, and users has requirement that one could not read chp files written by another user, which will cause information leak. 4. In some condition(which is relatively rare, I think), as @StephanEwen said, users has need to access other usersâ chp files for cloning/migrating jobs. 5. The chp files path is like: `hdfs://namenode:port/flink-checkpoints//chk-17/6ba7b810-9dad-11d1-80b4-00c04fd430c8` ## Solutions ### Solution #1 (would not require changes) 1. Admins control permission of root directory via HDFS ACLs(set it like: user1 can read, user2 can only read, â¦). 2. This has two disadvantages: a) It is a huge burden for Admins to set different permissions for large number of users/groups); and b) sub-dirs inherited permissions from root directory, which means they are basically same, which make it hard to do fine grained control. ### Solution #2 (this proposal) 1. We donât care what permission of the root dir is. It can be create while setup or job running, as long as it is available to use. 2. We control every sub-dir created by different jobs(which are submitted by different users, in most cases), and set it to a lower value(like â700â) to prevent it to be read by others. 3. If someone wanna migrate or clone jobs across users(again, this scenario is rare in my view), he should ask admins(normally HDFS admin) to add ACLs(or whatever) for this purpose. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3337: [FLINK-5825][UI]make the cite path relative to sho...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/flink/pull/3337 [FLINK-5825][UI]make the cite path relative to show it correctly In yarn mode, the web frontend url is accessed from yarn in format like "http://spark-91-206:8088/proxy/application_1487122678902_0015/;, and the running job page's url is "http://spark-91-206:8088/proxy/application_1487122678902_0015/#/jobs/9440a129ea5899c16e7c1a7e8f2897b3;. One .png file called "horizontal.png", which is very small can not be loaded in that mode, because in "index.styl" it is cited as absolute path. We should make the path relative. - [ ] Tests & Build LATER I will paste difference between before and after. You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/flink FLINK-5825 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3337.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3337 commit c2e2711b2f815a4aa1b9c8be2478c963c41da245 Author: WangTaoTheTonic <wangtao...@huawei.com> Date: 2017-02-17T06:48:14Z make the cite path relative to show it correctly --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3335: [FLINK-5818][Security]change checkpoint dir permission to...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3335 Hi @greghogan , I'm not sure I understand the relationship between HDFS ACLs and this change I proposed. Could you explain more specifically? Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3335: [FLINK-5818][Security]change checkpoint dir permission to...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3335 Hi Stephan, You may have a little misunderstanding about this change. It only controls directories with job id (generated using UUID), but not the configured root checkpoint directory. I agree with you that the root directory should be created or changed permission when setup, but setup would not be aware of these directories with job ids, which are created in runtime. About Hadoop dependency, I admit I am using a convenient (let's say a hack way) to do the transition, as it need a bit more codes to do it standalone. I will change it if it's a problem :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3335: [FLINK-5818][Security]change checkpoint dir permis...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/flink/pull/3335 [FLINK-5818][Security]change checkpoint dir permission to 700 Now checkpoint directory is made w/o specified permission, so it is easy for another user to delete or read files under it, which will cause restore failure or information leak. It's better to lower it down to 700. - [x] Tests & Build - Functionality added by the pull request is covered by tests ![chp-filesystem-session](https://cloud.githubusercontent.com/assets/5276001/23019741/d753e8e0-f47e-11e6-9f2e-2cd35de35ef1.JPG) You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/flink FLINK-5818 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3335.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3335 commit 02eef87dc2bfaa6737efad023916898719d34fe2 Author: WangTaoTheTonic <wangtao...@huawei.com> Date: 2017-02-16T11:24:43Z change checkpoint dir permission to 700 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3274: [FLINK-5723][UI]Use Used instead of Initial to make taskm...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3274 @zentol Is it ok to go? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3274: [FLINK-5723][UI]Use Used instead of Initial to make taskm...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3274 Thanks. That makes sense :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3274: [FLINK-5723][UI]Use Used instead of Initial to make taskm...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3274 Thanks @greghogan. Should I make another PR against branch `release-1.2`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3274: [FLINK-5723][UI]Use Used instead of Initial to make taskm...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3274 Travis timeout :) How to kick it up again? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3283: [FLINK-5729][EXAMPLES]add hostname option to be mo...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/flink/pull/3283 [FLINK-5729][EXAMPLES]add hostname option to be more convenient "hostname" option will help users to get data from the right port, otherwise the example would fail easily due to connection refused, especially in yarn-session mode. You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/flink hostname Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3283.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3283 commit 1a8117f8a70a6bd8e88b3a2cb245b4f9db4f9193 Author: WangTaoTheTonic <wangtao...@huawei.com> Date: 2017-02-07T07:52:26Z add hostname option to be more convenient --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3274: [FLINK-5723][UI]Use Used instead of Initial to make taskm...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3274 @zentol Thanks for review. I've changed .jade file, but it looks like CI is not up properly :( --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3274: [FLINK-5723][UI]Use Used instead of Initial to mak...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/flink/pull/3274 [FLINK-5723][UI]Use Used instead of Initial to make taskmanager tag more readable Now in JobManager web fronted, the used memory of task managers is presented as "Initial" in table header, which actually means "memory used", from codes. I'd like change it to be more readable, even it is trivial one. You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/flink changeheader Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3274.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3274 commit 475e668aa1015c10cd69ffa88b6b0c4e3aeeb75f Author: unknown <wangtaotheto...@163.com> Date: 2017-02-06T15:18:38Z Use Used instead of Initial to make taskmanager tag more readable --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3071: [FLINK-5417][DOCUMENTATION]correct the wrong config file ...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3071 Surely not :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3071: [FLINK-5417][DOCUMENTATION]correct the wrong config file ...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3071 I guess it is probably that the illustrator added sth. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3071: [FLINK-5417][DOCUMENTATION]correct the wrong config file ...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3071 Hi @zentol , I've updated the svg using Inkscape. Is the whitespace you refer to on the topest(the red highlighted part)? I think it's normal as the original one has them to. It would not affect view in documents. ![default](https://cloud.githubusercontent.com/assets/5276001/21953009/8eef0426-da67-11e6-850a-935268ad019e.JPG) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #3071: [FLINK-5417][DOCUMENTATION]correct the wrong config file ...
Github user WangTaoTheTonic commented on the issue: https://github.com/apache/flink/pull/3071 I use Illustratorâ to edit svg file, which will add some header infos that cause CI failed. Is there any prefered svg editor? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #3071: [FLINK-5417][DOCUMENTATION]correct the wrong confi...
GitHub user WangTaoTheTonic opened a pull request: https://github.com/apache/flink/pull/3071 [FLINK-5417][DOCUMENTATION]correct the wrong config file name As the config file name is conf/flink-conf.yaml, the usage "conf/flink-config.yaml" in document is wrong and easy to confuse user. We should correct them. You can merge this pull request into a Git repository by running: $ git pull https://github.com/WangTaoTheTonic/flink outdate Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3071.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3071 commit 8c927bda55fead6b4cc90f49151c19907ac3700f Author: WangTaoTheTonic <wangtao...@huawei.com> Date: 2017-01-06T04:12:31Z fix the wrong config file name --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---