Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3414
@tillrohrmann I've changed per comments. Mind reviewing again? Thanks :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as wel
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3408
ping @tillrohrmann
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
GitHub user WangTaoTheTonic opened a pull request:
https://github.com/apache/flink/pull/3525
[FLINK-6020]add a random integer suffix to blob key to avoid naming
conflicting
In yarn-cluster mode, if we submit one same job multiple times parallelly,
the task will encounter class
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3486
@StephanEwen I've added some test cases for testing new function and a
ITCase to prove akka cannot accept more than one protocol setting. Let me know
if there's better way to impl
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3486
thanks for review @vijikarthi. I will check if there are mismatch between
protocols and cipher suites and document it if any.
---
If your project is set up for it, you can reply to this
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3486
@vijikarthi I've checked the [JDK
doc](http://docs.oracle.com/javase/8/docs/technotes/guides/security/StandardNames.html#ciphersuites)
and not found any notes about combination o
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3486
Since not merged, I've turned them around. Sorry for the carelessness :(
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as wel
Github user WangTaoTheTonic commented on a diff in the pull request:
https://github.com/apache/flink/pull/2425#discussion_r106335331
--- Diff:
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/netty/CookieHandler.java
---
@@ -0,0 +1,130 @@
+/**
+ * Licensed
Github user WangTaoTheTonic commented on a diff in the pull request:
https://github.com/apache/flink/pull/2425#discussion_r106335560
--- Diff:
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/netty/CookieHandler.java
---
@@ -0,0 +1,130 @@
+/**
+ * Licensed
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3486
Fixed
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3525
The second rename will not fail, but make the file which written by the
first corrupted, which will make the first job failed if the task is loading
this jar.
by the way, the jar
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3525
Right...I have same thought as you at the beginning and i've tried to make
the move atomic but it has serveral side affect, like:
1. if we use this way to handle this, which means tw
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3525
ping @StephanEwen
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
GitHub user WangTaoTheTonic opened a pull request:
https://github.com/apache/flink/pull/3599
[FLINK-6174][HA]introduce a new election service to make JobManager always
available
Now in yarn mode, if we use zookeeper as high availability choice, it will
create a election service to
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3599
Thanks for your comments @wenlong88 .
I also gave a thought about adding retry logic when zk failover, but this
part should modify `LeaderLatch` in curator, which is a 3rd party
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3408
@tillrohrmann @StephanEwen
The code was changed and I've verified the functions, could you please
review this and merge it if it's good to go?
---
If your project is set
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3599
I don't think it's a good idea, as it can not solve the "split brain" issue
too.
The key problem is that `LeaderLatch` in curator is too sensitive to
connectio
GitHub user WangTaoTheTonic opened a pull request:
https://github.com/apache/flink/pull/3614
[FLINK-6189][YARN]Do not use yarn client config to do sanity check
Now in client, if #slots is greater than then number of
"yarn.nodemanager.resource.cpu-vcores" in yarn client c
GitHub user WangTaoTheTonic opened a pull request:
https://github.com/apache/flink/pull/3617
[FLINK-6192]reuse zookeeper client created by CuratorFramework
Now in yarn mode, there're three places using zookeeper client(web monitor,
jobmanager and resourcemanage
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3617
Glad we have same idea :) @tillrohrmann
I'll mark the JIRA duplicated now and close this PR as soon as you open the
new one.
---
If your project is set up for it, you can rep
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3617
Can you please review https://github.com/apache/flink/pull/3408, by the
way? @tillrohrmann
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3408
@tillrohrmann I still cannot get your point entirely. don't this three
configs(`-s/-ys`, `yarn.containers.vcores` and `taskmanager.numberOfTaskSlots`
mean same thing? Do they have diffe
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3408
All right. That makes sense. Let me rephrase that and please check if we
are in same channel:
1. slots of taskmanager is decided by `-s/-ys` and
`taskmanager.numberOfTaskSlots`, the
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3599
@StephanEwen
I've done the changes, which introduce a new smarter leader latch(the
reason why i write a new class is that `handleStateChange` method is private in
`LeaderLatch
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3599
@wenlong88 Feel free to review, thanks :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3408
@tillrohrmann
After changing code the test results(both session and single mode) is like:
| Configurations | #vcores of container(TM) |
#slots of TM
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3525
hi stephan, could you help review? @StephanEwen
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3704
Could you tell what modifications are done in "FRocksDB" and post the url
of source code repository?
---
If your project is set up for it, you can reply to this email and have
GitHub user WangTaoTheTonic opened a pull request:
https://github.com/apache/flink/pull/3709
[FLINK-6295]use LoadingCache instead of WeakHashMap to lower latency
Now in ExecutionGraphHolder, which is used in many handlers, we use a
WeakHashMap to cache ExecutionGraph(s), which is
GitHub user WangTaoTheTonic opened a pull request:
https://github.com/apache/flink/pull/3727
[FLINK-6312]update curator version to 2.12.0 to avoid potential block
As there's a Major
bug([CURATOR-344](https://issues.apache.org/jira/browse/CURATOR-344)) in
curator release us
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3709
@wenlong88 LoadingCache can also cache and evict data as WeakHashMap, as
this implementation shows it will evict data every 30 seconds and fetch data if
it doesn't contain the require
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3709
@zentol The execution graphs cached in `ExecutionGraphHolder`(which is
backed by a WeakHashMap) will be evicted only when gc happens.
---
If your project is set up for it, you can reply to
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3709
@zentol No you're wrong.
If you take a look at `ExecutionGraphHolder`, you'll find the graphs in it
are generated from message answered by JobManager, which means
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3709
I'm not a akka expert. As we observed, the status of cancelled tasks will
be updated to running only when gc happens in JM.
Way to reproduce:
1. launch a flink job with ha mode
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3709
In my opinion EGHolder is simply a cache which should not be assigned too
complicated task.
If we add the check logic, how long it should be? Will other events
afftects status of
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3709
I mean who's in charge of updating EGHolder? EGHolder itself or JobManager?
EGHolder don't sense status changing of jobs until it queries from JobManager
periodically.
If
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3414
Thanks. I've resolved conflicts. enjoy :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3709
Ok i think i've got your point.
Now using WeakHashMap, we add entries when the map doesn't contain the
requested EG id, remove invalid entries when GC happens.
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3709
That means every time EGHolder received a request, it will check if the job
status in request is suspended or not, right? This will make cache in EGHolder
unmeaningful.
---
If your
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3727
Sure. Seems like it will take a little long time but i'll try my best :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as wel
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3709
@zentol How do we know if a job requested is supended or not, as the status
of jobs in backend is alway changing?
---
If your project is set up for it, you can reply to this email and have
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3709
I got it, but still have one question: what about the other state
transition? Like when job is cancelling or failing or else?
---
If your project is set up for it, you can reply to this
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3709
My main concern is that the status showing in web doesn't match the actual
state backend.
---
If your project is set up for it, you can reply to this email and have your
reply appe
GitHub user WangTaoTheTonic opened a pull request:
https://github.com/apache/flink/pull/3745
[FLINK-6341]Don't let JM fall into infinite loop
When TaskManager register to JobManager, JM will send a
"NotifyResourceStarted" message to kick off Resource Manager,
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3709
All right. I'll change as you suggest and verify the result. Thanks for
comments and advise :)
---
If your project is set up for it, you can reply to this email and have your
reply a
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3709
I've testet and the function is ok. Please check if it's good to go, thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitH
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3745
@tillrohrmann What problem it will bring if we access
`currentResourceManager` from another thread? It is a variable in JobManager
and can be shared across multi threads, right? The new
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3745
I see. Connection ID is added, please check if it's ok :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3727
@StephanEwen All tests passed!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3709
@StephanEwen Sure. The current fix is like a "pull", while what you suggest
is a "push" way. Both them can fix just make difference in how the EGs being
updated.
---
If
Github user WangTaoTheTonic closed the pull request at:
https://github.com/apache/flink/pull/3727
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3525
For HA case, the blob server will upload jars to HDFS for recovery, and
there's a cocurrent operations here too. I'm not sure if the solutions ou
proposed can cover that.
-
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3525
That looks good to me. Looking forward to fix from @tillrohrmann. Thank you
very much :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3274
@zentol Is it ok to go?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
GitHub user WangTaoTheTonic opened a pull request:
https://github.com/apache/flink/pull/3335
[FLINK-5818][Security]change checkpoint dir permission to 700
Now checkpoint directory is made w/o specified permission, so it is easy
for another user to delete or read files under it
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3335
Hi Stephan,
You may have a little misunderstanding about this change. It only controls
directories with job id (generated using UUID), but not the configured root
checkpoint
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3335
Hi @greghogan , I'm not sure I understand the relationship between HDFS
ACLs and this change I proposed. Could you explain more specifically? Thanks.
---
If your project is set up f
GitHub user WangTaoTheTonic opened a pull request:
https://github.com/apache/flink/pull/3337
[FLINK-5825][UI]make the cite path relative to show it correctly
In yarn mode, the web frontend url is accessed from yarn in format like
"http://spark-91-206:8088/
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3335
I've read all guys and list preconditions and solutions for this directory
permission setting.
## Preconditions
1. Every flink job(session or single) can specify a dire
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3274
@StephanEwen What do you think about this? Could you help merge this if
you're ok with it? Thanks :)
---
If your project is set up for it, you can reply to this email and have your
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3335
We are very close in scenario :)
My point is that multiple users would use same root directory to store
their checkpoint files(creating single directory for each job is complex
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3335
@greghogan I'm aware of that, but my concern is when lots of users store
their checkpoint files under same root directory, it would be a burden for
admin to set different ACLs for diff
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3335
As sub dirs are created by different jobs/users under root directory, we
keep it minimum(or configurable) at creation in order to keep the data safe.
When a user has needs of
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3335
@greghogan I think it's more like an improvement rather than a new feature.
Anyway I'll post to mailling list for discussion.
Thanks all guys :)
---
If your project is
GitHub user WangTaoTheTonic opened a pull request:
https://github.com/apache/flink/pull/3408
[FLINK-5903][YARN]respect taskmanager.numberOfTaskSlots and
yarn.containers.vcores in YARN mode
Make sure taskmanager.numberOfTaskSlots and yarn.containers.vcores works in
YARN mode. The
GitHub user WangTaoTheTonic opened a pull request:
https://github.com/apache/flink/pull/3414
[FLINK-5904][YARN]make jobmanager.heap.mb and taskmanager.heap.mb work in
YARN mode
I'm making these two configuration items same with "-yjm""-ytm" in yarn
GitHub user WangTaoTheTonic opened a pull request:
https://github.com/apache/flink/pull/3415
[FLINK-5916][YARN]make env.java.opts.jobmanager and
env.java.opts.taskmanager working iâ¦
Now only env.java.opts works in YARN mode, and it applies both to JM and
TM.
It
Github user WangTaoTheTonic closed the pull request at:
https://github.com/apache/flink/pull/3274
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3274
sure.
Fixed in
- 1.2.1 via e3e3c2a7f9c8dd8576e0e27b2efddb7ff42c7c0d
- 1.3.0 via 03e6c249156fbbfeef39397a70c70bb905469d09
---
If your project is set up for it, you can
GitHub user WangTaoTheTonic opened a pull request:
https://github.com/apache/flink/pull/3486
[FLINK-5981][SECURITY]make ssl version and cipher suites work as configured
I configured ssl and start flink job, but found configured properties
cannot apply properly:
```
akka
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3415
@tillrohrmann Thanks for the review. I've added the logic checking if the
config is non-empty and test cases :)
---
If your project is set up for it, you can reply to this email and
Github user WangTaoTheTonic commented on a diff in the pull request:
https://github.com/apache/flink/pull/3486#discussion_r104828499
--- Diff:
flink-runtime/src/main/java/org/apache/flink/runtime/net/SSLUtils.java ---
@@ -55,6 +58,42 @@ public static boolean getSSLEnabled
Github user WangTaoTheTonic commented on a diff in the pull request:
https://github.com/apache/flink/pull/3408#discussion_r104836285
--- Diff:
flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java ---
@@ -317,6 +319,10 @@ public AbstractYarnClusterDescriptor
Github user WangTaoTheTonic commented on a diff in the pull request:
https://github.com/apache/flink/pull/3408#discussion_r104837998
--- Diff:
flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java ---
@@ -537,7 +543,6 @@ public YarnClusterClient createCluster
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3408
I'm sorry about the commit message :(
next time I'll format it, as it's better not to squash.
---
If your project is set up for it, you can reply to this email and have yo
Github user WangTaoTheTonic commented on a diff in the pull request:
https://github.com/apache/flink/pull/3414#discussion_r104853211
--- Diff:
flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java ---
@@ -306,12 +308,16 @@ public AbstractYarnClusterDescriptor
Github user WangTaoTheTonic commented on a diff in the pull request:
https://github.com/apache/flink/pull/3414#discussion_r104854352
--- Diff:
flink-core/src/main/java/org/apache/flink/configuration/ConfigConstants.java ---
@@ -110,7 +110,12 @@
public static final String
Github user WangTaoTheTonic commented on a diff in the pull request:
https://github.com/apache/flink/pull/3408#discussion_r104855322
--- Diff:
flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java ---
@@ -317,6 +319,10 @@ public AbstractYarnClusterDescriptor
Github user WangTaoTheTonic commented on a diff in the pull request:
https://github.com/apache/flink/pull/3408#discussion_r104862308
--- Diff:
flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java ---
@@ -317,6 +319,10 @@ public AbstractYarnClusterDescriptor
Github user WangTaoTheTonic commented on a diff in the pull request:
https://github.com/apache/flink/pull/3414#discussion_r104862993
--- Diff:
flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java ---
@@ -306,12 +308,16 @@ public AbstractYarnClusterDescriptor
Github user WangTaoTheTonic commented on a diff in the pull request:
https://github.com/apache/flink/pull/3408#discussion_r104867661
--- Diff:
flink-yarn/src/main/java/org/apache/flink/yarn/cli/FlinkYarnSessionCli.java ---
@@ -537,7 +543,6 @@ public YarnClusterClient createCluster
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3408
I move the initiallization of this config to constructor of cluster
descripter and restore the deleted configuration setting.
Please check if we are good with the usage of `YARN_VCORES
GitHub user WangTaoTheTonic opened a pull request:
https://github.com/apache/flink/pull/3071
[FLINK-5417][DOCUMENTATION]correct the wrong config file name
As the config file name is conf/flink-conf.yaml, the usage
"conf/flink-config.yaml" in document is wrong and easy
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3071
I use Illustratorâ to edit svg file, which will add some header infos
that cause CI failed. Is there any prefered svg editor?
---
If your project is set up for it, you can reply to this
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3071
Hi @zentol , I've updated the svg using Inkscape.
Is the whitespace you refer to on the topest(the red highlighted part)? I
think it's normal as the original one has them to
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3071
I guess it is probably that the illustrator added sth.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3071
Surely not :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
GitHub user WangTaoTheTonic opened a pull request:
https://github.com/apache/flink/pull/3274
[FLINK-5723][UI]Use Used instead of Initial to make taskmanager tag more
readable
Now in JobManager web fronted, the used memory of task managers is
presented as "Initial" in ta
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3274
@zentol Thanks for review. I've changed .jade file, but it looks like CI is
not up properly :(
---
If your project is set up for it, you can reply to this email and have your
reply a
GitHub user WangTaoTheTonic opened a pull request:
https://github.com/apache/flink/pull/3283
[FLINK-5729][EXAMPLES]add hostname option to be more convenient
"hostname" option will help users to get data from the right port,
otherwise the example would fail easily due to
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3274
Travis timeout :) How to kick it up again?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3274
Thanks @greghogan. Should I make another PR against branch `release-1.2`?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3274
Thanks. That makes sense :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3525
Thanks for your fix, i'll check in a day or two.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user WangTaoTheTonic closed the pull request at:
https://github.com/apache/flink/pull/3525
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the
Github user WangTaoTheTonic commented on the issue:
https://github.com/apache/flink/pull/3525
@tillrohrmann I've tried with your commit and the issue is resolved,
thanks. Closing this PR.
---
If your project is set up for it, you can reply to this email and have your
reply appe
96 matches
Mail list logo