Hi Bhaskar,
> Why the reset counter is not zero after streaming job restart is successful?
The short answer is that the fixed delay restart strategy is not
implemented like that (see [1] if you are using Flink 1.10 or above).
There are also other systems that behave similarly, e.g., Apache
ctory in flink the artifact is
flink-core-1.7.2.jar . I need to pack flink-core-1.9.0.jar from application
and use child first class loading to use newer version of flink-core. If I
have it as provided scope, sure it will work in IntelliJ but not outside of
it .
>
> Best,
> Nick
>
>
to me so quickly.
>
> Best,
> Nick
>
> On Tue, May 12, 2020 at 3:37 AM Gary Yao wrote:
>>
>> Hi Nick,
>>
>> Are you able to upgrade to Flink 1.9? Beginning with Flink 1.9 you can use
>> KafkaSerializationSchema to produce a ProducerRecord [1][2].
>>
Hi Averell,
If you are seeing the log message from [1] and Scheduled#report() is
not called, the thread in the "Flink-MetricRegistry" thread pool might
be blocked. You can use the jstack utility to see on which task the
thread pool is blocked.
Best,
Gary
[1]
Hi Nick,
Are you able to upgrade to Flink 1.9? Beginning with Flink 1.9 you can use
KafkaSerializationSchema to produce a ProducerRecord [1][2].
Best,
Gary
[1] https://issues.apache.org/jira/browse/FLINK-11693
[2]
Hi Suraj,
This question has been asked before:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-kafka-consumers-don-t-honor-group-id-td21054.html
Best,
Gary
On Wed, Apr 22, 2020 at 6:08 PM Suraj Puvvada wrote:
>
> Hello,
>
> I have two JVMs that run
> Bytes Sent but Records Sent is always 0
Sounds like a bug. However, I am unable to reproduce this using the
AsyncIOExample [1]. Can you provide a minimal working example?
> Is there an Async Sink? Or do I just rewrite my Sink as an AsyncFunction
followed by a dummy sink?
You will have to
Can you try to set config option taskmanager.numberOfTaskSlots to 2? By
default the TMs only offer one slot [1] independent from the number of CPU
cores.
Best,
Gary
[1]
>
> Additionally even though I add all necessary dependencies defiend in [1] I
> cannot see ProcessFunctionTestHarnesses class.
>
That class was added in Flink 1.10 [1].
[1]
Hi David,
> Would it be safe to automatically clear the temporary storage every time
when a TaskManager is started?
> (Note: the temporary volumes in use are dedicated to the TaskManager and
not shared :-)
Yes, it is safe in your case.
Best,
Gary
On Tue, Mar 10, 2020 at 6:39 PM David Maddison
Hi Morgan,
> I am interested in knowing more about the failure detection mechanism
used by Flink, unfortunately information is a little thin on the ground and
I was hoping someone could shed a little light on the topic.
It is probably best to look into the implementation (see my answers below).
ly(CaseStatements.scala:21)
>>
>> at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
>>
>> at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
>>
>> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
>>
>
Hi,
Can you post the complete stacktrace?
Best,
Gary
On Tue, Mar 3, 2020 at 1:08 PM kant kodali wrote:
> Hi All,
>
> I am just trying to read edges which has the following format in Kafka
>
> 1,2
> 1,3
> 1,5
>
> using the Table API and then converting to DataStream of Edge Objects and
>
Hi Flavio,
I am looping in Becket (cc'ed) who might be able to answer your question.
Best,
Gary
On Tue, Mar 3, 2020 at 12:19 PM Flavio Pompermaier
wrote:
> Hi to all,
> since Alink has been open sourced, is there any good reason to keep both
> Flink ML and Alink?
> From what I understood
Hi Puneet,
Can you describe how you validated that the state is not restored properly?
Specifically, how did you introduce faults to the cluster?
Best,
Gary
On Tue, Mar 3, 2020 at 11:08 AM Puneet Kinra <
puneet.ki...@customercentria.com> wrote:
> Sorry for the missed information
>
> On
Hi,
There is a release note for Flink 1.7 that could be relevant for you [1]
Granularity of latency metrics
The default granularity for latency metrics has been modified. To
restore the previous behavior users have to explicitly set the granularity
to subtask.
Best,
Gary
[1]
Hi David,
Before with the both n and -s it was not the case.
>
What do you mean by before? At least in 1.8 "-s" could be used to specify
the
number of slots per TM.
how can I be sure that my Sink that uses this lib is in one JVM ?
>
Is it enough that no other parallel instance of your sink
Hi,
You can use
yarn application -status
to find the host and port that the server is listening on (AM host & RPC
Port). If you need to access that information programmatically, take a look
at
the YarnClient [1].
Best,
Gary
[1]
Hi Eric,
What you say should be possible because your job will be executed in a
MiniCluster [1] which has HA support. I have not tried this out myself,
and I am not aware that people are doing this in production. However,
there are integration tests that use MiniCluster + ZooKeeper [2].
Best,
Hi community,
Because we have approximately one month of development time left until the
targeted Flink 1.10 feature freeze, we thought now would be a good time to
give another progress update. Below we have included a list of the ongoing
efforts that have made progress since our last release
Hi Steve,
> I also tried attaching a shared NFS folder between the two machines and
> tried to set their web.tmpdir property to the shared folder, however it
> appears that each job manager creates a seperate job inside that
directory.
You can create a fixed upload directory via the config
Program arguments should be set to "--input /home/alaa/nycTaxiRides.gz"
(without the quotes).
On Wed, Sep 11, 2019 at 10:39 AM alaa wrote:
> Hallo
>
> I put arguments but the same error appear .. what should i do ?
>
>
> <
>
Hi,
You are not supposed to change that part of the exercise code. You have to
pass the path to the input file as a program argument (e.g., --input
/path/to/file). See [1] and [2] on how to configure program arguments in
IntelliJ.
Best,
Gary
[1]
Hi Felipe,
I am glad that you were able to fix the problem yourself.
> But I suppose that Mesos will allocate Slots and Task Managers
dynamically.
> Is that right?
Yes, that is the case since Flink 1.5 [1].
> Then I put the number of "mesos.resourcemanager.tasks.cpus" to be equal or
> less the
Congratulations Andrey, well deserved!
Best,
Gary
On Thu, Aug 15, 2019 at 7:50 AM Bowen Li wrote:
> Congratulations Andrey!
>
> On Wed, Aug 14, 2019 at 10:18 PM Rong Rong wrote:
>
>> Congratulations Andrey!
>>
>> On Wed, Aug 14, 2019 at 10:14 PM chaojianok wrote:
>>
>> > Congratulations
Hi Burgess Chen,
If you are using MemoryStateBackend or FsStateBackend, you can observe race
conditions on the state objects. However, the RocksDBStateBackend should be
safe from these issues [1].
Best,
Gary
[1]
Since there were no objections so far, I will proceed with removing the
code [1].
[1] https://issues.apache.org/jira/browse/FLINK-12312
On Wed, Apr 24, 2019 at 1:38 PM Gary Yao wrote:
> The idea is to also remove the rescaling code in the JobMaster. This will
> make
> it easier
Hi Steve,
(1)
The CLI action you are looking for is called "modify" [1]. However, we
want
to temporarily disable this feature beginning from Flink 1.9 due to some
caveats with it [2]. If you have objections, it would be appreciated if
you
could comment on the respective thread on
or now. Actually some users are not aware of that
> it’s
> > > still experimental, and ask quite a lot about the problem it causes.
> > >
> > > Best,
> > > Paul Lam
> > >
> > > 在 2019年4月24日,14:49,Stephan Ewen 写道:
> > >
> > >
Hi all,
As the subject states, I am proposing to temporarily remove support for
changing the parallelism of a job via the following syntax [1]:
./bin/flink modify [job-id] -p [new-parallelism]
This is an experimental feature that we introduced with the first rollout of
FLIP-6 (Flink 1.5).
Hi,
Can you describe how to reproduce this?
Best,
Gary
On Mon, Apr 15, 2019 at 9:26 PM Hao Sun wrote:
> Hi, I can not find the root cause of this, I think hadoop version is mixed
> up between libs somehow.
>
> --- ERROR ---
> java.text.ParseException: inconsistent module descriptor file found
Hi Averell,
I think I have answered your question previously [1]. The bottom line is
that
the error is logged on INFO level in the ExecutionGraph [2]. However, your
effective log level (of the root logger) is WARN. The log levels are ordered
as follows [3]:
TRACE < DEBUG < INFO < WARN <
I forgot to add line numbers to the first link in my previous email:
https://github.com/apache/flink/blob/c6878aca6c5aeee46581b4d6744b31049db9de95/flink-dist/src/main/flink-bin/bin/jobmanager.sh#L21-L25
On Fri, Mar 15, 2019 at 8:08 AM Gary Yao wrote:
> Hi Harshith,
>
> In the jobm
1.us-east-1.abc.com:28945/user/resourcemanager
> <http://fl...@flink0-1.flink1.us-east-1.abc.com:28945/user/resourcemanager>
> under registration id 170ee6a00f80ee02ead0e88710093d77.*
>
>
>
>
>
> Thanks,
>
> Harshith
>
>
>
> *From: *Harshith Kumar Bolar
down remote daemon.
> 2019-03-14 11:47:35,952 INFO
> akka.remote.RemoteActorRefProvider$RemotingTerminator - Remote
> daemon shut down; proceeding with flushing remote transports.
> 2019-03-14 11:47:35,959 INFO
> akka.remote.RemoteActorRefProvider$RemotingTerminator
Hi Harshith,
Can you share JM and TM logs?
Best,
Gary
On Thu, Mar 14, 2019 at 3:42 PM Kumar Bolar, Harshith
wrote:
> Hi all,
>
>
>
> I'm trying to upgrade our Flink cluster from 1.4.2 to 1.7.2
>
>
>
> When I bring up the cluster, the task managers refuse to connect to the
> job managers with
could test it for you.
>
> Without 1.8and this exit code we are essentially held up.
>
> On Tue, Mar 12, 2019 at 10:56 AM Gary Yao wrote:
>
>> Nobody can tell with 100% certainty. We want to give the RC some exposure
>> first, and there is also a release process th
Hi,
If no other TaskManager (TM) is running, you can delete everything. If
multiple TMs share the same host, as far as I know, you will have to parse
TM
logs to know what directories you can delete [1]. As for local recovery,
tasks
that were running on a crashed TM are lost. From the
AM Vishal Santoshi <
> vishal.santo...@gmail.com> wrote:
>
>> :) That makes so much more sense. Is k8s native flink a part of this
>> release ?
>>
>> On Tue, Mar 12, 2019 at 10:27 AM Gary Yao wrote:
>>
>>> Hi Vishal,
>>>
>>>
art of this
> release ?
>
> On Tue, Mar 12, 2019 at 10:27 AM Gary Yao wrote:
>
>> Hi Vishal,
>>
>> This issue was fixed recently [1], and the patch will be released with
>> 1.8. If
>> the Flink job gets cancelled, the JVM should exit with code 0. There is a
Hi Vishal,
This issue was fixed recently [1], and the patch will be released with 1.8.
If
the Flink job gets cancelled, the JVM should exit with code 0. There is a
release candidate [2], which you can test.
Best,
Gary
[1] https://issues.apache.org/jira/browse/FLINK-10743
[2]
as possible.
> Best!
> Sen
>
> 在 2019年3月5日,下午3:15,Gary Yao 写道:
>
> Hi Sen,
>
> In that email I meant that you should disable the ZooKeeper configuration
> in
> the CLI because the CLI had troubles resolving the leader from ZooKeeper.
> What
> you should have done
docs-release-1.7/ops/deployment/yarn_setup.html#recovery-behavior-of-flink-on-yarn
On Tue, Mar 5, 2019 at 7:41 AM 孙森 wrote:
> Hi Gary:
> I used FsStateBackend .
>
>
> The jm log is here:
>
>
> After restart , the log is :
>
>
>
>
> Best!
> Sen
>
&g
/state_backends.html#available-state-backends
On Mon, Mar 4, 2019 at 10:01 AM 孙森 wrote:
> Hi Gary:
>
>
> Yes, I enable the checkpoints in my program .
>
> 在 2019年3月4日,上午3:03,Gary Yao 写道:
>
> Hi Sen,
>
> Did you set a restart strategy [1]? If you enabled checkpo
wrote:
> Thanks Gary,
>
> I will try to look into why the child-first strategy seems to have failed
> for this dependency.
>
> Best,
> Austin
>
> On Wed, Feb 27, 2019 at 12:25 PM Gary Yao wrote:
>
>> Hi,
>>
>> Actually Flink's inverted class loading featu
e rest api : http://activeRm/proxy/appId/jars
>
>
> The all client log is in the mail attachment.
>
>
>
>
> 在 2019年2月27日,下午9:30,Gary Yao 写道:
>
> Hi,
>
> How did you determine "jmhost" and "port"? Actually you do not need to
> specify
>
Hi,
Actually Flink's inverted class loading feature was designed to mitigate
problems with different versions of libraries that are not compatible with
each other [1]. You may want to debug why it does not work for you.
You can also try to use the Hadoop free Flink distribution, and export the
Hi Sen Sun,
The question is already resolved. You can find the entire email thread here:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/flink-list-and-flink-run-commands-timeout-td22826.html
Best,
Gary
On Wed, Feb 27, 2019 at 7:55 AM sen wrote:
> Hi Aneesha:
>
> I
Hi,
How did you determine "jmhost" and "port"? Actually you do not need to
specify
these manually. If the client is using the same configuration as your
cluster,
the client will look up the leading JM from ZooKeeper.
If you have already tried omitting the "-m" parameter, you can check in the
Hi,
Beginning with Flink 1.7, you cannot use the legacy mode anymore [1][2]. I
am
currently working on removing references to the legacy mode in the
documentation [3]. Is there any reason, you cannot use the "new mode"?
Best,
Gary
[1] https://flink.apache.org/news/2018/11/30/release-1.7.0.html
Hi Henry,
If I understand you correctly, you want YARN to allocate 4 vcores per TM
container. You can achieve this by enabling the FairScheduler in YARN
[1][2].
Best,
Gary
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.7/ops/config.html#yarn-containers-vcores
[2]
machine(8 core), I have 4 nodes,
> that will end up 28 taskmanagers and 1 job manager. I was wondering if this
> can bring additional burden on jobmanager? Is it recommended?
>
> Thanks,
>
> Jins George
> On 2/14/19 8:49 AM, Gary Yao wrote:
>
> Hi Jins George,
>
>
Hi Averell,
The TM containers fetch the Flink binaries and config files form HDFS (or
another DFS if configured) [1]. I think you should be able to change the log
level by patching the logback configuration in HDFS, and kill all Flink
containers on all hosts. If you are running an HA setup, your
Hi Jins George,
This has been asked before [1]. The bottom line is that you currently cannot
pre-allocate TMs and distribute your tasks evenly. You might be able to
achieve a better distribution across hosts by configuring fewer slots in
your
TMs.
Best,
Gary
[1]
Hi,
Are you logging from your own operator implementations, and you expect these
log messages to end up in a file prefixed with XYZ-? If that is the case,
modifying log4j-cli.properties will not be expedient as I wrote earlier.
You should modify the log4j.properties on all hosts that are running
Hi Averell,
Logback has this feature [1] but is not enabled out of the box. You will
have
to enable the JMX agent by setting the com.sun.management.jmxremote system
property [2][3]. I have not tried this out, though.
Best,
Gary
[1] https://logback.qos.ch/manual/jmxConfig.html
[2]
Hi,
Can you define what you mean by "job logs"? For code that is run on the
cluster, i.e., JM or TM, you should add your config to log4j.properties. The
log4j-cli.properties file is only used by the Flink CLI process.
Best,
Gary
On Mon, Feb 11, 2019 at 7:39 AM simpleusr wrote:
> Hi Chesnay,
>
Hi Averell,
That log file does not look complete. I do not see any INFO level log
messages
such as [1].
Best,
Gary
[1]
https://github.com/apache/flink/blob/46326ab9181acec53d1e9e7ec8f4a26c672fec31/flink-yarn/src/main/java/org/apache/flink/yarn/YarnResourceManager.java#L544
On Fri, Feb 1, 2019
Hi Tim,
There is an end-to-end test in the Flink repository that starts a job
cluster
in Kubernetes (minikube) [1]. If that does not help you, can you answer the
questions below?
What docker images are you using? Can you share the kubernetes resource
definitions? Can you share the complete logs
Hi Averell,
> Is there any way to avoid this? As if I run this as an AWS EMR job, the
job
> would be considered failed, while it is actually be restored
automatically by
> YARN after 10 minutes).
You are writing that it takes YARN 10 minutes to restart the application
master (AM). However, in my
Hi Averell,
> Then I have another question: when JM cannot start/connect to the JM on
.88,
> why didn't it try on .82 where resource are still available?
When you are deploying on YARN, the TM container placement is decided by the
YARN scheduler and not by Flink. Without seeing the complete
Hi Averell,
What Flink version are you using? Can you attach the full logs from JM and
TMs? Since Flink 1.5, the -n parameter (number of taskmanagers) should be
omitted unless you are in legacy mode [1].
> As per that screenshot, it looks like there are 2 tasks manager still
> running (one on
Hi Henry,
Can you share your pom.xml and the full stacktrace with us? It is expected
behavior that org.elasticsearch.client.RestClientBuilder is not shaded. That
class comes from the elasticsearch Java client, and we only shade its
transitive dependencies. Could it be that you have a dependency
u like to have log entries with DEBUG enabled or
> INFO would be enough?
>
> Thanks,
> Piotr
>
> pt., 18 sty 2019 o 15:14 Gary Yao napisał(a):
>
>> Hi Piotr,
>>
>> What was the version you were using before 1.7.1?
>> How do you deploy your cluster, e.g., YARN
Hi Piotr,
What was the version you were using before 1.7.1?
How do you deploy your cluster, e.g., YARN, standalone?
Can you attach full TM and JM logs?
Best,
Gary
On Fri, Jan 18, 2019 at 3:05 PM Piotr Szczepanek
wrote:
> Hello,
> we have scenario with running Data Processing jobs that
Hi,
The API still returns the location of a completed savepoint. See the example
in the Javadoc [1].
Best,
Gary
[1]
Hi Wei,
Did you build Flink with maven 3.2.5 as recommended in the documentation
[1]?
Also, did you use the -Pvendor-repos flag to add the cloudera repository
when
building?
Best,
Gary
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.7/flinkDev/building.html#build-flink
[2]
Hi all,
I think increasing the default value of the config option web.timeout [1] is
what you are looking for.
Best,
Gary
[1]
https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/rest/handler/RestHandlerConfiguration.java#L76
[2]
Hi,
You can use the YARN client to list all applications on your YARN cluster:
yarn application -list
If this does not show any running applications, the Flink cluster must have
somehow terminated. If you have YARN's log aggregation enabled, you should
be
able to view the Flink logs by
Hi Abhi Thakur,
We need more information to help you. What docker images are you using? Can
you share the kubernetes resource definitions? Can you share the complete
logs
of the JM and TMs? Did you follow the steps outlined in the Flink
documentation [1]?
Best,
Gary
[1]
commands.
>
> And also thank you for the pointer, I’ll keep tracking this problem.
>
> Best,
> Paul Lam
>
>
> 在 2018年11月10日,02:10,Gary Yao 写道:
>
> Hi Paul,
>
> Can you share the complete logs, or at least the logs after invoking the
> cancel command?
Hi Henry,
What you see in the API documentation is a schema definition and not a
sample
request. The request body should be:
{
"target-directory": "hdfs:///flinkDsl",
"cancel-job": false
}
Let me know if that helps.
Best,
Gary
On Mon, Nov 12, 2018 at 7:15 AM vino yang
Hi,
We only propagate the exception message but not the complete stacktrace [1].
Can you create a ticket for that?
Best,
Gary
[1]
Hi Paul,
Can you share the complete logs, or at least the logs after invoking the
cancel command?
If you want to debug it yourself, check if
MiniDispatcher#jobReachedGloballyTerminalState [1] is invoked, and see how
the
jobTerminationFuture is used.
Best,
Gary
[1]
Hi,
You are using event time but are you assigning watermarks [1]? I do not see
it
in the code.
Best,
Gary
[1]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kinesis.html#event-time-for-consumed-records
On Fri, Nov 9, 2018 at 6:58 PM Vijay Balakrishnan
wrote:
> Hi,
>
Hi,
If the job is actually running and consuming from Kinesis, the log you
posted
is unrelated to your problem. To understand why the process function is not
invoked, we would need to see more of your code, or you would need to
provide
an executable example. The log only shows that all offered
Hi,
Could it be that you are submitting the job in attached mode, i.e., without
-d
parameter? In the "job cluster attached mode", we actually start a Flink
session cluster (and stop it again from the CLI) [1]. Therefore, in attached
mode, the config option "yarn.per-job-cluster.include-user-jar"
Hi Borys,
I remember that another user reported a similar issue recently [1] –
attached
to the ticket you can find his log file. If I recall correctly, we concluded
that YARN returned the containers very quickly. At the time, Flink's debug
level logs were inconclusive because we did not log the
Hi Pawel,
As far as I know, the application attempt is incremented if the application
master fails and a new one is brought up. Therefore, what you are seeing
should not happen. I have just deployed on AWS EMR 5.17.0 (Hadoop 2.8.4) and
killed the container running the application master – the
Hi Averell,
It is up to the YARN scheduler on which hosts the containers are started.
What Flink version are you using? I assume you are using 1.4 or earlier
because you are specifying a fixed number of TMs. If you launch Flink with
-yn
2, you should be only seeing 2 TMs in total (not 4). Are
Hi Borys,
To debug how many containers Flink is requesting, you can look out for the
log
statement below [1]:
Requesting new TaskExecutor container with resources [...]
If you need help debugging, can you attach the full JM logs (preferably on
DEBUG level)? Would it be possible for you to
Hi Averell,
There is no general answer to your question. If you are running more TMs,
you
get better isolation between different Flink jobs because one TM is backed
by
one JVM [1]. However, every TMs brings additional overhead (heartbeating,
running more threads, etc.) [1]. It also depends on the
Hi Henry,
The URL below looks like the one from the YARN proxy (note that "proxy"
appears in the URL):
http://storage4.test.lan:8089/proxy/application_1532081306755_0124/jobs/269608367e0f548f30d98aa4efa2211e/savepoints
You can use
yarn application -status
to find the host and port of
Hi Averell,
Flink compares the number of user selected vcores to the vcores configured
in
the yarn-site.xml of the submitting node, i.e., in your case the master
node.
If there are not enough configured vcores, the client throws an exception.
This behavior is not ideal and I found an old JIRA
Hi Averell,
According to the AWS documentation [1], the master node only runs the YARN
ResourceManager and the HDFS NameNode. Containers can only by launched on
nodes that are running the YARN NodeManager [2]. Therefore, if you want TMs
or
JMs to be launched on your EMR master node, you have to
Hi Jose,
You can find an example here:
https://github.com/apache/flink/blob/1a94c2094b8045a717a92e232f9891b23120e0f2/flink-formats/flink-parquet/src/test/java/org/apache/flink/formats/parquet/avro/ParquetStreamingFileSinkITCase.java#L58
Best,
Gary
On Tue, Sep 11, 2018 at 11:59 AM, jose farfan
Hi Tony,
You are right that these metrics are missing. There is already a ticket for
that [1]. At the moment you can obtain these information from the REST API
(/overview) [2].
Since FLIP-6, the JM is no longer responsible for these metrics but for
backwards compatibility we can leave them in
Hi,
Do you also have pmml-model-moxy as a dependency in your job? Using mvn
dependency:tree, I do not see that pmml-evaluator has a compile time
dependency on jaxb-api. The jaxb-api dependency actually comes from pmml-
model-moxy. The exclusion should be therefore defined on pmml-model-moxy.
You
mode.
>
> 1. akka.transport.heartbeat.interval
> 2. akka.transport.heartbeat.pause
>
> It seems they are different from HeartbeatServices and possibly still
> valid.
>
> Best,
> tison.
>
>
> Gary Yao 于2018年9月10日周一 下午1:50写道:
>
>> I should add th
-akka
On Mon, Sep 10, 2018 at 7:38 AM, Gary Yao wrote:
> Hi Tony,
>
> You are right that with FLIP-6 Akka is abstracted away. If you want custom
> heartbeat settings, you can configure the options below [1]:
>
> heatbeat.interval
> heartbeat.timeout
>
> The con
Hi Tony,
You are right that with FLIP-6 Akka is abstracted away. If you want custom
heartbeat settings, you can configure the options below [1]:
heatbeat.interval
heartbeat.timeout
The config option taskmanager.exit-on-fatal-akka-error is also not relevant
anymore. I closest I can think
Hi all,
The question is being handled on the dev mailing list [1].
Best,
Gary
[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Cancel-flink-job-occur-exception-td24056.html
On Tue, Sep 4, 2018 at 2:21 PM, rileyli(李瑞亮) wrote:
> Hi all,
> I submit a flink job through
Hi Austin,
The config options rest.port, jobmanager.web.port, etc. are intentionally
ignored on YARN. The port should be chosen randomly to avoid conflicts with
other containers [1]. I do not see a way how you can set a fixed port at the
moment but there is a related ticket for that [2]. The
Hi Jason,
>From the stacktrace it seems that you are using the 1.4.0 client to list
jobs
on a 1.5.x cluster. This will not work. You have to use the 1.5.x client.
Best,
Gary
On Wed, Sep 5, 2018 at 5:35 PM, Jason Kania wrote:
> Hello,
>
> Thanks for the response. I had already tried setting
Hi Jelmer,
I saw that you have already found the JIRA issue tracking this problem [1]
but
I will still answer on the mailing list for transparency.
The timeout for "cancel with savepoint" should be RpcUtils.INF_TIMEOUT.
Unfortunately Flink is currently not respecting this timeout. A pull request
In 1.5.0, I’ve not enable local
> recovery.
>
>
>
> So whether I need manual disable local recovery via flink.conf?
>
>
>
> Regards
>
>
>
> James
>
>
>
> *From: *"James (Jian Wu) [FDS Data Platform]"
> *Date: *Monday, September 3, 20
Hi James,
What version of Flink are you running? In 1.5.0, tasks can spread out due to
changes that were introduced to support "local recovery". There is a
mitigation in 1.5.1 that prevents task spread out but local recovery must be
disabled [2].
Best,
Gary
[1]
t; and reply again if the problem comes up again. Thanks again for your quick
>> response!
>>
>> On Fri, Aug 31, 2018 at 1:38 PM Gary Yao wrote:
>>
>>> Hi Greg,
>>>
>>> Can you describe the steps to reproduce the problem, or can you attach
>&g
Hi Greg,
Can you describe the steps to reproduce the problem, or can you attach the
full jobmanager logs? Because JobExecutionResultHandler appears in your
log, I
assume that you are starting a job cluster on YARN. Without seeing the
complete logs, I cannot be sure what exactly happens. For now,
Hi Yubraj Singh,
Can you try submitting with HADOOP_CLASSPATH=`hadoop classpath` set? [1]
For example:
HADOOP_CLASSPATH=`hadoop classpath` bin/flink run [...]
Best,
Gary
[1]
1 - 100 of 171 matches
Mail list logo