I am also using exactly once checkpointing mode, I have a kafka source and
sink so both support transactions which should allow for exactly once
processing. Is this the reason why there is only one checkpoint retained ?
Thanks,
Vishwas
On Wed, Aug 21, 2019 at 5:26 PM Vishwas Siravara
wrote:
>
Hi
on 2019/8/21 22:46, Robert Metzger wrote:
I would recommend you to do some research yourself (there is plenty of
material online), and then try out the most promising systems yourself.
That's right. thank you.
regards.
Besides, would you like to participant our survey thread[1] on
user list about "How do you use high-availability services in Flink?"
It would help Flink improve its high-availability serving.
Best,
tison.
[1]
Hi Aleksandar,
base on your log:
taskmanager_1 | 2019-08-22 00:05:03,713 INFO
org.apache.flink.runtime.taskexecutor.TaskExecutor- Connecting
to ResourceManager
akka.tcp://flink@jobmanager:6123/user/jobmanager()
.
taskmanager_1 | 2019-08-22
I think it depends on the root cause of your job failure. Maybe the
following jvm options could help you to get the heap dump.
1. -XX:+HeapDumpOnOutOfMemoryError
2. -XX:+HeapDumpBeforeFullGC
3. -XX:+HeapDumpAfterFullGC
4. -XX:HeapDumpPath=/tmp/heap.dump.1
Use *env.java.opts* to set java opts for
Hi Aleksandar,
The resource manager address is retrieved from the HA services.
Would you check whether your customized HA services is returning the right
LeaderRetrievalService and whether the LeaderRetrievalService is really
retrieving the right leader's address?
Or is it possible that the
Hi Vishwas,
You can configure "state.checkpoints.num-retained" to specify the max
checkpoints to retain.
By default it is 1.
Thanks,
Zhu Zhu
Vishwas Siravara 于2019年8月22日周四 上午6:48写道:
> I am also using exactly once checkpointing mode, I have a kafka source and
> sink so both support
Hello,
Currently I have 2 streams and I enrich stream 1 with the second streams.
To further enrich stream 1 we are planning to add 2 more streams so a total
of 4 streams.
1. Stream 1 read from Kafka
2. Stream 2 read from Kafka
3. Stream 3 will be read from Kafka - new
4. Stream 4 will
Hi peeps,
I am externalizing checkpoints in S3 for my flink job and I retain them on
cancellation. However when I look into my S3 bucket where the checkpoints
are stored there is only 1 checkpoint at any point in time . Is this the
default behavior of flink where older checkpoints are deleted when
Hi all,
I’m experimenting with using my own implementation of HA services instead of
ZooKeeper that would persist JobManager information on a Kubernetes volume
instead of in ZooKeeper.
I’ve set the high-availability option in flink-conf.yaml to the FQN of my
factory class, and started the
Hi:
Is there any configuration to get heap dump when job fails in an EMR ?
Thanks
Hi all,
I modified the logback.xml provided by flink distribution, so now the
logback.xml file looks like this :
*${log.file}
false
%d{-MM-dd HH:mm:ss.SSS} [%thread]
What Watermarks do is to advance the event time clock. You can
consider a Watermark(t) as an assertion about the completeness of the
stream -- it marks a point in the stream and says that at that point,
the stream is (probably) now complete up to time t.
The autoWatermarkInterval determines how
I'm not sure I fully understand the scenario you envision. Are you
saying you want to have some sort of window that batches (and
deduplicates) up until a downstream map has finished processing the
previous deduplicated batch, and then the window should emit the new
batch?
If that's what you want,
Hello,
I added this to the command line argument
-Denv.java.opts.taskmanager="-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=6005"
and also tried adding the below to the flink-config.yaml
env.java.opts.taskmanager:
Hello,
Originally, watermark of connected stream is set by minimum of watermarks two
streams when two streams are connected.
I wrote a code to connect two streams but one of streams does not have any
message by a condition.
In this situation, watermark is never increased and processing is
Hi Raj,
Have you restarted the cluster? You need to restart the cluster to apply
changes in flink-config.yaml.
You can also set suspend=y in the debug argument so that task managers will
pause and wait for the connection of Intellij before going on.
Raj, Smriti 于2019年8月22日周四 上午11:06写道:
>
Hello,
We have all of spark, flink, storm, kafka installed.
For realtime streaming calculation, which one is the best above?
Like other big players, the logs in our stack are huge.
Thanks.
Hi guys,
We want to have an accurate idea of how users actually use
high-availability services in Flink, especially how you customize
high-availability services by HighAvailabilityServicesFactory.
Basically there are standalone impl., zookeeper impl., embedded impl.
used in MiniCluster, YARN
In addition, FLINK-13750[1] also likely introduce breaking change
on high-availability services. So it is highly encouraged you who
might be affected by the change share your cases :-)
Best,
tison.
[1] https://issues.apache.org/jira/browse/FLINK-13750
Zili Chen 于2019年8月21日周三 下午3:32写道:
> Hi
The following code:
val MAILBOX_SET_TYPE_INFO = object: TypeHint>() {}.typeInfo
val env = StreamExecutionEnvironment.getExecutionEnvironment()
println(MAILBOX_SET_TYPE_INFO.createSerializer(env.config))
prints:
org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer@2c39e53
While there
Hi,
I am a little confused about watermarkers in Flink.
My application is using EventTime. My sources are calling
ctx.collectWithTimestamp and ctx.emitWatermark. Then I have a
CoProcessFunction which merge the two streams. I have a state on this
function and I want to clean this state every time
Thanks for the helpful reply.
One more question, does this zookeeper or HA requirement apply for a savepoint?
Can I bounce a single jobmanager cluster and rerun my flink job from its
previous states with a save point directory? e.g.
./bin/flink run myJob.jar -s savepointDirectory
Regards,
Min
Hi,
It has taken me quite a bit of time to figure this out.
This is the solution I have now (works on my machine).
Please tell me where I can improve this.
Turns out that the schema you provide for registerDataStream only needs the
'top level' fields of the Avro datastructure.
With only the top
Ok, no problem.
On Wed, Aug 21, 2019 at 12:22 AM Pei HE wrote:
> Thanks Kali for the information. However, it doesn't work for me, because
> I need features in Flink 1.7.x or later and use manged Amazon MSK.
> --
> Pei
>
>
>
> On Tue, Aug 20, 2019 at 7:17 PM sri hari kali charan Tummala <
>
Flink
---
Oytun Tez
*M O T A W O R D*
The World's Fastest Human Translation Platform.
oy...@motaword.com — www.motaword.com
On Wed, Aug 21, 2019 at 2:42 AM Eliza wrote:
> Hello,
>
> We have all of spark, flink, storm, kafka installed.
> For realtime streaming calculation, which one is the
Hey Eliza,
This decision depends on many factors, such as the experience of your team,
your use case, your deployment model, your workload, expected growth etc.
Posting the same question to the user mailing list of all these systems
won't magically answer you the question, because there is no
Hi Min,
For your question, the answer is no.
In standalone case Flink uses an in memory checkpoint store which
is able to restore your savepoint configured in command-line and
recover states from it.
Besides, stop with savepoint and resume the job from savepoint
is the standard path to migrate
Dear list-members,
I have a question regarding window-firing and element accumulation for a
slidindingwindow on a DataStream (Flink 1.8.1-2.12).
My DataStream is derived from a custom SourceFunction, which emits
stirng-sequences of WINDOW size, in a deterministic sequence.
The aim is to crete
I should add that the behaviour persists, even when I force parallelism to
1.
On Wed, Aug 21, 2019 at 5:19 PM Eric Isling wrote:
> Dear list-members,
>
> I have a question regarding window-firing and element accumulation for a
> slidindingwindow on a DataStream (Flink 1.8.1-2.12).
>
> My
启动时间是 20:00:25,task都处于running甚至第一次checkpoint
completed的时间是20:00:42,一共才17秒,何来10分钟的问题?
From: 々守护々 <346531...@qq.com>
Sent: Thursday, August 22, 2019 11:18
To: user-zh
Subject: 回复: flink启动等待10分钟问题
您好,这个是我jobmanager启动日志,请帮忙看看,谢谢!
2019-08-21 20:00:25,428 INFO
10??yarn??application
-- --
??: "Yun Tang";
: 2019??8??22??(??) 11:23
??: "user-zh";
: Re: ?? flink10
??
?
tangjunli...@huitongjy.com
?? 2019-08-22 11:32
user-zh
?? ?? ?? flink10
10??yarn??application
-- --
各位Flink社区大佬,您好!
因需要和java后端的开发一起合作开发项目,现需要将Flink和Springboot进行整合,大体流程是:通过Dubbo获取配置中心上有关数据库连接池,kafka连接等配置信息,然后接收kafka数据流,
flink处理后写入数据库中。其中入库是通过mybatis做的持久化,各种bean是通过springboot注入初始化的。
目前在本地运行是可以的,但无法通过flink run方式提交到集群运行,各种报错。
想请问一下社区中有没有大佬做过这块的工作,希望能给予小弟一下指教。
你说的【停在那儿了】是说 flink run 的终端输出不动了吗?你看一下这个终端输出里 YARN 是什么时候 accept 你的应用的,我怀疑是
YARN 集群忙导致 10 分钟没响应。
Best,
tison.
Zili Chen 于2019年8月22日周四 上午11:35写道:
> user-zh 不支持贴图,你用下第三方存储然后贴个链接吧,或者我记得可以传邮件附件
>
> Best,
> tison.
>
>
> 々守护々 <346531...@qq.com> 于2019年8月22日周四 上午11:33写道:
>
>>
??
-- --
??: "tangjunli...@huitongjy.com";
: 2019??8??22??(??) 11:34
??: "user-zh";
: : ?? flink10
?
Flink
??Flink SQL (Flink 1.8.1),??hadoop
yarn-site.xml
yarn.nodemanager.resource.memory-mb
16384
yarn.scheduler.minimum-allocation-mb
1024
你不错误贴出来,没人能给你解决的?
tangjunli...@huitongjy.com
发件人: jiang51...@163.com
发送时间: 2019-08-22 11:30
收件人: user-zh
主题: Springboot整合Flink
各位Flink社区大佬,您好!
因需要和java后端的开发一起合作开发项目,现需要将Flink和Springboot进行整合,大体流程是:通过Dubbo获取配置中心上有关数据库连接池,kafka连接等配置信息,然后接收kafka数据流,
Hi
Flink on YARN作业启动时间长,有很多原因,例如资源不够在等待,container申请的时候又退出了。默认的slot
request的timeout时间是5min,感觉你的作业应该是可能遇到了一个slot request
timeout,然后又重新申请。最好能提供一下jobmanager的日志才好进一步分析。
祝好
唐云
From: 々守护々 <346531...@qq.com>
Sent: Thursday, August 22, 2019 11:04
To: user-zh
Subject:
基本上你是卡在上传用户 jar 这一步了,提交任务到部署成功是一瞬间的
2019-08-22 11:38:02,185 INFO
org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting
application master application_1566383236573_0004
2019-08-22 11:38:02,226 INFO
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted
近日 FLINK 社区的每周社区速报又重新开始发布了[1],鉴于 FLINK 中文社区已经发展到开设了一个专门的 user-zh
邮件列表,我打算试着编纂中文版的 FLINK 社区每周速报。
本次首发在这里[0],根据和社区讨论的结果,以后会同步发布在邮件列表上,同时针对邮件列表的阅读形式做格式上的调整。
英文版的 WEEKLY 主要关注在开发(develop)和社区新闻事件(event)上,而我发现中文社区的用户对大多在海外的 event
兴趣不大,反而对一些 FLINK 常见的问题的解答需求量较大。因此我会在第一个部分先选出 FLINK 社区 user 列表和 user-zh
大家晚上好,我们这里用flink消费kafka,往mysql写。topic有9个分区,flink任务并行度是3。任务却一直报这个错:2019-08-21
19:33:07,058 WARN
org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase - Async
Kafka commit failed.
org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be
completed since the group
你这个用的是什么版本,我看社区版现在都是支持 kafka assign 接口,走 subscribe 接口的话会走 kafka group
管理,这个出现下面问题的几率会比较大,比如:consumer 超过一定时间没有调用 poll 接口就会触发。
---
Best,
Matt Wang
Original Message
Sender:yeyiyeyi9...@sina.com
Recipient:user-zhuser...@flink.apache.org
Date:Wednesday, Aug 21, 2019 20:26
43 matches
Mail list logo