slot需要多少内存是和具体作业相关的,不同作业差别会比较大。
slot的资源需求是根据所有算子的资源需求相加得到的,如果你对你的作业用到了哪些算子比较了解的话,可以根据算子的资源需求推算出来。
算子的默认资源需求可以参考 [1],里面有五个“table.exec.resource.*”的配置项,也可以调整这些配置项来更改算子使用的内存。
如果对作业使用到的算子不是很了解的话,那比较简单的办法还是直接提交作业试试看,去日志里面搜"Request slot with
profile"就能够看到slot的资源需求。
Thank you~
Xintong Song
] and FLIP-56 [2] for
more details. Another related effort is pluggable slot manager [3], which
allows having pluggable resource scheduling strategies such as launch task
managers with customized resources according to the tasks' requirements.
Thank you~
Xintong Song
[1]
https://cwiki.apache.org
Hi faaron,
Flink 1.9 中 -yn参数应该是不生效的,后续版本中已经删除了这个参数。
根据你的参数,在每个 TM 的内存为30G不变的情况下,每个 TM
中的slot个数(-ys)从5变成10,也就意味着平均每个slot占用的内存变为了原来的一半。
Flink 1.9 的sql batch 算子对 flink managed memory 是有确定的需求的,很可能是这个变化导致单个 slot
的managed memory无法满足算子的资源需求了。
Thank you~
Xintong Song
On Wed, Dec 25, 2019 at 11:09 AM
这个应该不是root cause,slot was removed通常是tm挂掉了导致的,需要找下对应的tm日志看下挂掉的原因。
Thank you~
Xintong Song
On Tue, Dec 24, 2019 at 10:06 PM hiliuxg <736742...@qq.com> wrote:
> 偶尔发现,分配好的slot突然就被remove了,导致作业重启,看不出是什么原因导致?CPU和FULL GC都没有,异常信息如下:
>
> org.apache.flink.util.FlinkException: Th
t;, change the line "FROM
openjdk:8-jre-alpine" to point to a domestic or local image source.
Thank you~
Xintong Song
On Mon, Dec 23, 2019 at 2:46 PM LakeShen wrote:
> Hi community , when I run the flink task on k8s , the first thing is that
> to build the flink task jar
t;, change the line "FROM
openjdk:8-jre-alpine" to point to a domestic or local image source.
Thank you~
Xintong Song
On Mon, Dec 23, 2019 at 2:46 PM LakeShen wrote:
> Hi community , when I run the flink task on k8s , the first thing is that
> to build the flink task jar
,是针对 flink 1.9 及以前版本的。最新尚未发布的 flink 1.10
中资源配置部分做了比较大的改动,如果有兴趣的话可以等到发布之后关注一下相关的文档。
Thank you~
Xintong Song
On Wed, Dec 18, 2019 at 4:49 PM USERNAME wrote:
> @tonysong...@gmail.com 感谢回复
> 看了下参数的含义,
> taskmanager.memory.off-heap:
> 如果设置为true,TaskManager分配用于排序,hash表和缓存中间结果的内存位于JVM堆
你这个不是OOM,是 container 内存超用被 yarn 杀掉了。
JVM 的内存是不可能超用的,否则会报 OOM。所以比较可能是 RocksDB 的内存够用量增加导致了超用。
建议:
1. 增加如下配置
taskmanager.memory.off-heap: true
taskmanager.memory.preallocate: false
2. 若果已经采用了如下配置,或者改了配置之后仍存在问题,可以尝试调大下面这个配置,未配置时默认值是0.25
containerized.heap-cutoff-ratio
Thank you~
Xintong Song
, but will have to wait until the GC to be truly released.
Thank you~
Xintong Song
On Tue, Dec 17, 2019 at 12:30 PM Ethan Li wrote:
> Thank you very much Xintong! It’s much clear to me now.
>
> I am still on standalone cluster setup. Before I was using 350GB on-heap
> memory on a 378G
Thank you Kostas.
Big +1 for keeping all the documentation related issues at one place.
I've added the documentation task for resource management.
Thank you~
Xintong Song
On Mon, Dec 16, 2019 at 5:29 PM Kostas Kloudas wrote:
> Hi all,
>
> With the feature-freeze for the rel
Sorry, I just realized that I've send my feedbacks to Jingsong's email
address, instead of the dev / user mailing list.
Please find my comments below.
Thank you~
Xintong Song
On Wed, Nov 27, 2019 at 4:32 PM Xintong Song wrote:
> As a participant of the discussion yesterday, I'm
看报错是TM挂了,具体原因需要分析TM日志,有可能是上面答复中相同的问题,也有可能是其他原因造成的。
Thank you~
Xintong Song
On Mon, Oct 21, 2019 at 11:36 AM hery...@163.com wrote:
> 参考:
> http://mail-archives.apache.org/mod_mbox/flink-user-zh/201905.mbox/%3c2019052911134683852...@wsmtec.com%3E
>
>
>
> hery.
你好,
blink perjob模式是根据job的资源需求按需申请资源的,不能限制整个job的资源上限。
你列出来的这几个参数,只能控制单个TM的资源上限,但是单个TM的资源上限减少了,整个job的资源需求并不会变,只是会申请更多的TM。
Thank you~
Xintong Song
On Sat, Oct 19, 2019 at 3:56 PM 蒋涛涛 wrote:
> Hi all,
>
> 我在使用blink提交的任务的时候(perjob模式),如何限制任务的资源使用上限啊,有个任务使用yarn的vcores特别多
&
and optimizing
performance.
Thank you~
Xintong Song
On Thu, Oct 10, 2019 at 7:55 PM Timothy Victor wrote:
> After a batch job finishes in a flink standalone cluster, I notice that
> the memory isn't freed up. I understand Flink uses it's own memory
> manager and just allocate
=' 。
Thank you~
Xintong Song
On Tue, Oct 8, 2019 at 1:59 PM LakeShen wrote:
> Flink任务自身无法隔离CPU,我想了一下,在内存方面,你可以结合用户输入的参数提前计算出来任务使用的内存大小,同样,VCore也是。
> 最近我们这边也准备限制用户申请的资源。
>
> 龙逸尘 于2019年10月7日周一 下午4:50写道:
>
> > Dear community,
> > 我搭建了一个实时计算平台,由于历史遗留问题,目前使用的 Flink 版本是社区版1
首先你要进入测试所在module的目录,在你这个例子中是 flink-connnectors\flink-connector-kafka-base\
然后执行 mvn -Dtest=KafkaProducerTestBase#testExactlyOnceCustomOperator test
-Dtest=后面可以跟<类名>#<方法名>执行某个测试用例,也可以跟<类名>执行某个类的所有测试用例
Thank you~
Xintong Song
On Sun, Sep 29, 2019 at 4:32 PM gaofei
这个邮件列表看不到图片附件的,文本内容可以直接贴出来,图片的话需要放外部链接
Thank you~
Xintong Song
On Tue, Aug 27, 2019 at 5:17 PM 张坤 wrote:
>
> 感谢您的回复,checkpoint使用的rocksDB,现在查看GC日志得到以下信息,堆内存使用正常,线程数使用在500左右,线程回收,但是线程占用的内存好像并没有回收掉。
>
> 在 2019/8/27 下午5:02,“Xintong Song” 写入:
>
> 你用的是heap state backe
你用的是heap state backend吗?可以看下checkpoint
size是否持续在增大,如果是的话很可能就是state增大导致的。作业运行后,随着处理的数据越来越多,state的key数量也会越来越多,大小随之增大。解决方案要么是改用RocksDB,要么是把tm内存配大为state增大留出富裕。
另外,如果checkpoint size持续增长没有趋于平缓的趋势,那么也可能state的使用有问题。
如果观察到不是state的问题,那么可能需要dump下tm的内存,看看是否哪里有内存泄露的情况。
Thank you~
Xintong Song
On Mon
Congratulations!
Thanks Gordon and Kurt for being the release managers, and thanks all the
contributors.
Thank you~
Xintong Song
On Thu, Aug 22, 2019 at 2:39 PM Yun Gao wrote:
> Congratulations !
>
> Very thanks for Gordon and Kurt for managing the release and very
Congratulations Andery~!
Thank you~
Xintong Song
On Wed, Aug 14, 2019 at 3:31 PM Oytun Tez wrote:
> Congratulations Andrey!
>
> I am glad the Flink committer team is growing at such a pace!
>
> ---
> Oytun Tez
>
> *M O T A W O R D*
> The World's Fastest Human
omatically, as long as there
are continuous activities of creating / destroying objects in heap, e.g.,
due to heartbeats. Please refer to java garbage collection documents [1]
for more details.
Thank you~
Xintong Song
[1]
https://www.oracle.com/webfolder/technetwork/tutorials/obe/java/gc01/
Hi,
It would be good if you can provide the job manager and task manager log
files, so that others can analysis the problem?
Thank you~
Xintong Song
On Mon, Aug 12, 2019 at 10:12 AM pengcheng...@bonc.com.cn <
pengcheng...@bonc.com.cn> wrote:
> Hi all,
> some slots are not be av
你的问题描述比较笼统,最好是能提供一些详细的信息和日志,这样其他人才好帮助你。
例如你用的是哪个版本的flink,运行的是什么模式 (perjob / session),是在什么环境下运行的(standalone / yarn /
mesos / k8s),是如何判断slot没有被释放的等。
Thank you~
Xintong Song
On Mon, Aug 12, 2019 at 3:57 AM pengcheng...@bonc.com.cn <
pengcheng...@bonc.com.cn> wrote:
> 你好,谢谢,图片显示确实有问题,不
Hi,
邮件中的图片显示不出来。Flink邮件列表的图片附件是有点问题的,如果是截图最好上传到其他地方然后把链接贴出来。
Thank you~
Xintong Song
On Fri, Aug 9, 2019 at 10:06 AM pengcheng...@bonc.com.cn <
pengcheng...@bonc.com.cn> wrote:
> 各位大佬:
>
> 有对这种情况比较了解的吗?任务结束后,一些slot并没有释放掉。
>
>
> 如图所示:
>
>
>
>
> ---
Congratulations~!
Thank you~
Xintong Song
On Wed, Aug 7, 2019 at 4:00 PM vino yang wrote:
> Congratulations!
>
> highfei2...@126.com 于2019年8月7日周三 下午7:09写道:
>
> > Congrats Hequn!
> >
> > Best,
> > Jeff Yang
> >
> >
> > Origi
/projects/flink/flink-docs-release-1.8/concepts/runtime.html#task-slots-and-resources
Thank you~
Xintong Song
On Sun, Aug 4, 2019 at 9:40 PM Chad Dombrova wrote:
> Hi all,
> First time poster, so go easy on me :)
>
> What is Flink's story for accommodating task workloads with vastly
Thanks for bringing this up, Boxiu.
The problem make sense to me. For me the concern is should we limit the
priorities to 1-9 or not.
I think it would be good to open a jira issue and have the discussion
there.
Thank you~
Xintong Song
On Wed, Jul 31, 2019 at 12:22 PM tian boxiu wrote
container OOM, then it
should be non-java memory used by RocksDB.
[1]
https://docs.oracle.com/javase/8/docs/api/java/lang/management/MemoryMXBean.html
Thank you~
Xintong Song
On Tue, Jul 23, 2019 at 8:42 PM wvl wrote:
> Hi,
>
> We're running a relatively simply Flink applicatio
all the new generated files under the log dir here?
Thank you~
Xintong Song
On Fri, Jul 12, 2019 at 2:25 PM Karthik Guru wrote:
> Hello Xintong
>
> Thanks for your reply.
>
> 1)I have attached screenshots of the logs directory (Screenshot 1,2)
>
> 2) In conf/slaves, all
by the cluster.
- You can find the logs in 'log/flink-*-standalonesession-*.log'.
- If you cannot find the problem from the logs, you can also post them in
this ML for help.
Thank you~
Xintong Song
On Fri, Jul 12, 2019 at 1:57 AM Karthik Guru wrote:
> Hey Flink team,
>
> Novic
Hi Singh,
Could your problem be solved by simply record the previous value and
subtract it from the new value?
Thank you~
Xintong Song
On Wed, Jul 10, 2019 at 3:33 PM M Singh wrote:
> Hi:
>
> I am working on a Flink application and need to collect the number of
> events of
Thanks for the kindly offer, Qi.
I think this work should not take much time, so I can take care of it. It's
just the community is currently under feature freeze for release 1.9, so we
need to wait until the release code branch being cut.
Thank you~
Xintong Song
On Wed, Jul 10, 2019 at 1:55
`.
Thank you~
Xintong Song
On Tue, Jul 9, 2019 at 4:41 AM Vishwas Siravara wrote:
> Hi guys,
> I am not able to start a stand alone session with one task manager and one
> job manager on the same node by adding debug option in flink-conf.yaml as
> env.java.opts:
> -agentlib
你好,
这个应该是不可以的。
Thank you~
Xintong Song
On Thu, Jul 4, 2019 at 4:29 PM 戴嘉诚 wrote:
> 大家好:
>
> 我在flink中看到可以注册一个分布式缓存文件StreamExecutionEnvironment.registerCachedFile()然后可以广播到每个tm上给算子使用,那么我想问问,这个文件可以检测到文件更新了,然后会重新广播过去嘛?因为ip会可能会每天都有改变,所以ip库要每天都更新。
>
>
你好,
社区此前已经发现你所遇到的问题了,会在后续版本中修复,目前规划的是在1.7.3, 1.8.2, 1.9.0几个版本中修复。详情可以参考:
https://issues.apache.org/jira/browse/FLINK-12122
Thank you~
Xintong Song
On Tue, Jul 2, 2019 at 11:27 AM Ever <439674...@qq.com> wrote:
> 我们测试环境的flink集群(flink 1.8),taskmanager有3个,每个有10个slot。
>
> 然后我有一个jo
you~
Xintong Song
On Mon, Jul 1, 2019 at 8:59 PM 徐涛 wrote:
> Hi All,
> 在官方文档里面,有介绍说多个Slot之间可以平均划分TM内存。
>
> 但是我在Flink的源代码里面并没有找到Slot平均划分TM内存的代码。而且不太明白的是,同一个JVM内,不同Slot平均划分内存的实现原理是什么?
> 非常感谢!
>
>
> 谢谢
> 徐涛
hen running jobs IDE.
Thank you~
Xintong Song
On Fri, Jun 28, 2019 at 8:45 PM M Singh wrote:
> Hi Xintong:
>
> I passed the -Dparallelism.default=2 in the run configuration VM
> arguments for IntelliJ.
>
> So what I am looking for is a way to overwrite the config parameters which
Hi Singh,
You can use the environment variable "FLINK_CONF_DIR" to specify path to
the directory of config files. You can also override config options with
command line arguments prefixed -D (for yarn-session.sh) or -yD (for 'flink
run' command).
Thank you~
Xintong Song
On Wed, Ju
"NONE".
2. It "ClassPathJobGraphRetriever#retrieveJobGraph" actually invoked, and
is there any exceptions thrown from it. This is to verify whether the
correct code path for job cluster is invoked.
Thank you~
Xintong Song
On Tue, Jun 4, 2019 at 10:48 AM Boris Lublinsky <
boris.lublin...@ligh
So here are my questions:
1. What environment do you run Flink in? Is it locally, on Yarn or Mesos?
2. How do you trigger "restart a Job Master"?
Thank you~
Xintong Song
On Tue, Jun 4, 2019 at 10:35 AM Boris Lublinsky <
boris.lublin...@lightbend.com> wrote:
> Thanks,
>
, can you explain what are the detailed operation steps do you
perform when you say "trying to restart a Job Master".
Thank you~
Xintong Song
On Mon, Jun 3, 2019 at 10:05 PM Boris Lublinsky <
boris.lublin...@lightbend.com> wrote:
> I am trying to experiment with Flink Job s
be a simple choice because it avoids tuning these two
relevant factors at the same time.
Thank you~
Xintong Song
On Sat, May 25, 2019 at 4:54 AM black chase wrote:
> Hi Song,
> You said "In that way, the total slots (or number of TaskManagers if you
> config on slot for each Tas
As far as I know, Flink does not have any requirements on how the
TaskManagers are distributed across physical machines. So I think it really
depends on the scheduling policy of the Mesos cluster. I'm not an expert on
Mesos, so correct me if I was wrong.
Thank you~
Xintong Song
On Fri, May 24
performance bottleneck, usually it
can be solved by increasing the JobManager's resources with a '-jm'
argument.
Thank you~
Xintong Song
On Fri, May 24, 2019 at 2:33 AM Antonio Verardi wrote:
> Hello Flink users,
>
> How many task managers one can expect a Flink cluster to be able to
>
1.8/concepts/runtime.html#task-slots-and-resources>
.
Thank you~
Xintong Song
On Thu, May 23, 2019 at 11:49 PM black chase
wrote:
>
> Hi,
>
> I am redesigning the scheduler of the JobManager to place tasks of a job
> across TaskManagers accroding to a scheduling policy.
>
&
that do not
belong to any particular job.
For the latter, there are two ways that I can think of to specify log file
path for a Flink cluster.
- Set evn.log.dir in flink-conf.yaml
- Modify log4j.appender.file.file in log4j.properties
Hope that helps.
Thank you~
Xintong Song
On Tue
about resource management of
Flink. AFAIK, the native memory is on the table, but I'm not sure whether
the metaspace memory is considered. I think we should create a jira issue
on this.
Thank you~
Xintong Song
On Mon, May 20, 2019 at 4:47 PM XiangWei Huang
wrote:
> Hi all,
> Currentl
取hostname的第一部分是为了和hdfs的用法保持一致,可以参考一下当时的issue,作者专门提到了为什么要这么做。
https://issues.apache.org/jira/browse/FLINK-1170?focusedCommentId=14175285=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14175285
Thank you~
Xintong Song
On Wed, May 15, 2019 at 9:11 PM Yun Tang wrote
hi naisili,
我没有在你的邮件里看到任何附件、截图或者文字描述的错误,麻烦你再确认一次。
Thank you~
Xintong Song
On Fri, Apr 26, 2019 at 10:46 AM naisili Yuan
wrote:
> 还是集群稳定性问题,发现了这个错误,我想问下是不是我配置集群高可用的问题,是否不依赖zookeeper会更稳定一点。
> 希望得到回复,谢谢!
>
> naisili Yuan 于2019年4月22日周一 下午2:23写道:
>
>> 不好意思,我忘记贴图了。
>>
Hi naisili,
This is the user-zh mailing list, so if you speak Chinese you can ask
questions in Chinese. If you prefer using English, you can send emails to
u...@flink.apache.org. Hope that helps you.
BTW, I think you forgot to attache the screenshot.
Thank you~
Xintong Song
On Mon, Apr 22
301 - 350 of 350 matches
Mail list logo