date:20200826

??????????????????UV??????????????MapState??BloomFilter,??checkpoint????????????????????

2020-08-26 Thread x

UV??MapStateBloomFilter??,checkpoint??bloomMapState

Re: Flink 维表延迟join

2020-08-26 Thread Leonard Xu

Hi, all 看起来维表延迟join是一个common case, 我在邮件列表里看到蛮多小伙伴反馈了，感觉可以考虑支持下维表延迟 join，大家可以一起分享下主要的业务场景吗？ Best Leonard > 在 2020年8月27日，10:39，china_tao 写道： > > 一般来说，是先有维表数据，再有流数据。如果出现了你这样的情况，两个方式，一个使用left > join，使流表数据的维表信息为null，后期通过etl再补录；或者碰到异常，把消息打到另外一个kafka中，再进行异常处理或者补录处理，也可以理解为您说的那种5分钟，10分钟join一次。 >

Re: Flink 维表延迟join

2020-08-26 Thread china_tao

一般来说，是先有维表数据，再有流数据。如果出现了你这样的情况，两个方式，一个使用left join，使流表数据的维表信息为null，后期通过etl再补录；或者碰到异常，把消息打到另外一个kafka中，再进行异常处理或者补录处理，也可以理解为您说的那种5分钟，10分钟join一次。个人推荐先用null存储，后期etl补录。 -- Sent from: http://apache-flink.147419.n8.nabble.com/

答复: 关于sink失败不消费kafka消息的处理

2020-08-26 Thread 范超

> 假设，现在message 1-5 正在被sink operator publish， 1-3 已经publish了，但4，5 还没有， > 这个时候source operator成功完成了checkpoint, 这个checkpoint 里面 offset 应该要 > 5, 假设是6. > 假如这个时候publish message 4 失败了，那么job restart from last successful > checkpoint, source operator 就会从6 开始读数据，那么4 和 5 就会丢失了，这个理解正确吗

Re: flink interval join后按窗口聚组问题

2020-08-26 Thread Benchao Li

Hi Danny, You are right, we have already considered the watermark lateness in this case. However our Interval Join Operator has some bug that will still produce records later than watermark. I've created a issue[1], we can discuss it in the jira issue. [1]

Re: Flink 维表延迟join

2020-08-26 Thread Benchao Li

Hi，我们也遇到过类似场景，我们的做法是修改了一下维表Join算子，让它来支持延迟join。其实还有个思路，你可以把这种没有join到的数据发送到另外一个topic，然后再消费回来继续join。郑斌斌于2020年8月27日周四上午9:23写道： > 小伙伴们： > > 大家好，请教一个问题，流表和维表在JOIN时，如果流表的数据没在维表中时，能否进行延迟join，比如，每10分钟进行match一下，连续match6次都没有match上的话，丢弃该数据。 > 这个场景怎么通过flink SQL或UDF实现，目前是通过timer来实现的，感觉有些麻烦。 > > Thanks

Re: 关于sink失败不消费kafka消息的处理

2020-08-26 Thread Benchao Li

Hi Eleanore，shizk233 同学给出的解释已经很全面了。对于你后面提的这个问题，我感觉这个理解应该不太正确。开了checkpoint之后，虽然kafka producer没有用两阶段提交，但是也可以保证在checkpoint成功的时候会将当前的所有数据flush出去。如果flush失败，那应该是会导致checkpoint失败的。所以我理解这里应该是 at least once的语义，也就是数据可能会重复，但是不会丢。 Eleanore Jin 于2020年8月27日周四上午9:53写道： > Hi shizk233, > > 非常感谢你的回答！

Re: Flink SQL Map类型字段大小写不敏感支持

2020-08-26 Thread zilong xiao

好的，了解了，谢谢啦~ Leonard Xu 于2020年8月26日周三下午9:26写道： > Hi，zilong > > 之前我建了一个issue[1]支持大小写敏感, 有了个初步的PR，但是社区想做全套，字段名，表名，catalog名都统一解决，所以还没支持 > > 祝好 > Leonard > [1] https://issues.apache.org/jira/browse/FLINK-16175?filter=12347488 < > https://issues.apache.org/jira/browse/FLINK-16175?filter=12347488> >

Re: 关于sink失败不消费kafka消息的处理

2020-08-26 Thread Eleanore Jin

Hi shizk233, 非常感谢你的回答！如果是如下场景：我的DAG 就是从kafka source topic 读取数据，然后写到kafka sink topic，中间没有其他stateful operator. 如果sink operator 不是两端提交，就是kafka producer send, 那么如果开启checkpoint, state 就只是source operator kafka offset. 假设，现在message 1-5 正在被sink operator publish， 1-3 已经publish了，但4，5 还没有，这个时候source

Re: pyflink kafka connector的问题

2020-08-26 Thread Xingbo Huang

Hi, 你的DDL没有问题，问题应该是你没有把kafka的jar包添加进来。你可以到 https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/kafka.html 这里下载kafaka的universal版本的jar包。关于如何把jar包添加到pyflink里面使用，你可以参考文档 https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/python/faq.html#adding-jar-files

Flink 维表延迟join

2020-08-26 Thread 郑斌斌

小伙伴们：大家好，请教一个问题，流表和维表在JOIN时，如果流表的数据没在维表中时，能否进行延迟join，比如，每10分钟进行match一下，连续match6次都没有match上的话，丢弃该数据。这个场景怎么通过flink SQL或UDF实现，目前是通过timer来实现的，感觉有些麻烦。 Thanks

Re: 文件的增量监控

2020-08-26 Thread yang zhang

这个场景可以利用checkpoint更新偏移量实现。可以参考代码 https://github.com/liuhouer/np-flink/blob/master/src/main/java/cn/northpark/flink/project2/NP_ExactlyOnceParallelismFileSource.java 发自我的iPhone > 在 2020年8月27日，08:03，lj879933274 写道： > > 各位大佬: > >场景描述: 利用flink监控某个目录下的文件，如果文件内数据增加(追加)了，就进行处理。 > >使用方法:

Re: Default Flink Metrics Graphite

2020-08-26 Thread Vijayendra Yadav

Hi Chesnay and Dawid, I see multiple entries as following in Log: 2020-08-26 23:46:19,105 WARN org.apache.flink.runtime.metrics.MetricRegistryImpl - Error while registering metric: numRecordsIn. java.lang.IllegalArgumentException: A metric named

文件的增量监控

2020-08-26 Thread lj879933274

各位大佬: 场景描述: 利用flink监控某个目录下的文件，如果文件内数据增加(追加)了，就进行处理。使用方法: 我现在利用ContinuousFileMonitoringFunction 作为source，采用PROCESS_CONTINUOUSLY的处理模式。遇见的问题: 当使用这种方法时每次追加文件内容后都是全量的读取文件数据，有没有什么方法让我追加文件后只读取到追加的内容？

Re: Failures due to inevitable high backpressure

2020-08-26 Thread David Anderson

One other thought: some users experiencing this have found it preferable to increase the checkpoint timeout to the point where it is effectively infinite. Checkpoints that can't timeout are likely to eventually complete, which is better than landing in the vicious cycle you described. David On

Re: Failures due to inevitable high backpressure

2020-08-26 Thread David Anderson

You should begin by trying to identify the cause of the backpressure, because the appropriate fix depends on the details. Possible causes that I have seen include: - the job is inadequately provisioned - blocking i/o is being done in a user function - a huge number of timers are firing

Re: OOM error for heap state backend.

2020-08-26 Thread Vishwas Siravara

Thanks Andrey, My question is related to The FsStateBackend is encouraged for: - Jobs with large state, long windows, large key/value states. - All high-availability setups. How large is large state without any overhead added by the framework? Best, Vishwas On Wed, Aug 26, 2020 at 12:10

Re: OOM error for heap state backend.

2020-08-26 Thread Andrey Zagrebin

Hi Vishwas, is this quantifiable with respect to JVM heap size on a single node > without the node being used for other tasks ? I don't quite understand this question. I believe the recommendation in docs has the same reason: use larger state objects so that the Java object overhead pays off.

Re: Setting job/task manager memory management in kubernetes

2020-08-26 Thread Alexey Trenikhun

Hello, What version of Flink do you use? If you use 1.10+ please check [1] (different properties names) [1] - https://ci.apache.org/projects/flink/flink-docs-stable/ops/memory/mem_setup.html Thanks, Alexey From: Sakshi Bansal Sent: Monday, August 24, 2020 3:30

Re: Example flink run with security options? Running on k8s in my case

2020-08-26 Thread Andrey Zagrebin

Hi Adam, maybe also check your SSL setup in a local cluster to exclude possibly related k8s things. Best, Andrey On Wed, Aug 26, 2020 at 3:59 PM Adam Roberts wrote: > Hey Nico - thanks for the prompt response, good catch - I've just tried > with the two security options (enabling rest and

Re: 关于sink失败不消费kafka消息的处理

2020-08-26 Thread shizk233

Hi Eleanore，这个问题我可以提供一点理解作为参考 1.chk与at least once checkpoint机制的思想就是做全局一致性的快照，失败恢复时数据的消费位点会回滚到上一次chk n的进度，然后进行数据重放，这样就保证了数据不会缺失，至少被消费一次。 2. sink2PC 在chk机制下，数据重放时一般sink端的数据不能回滚，就会有重复数据。如果是upsert sink那仍然是一致的，否则需要通过2PC的预提交来将chk n+1成功前的数据写到临时存储，等chk n+1完成再真正写入的物理存储。如果在chk

Re: OOM error for heap state backend.

2020-08-26 Thread Andrey Zagrebin

Hi Vishwas, I believe the screenshots are from a heap size of 1GB? There are indeed many internal Flink state objects. They are overhead which is required for Flink to organise and track the state on-heap. Depending on the actual size of your state objects, the overhead may be relatively large

Failures due to inevitable high backpressure

2020-08-26 Thread Hubert Chen

Hello, My Flink application has entered into a bad state and I was wondering if I could get some advice on how to resolve the issue. The sequence of events that led to a bad state: 1. A failure occurs (e.g., TM timeout) within the cluster 2. The application successfully recovers from the last

Re: 关于sink失败不消费kafka消息的处理

2020-08-26 Thread Eleanore Jin

Hi Benchao 可以解释一下为什么sink没有两阶段提交，那就是at least once 的语义吗？比如source和 sink 都是kafka，如果 sink 不是两段式提交，那么checkpoint 的state 就只是source 的 offset，这种情况下和使用kafka auto commit offset 看起来似乎没有什么区别可否具体解释一下？谢谢！ Eleanore On Tue, Aug 25, 2020 at 9:59 PM Benchao Li wrote: >

Re: Idle stream does not advance watermark in connected stream

2020-08-26 Thread Aljoscha Krettek

Yes, I'm afraid this analysis is correct. The StreamOperator, AbstractStreamOperator to be specific, computes the combined watermarks from both inputs here:

Re: Default Flink Metrics Graphite

2020-08-26 Thread Chesnay Schepler

metrics.reporter.grph.class: org.apache.flink.metrics.graphite.GraphiteReporter https://ci.apache.org/projects/flink/flink-docs-release-1.10/monitoring/metrics.html#graphite-orgapacheflinkmetricsgraphitegraphitereporter On 26/08/2020 16:40, Vijayendra Yadav wrote: Hi Dawid, I have 1.10.0

Re: Default Flink Metrics Graphite

2020-08-26 Thread Dawid Wysakowicz

I'd recommend then following this instruction from older docs[1] The difference are that you should set: |metrics.reporter.grph.class: org.apache.flink.metrics.graphite.GraphiteReporter| and put the reporter jar to the /lib folder: In order to use this reporter you must copy

Re:基于flink1.10源码编译问题

2020-08-26 Thread Lynn Chen

aliyun-mapr-public mapr-releases mapr-releases https://maven.aliyun.com/repository/mapr-public confluent-packages-maven confluent confluent https://packages.confluent.io/maven

Re: Default Flink Metrics Graphite

2020-08-26 Thread Vijayendra Yadav

Hi Dawid, I have 1.10.0 version of flink. What is alternative for this version ? Regards, Vijay > > On Aug 25, 2020, at 11:44 PM, Dawid Wysakowicz wrote: > > > Hi Vijay, > > I think the problem might be that you are using a wrong version of the > reporter. > > You say you used

?????? ????????????checkpoint????

2020-08-26 Thread Robert.Zhang

Hi ?? DataStream broad=env.readFrom(...).broad; DataStream firstSource=env.readFrom(...); DataStream secondSource=env.readFrom(...); DataStream union=firstSource.union(secondSource); IterativeStream iterativeStream=union.keyby(...).process(...).iterate(); DataStream

Resource leak in DataSourceNode?

2020-08-26 Thread Mark Davis

Hi, I am trying to investigate a problem with non-released resources in my application. I have a stateful application which submits Flink DataSetjobs using code very similar to the code in CliFrontend. I noticed what I am getting a lot of non-closed connections to my data store (HBase in my

RE: Example flink run with security options? Running on k8s in my case

2020-08-26 Thread Adam Roberts

Hey Nico - thanks for the prompt response, good catch - I've just tried with the two security options (enabling rest and internal SSL communications) and still hit the same problem I've also tried turning off security (both in my Job definition and in my Flink cluster JobManager/TaskManager

Re: Flink SQL Map类型字段大小写不敏感支持

2020-08-26 Thread Leonard Xu

Hi，zilong 之前我建了一个issue[1]支持大小写敏感, 有了个初步的PR，但是社区想做全套，字段名，表名，catalog名都统一解决，所以还没支持祝好 Leonard [1] https://issues.apache.org/jira/browse/FLINK-16175?filter=12347488 > 在 2020年8月26日，20:47，zilong xiao 写道： > > 这个有相关的issue可以follow吗？

Re: Flink SQL Map类型字段大小写不敏感支持

2020-08-26 Thread zilong xiao

这个有相关的issue可以follow吗？ Danny Chan 于2020年8月26日周三下午8:42写道： > 您好现在 Flink SQL 是大小写敏感的目前还没有计划开启大小写不敏感。 > > Best, > Danny Chan > 在 2020年8月21日 +0800 AM11:04，zilong xiao ，写道： > > 如题，在业务中有遇到过在Map类型字段中有表示同一个含义但是大小写不一致的情况，比如有个Map字段 my_map，里面可能存在key > > aB，ab，Ab，在SQL中取值时能否能做到大小写不敏感呢，my_map['ab']

Re: DDL中声明主键会报类型不匹配

2020-08-26 Thread Danny Chan

是的加了 primary key constraint 后会强制将类型转成 Not nullable，这个是 primary key 的特性导致的。 Best, Danny Chan 在 2020年8月20日 +0800 PM5:19，xiao cai ，写道： > Hi： > flink版本1.11.0 connector为kafka > DDL中声明某个字段为primary key时，会报类型不匹配，去掉primary key constraint就可以正常执行。 > 把shop_id设置为 varchar not null也不行。 > > >

Re: Flink SQL Map类型字段大小写不敏感支持

2020-08-26 Thread Danny Chan

您好现在 Flink SQL 是大小写敏感的目前还没有计划开启大小写不敏感。 Best, Danny Chan 在 2020年8月21日 +0800 AM11:04，zilong xiao ，写道： > 如题，在业务中有遇到过在Map类型字段中有表示同一个含义但是大小写不一致的情况，比如有个Map字段 my_map，里面可能存在key > aB，ab，Ab，在SQL中取值时能否能做到大小写不敏感呢，my_map['ab'] 能取所有相关key的值

Re: [ANNOUNCE] Apache Flink 1.10.2 released

2020-08-26 Thread liupengcheng

Thanks ZhuZhu for managing this release and everyone who contributed to this. Best, Pengcheng 在 2020/8/26 下午7:06，“Congxian Qiu” 写入: Thanks ZhuZhu for managing this release and everyone else who contributed to this release! Best, Congxian Xingbo Huang 于2020年8月26日周三

Re: How to visit outer service in batch for sql

2020-08-26 Thread Danny Chan

Hi, did you try to define a UDAF there within your group window sql, where you can have a custom service there. I’m afraid you are right, SQL only supports time windows. Best, Danny Chan 在 2020年8月26日 +0800 PM8:02，刘建刚，写道： > For API, we can visit outer service in batch through countWindow,

Re: flink interval join后按窗口聚组问题

2020-08-26 Thread Danny Chan

For SQL, we always hold back the watermark when we emit the elements, for time interval: return Math.max(leftRelativeSize, rightRelativeSize) + allowedLateness; For your case, the watermark would hold back for 1 hour, so the left join records would not delay when it is used by subsequent

How to visit outer service in batch for sql

2020-08-26 Thread 刘建刚

For API, we can visit outer service in batch through countWindow, such as the following. We can visit outer service every 1000 records. If we visit outer service every record, it will be very slow for our job. source.keyBy(new KeySelector()) .countWindow(1000)

How to visit outer service in batch for sql

2020-08-26 Thread 刘建刚

For API, we can visit outer service in batch through countWindow, such as the following. We can visit outer service every 1000 records. If we visit outer service every record, it will be very slow for our job. source.keyBy(new KeySelector()) .countWindow(1000)

Re: 报错 Could not resolve ResourceManager address akka.tcp://flink@hostname:16098/user/resourcemanager

2020-08-26 Thread song wang

zk的node是可以创建的，每次都是在新的job_id下边创建一个job_manager_lock。 yarn-session已经创建了3个月了，日志非常大，有好几个G，不好传。非常感谢你的回复，不好意思一直打扰。我自己在研究下吧，有什么进展会写在这里。 Xintong Song 于2020年8月26日周三下午7:11写道： > ZK 日志里有 TaskExecutor 节点创建失败的相关信息吗？ > 另外，你这个 yarn-session 是什么时间创建的，运行多久啦？ > > 如果是zk 问题的话，我理解影响的应该是所有的yarnsession,可是只有这一个有问题 > >

Re: [ANNOUNCE] Apache Flink 1.10.2 released

2020-08-26 Thread Leonard Xu

Thanks ZhuZhu for being the release manager and everyone who contributed to this release. Best, Leonard

Re: [ANNOUNCE] Apache Flink 1.10.2 released

2020-08-26 Thread Leonard Xu

Thanks ZhuZhu for being the release manager and everyone who contributed to this release. Best, Leonard

Re: flink checkpoint导致反压严重

2020-08-26 Thread Congxian Qiu

Hi 对于开启 Checkpoint 之后导致反压的情况，如果希望在现在的基础上进行优化的话，则需要找到反压的原因是什么，可以尝试从最后一个被反压的算子开始排查，到底什么原因导致的，排查过程中，或许 Arthas[1] 可以有一些帮助另外比较好奇的是，为什么反压会导致你的作业挂掉呢？作业挂掉的原因是啥呢 [1] https://github.com/alibaba/arthas Best, Congxian Yun Tang 于2020年8月26日周三上午11:25写道： > Hi > > 对于已经改为at least

Re: 流处理任务中checkpoint失败

2020-08-26 Thread Congxian Qiu

Hi 按理说，数据和 barrier 没有依赖关系的，但从你的描述看，没有数据的时候，无法接受到 barrier，或许你可以分享一下你的代码（可以把业务逻辑进行简化），或许大家可以帮你看看 Best, Congxian Robert.Zhang <173603...@qq.com> 于2020年8月26日周三上午11:43写道： > Hi Congxian, > > 开了更多日志观察了下，是由于iteration source的barrier无法接收到导致的。 >

Re: 报错 Could not resolve ResourceManager address akka.tcp://flink@hostname:16098/user/resourcemanager

2020-08-26 Thread Xintong Song

ZK 日志里有 TaskExecutor 节点创建失败的相关信息吗？另外，你这个 yarn-session 是什么时间创建的，运行多久啦？如果是zk 问题的话，我理解影响的应该是所有的yarnsession,可是只有这一个有问题 > 这个不一定的，ZK 的问题不见得是整个服务不可用，可能是与当前应用相关的某个状态出现了问题，造成只有这个作业的后续服务受到影响。我这边也只能是根据你的描述猜测可能的原因。是否方便提供下完整的 JM 日志，我这边看下是否能有所发现？ Thank you~ Xintong Song On Wed, Aug 26, 2020 at 5:16 PM

Re: [ANNOUNCE] Apache Flink 1.10.2 released

2020-08-26 Thread Congxian Qiu

Thanks ZhuZhu for managing this release and everyone else who contributed to this release! Best, Congxian Xingbo Huang 于2020年8月26日周三下午1:53写道： > Thanks Zhu for the great work and everyone who contributed to this release! > > Best, > Xingbo > > Guowei Ma 于2020年8月26日周三下午12:43写道： > >> Hi, >>

Re: [ANNOUNCE] Apache Flink 1.10.2 released

2020-08-26 Thread Congxian Qiu

Thanks ZhuZhu for managing this release and everyone else who contributed to this release! Best, Congxian Xingbo Huang 于2020年8月26日周三下午1:53写道： > Thanks Zhu for the great work and everyone who contributed to this release! > > Best, > Xingbo > > Guowei Ma 于2020年8月26日周三下午12:43写道： > >> Hi, >>

Re: 1.11.2大概什么时候发布

2020-08-26 Thread abc15606

好的，谢谢。主要是hive streaming这块。发自我的iPhone > 在 2020年8月26日，18:34，Yun Tang 写道： > > 可以参照 https://flink.apache.org/downloads.html#all-stable-releases > 的历史发布记录，一般是3个月左右，也就是大约10月底。 > > 1.11.2 有什么特别期待的bug fix么？ > > 祝好 > 唐云 > > > > > From: abc15...@163.com > Sent:

RE: flink1.11 kafka sql connector

2020-08-26 Thread venn

默认应该是 Kafka 的自动提交，开了Checkpoint 就 Checkpoint 提交 -Original Message- From: user-zh-return-6960-wxchunjhyy=163@flink.apache.org On Behalf Of Dream-底限 Sent: Wednesday, August 26, 2020 10:42 AM To: user-zh@flink.apache.org Subject: flink1.11 kafka sql connector hi

RE: 关于sink失败不消费kafka消息的处理

2020-08-26 Thread venn

可以参考下这个： https://www.cnblogs.com/bethunebtj/p/9168274.html#5-%E4%B8%BA%E6%89%A7%E8%A1%8C%E4%BF%9D%E9%A9%BE%E6%8A%A4%E8%88%AAfault-tolerant%E4%B8%8E%E4%BF%9D%E8%AF%81exactly-once%E8%AF%AD%E4%B9%89 -Original Message- From: user-zh-return-6980-wxchunjhyy=163@flink.apache.org On Behalf

RE: Flink运行时可以转移数据吗？

2020-08-26 Thread venn

如果自己实现 KeySelector ，可以感知下游节点的反压，动态调整 KeySelector 策略就可以 -Original Message- From: user-zh-return-6979-wxchunjhyy=163@flink.apache.org On Behalf Of Sun_yijia Sent: Wednesday, August 26, 2020 2:17 PM To: user-zh Subject: Flink运行时可以转移数据吗？在做反压相关的代码，想请教各位大佬。

Re: flink on yarn配置问题

2020-08-26 Thread 赵一旦

这个问题暂停段时间，这部分比较复杂。可能还涉及到自定义的scheduler，以及自定义的hadoop鉴权方式等。目前我也不是很清楚还，还需要继续问问公司相关基础设施的同学。 Yang Wang 于2020年8月25日周二上午11:21写道： > > 你确认upd_security这个queue是存在的吧，另外你Yarn集群的scheduler是capacityScheduler还是FairScheduler > 如果是Fair的话，需要指定完整的queue名字，而不是叶子节点的 > > > Best, > Yang > > 赵一旦于2020年8月24日周一上午10:55写道：

Re: Monitor the usage of keyed state

2020-08-26 Thread Yun Tang

Hi Mu I want to share something more about the memory usage of RocksDB. If you enable managed memory for rocksDB (which is enabled by default) [1], you should refer to the block cache usage as we cast index & filter into cache and counting write buffer usage in cache. We could refer to the

Re: Example flink run with security options? Running on k8s in my case

2020-08-26 Thread Nico Kruber

Hi Adam, the flink binary will pick up any configuration from the flink-conf.yaml of its directory. If that is the same as in the cluster, you wouldn't have to pass most of your parameters manually. However, if you prefer not having a flink-conf.yaml in place, you could remove the

Re: 1.11.2大概什么时候发布

2020-08-26 Thread Yun Tang

可以参照 https://flink.apache.org/downloads.html#all-stable-releases 的历史发布记录，一般是3个月左右，也就是大约10月底。 1.11.2 有什么特别期待的bug fix么？祝好唐云 From: abc15...@163.com Sent: Wednesday, August 26, 2020 15:41 To: user-zh@flink.apache.org Subject: 1.11.2大概什么时候发布 1.11.2大概什么时候发布？

Re: Flink运行时可以转移数据吗？

2020-08-26 Thread Congxian Qiu

Hi 据我所知，在作业启动之后，是无法改变数据的分法规则的，也就是说没办法做到这个要求。 Best, Congxian Sun_yijia 于2020年8月26日周三下午2:17写道： > 在做反压相关的代码，想请教各位大佬。 > > > 有一个分支节点，分支后面有两个节点A和B。假设A节点出现了反压，B节点负载空闲。 > 我想让B节点帮A节点做一些计算，这样B节点就能够缓解一部分A节点的压力。 > > > 有什么方法能让Flink在运行过程中，把接下来要发给A节点的数据发送给B节点吗？

pyflink kafka connector的问题

2020-08-26 Thread Wanmail1997

大家好，我直接使用ddl定义kafka数据源出现了问题。 kafka里是logstash采上来的json格式数据。 ddl如下： CREATE TABLE vpn_source ( c_real_ip VARCHAR, d_real_ip VARCHAR, c_real_port INT, d_real_port INT, logtype INT, `user` VARCHAR, host_ip VARCHAR ) WITH ( 'connector.type' = 'kafka', 'connector.version' = 'universal',

Re: 报错 Could not resolve ResourceManager address akka.tcp://flink@hostname:16098/user/resourcemanager

2020-08-26 Thread song wang

如果是zk 问题的话，我理解影响的应该是所有的yarnsession,可是只有这一个有问题 Xintong Song 于2020年8月26日周三16:50写道： > 按照我们目前掌握的信息，我这边的初步判断是 ZK 的问题。至于具体 ZK 什么问题，建议你咨询一下 ZK 的专家，看一下为什么节点 create > > 不成功。这方面我也不是很熟悉。 > > > > Thank you~ > > > > Xintong Song > > > > > > > > On Wed, Aug 26, 2020 at 4:42 PM song wang > wrote: > > > > >

Re: 报错 Could not resolve ResourceManager address akka.tcp://flink@hostname:16098/user/resourcemanager

2020-08-26 Thread song wang

你好，我找到了taskmanager的日志，发现在与jobmanager同样的时间点：2020-08-22 05:39:24，也发生了与resourcemanager 心跳超时的问题，然后就是报无法解析resourcemanager地址的错误，最后超过最大registration时间，taskamaner退出。日志如下： 2020-08-22 05:39:24,479 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor - The heartbeat of ResourceManager with id

Re: 【闫云鹏】Flink cdc 连接mysql5.7.25报错

2020-08-26 Thread Yan,Yunpeng(DXM,PB)

Hi Flink 版本 1.11.1 直接使用的cdc的包订阅bin-log public static void main(String[] args) throws Exception { SourceFunction sourceFunction = MySQLSource.builder() .hostname("") .port(***) .databaseList(" ") // monitor all tables under

Re: 报错 Could not resolve ResourceManager address akka.tcp://flink@hostname:16098/user/resourcemanager

2020-08-26 Thread Xintong Song

按照我们目前掌握的信息，我这边的初步判断是 ZK 的问题。至于具体 ZK 什么问题，建议你咨询一下 ZK 的专家，看一下为什么节点 create 不成功。这方面我也不是很熟悉。 Thank you~ Xintong Song On Wed, Aug 26, 2020 at 4:42 PM song wang wrote: > 你好，报错之前是有这个jobmanager 日志的， > 2020-08-22 05:35:32,944 INFO org.apache.flink.yarn.YarnResourceManager > -

Re: 报错 Could not resolve ResourceManager address akka.tcp://flink@hostname:16098/user/resourcemanager

2020-08-26 Thread song wang

你好，报错之前是有这个jobmanager 日志的， 2020-08-22 05:35:32,944 INFO org.apache.flink.yarn.YarnResourceManager - Disconnect job manager a523ce29077177cd3722ab2a8c9c40a9 @akka.tcp://flink@hostname:16098/user/jobmanager_32 for job 615cc1aaec726a4c42758e47772a81fa from the resource manager. zk

Re: 【闫云鹏】Flink cdc 连接mysql5.7.25报错

2020-08-26 Thread china_tao

flink什么版本？用什么方式连接的？如果是flinksql的话，使用https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/jdbc.html，设置driver。如果你mysql账号密码确定没有问题的话，可以在pom中把mysql的依赖去掉，把mysql连接的jar包房到flin的lib中，再提交一次试试。 -- Sent from: http://apache-flink.147419.n8.nabble.com/

Re: 【闫云鹏】Flink cdc 连接mysql5.7.25报错

2020-08-26 Thread Yan,Yunpeng(DXM,PB)

有大佬帮忙看看？闫云鹏 DXM 支付业务部地址：北京市海淀区西北旺东路度小满金融总部邮编：100085 手机：13693668213 邮箱：yanyunp...@duxiaoman.com 度小满金融精于科技值得信赖发件人: "Yan,Yunpeng(DXM,PB)" 答复: "user-zh@flink.apache.org" 日期: 2020年8月26日星期三 11:24 收件人: "user-zh@flink.apache.org" 抄送: "Li,Qian(DXM,PB)" 主题:

Re: Idle stream does not advance watermark in connected stream

2020-08-26 Thread Dawid Wysakowicz

Hi Kien, I am afraid this is a valid bug. I am not 100% sure but the way I understand the code the idleness mechanism applies to input channels, which means e.g. when multiple parallell instances shuffle its results to downstream operators. In case of a two input operator, combining the

1.11.2大概什么时候发布

2020-08-26 Thread abc15606

1.11.2大概什么时候发布？

回复： flink1.11 sql问题

2020-08-26 Thread 酷酷的浑蛋

好吧，谢谢在2020年08月25日 18:40，Benchao Li 写道： Hi, 这个功能已经在1.12支持了[1]，如果着急使用，可以cherry-pick回去试试看。用法就是直接把这个字段声明为varchar，json format会帮你自动处理 [1] https://issues.apache.org/jira/browse/FLINK-18002 酷酷的浑蛋于2020年8月25日周二下午6:32写道：还没到udf那一步，直接用create table的方式，过来的数据就是获取不到值的， CREATE TABLE test ( a

Re: Loading FlinkKafkaProducer fails with LinkError

2020-08-26 Thread Arvid Heise

Hi, @Chesnay Schepler The issue is that the uber-jar is first loaded with Flink's app classloader (because it's in lib) and then when the application starts, it gets loaded again in the ChildFirstCL and since it's child-first, the class is loaded anyways. What I don't quite understand is why

Re: Why consecutive calls of orderBy are forbidden?

2020-08-26 Thread Dawid Wysakowicz

Hi, I think you are hitting a bug here. It should be possible what you are trying to do. Would you like to open a bug for it? However, the bug applies to the legacy batch planner (you are using the BatchTableEnvironment), which is no longer maintained and there were discussions already to drop

Re: Default Flink Metrics Graphite

2020-08-26 Thread Dawid Wysakowicz

Hi Vijay, I think the problem might be that you are using a wrong version of the reporter. You say you used flink-metrics-graphite-1.10.0.jar from 1.10 as a plugin, but it was migrated to plugins in 1.11 only[1]. I'd recommend trying it out with the same 1.11 version of Flink and Graphite

答复: 关于sink失败不消费kafka消息的处理

2020-08-26 Thread 范超

您好 BenChao ,不知道是否有可以参考的两阶段提交的Flink 实例或者文档资料 -邮件原件- 发件人: Benchao Li [mailto:libenc...@apache.org] 发送时间: 2020年8月26日星期三 12:59 收件人: user-zh 主题: Re: 关于sink失败不消费kafka消息的处理这种情况需要打开checkpoint来保证数据的不丢。如果sink没有两阶段提交，那就是at least once语义。范超于2020年8月26日周三上午11:38写道： > 大家好，我现在有个疑问 >

Flink??????????????????????

2020-08-26 Thread Sun_yijia

??A??B??AB?? ??B??ABA ??FlinkAB

Re: Default Flink Metrics Graphite

2020-08-26 Thread Vijayendra Yadav

Hi Nikola, To rule out any other cluster issues, I have tried it in my local now. Steps as follows, but don't see any metrics yet. 1) Set up local Graphite docker run -d\ --name graphite\ --restart=always\ -p 80:80\ -p 2003-2004:2003-2004\ -p 2023-2024:2023-2024\ -p 8125:8125/udp\ -p

75 matches

Mail list logo