??????????Async I/O??exactly-once

2019-09-03 Thread star
??io?? ??at least once??The asynchronous I/O operator offers full exactly-once fault tolerance guarantees. It stores the records for in-flight asynchronous requests in checkpoints and restores/re-triggers the

Re: Streaming File Sink疑问

2019-09-03 Thread 周美娜
我的做法是 重新配置 HADOOP_CONF_DIR 环境变量:在flink集群里面配置 core-site.xml 和 hdfs-site.xml,同时将 HADOOP_CONF_DIR 环境变量 指向这个文件目录 > 在 2019年9月4日,上午11:16,戴嘉诚 写道: > > 大家好: > 我在看到streamingFileSink >

??????question

2019-09-03 Thread ????????
I have found a way?? select row(msg.f1) from table. -- -- ??: ""<2463...@qq.com>; : 2019??9??4??(??) 10:57 ??: ""<2463...@qq.com>;"JingsongLee";"user"; : ??question I want to output the

Re: Kafka consumer behavior with same group Id, 2 instances of apps in separate cluster?

2019-09-03 Thread Becket Qin
Thanks for the explanation Ashish. Glad you made it work with custom source. I guess your application is probably stateless. If so, another option might be having a geo-distributed Flink deployment. That means there will be TM in different datacenter to form a single Flink cluster. This will also

Streaming File Sink疑问

2019-09-03 Thread 戴嘉诚
大家好: 我在看到streamingFileSink

Re: Kafka consumer behavior with same group Id, 2 instances of apps in separate cluster?

2019-09-03 Thread Ashish Pokharel
Thanks Becket, Sorry for delayed response. That’s what I thought as well. I built a hacky custom source today directly using Kafka client which was able to join consumer group etc. which works as I expected but not sure about production readiness for something like that :) The need arises

??????question

2019-09-03 Thread ????????
I want to output the query results to kafka, json format is as follows?? { "id": "123", "serial": "6b0c2d26", "msg": { "f1": "5677" } } How to define the format and schema of kafka sink?? thanks! -- -- ??: ""<2463...@qq.com>;

????: flink??????????????????????taskmanager

2019-09-03 Thread pengcheng...@bonc.com.cn
flink??slave3bin/taskmanager.sh start?? pengcheng...@bonc.com.cn ?? 2019-09-04 10:27 user-zh ?? flink??taskmanager ?? ??flink??1??jobmanager,??2??taskmanager(slave1,slave2)??

Re: flink集群模式下如何动态添加taskmanager

2019-09-03 Thread Wesley Peng
Hi on 2019/9/4 10:27, 如影随形 wrote: 在flink集群部署时,有1个jobmanager,有2个taskmanager(slave1,slave2), 现在想添加slave3作为 taskmnanager。如何在不停止集群的情况下,可以像spark一样动态添加吗? AFAIK the answer is NO for now. However, community tells that this has been under consideration from the FLIP-6 Flink Development and

flink??????????????????????taskmanager

2019-09-03 Thread ????????
?? ??flink??1??jobmanager,??2??taskmanager(slave1,slave2)?? ??slave3 taskmnanager??spark --

回复: Flink 周期性创建watermark,200ms的周期是怎么控制的

2019-09-03 Thread Yuan,Youjun
源码参考:PeriodicWatermarkEmitter -邮件原件- 发件人: Dino Zhang 发送时间: Tuesday, September 3, 2019 3:14 PM 收件人: user-zh@flink.apache.org 主题: Re: Flink 周期性创建watermark,200ms的周期是怎么控制的 hi venn,

Re: 关于Async I/O的exactly-once

2019-09-03 Thread Dino Zhang
hi star, exactly-once指flink内部的,要保证end-to-end exactly可以通过两阶段提交,需要实现TwoPhaseCommitSinkFunction,或者做幂等处理 On Wed, Sep 4, 2019 at 8:20 AM star <3149768...@qq.com> wrote: > 看文档我的理解是会将异步的请求保存在检查点中,failover的时候重新触发请求。我的问题是既然是重新触发请求,并没有回滚,那之前的请求已经对外部系统造成影响了,不就是at > least-once了吗? > 比如ck1:发送了a b

????Async I/O??exactly-once

2019-09-03 Thread star
??failover??at least-once?? ck1:??a b cck2d??e??f??ck1??checkpoint??a ??b??c??.

Re: [SURVEY] How do you use high-availability services in Flink?

2019-09-03 Thread Aleksandar Mastilovic
Hi Zili, Sorry for replying late, we had a holiday here in the US. We are using the high-availability.storageDir but only for the Blob store, however job graphs, checkpoints and checkpoint IDs are stored in MapDB. > On Aug 28, 2019, at 7:48 PM, Zili Chen wrote: > > Thanks for your email

回复: Flink SQL 时间问题

2019-09-03 Thread Jimmy Wong
Hi: 时间你可以转成Long,关于UTC,你说要生成Table,这样的话如果用的是SQL,可以采用UDF进行转换 | | Jimmy | | wangzmk...@163.com | 签名由网易邮箱大师定制 在2019年09月3日 21:25,JingsongLee 写道: Hi: 1.是的,目前只能是UTC,如果你有计算要求,你可以考虑改变的业务的窗口时间。 2.支持long的,你输入是不是int才会报错的,具体报错的信息? Best, Jingsong Lee

Re: [ANNOUNCE] Kinesis connector becomes part of Flink releases

2019-09-03 Thread vino yang
Good news! Thanks for your efforts, Bowen! Best, Vino Yu Li 于2019年9月2日周一 上午6:04写道: > Great to know, thanks for the efforts Bowen! > > And I believe it worth a release note in the original JIRA, wdyt? Thanks. > > Best Regards, > Yu > > > On Sat, 31 Aug 2019 at 11:01, Bowen Li wrote: > >> Hi

Re: Flink SQL 时间问题

2019-09-03 Thread JingsongLee
Hi: 1.是的,目前只能是UTC,如果你有计算要求,你可以考虑改变的业务的窗口时间。 2.支持long的,你输入是不是int才会报错的,具体报错的信息? Best, Jingsong Lee -- From:hb <343122...@163.com> Send Time:2019年9月3日(星期二) 10:44 To:user-zh Subject:Flink SQL 时间问题 使用kafka connectorDescriptor ,

??????question

2019-09-03 Thread ????????
thank you?? Let me try?? -- -- ??: "JingsongLee"; : 2019??9??3??(??) 7:53 ??: ""<2463...@qq.com>;"user"; : Re:question should be schema.field(??msg??, Types.ROW(...))?And you should select msg.f1 from table.

Re:question

2019-09-03 Thread JingsongLee
should be schema.field(“msg”, Types.ROW(...))? And you should select msg.f1 from table. Best Jingsong Lee 来自阿里邮箱 iPhone版 --Original Mail -- From:圣眼之翼 <2463...@qq.com> Date:2019-09-03 09:22:41 Recipient:user Subject:question How do you do: My problem is flink

error in my job

2019-09-03 Thread yuvraj singh
Hi all , i am facing a problem in my flink job , i am getting given exception 2019-09-03 12:02:04,278 ERROR org.apache.flink.runtime.io.network.netty.PartitionRequestQueue - Encountered error while consuming partitions java.io.IOException: Connection reset by peer at

Re: [SURVEY] Is the default restart delay of 0s causing problems?

2019-09-03 Thread Till Rohrmann
The FLIP-62 discuss thread can be found here [1]. [1] https://lists.apache.org/thread.html/9602b342602a0181fcb618581f3b12e692ed2fad98c59fd6c1caeabd@%3Cdev.flink.apache.org%3E Cheers, Till On Tue, Sep 3, 2019 at 11:13 AM Till Rohrmann wrote: > Thanks everyone for the input again. I'll then

Re: [SURVEY] Is the default restart delay of 0s causing problems?

2019-09-03 Thread Till Rohrmann
Thanks everyone for the input again. I'll then conclude this survey thread and start a discuss thread to set the default restart delay to 1s. @Arvid, I agree that a better documentation how to tune Flink with sane settings for certain scenarios is super helpful. However, as you've said it is

Re: 回复: flink使用StateBackend问题

2019-09-03 Thread Yun Tang
Hi 1. Checkpoint 超时时间设置是多少(是默认的10min么),如果超时时间太短,容易checkpoint failure 2. 所有的subtask都是n/a 么,source task的checkpoint没有rocksDb的参与,与使用默认的MemoryStateBackend其实是一样的,不应该source task也没有完成checkpoint(除非一直都拿不到StreamTask里面的锁,一直都在process element) 3. 作业的反压情况如何,是不是使用RocksDB时候存在严重的反压(back

Re: 回复: flink使用StateBackend问题

2019-09-03 Thread Wesley Peng
on 2019/9/3 15:38, 守护 wrote: org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Received late message for now expired checkpoint attempt 3 from 24674987621178ed1a363901acc5b128 of job fd5010cbf20501339f1136600f0709c3. 请问这个是什么问题呢?

?????? flink????StateBackend????

2019-09-03 Thread ????
1.FlinkAPI??windowenv.setStateBackend(new RocksDBStateBackend("hdfs://host51:9000/flink/checkpoints",true))rocksDBcheckpoint??Completed checkpoint 142 for job 25a50baff7d16ee22aecb7b1 (806418 bytes in 426 ms).

Re: flink使用StateBackend问题

2019-09-03 Thread Wesley Peng
Hi on 2019/9/3 12:14, 々守护々 wrote: 现象:我是在代码中配置StateBackend的,代码是这样写的env.setStateBackend(new RocksDBStateBackend("hdfs://host51:9000/flink/checkpoints",true)),但是我发现在下面的SQLwindow中始终在checkpoint那就过不去,总是报n/a, 下面我就把代码中的这句给注释掉了,在flink-conf.xml文件中配置 state.checkpoints.dir:

Re: Flink 周期性创建watermark,200ms的周期是怎么控制的

2019-09-03 Thread Dino Zhang
hi venn, 基于EventTIme的Watermark间隔默认200ms,可以通过ExecutionConfig的setAutoWatermarkInterval方法进行设置,见StreamExecutionEnvironment: /** * Sets the time characteristic for all streams create from this environment, e.g., processing * time, event time, or ingestion time. * * If you set the

Flink 周期性创建watermark,200ms的周期是怎么控制的

2019-09-03 Thread venn
各位大佬, 今天看flink 指派Timestamp 和watermark 的源码,发现周期性创建 watermark 确实是周期性的,从打印到控制台的时间可以看到差不多是200毫秒执行一 次, 200毫秒是在哪里控制的,在debug 的调用栈找不到(源码位置)? 周期新创建watermark 方法如下: .assignAscendingTimestamps(element => sdf.parse(element.createTime).getTime) .assignTimestampsAndWatermarks(new

Re: [SURVEY] Is the default restart delay of 0s causing problems?

2019-09-03 Thread Arvid Heise
Hi all, just wanted to share my experience with configurations with you. For non-expert users configurations of Flink can be very daunting. The list of common properties is already helping a lot [1], but it's not clear how they depend on each other and settings common for specific use cases are