??io??
??at least once??The
asynchronous I/O operator offers full exactly-once fault tolerance guarantees.
It stores the records for in-flight asynchronous requests in checkpoints and
restores/re-triggers the
我的做法是 重新配置 HADOOP_CONF_DIR 环境变量:在flink集群里面配置 core-site.xml 和 hdfs-site.xml,同时将
HADOOP_CONF_DIR 环境变量 指向这个文件目录
> 在 2019年9月4日,上午11:16,戴嘉诚 写道:
>
> 大家好:
> 我在看到streamingFileSink
>
I have found a way??
select row(msg.f1) from table.
-- --
??: ""<2463...@qq.com>;
: 2019??9??4??(??) 10:57
??:
""<2463...@qq.com>;"JingsongLee";"user";
: ??question
I want to output the
Thanks for the explanation Ashish. Glad you made it work with custom source.
I guess your application is probably stateless. If so, another option might
be having a geo-distributed Flink deployment. That means there will be TM
in different datacenter to form a single Flink cluster. This will also
大家好:
我在看到streamingFileSink
Thanks Becket,
Sorry for delayed response. That’s what I thought as well. I built a hacky
custom source today directly using Kafka client which was able to join consumer
group etc. which works as I expected but not sure about production readiness
for something like that :)
The need arises
I want to output the query results to kafka, json format is as follows??
{
"id": "123",
"serial": "6b0c2d26",
"msg": {
"f1": "5677"
}
}
How to define the format and schema of kafka sink??
thanks!
-- --
??: ""<2463...@qq.com>;
flink??slave3bin/taskmanager.sh start??
pengcheng...@bonc.com.cn
?? 2019-09-04 10:27
user-zh
?? flink??taskmanager
??
??flink??1??jobmanager,??2??taskmanager(slave1,slave2)??
Hi
on 2019/9/4 10:27, 如影随形 wrote:
在flink集群部署时,有1个jobmanager,有2个taskmanager(slave1,slave2),
现在想添加slave3作为 taskmnanager。如何在不停止集群的情况下,可以像spark一样动态添加吗?
AFAIK the answer is NO for now. However, community tells that this has
been under consideration from the FLIP-6 Flink Development and
??
??flink??1??jobmanager,??2??taskmanager(slave1,slave2)??
??slave3
taskmnanager??spark
--
源码参考:PeriodicWatermarkEmitter
-邮件原件-
发件人: Dino Zhang
发送时间: Tuesday, September 3, 2019 3:14 PM
收件人: user-zh@flink.apache.org
主题: Re: Flink 周期性创建watermark,200ms的周期是怎么控制的
hi venn,
hi star,
exactly-once指flink内部的,要保证end-to-end
exactly可以通过两阶段提交,需要实现TwoPhaseCommitSinkFunction,或者做幂等处理
On Wed, Sep 4, 2019 at 8:20 AM star <3149768...@qq.com> wrote:
> 看文档我的理解是会将异步的请求保存在检查点中,failover的时候重新触发请求。我的问题是既然是重新触发请求,并没有回滚,那之前的请求已经对外部系统造成影响了,不就是at
> least-once了吗?
> 比如ck1:发送了a b
??failover??at
least-once??
ck1:??a b
cck2d??e??f??ck1??checkpoint??a
??b??c??.
Hi Zili,
Sorry for replying late, we had a holiday here in the US.
We are using the high-availability.storageDir but only for the Blob store,
however job graphs, checkpoints and checkpoint IDs are stored in MapDB.
> On Aug 28, 2019, at 7:48 PM, Zili Chen wrote:
>
> Thanks for your email
Hi:
时间你可以转成Long,关于UTC,你说要生成Table,这样的话如果用的是SQL,可以采用UDF进行转换
| |
Jimmy
|
|
wangzmk...@163.com
|
签名由网易邮箱大师定制
在2019年09月3日 21:25,JingsongLee 写道:
Hi:
1.是的,目前只能是UTC,如果你有计算要求,你可以考虑改变的业务的窗口时间。
2.支持long的,你输入是不是int才会报错的,具体报错的信息?
Best,
Jingsong Lee
Good news! Thanks for your efforts, Bowen!
Best,
Vino
Yu Li 于2019年9月2日周一 上午6:04写道:
> Great to know, thanks for the efforts Bowen!
>
> And I believe it worth a release note in the original JIRA, wdyt? Thanks.
>
> Best Regards,
> Yu
>
>
> On Sat, 31 Aug 2019 at 11:01, Bowen Li wrote:
>
>> Hi
Hi:
1.是的,目前只能是UTC,如果你有计算要求,你可以考虑改变的业务的窗口时间。
2.支持long的,你输入是不是int才会报错的,具体报错的信息?
Best,
Jingsong Lee
--
From:hb <343122...@163.com>
Send Time:2019年9月3日(星期二) 10:44
To:user-zh
Subject:Flink SQL 时间问题
使用kafka connectorDescriptor ,
thank you??
Let me try??
-- --
??: "JingsongLee";
: 2019??9??3??(??) 7:53
??: ""<2463...@qq.com>;"user";
: Re:question
should be schema.field(??msg??, Types.ROW(...))?And you should select msg.f1
from table.
should be schema.field(“msg”, Types.ROW(...))?
And you should select msg.f1 from table.
Best
Jingsong Lee
来自阿里邮箱 iPhone版
--Original Mail --
From:圣眼之翼 <2463...@qq.com>
Date:2019-09-03 09:22:41
Recipient:user
Subject:question
How do you do:
My problem is flink
Hi all ,
i am facing a problem in my flink job , i am getting given exception
2019-09-03 12:02:04,278 ERROR
org.apache.flink.runtime.io.network.netty.PartitionRequestQueue -
Encountered error while consuming partitions
java.io.IOException: Connection reset by peer
at
The FLIP-62 discuss thread can be found here [1].
[1]
https://lists.apache.org/thread.html/9602b342602a0181fcb618581f3b12e692ed2fad98c59fd6c1caeabd@%3Cdev.flink.apache.org%3E
Cheers,
Till
On Tue, Sep 3, 2019 at 11:13 AM Till Rohrmann wrote:
> Thanks everyone for the input again. I'll then
Thanks everyone for the input again. I'll then conclude this survey thread
and start a discuss thread to set the default restart delay to 1s.
@Arvid, I agree that a better documentation how to tune Flink with sane
settings for certain scenarios is super helpful. However, as you've said it
is
Hi
1. Checkpoint 超时时间设置是多少(是默认的10min么),如果超时时间太短,容易checkpoint failure
2. 所有的subtask都是n/a 么,source
task的checkpoint没有rocksDb的参与,与使用默认的MemoryStateBackend其实是一样的,不应该source
task也没有完成checkpoint(除非一直都拿不到StreamTask里面的锁,一直都在process element)
3. 作业的反压情况如何,是不是使用RocksDB时候存在严重的反压(back
on 2019/9/3 15:38, 守护 wrote:
org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Received late
message for now expired checkpoint attempt 3 from
24674987621178ed1a363901acc5b128 of job fd5010cbf20501339f1136600f0709c3.
请问这个是什么问题呢?
1.FlinkAPI??windowenv.setStateBackend(new
RocksDBStateBackend("hdfs://host51:9000/flink/checkpoints",true))rocksDBcheckpoint??Completed
checkpoint 142 for job 25a50baff7d16ee22aecb7b1 (806418 bytes in 426
ms).
Hi
on 2019/9/3 12:14, 々守护々 wrote:
现象:我是在代码中配置StateBackend的,代码是这样写的env.setStateBackend(new
RocksDBStateBackend("hdfs://host51:9000/flink/checkpoints",true)),但是我发现在下面的SQLwindow中始终在checkpoint那就过不去,总是报n/a,
下面我就把代码中的这句给注释掉了,在flink-conf.xml文件中配置
state.checkpoints.dir:
hi venn,
基于EventTIme的Watermark间隔默认200ms,可以通过ExecutionConfig的setAutoWatermarkInterval方法进行设置,见StreamExecutionEnvironment:
/**
* Sets the time characteristic for all streams create from this
environment, e.g., processing
* time, event time, or ingestion time.
*
* If you set the
各位大佬, 今天看flink 指派Timestamp 和watermark 的源码,发现周期性创建
watermark 确实是周期性的,从打印到控制台的时间可以看到差不多是200毫秒执行一
次, 200毫秒是在哪里控制的,在debug 的调用栈找不到(源码位置)?
周期新创建watermark 方法如下:
.assignAscendingTimestamps(element =>
sdf.parse(element.createTime).getTime)
.assignTimestampsAndWatermarks(new
Hi all,
just wanted to share my experience with configurations with you. For
non-expert users configurations of Flink can be very daunting. The list of
common properties is already helping a lot [1], but it's not clear how they
depend on each other and settings common for specific use cases are
29 matches
Mail list logo