?????? flink savepoints ?? checkpoints??????

2019-03-26 Thread ????
~ -- -- ??: "shengjk1"; : 2019??3??27??(??) 12:10 ??: "user-zh@flink.apache.org"; : "user-zh"; : ?? flink savepoints ?? checkpoints?? ??flink-connector-kafkakafka

?????? flink savepoints ?? checkpoints??????

2019-03-26 Thread shengjk1
??flink-connector-kafkakafka 0.8zkkafka??(??kafka??API) Flinkoffsetcheckpoint?? ??checkpointkafka??APIoffset??

?????? ?????? flink savepoints ?? checkpoints??????

2019-03-26 Thread ????
??checkpoint??savepoint 1??flink kafka??kafka??offsetflink??zookeeper

?????? ?????? flink savepoints ?? checkpoints??????

2019-03-26 Thread ????
?? -- -- ??: "baiyg25...@hundsun.com"; : 2019??3??27??(??) 11:38 ??: "user-zh"; : : ?? flink savepoints ?? checkpoints??

????: ?????? flink savepoints ?? checkpoints??????

2019-03-26 Thread baiyg25...@hundsun.com
savepoint ??savepoints baiyg25...@hundsun.com ?? 2019-03-27 11:03 user-zh ?? ?? flink

?????? flink savepoints ?? checkpoints??????

2019-03-26 Thread ????
??savepointsavepoints -- -- ??: "baiyg25...@hundsun.com"; : 2019??3??27??(??) 11:01 ??: "user-zh"; : : flink savepoints ?? checkpoints??

Re: Re: 实现 UpsertStreamTableSink, BatchTableSink 接口代码

2019-03-26 Thread 邓成刚【qq】
sql: select EVENTTIME,ID,EVENT_ID,MSISDN,TS from (select a.*,ROW_NUMBER() over(partition by EVENT_ID,MSISDN order by TS desc) AS rw           from table1 a ) where rw = 1 tableEnv.toRetractStream(结果表, Row.class).print(); 输出结果,分析结果发现,第二条的  1553652720961584  比第一条的时间 1553652720927835 更大,同时输出一条

????: flink savepoints ?? checkpoints??????

2019-03-26 Thread baiyg25...@hundsun.com
checkpoints: checkpoint?? savepoints??

flink savepoints ?? checkpoints??????

2019-03-26 Thread ????
??flink savepoints ?? checkpointscheckpoints??savepointssavepoints??savepoints?? PS??checkpoints,

Re: Help debugging Kafka connection leaks after job failure/cancelation

2019-03-26 Thread Steven Wu
it might be related to this issue https://issues.apache.org/jira/browse/FLINK-10774 On Tue, Mar 26, 2019 at 4:35 PM Fritz Budiyanto wrote: > Hi All, > > We're using Flink-1.4.2 and noticed many dangling connections to Kafka > after job deletion/recreation. The trigger here is Job

Re: Discrepancy between the part length file's length and the part file length during recover

2019-03-26 Thread Paul Lam
Hi, > Would then the assumption that this possibility ( part reported length > > part file size ( reported by FileStatus on NN) ) is only attributable to > this edge case be correct ? Yes, I think so. > Or do you see a case where in though the above is true, the part file would > need

Help debugging Kafka connection leaks after job failure/cancelation

2019-03-26 Thread Fritz Budiyanto
Hi All, We're using Flink-1.4.2 and noticed many dangling connections to Kafka after job deletion/recreation. The trigger here is Job cancelation/failure due to network down event followed by Job recreation. Our flink job has checkpointing disabled, and upon job failure (due to network

Re: Schema Evolution on Dynamic Schema

2019-03-26 Thread shkob1
Sorry to flood this thread, but keeping my experiments: so far i've been using retract to a Row and then mapping to a dynamic pojo that is created (using ByteBuddy) according to the select fields in the SQL. Considering the error I'm trying now to remove thr usage in Row and use the dynamic type

Re: Schema Evolution on Dynamic Schema

2019-03-26 Thread shkob1
Debugging locally it seems like the state descriptor of "GroupAggregateState" is creating an additional field (TypleSerializer of SumAccumulator) serializer within the RowSerializer. Im guessing this is what causing incompatibility? Is there any work around i can do? -- Sent from:

Re: Schema Evolution on Dynamic Schema

2019-03-26 Thread shkob1
Hi Fabian, It seems like it didn't work. Let me specify what i have done: i have a SQL that looks something like: Select a, sum(b), map[ 'sum_c', sum(c), 'sum_d', sum(d)] as my_map FROM... GROUP BY a As you said im preventing keys in the state forever by doing idle state retention time (+ im

1.7.2 requires several attempts to start in AWS EMR's Yarn

2019-03-26 Thread Bruno Aranda
Hi, I did write recently about our problems with 1.7.2 for which we still haven't found a solution and the cluster is very unstable. I am trying to point now to a different problem that maybe it is related somehow and we don't understand. When we restart a Flink Session in Yarn, we see it takes

Functionality of blob server

2019-03-26 Thread Manjusha Vuyyuru
Hello, Can someone please explain me the functionality of blob server in flink ? Thanks, Manju

Re: Install 1.7.2 on EC2 - No task slots - 2019

2019-03-26 Thread Jeff Crane
There are 2 out files and a log file. I will describe the steps to take below, but I have not found anything to indicate there's a problem when it cannot allocate any resources (but otherwise runs). --start default free tier aws instanceedit security group to allow 8081 incomingsudo

Re: What is Flinks primary API language?

2019-03-26 Thread Ilya Karpov
Thanks Yun Tang, we will keep that in mind! > 26 марта 2019 г., в 11:27, Yun Tang написал(а): > > Hi Llya > > I believe Java is the main implementation language of Flink internals, > flink-core is the kernel module and implemented in Java. > > What's more: > FILP6: Replace Scala implemented

Re: Discrepancy between the part length file's length and the part file length during recover

2019-03-26 Thread Vishal Santoshi
Thank you for your email. Would then the assumption that this possibility ( part reported length > part file size ( reported by FileStatus on NN) ) is only attributable to this edge case be correct ? Or do you see a case where in though the above is true, the part file would need truncation as

Re: Discrepancy between the part length file's length and the part file length during recover

2019-03-26 Thread Paul Lam
Hi Vishal, I’ve come across the same problem. The problem is that by default the file length is not updated when the output stream is not closed properly. I modified the writer to update file lengths on each flush, but it comes with some overhead, so this approach should be used when strong

Re: [DISCUSS] Remove forceAvro() and forceKryo() from the ExecutionConfig

2019-03-26 Thread Yun Tang
Hi Konstantin I think there is no direct relationship between registering chill-protobuf/chill-thrift for Protobuf/Thrift type with Kryo and enforcing POJO to use Kryo. For both Protobuf and Thrift types, they will be extracted as GenericTypeInfo within TypeExtractor which would use

Re: 回复: 实现 UpsertStreamTableSink, BatchTableSink 接口代码

2019-03-26 Thread 邓成刚【qq】
不好意思,我理解错了,更正一下: APPEND 流是没有这个字段的,只有更新流才有,true 表示 APPEND ,false 表示 update,这个值应该是流发出数据时自己带的,不是人为赋值的。。。   发件人: 邓成刚【qq】 发送时间: 2019-03-26 18:27 收件人: user-zh 主题: 回复: 实现 UpsertStreamTableSink, BatchTableSink 接口代码 这里面决定 update 或 delete 的 Boolean型值 怎么赋?   这里的  Boolean 值 是流类型决定,如果流是APPEND,则为true,

Re: How to run a job with job cluster mode on top of mesos?

2019-03-26 Thread Till Rohrmann
Hi Jacky, you're right that we are currently lacking documentation for the `mesos-appmaster-job.sh` script. I've added a JIRA issue to cover this [1]. In order to use this script you first need to store a serialized version of the `JobGraph` you want to run somewhere where the Mesos appmaster

回复: 实现 UpsertStreamTableSink, BatchTableSink 接口代码

2019-03-26 Thread 邓成刚【qq】
这里面决定 update 或 delete 的 Boolean型值 怎么赋? 这里的  Boolean 值 是流类型决定,如果流是APPEND,则为true, update,为false,你直接打印流会有这个字段 不知道我的理解是否正确,期待大佬解答。。。 邓成刚【qq】   发件人: baiyg25...@hundsun.com 发送时间: 2019-03-26 18:02 收件人: user-zh 主题: 实现 UpsertStreamTableSink, BatchTableSink 接口代码 大家好!         伙伴们,附件有实现 blink 中

blink消费kafka出现诡异的情况,困扰很久了,哪位大佬知道怎么回事

2019-03-26 Thread 邓成刚【qq】
HI,各位大佬:       发现一个很诡异的问题:使用SQL API时,在窗口上group by,JOB 5分钟后会timeout,但如果改成select * 就能正常消费kafka。。。 说明:本地模式和提交JOB均存在此异常 相关信息: blink 1.5.1 kafka 1.1.1 flink-connector-kafka-0.11_2.11-1.5.1-sql-jar.jar 消费正常的code: String sql = "select * from table1" Table sip_distinct_event_id = tableEnv.sqlQuery(

Re: [DISCUSS] Remove forceAvro() and forceKryo() from the ExecutionConfig

2019-03-26 Thread Stephan Ewen
Good point, Konstantin, that makes sense. On Tue, Mar 26, 2019 at 10:37 AM Konstantin Knauf wrote: > Hi Stephan, > > I am in favor of renaming forceKryo() instead of removing it, because users > might plugin their Protobuf/Thrift serializers via Kryo as advertised in > our documentation [1].

实现 UpsertStreamTableSink, BatchTableSink 接口代码

2019-03-26 Thread baiyg25...@hundsun.com
大家好! 伙伴们,附件有实现 blink 中 flink-table 模块 UpsertStreamTableSink, BatchTableSink 接口代码 ,自实现类放在 flink-jdbc 模块 org.apache.flink.api.java.io.jdbc 包下。大神们帮忙看看呗! 江湖救急啊! 目前实现后,在代码中调用,报异常: Exception in thread "main" org.apache.flink.table.api.TableException: Arity [4] of result

RocksDBStatebackend does not write checkpoints to backup path

2019-03-26 Thread Paul Lam
Hi, I have a job (with Flink 1.6.4) which uses rocksdb incremental checkpointing, but the checkpointing always fails with `IllegalStateException`, because hen performing `RocksDBIncrementalSnapshotOperation`, rocksdb finds that `localBackupDirectory`, which should be created earlier by rocksdb

Re: [DISCUSS] Remove forceAvro() and forceKryo() from the ExecutionConfig

2019-03-26 Thread Konstantin Knauf
Hi Stephan, I am in favor of renaming forceKryo() instead of removing it, because users might plugin their Protobuf/Thrift serializers via Kryo as advertised in our documentation [1]. For this, Kryo needs to be used for POJO types as well, if I am not mistaken. Cheers, Konstantin [1]

Re: [DISCUSS] Introduction of a Table API Java Expression DSL

2019-03-26 Thread jincheng sun
Thanks for bringing up this DISCUSS Timo! Java Expression DSL is pretty useful for java user. When we have the Java Expression DSL, Java API will become very rich and easy to use! +1 from my side. Best, Jincheng Dawid Wysakowicz 于2019年3月26日周二 下午5:08写道: > Hi, > > I really like the idea of

Re: RocksDB中指定nameNode 的高可用

2019-03-26 Thread Yun Tang
Hi Flink高可用相关配置的存储目录,当存储路径配置成HDFS时,相关namenode高可用性由HDFS支持,对上层完全透明。 祝好 唐云 From: 戴嘉诚 Sent: Tuesday, March 26, 2019 16:57 To: user-zh@flink.apache.org Subject: RocksDB中指定nameNode 的高可用   嘿,我想询问一下,flink中的RocksDB位置

Execution sequence for slot sharing

2019-03-26 Thread yinhua.dai
Hi Community, Can anyone help me understand the execution sequence in batch mode? 1. Can I set slot isolation in batch mode? I can only find the slotSharingGroup API in streaming mode. 2. When multiple data source parallel instances are allocated to the same slot, how does flink run those data

Re: [DISCUSS] Introduction of a Table API Java Expression DSL

2019-03-26 Thread Dawid Wysakowicz
Hi, I really like the idea of introducing Java Expression DSL. I think this will solve many problems e.g. right now it's quite tricky how string literals work in scala (sometimes it might go through the ExpressionParser and it will end up as an UnresolvedFieldReference), another important problem

Re: Is there window trigger in Table API ?

2019-03-26 Thread jincheng sun
Hi luyj, Currently, TableAPI does not have the trigger, due to the behavior of the windows(unbounded, tumble, slide, session) is very clear.The behavior of each window is as follows: - Unbounded Window - Each set of keys is a grouping, and each event triggers a calculation. - Tumble

Re: [DISCUSS] Remove forceAvro() and forceKryo() from the ExecutionConfig

2019-03-26 Thread Stephan Ewen
Compatibility is really important for checkpointed state. For that, you can always directly specify GenericTypeInfo or AvroTypeInfo if you want to continue to treat a type via Kryo or Avro. Alternatively, once https://issues.apache.org/jira/browse/FLINK-11917 is implemented, this should happen

RocksDB中指定nameNode 的高可用

2019-03-26 Thread 戴嘉诚
  嘿,我想询问一下,flink中的RocksDB位置  我指定了hdfs路径,但是,这里是强指定nameNode的地址,但是我的hdfs是有个两个nameNode地址的,这里能否有个功能,当active nameNode挂掉了,类似hdfs的HA那样,能无缝切换nameNode地址吗?不然,当nameNode挂掉了, 我指定的flink也会一并挂掉

如何实现 UpsertStreamTableSink , BatchTableSink 接口

2019-03-26 Thread baiyg25...@hundsun.com
大家好! 有没有伙伴对 blink 中 flink-table 模块下的 UpsertStreamTableSink , BatchTableSink 这两个接口比较熟悉?或者对TableSink这块处理原理比较熟悉?我想实现这两个接口,以实现JDBC更新功能,自己看源码只能看懂表面,希望熟悉的伙伴能给些指导。。。 baiyg25...@hundsun.com

Re: What is Flinks primary API language?

2019-03-26 Thread Yun Tang
Hi Llya I believe Java is the main implementation language of Flink internals, flink-core is the kernel module and implemented in Java. What's more: FILP6: Replace Scala implemented JobManager.scala and TaskManager.scala to new JobMaster.java and TaskExecutor.java FILP32: Make flink-table

Is there window trigger in Table API ?

2019-03-26 Thread lu yj
Hello, I am using Table API to do some aggregation based on time window. In DataStream API, there is trigger to control when the aggregation function should be invoked. Is there similar thing in Table API? Because I am using large time window, like a day. I want the intermediate result every

re:回复:fw:Blink SQL报错

2019-03-26 Thread bigdatayunzhongyan
收到,感谢姬平老师的专业解答。 谢谢各位! 发件人: 胥平勇(姬平) 发送时间: 2019-03-26 15:01:15 收件人:  bigdatayunzhongyan 抄送:  user-zh; Bowen Li 主题: 回复:fw:Blink SQL报错 Hi bigdatayunzhongyan: 1. SQL语法不支持: 这个可以参照代码里面TpcDsBatchExecPlanTest的单测,我们使用的sql query也都放在了工程里。看看是不是有些query的语法有些区别。 2. 执行方式: 我们自己benchmark的时候采用的是依赖tableEnv 

Re: [DISCUSS] Remove forceAvro() and forceKryo() from the ExecutionConfig

2019-03-26 Thread Yun Tang
Hi Stephan I prefer to remove 'enableForceKryo' since Kryo serializer does not work out-of-the-box well for schema evolution stories due to its mutable properties, and our built-in POJO serializer has already supported schema evolution. On the other hand, what's the backward compatibility plan

What is Flinks primary API language?

2019-03-26 Thread Ilya Karpov
Hello, our dev-team is choosing a language for developing Flink jobs. Most likely that we will use flink-streaming api (at least in the very beginning). Because of Spark jobs developing experience we had before the choice for now is scala-api. However recently I’ve found a

Re: Reserving Kafka offset in Flink after modifying app

2019-03-26 Thread Yun Tang
Hi Son I think it might be because of not assigning operator ids to your Filter and Map functions, you could refer to [1] to assign ids to your application. Moreover, if you have ever removed some operators, please consider to add --allowNonRestoredState [2] option. [1]

Re: Re: flink ha模式进程hang!!!

2019-03-26 Thread Han Xiao
非常谢谢您的解答,这个问题是zk中有失败任务的jobGraph,导致每次启动群集就会去检索,删除zk中残余后重启即可解决。 Thank you for your reply! 发件人: baiyg25...@hundsun.com 发送时间: 2019-03-26 09:40 收件人: user-zh 主题: Re: Re: flink ha模式进程hang!!! 是不是跟这个访问控制有关? high-availability.zookeeper.client.acl: open baiyg25...@hundsun.com 发件人: Han Xiao 发送时间: