flink-yarn ha

2021-02-01 Thread 郭斌
在使用zookeeper 实现ha时,重启3个flink-yarn中一个实例时出现如下问题,请问是什么原因: org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /flink-ha/cluster_yarn/leader at

Flink sql using Hive for metastore throws Exception

2021-02-01 Thread Eleanore Jin
Hi experts, I am trying to experiment how to use Hive to store metadata along using Flink SQL. I am running Hive inside a docker container locally, and running Flink SQL program through IDE. Flink version 1.12.0 the sample code looks like: StreamExecutionEnvironment bsEnv =

Re: Flink 提交作业时的缓存可以删除吗

2021-02-01 Thread tison
org/apache/flink/yarn/YarnResourceManagerDriver.java:236 org/apache/flink/yarn/YarnClusterDescriptor.java:495 应该是会在作业退出或者强杀的时候清理的,你可以看一下对应版本有无这个逻辑 可以加一下日志看看实际是否触发,删除的是什么目录 Best, tison. Robin Zhang 于2021年2月2日周二 下午2:37写道: > Flink 1.12下会将flink的依赖以及作业的jar包缓存在hdfs上,如下图: > > < >

TaggedOperatorSubtaskState is missing when creating a new savepoint using state processor api

2021-02-01 Thread ????????
state processor apikafkastate??savepointmax parallelismsavepoint?? java.lang.Exception: Exception while creating StreamOperatorStateContext. at

Flink 提交作业时的缓存可以删除吗

2021-02-01 Thread Robin Zhang
Flink 1.12下会将flink的依赖以及作业的jar包缓存在hdfs上,如下图: 由于flink很早就开始使用了,这种目录越来越多,就算任务不在运行也不会自动清除。经过简单测试,直接删除后,不影响任务的运行以及简单的状态恢复。目前不知道会不会存在其他依赖,希望有清楚的能解释下这个的原理、作用以及能否删除。

flink-parcel????????

2021-02-01 Thread ????????
flink-parcelflink-yarn??kill??flink-yarnkill??kill??yarn application??finished application_1612236486024_0009 hdfsFlink session cluster Apache Flink root.users.hdfs 0 Tue Feb 2 11:48:10 +0800 2021

flink-parcel????????

2021-02-01 Thread ????????
flink-parcelflink-yarn??kill??flink-yarnkill??kill??yarn application??finished application_1612236486024_0009 hdfsFlink session cluster Apache Flink root.users.hdfs 0 Tue Feb 2 11:48:10 +0800 2021

Re: Custom JdbcSink?

2021-02-01 Thread Cecile Kim
Nevermind! I realized I was just misreading the code. I can override the class. JdbcExec is just a generic, duh  Thank you, Cecile From: Cecile Kim Date: Monday, February 1, 2021 at 2:01 PM To: user@flink.apache.org Subject: Custom JdbcSink? Hi, I would like to override the JdbcSink, so

flink sql

2021-02-01 Thread ???????L
hi, ?? ??1.12flink sql ??datastream?,

flink-parcel使用求助

2021-02-01 Thread 郭斌
使用flink-parcel部署flink-yarn成功,为了模拟服务器宕机,kill掉其中一个flink-yarn实例,为什么每次kill之后都有另外一个非手动kill的yarn application标记为finished application_1612236486024_0009hdfsFlink session clusterApache Flinkroot.users.hdfs0Tue Feb 2 11:48:10 +0800 2021Tue Feb 2 11:48:11 +0800 2021Tue Feb 2 13:40:39 +0800 2021KILLED

Re: LEAD/LAG functions

2021-02-01 Thread Jark Wu
Yes. RANK/ROW_NUMBER is not allowed with ROW/RANGE over window, i.e. the "ROWS BETWEEN 1 PRECEDING AND CURRENT ROW" clause. Best, Jark On Mon, 1 Feb 2021 at 22:06, Timo Walther wrote: > Hi Patrick, > > I could imagine that LEAD/LAG are translated into RANK/ROW_NUMBER > operations that are not

Re: Flink CheckPoint/Savepoint Behavior Question

2021-02-01 Thread Raghavendar T S
Flink is aware of all the tasks running in the cluster. If any of the tasks fails, the failed task is restored using the checkpoint (only If the task uses Flink Operator State). This scenario will not use savepoints. Savepoints are same as checkpoints and the difference is that the savepoints are

Re: python udf求助: Process died with exit code 0

2021-02-01 Thread Xingbo Huang
Hi, IllegalStateException这个不是root cause,最好把完整的日志贴出来才能更好查出问题。而且最好把能准确复现的代码尽量精简化的贴出来。 Best, Xingbo Appleyuchi 于2021年1月26日周二 下午5:51写道: > 我进行了如下操作: > https://yuchi.blog.csdn.net/article/details/112837327 > > > 然后报错: > java.lang.IllegalStateException: Process died with exit code 0 > > > 请问应该如何解决? >

Re: flinksql引入flink-parquet_2.11任务提交失败

2021-02-01 Thread yujianbo
大佬后面你是怎么解决的,我也是突然遇到这个问题 -- Sent from: http://apache-flink.147419.n8.nabble.com/

Re: PyFlink Expected IPC message of type schema but got record batch

2021-02-01 Thread Xingbo Huang
Hi, Sorry for the late reply. Thanks for reporting this issue which has been recorded in FLINK-21208[1]. I will fix it as soon as possible. [1] https://issues.apache.org/jira/browse/FLINK-21208 Best, Xingbo 苗红宾 于2021年1月31日周日 下午3:28写道: > Hi: > > Hope you are good! I have a question for

How to implement a FTP connector Flink Table/sql support?

2021-02-01 Thread 1095193...@qq.com
Hi I have investigate the relevant document and code about Flink connector. Flink support local filesystem and several pluggable file system which not include FTP. Could you give me some suggestions how to make Flink read data from FTP. One way I have learned is implementing FTP conncector

Flink 1.11 session cluster相关问题

2021-02-01 Thread zilong xiao
请问社区大佬,1.11版本的session cluster模式不支持在启动时指定启动taskmanager个数了吗?好像只能动态申请资源了?在1.4版本可以用-n,现在该参数已移除,为什么要这么做呢?我理解在启动一个session cluster的同时申请好TM个数也是一种常见场景吧? 求社区大佬指点

Re: Integration with Apache AirFlow

2021-02-01 Thread 姜鑫
Hi Flavio,Could you explain what your direct question is? In my opinion, it is possible to define two airflow operators to submit dependent flink job, as long as the first one can reach the end.Regards,Xin2021年2月1日 下午6:43,Flavio Pompermaier 写道:Any advice here?On Wed, Jan 27,

getCurrentWatermark

2021-02-01 Thread ????
@Override public final Watermark getCurrentWatermark() { // this guarantees that the watermark never goes backwards. long potentialWM = currentMaxTimestamp - maxOutOfOrderness; if (potentialWM = lastEmittedWatermark) { //currentMaxTimestamp

[Flink SQL] Run SQL programatically

2021-02-01 Thread cristi.cioriia
Hey guys, We have been trying out Fink SQL to implement use-cases like "compute the last X minutes of data" and we'd like to return the computed data programatically from Scala/Java, is there a way to write a program to run SQL queries over Flink in a reactive manner? We used for now for

Flink CheckPoint/Savepoint Behavior Question

2021-02-01 Thread Jason Liu
We currently have some logic to load data from S3 into memory in our Flink/Kinesis Analytics app. This happens before the RichFunction.open() function. We have two questions here and I can't find too much information in the apache.org website: 1. (More of a clarification) When Flink does

Custom JdbcSink?

2021-02-01 Thread Cecile Kim
Hi, I would like to override the JdbcSink, so that, given one record, it adds N insert SQL statements to the batch, where N is equal to a length computed by the given record. To do this, I need to override JdbcBatchingOutputFormat.writeRecord(), so that I can adjust how batchCount is

Running Beam Pipelines on a Flink Application Mode Cluster

2021-02-01 Thread Jan Bensien
Hello, I am currently trying to run my Apache Beam applications using Flink as my backend. Currently i use a session cluster running on Kubernetes. Is it possible to run Beam pipelines using the application mode? I would like to change to application mode, as I currently benchmark my

[Flink SQL] Insert query fails for partitioned table

2021-02-01 Thread Cristian Cioriia
Hey guys, I’m trying to create a Kafka backed partitioned table [1] and insert some data into it [2] using the sql-client, but I get the error [3] when doing it. Can you guys help with this? Also, I wanted to add the partition to the table as in [4] as per the documentation, but then the

Re: Proctime consistency

2021-02-01 Thread Rex Fenley
We need to aggregate in precisely row order. Is there a safe way to do this? Maybe with some sort of row time sequence number? As written in another email, we're currently doing the following set of operations val compactedUserDocsStream = userDocsStream

Re: Flink and Amazon EMR

2021-02-01 Thread Piotr Nowojski
Hi Marco, > Is this assumption correct? Yes. More or else each operator is first creating a copy of its state locally and uploading to S3 this whole file at once. Please first take a look which part of checkpointing is taking so long. Re backpressure. Keep in mind that Checkpoint Barriers need

Re: question on checkpointing

2021-02-01 Thread Marco Villalobos
Actually, perhaps I misworded it. This particular checkpoint seems to occur in an operator that is flat mapping (it is actually a keyed processing function) a single blob data-structure into several hundred thousands elements (sometimes a million) that immediately flow into a sink. I am

Re: Flink and Amazon EMR

2021-02-01 Thread Marco Villalobos
Thank you. Checkpoints timeout often, even though the timeout limit is 20 minutes. The volume of records in our processing window that require checkpointing is large (between 20 and 2 million). I made the assumption that Flink would batch a blob of bytes to S3, and not create an S3 call per

Re: Connect to schema registry via SSL

2021-02-01 Thread Laurent Exsteens
Hi Dawid, Thank you for your answer. I suspected the same and was about to try to follow the calls to verify it. You save me some time. I guess it might be early to tell, but if you would know in which flink and Ververica platform version this would become available, and when, I'm off course

Re: importing types doesn't fix “could not find implicit value for evidence parameter of type …TypeInformation”

2021-02-01 Thread Piotr Nowojski
Hey, Sorry for my hasty response. I didn't notice you have the import inside the code block. Have you maybe tried one of the responses suggested in the Stackoverflow by other users? Best, Piotrek pon., 1 lut 2021 o 15:49 Piotr Nowojski napisał(a): > Hey Devin, > > Have you maybe tried

Re: Flink and Amazon EMR

2021-02-01 Thread Piotr Nowojski
Hi, Yes, it's working. You would need to analyse what's working slower than expected. Checkpointing times? (Async duration? Sync duration? Start delay/back pressure?) Throughput? Recovery/startup? Are you being rate limited by Amazon? Piotrek czw., 28 sty 2021 o 03:46 Marco Villalobos

Re: importing types doesn't fix “could not find implicit value for evidence parameter of type …TypeInformation”

2021-02-01 Thread Piotr Nowojski
Hey Devin, Have you maybe tried looking for an answer via Google? Via just copying pasting your error message into Google I'm getting hundreds of results pointing towards: import org.apache.flink.api.scala._ Best, Piotrek czw., 28 sty 2021 o 04:13 Devin Bost napisał(a): > I posted this

Re: Connect to schema registry via SSL

2021-02-01 Thread Dawid Wysakowicz
Hi, I am afraid passing of these options is not supported in SQL yet. I created FLINK-21229 to add support for it. In a regular job you can construct a schema registry client manually:     RegistryAvroDeserializationSchema deserializationSchema = new RegistryAvroDeserializationSchema<>(

Re: Proctime consistency

2021-02-01 Thread Timo Walther
Hi Rex, processing-time gives you no alignment of operators across nodes. Each operation works with its local machine clock that might be interrupted by the OS, Java garbage collector, etc. It is always a best effort timing. Regards, Timo On 27.01.21 18:16, Rex Fenley wrote: Hello, I'm

Re: Flink sql problem

2021-02-01 Thread Timo Walther
Hi Jiazhi, I think an OVER window might solve your use case. It gives you a rolling aggregation over period of time. Maybe you need to define a custom aggregate function to emit the final record as you need it. Let me know if you have further questions. Regards, Timo On 27.01.21 15:02,

Re: Rocksdb - org.apache.flink.util.SerializedThrowable : bad entry in block

2021-02-01 Thread Timo Walther
Hi Omkar, sorry for the late reply. This sounds like a serious issue. It looks like some of the RocksDB data is corrupt. Are you sure this is not a problem of you storage layer? Otherwise I would investigate whether the serializers work correctly. Maybe Beam did put a corrupt data into

Re: Problem restirng state

2021-02-01 Thread Timo Walther
Hi Shridhar, the exception indicates that something is wrong with the object serialization. Kryo is unable to serialize the given object. It might help to 1) register a custom Kryo serializer in the ExecutionConfig or 2 ) pass dedicated type information using the types from

Re: Is Flink able to parse strings into dynamic JSON?

2021-02-01 Thread Chesnay Schepler
Flink needs to know upfront what kind of types it deals with to setup the serialization stack between operators. As such, generally speaking, you will have to use some generic container for transmitting data (e.g., a String or a Jackson ObjectNode) and either work on them directly or map them

Re: Is Flink able to parse strings into dynamic JSON?

2021-02-01 Thread Timo Walther
Hi Devin, Flink supports arbitrary data types. You can simply read the JSON object as a big string first and process the individual event types in a UDF using e.g. the Jackson library. Are you using SQL or DataStream API? An alternative is to set the "fail-on-missing-field" flag to false.

Re: LEAD/LAG functions

2021-02-01 Thread Timo Walther
Hi Patrick, I could imagine that LEAD/LAG are translated into RANK/ROW_NUMBER operations that are not supported in this context. But I will loop in @Jark who might know more about the limitaitons here. Regards, Timo On 29.01.21 17:37, Patrick Angeles wrote: Another (hopefully newbie)

Re: Catalog(Kafka Connectors 的ddl)持久化到hive metastore,groupid一样的问题

2021-02-01 Thread 孙啸龙
非常谢谢 > 在 2021年1月30日,下午9:18,JasonLee <17610775...@163.com> 写道: > > hi > > 社区以及提供了动态修改表属性的功能,具体使用可以参考 https://mp.weixin.qq.com/s/nWKVGmAtENlQ80mdETZzDw > > > > - > Best Wishes > JasonLee > -- > Sent from: http://apache-flink.147419.n8.nabble.com/

Re: [Stateful Functions] Problems with Protobuf Versions

2021-02-01 Thread Igal Shilman
Adding user@flink (was accidentally omitted previously) On Fri, Jan 29, 2021 at 5:18 PM Igal Shilman wrote: > Hi Jan, > > Glad to hear that 3.71 and 3.3.0 works for you. You can still include > protobuf in your project, but the version needs to be compatible with > what you will find at

Re: Integration with Apache AirFlow

2021-02-01 Thread Flavio Pompermaier
Any advice here? On Wed, Jan 27, 2021 at 9:49 PM Flavio Pompermaier wrote: > Hello everybody, > is there any suggested way/pointer to schedule Flink jobs using Apache > AirFlow? > What I'd like to achieve is the submission (using the REST API of AirFlow) > of 2 jobs, where the second one can be

Re: question on checkpointing

2021-02-01 Thread Chesnay Schepler
1) An operator that just blocks for a long time (for example, because it does a synchronous call to some external service) can indeed cause a checkpoint timeout. 2) What kind of effects are you worried about? On 1/28/2021 8:05 PM, Marco Villalobos wrote: Is it possible that checkpointing

Re: Cannot access state from a empty taskmanager - using kubernetes

2021-02-01 Thread Chesnay Schepler
Yes, it does sound quite a lot like FLINK-10225. I assume it is only happening for some task executors and not all of them? Unfortunately I don't think this issue will be fixed anytime soon. On 1/28/2021 12:59 PM, Daniel Peled wrote: Hi, We have followed the instructions in the following

Re: Question about setNestedFileEnumeration()

2021-02-01 Thread Piotr Nowojski
Hi Billy, Could you maybe share some minimal code reproducing the problem? For example I would suggest to start with reading from local files with some trivial application. Best Piotrek pt., 22 sty 2021 o 00:21 Billy Bain napisał(a): > I have a Streaming process where new directories are

Re: Question

2021-02-01 Thread Chesnay Schepler
Could you expand a bit on what you mean? Are you referring to /savepoints/? On 1/28/2021 3:24 PM, Abu Bakar Siddiqur Rahman Rocky wrote: Hi, Is there any library to use and remember the apache flink snapshot? Thank you -- Regards, Abu Bakar Siddiqur Rahman

Re: [Stateful Functions] Problems with Protobuf Versions

2021-02-01 Thread Tzu-Li (Gordon) Tai
Hi, This hints an incompatible Protobuf generated class by the protoc compiler, and the runtime dependency used by the code. Could you try to make sure the `protoc` compiler version matches the Protobuf version in your code? Cheers, Gordon On Fri, Jan 29, 2021 at 6:07 AM Jan Brusch wrote: >

Re: Comment in source code of CoGroupedStreams

2021-02-01 Thread Piotr Nowojski
Hi Sudharsan, Sorry for maybe a bit late response, but as far as I can tell, this comment refers to this piece of code: public void apply(KEY key, W window, Iterable> values, Collector out) throws Exception { List oneValues = new ArrayList<>();

Re: stopping with save points

2021-02-01 Thread Chesnay Schepler
It would be good if you could take a look at the Job-/TaskManager logs to see whether the operation is making progress or whether an exception has occurred. Does the job stop eventually? It could be that draining the jobs just takes longer than the client timeout allows by default (60

Re: Flink 1.11.2 test cases fail with Scala 2.12.12

2021-02-01 Thread Chesnay Schepler
Scala 2.12.8 broke binary compatibility with 2.12.7 which Flink currently is compiled against. As a result you must either stay at 2.12.7, or recompile Flink yourself against 2.12.12 as shown here

??????????: ????: ??????????

2021-02-01 Thread ???????L
,??1.12?? EnvironmentSettings settings = EnvironmentSettings.newInstance().inStreamingMode().useBlinkPlanner().build(); StreamExecutionEnvironment executionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment(); StreamTableEnvironment streamTableEnv =

????: ????: ??????????

2021-02-01 Thread amenhub
Flink ???L ?? 2021-02-01 17:56 user-zh ?? ??: ?? , ?? Configuration configuration = streamTableEnv.getConfig().getConfiguration(); configuration.setString("table.exec.source.idle-timeout","1000 ms");

??????????: ??????????

2021-02-01 Thread ???????L
, ?? Configuration configuration = streamTableEnv.getConfig().getConfiguration(); configuration.setString("table.exec.source.idle-timeout","1000 ms"); ---- ??:

????: ??????????

2021-02-01 Thread amenhub
hi, idle source??[1] [1] https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/event_timestamps_watermarks.html#dealing-with-idle-sources best, amenhub ???L ?? 2021-02-01 17:20 user-zh ?? ?? flink1.12,

Re: flink做checkpoint失败 Checkpoint Coordinator is suspending.

2021-02-01 Thread chen310
补充下,jobmanager日志异常: 2021-02-01 08:54:43,639 ERROR org.apache.flink.runtime.rest.handler.job.JobDetailsHandler [] - Exception occurred in REST handler: Job 65892aaedb8064e5743f04b54b5380df not found 2021-02-01 08:54:44,642 ERROR org.apache.flink.runtime.rest.handler.job.JobDetailsHandler [] -

Flink 1.12 SQL 语法,是否完全兼容 Flink 1.10 的 SQL 语法

2021-02-01 Thread LakeShen
Hello 社区, 最近开始考虑整理 Flink 1.10 升级到 1.12 的整体收益,想问下, Flink 1.12 SQL 语法是否完全兼容 1.10 版本的 SQL 语法,我的理解应该是兼容的. Best, LakeShen

??????????

2021-02-01 Thread ???????L
flink1.12, kafka??3??, flink??3??. ??, , ??,, ?

Re: Unknown call expression: avg(amount) when use distinct() in Flink Thanks~!

2021-02-01 Thread Timo Walther
Hi, sorry I forgot to further investigate this issue. It seems the last refactoring of the code base caused this documented feature to break. I opened an issue for it: https://issues.apache.org/jira/browse/FLINK-21225 For now, I would suggest to use SQL for the same behavior. I hope

Re: Flink SQL and checkpoints and savepoints

2021-02-01 Thread Timo Walther
I agree with Max. Within the same Flink release you can perform savepoints and sometimes also change parts of the query. But the latter depends on a case-by-case basis and needs to be tested. Regards, Timo On 30.01.21 11:43, Maximilian Michels wrote: It is true that there are no strict

TwoPhaseCommitSinkFunction 中 initializeState 代码的困惑

2021-02-01 Thread Zeahoo
大家好, 在阅读源码中看到以下代码片段,initializeState 方法内有如下代码: if (context.isRestored()) { LOG.info("{} - restoring state", name()); for (State operatorState : state.get()) { userContext = operatorState.getContext();

Re: Newbie question: Machine Learning Library of Apache Flink

2021-02-01 Thread Timo Walther
Hi, it is true that there is no dedicated machine learning library for Flink. Flink is a general data processing framework. It allows to embedded any available algorithm library within user-defined functions. Flink's focus is on stream processing. There are not many dedicated stream

Re: [ANNOUNCE] Apache Flink 1.10.3 released

2021-02-01 Thread Matthias Pohl
Yes, thanks for taking over the release! Best, Matthias On Mon, Feb 1, 2021 at 5:04 AM Zhu Zhu wrote: > Thanks Xintong for being the release manager and everyone who helped with > the release! > > Cheers, > Zhu > > Dian Fu 于2021年1月29日周五 下午5:56写道: > >> Thanks Xintong for driving this release!

flink做checkpoint失败 Checkpoint Coordinator is suspending.

2021-02-01 Thread chen310
flink做checkpoint一直失败,请教下是啥原因 -- Sent from:

关于kafka中csv格式数据字段分隔符请教

2021-02-01 Thread yinghua...@163.com
今天在文档中发现1.12版本支持不可见字符的配置的,如下: csv.field分隔符 任选,串字段分隔符(默认','),必须为单字符。你可以使用反斜杠字符指定一些特殊字符,例如'\t'代表制表符。你也可以通过unicode的编码在纯SQL文本中指定一些特殊字符,例如'csv.field-delimiter' = U&'\0001'代表0x01字符。 支持'csv.field-delimiter' = U&'\0001' 这种写法,但是1.11版本的说明里没看到这种说明,是不是1.11不支持= U&'\0001'这种写法? yinghua...@163.com