Find many strange measurements in metrics database of influxdb

2021-03-15 Thread Tim yu
Hi all, I run many flink jobs that contains sql, they reports the metrics to infuxdb. I find many strange measurements in metrics database of influxdb, e.g. "from table1". Does sql produce those measurements ? What is the meanings of those measurements ? -- tim yu

Is it possible to mount node local disk for task managers in a k8s application cluster?

2021-03-15 Thread Chen-Che Huang
Hi, We use the per-job deployment mode to deploy our Flink services on Kubernetes. We're considering to move from the per-job mode to the application mode in view of the advantages of the application mode. However, it seems that `bin/flink run-application --target kubernetes-application` does

Re: Application cluster - Best Practice

2021-03-15 Thread Yang Wang
I have created a ticket FLINK-21807[1] to track this requirement. [1]. https://issues.apache.org/jira/browse/FLINK-21807 Best, Yang Tamir Sagi 于2021年3月16日周二 上午1:11写道: > Hey Yang, > > The operator gave me a good lead by "revealing" that Application Deployer > does exist and there is a way to

回复:flink yarn-perjob提交任务无法启动

2021-03-15 Thread lian
两种情况: 情况1:jar打包不完整,重新打包试一下 情况2:缺少依赖 在2021年03月15日 21:59,刘朋强 写道: 问题: 通过如下命令提交任务到yarn-cluster, flink run -m yarn-cluster -yjm 1024m -ytm 2048m -c org.apache.flink.streaming.examples.wordcount.WordCount /home/lpq/flink-examples-streaming_2.11.jar 在flink ui界面taskmanager总是0,任务无法启动,没有报错信息,不知道如何排查

Re: EOFException on attempt to scale up job with RocksDB state backend

2021-03-15 Thread Alexey Trenikhun
No, I believe original exception was from 1.12.1 to 1.12.1 Thanks, Alexey From: Yun Tang Sent: Monday, March 15, 2021 8:07:07 PM To: Alexey Trenikhun ; Tzu-Li (Gordon) Tai ; user@flink.apache.org Subject: Re: EOFException on attempt to scale up job with

Re: Flink + Hive + Compaction + Parquet?

2021-03-15 Thread Kurt Young
Hi Theo, Regarding your first 2 questions, the answer is yes Flink supports streaming write to Hive. And Flink also supports automatically compacting small files during streaming write [1]. (Hive and Filesystem shared the same mechanism to do compaction, we forgot to add a dedicated document for

Re: pyflink使用的一些疑问

2021-03-15 Thread xiaoyue
您好, 目前同样在做pyflink 结合pandas的分布式计算调研和尝试,对于您的问题,仅有一些经验性的分享。 pyflink以后应该都会集成到DataStream,所以应该不会再支持DataSet; 不建议在计算中间采用 table.to_pandas()的方式进行table和dataFrame互转,会影响计算效率; 目前采用的计算效率较好的方式,是定义pandas类型的udf/udaf方式,但相较java版接口同样的方式,pyflink还是会慢很多;

Re: Evenly Spreading Out Source Tasks

2021-03-15 Thread Xintong Song
If all the tasks have the same parallelism 36, your job should only allocate 36 slots. The evenly-spread-out-slots option should help in your case. Is it possible for you to share the complete jobmanager logs? Thank you~ Xintong Song On Tue, Mar 16, 2021 at 12:46 AM Aeden Jameson wrote: >

Re: Time Temporal Join

2021-03-15 Thread Satyam Shekhar
Hello folks, I would love to hear back your feedback on this. Regards, Satyam On Wed, Mar 10, 2021 at 6:53 PM Satyam Shekhar wrote: > Hello folks, > > I am looking to enrich rows from an unbounded streaming table by > joining it with a bounded static table while preserving rowtime for the >

Re: Prefix Seek RocksDB

2021-03-15 Thread Yun Tang
Hi Rex, You could configure prefix seek via RocksDB's column family options [1]. Be careful to use correct prefix extractor. [1] https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/state_backends.html#passing-options-factory-to-rocksdb Best

Re: Question about session_aggregate.merging-window-set.rocksdb_estimate-num-keys

2021-03-15 Thread Yun Tang
Hi, Could you describe what you observed in details? Which states you compare with the session window state "merging-window-set", the "newKeysInState" or "existingKeysInState"? BTW, since we use list state as main state for window operator and we use RocksDB's merge operation for window state

Unit Testing for Custom Metrics in Flink

2021-03-15 Thread Rion Williams
Hi all, Recently, I was working on adding some custom metrics to a Flink job that required the use of dynamic labels (i.e. capturing various counters that were "slicable" by things like tenant / source, etc.). I ended up handling it in a very naive fashion that would just keep a dictionary of

Re: EOFException on attempt to scale up job with RocksDB state backend

2021-03-15 Thread Yun Tang
Hi, Can you scale the job at the same version from 1.12.1 to 1.12.1? Best Yun Tang From: Alexey Trenikhun Sent: Tuesday, March 16, 2021 4:46 To: Tzu-Li (Gordon) Tai ; user@flink.apache.org Subject: Re: EOFException on attempt to scale up job with RocksDB

Re: Checkpoint fail due to timeout

2021-03-15 Thread Alexey Trenikhun
Hi Roman, I took thread dump: "Source: digital-itx-eastus2 -> Filter (6/6)#0" Id=200 BLOCKED on java.lang.Object@5366a0e2 owned by "Legacy Source Thread - Source: digital-itx-eastus2 -> Filter (6/6)#0" Id=202 at

Re: Why does Flink FileSystem sink splits into multiple files

2021-03-15 Thread Yik San Chan
Thank you, it works. Best, Yik San Chan On Mon, Mar 15, 2021 at 5:30 PM David Anderson wrote: > The first time you ran it without having specified the parallelism, and so > you got the default parallelism -- which is greater than 1 (probably 4 or > 8, depending on how many cores your computer

Re: Got “pyflink.util.exceptions.TableException: findAndCreateTableSource failed.” when running PyFlink example

2021-03-15 Thread Yik San Chan
Thanks for your help, it works. Best, Yik San Chan On Tue, Mar 16, 2021 at 10:03 AM Xingbo Huang wrote: > Hi, > > The problem is that the legacy DataSet you are using does not support the > FileSystem connector you declared. You can use blink Planner to achieve > your needs. > > >>> >

Connection reset by peer

2021-03-15 Thread yidan zhao
任务异常自动重启,日志如下,伙伴们帮忙分析下问题。 2021-03-16 00:00:06 org.apache.flink.runtime.io.network.netty.exception.LocalTransportException: readAddress(..) failed: Connection reset by peer (connection to ' 10.35.100.171/10.35.100.171:2016') at org.apache.flink.runtime.io.network.netty.

Re: Got “pyflink.util.exceptions.TableException: findAndCreateTableSource failed.” when running PyFlink example

2021-03-15 Thread Xingbo Huang
Hi, The problem is that the legacy DataSet you are using does not support the FileSystem connector you declared. You can use blink Planner to achieve your needs. >>> t_env = BatchTableEnvironment.create( environment_settings=EnvironmentSettings.new_instance()

Re: flink1.12版本,使用yarn-application模式提交任务失败

2021-03-15 Thread todd
我从flink yaml文件设置了如下配置项: HADOOP_CONF_DIR: execution.target: yarn-application yarn.provided.lib.dirs:hdfs://... pipeline.jars: hdfs://... 所以我不确定你们使用yarn-application如何进行的配置 -- Sent from: http://apache-flink.147419.n8.nabble.com/

Re: Python StreamExecutionEnvironment from_collection Kafka example

2021-03-15 Thread Xingbo Huang
Hi, >From the error message, I think the problem is no python interpreter on your TaskManager machine. You need to install a python 3.5+ interpreter on the TM machine, and this python environment needs to install pyflink (pip install apache-flink). For details, you can refer to the document[1].

Re: Pyflink dataset没有支持相关map reduce函数

2021-03-15 Thread Dian Fu
Hi, 有几个疑问: 1)你说的map reduce函数具体指的什么?可以举一个例子吗? 2)DataSet API指的是Java的DataSet API吗?另外,Java的DataSet API会逐步废弃,统一到DataStream API上来,所以PyFlink里不会支持DataSet API,只支持Python Table API和Python DataStream API > 2021年3月13日 上午10:54,nova.he 写道: > > 你好, > >

Prefix Seek RocksDB

2021-03-15 Thread Rex Fenley
Hello! I'm wondering if Flink RocksDB state backend is pre-configured to have Prefix Seeks enabled, such as for Joins and Aggs on the TableAPI [1]? If not, what's the easiest way to configure this? I'd imagine this would be beneficial. Thanks! [1]

请问有flink + hudi或iceberg + aliyun oss的示例吗?

2021-03-15 Thread casel.chen
请问有flink + hudi或iceberg + aliyun oss的示例吗?谢谢!

Re: Attach remote debugger to task executor

2021-03-15 Thread Jaffe, Julian
You can use `env.java.opts.taskmanager` to specify java options for the task managers specifically. Be aware you may want to set `suspend=n` or be sure to attach your debugger promptly, otherwise the task manager may time out attempting to connect to the job manager (since it’s waiting for you

Attach remote debugger to task executor

2021-03-15 Thread Reggie Quimosing
I'm running flink locally via ./start-cluster.sh, and submitting my job via ./flink run . I can attach a debugger to the job client process using either: export

Re: EOFException on attempt to scale up job with RocksDB state backend

2021-03-15 Thread Alexey Trenikhun
Savepoint was taken with 1.12.1, I've tried to scale up using same version and 1.12.2 From: Tzu-Li (Gordon) Tai Sent: Monday, March 15, 2021 12:06 AM To: user@flink.apache.org Subject: Re: EOFException on attempt to scale up job with RocksDB state backend Hi,

parquet protobuf output and aws athena support

2021-03-15 Thread Jin Yi
using ParquetProtoWriters , does anyone have this working with aws athena ingestion via aws glue crawls? the parquet files being generated by our flink job looks

Re: Python StreamExecutionEnvironment from_collection Kafka example

2021-03-15 Thread Robert Cullen
Okay, I added the jars and fixed that exception. However I have a new exception that is harder to decipher: 2021-03-15 14:46:20 org.apache.flink.runtime.JobException: Recovery is suppressed by NoRestartBackoffTimeStrategy at

Re: Working with DataStreams of Java objects in Pyflink?

2021-03-15 Thread Kevin Lam
Hi Shuiqiang Chen, Thanks for the quick response. Oh I see, that's too bad POJO is not currently supported. I'd like to check if I understand your suggestion about RowType. You're suggesting something like: 1/ Define subclasses of RowType in Java/Scala to hold our java objects we want to

Re: Application cluster - Best Practice

2021-03-15 Thread Tamir Sagi
Hey Yang, The operator gave me a good lead by "revealing" that Application Deployer does exist and there is a way to do what I was looking for programmatically. Just a quick note, the operator uses the same Flink methods internally to deploy the application cluster but also provide a way to

Re: Evenly Spreading Out Source Tasks

2021-03-15 Thread Aeden Jameson
Hi Xintong, Thanks for replying. Yes, you understood my scenario. Every task has the same parallelism since we're using FlinkSql unless there is a way to change the parallelism of the source task that I have missed. Your explanation of the setting makes sense and is what I ended up

Re: Working with DataStreams of Java objects in Pyflink?

2021-03-15 Thread Shuiqiang Chen
Hi Kevin, Currently, POJO type is not supported in Python DataStream API because it is hard to deal with the conversion between Python Objects and Java Objects. Maybe you can use a RowType to represent the POJO class such as Types.ROW_NAME([id, created_at, updated_at], [Types.LONG(),

Re: Python StreamExecutionEnvironment from_collection Kafka example

2021-03-15 Thread Robert Metzger
Hey, are you sure the class is in the lib/ folder of all machines / instances, and you've restarted Flink after adding the files to lib/ ? On Mon, Mar 15, 2021 at 3:42 PM Robert Cullen wrote: > Shuiqiang, > > I added the flink-connector-kafka_2-12 jar to the /opt/flink/lib directory > > When

Re: Python StreamExecutionEnvironment from_collection Kafka example

2021-03-15 Thread Robert Cullen
Shuiqiang, I added the flink-connector-kafka_2-12 jar to the /opt/flink/lib directory When submitting this job to my flink cluster I’m getting this stack trace at runtime: org.apache.flink.runtime.JobException: Recovery is suppressed by NoRestartBackoffTimeStrategy at

Working with DataStreams of Java objects in Pyflink?

2021-03-15 Thread Kevin Lam
Hi all, Looking to use Pyflink to work with some scala-defined objects being emitted from a custom source. When trying to manipulate the objects in a pyflink defined MapFunction

flink yarn-perjob提交任务无法启动

2021-03-15 Thread 刘朋强
问题: 通过如下命令提交任务到yarn-cluster, flink run -m yarn-cluster -yjm 1024m -ytm 2048m -c org.apache.flink.streaming.examples.wordcount.WordCount /home/lpq/flink-examples-streaming_2.11.jar 在flink ui界面taskmanager总是0,任务无法启动,没有报错信息,不知道如何排查 yarn UI flink ui yarn container log down cluster because

Re: Flink shuffle vs rebalance

2021-03-15 Thread Kezhu Wang
ShufflePartitioner: public int selectChannel(SerializationDelegate> record) { return random.nextInt(numberOfChannels); } RebalancePartitioner public int selectChannel(SerializationDelegate> record) { nextChannelToSendTo = (nextChannelToSendTo + 1) %

Re: Netty LocalTransportException: Sending the partition request to 'null' failed

2021-03-15 Thread Robert Metzger
Hey Matthias, are you sure you can connect to 127.0.1.1, since everything between 127.0.0.1 and 127.255.255.255 is bound to the loopback device?: https://serverfault.com/a/363098 On Mon, Mar 15, 2021 at 11:13 AM Matthias Seiler < matthias.sei...@campus.tu-berlin.de> wrote: > Hi Arvid, > > I

Flink shuffle vs rebalance

2021-03-15 Thread 赢峰
Flink 中 shuffle Partitioner 和 rebalance partitoner 有什么区别?

Re: flink参数问题

2021-03-15 Thread Yik San Chan
Hi lxk, Can you please translate the question to English, and provide more info so that people can help? Thanks. On Mon, Mar 15, 2021 at 2:20 PM lxk7...@163.com wrote: > > 大佬们,我现在flink的版本是flink 1.10,但是我通过-ynm 指定yarn上的任务名称不起作用,一直显示的是Flink per-job > cluster > > -- >

Re: Questions with State Processor Api

2021-03-15 Thread Roman Khachatryan
Hi Yuri, I think you can achieve this by using "normal" flink operators and sinks. One thing that immediately comes to my mind are timers [1]. It should be simpler to implement and setup rather than with the State Processor API (though it seems doable via this API too). [1]

Re: flink1.12版本,使用yarn-application模式提交任务失败

2021-03-15 Thread Congxian Qiu
Hi 从你的日志看作业启动失败的原因是: Caused by: java.lang.IllegalArgumentException: Wrong FS: hdfs://xx/flink120/, expected: file:/// 看上去你设置的地址和 需要的 schema 不一样,你需要解决一下这个问题 Best, Congxian todd 于2021年3月15日周一 下午2:22写道: > 通过脚本提交flink作业,提交命令: > /bin/flink run-application -t yarn-application >

Re: Flink Read S3 Intellij IDEA Error

2021-03-15 Thread Robert Metzger
Since this error is happening in your IDE, I would recommend using the IntelliJ debugger to follow the filesystem initialization process and see where it fails to pick up the credentials. On Fri, Mar 12, 2021 at 11:11 PM sri hari kali charan Tummala < kali.tumm...@gmail.com> wrote: > Same error.

Flink Temporal Join Two union Hive Table Error

2021-03-15 Thread macia kk
Hi, 麻烦帮忙看下这个问题: 创建 View promotionTable: SELECT *, 'CN' as country, id as pid FROM promotion_cn_rule_tab UNION SELECT *, 'JP' as country, id as pid FROM promotion_jp_rule_tab FLink SQL Query: SELECT t1.country, t1.promotionId, t1.orderId, CASE WHEN t2.pid IS NULL THEN 'Rebate' ELSE 'Rebate'

Flink Temporal Join Two union Hive Table Error

2021-03-15 Thread macia kk
Hi, 麻烦帮忙看下这个问题: 创建 View promotionTable: SELECT *, 'CN' as country, id as pid FROM promotion_cn_rule_tab UNION SELECT *, 'JP' as country, id as pid FROM promotion_jp_rule_tab FLink SQL Query: SELECT t1.country, t1.promotionId, t1.orderId, CASE WHEN t2.pid IS NULL THEN 'Rebate' ELSE 'Rebate' END

Re: Netty LocalTransportException: Sending the partition request to 'null' failed

2021-03-15 Thread Matthias Seiler
Hi Arvid, I listened to ports with netcat and connected via telnet and each node can connect to the other and itself. The `/etc/hosts` file looks like this ``` 127.0.0.1   localhost 127.0.1.1   node-2.example.com   node-2    node-1 ``` Is the second line the reason it fails? I also replaced all

Re: Why does Flink FileSystem sink splits into multiple files

2021-03-15 Thread David Anderson
The first time you ran it without having specified the parallelism, and so you got the default parallelism -- which is greater than 1 (probably 4 or 8, depending on how many cores your computer has). Flink is designed to be scalable, and to achieve that, parallel instances of an operator, such as

Re: Application cluster - Best Practice

2021-03-15 Thread Yang Wang
Hi Tamir, Thanks for sharing the information. I think you are right. If you dive into the code implementation of flink-native-k8s-operator[1], they are just same. @Till Rohrmann Do you think it is reasonable to make "ApplicationDeployer" interface as public? Then I believe it will be great

Re: Checkpoint fail due to timeout

2021-03-15 Thread Roman Khachatryan
Hello Alexey, Thanks for the details. It looks like backpressure is indeed the cause of the issue. You can check that by looking at the (succeeded) checkpoint start delay in the tasks following the suspected source (digital-itx-eastus2?). To be sure, you can take a thread dump (or profile) those

Re: Upsert Kafka 的format 为什么要求是INSERT-ONLY的

2021-03-15 Thread Shengkai Fang
Hi. 对于table scan而言 - +I和+U都是被认为是insert消息, changelog normalize 则是会将消息处理为正确的类型; - 我们在scan的时候看到 tombstone的消息的value部分是空,因此直接将类型设置为delete,在changelog normalize的时候会补全value部分的值。 - -u消息是不会存入到upsert-kafka之中的 详细的可以参考下这里的ppt[1] Best, Shengkai [1]

Re: Flink + Hive + Compaction + Parquet?

2021-03-15 Thread Flavio Pompermaier
What about using Apache Hudi o Apache Iceberg? On Thu, Mar 4, 2021 at 10:15 AM Dawid Wysakowicz wrote: > Hi, > > I know Jingsong worked on Flink/Hive filesystem integration in the > Table/SQL API. Maybe he can shed some light on your questions. > > Best, > > Dawid > On 02/03/2021 21:03, Theo

Handle late message with flink SQL

2021-03-15 Thread Yi Tang
We can get a stream from a DataStream api by SideOutput. But it's hard to do the same thing with Flink SQL. I have an idea about how to get the late records while using Flink SQL. Assuming we have a source table for the late records, then we can query late records on it. Obviously, it's not a

Re: flink 1.12 使用流式写入hive报错,Failed to create Hive RecordWriter

2021-03-15 Thread yinghua...@163.com
Caused by: java.lang.OutOfMemoryError: Java heap space yinghua...@163.com 发件人: william 发送时间: 2021-03-15 16:32 收件人: user-zh 主题: flink 1.12 使用流式写入hive报错,Failed to create Hive RecordWriter flink 1.12 hadoop 2.7.5 hive 2.3.6 报错内容: 2021-03-15 16:29:43

Why does Flink FileSystem sink splits into multiple files

2021-03-15 Thread Yik San Chan
The question is cross-posted on StackOverflow https://stackoverflow.com/questions/66634813/why-does-flink-filesystem-sink-splits-into-multiple-files . I want to use Flink to read from an input file, do some aggregation, and write the result to an output file. The job is in batch mode. See

flink 1.12 使用流式写入hive报错,Failed to create Hive RecordWriter

2021-03-15 Thread william
flink 1.12 hadoop 2.7.5 hive 2.3.6 报错内容: 2021-03-15 16:29:43 org.apache.flink.connectors.hive.FlinkHiveException: org.apache.flink.table.catalog.exceptions.CatalogException: Failed to create Hive RecordWriter at

Re: Kubernetes HA - attempting to restore from wrong (non-existing) savepoint

2021-03-15 Thread Yang Wang
Feel free to share the terminated JobManager logs if you could reproduce this issue again. Maybe "kubectl logs {pod_name} --previous" could help. Best, Yang Alexey Trenikhun 于2021年3月15日周一 下午2:28写道: > With 1.12.1 it happened quite often, with 1.12.2 not that match, I think I > saw it once or

Re: 关于pyflink LATERAL TABLE 问题请教

2021-03-15 Thread 陈康
简单提供了下 可复现的例子,请帮忙看看~谢谢! -- Sent from: http://apache-flink.147419.n8.nabble.com/

RE: uniqueness of name when constructing a StateDescriptor

2021-03-15 Thread Colletta, Edward
Thank you. -Original Message- From: Tzu-Li (Gordon) Tai Sent: Monday, March 15, 2021 3:05 AM To: user@flink.apache.org Subject: Re: uniqueness of name when constructing a StateDescriptor NOTICE: This email is from an external sender - do not click on links or attachments unless you

Re: 关于statement输出结果疑问

2021-03-15 Thread Dian Fu
奥,那你理解错了。这里面其实细分成2种情况: - sink1和sink2,通过operator chain之后,和source节点位于同一个物理节点:由于执行的时候,在一个JVM里,每处理一条数据,先发送给其中一个sink,再发送给另外一个sink。然后处理下一条数据 - sink1 和sink2,和source节点处于不同一个物理节点:这个时候是纯异步的,每处理一条数据,会分别通过网络发送给sink1和sink2,同时网络有buffer,所以sink1和sink2收到数据的顺序是完全不确定的。

Re: Relation between Two Phase Commit and Kafka's transaction aware producer

2021-03-15 Thread Tzu-Li (Gordon) Tai
+ user@f.a.o (adding the conversation back to the user mailing list) On Fri, Mar 12, 2021 at 6:06 AM Kevin Kwon wrote: > Thanks Tzu-Li > > Interesting algorithm. Is consumer offset also committed to Kafka at the > last COMMIT stage after the checkpoint has completed? > Flink does commit the

Re: EOFException on attempt to scale up job with RocksDB state backend

2021-03-15 Thread Tzu-Li (Gordon) Tai
Hi, Could you provide info on the Flink version used? Cheers, Gordon -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: uniqueness of name when constructing a StateDescriptor

2021-03-15 Thread Tzu-Li (Gordon) Tai
Hi, The scope is per individual operator, i.e. a single KeyedProcessFunction instance cannot have multiple registered state with the same name. Cheers, Gordon -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: [Statefun] Interaction Protocol for Statefun

2021-03-15 Thread Tzu-Li (Gordon) Tai
Hi, Interesting idea! Just some initial thoughts and questions, maybe others can chime in as well. In general I think the idea of supporting more high-level protocols on top of the existing StateFun messaging primitives is good. For example, what probably could be categorized under this effort

Re: Extracting state keys for a very large RocksDB savepoint

2021-03-15 Thread Tzu-Li (Gordon) Tai
Hi Andrey, Perhaps the functionality you described is worth adding to the State Processor API. Your observation on how the library currently works is correct; basically it tries to restore the state backends as is. In you current implementation, do you see it worthwhile to try to add this?

答复: Upsert Kafka 的format 为什么要求是INSERT-ONLY的

2021-03-15 Thread 刘首维
Hi Shengkai, 感谢回复 让我理解一下: 在ChangelogNormalize中 1. Rowkind是未生效的 2. null表达墓碑 3. 保存全量数据的overhead 如果我的理解是对的,那么假设遇到了不论是+u i,还是-u,d 都会被理解为是一次insert,从而促使下游emit record? 我们现在有若干自定义的Format,如果为了适配Upsert Kafka,format需要对d,(-u) 事件发射value == null的Record吗

Re: Kubernetes HA - attempting to restore from wrong (non-existing) savepoint

2021-03-15 Thread Alexey Trenikhun
With 1.12.1 it happened quite often, with 1.12.2 not that match, I think I saw it once or twice for ~20 cancels, when it happened, job actually restarted on cancel, did not grab log at that time, but chances good that I will able to reproduce. Thanks, Alexey

flink1.12版本,使用yarn-application模式提交任务失败

2021-03-15 Thread todd
通过脚本提交flink作业,提交命令: /bin/flink run-application -t yarn-application -Dyarn.provided.lib.dirs="hdfs://xx/flink120/" hdfs://xx/flink-example.jar --sqlFilePath /xxx/kafka2print.sql flink使用的Lib及user jar已经上传到Hdfs路径,但是抛出以下错误: --- The program

Re: Upsert Kafka 的format 为什么要求是INSERT-ONLY的

2021-03-15 Thread Shengkai Fang
Hi. 当初的设计是基于kafka的compacted topic设计的,而compacted topic有自身的表达changelog的语法,例如:使用value 为 null 表示tombstone message。从这个角度出发的话,我们仅从kafka的角度去理解数据,而非从format的角度去解析数据。 这当然引入了一些问题,例如当利用upsert-kafka读取数据的时候需要维护一个state以记住读取的所有的key。 Best, Shengkai 刘首维 于2021年3月15日周一 上午11:48写道: > Hi all, > > > >

flink参数问题

2021-03-15 Thread lxk7...@163.com
大佬们,我现在flink的版本是flink 1.10,但是我通过-ynm 指定yarn上的任务名称不起作用,一直显示的是Flink per-job cluster lxk7...@163.com