Where can i find MySQL retract stream table sink java soure code?

2020-03-24 Thread wangl...@geekplus.com.cn
Create one table with kafka, another table with MySQL using flinksql. Write a sql to read from kafka and write to MySQL. INSERT INTO mysqlTable SELECT status, COUNT(order_no) AS num FROM (SELECT order_no, LAST_VALUE(status) AS status FROM kafkaTable GROUP BY

Re: Re: Where can i find MySQL retract stream table sink java soure code?

2020-03-24 Thread wangl...@geekplus.com.cn
Seems it is here: https://github.com/apache/flink/tree/master/flink-connectors/flink-jdbc/src/main/java/org/apache/flink/api/java/io/jdbc There's no JDBCRetractTableSink, only append and upsert. I am confused why the MySQL record can be deleted. Thanks, Lei wangl...@geekplus.com.cn

Re: Re: Where can i find MySQL retract stream table sink java soure code?

2020-03-24 Thread wangl...@geekplus.com.cn
Thanks Jingsong. So JDBCTableSink now suport append and upsert mode. Retract mode not available yet. It is right? Thanks, Lei wangl...@geekplus.com.cn Sender: Jingsong Li Send Time: 2020-03-25 11:39 Receiver: wangl...@geekplus.com.cn cc: user Subject: Re: Where can i find MySQL

Re: Re: Where can i find MySQL retract stream table sink java soure code?

2020-03-24 Thread Jingsong Li
It is not the mean what you said. There are two queries: append query and update query. For update query, there are two ways to handle, one is retract, another is upsert. So the thing is a sink can choose a mode to handle update query. Just choose one is OK. You could read more in [1]. [1]

Re: Re: Where can i find MySQL retract stream table sink java soure code?

2020-03-24 Thread wangl...@geekplus.com.cn
Thanks Jingsong. When executing this sql, the mysql table record can be deleted. So i guess it is a retract stream. I want to know the exactly java code it is generated and have a look at it. Thanks, Lei wangl...@geekplus.com.cn Sender: Jingsong Li Send Time: 2020-03-25 11:14 Receiver:

Re: Where can i find MySQL retract stream table sink java soure code?

2020-03-24 Thread Jingsong Li
Hi, Maybe you have some misunderstanding to upsert sink. You can take a look to [1], it can deal with "delete" records. [1] https://ci.apache.org/projects/flink/flink-docs-master/dev/table/streaming/dynamic_tables.html#table-to-stream-conversion Best, Jingsong Lee On Wed, Mar 25, 2020 at 11:37

Re: Deploying native Kubernetes via yaml files?

2020-03-24 Thread Yang Wang
Hi Niels, I have created a ticket[1] to track the yaml file submission for native K8s integration. Feel free to share your significative thoughts about this way. [1]. https://issues.apache.org/jira/browse/FLINK-16760 Best, Yang Niels Basjes 于2020年3月24日周二 下午11:10写道: > Thanks. > I'll have a

Re: Where can i find MySQL retract stream table sink java soure code?

2020-03-24 Thread Jingsong Li
Hi, This can be a upsert stream [1] Best, Jingsong Lee On Wed, Mar 25, 2020 at 11:12 AM wangl...@geekplus.com.cn < wangl...@geekplus.com.cn> wrote: > > Create one table with kafka, another table with MySQL using flinksql. > Write a sql to read from kafka and write to MySQL. > > INSERT INTO

Re: Where can i find MySQL retract stream table sink java soure code?

2020-03-24 Thread Jingsong Li
Hi, This can be a upsert stream [1], and JDBC has upsert sink now [2]. [1] https://ci.apache.org/projects/flink/flink-docs-master/dev/table/streaming/dynamic_tables.html#table-to-stream-conversion [2] https://ci.apache.org/projects/flink/flink-docs-master/dev/table/connect.html#jdbc-connector

Importance of calling mapState.clear() when no entries in mapState

2020-03-24 Thread Ilya Karpov
Hi, given: - flink 1.6.1 - stateful function with MapState mapState = //init logic; Is there any reason I should call mapState.clear() if I know beforehand that there are no entries in mapState (e.g. mapState.iterator().hasNext() returns false)? Thanks in advance!

Re: Issues with Watermark generation after join

2020-03-24 Thread Timo Walther
Hi Dominik, the big conceptual difference between DataStream and Table API is that record timestamps are part of the schema in Table API whereas they are attached internally to each record in DataStream API. When you call `y.rowtime` during a stream to table conversion, the runtime will

Re: Dynamic Flink SQL

2020-03-24 Thread Arvid Heise
Hi Krzysztof, from my past experience as data engineer, I can safely say that users often underestimate the optimization potential and techniques of the used systems. I implemented a similar thing in the past, where I parsed up to 500 rules reading from up to 10 data sources. The basic idea was

Re: Need help on timestamp type conversion for Table API on Pravega Connector

2020-03-24 Thread Timo Walther
This issue is tracked under: https://issues.apache.org/jira/browse/FLINK-16693 Could you provide us a little reproducible example in the issue? I think that could help us in resolving this issue quickly in the next minor release. Thanks, Timo On 20.03.20 03:28, b.z...@dell.com wrote: Hi,

Re: Object has non serializable fields

2020-03-24 Thread Kostas Kloudas
Hi Eyal, This is a known issue which is fixed now (see [1]) and will be part of the next releases. Cheers, Kostas [1] https://issues.apache.org/jira/browse/FLINK-16371 On Tue, Mar 24, 2020 at 11:10 AM Eyal Pe'er wrote: > > Hi all, > > I am trying to write a sink function that retrieves string

Re: Kerberos authentication for SQL CLI

2020-03-24 Thread Dawid Wysakowicz
Hi Gyula, As far as I can tell SQL cli does not support Kerberos natively. SQL CLI submits all the queries to a running Flink cluster. Therefore if you kerberize the cluster the queries will use that configuration. On a different note. Out of curiosity. What would you expect the SQL CLI to use

Re: Issues with Watermark generation after join

2020-03-24 Thread Timo Walther
Hi, 1) yes with "partition" I meant "parallel instance". If the watermarking is correct in the DataStream API. The Table API and SQL will take care that it remains correct. E.g. you can only perform a TUMBLE window if the timestamp column has not lost its time attribute property. A regular

Re: Issues with Watermark generation after join

2020-03-24 Thread Timo Walther
Or better: "But for sources, you need to emit a watermark from all sources in order to have progress in event-time." On 24.03.20 13:09, Timo Walther wrote: Hi, 1) yes with "partition" I meant "parallel instance". If the watermarking is correct in the DataStream API. The Table API and SQL

Kerberos authentication for SQL CLI

2020-03-24 Thread Gyula Fóra
Hi! Does the SQL CLI support Kerberos Authentication? I am struggling to find any use of the SecurityContext in the SQL CLI logic but maybe I am looking in the wrong place. Thank you! Gyula

Object has non serializable fields

2020-03-24 Thread Eyal Pe'er
Hi all, I am trying to write a sink function that retrieves string and creates compressed files in time buckets. The code is pretty straight forward and based on CompressWriterFactoryTest

RE: Need help on timestamp type conversion for Table API on Pravega Connector

2020-03-24 Thread B.Zhou
Hi, Thanks for the information. I replied in the comment of this issue: https://issues.apache.org/jira/browse/FLINK-16693?focusedCommentId=17065486=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17065486 Best Regards, Brian -Original Message- From: Timo

Re: How to calculate one alarm strategy for each device or one alarm strategy for each type of IOT device

2020-03-24 Thread Dawid Wysakowicz
Hi, Could you elaborate a bit more what do you want to achieve. What have you tried so far? Could you share some code with us? What problems are you facing? From the vague description you provided you should be able to design it with e.g. KeyedProcessFunction[1] Best, Dawid [1]

Re: Issues with Watermark generation after join

2020-03-24 Thread Dominik Wosiński
Hey Timo, Thanks a lot for this answer! I was mostly using the DataStream API, so that's good to know the difference. I have followup questions then, I will be glad for clarification: 1) So, for the SQL Join operator, is the *partition *the parallel instance of operator or is it the table

Re: Importance of calling mapState.clear() when no entries in mapState

2020-03-24 Thread Dawid Wysakowicz
I think there should be no reason to do that. Best, Dawid On 24/03/2020 09:29, Ilya Karpov wrote: > Hi, > > given: > - flink 1.6.1 > - stateful function with MapState mapState = //init logic; > > Is there any reason I should call mapState.clear() if I know beforehand that > there are no

Re: Kerberos authentication for SQL CLI

2020-03-24 Thread Gyula Fóra
Thanks Dawid, I think you are right that most of the things should work like this just fine. Maybe some catalogs will need this at some point but not right now, I was just wondering why is this different from how the CliFrontend works which also installs the security context on the Client side.

Re: Issue with single job yarn flink cluster HA

2020-03-24 Thread Andrey Zagrebin
Hi Dinesh, If the current leader crashes (e.g. due to network failures) then getting these messages do not look like a problem during the leader re-election. They look to me just as warnings that caused failover. Do you observe any problem with your application? Does the failover not work, e.g.

Deploying native Kubernetes via yaml files?

2020-03-24 Thread Niels Basjes
Hi, As clearly documented here https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/native_kubernetes.html the current way of deploying Flink natively on Kubernetes is by running the ./bin/kubernetes-session.sh script that runs some Java code that does "magic" to deploy in on the

Re: Deploying native Kubernetes via yaml files?

2020-03-24 Thread Ufuk Celebi
Hey Niels, you can check out the README with example configuration files here: https://github.com/apache/flink/tree/master/flink-container/kubernetes Is that what you were looking for? Best, Ufuk On Tue, Mar 24, 2020 at 2:42 PM Niels Basjes wrote: > Hi, > > As clearly documented here >

Re: Deploying native Kubernetes via yaml files?

2020-03-24 Thread Ufuk Celebi
PS: See also https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/deployment/kubernetes.html#flink-job-cluster-on-kubernetes On Tue, Mar 24, 2020 at 2:49 PM Ufuk Celebi wrote: > Hey Niels, > > you can check out the README with example configuration files here: >

Re: Lack of KeyedBroadcastStateBootstrapFunction

2020-03-24 Thread Dawid Wysakowicz
Hi, I am not very familiar with the State Processor API, but from a brief look at it, I think you are right. I think the State Processor API does not support mixing different kinds of states in a single operator for now. At least not in a nice way. Probably you could implement the

Re: Kerberos authentication for SQL CLI

2020-03-24 Thread Dawid Wysakowicz
I think the reason why its different from the CliFrontend is that the sql client is way younger and as far as I know never reached "production" readiness. (as per the docs [1], it's still marked as Beta, plus see the first "Attention" ;) ). I think it certainly makes sense to have a proper

Re: Deploying native Kubernetes via yaml files?

2020-03-24 Thread Yang Wang
Hi Niels, Currently, the native integration Flink cluster could not be created via yaml file. The reason why we introduce the native integration is for the users who are not familiar with K8s and kubectl. So we want to make it easier for our Flink users to deploy Flink cluster on K8s. However, i

Re: java.lang.AbstractMethodError when implementing KafkaSerializationSchema

2020-03-24 Thread Steve Whelan
Hi Arvid, Interestingly, my job runs successfully in a docker container (image* flink:1.9.0-scala_2.11*) but is failing with the *java.lang.AbstractMethodError* on AWS EMR (non-docker). I am compiling with java version OpenJDK 1.8.0_242, which is the same version my EMR cluster is running. Though

Re: How to calculate one alarm strategy for each device or one alarm strategy for each type of IOT device

2020-03-24 Thread yang xu
Hi Dawid I use Flink to calculate IOT device alarms,My scenario is that each device has an independent alarm strategy,For example, I calculate that the temperature of 10 consecutive event data of a device is higher than 10 degrees。 I use: sourceStream.keyBy("deviceNo")

Re: 关于sql-gateway insert into 异常捕获的问题

2020-03-24 Thread godfrey he
hi, sql gateway目前支持获取作业状态(要求jm还能查询到该作业)。如果要获取作业异常可以通过jm提供的REST API [1] [1] https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/rest_api.html#jobs-jobid-exceptions Best, Godfrey 111 于2020年3月25日周三 上午10:38写道: > Hi, > 确实是执行报错….那如果是执行报错,flink本身是否有提供获取exception的机制呢? > > > | | >

回复: 关于sql-gateway insert into 异常捕获的问题

2020-03-24 Thread 111
Hi, 目前确实通过sql-gateway拿到了jobId,并且获取到了状态。不过由于使用的是yarn-session,导致失败后taskmanager回收,taskmanager上的日志也就丢失了。 如果连接到Jobmanager上,是拿不到taskmanager曾经的日志吧? Best, xinghalo 在2020年03月25日 10:47,godfrey he 写道: hi, sql gateway目前支持获取作业状态(要求jm还能查询到该作业)。如果要获取作业异常可以通过jm提供的REST API [1] [1]

Re: 向您请教pyflink在windows上运行的问题,我第一次接触flink。

2020-03-24 Thread jincheng sun
很高兴收到您的邮件,我不太确定您具体看的是哪一个视频,所以很难确定您遇到的问题原因。您可以参考官方文档[1], 同时我个人博客里有一些3分钟的视频和文档入门教程[2],您可以先查阅一下。如又问题,可以保持邮件沟通,可以在我的博客留言! [1] https://ci.apache.org/projects/flink/flink-docs-release-1.10/tutorials/python_table_api.html [2] https://enjoyment.cool/ Best, Jincheng xu1990xaut 于2020年3月24日周二

Re: ddl es 报错

2020-03-24 Thread jinhai wang
优秀!可以提个improve issue Best Regards jinhai...@gmail.com > 2020年3月25日 下午1:40,zhisheng 写道: > > hi,Leonar Xu > > 官方 ES DDL 现在不支持填写 ES 集群的用户名和密码,我现在在公司已经做了扩展,加了这个功能,请问社区是否需要这个功能?我该怎么贡献呢? > > 效果如图:http://zhisheng-blog.oss-cn-hangzhou.aliyuncs.com/2020-03-25-053948.png > > Best Wishes! > >

Re:Re: flinksql创建源表添加水位线失败

2020-03-24 Thread flink小猪
感谢您的回复,这是我lib目录下的jar包 flink-dist_2.11-1.10.0.jar flink-sql-connector-kafka_2.11-1.10.0.jar flink-table-blink_2.11-1.10.0.jar slf4j-log4j12-1.7.15.jar flink-json-1.10.0.jar flink-table_2.11-1.10.0.jar log4j-1.2.17.jar 以下是提交任务的异常信息

Re: 关于sql-gateway insert into 异常捕获的问题

2020-03-24 Thread godfrey he
Hi,你可以尝试在yarn上去拿历史作业的日志 Best, Godfrey 111 于2020年3月25日周三 上午10:53写道: > Hi, > > 目前确实通过sql-gateway拿到了jobId,并且获取到了状态。不过由于使用的是yarn-session,导致失败后taskmanager回收,taskmanager上的日志也就丢失了。 > 如果连接到Jobmanager上,是拿不到taskmanager曾经的日志吧? > > > Best, > xinghalo > 在2020年03月25日 10:47,godfrey he 写道: > hi, sql

回复: 关于sql-gateway insert into 异常捕获的问题

2020-03-24 Thread 111
Hi godfrey, 关于exceptions 这个rest api 的建议,试验了下,目前可以满足需求。非常感谢 ! Best, Xinghalo 在2020年03月25日 11:37,111 写道: Hi, 好的,我研究下哈。现在taskmanager的原理还不太熟,有问题再沟通 Best, Xinghalo 在2020年03月25日 11:01,godfrey he 写道: Hi,你可以尝试在yarn上去拿历史作业的日志 Best, Godfrey 111 于2020年3月25日周三 上午10:53写道: Hi,

Re: ddl es 报错

2020-03-24 Thread zhisheng
hi,Leonar Xu 官方 ES DDL 现在不支持填写 ES 集群的用户名和密码,我现在在公司已经做了扩展,加了这个功能,请问社区是否需要这个功能?我该怎么贡献呢? 效果如图:http://zhisheng-blog.oss-cn-hangzhou.aliyuncs.com/2020-03-25-053948.png Best Wishes! zhisheng Leonard Xu 于2020年3月24日周二 下午5:53写道: > Hi, 出发 > 看起来是你缺少了依赖es connector的依赖[1],所以只能找到内置的filesystem

Re: 关于sql-gateway insert into 异常捕获的问题

2020-03-24 Thread godfrey he
hi, sql gateway当前会把服务端的完整异常栈返回给用户, 例如: Caused by: java.util.concurrent.ExecutionException: org.apache.flink.runtime.rest.util.RestClientException: [Internal server error., 于2020年3月25日周三 上午8:47写道: > Hi, > 最近在使用sql-gateway,当使用 > StatementExecuteResponseBody body = getInstance().sendRequest( >

Re: 关于sql-gateway insert into 异常捕获的问题

2020-03-24 Thread godfrey he
目前sql gateway只负责提交作业,不负责跟踪作业的状态。如果作业没提交成功,sql gateway会返回相关的错误;如果是执行时报错,sql gateway不会返回错误。你看看flink web ui作业是否提交成功 Best, Godfrey 111 于2020年3月25日周三 上午10:29写道: > > > Hi, > 我试过了,insert into是不行的…sql-gateway的后台日志也没有任何报错。 > 怀疑是不是我使用的jdbc sink connector,内部是流的方式。流不会把异常抛给sql-gateway? > > > 在2020年03月25日

Re: 【PyFlink】对于数据以Csv()格式写入kafka报错,以及使用python udf时无法启动udf

2020-03-24 Thread jincheng sun
Hi Zhefu, 谢谢您分享解决问题的细节,这对社区有很大的贡献! 1. 关于订阅问题 我想确认一下,你是否参考了[1],同时以订阅中文用户列表(user-zh@flink.apache.org)为例,您需要发送邮件到( user-zh-subscr...@flink.apache.org),就是在原有邮件的地址上添加subscribe。同时收到一封“confirm subscribe to *user-zh*@flink.apache.org”的确认邮件,需要进行确认回复。 2. 关于JAR包冲突问题 flink-python

回复: 关于sql-gateway insert into 异常捕获的问题

2020-03-24 Thread 111
Hi, 我试过了,insert into是不行的…sql-gateway的后台日志也没有任何报错。 怀疑是不是我使用的jdbc sink connector,内部是流的方式。流不会把异常抛给sql-gateway? 在2020年03月25日 10:26,godfrey he 写道: hi, sql gateway当前会把服务端的完整异常栈返回给用户, 例如: Caused by: java.util.concurrent.ExecutionException: org.apache.flink.runtime.rest.util.RestClientException:

回复: 关于sql-gateway insert into 异常捕获的问题

2020-03-24 Thread 111
Hi, 确实是执行报错….那如果是执行报错,flink本身是否有提供获取exception的机制呢? | | xinghalo | | xingh...@163.com | 签名由网易邮箱大师定制 在2020年03月25日 10:32,godfrey he 写道: 目前sql gateway只负责提交作业,不负责跟踪作业的状态。如果作业没提交成功,sql gateway会返回相关的错误;如果是执行时报错,sql gateway不会返回错误。你看看flink web ui作业是否提交成功 Best, Godfrey 111 于2020年3月25日周三 上午10:29写道:

回复: 关于sql-gateway insert into 异常捕获的问题

2020-03-24 Thread 111
Hi, 好的,我研究下哈。现在taskmanager的原理还不太熟,有问题再沟通 Best, Xinghalo 在2020年03月25日 11:01,godfrey he 写道: Hi,你可以尝试在yarn上去拿历史作业的日志 Best, Godfrey 111 于2020年3月25日周三 上午10:53写道: Hi, 目前确实通过sql-gateway拿到了jobId,并且获取到了状态。不过由于使用的是yarn-session,导致失败后taskmanager回收,taskmanager上的日志也就丢失了。

Re: 关于 SQL DATE_FORMAT 的时区设置的构想

2020-03-24 Thread DONG, Weike
Hi Zhenghua, 感谢您的回复。感觉既然 Flink 已经提供了时区的设定,这方面也许可以进一步增强一些。CONVERT_TZ 用户很容易忘记或者漏掉,这里还是有不少完善的空间。 Best, Weike On Tue, Mar 24, 2020 at 4:20 PM Zhenghua Gao wrote: > CURRENT_TIMESTAMP 返回值类型是 TIMESTAMP (WITHOUT TIME ZONE), > 其语义可参考 java.time.LocalDateTime。 > 其字符形式的表示并不随着时区变化而变化(如你所见,和UTC+0 一致)。 > >

Re: 关于 SQL DATE_FORMAT 的时区设置的构想

2020-03-24 Thread Zhenghua Gao
CURRENT_TIMESTAMP 返回值类型是 TIMESTAMP (WITHOUT TIME ZONE), 其语义可参考 java.time.LocalDateTime。 其字符形式的表示并不随着时区变化而变化(如你所见,和UTC+0 一致)。 你的需求可以通过 CONVERT_TZ(timestamp_string, time_zone_from_string, time_zone_to_string) *Best Regards,* *Zhenghua Gao* On Mon, Mar 23, 2020 at 10:12 PM DONG, Weike wrote: >

Re: Flink 1.10 的 JDBCUpsertOutputFormat flush方法的重试机制无法生效

2020-03-24 Thread Leonard Xu
Hi, shangwen 这应该是AppendOnlyWriter的一个bug[1], 在1.10.1/1.11-SNAPSHOT(master)中已经修复. 用1.10.1或master分支就好了,目前1.10.1还未发布,我了解到的1.10.1社区正在准备发布中。 如果急需修复,你可以参考1.10.1分支的代码。 Best, Leonard [1]https://issues.apache.org/jira/browse/FLINK-16281

Re: FLINK SQL中时间戳怎么处理处理

2020-03-24 Thread Leonard Xu
Hi,吴志勇 你的SQL表定义应该没问题,出问题的地方 现在flink的 json format 遵循 RFC3399标准[1],其识别的timestamp的格式是:'-MM-dd'T'HH:mm:ss.SSS’Z', 暂不支持long解析为 timestamp,你可以在输出到kafka时将timestamp转换成该格式: DateFormat dateFormat = new SimpleDateFormat("-MM-dd'T'HH:mm:ss.SSS'Z'"); Date date = new Date(System.currentTimeMillis());

如何提升任务cpu使用率

2020-03-24 Thread yanggang_it_job
hi: 背景介绍,现在集群的剩余核数不多,就去梳理了一些大任务。 通过PromSQL:max(flink_taskmanager_Status_JVM_CPU_Load{job_name={job_name}})获取指定任务的cpu使用率, 发现任务的cpu使用率普遍较低,一个slot为10的container,使用率大多小于6%。 然后我测试中我降低container里面的slot数,发现cpu使用率并没有线性增加,同理我增大slot数也没有线性减少。 我是不是测试的有问题呢?或者有什么相关思路吗?

ddl es 报错

2020-03-24 Thread 出发
图片是我用到的属性

Re: ddl es 报错

2020-03-24 Thread Leonard Xu
Hi, 出发 看起来是你缺少了依赖es connector的依赖[1],所以只能找到内置的filesystem connector,目前内置的filesystem connector只支持csv format,所以会有这个错误。 在项目中加上缺失的依赖即可,如果使用SQL CLI,也需要将依赖的jar包放到 flink的lib目录下。 org.apache.flink flink-sql-connector-elasticsearch6_2.11 ${flink.version} org.apache.flink flink-json

JDBC Sink参数connector.write.max-retries 在Oracle中的bug

2020-03-24 Thread 111
Hi, 在使用jdbc sink时,底层使用oracle驱动会出现bug。 出现的现象:当max-retries参数设置为1时,任务能正常报错;当max-retries参数大于1时,虽然程序内部报错,但是任务总是正常结束。 在JDBCUpsertOutputFormat.java中的flush()方法中,设计了重试机制: public synchronized void flush() throws Exception { checkFlushException(); for (int i = 1; i <= maxRetryTimes; i++) { try {

Re: JDBC Sink参数connector.write.max-retries 在Oracle中的bug

2020-03-24 Thread Leonard Xu
Hi, xinghalo 这是jdbc sink 的AppenOnlyWriter的一个已知bug,在1.10.1里已经修复[1],社区近期在准备1.10.1的发布, 建议等1.10.1发布后升级即可。 Best, Leonard [1]https://issues.apache.org/jira/browse/FLINK-16281 > 在 2020年3月24日,18:32,111 写道: > > Hi, > 在使用jdbc

ddl es ????

2020-03-24 Thread ????
: CREATE TABLE buy_cnt_per_hour ( hour_of_day BIGINT, buy_cnt BIGINT ) WITH ( 'connector.type' = 'elasticsearch', 'connector.version' = '6', 'connector.hosts' = 'http://localhost:9200', 'connector.index' = 'buy_cnt_per_hour', 'connector.document-type' = 'user_behavior',

Re:Flink on YARN 使用Kerboros认证失败

2020-03-24 Thread 巫旭阳
之前在使用hadoop client时设置了一个系统变量, 当这个变量没设置的时候就会报之前的错误 System.setProperty("java.security.krb5.conf", "C:\\Users\\86177\\Desktop\\tmp\\5\\krb5.conf" ); 但flink on yarn 没有提供这个参数的设置。 在 2020-03-24 20:52:44,"aven.wu" 写道: Flink 提交作业到有kerboros认证的集群报以下异常 java.lang.Exception: unable to establish

Re: Flink on YARN 使用Kerboros认证失败

2020-03-24 Thread nie...@163.com
对于Flink on YARN,最简单的情况是直接在终端 kinit,就能提交任务。flink本身不用配置。 Can't get Kerberos realm一般是是krb5.conf对应realm的配置的问题。 flink/hado...@example.com hadoop0不知道是不是主机,这看起来像是个服务的principal 。 这里应该是user的principal 就行了。 > 在 2020年3月24日,下午9:03,巫旭阳 写道: > > 之前在使用hadoop

Flink on YARN 使用Kerboros认证失败

2020-03-24 Thread aven . wu
Flink 提交作业到有kerboros认证的集群报以下异常 java.lang.Exception: unable to establish the security context at org.apache.flink.runtime.security.SecurityUtils.install(SecurityUtils.java:73) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1124) Caused by: java.lang.IllegalArgumentException:

Flink JDBC Driver可以创建kafka表吗?

2020-03-24 Thread 赵峰
hi Flink JDBC Driver创建kafka表报错,是我的建表代码不正确?代码如下: Connection connection = DriverManager.getConnection("jdbc:flink://localhost:8083?planner=blink"); Statement statement = connection.createStatement(); statement.executeUpdate( "CREATE TABLE table_kafka (\n" + "user_id BIGINT,\n" + "

Flink JDBC Driver是否支持创建流数据表

2020-03-24 Thread 赵峰
hi Flink JDBC Driver创建kafka表报错,是我的建表代码不正确?代码如下: Connection connection = DriverManager.getConnection("jdbc:flink://localhost:8083?planner=blink"); Statement statement = connection.createStatement(); statement.executeUpdate( "CREATE TABLE table_kafka (\n" + "user_id BIGINT,\n" + "

回复: Flink JDBC Driver是否支持创建流数据表

2020-03-24 Thread wangl...@geekplus.com.cn
参考下这个文档: https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/table/connect.html#kafka-connector 下面的语法应该是不支持的: 'format.type' = 'csv',\n" + "'format.field-delimiter' = '|'\n" 下面是我可以跑通的代码, kafka 里的数据需要是这种格式:{"order_no":"abcdefg","status":90} tEnv.sqlUpdate("CREATE TABLE

????flink sql 1.10 source????????????????????

2020-03-24 Thread Chief
hi all?? ??flink sqlhivehive150sql client??10??source??150web ui

flinksql创建源表添加水位线失败

2020-03-24 Thread flink小猪
当我尝试通过sql client创建kafka源表时(这里借鉴了云邪大佬的《Demo:基于 Flink SQL 构建流式应用》文章), CREATE TABLE user_behavior ( user_id BIGINT, item_id BIGINT, category_id BIGINT, behavior STRING, ts TIMESTAMP(3), proctime as PROCTIME(), WATERMARK FOR ts as ts - INTERVAL '5' SECOND ) WITH (

Re: 关于 SQL DATE_FORMAT 的时区设置的构想

2020-03-24 Thread Jark Wu
Thanks for reporting this Weike. 首先,我觉得目前 Flink 返回 TIMESTAMP WITHOUT TIME ZONE 应该是有问题的。 因为 SQL 标准(SQL:2011 Part 2 Section 6.32)定义了返回类型是 WITH TIME ZONE。 另外 Calcite 文档中 [1] 也说返回的是 TIMESTAMP WITH TIME ZONE (虽然好像和实现不一致) 其他的一些数据库也都差不多:mysql [2], oracle[3] Best, Jark [1]:

关于sql-gateway insert into 异常捕获的问题

2020-03-24 Thread 111
Hi, 最近在使用sql-gateway,当使用 StatementExecuteResponseBody body = getInstance().sendRequest( host,port,StatementExecuteHeaders.getInstance(), new SessionMessageParameters(sessionId), new StatementExecuteRequestBody(stmt, timeout)).get(); 提交insert语句时,当任务失败,无法返回对应的异常信息;是目前版本暂时不支持这种特性吗?

Re: flinksql创建源表添加水位线失败

2020-03-24 Thread Jark Wu
Emm... 这个好奇怪,理论上 IDEA 中运行的时候可能会有问题 (Calcite bug 导致的问题),SQL CLI 中不应该有问题。 你的集群/作业中有其他的依赖吗? 比如自己依赖了 Calcite? Best, Jark On Tue, 24 Mar 2020 at 23:37, flink小猪 <18579099...@163.com> wrote: > 当我尝试通过sql client创建kafka源表时(这里借鉴了云邪大佬的《Demo:基于 Flink SQL 构建流式应用》文章), > CREATE TABLE user_behavior ( >

Re: 关于flink sql 1.10 source并行度自动推断的疑问

2020-03-24 Thread Jun Zhang
hi,Chief: 目前flink读取hive的时候,如果开启了自动推断,系统会根据所读取的文件数来推断并发,如果没有超过最大并发数(默认1000),source的并行度就等于你文件的个数, 你可以通过table.exec.hive.infer-source-parallelism.max来设置source的最大并发度。 Kurt Young 于2020年3月25日周三 上午8:53写道: > 你的数据量有多大?有一个可能的原因是source的其他并发调度起来的时候,数据已经被先调度起来的并发读完了。 > > Best, > Kurt > > > On Tue, Mar 24,

回复:关于flink sql 1.10 source并行度自动推断的疑问

2020-03-24 Thread Jun Zhang
hi,Chief: 目前flink读取hive的时候,如果开启了自动推断,系统会根据所读取的文件数来推断并发,如果没有超过最大并发数(默认1000),source的并行度就等于你文件的个数, 你可以通过table.exec.hive.infer-source-parallelism.max来设置source的最大并发度。 BestJun -- 原始邮件 -- 发件人: Kurt Young

Re: 如何提升任务cpu使用率

2020-03-24 Thread Xintong Song
你的 Flink 版本是什么?运行环境是 Yarn? 降低 slot 数并不能提高 cpu 的使用率。默认情况下 yarn container 申请 vcore 数等于 slot 数,降低 slot 数相当于等比例地降低了每个 container 的 cpu 资源和计算需求。如果想提高 cpu 的使用率,可以考虑让 container 的 vcore 数少于 slot 数。通过 ‘yarn.containers.vcores’ 可以设置 container 的 vcore 数不用默认的 slot 数。 Thank you~ Xintong Song On Tue, Mar

Re: 关于flink sql 1.10 source并行度自动推断的疑问

2020-03-24 Thread Kurt Young
你的数据量有多大?有一个可能的原因是source的其他并发调度起来的时候,数据已经被先调度起来的并发读完了。 Best, Kurt On Tue, Mar 24, 2020 at 10:39 PM Chief wrote: > hi all: > 之前用flink sql查询hive的数据,hive的数据文件是150个,sql > client配置文件设置的并行度是10,source通过自动推断生成了150并发,但是通过看web > ui发现只有前十个子任务是读到数据了,其他的任务显示没有读到数据,请问是我设置有问题吗?