date:20200807

Re: 关于Flink1.11 CSV Format的一些疑问

2020-08-07 Thread Shengkai Fang

hi, 对于第一个问题，文档[1]中已经有较为详细的解释，你可以仔细阅读下文档关于partition files的解释。对于第二个问题，现在的csv格式的确不支持这个选项，可以考虑见个jira作为improvment. [1] https://ci.apache.org/projects/flink/flink-docs-master/dev/table/connectors/filesystem.html WeiXubin <18925434...@163.com> 于2020年8月8日周六上午11:40写道： > Hi，我在Flink1.11版本，使用filesystem

flink1.8??????????lettuce

2020-08-07 Thread wujunxi

flinkIOredislettuce?? Caused by: io.lettuce.core.RedisException: Cannot retrieve initial cluster partitions from initial URIs [RedisURI [host='XXX', port=XXX]] at

关于Flink1.11 CSV Format的一些疑问

2020-08-07 Thread WeiXubin

Hi，我在Flink1.11版本，使用filesystem connector的时候，读取csv文件并输出到另外一个csv文件遇到了些问题，问题如下：问题1：sink 的path指定具体输出文件名，但是输出的结果是文件夹形式问题2：在flink1.11的文档中没有找到csv的 ignore-first-line 忽略第一行这个配置测试数据 11101322000220200517145507667060666706;9 11101412000220200515163257249700624970;9 11101412010220200514163709315410631541;9

State Processor API to boot strap keyed state for Stream Application.

2020-08-07 Thread Marco Villalobos

I have read the documentation and various blogs that state that it is possible to load data into a data-set and use that data to bootstrap a stream application. The documentation literally says this, "...you can read a batch of data from any store, preprocess it, and write the result to a

Re: [DISCUSS] Upgrade HBase connector to 2.2.x

2020-08-07 Thread Felipe Lolas

Hi all! Im new here; I have been using the flink connector for hbase 1.2, but recently opt to upgrading to hbase 2.1(basically because was bundled in CDH6) it would be nice to add support for hbase 2.x! I found that supporting hbase 1.4.3 and 2.1 needs minimal changes and keeping that in mind

Re: GroupBy with count on a joint table only let met write using toRetractStream

2020-08-07 Thread Faye Pressly

Sorry just notice I made a typo in the last table (clickAdvertId != null instead of clickCount !=null) Table allImpressionTable = impressionsTable .leftOuterJoin(clicksTable, "clickImpId = impImpressionId && clickMinute = impMinute") .groupBy("impAdvertId, impVariationName,

GroupBy with count on a joint table only let met write using toRetractStream

2020-08-07 Thread Faye Pressly

Hello, I have a steam of events (coming from a Kinesis Stream) of this form: impressionId | advertid | variationName | eventType | eventTime The end goal is to output back on a Kinesis Stream the count of event of type 'impression' and the count of events of type 'click' however, I need to

Native K8S Jobmanager restarts and job never recovers

2020-08-07 Thread Bohinski, Kevin

Hi all, In our 1.11.1 native k8s session after we submit a job it will run successfully for a few hours then fail when the jobmanager pod restarts. Relevant logs after restart are attached below. Any suggestions? Best kevin 2020-08-06 21:50:24,425 INFO

Re: [DISCUSS] Upgrade HBase connector to 2.2.x

2020-08-07 Thread Jark Wu

I'm +1 to add HBase 2.x However, I have some concerns about moving HBase 1.x to Bahir: 1) As discussed above, there are still lots of people using HBase 1.x. 2) Bahir doesn't have the infrastructure to run the existing HBase E2E tests. 3) We also paid lots of effort to provide an uber connector

Re: Submit Flink 1.11 job from java

2020-08-07 Thread David Anderson

Flavio, Have you looked at application mode [1] [2] [3], added in 1.11? It offers at least some of what you are looking for -- the application jar and its dependencies can be pre-uploaded to HDFS, and the main() method runs on the job manager, so none of the classes have to be loaded in the

Re: 使用flink-sql-connector-elasticsearch6_2.11_1.10.0.jar执行任务失败

2020-08-07 Thread Shengkai Fang

不好意思，在es6上也进行了相应的修复。但似乎是一个相同的问题。 Shengkai Fang 于2020年8月7日周五下午7:52写道： > 你的意思是不是用1.10的es包没问题，但是用1.11的有问题？ > 似乎是一个bug，在1.11 es7上已经修复了这个bug，但是没有对于es6进行修复。 > 参见[1] https://issues.apache.org/jira/browse/FLINK-18006?filter=-2 > > 费文杰于2020年8月7日周五下午3:56写道： > >> >> 以下是我的代码： >> import

Re: 使用flink-sql-connector-elasticsearch6_2.11_1.10.0.jar执行任务失败

2020-08-07 Thread Shengkai Fang

你的意思是不是用1.10的es包没问题，但是用1.11的有问题？似乎是一个bug，在1.11 es7上已经修复了这个bug，但是没有对于es6进行修复。参见[1] https://issues.apache.org/jira/browse/FLINK-18006?filter=-2 费文杰于2020年8月7日周五下午3:56写道： > > 以下是我的代码： > import com.alibaba.fastjson.JSONObject; > import lombok.extern.slf4j.Slf4j; > import

Flink job percentage

2020-08-07 Thread Flavio Pompermaier

Hi to all, one of our customers asked us to see a percentage of completion of a Flink Batch job. Is there any already implemented heuristic I can use to compute it? Will this be possible also when DataSet api will migrate to DataStream..? Thanks in advance, Flavio

Re: Only One TaskManager Showing High CPU Usage

2020-08-07 Thread Jake

Hi Mason Can you use the jvm cpu perfrommance analysis tools? Jprofile and https://github.com/alibaba/arthas You can probably guess the reason for the high CPU load. Jake > On Aug 6, 2020, at 12:25 PM, Chen, Mason wrote: > > Thanks Peter for the reply.

Re: [DISCUSS] Upgrade HBase connector to 2.2.x

2020-08-07 Thread Robert Metzger

Hi, Thank you for picking this up so quickly. I have no objections regarding all the proposed items. @Gyula: Once the bahir contribution is properly reviewed, ping me if you need somebody to merge it. On Fri, Aug 7, 2020 at 10:43 AM Márton Balassi wrote: > Hi Robert and Gyula, > > Thanks for

Re:回复： Re: Table Api执行sql如何设置sink并行度

2020-08-07 Thread wldd

hi，Shengkai Fang，Cayden chen：谢谢解答，这个DISCUSS应该可以解决我的问题 -- Best， wldd 在 2020-08-07 16:56:30，"Cayden chen" <1193216...@qq.com> 写道： >hi: > 你可以把sink >之前的table转成datastream，然后改变全局的为你想设置的sink并行度，再dataStream.addSink(sink)(由于这里会取全局并行度并给算子设置), > 之后把全局并行度改回去。理论上这个方法可以为每个算子设置单独并行度 > > > >

回复： Flink 1.11.1 on k8s 如何配置hadoop

2020-08-07 Thread Matt Wang

官网的镜像只包含 Flink 相关的内容，如果需要连接 HDFS，你需要将 Hadoop 相关包及配置打到镜像中 -- Best, Matt Wang 在2020年08月7日 12:49，caozhen 写道：顺手贴一下flink1.11.1的hadoop集成wiki： https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/hadoop.html 根据官网说不再提供flink-shaded-hadoop-2-uber。并给出以下两种解决方式

Re: flink sql csv格式字段分隔符问题

2020-08-07 Thread Leonard Xu

Hi 试下这个 'csv.line-delimiter' = U&'\\0009' 注意后面不要再加单引号了祝好 Leonard > 在 2020年8月7日，16:27，kandy.wang 写道： > > 设置 'csv.field-delimiter'='\t' > ，查询数据的时候，报错：org.apache.flink.table.api.ValidationException: Option > 'csv.field-delimiter' must be a string with single character, but was: \t > 请问，该怎么搞？

??????flink elasticsearch sink ????????????

2020-08-07 Thread Cayden chen

hi. source?? ---- ??: "user-zh"

?????? Re: Table Api????sql????????sink??????

2020-08-07 Thread Cayden chen

hi: sink ??tabledatastreamsink??dataStream.addSink(sink)(??), ?? ---- ??:

Flink maxrecordcount increase causing a few task manager throughput drops

2020-08-07 Thread Terry Chia-Wei Wu

hi, I change the following config from flink.shard.getrecords.maxrecordcount: 1000 flink.shard.getrecords.intervalmillis: 200 to flink.shard.getrecords.maxrecordcount: 1 flink.shard.getrecords.intervalmillis: 1000 and found a few task managers around (10/1000) are becoming very slow. We

Re: [DISCUSS] Upgrade HBase connector to 2.2.x

2020-08-07 Thread Márton Balassi

Hi Robert and Gyula, Thanks for reviving this thread. We have the implementation (currently for 2.2.3) and it is straightforward to contribute it back. Miklos (ccd) has recently written a readme for said version, he would be interested in contributing the upgraded connector back. The latest HBase

Re: Re: Table Api执行sql如何设置sink并行度

2020-08-07 Thread Shengkai Fang

hi, 现在仅支持全局设置，现在并不支持对于单个sink并行度的设置。对于单个sink的设置社区正在讨论中，见 https://www.mail-archive.com/dev@flink.apache.org/msg40251.html wldd 于2020年8月7日周五下午3:41写道： > hi： > 这应该是对应的flink-conf.yaml的配置，这是一个全局的配置，并不能指定sink的并行度 > > > > > > > > > > > > > > -- > > Best， > wldd > > > > > > 在 2020-08-07

flink sql csv格式字段分隔符问题

2020-08-07 Thread kandy.wang

设置 'csv.field-delimiter'='\t' ，查询数据的时候，报错：org.apache.flink.table.api.ValidationException: Option 'csv.field-delimiter' must be a string with single character, but was: \t 请问，该怎么搞？

Re: [DISCUSS] Upgrade HBase connector to 2.2.x

2020-08-07 Thread Gyula Fóra

Hi Robert, I completely agree with you on the Bahir based approach. I am happy to help with the contribution on the bahir side, with thorough review and testing. Cheers, Gyula On Fri, 7 Aug 2020 at 09:30, Robert Metzger wrote: > It seems that this thead is not on dev@ anymore. Adding it

使用flink-sql-connector-elasticsearch6_2.11_1.10.0.jar执行任务失败

2020-08-07 Thread 费文杰

以下是我的代码： import com.alibaba.fastjson.JSONObject; import lombok.extern.slf4j.Slf4j; import org.apache.flink.api.common.functions.RuntimeContext; import org.apache.flink.api.common.serialization.SimpleStringSchema; import org.apache.flink.streaming.api.TimeCharacteristic; import

Re:Re: Table Api执行sql如何设置sink并行度

2020-08-07 Thread wldd

hi：这应该是对应的flink-conf.yaml的配置，这是一个全局的配置，并不能指定sink的并行度 -- Best， wldd 在 2020-08-07 15:26:34，"Shengkai Fang" 写道： >hi >不知道这个能不能满足你的要求 > >tEnv.getConfig().addConfiguration( >new Configuration() >.set(CoreOptions.DEFAULT_PARALLELISM, 128) >); >

Re: Submit Flink 1.11 job from java

2020-08-07 Thread Flavio Pompermaier

The problem with env.executeAsync is that I need to load the job classes on the client side and this is something I'd like to avoid because it's a source of problems. I'd like to tell Flink to run a jar that is available somewhere (on the flink instances or on the blob server or on a network

Re: [DISCUSS] Upgrade HBase connector to 2.2.x

2020-08-07 Thread Robert Metzger

It seems that this thead is not on dev@ anymore. Adding it back ... On Fri, Aug 7, 2020 at 9:23 AM Robert Metzger wrote: > I would like to revive this discussion. There's a new JIRA[1] + PR[2] for > adding HBase 2 support. > > it seems that there is demand for a HBase 2 connector, and consensus

Re: 使用StreamTableEnvironment.createTemporarySystemFunction注册UD(T)F异常

2020-08-07 Thread Benchao Li

Hi, 1.11中引入的新的udf注册接口，使用的是新的udf类型推断机制，所以会有上面的问题。你可以参考新的udf类型推导文档[1] 来写一下type hint试试 [1] https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/functions/udfs.html#type-inference zz zhang 于2020年8月7日周五上午11:00写道： > 执行如下代码提示异常，改为旧方法StreamTableEnvironment.registerFunction执行正常， >

Re: Table Api执行sql如何设置sink并行度

2020-08-07 Thread Shengkai Fang

hi 不知道这个能不能满足你的要求 tEnv.getConfig().addConfiguration( new Configuration() .set(CoreOptions.DEFAULT_PARALLELISM, 128) ); 参见文档：https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html wldd 于2020年8月7日周五下午3:16写道： > hi，all： > 请教一下，TableEnviroment在执行sql的时候如何设置sink的并行度 > > > > > >

Re: [DISCUSS] Upgrade HBase connector to 2.2.x

2020-08-07 Thread Robert Metzger

I would like to revive this discussion. There's a new JIRA[1] + PR[2] for adding HBase 2 support. it seems that there is demand for a HBase 2 connector, and consensus to do it. The remaining question in this thread seems to be the "how". I would propose to go the other way around as Gyula

flink elasticsearch sink 反压如何解决

2020-08-07 Thread 大头糯米丸子

现在在es端做了一些索引和写es的参数优化，flink在高峰期可以做些什么，有没有什么限流的办法，除了自带的动态反压以外

Table Api执行sql如何设置sink并行度

2020-08-07 Thread wldd

hi，all：请教一下，TableEnviroment在执行sql的时候如何设置sink的并行度 -- Best， wldd

Re: Flink 1.10 on Yarn

2020-08-07 Thread xuhaiLong

感谢回复！我这边的确是这个bug 引起的 On 8/7/2020 13:43，chenkaibit wrote： hi xuhaiLong,看日志发生的 checkpoint nullpointer 是个已知的问题，具体可以查看下面两个jira。你用的jdk版本是多少呢？目前发现使用 jdk8_40/jdk8_60 + flink-1.10 会出现 checkpoint nullpointer，可以把jdk升级下版本试一下 https://issues.apache.org/jira/browse/FLINK-18196

Re:flink-jdbc_2.11:1.11.1依赖找不到

2020-08-07 Thread RS

Hi, 找下这种的 flink-connector-jdbc_2.12-1.11.1.jar 在 2020-08-07 14:47:29，"lydata" 写道： >flink-jdbc_2.11:1.11.1依赖在 https://mvnrepository.com/ 找不到，是不是没有上传？

flink-jdbc_2.11:1.11.1依赖找不到

2020-08-07 Thread lydata

flink-jdbc_2.11:1.11.1依赖在 https://mvnrepository.com/ 找不到，是不是没有上传？

Re: flinksql连接kafka，数据里面有个字段名字是time

2020-08-07 Thread leiyanrui

好的我去看下谢谢哈 -- Sent from: http://apache-flink.147419.n8.nabble.com/

Re: flinksql连接kafka，数据里面有个字段名字是time

2020-08-07 Thread Benchao Li

1.10的确是存在一个这样的bug[1]，这个已经在1.10.1和1.11.0中修复了，可以尝试下1.10.1或者1.11.1版本。 [1] https://issues.apache.org/jira/browse/FLINK-16068 leiyanrui <1150693...@qq.com> 于2020年8月7日周五下午2:32写道： > 1.10 > > > > > -- > Sent from:

Re: flinksql连接kafka，数据里面有个字段名字是time

2020-08-07 Thread leiyanrui

-- Sent from: http://apache-flink.147419.n8.nabble.com/

Re: flinksql连接kafka，数据里面有个字段名字是time

2020-08-07 Thread leiyanrui

1.10 -- Sent from: http://apache-flink.147419.n8.nabble.com/

Re: flinksql连接kafka，数据里面有个字段名字是time

2020-08-07 Thread Benchao Li

使用的是Flink哪个版本呢？以及最好也提供一下异常信息 leiyanrui <1150693...@qq.com> 于2020年8月7日周五下午2:18写道： > CREATE TABLE table1 ( > bg BIGINT, > user_source BIGINT, > bossid BIGINT, > geekid BIGINT, > qq_intent BIGINT, > phone_intent BIGINT, > wechat_intent BIGINT, > `time` BIGINT, >

Re: 怎么把自己从list中取消,我想换个邮箱

2020-08-07 Thread Benchao Li

Hi, 取消订阅中文用户邮件列表，可以发送邮件到：user-zh-unsubscr...@flink.apache.org 更多邮件列表信息，可以参考[1] [1] https://flink.apache.org/community.html#mailing-lists 何宗谨于2020年8月7日周五下午2:14写道： > > > > -- Best, Benchao Li

Re: flinksql连接kafka，数据里面有个字段名字是time

2020-08-07 Thread leiyanrui

CREATE TABLE table1 ( bg BIGINT, user_source BIGINT, bossid BIGINT, geekid BIGINT, qq_intent BIGINT, phone_intent BIGINT, wechat_intent BIGINT, `time` BIGINT, t as to_timestamp(from_unixtime(__ts,'-MM-dd HH:mm:ss')), watermark for t as t - interval '5'

Re: flinksql连接kafka，数据里面有个字段名字是time

2020-08-07 Thread Benchao Li

可以提供一下你使用的Flink版本以及DDL么？ leiyanrui <1150693...@qq.com> 于2020年8月7日周五上午11:58写道： > 使用flinksql连接kafka，kafka的数据格式内部有个字段叫time，我在create > table的时候将time字段加了反单引号还是不行，报错，有什么别的方法吗 > > > > > -- > Sent from: http://apache-flink.147419.n8.nabble.com/ -- Best, Benchao Li

怎么把自己从list中取消,我想换个邮箱

2020-08-07 Thread 何宗谨

Re: 回复：flink1.11 DDL定义kafka source报错

2020-08-07 Thread chengyanan1...@foxmail.com

你好：你使用的是Flink 1.11版本，但是你的建表语句还是用的老版本，建议更换新版本的建表语句后再试一下参考如下： https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/kafka.html chengyanan1...@foxmail.com 发件人：阿华田发送时间： 2020-08-07 14:03 收件人： user-zh@flink.apache.org 主题：回复：flink1.11 DDL定义kafka

Re: getting error after upgrade Flink 1.11.1

2020-08-07 Thread dasraj

Hi Kostas, I am trying to migrate our code base to use new ClusterClient method for job submission. As you recommending to use new publicEvolving APIs, any doc or link for reference will be helpful. Thanks, -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: flink1.11 DDL定义kafka source报错

2020-08-07 Thread chengyanan1...@foxmail.com

你好：图片是看不到的，建议直接粘贴文本再发送一次 chengyanan1...@foxmail.com 发件人：阿华田发送时间： 2020-08-07 13:49 收件人： user-zh 主题： flink1.11 DDL定义kafka source报错代码如下阿华田 a15733178...@163.com 签名由网易邮箱大师定制

回复：flink1.11 DDL定义kafka source报错

2020-08-07 Thread 阿华田

错误信息： Exception in thread "main" java.lang.IllegalStateException: No operators defined in streaming topology. Cannot generate StreamGraph. at org.apache.flink.table.planner.utils.ExecutorUtils.generateStreamGraph(ExecutorUtils.java:47) at

50 matches

Mail list logo