Re: Fwd: flink1.12.2 CLI连接hive出现异常

2021-04-28 Thread 张锴
我这里生产的hive没有配置Kerberos认证 张锴 于2021年4月29日周四 上午10:05写道: > 官网有说吗,你在哪里找到的呢 > > guoyb <861277...@qq.com> 于2021年4月28日周三 上午10:56写道: > >> 我的也有这种问题,没解决,kerberos认证的hive导致的。 >> >> >> >> ---原始邮件--- >> 发件人: "张锴"> 发送时间: 2021年4月28日(周三) 上午10:41 >> 收件人: "user-zh"> 主题: Fwd: flink1.12.2 CLI连接hive出现异常 >> >> >>

Re: flink 背压问题

2021-04-28 Thread HunterXHunter
中间有错误数据或者其他错误原因,背压不会导致数据丢失 -- Sent from: http://apache-flink.147419.n8.nabble.com/

Re: Table-api sql 预检查

2021-04-28 Thread Shengkai Fang
Hi. 可以看看 TableEnvironment#execute逻辑,如果能够将sql 转化成 `Transformation`,那么语法应该没有问题。 Best, Shengkai Michael Ran 于2021年4月29日周四 上午11:57写道: > dear all : > 用table api 提交多个SQL 的时候,有什么API 能提前检查SQL 的错误吗? 每次都是提交执行的时候才报错。 > 理论上里面在SQL解析部分就能发现,有现成的API吗?还是说需要自己去抓那部分解析代码,然后封装出API。 >

Re: 使用Table API怎么构造多个sink

2021-04-28 Thread Shengkai Fang
Hi. 可以通过`StatementSet` 指定多个insert,这样子就可以构造出多个sink了。 Best, Shengkai Han Han1 Yue 于2021年4月28日周三 下午2:30写道: > Hi, > 个人在分析RelNodeBlock逻辑,多个SINK才会拆分并重用公共子树,怎么构造多个sink呢, > 文件RelNodeBlock.scala源码里的writeToSink()已经找不到了 > > // 源码里的多sink例子 > val sourceTable = tEnv.scan("test_table").select('a, 'b, 'c) >

回复:wd: flink1.12.2 CLI连接hive出现异常

2021-04-28 Thread 田向阳
重新启动一个yarn session 集群。 | | 田向阳 | | 邮箱:lucas_...@163.com | 签名由 网易邮箱大师 定制 在2021年04月29日 10:05,张锴 写道: 官网有说吗,你在哪里找到的呢 guoyb <861277...@qq.com> 于2021年4月28日周三 上午10:56写道: > 我的也有这种问题,没解决,kerberos认证的hive导致的。 > > > > ---原始邮件--- > 发件人: "张锴" 发送时间: 2021年4月28日(周三) 上午10:41 > 收件人: "user-zh" 主题: Fwd:

Table-api sql 预检查

2021-04-28 Thread Michael Ran
dear all : 用table api 提交多个SQL 的时候,有什么API 能提前检查SQL 的错误吗? 每次都是提交执行的时候才报错。 理论上里面在SQL解析部分就能发现,有现成的API吗?还是说需要自己去抓那部分解析代码,然后封装出API。 如果没有,希望能提供这个功能,blink 应该是有的。 Thanks !

Re:flink 背压问题

2021-04-28 Thread Michael Ran
不至于吧,中间有错误吧。。 在 2021-04-29 11:45:17,"Bruce Zhang" 写道: >我的数据源每一秒发送一条数据,下游算子每六秒才能处理完成入库,我测试时使用的是一个并行度,数据发送完毕后,在库里只有前三条发送和后两条发送的数据,中间的数据全部丢失了。应该是背压机制的问题,这是什么原因呢

flink 背压问题

2021-04-28 Thread Bruce Zhang
我的数据源每一秒发送一条数据,下游算子每六秒才能处理完成入库,我测试时使用的是一个并行度,数据发送完毕后,在库里只有前三条发送和后两条发送的数据,中间的数据全部丢失了。应该是背压机制的问题,这是什么原因呢

Re: Watermarks in Event Time Temporal Join

2021-04-28 Thread Leonard Xu
Thanks for your example, Maciej I can explain more about the design. > Let's have events. > S1, id1, v1, 1 > S2, id1, v2, 1 > > Nothing is happening as none of the streams have reached the watermark. > Now let's add > S2, id2, v2, 101 > This should trigger join for id1 because we have all the

Re: Fwd: flink1.12.2 CLI连接hive出现异常

2021-04-28 Thread 张锴
官网有说吗,你在哪里找到的呢 guoyb <861277...@qq.com> 于2021年4月28日周三 上午10:56写道: > 我的也有这种问题,没解决,kerberos认证的hive导致的。 > > > > ---原始邮件--- > 发件人: "张锴" 发送时间: 2021年4月28日(周三) 上午10:41 > 收件人: "user-zh" 主题: Fwd: flink1.12.2 CLI连接hive出现异常 > > > -- Forwarded message - > 发件人: 张锴 Date: 2021年4月27日周二 下午1:59 >

Any configuration for accelerating state processor

2021-04-28 Thread Chen-Che Huang
Hi, I have a job that uses the state processor to load data from checkpoints on google cloud storage to do some processing and then write the result to google cloud storage. The total data size is about 30-50 GB and the job may take more than 2 hours to finish. From the flame graph generated

Re: [Announcement] Analytics Zoo 0.10.0 release

2021-04-28 Thread Jason Dai
Please See the following: 1) Analytics Zoo* PPML *supports running unmodified Flink programs in a secure fashion on an untrusted cloud ( https://analytics-zoo.readthedocs.io/en/latest/doc/PPML/Overview/ppml.html#trusted-realtime-compute-and-ml ) 2) Analytics Zoo *Cluster Serving* supports

Stateful Functions, Kinesis, and ConsumerConfigConstants

2021-04-28 Thread Ammon Diether
When using Flink Stateful Function's KinesisIngressBuilder, I do not see a way to set things like ConsumerConfigConstants.SHARD_GETRECORDS_MAX or ConsumerConfigConstants.SHARD_GETRECORDS_INTERVAL_MILLIS Looking at KinesisSourceProvider, it appears that this is the spot that creates the

KafkaSourceBuilder causing invalid negative offset on checkpointing

2021-04-28 Thread Lars Skjærven
Hello, I ran into an issue when using the new KafkaSourceBuilder (running Flink 1.12.2, scala 2.12.13, on ververica platform in a container with java 8). Initially it generated warnings on kafka configuration, but the job was able to consume and produce messages. The configuration

Re: Taskmanager killed often after migrating to flink 1.12

2021-04-28 Thread Sambaran
Hi Till, Thank you for the response, we are currently running flink with an increased memory usage, so far the taskmanager is working fine, we will check if there is any further issue and will update you. Regards Sambaran On Wed, Apr 28, 2021 at 5:33 PM Till Rohrmann wrote: > Hi Sambaran, > >

Re: Checkpoint error - "The job has failed"

2021-04-28 Thread Dan Hill
Oh interesting. Yea, could be. We'll soon update to v1.12. Thanks Robert and Yun! On Wed, Apr 28, 2021 at 1:30 AM Yun Tang wrote: > Hi Dan, > > You could refer to the "Fix Versions" in FLINK-16753 [1] and know that > this bug is resolved after 1.11.3 not 1.11.1. > > [1]

Re: Watermarks in Event Time Temporal Join

2021-04-28 Thread Maciej Bryński
Hi Leonard, Let's assume we have two streams. S1 - id, value1, ts1 with watermark = ts1 - 1 S2 - id, value2, ts2 with watermark = ts2 - 1 Then we have following interval join SELECT id, value1, value2, ts1 FROM S1 JOIN S2 ON S1.id = S2.id and ts1 between ts2 - 1 and ts2 Let's have events.

Re: Taskmanager killed often after migrating to flink 1.12

2021-04-28 Thread Till Rohrmann
Hi Sambaran, could you also share the cause why the checkpoints could not be discarded with us? With Flink 1.10, we introduced a stricter memory model for the TaskManagers. That could be a reason why you see more TaskManagers being killed by the underlying resource management system. You could

Re: [Announcement] Analytics Zoo 0.10.0 release

2021-04-28 Thread Yik San Chan
Hi Jason, Thanks for sharing. I look up the term "Flink" on https://analytics-zoo.readthedocs.io/en/latest/ but it doesn't even exist. Do you mind sharing how does it relate to Flink users? Best, Yik San On Wed, Apr 28, 2021 at 10:48 PM Jason Dai wrote: > Hi Everyone, > > > I’m happy to

Queryable State unavailable after Kubernetes HA State cleanup

2021-04-28 Thread Sandeep khanzode
Hello, Stuck at this time. Any help will be appreciated. I am able to create a queryable state and also query the state. Everything works correctly. KeyedStream, Key> stream = sourceStream.keyBy(t2 -> t2.f0); stream.asQueryableState("queryableVO"); I deploy this on a Kubernetes cluster with

Re: Application cluster - Job execution and cluster creation timeouts

2021-04-28 Thread Tamir Sagi
Hey All, I know Application cluster is not widely used yet, I'm happy to be part of Flink community , test it and share the results. Following my previous email, I'd like to share more information and get your feedback. Scenario 4 : requestJobResult() gets out of sync. The result is very

How to implement a window that emits events at regular intervals and on specific events

2021-04-28 Thread Tim Josefsson
Hello! I'm trying to figure out how to implement a window that will emit events at regular intervals or when a specific event is encountered. A bit of background. I have a stream of events from devices that will send events to our system whenever a user watches a video. These events include a

Re: [Stateful Functions] Help for calling remote stateful function (written in Python)

2021-04-28 Thread Bonino Dario
Dear Igal, dear List Thank you very much for your reply. Your advice was crucial to overcome the issue. I have now created a TypedValue manually and successfully managed to communicate with the remote function in Python. Nevertheless, I am still facing a strange behavior regarding the

Fwd: [Announcement] Analytics Zoo 0.10.0 release

2021-04-28 Thread Jason Dai
Hi Everyone, I’m happy to announce the 0.10.0 release for Analytics Zoo (distributed TensorFlow and PyTorch on Apache Spark/Flink & Ray); the highlights of this release include: - A re-designed document website

Flink Resuming From Checkpoint With "-s" FAILURE

2021-04-28 Thread Zachary Manno
Hello, I am running Flink version 1.10 on Amazon EMR. We use RocksDB/S3 for state. I have "Persist Checkpoints Externally" enabled. Periodically I must tear down the current infrastructure and bring it back up. To do this, I terminate the EMR, bring up a fresh EMR cluster, and then I resume the

Re: 流跑着跑着,报出了rocksdb层面的错误 Error while retrieving data from RocksDB

2021-04-28 Thread Peihui He
刚觉像是rocksdb的内存不够用了,调大试试呢? a593700624 <593700...@qq.com> 于2021年4月28日周三 下午3:47写道: > org.apache.flink.util.FlinkRuntimeException: Error while retrieving data > from > RocksDB > at > > org.apache.flink.contrib.streaming.state.RocksDBListState.getInternal(RocksDBListState.java:121) > at > >

回复:flink 1.12.1 savepoint 重启问题

2021-04-28 Thread 田向阳
我也遇到了,不知道啥原因,这个也是偶尔发生,是真的难定位问题 | | 田向阳 | | 邮箱:lucas_...@163.com | 签名由 网易邮箱大师 定制 在2021年04月28日 17:00,chaos 写道: 你好,因为业务逻辑变化,需要对线上在跑的flink任务重启,采用savapoint方式。 操作如下: # 查看yarn-application-id yarn application -list # 查看flink任务id flink list -t yarn-per-job

flink 1.12.1 savepoint 重启问题

2021-04-28 Thread chaos
你好,因为业务逻辑变化,需要对线上在跑的flink任务重启,采用savapoint方式。 操作如下: # 查看yarn-application-id yarn application -list # 查看flink任务id flink list -t yarn-per-job -Dyarn.application.id=application_1617064018715_0097 # 带有保存点的停止任务 flink stop -t yarn-per-job -p hdfs:///user/chaos/flink/0097_savepoint/

Key by Kafka partition / Kinesis shard

2021-04-28 Thread Yegor Roganov
Hello To learn Flink I'm trying to build a simple application where I want to save events coming from Kinesis to S3. I want to subscribe to each shard, and within each shard I want to batch for 30 seconds, or until 1000 events are observed. These batches should then be uploaded to S3. What I

Re: Using Hive UDFs

2021-04-28 Thread Rui Li
Hi Youngwoo, That's no problem at all and glad to know the UDF works now. Yeah, before you can use a hive udf, you should register it into metastore. And that can be done via either Flink or Hive. Feel free to let me know if you encounter any other issues. On Wed, Apr 28, 2021 at 4:28 PM

flink sql 1.12 minibatch??????

2021-04-28 Thread op
flink sql 1.12 minibatch?? val config = tConfig.getConfiguration() config.setString("table.exec.mini-batch.enabled", "true") // mini-batch is enabled config.setString("table.exec.mini-batch.allow-latency", "true") config.setString("table.exec.mini-batch.size", 100)

KafkaSourceBuilder causing invalid negative offset on checkpointing

2021-04-28 Thread Lars Skjærven
Hello, I ran into some issues when using the new KafkaSourceBuilder (running Flink 1.12.2, scala 2.12.13, on ververica platform in a container with java 8). Initially it generated warnings on kafka configuration, but the job was able to consume and produce messages. The configuration

Re: Checkpoint error - "The job has failed"

2021-04-28 Thread Yun Tang
Hi Dan, You could refer to the "Fix Versions" in FLINK-16753 [1] and know that this bug is resolved after 1.11.3 not 1.11.1. [1] https://issues.apache.org/jira/browse/FLINK-16753 Best Yun Tang From: Dan Hill Sent: Tuesday, April 27, 2021 7:50 To: Yun Tang Cc:

Re: Using Hive UDFs

2021-04-28 Thread 김영우
Hey Rui, My bad! You have already pointed out to me what I completely misunderstood. I've been confusing some of the steps to register udfs. And also, somehow, my metastore was a mess. So, I cleaned up the metastore and database and then, I created a database for hive catalog and registered the

Best practice for packaging and deploying Flink jobs on K8S

2021-04-28 Thread Sumeet Malhotra
Hi, I have a PyFlink job that consists of: - Multiple Python files. - Multiple 3rdparty Python dependencies, specified in a `requirements.txt` file. - A few Java dependencies, mainly for external connectors. - An overall job config YAML file. Here's a simplified structure of the

流跑着跑着,报出了rocksdb层面的错误 Error while retrieving data from RocksDB

2021-04-28 Thread a593700624
org.apache.flink.util.FlinkRuntimeException: Error while retrieving data from RocksDB at org.apache.flink.contrib.streaming.state.RocksDBListState.getInternal(RocksDBListState.java:121) at org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:111) at

流跑着跑着,报出了rocksdb层面的错误 Error while retrieving data from RocksDB

2021-04-28 Thread a593700624
org.apache.flink.util.FlinkRuntimeException: Error while retrieving data from RocksDB at org.apache.flink.contrib.streaming.state.RocksDBListState.getInternal(RocksDBListState.java:121) at org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:111) at

org.apache.flink.util.FlinkRuntimeException: Error while retrieving data from RocksDB

2021-04-28 Thread a593700624
org.apache.flink.util.FlinkRuntimeException: Error while retrieving data from RocksDB at org.apache.flink.contrib.streaming.state.RocksDBListState.getInternal(RocksDBListState.java:121) at org.apache.flink.contrib.streaming.state.RocksDBListState.get(RocksDBListState.java:111) at

Re: Using Hive UDFs

2021-04-28 Thread 김영우
Hey Rui, For geospatial udfs, I've configured these jars to my flink deployment: # Flink-Hive RUN wget -q -O /opt/flink/lib/flink-sql-connector-hive-3.1.2_2.12-1.12.2.jar

Re: Using Hive UDFs

2021-04-28 Thread Rui Li
Hi Youngwoo, Could you please share the function jar and DDL you used to create the function? I can try reproducing this issue locally. On Wed, Apr 28, 2021 at 1:33 PM Youngwoo Kim (김영우) wrote: > Thanks Shengkai and Rui for looking into this. > > A snippet from my app. looks like following: >