Re:Re:Re: 请教下Flink时间戳问题

2021-08-16 Thread RS
T只是时间格式显示问题, 数据格式都是timestamp(3), 这个和T应该无关的 在 2021-08-16 13:45:12,"Geoff nie" 写道: >谢谢你!第二个问题确实是我版本太低问题,我flink版本是1.12.1。 >第一个问题,是因为我通过flink写入iceberg >表中,然后通过presto查询iceberg表,其他字段的表都可以查询,但是当写入的是含有TIMESTAMP 类型的表时,presto查询如下报错: > >Query failed (#20210816_020321_00011_wa8bs) in your-presto: Cannot

Re: Can i contribute for flink doc ?

2021-08-16 Thread Caizhi Weng
Hi! Thanks for your interest in contributing to Flink. Currently most of the committers are busy with the upcoming Flink 1.14 so there might be few people having their eyes on the new PRs, especially if they do not exist in a JIRA issue. Please follow Jing Zhang's advice by first creating the

Re: Handling HTTP Requests for Keyed Streams

2021-08-16 Thread Caizhi Weng
Hi! As you mentioned that the configuration fetching is very infrequent, why don't you use a blocking approach to send HTTP requests and receive responses? This seems like a more reasonable solution to me. Rion Williams 于2021年8月17日周二 上午4:00写道: > Hi all, > > I've been exploring a few different

Re: Can i contribute for flink doc ?

2021-08-16 Thread JING ZHANG
Hi Camile, First of all, thanks for the great contribution, the document improvement is very helpful. but I don't know why nobody merges it and no comment. > Maybe we could try the following way to start the first contribution, please go document [1] for more detailed information. 1. Please make

Re: Windowed Aggregation With Event Time over a Temporary View

2021-08-16 Thread JING ZHANG
Hi Joe, > > caused by: org.apache.calcite.runtime.CalciteContextException: From line > 1, column 139 to line 1, column 181: Cannot apply '$TUMBLE' to arguments of > type '$TUMBLE(, )'. Supported form(s): > '$TUMBLE(, )' The first parameter of Group Window Function [1] must be a field with time

退订

2021-08-16 Thread 610346212
退订

Flink On Yarn HA 部署模式下Flink程序无法启动

2021-08-16 Thread 周瑞
您好:Flink程序部署在Yran上以Appliation Mode 模式启动的,在没有采用HA 模式的时候可以正常启动,配置了HA之后,启动异常,麻烦帮忙看下是什么原因导致的. HA 配置如下: high-availability: zookeeper high-availability.storageDir: hdfs://mycluster/flink/ha high-availability.zookeeper.quorum: zk-1:2181,zk-2:2181,zk-3:2181 high-availability.zookeeper.path.root:

Can i contribute for flink doc ?

2021-08-16 Thread Camile Sing
Hi, all I'm a Flink user. recently I find some problems when I use Flink, it takes some time to understand the internal mechanisms. This really makes me know more about Flink, but I think the doc can be clearer, so I open some merge requests for the doc: -

Re: redis sink from flink

2021-08-16 Thread Yik San Chan
By the way, this post in Chinese showed how we do it exactly with code. https://yiksanchan.com/posts/flink-bulk-insert-redis And yes it had buffered writes support by leveraging Flink operator state, and Redis Pipelining. Feel free to let you know if you have any questions. On Tue, Aug 17,

Re: redis sink from flink

2021-08-16 Thread Yik San Chan
Hi Jin, I was in the same shoes. I tried bahir redis connector at first, then I felt it was very limited, so I rolled out my own. It was actually quite straightforward. All you need to do is to extend RichSinkFunction, then put your logic inside. Regarding Redis clients, Jedis

Re: redis sink from flink

2021-08-16 Thread Yangze Guo
Hi, Jin IIUC, the DataStream connector `RedisSink` can still be used. However, the Table API connector `RedisTableSink` might not work (at least in the future) because it is implemented based on the deprecated Table connector abstraction. You can still give it a try, though. Best, Yangze Guo On

Re: Flink taskmanager in crash loop

2021-08-16 Thread Yangze Guo
Hi, Abhishek, Do you see something like "Fatal error occurred while executing the TaskManager" in your log or would you like to provide the whole task manager log? Best, Yangze Guo On Tue, Aug 17, 2021 at 5:17 AM Abhishek Rai wrote: > > Hello, > > In our production environment, running Flink

redis sink from flink

2021-08-16 Thread Jin Yi
is apache bahir still a thing? it hasn't been touched for months (since redis 2.8.5). as such, looking at the current flink connector docs, it's no longer pointing to anything from the bahir project. looking around in either the flink or bahir newsgroups doesn't turn up anything regarding

Re: RabbitMQ 3.9+ Native Streams

2021-08-16 Thread Rob Englander
I will definitely consider the contribution idea :) On Mon, Aug 16, 2021 at 3:16 PM David Morávek wrote: > Hi Rob, > > there is currently no on-going effort for this topic, I think this would > be a really great contribution though. This seems to be pushing RabbitMQ > towards new usages ;) > >

Flink taskmanager in crash loop

2021-08-16 Thread Abhishek Rai
Hello, In our production environment, running Flink 1.13 (Scala 2.11), where Flink has been working without issues with a dozen or so jobs running for a while, Flink taskmanager started crash looping with a period of ~4 minutes per crash. The stack trace is not very informative, therefore

Handling HTTP Requests for Keyed Streams

2021-08-16 Thread Rion Williams
Hi all, I've been exploring a few different options for storing tenant-specific configurations within Flink state based on the messages I have flowing through my job. Initially I had considered creating a source that would periodically poll an HTTP API and connect that stream to my original event

Re: RabbitMQ 3.9+ Native Streams

2021-08-16 Thread David Morávek
Hi Rob, there is currently no on-going effort for this topic, I think this would be a really great contribution though. This seems to be pushing RabbitMQ towards new usages ;) Best, D. On Mon, Aug 16, 2021 at 8:16 PM Rob Englander wrote: > I'm wondering if there's any work underway to develop

RE: Upgrading from Flink on YARN 1.9 to 1.11

2021-08-16 Thread Hailu, Andreas [Engineering]
Hi David, You’re correct about classpathing problems – thanks for your help in spotting them. I was able to get past that exception by removing some conflicting packages in my shaded JAR, but I’m seeing something else that’s interesting. With the 2 threads trying to submit jobs, one of the

RabbitMQ 3.9+ Native Streams

2021-08-16 Thread Rob Englander
I'm wondering if there's any work underway to develop DataSource/DataSink for RabbitMQ's streams recently released in RMQ 3.9? Rob Englander

Keystore format limitations for TLS

2021-08-16 Thread Alexis Sarda-Espinosa
Hello, I am trying to configure TLS communication for a Flink cluster running on Kubernetes. I am currently using the BCFKS format and setting that as default via javax.net.ssl.keystoretype and javax.net.ssl.truststoretype (which are injected in the environment variable FLINK_ENV_JAVA_OPTS).

Windowed Aggregation With Event Time over a Temporary View

2021-08-16 Thread Joseph Lorenzini
Hi all,   I am on Flink 1.12.3.   So here’s the scenario: I have a Kafka topic as a source, where each record repsents a change to an append only audit log. The kafka record has the following fields:   id (unique identifier for that audit log entry) operation id (is shared across

Re: Flink使用SQL注册UDF未来有规划吗

2021-08-16 Thread liwei li
你是指这个吗? CREATE [TEMPORARY|TEMPORARY SYSTEM] FUNCTION [IF NOT EXISTS] [catalog_name.][db_name.]function_name AS identifier [LANGUAGE JAVA|SCALA|PYTHON] 使用SQL注册UDF 详情见: https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/sql/create/#create-function Ada Luna

Re: Dynamic Cluster/Index Routing for Elasticsearch Sink

2021-08-16 Thread Rion Williams
Thanks for this suggestion David, it's extremely helpful. Since this will vary depending on the elements retrieved from a separate stream, I'm guessing something like the following would be roughly the avenue to continue down: fun main(args: Array) { val parameters =

Re: NullPointerException in StateTable.put()

2021-08-16 Thread David Morávek
Great, let me know if that helped ;) Best, D. On Mon, Aug 16, 2021 at 4:36 PM László Ciople wrote: > The events are json dictionaries and the key is a field which represents a > device id, or if it doesn't exist, then actually a *hashCode *of the > device object in the dictionary is used. So

Re: NullPointerException in StateTable.put()

2021-08-16 Thread David Morávek
My intuition is that you have a non-deterministic shuffle key. If you perform any "per-key" operation, you need to make sure that the same key always end up in the same partition. To simplify this, it means that the key needs to have a consistent *hashCode* and *equals* across different JVMs.

Re: Exploring Flink for a HTTP delivery service.

2021-08-16 Thread David Morávek
Hi Prasanna, here are some quick thoughts 1) Batching is an aggregation operation.But what I have seen in the > examples of windowing is that they get the count/max/min operation in the > particular window. So could the batching be implemented via a windowing > mechanism ? > I guess a custom

NullPointerException in StateTable.put()

2021-08-16 Thread László Ciople
Hello, I am trying to write a Flink application which receives data from Kafka, does processing on keyed windowed streams and sends results on a different topic. Shortly after the job is started it fails with a NullPointerException in StateTable.put(). I am really confused by this error, because I

Re: s3 access denied with flink-s3-fs-presto

2021-08-16 Thread David Morávek
Hi Vamshi, >From your configuration I'm guessing that you're using Amazon S3 (not any implementation such as Minio). Two comments: - *s3.endpoint* should not contain bucket (this is included in your s3 path, eg. *s3:///*) - "*s3.path.style.access*: true" is only correct for 3rd party

Re: Upgrading from Flink on YARN 1.9 to 1.11

2021-08-16 Thread David Morávek
Hi Andreas, Per-job and session deployment modes should not be affected by this FLIP. Application mode is just a new deployment mode (where job driver runs embedded within JM), that co-exists with these two. >From information you've provided, I'd say your actual problem is this exception: ```

Re: Problems with reading ORC files with S3 filesystem

2021-08-16 Thread David Morávek
Hi Piotr, unfortunately this is a known long-standing issue [1]. The problem is that ORC format is not using Flink's filesystem abstraction for actual reading of the underlying file, so you have to adjust your Hadoop config accordingly. There is also a corresponding SO question [2] covering this.

Re: Scaling Flink for batch jobs

2021-08-16 Thread Gorjan Todorovski
Thanks, I'll check more about job tuning. On Mon, 16 Aug 2021 at 06:28, Caizhi Weng wrote: > Hi! > > if I use parallelism of 2 or 4 - it takes the same time. >> > It might be that there is no data in some parallelisms. You can click on > the nodes in Flink web UI and see if it is the case for

Re: FlinkSQL 反压 inputPoolUsage问题

2021-08-16 Thread Ada Luna
taskmanager.network.memory.buffers-per-channel 把这个参数从默认的2调整成5,反压的PoolUsage就和网上的文章一致了,这是为什么? Ada Luna 于2021年8月16日周一 下午4:17写道: > > 在网上看文章一般反压源头的inputPoolUsage应该是高的,其他被反压算子的inputPoolUsage也应该是高的。但是我最近发现的反压inputPoolUsage全是空,是Flink的反压机制就是这样,还是说这个版本的Metrics有问题。 > > Ada Luna 于2021年8月16日周一 下午4:16写道: > >

Re: JobManager Resident memory Always Increasing

2021-08-16 Thread David Morávek
Hi Pranjul, which deployment mode are you using? - For session cluster, I'd say it's possible that memory grows with # of jobs. - For application mode, there is actual user-code executed, so if you're using some native libraries in your job driver, that may be another reason for the growing

Re: FlinkSQL 反压 inputPoolUsage问题

2021-08-16 Thread Ada Luna
在网上看文章一般反压源头的inputPoolUsage应该是高的,其他被反压算子的inputPoolUsage也应该是高的。但是我最近发现的反压inputPoolUsage全是空,是Flink的反压机制就是这样,还是说这个版本的Metrics有问题。 Ada Luna 于2021年8月16日周一 下午4:16写道: > > 版本1.10.1 > 最近我观察很多FlinkSQL 任务的反压指标发现,反压为High算子的outputPoolUsage是满的 > inputPoolUsage是空,反压源头inputPoolUsage和outputPoolUsage都是空的,这是正常的嘛。

FlinkSQL 反压 inputPoolUsage问题

2021-08-16 Thread Ada Luna
版本1.10.1 最近我观察很多FlinkSQL 任务的反压指标发现,反压为High算子的outputPoolUsage是满的 inputPoolUsage是空,反压源头inputPoolUsage和outputPoolUsage都是空的,这是正常的嘛。

Re: JobManager Resident memory Always Increasing

2021-08-16 Thread Yun Tang
Hi Pranjul, First of all, you adopted on-heap state backend: HashMapStateBackend, which would not use native off-heap memory. Moreover, JobManager would not initialize any keyed state backend instance. And if not enable high availability, JobManagerCheckpointStorage would also not use direct

Re: Flink使用SQL注册UDF未来有规划吗

2021-08-16 Thread Jeff Zhang
https://www.yuque.com/jeffzhangjianfeng/gldg8w/dthfu2 Ada Luna 于2021年8月16日周一 下午2:26写道: > 目前注册UDF要通过Table API。 > 未来会通过SQL直接将UDF注册到上下文中吗? > -- Best Regards Jeff Zhang

Flink使用SQL注册UDF未来有规划吗

2021-08-16 Thread Ada Luna
目前注册UDF要通过Table API。 未来会通过SQL直接将UDF注册到上下文中吗?