How to customize triggering of checkpoints?

2018-04-10 Thread syed
I am new to the flink environment and looking to analyze the triggering of checkpoints. I am looking to trigger non-periodic checkpoints such that checkpoint intervals are not of equal length, but not sure how can I do this in Flink. My specific query is; (1) How can I trigger non-periodic

Re: java.lang.Exception: TaskManager was lost/killed

2018-04-10 Thread 周思华
Hi Lasse, I met that before. I think maybe the non-heap memory trend of the graph you attached is the "expected" result ... Because rocksdb will keep the a "filter (bloom filter)" in memory for every opened sst file by default, and the num of the sst file will increase by time, so it looks

Re: java.lang.Exception: TaskManager was lost/killed

2018-04-10 Thread Ted Yu
Please see the last comment on this issue: https://github.com/facebook/rocksdb/issues/3216 FYI On Tue, Apr 10, 2018 at 12:25 AM, Lasse Nedergaard < lassenederga...@gmail.com> wrote: > > This graph shows Non-Heap . If the same pattern exists it make sense that > it will try to allocate more

Is Flink able to do real time stock market analysis?

2018-04-10 Thread Ivan Wang
Hi all, I've spent nearly 2 weeks trying to figure a solution to my requirement as below. If anyone can advise, that would be great. 1. There're going to be 2000 transactions per second as StreamRaw, I'm going to tumbleWindow them as StreamA, let's say every 15 seconds. Then I'm going to

Watermark and multiple streams

2018-04-10 Thread Filipe Couto
Hello. I'm joining several data streams, using ConnectedStreams. Let's say something like A connect B which outputs AB, and then I join AB with C, which outputs ABC. However, the relationship between A and B, or AB and C may be of 1 to many, or 1 to 1, depending on the case. For the 1 to 1,

Recovering snapshot state saved with TypeInformation generated by the implicit macro from the scala API

2018-04-10 Thread Petter Arvidsson
Hello everyone, We are trying to recover state from a snapshot which we can no longer load. When it is loaded we receive the following exception: java.lang.ClassNotFoundException: io.relayr.counter.FttCounter$$ anon$71$$anon$33 This, via a couple more exceptions, leads to: java.io.IOException:

Re: Kafka Consumers Partition Discovery doesn't work

2018-04-10 Thread Juho Autio
Ahhh looks like I had simply misunderstood where that property should go. The docs correctly say: > To enable it, set a non-negative value for flink.partition-discovery.interval-millis in the __provided properties config__ So it should be set in the Properties that are passed in the constructor

Re: RocksDBMapState example?

2018-04-10 Thread Ted Yu
For KeyedState, apart from https://ci.apache.org/projects/flink/flink-docs- release-1.4/dev/stream/state/state.html#keyed-state-and-operator-state , you can refer to docs/dev/migration.md : public void initializeState(FunctionInitializationContext context) throws Exception { counter =

RE: RocksDBMapState example?

2018-04-10 Thread NEKRASSOV, ALEXEI
Yes, I've read the documentation on working with state. It talks about MapState. When I looked at Javadoc, I learned that MapState is an interface, with RocksDBMapState as one of the implementing classes. I'm not sure what you mean by KeyedState; I don't see a class with that name. I'm

Re: RocksDBMapState example?

2018-04-10 Thread Dawid Wysakowicz
Hi Alexei, You should not use RocksDBMapState directly. Have you went through the doc page regarding working with state[1]? I think you want to use KeyedState, assuming the size of your keyspace. Probably a way to go would be to key your stream and then even ValueState (which will be scoped to

RE: RocksDBMapState example?

2018-04-10 Thread NEKRASSOV, ALEXEI
I looked at that code, but I’m still not clear. new RocksDBMapState<>(columnFamily, namespaceSerializer, stateDesc, this); columnFamily is determined by 50-line function; is this necessary for a simple use case like mine? What should I use as state descriptor in that function?.. Last argument

Re: Record timestamp from kafka

2018-04-10 Thread Ben Yan
> On Apr 10, 2018, at 7:32 PM, Ben Yan wrote: > > Hi Chesnay: > > I think it would be better without such a limitation.I want to > consult another problem. When I use BucketingSink(I use aws s3), the filename > of a few files after checkpoint still

Re: Record timestamp from kafka

2018-04-10 Thread Ben Yan
Hi Fabian: I think it would be better without such a limitation.I want to consult another problem. When I use BucketingSink(I use aws s3), the filename of a few files after checkpoint still hasn't changed, resulting in the underline prefix of the final generation of a small number of

Re: Record timestamp from kafka

2018-04-10 Thread Chesnay Schepler
You must use a ProcessFunction for this, the timestamps are not exposed in any way to map/flatmap functions. On 10.04.2018 12:29, Ben Yan wrote: Hi Fabian. If I use ProcessFunction , I can get it! But I want to know that how to get Kafka timestamp in like flatmap and map methods of

Re: Record timestamp from kafka

2018-04-10 Thread Ben Yan
Hi Fabian. If I use ProcessFunction , I can get it! But I want to know that how to get Kafka timestamp in like flatmap and map methods of datastream using scala programming language. Thanks! Best Ben > On Apr 4, 2018, at 7:00 PM, Fabian Hueske wrote: > > Hi

Re: java.lang.Exception: TaskManager was lost/killed

2018-04-10 Thread Lasse Nedergaard
This time attached. 2018-04-10 10:41 GMT+02:00 Ted Yu : > Can you use third party site for the graph ? > > I cannot view it. > > Thanks > > Original message > From: Lasse Nedergaard > Date: 4/10/18 12:25 AM (GMT-08:00) > To:

Re: java.lang.Exception: TaskManager was lost/killed

2018-04-10 Thread Ted Yu
Can you use third party site for the graph ? I cannot view it. Thanks Original message From: Lasse Nedergaard Date: 4/10/18 12:25 AM (GMT-08:00) To: Ken Krugler Cc: user , Chesnay Schepler

Re: java.lang.Exception: TaskManager was lost/killed

2018-04-10 Thread Lasse Nedergaard
This graph shows Non-Heap . If the same pattern exists it make sense that it will try to allocate more memory and then exceed the limit. I can see the trend for all other containers that has been killed. So my question is now, what is using non-heap memory? From

Re: java.lang.Exception: TaskManager was lost/killed

2018-04-10 Thread Lasse Nedergaard
Hi. I found the exception attached below, for our simple job. It states that our task-manager was killed du to exceed memory limit on 2.7GB. But when I look at Flink metricts just 30 sec before it use 1.3 GB heap and 712 MB Non-Heap around 2 GB. So something else are also using memory inside the

Re: Flink 1.4.2 in Zeppelin Notebook

2018-04-10 Thread Rico Bergmann
FYI: I finally managed to get the new Flink version running in Zeppelin. Besides adding the parameters mentioned below you have to build Zeppelin with profile scala-2.11 and the new Flink version 1.4.2. Best, Rico. Am 09.04.2018 um 14:43 schrieb Rico Bergmann: > > The error message is: > >