Re: JobManager not receiving resource offers from Mesos

2018-01-04 Thread Dongwon Kim
Hi Till, > It could be as simple as giving Flink the right role via > `mesos.resourcemanager.framework.role`. The problem seems more related to resources (GPUs) than framework roles. The cluster I'm working on consists of servers all equipped with GPUs. When DC/OS is installed, a GPU-specific

[SEATTLE MEETUP] Announcing First Seattle Apache Flink Meetup

2018-01-04 Thread Bowen Li
Hi Flinkers, After months of preparation, we are excited to announce our first Seattle Apache Flink meetup, and invite you to join us! *Please RSVP at https://www.meetup.com/seattle-apache-flink/events/246458117 . *Food and drinks

Queryable State in Flink 1.4

2018-01-04 Thread Boris Lublinsky
It appears, that queryable state access significantly changed in 1.4 compared to 1.3.Documentation on the queryable state client https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/state/queryable_state.html#exampleStates that the client needs to connect to a proxy port.My

Re: Apache Flink - Question about rolling window function on KeyedStream

2018-01-04 Thread M Singh
Hi Fabian: Thanks for your answer - it is starting to make sense to me now. On Thursday, January 4, 2018 12:58 AM, Fabian Hueske wrote: Hi, the ReduceFunction holds the last emitted record as state. When a new record arrives, it reduces the new record and last

Re: Apache Flink - broadcasting DataStream

2018-01-04 Thread M Singh
Thanks Fabian and Ufuk for your answers. On Thursday, January 4, 2018 12:32 AM, Fabian Hueske wrote: Hi, A broadcast replicates all elements by the parallelism of the following operator, i.e., each parallel instance of the following operator receives all events of

Re: Service discovery for flink-metrics-prometheus

2018-01-04 Thread Stephan Ewen
How are you running deploying your Flink processes? For Service Discovery for Prometheus on Kubernetes, there are a few articles out there... On Thu, Jan 4, 2018 at 3:52 PM, Aljoscha Krettek wrote: > I'm not aware of how this is typically done but maybe Chesnay (cc'ed) has

Re: Testing Flink 1.4 unable to write to s3 locally

2018-01-04 Thread Stephan Ewen
This looks like the output from the client - do you have some TaskManager log files with more log entries? That would be helpful... On Wed, Jan 3, 2018 at 5:26 PM, Kyle Hamlin wrote: > Hello Stephan & Nico, > > Here is the full stacktrace, its not much more than what I

Flink and Rest API

2018-01-04 Thread Alberto Ramón
* Read from Rest API in streaming / micro batch some values ( Example: read last Value of BitCoin) * Expose Queriable State as as queriable Rest API (Example: Expose intermediate results on demmand)

Re: Flink State monitoring

2018-01-04 Thread Steven Wu
Aljoscha/Stefan, if incremental checkpoint is enabled, I assume the "checkpoint size" is only the delta/incremental size (not the full state size), right? Thanks, Steven On Thu, Jan 4, 2018 at 5:18 AM, Aljoscha Krettek wrote: > Hi, > > I'm afraid there is currently no

Re: About Kafka08Fetcher and Kafka010Fetcher

2018-01-04 Thread Tzu-Li (Gordon) Tai
Hi Jaxon, The threading model is implemented differently between the Kafka08Fetcher and all other fetcher versions higher than 0.9+ because the Kafka Java clients used between these versions have different abstraction levels. The Kafka08Fetcher still uses the low-level `SimpleConsumer` API,

Re: Exception on running an Elasticpipe flink connector

2018-01-04 Thread Nico Kruber
Hi Vipul, Yes, this looks like a problem with a different netty version being picked up. First of all, let me advertise Flink 1.4 for this since there we properly shade away our netty dependency (on version 4.0.27 atm) so you (or in this case Elasticsearch) can rely on your required version.

RE: Two operators consuming from same stream

2018-01-04 Thread Sofer, Tovi
Hi Timo, Actually I do keyBy in both cases, and in split\duplicate case I do it on both splitted streams. I did do the connect below twice and not once, but connect only calls ctor of ConnectedStreams, and doesn’t do any real operation. So I don’t see how it will make a difference. I can try

Re: Parquet Format Read and Write

2018-01-04 Thread Aljoscha Krettek
A quick google search is turning up these results: - https://medium.com/@istanbul_techie/crunching-parquet-files-with-apache-flink-200bec90d8a7 - https://github.com/FelixNeutatz/parquet-flinktacular

Re: Service discovery for flink-metrics-prometheus

2018-01-04 Thread Aljoscha Krettek
I'm not aware of how this is typically done but maybe Chesnay (cc'ed) has an idea. > On 14. Dec 2017, at 16:55, Kien Truong wrote: > > Hi, > > Does anyone have recommendations about integrating flink-metrics-prometheus > with some SD mechanism > > so that Prometheus

Re: about the checkpoint and state backend

2018-01-04 Thread Aljoscha Krettek
TaskManagers don't do any checkpointing but Operators that run in TaskManagers do. Each operator, of which there are multiple running on multiple TMs in the cluster will write to a unique DFS directory. Something like: /checkpoints/job-xyz/checkpoint-1/operator-a/1 These individual checkpoints

Re: ElasticSearch Connector for version 6.x and scala 2.11

2018-01-04 Thread Nico Kruber
Actually, Flink's netty dependency (4.0.27) is shaded away into the "org.apache.flink.shaded.netty4.io.netty" package now (since version 1.4) and should thus not clash anymore. However, other netty versions may come into play from the job itself or from the integration of Hadoop's classpath (if

Re: Queryable State Client - Actor Not found Exception

2018-01-04 Thread Velu Mitwa
Thank you Aljoscha. I am able to Query state when I use the hostname of Job Manager instead of its IP Address. But I couldn't understand why it is not working if I give IP address. On Thu, Jan 4, 2018 at 6:42 PM, Aljoscha Krettek wrote: > Hi, > > Is my-machine:52650 the

Re: Separate checkpoint directories

2018-01-04 Thread Stefan Richter
Hi, the state is checkpointed in subdirectories and with unique file names, so having all in one root directory is no problem. This all happens automatically. As far as I know, there is no implementation that generates output paths for sinks like that. You could open a jira with a feature

Re: does the flink sink only support bio?

2018-01-04 Thread Stefan Richter
Yes, that is how it works. > Am 04.01.2018 um 14:47 schrieb Jinhua Luo : > > The TwoPhaseCommitSinkFunction seems to record the transaction status > in the state just like what I imagine above, correct? > and if the progress fails before commit, in the later restart, the >

Re: about the checkpoint and state backend

2018-01-04 Thread Jinhua Luo
One task manager would create one rocksdb instance on its local temporary dir, correct? Since there is likely multiple task managers for one cluster, so how they handle directory conflict, because one rocksdb instance is one directory, that is, what I mentioned at first, how they merge rocksdb

Re: about the checkpoint and state backend

2018-01-04 Thread Aljoscha Krettek
Ah I see. Currently the RocksDB backend will use one column in RocksDB per state that is registered. The states for different keys of one state are stored in one column. > On 4. Jan 2018, at 14:56, Jinhua Luo wrote: > > ok, I see. > > But as known, one rocksdb instance

Re: about the checkpoint and state backend

2018-01-04 Thread Jinhua Luo
ok, I see. But as known, one rocksdb instance occupy one directory, so I am still wondering what's the relationship between the states and rocksdb instances. 2018-01-04 21:50 GMT+08:00 Aljoscha Krettek : > Each operator (which run in a TaskManager) will write its state to

Re: about the checkpoint and state backend

2018-01-04 Thread Aljoscha Krettek
Each operator (which run in a TaskManager) will write its state to some location in HDFS (or any other DFS) and send a handle to this to the CheckpointCoordinator (which is running on the JobManager). The CheckpointCoordinator is collecting all the handles and creating one Uber-Handle, which

Re: does the flink sink only support bio?

2018-01-04 Thread Jinhua Luo
The TwoPhaseCommitSinkFunction seems to record the transaction status in the state just like what I imagine above, correct? and if the progress fails before commit, in the later restart, the commit would be triggered again, correct? So the commit would not be forgotten, correct? 2018-01-03 22:54

Re: about the checkpoint and state backend

2018-01-04 Thread Jinhua Luo
OK, I think I get the point. But another question raises: how task managers merge their rocksdb snapshot on a global single path? 2018-01-04 21:30 GMT+08:00 Aljoscha Krettek : > Hi, > > The path you give to the constructor must be a path on some distributed > filesystem,

Re: custom writer fail to recover

2018-01-04 Thread Aljoscha Krettek
Hi, Which version of Flink is this? It cannot recover because it expects more data to have been written than is there, which seems to indicate that flushing did not work correctly. Best, Aljoscha > On 19. Dec 2017, at 00:40, xiatao123 wrote: > > Hi Das, > Have you got

Re: keyby() issue

2018-01-04 Thread Jinhua Luo
I mean the timeout should likely happens in the sending queue of the redis lib if the concurrency number is low. -org.apache.flink.streaming.api.operators.async.AsyncWaitOperator.processElement(StreamRecord) public void processElement(StreamRecord element) throws Exception { final

Re: Flink support for Microsoft Windows

2018-01-04 Thread Aljoscha Krettek
We have one committer (and PMC) who is developing on windows but I'm not aware of people running production Flink on windows. There probably still are some that I'm not aware of, though. I'm cc'ing Chesnay who is the Flink Windows guy. Best, Aljoscha > On 19. Dec 2017, at 09:02, avivros

Re: keyby() issue

2018-01-04 Thread Jinhua Luo
2018-01-04 21:04 GMT+08:00 Aljoscha Krettek : > Memory usage should grow linearly with the number of windows you have active > at any given time, the number of keys and the number of different window > operations you have. But the memory usage is still too much, especially

Re: about the checkpoint and state backend

2018-01-04 Thread Aljoscha Krettek
Hi, The path you give to the constructor must be a path on some distributed filesystem, otherwise the data will be lost when the local machine crashes. As you mentioned correctly. RocksDB will keep files in a local directory (you can specify this using setDbStoragePath()) and when

Re: about the checkpoint and state backend

2018-01-04 Thread Jinhua Luo
I still do not understand the relationship between rocksdb backend and the filesystem (here I refer to any filesystem impl, including local, hdfs, s3). For example, when I specify the path to rocksdb backend: env.setStateBackend(new RocksDBStateBackend("file:///data1/flinkapp")); What does it

Re: Flink State monitoring

2018-01-04 Thread Aljoscha Krettek
Hi, I'm afraid there is currently no metrics around state. I see that it's very good to have so I'm putting it on my list of stuff that we should have at some point. One thing that comes to mind is checking the size of checkpoints, which gives you an indirect way of figuring out how big state

Re: NullPointerException with Avro Serializer

2018-01-04 Thread Aljoscha Krettek
Phew, thanks for letting us know!  And yes, there were some problems with Avro and class loading but I was hoping that we got them all before Flink 1.4.0. Best, Aljoscha > On 20. Dec 2017, at 10:54, Kien Truong wrote: > > It turn out that our flink branch is

Re: Queryable State Client - Actor Not found Exception

2018-01-04 Thread Aljoscha Krettek
Hi, Is my-machine:52650 the correct address for the JobManager running in YARN? Also, Queryable State in Flink 1.3.2 is not easy to get to work when you use YARN with HA mode. Best, Aljoscha > On 3. Jan 2018, at 16:02, Velu Mitwa wrote: > > Hi, > I am running a

Re: keyby() issue

2018-01-04 Thread Aljoscha Krettek
Memory usage should grow linearly with the number of windows you have active at any given time, the number of keys and the number of different window operations you have. Regarding the async I/O writing to redis, I see that you give a capacity of 1 which means that there will possibly be

Re: Job Manager not able to fetch job info when restarted

2018-01-04 Thread Sushil Ks
Oh okay! Thanks! On Jan 4, 2018 2:47 PM, "Stefan Richter" wrote: > Yes, HA is required for what you want to do. I am not aware of an alert > mechanism in Flink, but would assume that this is something that should > better be solved on the YARN level? > > Best, >

Re: Can't call getProducedType on Avro messages with array types

2018-01-04 Thread Aljoscha Krettek
Hi, I think you might be able to use AvroTypeInfo which you can use by including the flink-avro dependencies. Is that an option for you? Best, Aljoscha > On 3. Jan 2018, at 21:34, Kyle Hamlin wrote: > > Hi, > > It appears that Kryo can't properly extract/deserialize

Re: BucketingSink doesn't work anymore moving from 1.3.2 to 1.4.0

2018-01-04 Thread Aljoscha Krettek
I think this might be happening because partial Hadoop dependencies are in the user jar and the rest is only available from the Hadoop deps that come bundled with Flink. For example, I noticed that you have Hadoop-common as a dependency which probably ends up in your Jar. > On 4. Jan 2018, at

Re: BucketingSink doesn't work anymore moving from 1.3.2 to 1.4.0

2018-01-04 Thread Stephan Ewen
@Kyle: Please also check if you have any Hadoop classes in your user jar. There should be none, Hadoop should only be in the Flink classpath. Fixing the project Maven setup (making sure Hadoop and Flink core dependencies are provided) should work. To do that, you can for example use the latest

Re: Flink CEP with event time

2018-01-04 Thread Aljoscha Krettek
Yes, because event-time only advances if something makes it advance. Basically. > On 4. Jan 2018, at 11:34, shashank agarwal wrote: > > But this will be wrong in my case. So I have to wait for the results until I > receive next event. > > > > ‌ > > On Thu, Jan 4,

Re: BucketingSink doesn't work anymore moving from 1.3.2 to 1.4.0

2018-01-04 Thread Stephan Ewen
Hi! This looks indeed like a class-loading issue - it looks like "RpcEngine" and "ProtobufRpcEngine" are loaded via different classloaders. Can you try the following: - In your flink-conf.yml, set classloader.resolve-order: parent-first If that fixes the issue, then we can look at a way to

Re: Flink CEP with event time

2018-01-04 Thread shashank agarwal
But this will be wrong in my case. So I have to wait for the results until I receive next event. ‌ On Thu, Jan 4, 2018 at 3:53 PM, Aljoscha Krettek wrote: > Think this is actually working as intended, from your earlier description > of when results are produced: When you

RE: Lower Parallelism derives better latency

2018-01-04 Thread Netzer, Liron
Hi, Yes, one machine. Only local connections. CPU wasn't really effected by the parallelism changes, the CPU consumption was ~28%. I'm finding out whether I'm allowed to send the code, will update soon. Thanks, Liron From: Stephan Ewen [mailto:se...@apache.org] Sent: Thursday, January 04, 2018

Re: Flink CEP with event time

2018-01-04 Thread Aljoscha Krettek
Think this is actually working as intended, from your earlier description of when results are produced: When you see Event 1.B, the watermark is not sufficiently advanced to trigger computation, only when you see Event 2.A does the watermark advance and you get a result. This is what I would

Re: Lower Parallelism derives better latency

2018-01-04 Thread Stephan Ewen
Just to make sure: - This runs on one machine, so only local connections? On Thu, Jan 4, 2018 at 10:47 AM, Stefan Richter wrote: > Hi, > > ok, throughput sounds good then, and I assume there is also no unexpected > increase in CPU usage? For the code example,

Re: Python API not working

2018-01-04 Thread Yassine MARZOUGUI
Hi all, Any ideas on this? 2017-12-15 15:10 GMT+01:00 Yassine MARZOUGUI : > Hi Ufuk, > > Thanks for your response. Unfortunately specifying 'streaming` or `batch` > doesn't work, it looks like mode should be either "plan" or "operator" , > and then the program

Re: Lower Parallelism derives better latency

2018-01-04 Thread Stefan Richter
Hi, ok, throughput sounds good then, and I assume there is also no unexpected increase in CPU usage? For the code example, maybe it is possible to minimize the code (dropping all the confidential business logic, simple generator sources,…) , while still keeping the general shape of the job

Re: Job Manager not able to fetch job info when restarted

2018-01-04 Thread Stefan Richter
Yes, HA is required for what you want to do. I am not aware of an alert mechanism in Flink, but would assume that this is something that should better be solved on the YARN level? Best, Stefan > Am 03.01.2018 um 19:27 schrieb Sushil Ks : > > Hi Stefan, > It

Re: Lower Parallelism derives better latency

2018-01-04 Thread Stefan Richter
Hi, ok that would have been good to know, so forget about my explanation attempt :-). This makes it interesting, and at the same time I cannot come up with an „easy“ explanation. It is not even clear if the reason for this is a general problem in Flink, your setup, or caused by something that

Re: Apache Flink - Question about rolling window function on KeyedStream

2018-01-04 Thread Fabian Hueske
Hi, the ReduceFunction holds the last emitted record as state. When a new record arrives, it reduces the new record and last emitted record, updates its state, and emits the new result. Therefore, a ReduceFunction emits one output record for each input record, i.e., it is triggered for each input

Re: keyby() issue

2018-01-04 Thread Jinhua Luo
The app is very simple, please see the code snippet: https://gist.github.com/kingluo/e06381d930f34600e42b050fef6baedd I rerun the app, but it's weird that it can continuously produce the results now. But it have two new issues: a) memory usage too high, it uses about 8 GB heap memory! why?

Re: Apache Flink - broadcasting DataStream

2018-01-04 Thread Fabian Hueske
Hi, A broadcast replicates all elements by the parallelism of the following operator, i.e., each parallel instance of the following operator receives all events of its input stream. If the operation following a broadcast is a connect and a co-operator (and the other input is not broadcasted), the