Task and Operator Monitoring via JMX / naming

2016-10-14 Thread Philipp Bussche
Hi there, I am struggeling to understand what I am looking at after enabling JMX metric reporting on my taskmanager. The job I am trying this out with has 1 source, 2 map functions (where one is a RichMap) and 3 sinks. This is how I have written my Job: DataStream invitations = streaming

Re: Listening to timed-out patterns in Flink CEP

2016-10-14 Thread David Koch
Hello, Thanks for the code Sameer. Unfortunately, it didn't solve the issue. Compared to what I did the principle is the same - make sure that the watermark advances even without events present to trigger timeouts in CEP patterns. If Till or anyone else could provide a minimal example

Re: [DISCUSS] Deprecate Hadoop source method from (batch) ExecutionEnvironment

2016-10-14 Thread Till Rohrmann
Fabian's proposal sounds good to me. It would be a good first step towards removing our dependency on Hadoop. Thus, +1 for the changes. Cheers, Till On Fri, Oct 14, 2016 at 11:29 AM, Fabian Hueske wrote: > Hi everybody, > > I would like to propose to deprecate the utility

Re: ConcurrentModificationException when using histogram accumulators

2016-10-14 Thread Till Rohrmann
Hi Yukun, I think you've found a bug in the code. The accumulators don't seem to be really thread safe. I've created an issue to fix this issue [1]. Thanks for reporting the problem :-) [1] https://issues.apache.org/jira/browse/FLINK-4829 Cheers, Till On Fri, Oct 14, 2016 at 8:32 AM, Yukun Guo

Re: Listening to timed-out patterns in Flink CEP

2016-10-14 Thread Till Rohrmann
Hi guys, I'll try to come up with an example illustrating the behaviour over the weekend. Cheers, Till On Fri, Oct 14, 2016 at 11:16 AM, David Koch wrote: > Hello, > > Thanks for the code Sameer. Unfortunately, it didn't solve the issue. > Compared to what I did the

[DISCUSS] Deprecate Hadoop source method from (batch) ExecutionEnvironment

2016-10-14 Thread Fabian Hueske
Hi everybody, I would like to propose to deprecate the utility methods to read data with Hadoop InputFormats from the (batch) ExecutionEnvironment. The motivation for deprecating these methods is reduce Flink's dependency on Hadoop but rather have Hadoop as an optional dependency for users that

Re: SVM Multiclass classification

2016-10-14 Thread Theodore Vasiloudis
Hello Kursat, As noted in the documentation, the SVM implementation is for binary classification only for the time being. Regards, Theodore -- Sent from a mobile device. May contain autocorrect errors. On Oct 13, 2016 8:53 PM, "Kürşat Kurt" wrote: > Hi; > > > > I am

Re: [DISCUSS] Deprecate Hadoop source method from (batch) ExecutionEnvironment

2016-10-14 Thread Aljoscha Krettek
+1 for deprecating and the removing. On Fri, 14 Oct 2016 at 11:38 Till Rohrmann wrote: > Fabian's proposal sounds good to me. It would be a good first step towards > removing our dependency on Hadoop. > > Thus, +1 for the changes. > > Cheers, > Till > > On Fri, Oct 14,

How to debug why Flink makes and executes only partial plan

2016-10-14 Thread Satish Chandra Gupta
Hi, In my Flink program, after a couple of map, union and connect, I have a final filter and a sink. Something like this (after abstracting out details): val filteredEvents: DataStream[NotificationEvent] = allThisStuffWorking .name("filtered_users") filteredEvents *.filter(x =>

A question regarding to the checkpoint mechanism

2016-10-14 Thread Li Wang
Hi all, As far as I know, a stateful operator will checkpoint its current state to a persistent storage when it receives all the barrier from all of its upstream operators. My question is that does the operator doing the checkpoint need to pause processing the input tuples for the next batch

Re: HivePartitionTap is not working with cascading flink

2016-10-14 Thread Aljoscha Krettek
+Fabian directly looping in Fabian since he worked on the Cascading/Flink integration. Do you have any idea about this? On Fri, 14 Oct 2016 at 08:42 Santlal J Gupta < santlal.gu...@bitwiseglobal.com> wrote: > Hi, > > > > I am new to flink, I have created cascading job and used * >

job failure with checkpointing enabled

2016-10-14 Thread robert.lancaster
I recently tried enabling checkpointing in a job (that previously works w/o checkpointing) and received the following failure on job execution: java.io.FileNotFoundException:

Re: [DISCUSS] Deprecate Hadoop source method from (batch) ExecutionEnvironment

2016-10-14 Thread Fabian Hueske
Hi Shannon, the plan is as follows: We will keep the methods as they are for 1.2 but deprecate them and at the same time we will add alternatives in an optional dependency. In a later release, the deprecated methods will be removed and everybody has to switch to the optional dependency. Does

Re: [DISCUSS] Deprecate Hadoop source method from (batch) ExecutionEnvironment

2016-10-14 Thread Shannon Carey
Speaking as a user, if you are suggesting that you will retain the functionality but move the methods to an optional dependency, it makes sense to me. We have used the Hadoop integration for AvroParquetInputFormat and CqlBulkOutputFormat in Flink (although we won't be using CqlBulkOutputFormat

ConcurrentModificationException when using histogram accumulators

2016-10-14 Thread Yukun Guo
This happens when the TaskManager is serializing an org.apache.flink.api.common.accumulators.Histogram by iterating through the underlying TreeMap while a MapFunction for updating the accumulator attempts to modify the TreeMap concurrently. How could I fix it? The call stack: WARN

Re: [DISCUSS] Deprecate Hadoop source method from (batch) ExecutionEnvironment

2016-10-14 Thread Shannon Carey
Yep! From: Fabian Hueske > Date: Friday, October 14, 2016 at 11:00 AM To: Shannon Carey > Cc: "user@flink.apache.org" >

Flink Cluster is savepointing jobmanager instead of external filesystem

2016-10-14 Thread Jason Brelloch
Hey all, We are running into an issue where the savepoint command is saving to the jobmanager instead of the filesystem location specified in the flink-conf.yaml. It saves to the filesystem when we are running in a local standalone flink, but when we deploy the job to our flink cluster it saves

Re: job failure with checkpointing enabled

2016-10-14 Thread Aljoscha Krettek
Hi, the file that Flink is trying to create there is not meant to be in the checkpointing location. It is a local file that is used for buffering elements until a checkpoint barrier arrives (for certain cases). Can you check whether the base path where it is trying to create that file exists? For

Re: Flink Cluster is savepointing jobmanager instead of external filesystem

2016-10-14 Thread Aljoscha Krettek
Hi, are you running this standalone or on YARN? I'm also directly looping in Ufuk because he knows that stuff best. Maybe he has an idea what could be going on. Cheers, Aljoscha On Fri, 14 Oct 2016 at 18:39 Jason Brelloch wrote: > Hey all, > > We are running into an issue

Re: Flink Cluster is savepointing jobmanager instead of external filesystem

2016-10-14 Thread Jason Brelloch
It is a standalone cluster. On Fri, Oct 14, 2016 at 1:02 PM, Aljoscha Krettek wrote: > Hi, > are you running this standalone or on YARN? > > I'm also directly looping in Ufuk because he knows that stuff best. Maybe > he has an idea what could be going on. > > Cheers, >

Re: HivePartitionTap is not working with cascading flink

2016-10-14 Thread Fabian Hueske
Hi Santlal, I'm afraid I don't know what is going wrong here either. Debugging and correctly configuring the Taps was one of the major obstacles when implementing the connector. Best, Fabian 2016-10-14 14:40 GMT+02:00 Aljoscha Krettek : > +Fabian directly looping in Fabian