Re: Debugging Python-Api fails with NoClassDefFoundError

2017-01-04 Thread Mathias Peters
Yes, it is. Also, the project import in Idea has worked so far. Cheers On 04.01.2017 21:52, Ted Yu wrote: > This class is in flink-core jar. > > Have you verified that the jar is on classpath ? > > Cheers > > On Wed, Jan 4, 2017 at 12:16 PM, Mathias Peters > mailto:mathias.pet...@gmx.org>> wrote

Re: 1.1.4 IntelliJ Problem

2017-01-04 Thread Stephan Epping
Hey Yuri, thanks a lot. It was flink-spector that was requiring flink-test-utils 1.1.0 best, Stephan > On 04 Jan 2017, at 13:17, Yury Ruchin wrote: > > Hi Stephan, > > It looks like you have libraries from different versions of Flink > distribution on the same classpath. > > ForkableFlink

Re: Are heterogeneous DataStreams possible?

2017-01-04 Thread ljwagerfield
I should add: the operators determine how to handle each message by inspecting the message's SCHEMA_ID field (every message has a SCHEMA_ID as its first field). -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Are-heterogeneous-DataStreams-pos

Are heterogeneous DataStreams possible?

2017-01-04 Thread ljwagerfield
Our data's schema is defined by our users and is not known at compile time. All data arrives in via a single Kafka topic and is serialized using the same serialization tech (to be defined). We want to use King.com's RBEA technique to process this data in different ways at runtime (depending on i

Re: How do I ensure binary comparisons are being used?

2017-01-04 Thread ljwagerfield
Thank you Fabian :) -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/How-do-I-ensure-binary-comparisons-are-being-used-tp10806p10851.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Som question about Flink stream sql

2017-01-04 Thread Hongyuhong
Hi, We are currently exploring on Flink streamsql , And I see the group-window has been implemented in Table API, and row-window is also planning in FLIP-11. It seems that row-window grammar is more similar to calcite over clause. I'm curious about the detail plan and roadmap of stream sql, cause

Rapidly failing job eventually causes "Not enough free slots"

2017-01-04 Thread Shannon Carey
In Flink 1.1.3 on emr-5.2.0, I've experienced a particular problem twice and I'm wondering if anyone has some insight about it. In both cases, we deployed a job that fails very frequently (within 15s-1m of launch). Eventually, the Flink cluster dies. The sequence of events looks something like

Re: Debugging Python-Api fails with NoClassDefFoundError

2017-01-04 Thread Ted Yu
This class is in flink-core jar. Have you verified that the jar is on classpath ? Cheers On Wed, Jan 4, 2017 at 12:16 PM, Mathias Peters wrote: > Hi, > > I just wanted to debug a custom python script using your python dataset > api. Running the PythonPlanBinder in Intellij IDEA gives me the >

Debugging Python-Api fails with NoClassDefFoundError

2017-01-04 Thread Mathias Peters
Hi, I just wanted to debug a custom python script using your python dataset api. Running the PythonPlanBinder in Intellij IDEA gives me the subjected error. I took a fresh clone, built it with mvn clean install -DskipTest, and imported everything in idea. Using an older version this worked fine, s

Shared Object Instance over different RichMapFunctions

2017-01-04 Thread Duck
Hi there, I was wondering on how my caching object, would behave in the given scenario below. 1) I create an instance of an object that performs lookups to an external resource, and caches results. 2) I have a DataStream that i perform a map function on (with a custom RichMapFunction) 3) I hav

RE: Serializing NULLs

2017-01-04 Thread Newport, Billy
Map> in your avro schema is what you want here if the map values are nullable. From: Anirudh Mallem [mailto:anirudh.mal...@247-inc.com] Sent: Tuesday, December 20, 2016 2:26 PM To: user@flink.apache.org Subject: Re: Serializing NULLs If you are using Avro generated classes then you cannot have

Re: Flink Checkpoint runs slow for low load stream

2017-01-04 Thread Fabian Hueske
Hi CVP, we recently release Flink 1.1.4, i.e., the next bugfix release of the 1.1.x series with major robustness improvements [1]. You might want to give 1.1.4 a try as well. Best, Fabian [1] http://flink.apache.org/news/2016/12/21/release-1.1.4.html 2017-01-04 16:51 GMT+01:00 Chakravarthy vara

Re: Flink Checkpoint runs slow for low load stream

2017-01-04 Thread Chakravarthy varaga
Hi Stephan, All, I just got a chance to try if 1.1.3 fixes slow check pointing on FS backend. It seemed to have been fixed. Thanks for the fix. While testing this, with varying check point intervals, there seem to be Spikes of slow checkpoints every 30/40 seconds for an interval of 15 s

Re: Reading worker-local input files

2017-01-04 Thread Robert Schmidtke
Hi Fabian, thanks for your directions! They worked flawlessly. I am aware of the reduced robustness, but then again my input is only available on each worker and not replicated. In case anyone is wondering, here is how I did it: *https://github.com/robert-schmidtke/hdfs-statistics-adapter/tree/2a4

Re: [DISCUSS] Apache Flink 1.2.0 RC0 (Non-voting testing release candidate)

2017-01-04 Thread Ted Yu
Hi, I downloaded the source tar ball and ran test suite. AsyncWaitOperatorTest hung: "main" #1 prio=5 os_prio=0 tid=0x7f02c8008800 nid=0x4b8c in Object.wait() [0x7f02cf974000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waitin

Re: Question about Scheduling of Batch Jobs

2017-01-04 Thread Konstantin Knauf
Hi Fabian, I see, thank's for the quick explanation. Cheers, Konstantin On 04.01.2017 14:15, Fabian Hueske wrote: > Hi Konstantin, > > the DataSet API tries to execute all operators as soon as possible. > > I assume that in your case, Flink does not do this because it tries to > avoid a dead

Re: Question about Scheduling of Batch Jobs

2017-01-04 Thread Fabian Hueske
Hi Konstantin, the DataSet API tries to execute all operators as soon as possible. I assume that in your case, Flink does not do this because it tries to avoid a deadlock. A dataflow which replicates data from the same source and joins it again might get deadlocked because all pipelines need to m

Re: Flink streaming questions

2017-01-04 Thread Fabian Hueske
Hi Henri, can you express the logic of your FoldFunction (or WindowFunction) as a combination of ReduceFunction and WindowFunction [1]? ReduceFunction should be supported by a merging WindowAssigner and has the same resource consumption as a FoldFunction, i.e., a single record per window. Best, F

Re: How do I ensure binary comparisons are being used?

2017-01-04 Thread Fabian Hueske
Hi Lawrence, comparison of binary data are mainly used by the DataSet API when sorting large data sets or building and probing hash tables. The DataStream API mainly benefits from Flink's custom and efficient serialization when sending data over the wire or taking checkpoints. There are also plan

Question about Scheduling of Batch Jobs

2017-01-04 Thread Konstantin Knauf
Hi everyone, I have a basic question regarding scheduling of batch programs. Let's take the following graph: -> Group Combine -> ... / Source > Group Combine -> ... \ -> Map -> ... So, a source and followed by three operators with ship strategy "Forward" a

Re: 1.1.4 IntelliJ Problem

2017-01-04 Thread Yury Ruchin
Hi Stephan, It looks like you have libraries from different versions of Flink distribution on the same classpath. ForkableFlinkMiniCluster resides in flink-test-utils. As of distribution version 1.1.3 it invokes JobManager.startJobManagerActors() with 6 arguments. The signature changed by 1.1.4,

Re: Triggering a saveppoint failed the Job

2017-01-04 Thread Stephan Ewen
Hi! Thanks for reporting this. I created a JIRA issue for it: https://issues.apache.org/jira/browse/FLINK-5407 We'll look into it as part of the 1.2 release testing. If you have any more details that may help diagnose/fix that, would be great if you could share them with us. Thanks, Stephan O

Triggering a saveppoint failed the Job

2017-01-04 Thread Yassine MARZOUGUI
Hi all, I tried to trigger a savepoint for a streaming job, both the savepoint and the job failed. The job failed with the following exception: java.lang.RuntimeException: Error while triggering checkpoint for IterationSource-7 (1/1) at org.apache.flink.runtime.taskmanager.Task$3.run(Tas

Re: Does Flink cluster security works in Flink 1.1.4 release?

2017-01-04 Thread Stephan Ewen
Hi! Flink 1.1.x supports Kerberos for Hadoop (HDFS, YARN, HBase) via Hadoop's ticket system. It should work via kinit, in the same way when submitting a secure MapReduce job. Kerberos for ZooKeeper, Kafka, etc, is only part of the 1.2 release. Greetings, Stephan On Wed, Jan 4, 2017 at 7:25 AM,

Re: Cannot run using a savepoint with the same jar

2017-01-04 Thread Stephan Ewen
Hi! Did you change the parallelism in your program, or do the names of some functions change each time you call the program? Can you try what happens when you give explicit IDs to operators via the '.uid(...)' method? Stephan On Tue, Jan 3, 2017 at 11:44 PM, Al-Isawi Rami wrote: > Hi, > > I

Re: 1.1.4 IntelliJ Problem

2017-01-04 Thread Stephan Epping
I also changed the scala version of the packages/artifacts to 2.11, with no success. Further, I am not deeply familiar with maven or java dependency management at all. best Stephan > On 03 Jan 2017, at 22:06, Stephan Ewen wrote: > > Hi! > > It is probably some inconsistent configuration in t

Re: 1.1.4 IntelliJ Problem

2017-01-04 Thread Stephan Epping
Thanks Stephan, but that didn’t help. The IDE is configured to use Default Scala Compiler and JDK 1.8.0_92. best Stephan > On 03 Jan 2017, at 22:06, Stephan Ewen wrote: > > Hi! > > It is probably some inconsistent configuration in the IDE. > > It often helps to do "Maven->Reimport" or use