Re: Join two Datasets --> InvalidProgramException

2016-02-10 Thread Dominique Rondé
Hi Fabian, your hint was good! Maven fools me with the dependency management. Now everything works as expected! Many many thanks to all of you! Greets Dominique Am 10.02.2016 um 08:45 schrieb Fabian Hueske: Hi Dominique, can you check if the versions of the remotely running job manager &

Re: Join two Datasets --> InvalidProgramException

2016-02-09 Thread Fabian Hueske
Hi, glad you could resolve the POJO issue, but the new error doesn't look right. The CO_GROUP_RAW strategy should only be used for programs that are implemented against the Python DataSet API. I guess that's not the case since all code snippets were Java so far. Can you post the full stacktrace

Re: Join two Datasets --> InvalidProgramException

2016-02-09 Thread Till Rohrmann
I tested the TypeExtractor with your SourceA and SourceB types (adding proper setters and getters) and it correctly returned a PojoType. Thus, I would suspect that you haven’t specified the proper setters and getters in your implementation. Cheers, Till ​ On Tue, Feb 9, 2016 at 2:46 PM,

Re: Join two Datasets --> InvalidProgramException

2016-02-09 Thread Chiwan Park
I wrote a sample inherited POJO example [1]. The example works with Flink 0.10.1 and 1.0-SNAPSHOT. [1]: https://gist.github.com/chiwanpark/0389ce946e4fff58d611 Regards, Chiwan Park > On Feb 9, 2016, at 8:07 PM, Fabian Hueske wrote: > > What is the type of sessionId? > It

Re: Join two Datasets --> InvalidProgramException

2016-02-09 Thread Chiwan Park
Hi Dominique, It seems that `SourceA` is not dealt as POJO. Are all fields in SourceA public? There are some requirements for POJO classes [1]. [1]: https://ci.apache.org/projects/flink/flink-docs-release-0.10/apis/programming_guide.html#pojos Regards, Chiwan Park > On Feb 9, 2016, at 7:42

Re: Join two Datasets --> InvalidProgramException

2016-02-09 Thread Dominique Rondé
The fields in SourceA and SourceB are private but have public getters and setters. The classes provide an empty and public constructor. Am 09.02.2016 11:47 schrieb "Chiwan Park" : > Oh, the fields in SourceA have public getters. Does the fields in SourceA > have public

Re: Join two Datasets --> InvalidProgramException

2016-02-09 Thread Till Rohrmann
Could you share the code for your types SourceA and SourceB. It seems as if Flink does not recognize them to be POJOs because he assigned them the GenericType type. Either there is something wrong with the type extractor or your implementation does not fulfil the requirements for POJOs, as

Re: Join two Datasets --> InvalidProgramException

2016-02-09 Thread Fabian Hueske
What is the type of sessionId? It must be a key type in order to be used as key. If it is a generic class, it must implement Comparable to be used as key. 2016-02-09 11:53 GMT+01:00 Dominique Rondé : > The fields in SourceA and SourceB are private but have public

Re: Join two Datasets --> InvalidProgramException

2016-02-09 Thread Chiwan Park
Oh, the fields in SourceA have public getters. Does the fields in SourceA have public setter? SourceA needs public setter for private fields. Regards, Chiwan Park > On Feb 9, 2016, at 7:45 PM, Chiwan Park wrote: > > Hi Dominique, > > It seems that `SourceA` is not

Re: Join two Datasets --> InvalidProgramException

2016-02-09 Thread Till Rohrmann
Could you post the complete example code (Flink job including the type definitions). For example, if the data sets are of type DataSet, then it will be treated as a GenericType. Judging from your pseudo code, it looks fine on the first glance. Cheers, Till ​ On Tue, Feb 9, 2016 at 2:25 PM,

Re: Join two Datasets --> InvalidProgramException

2016-02-09 Thread Dominique Rondé
Sorry, i was out for lunch. Maybe the problem is that sessionID is a String? public abstract class Parent{ private Date eventDate; private EventType eventType; private String sessionId; public Parent() { } //GETTER & SETTER } public class SourceA extends Parent{ private Boolean

Re: Join two Datasets --> InvalidProgramException

2016-02-09 Thread Fabian Hueske
String is perfectly fine as key. Looks like SourceA / SourceB are not correctly identified as Pojos. 2016-02-09 14:25 GMT+01:00 Dominique Rondé : > Sorry, i was out for lunch. Maybe the problem is that sessionID is a > String? > > public abstract class Parent{ >

Re: Join two Datasets --> InvalidProgramException

2016-02-09 Thread Dominique Rondé
Here we go! ExecutionEnvironment env = ExecutionEnvironment.createRemoteEnvironment("xxx.xxx.xxx.xxx", 53408,"flink-job.jar"); DataSource datasourceA= env.readTextFile("hdfs://dev//sourceA/"); DataSource datasourceB= env.readTextFile("hdfs://dev//sourceB/"); DataSet sourceA=

Re: Join two Datasets --> InvalidProgramException

2016-02-09 Thread Dominique Rondé
Hi, your guess is correct. I use java all the time... Here is the complete stacktrace: Exception in thread "main" org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Job execution failed. at

Re: Join two Datasets --> InvalidProgramException

2016-02-09 Thread Fabian Hueske
Hi Dominique, can you check if the versions of the remotely running job manager & task managers are the same as the Flink version that is used to submit the job? The version and commit hash are logged at the top of the JM and TM log files. Right now, the local client optimizes the job, chooses