[jira] [Created] (FLINK-6024) Need more fine-grained info for "InvalidProgramException: This type (...) cannot be used as key"

2017-03-10 Thread Luke Hutchison (JIRA)
Luke Hutchison created FLINK-6024: - Summary: Need more fine-grained info for "InvalidProgramException: This type (...) cannot be used as key" Key: FLINK-6024 URL: https://issues.apache.org/jira/browse/FLINK-6024

Re: Flink CSV parsing

2017-03-10 Thread Flavio Pompermaier
If you already have an idea on how to proceed maybe I can try to take care of issue a PR using commons-csv or whatever library you prefer On 10 Mar 2017 22:07, "Fabian Hueske" wrote: Hi Flavio, Flink's CsvInputFormat was originally meant to be an efficient way to parse

Re: Flink CSV parsing

2017-03-10 Thread Fabian Hueske
Hi Flavio, Flink's CsvInputFormat was originally meant to be an efficient way to parse structured text files and dates back to the very early days of the project (probably 2011 or so). It was never meant to be compliant with the RFC specification and initially didn't support many features like

Re: Scala / Java window issue

2017-03-10 Thread Fabian Hueske
Hi Radu, there are already several WindowFunction implementations in the Table API that can help as a reference: - IncrementalAggregateAllTimeWindowFunction [1] - IncrementalAggregateAllWindowFunction [2] - IncrementalAggregateTimeWindowFunction [3] - IncrementalAggregateTimeWindowFunction [4]

Re: [DISCUSS] FLIP-17 Side Inputs

2017-03-10 Thread Kenneth Knowles
Hi all, I thought I would briefly join this thread to mention some side input lessons from Apache Beam. My knowledge of Flink is not deep enough, technically or philosophically, to make any specific recommendations. And I might just be repeating things that the docs and threads cover, but I hope

Re: Machine Learning on Flink - Next steps

2017-03-10 Thread Stavros Kontopoulos
Thanks Theodore, I'd vote for - Offline learning with Streaming API - Low-latency prediction serving Some comments... Online learning Good to have but my feeling is that it is not a strong requirement (if a requirement at all) across the industry right now. May become hot in the future.

Scala / Java window issue

2017-03-10 Thread Radu Tudoran
Hi, I am struggling to move a working implementation from Java to Scala :(...this is for computing window aggregates (sliding window). As I am not proficient in Scala I got block in (probably a stupid error)...maybe someone can help me. I am trying to create a simple window function to be

Re: [DISCUSS] FLIP-17 Side Inputs

2017-03-10 Thread Gábor Hermann
Hi all, Thanks Aljoscha for going forward with the side inputs and for the nice proposal! I'm also in favor of the implementation with N-ary input (3.) for the reasons Ventura explained. I'm strongly against managing side inputs at StreamTask level (2.), as it would create another

Re: Flink 1.2 / YARN on Secure MapR Cluster

2017-03-10 Thread Till Rohrmann
Hi, could it be that this issue is related to [1]? If so, then it should soon be fixed. [1] https://issues.apache.org/jira/browse/FLINK-5949 Cheers, Till On Wed, Mar 8, 2017 at 11:50 PM, dschexna wrote: > I am attempting to run flink / YARN on a secure MapR 5.2 cluster.

Re: Machine Learning on Flink - Next steps

2017-03-10 Thread Till Rohrmann
Thanks Theo for steering Flink's ML effort here :-) I'd vote to concentrate on - Online learning - Low-latency prediction serving because of the following reasons: Online learning: I agree that this topic is highly researchy and it's not even clear whether it will ever be of any interest

Re: [DISCUSS] Flink ML roadmap

2017-03-10 Thread Till Rohrmann
Hi Roberto, jpmml looks quite promising and this could be a first step towards the model serving story. Thus, looking really forward seeing it being open sourced by you guys :-) @Katherin, I'm not saying that there is no interest in the community to work on batch features. However, there is

Flink 1.2 / YARN on Secure MapR Cluster

2017-03-10 Thread dschexna
I am attempting to run flink / YARN on a secure MapR 5.2 cluster. The cluster is secured using "MapR Native Security", not kerberos. I did include the MapR zookeeper when building: opt/apache-maven-3.3.9/bin/mvn clean install -DskipTests -Pvendor-repos -Dhadoop.version=2.7.0-mapr-1607

[jira] [Created] (FLINK-6023) Fix Scala snippet into Process Function (Low-level Operations) Doc

2017-03-10 Thread Mauro Cortellazzi (JIRA)
Mauro Cortellazzi created FLINK-6023: Summary: Fix Scala snippet into Process Function (Low-level Operations) Doc Key: FLINK-6023 URL: https://issues.apache.org/jira/browse/FLINK-6023 Project:

Re: [DISCUSS] Project build time and possible restructuring

2017-03-10 Thread Till Rohrmann
Thanks for all your input. In order to wrap the discussion up I'd like to summarize the mentioned points: The problem of increasing build times and complexity of the project has been acknowledged. Ideally we would have everything in one repository using an incremental build tool. Since Maven does

[jira] [Created] (FLINK-6022) Improve support for Avro GenericRecord

2017-03-10 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-6022: - Summary: Improve support for Avro GenericRecord Key: FLINK-6022 URL: https://issues.apache.org/jira/browse/FLINK-6022 Project: Flink Issue Type:

Flink CSV parsing

2017-03-10 Thread Flavio Pompermaier
Hi to all, I want to discuss with the dev group something about CSV parsing. Since I started using Flink with CSVs I always faced some little problem here and there and the new tickets about the CSV parsing seems to confirm that this part is still problematic. In my production jobs I gave up using

[jira] [Created] (FLINK-6021) Downloads page references "Hadoop 1 version" which isn't an option

2017-03-10 Thread Patrick Lucas (JIRA)
Patrick Lucas created FLINK-6021: Summary: Downloads page references "Hadoop 1 version" which isn't an option Key: FLINK-6021 URL: https://issues.apache.org/jira/browse/FLINK-6021 Project: Flink

Re: Machine Learning on Flink - Next steps

2017-03-10 Thread Gábor Hermann
Hey all, Sorry for the bit late response. I'd like to work on - Offline learning with Streaming API - Low-latency prediction serving I would drop the batch API ML because of past experience with lack of support, and online learning because the lack of use-cases. I completely agree with Kate

[jira] [Created] (FLINK-6020) Blob Server cannot hanlde multiple job sumits(with same content) parallelly

2017-03-10 Thread Tao Wang (JIRA)
Tao Wang created FLINK-6020: --- Summary: Blob Server cannot hanlde multiple job sumits(with same content) parallelly Key: FLINK-6020 URL: https://issues.apache.org/jira/browse/FLINK-6020 Project: Flink

[jira] [Created] (FLINK-6019) Some log4j messages do not have a loglevel field set, so they can't be suppressed

2017-03-10 Thread Luke Hutchison (JIRA)
Luke Hutchison created FLINK-6019: - Summary: Some log4j messages do not have a loglevel field set, so they can't be suppressed Key: FLINK-6019 URL: https://issues.apache.org/jira/browse/FLINK-6019