[GitHub] flink pull request: [FLINK-1512] Add CsvReader for reading into PO...
Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/426#issuecomment-84977748 Hi @chiwanpark Thanks for updating the PR! :-) I was gone for a few days. Will have a look at your PR shortly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1512) Add CsvReader for reading into POJOs.
[ https://issues.apache.org/jira/browse/FLINK-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375858#comment-14375858 ] ASF GitHub Bot commented on FLINK-1512: --- Github user fhueske commented on the pull request: https://github.com/apache/flink/pull/426#issuecomment-84977748 Hi @chiwanpark Thanks for updating the PR! :-) I was gone for a few days. Will have a look at your PR shortly. Add CsvReader for reading into POJOs. - Key: FLINK-1512 URL: https://issues.apache.org/jira/browse/FLINK-1512 Project: Flink Issue Type: New Feature Components: Java API, Scala API Reporter: Robert Metzger Assignee: Chiwan Park Priority: Minor Labels: starter Currently, the {{CsvReader}} supports only TupleXX types. It would be nice if users were also able to read into POJOs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1770) Rename the variable 'contentAdressable' to 'contentAddressable'
Sibao Hong created FLINK-1770: - Summary: Rename the variable 'contentAdressable' to 'contentAddressable' Key: FLINK-1770 URL: https://issues.apache.org/jira/browse/FLINK-1770 Project: Flink Issue Type: Bug Reporter: Sibao Hong Priority: Minor Rename the variable 'contentAdressable' to 'contentAddressable' in order to better understanding. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1633) Add getTriplets() Gelly method
[ https://issues.apache.org/jira/browse/FLINK-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375756#comment-14375756 ] ASF GitHub Bot commented on FLINK-1633: --- Github user vasia commented on a diff in the pull request: https://github.com/apache/flink/pull/452#discussion_r26928420 --- Diff: flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/example/EuclideanGraphExample.java --- @@ -0,0 +1,214 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.graph.example; + +import org.apache.flink.api.common.ProgramDescription; +import org.apache.flink.api.common.functions.MapFunction; +import org.apache.flink.api.java.DataSet; +import org.apache.flink.api.java.ExecutionEnvironment; +import org.apache.flink.api.java.tuple.Tuple2; +import org.apache.flink.api.java.tuple.Tuple3; +import org.apache.flink.graph.Edge; +import org.apache.flink.graph.Graph; +import org.apache.flink.graph.Triplet; +import org.apache.flink.graph.Vertex; +import org.apache.flink.graph.example.utils.EuclideanGraphData; + +import java.io.Serializable; + +/** + * Given a directed, unweighted graph, with vertex values representing points in a plan, + * return a weighted graph where the edge weights are equal to the Euclidean distance between the + * src and the trg vertex values. + * + * p + * Input files are plain text files and must be formatted as follows: + * ul + * li Vertices are represented by their vertexIds and vertex values and are separated by newlines, + * the value being formed of two doubles separated by a comma. + * For example: code1,1.0,1.0\n2,2.0,2.0\n3,3.0,3.0\n/code defines a data set of three vertices + * li Edges are represented by triples of srcVertexId, srcEdgeId, weight which are + * separated by commas. Edges themselves are separated by newlines. The initial edge value will be overwritten. --- End diff -- Hey @andralungu! Sorry for not noticing this earlier, but why give a weighted edge as input if you're going to override the weight anyway? And in the beginning of the description, you refer to an unweighted graph. Add getTriplets() Gelly method -- Key: FLINK-1633 URL: https://issues.apache.org/jira/browse/FLINK-1633 Project: Flink Issue Type: New Feature Components: Gelly Affects Versions: 0.9 Reporter: Vasia Kalavri Assignee: Andra Lungu Priority: Minor Labels: starter In some graph algorithms, it is required to access the graph edges together with the vertex values of the source and target vertices. For example, several graph weighting schemes compute some kind of similarity weights for edges, based on the attributes of the source and target vertices. This issue proposes adding a convenience Gelly method that generates a DataSet of srcVertex, Edge, TrgVertex triplets from the input graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1633][gelly] Added getTriplets() method...
Github user vasia commented on a diff in the pull request: https://github.com/apache/flink/pull/452#discussion_r26928666 --- Diff: flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java --- @@ -53,6 +54,7 @@ import org.apache.flink.graph.utils.Tuple2ToVertexMap; import org.apache.flink.graph.utils.Tuple3ToEdgeMap; import org.apache.flink.graph.utils.VertexToTuple2Map; +import org.apache.flink.graph.utils.Tuple2ToVertexMap; --- End diff -- and this import is not used anywhere :-) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1633][gelly] Added getTriplets() method...
Github user vasia commented on a diff in the pull request: https://github.com/apache/flink/pull/452#discussion_r26928684 --- Diff: flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java --- @@ -53,6 +54,7 @@ import org.apache.flink.graph.utils.Tuple2ToVertexMap; import org.apache.flink.graph.utils.Tuple3ToEdgeMap; import org.apache.flink.graph.utils.VertexToTuple2Map; +import org.apache.flink.graph.utils.Tuple2ToVertexMap; --- End diff -- my bad, it's duplicate --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1633) Add getTriplets() Gelly method
[ https://issues.apache.org/jira/browse/FLINK-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375765#comment-14375765 ] ASF GitHub Bot commented on FLINK-1633: --- Github user vasia commented on a diff in the pull request: https://github.com/apache/flink/pull/452#discussion_r26928684 --- Diff: flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java --- @@ -53,6 +54,7 @@ import org.apache.flink.graph.utils.Tuple2ToVertexMap; import org.apache.flink.graph.utils.Tuple3ToEdgeMap; import org.apache.flink.graph.utils.VertexToTuple2Map; +import org.apache.flink.graph.utils.Tuple2ToVertexMap; --- End diff -- my bad, it's duplicate Add getTriplets() Gelly method -- Key: FLINK-1633 URL: https://issues.apache.org/jira/browse/FLINK-1633 Project: Flink Issue Type: New Feature Components: Gelly Affects Versions: 0.9 Reporter: Vasia Kalavri Assignee: Andra Lungu Priority: Minor Labels: starter In some graph algorithms, it is required to access the graph edges together with the vertex values of the source and target vertices. For example, several graph weighting schemes compute some kind of similarity weights for edges, based on the attributes of the source and target vertices. This issue proposes adding a convenience Gelly method that generates a DataSet of srcVertex, Edge, TrgVertex triplets from the input graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1615] [java api] SimpleTweetInputFormat
Github user Elbehery commented on the pull request: https://github.com/apache/flink/pull/442#issuecomment-84968643 @rmetzger I think it failed again, but I cant see the reason --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1615) Introduces a new InputFormat for Tweets
[ https://issues.apache.org/jira/browse/FLINK-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375784#comment-14375784 ] ASF GitHub Bot commented on FLINK-1615: --- Github user Elbehery commented on the pull request: https://github.com/apache/flink/pull/442#issuecomment-84968643 @rmetzger I think it failed again, but I cant see the reason Introduces a new InputFormat for Tweets --- Key: FLINK-1615 URL: https://issues.apache.org/jira/browse/FLINK-1615 Project: Flink Issue Type: New Feature Components: flink-contrib Affects Versions: 0.8.1 Reporter: mustafa elbehery Priority: Minor An event-driven parser for Tweets into Java Pojos. It parses all the important part of the tweet into Java objects. Tested on cluster and the performance in pretty well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-1770) Rename the variable 'contentAdressable' to 'contentAddressable'
[ https://issues.apache.org/jira/browse/FLINK-1770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sibao Hong updated FLINK-1770: -- Fix Version/s: master Rename the variable 'contentAdressable' to 'contentAddressable' --- Key: FLINK-1770 URL: https://issues.apache.org/jira/browse/FLINK-1770 Project: Flink Issue Type: Bug Reporter: Sibao Hong Priority: Minor Fix For: master Rename the variable 'contentAdressable' to 'contentAddressable' in order to better understanding. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-1770) Rename the variable 'contentAdressable' to 'contentAddressable'
[ https://issues.apache.org/jira/browse/FLINK-1770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sibao Hong updated FLINK-1770: -- Affects Version/s: master Rename the variable 'contentAdressable' to 'contentAddressable' --- Key: FLINK-1770 URL: https://issues.apache.org/jira/browse/FLINK-1770 Project: Flink Issue Type: Bug Affects Versions: master Reporter: Sibao Hong Priority: Minor Fix For: master Rename the variable 'contentAdressable' to 'contentAddressable' in order to better understanding. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1633][gelly] Added getTriplets() method...
Github user vasia commented on a diff in the pull request: https://github.com/apache/flink/pull/452#discussion_r26928420 --- Diff: flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/example/EuclideanGraphExample.java --- @@ -0,0 +1,214 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.graph.example; + +import org.apache.flink.api.common.ProgramDescription; +import org.apache.flink.api.common.functions.MapFunction; +import org.apache.flink.api.java.DataSet; +import org.apache.flink.api.java.ExecutionEnvironment; +import org.apache.flink.api.java.tuple.Tuple2; +import org.apache.flink.api.java.tuple.Tuple3; +import org.apache.flink.graph.Edge; +import org.apache.flink.graph.Graph; +import org.apache.flink.graph.Triplet; +import org.apache.flink.graph.Vertex; +import org.apache.flink.graph.example.utils.EuclideanGraphData; + +import java.io.Serializable; + +/** + * Given a directed, unweighted graph, with vertex values representing points in a plan, + * return a weighted graph where the edge weights are equal to the Euclidean distance between the + * src and the trg vertex values. + * + * p + * Input files are plain text files and must be formatted as follows: + * ul + * li Vertices are represented by their vertexIds and vertex values and are separated by newlines, + * the value being formed of two doubles separated by a comma. + * For example: code1,1.0,1.0\n2,2.0,2.0\n3,3.0,3.0\n/code defines a data set of three vertices + * li Edges are represented by triples of srcVertexId, srcEdgeId, weight which are + * separated by commas. Edges themselves are separated by newlines. The initial edge value will be overwritten. --- End diff -- Hey @andralungu! Sorry for not noticing this earlier, but why give a weighted edge as input if you're going to override the weight anyway? And in the beginning of the description, you refer to an unweighted graph. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (FLINK-1373) Add documentation for intermediate results and network stack to internals
[ https://issues.apache.org/jira/browse/FLINK-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ufuk Celebi resolved FLINK-1373. Resolution: Fixed Fixed in https://cwiki.apache.org/confluence/display/FLINK/Data+exchange+between+tasks Add documentation for intermediate results and network stack to internals - Key: FLINK-1373 URL: https://issues.apache.org/jira/browse/FLINK-1373 Project: Flink Issue Type: Improvement Components: Distributed Runtime, Documentation Reporter: Ufuk Celebi Assignee: Ufuk Celebi There is a short overview in the respective pull request. https://github.com/apache/flink/pull/254 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1319) Add static code analysis for UDFs
[ https://issues.apache.org/jira/browse/FLINK-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376003#comment-14376003 ] Ufuk Celebi commented on FLINK-1319: Hey Timo, great news! :-) 1. What about adding it to staging? 2 I would very much like to have this on by default in the future. But I agree that we should not do this now. We really need to be certain that we don't introduce wrong annotations. They might cause some hard-to-understand problems for new users when enabled by default. As a first step it makes sense make this as explicit as possible, for example with a {{optimizeUdf()}} method as you propose. Add static code analysis for UDFs - Key: FLINK-1319 URL: https://issues.apache.org/jira/browse/FLINK-1319 Project: Flink Issue Type: New Feature Components: Java API, Scala API Reporter: Stephan Ewen Assignee: Timo Walther Priority: Minor Flink's Optimizer takes information that tells it for UDFs which fields of the input elements are accessed, modified, or frwarded/copied. This information frequently helps to reuse partitionings, sorts, etc. It may speed up programs significantly, as it can frequently eliminate sorts and shuffles, which are costly. Right now, users can add lightweight annotations to UDFs to provide this information (such as adding {{@ConstandFields(0-3, 1, 2-1)}}. We worked with static code analysis of UDFs before, to determine this information automatically. This is an incredible feature, as it magically makes programs faster. For record-at-a-time operations (Map, Reduce, FlatMap, Join, Cross), this works surprisingly well in many cases. We used the Soot toolkit for the static code analysis. Unfortunately, Soot is LGPL licensed and thus we did not include any of the code so far. I propose to add this functionality to Flink, in the form of a drop-in addition, to work around the LGPL incompatibility with ALS 2.0. Users could simply download a special flink-code-analysis.jar and drop it into the lib folder to enable this functionality. We may even add a script to tools that downloads that library automatically into the lib folder. This should be legally fine, since we do not redistribute LGPL code and only dynamically link it (the incompatibility with ASL 2.0 is mainly in the patentability, if I remember correctly). Prior work on this has been done by [~aljoscha] and [~skunert], which could provide a code base to start with. *Appendix* Hompage to Soot static analysis toolkit: http://www.sable.mcgill.ca/soot/ Papers on static analysis and for optimization: http://stratosphere.eu/assets/papers/EnablingOperatorReorderingSCA_12.pdf and http://stratosphere.eu/assets/papers/openingTheBlackBoxes_12.pdf Quick introduction to the Optimizer: http://stratosphere.eu/assets/papers/2014-VLDBJ_Stratosphere_Overview.pdf (Section 6) Optimizer for Iterations: http://stratosphere.eu/assets/papers/spinningFastIterativeDataFlows_12.pdf (Sections 4.3 and 5.3) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1744) Change the reference of slaves to workers to match the description of the system
[ https://issues.apache.org/jira/browse/FLINK-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376010#comment-14376010 ] Henry Saputra commented on FLINK-1744: -- If this is too much of a distraction for now, I could just close it and maybe revisit it in the future. Change the reference of slaves to workers to match the description of the system Key: FLINK-1744 URL: https://issues.apache.org/jira/browse/FLINK-1744 Project: Flink Issue Type: Improvement Components: core, Documentation Reporter: Henry Saputra Priority: Trivial There are some references to slaves which actually mean workers. Need to change it to use workers whenever possible, unless it is needed when communicating with external system like Apache Hadoop -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1744) Change the reference of slaves to workers to match the description of the system
[ https://issues.apache.org/jira/browse/FLINK-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376033#comment-14376033 ] Robert Metzger commented on FLINK-1744: --- Oh ... sorry. I didn't want to ask you to close the issue. I'm not sure how the other committers think about this. Maybe others agree with the change. I'm hesitant here because all changes to scripts, config files etc. often break a lot of things (automated testing setups, docker/VM scripts, debian/rpm packages, other packaging). Even worse, these things are hard to test. Change the reference of slaves to workers to match the description of the system Key: FLINK-1744 URL: https://issues.apache.org/jira/browse/FLINK-1744 Project: Flink Issue Type: Improvement Components: core, Documentation Reporter: Henry Saputra Priority: Trivial There are some references to slaves which actually mean workers. Need to change it to use workers whenever possible, unless it is needed when communicating with external system like Apache Hadoop -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-1744) Change the reference of slaves to workers to match the description of the system
[ https://issues.apache.org/jira/browse/FLINK-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Saputra closed FLINK-1744. Resolution: Won't Fix Punt it to backlog to avoid confusion Change the reference of slaves to workers to match the description of the system Key: FLINK-1744 URL: https://issues.apache.org/jira/browse/FLINK-1744 Project: Flink Issue Type: Improvement Components: core, Documentation Reporter: Henry Saputra Priority: Trivial There are some references to slaves which actually mean workers. Need to change it to use workers whenever possible, unless it is needed when communicating with external system like Apache Hadoop -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1744) Change the reference of slaves to workers to match the description of the system
[ https://issues.apache.org/jira/browse/FLINK-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375972#comment-14375972 ] Henry Saputra commented on FLINK-1744: -- It is common but it is coming from old term of master-slaves architecture. The term Flink uses is worker for the nodes that do all the work. I filed this JIRA to see if it is better for Flink match the terms used in the project. Change the reference of slaves to workers to match the description of the system Key: FLINK-1744 URL: https://issues.apache.org/jira/browse/FLINK-1744 Project: Flink Issue Type: Improvement Components: core, Documentation Reporter: Henry Saputra There are some references to slaves which actually mean workers. Need to change it to use workers whenever possible, unless it is needed when communicating with external system like Apache Hadoop -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1771) Add tests for setting the processing slots for the YARN client
Robert Metzger created FLINK-1771: - Summary: Add tests for setting the processing slots for the YARN client Key: FLINK-1771 URL: https://issues.apache.org/jira/browse/FLINK-1771 Project: Flink Issue Type: Improvement Components: YARN Client Reporter: Robert Metzger Assignee: Robert Metzger We need tests ensuring that the processing slots are set properly when starting Flink on YARN, in particular with the per job YARN session feature. Also, the YARN tests for detached YARN sessions / per job yarn clusters are polluting the local home-directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-1771) Add tests for setting the processing slots for the YARN client
[ https://issues.apache.org/jira/browse/FLINK-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-1771: -- Affects Version/s: 0.9 Add tests for setting the processing slots for the YARN client -- Key: FLINK-1771 URL: https://issues.apache.org/jira/browse/FLINK-1771 Project: Flink Issue Type: Improvement Components: YARN Client Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Robert Metzger We need tests ensuring that the processing slots are set properly when starting Flink on YARN, in particular with the per job YARN session feature. Also, the YARN tests for detached YARN sessions / per job yarn clusters are polluting the local home-directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1650] Configure Netty (akka) to use Slf...
GitHub user rmetzger opened a pull request: https://github.com/apache/flink/pull/518 [FLINK-1650] Configure Netty (akka) to use Slf4j. It seems that Netty 3.8.0 used by Akka was using `java.util.logging` for its internal logging, that's why this entry was without effect: https://github.com/apache/flink/blob/master/flink-dist/src/main/flink-bin/conf/log4j.properties#L29. The change sets the Netty logging factory to Slf4j. This means that Netty is now using Sfl4j. It'll also respect our logging settings (`org.jboss.netty.channel.DefaultChannelPipeline` to lvl ERROR). Lets see if the exception now disappears. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rmetzger/flink flink1650 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/518.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #518 commit d4778d4d51cd20db2861b3632e64aeac1f5f6f7a Author: Robert Metzger rmetz...@apache.org Date: 2015-03-23T10:42:21Z [FLINK-1650] Set akka version to 2.3.9 commit 69b6a9b497955544e0e5c8707293c580e1ba3d5e Author: Robert Metzger rmetz...@apache.org Date: 2015-03-23T13:44:19Z [FLINK-1650] Let Netty(Akka) use Slf4j --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1650) Suppress Akka's Netty Shutdown Errors through the log config
[ https://issues.apache.org/jira/browse/FLINK-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375907#comment-14375907 ] ASF GitHub Bot commented on FLINK-1650: --- GitHub user rmetzger opened a pull request: https://github.com/apache/flink/pull/518 [FLINK-1650] Configure Netty (akka) to use Slf4j. It seems that Netty 3.8.0 used by Akka was using `java.util.logging` for its internal logging, that's why this entry was without effect: https://github.com/apache/flink/blob/master/flink-dist/src/main/flink-bin/conf/log4j.properties#L29. The change sets the Netty logging factory to Slf4j. This means that Netty is now using Sfl4j. It'll also respect our logging settings (`org.jboss.netty.channel.DefaultChannelPipeline` to lvl ERROR). Lets see if the exception now disappears. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rmetzger/flink flink1650 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/518.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #518 commit d4778d4d51cd20db2861b3632e64aeac1f5f6f7a Author: Robert Metzger rmetz...@apache.org Date: 2015-03-23T10:42:21Z [FLINK-1650] Set akka version to 2.3.9 commit 69b6a9b497955544e0e5c8707293c580e1ba3d5e Author: Robert Metzger rmetz...@apache.org Date: 2015-03-23T13:44:19Z [FLINK-1650] Let Netty(Akka) use Slf4j Suppress Akka's Netty Shutdown Errors through the log config Key: FLINK-1650 URL: https://issues.apache.org/jira/browse/FLINK-1650 Project: Flink Issue Type: Bug Components: other Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 I suggest to set the logging for `org.jboss.netty.channel.DefaultChannelPipeline` to error, in order to get rid of the misleading stack trace caused by an akka/netty hickup on shutdown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: FLINK-1560 - Add ITCases for streaming example...
GitHub user szape opened a pull request: https://github.com/apache/flink/pull/519 FLINK-1560 - Add ITCases for streaming examples ITCases for streaming examples have been added, except for IterateExample and StockPrices. I replaced the IterateExample with one that generates Fibonacci-sequences. Something is bugous around iteration, so the ITCase have to wait. StockPrices is untestable because of the windowJoin operator, hence it will not have an ITCase right now. You can merge this pull request into a Git repository by running: $ git pull https://github.com/mbalassi/flink FLINK-1560 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/519.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #519 commit 16da0d092ddc192afdae3a3c5a0a57144390c8ff Author: szape nemderogator...@gmail.com Date: 2015-03-05T15:27:44Z [Flink-1560] [streaming] Streaming examples rework commit 45fe9bd3381ef9cf99e8c9f0570e66e38d78b383 Author: szape nemderogator...@gmail.com Date: 2015-03-05T15:30:31Z [Flink-1560] [streaming] Added ITCases to streaming examples commit 237f08d3eaf2da36c8a9d48145b2544dcbb41012 Author: szape nemderogator...@gmail.com Date: 2015-03-23T09:28:37Z [FLINK-1560] [streaming] Added iterate example --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1560) Add ITCases for streaming examples
[ https://issues.apache.org/jira/browse/FLINK-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375914#comment-14375914 ] ASF GitHub Bot commented on FLINK-1560: --- GitHub user szape opened a pull request: https://github.com/apache/flink/pull/519 FLINK-1560 - Add ITCases for streaming examples ITCases for streaming examples have been added, except for IterateExample and StockPrices. I replaced the IterateExample with one that generates Fibonacci-sequences. Something is bugous around iteration, so the ITCase have to wait. StockPrices is untestable because of the windowJoin operator, hence it will not have an ITCase right now. You can merge this pull request into a Git repository by running: $ git pull https://github.com/mbalassi/flink FLINK-1560 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/519.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #519 commit 16da0d092ddc192afdae3a3c5a0a57144390c8ff Author: szape nemderogator...@gmail.com Date: 2015-03-05T15:27:44Z [Flink-1560] [streaming] Streaming examples rework commit 45fe9bd3381ef9cf99e8c9f0570e66e38d78b383 Author: szape nemderogator...@gmail.com Date: 2015-03-05T15:30:31Z [Flink-1560] [streaming] Added ITCases to streaming examples commit 237f08d3eaf2da36c8a9d48145b2544dcbb41012 Author: szape nemderogator...@gmail.com Date: 2015-03-23T09:28:37Z [FLINK-1560] [streaming] Added iterate example Add ITCases for streaming examples -- Key: FLINK-1560 URL: https://issues.apache.org/jira/browse/FLINK-1560 Project: Flink Issue Type: Test Components: Streaming Affects Versions: 0.9 Reporter: Márton Balassi Assignee: Péter Szabó Currently there are no tests for consistency of the streaming example programs. This might be a real show stopper for users who encounter an issue there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: FLINK-1560 - Add ITCases for streaming example...
Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/519#discussion_r26938775 --- Diff: flink-staging/flink-streaming/flink-streaming-examples/src/test/java/org/apache/flink/streaming/examples/test/socket/SocketTextStreamWordCountITCase.java --- @@ -0,0 +1,93 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.examples.test.socket; + +import org.apache.flink.streaming.examples.socket.SocketTextStreamWordCount; +import org.apache.flink.streaming.examples.socket.util.SocketTextStreamWordCountData; +import org.apache.flink.streaming.util.StreamingProgramTestBase; + +import java.io.PrintWriter; +import java.net.ServerSocket; +import java.net.Socket; + +public class SocketTextStreamWordCountITCase extends StreamingProgramTestBase { + + private static final String HOST = localhost; + private static final String PORT = ; + protected String resultPath; + + private ServerSocket temporarySocket; + + @Override + protected void preSubmit() throws Exception { + temporarySocket = createSocket(HOST, Integer.valueOf(PORT), SocketTextStreamWordCountData.SOCKET_TEXT); + resultPath = getTempDirPath(result); + } + + @Override + protected void postSubmit() throws Exception { + compareResultsByLinesInMemory(SocketTextStreamWordCountData.STREAMING_COUNTS_AS_TUPLES, resultPath); + temporarySocket.close(); + } + + @Override + protected void testProgram() throws Exception { + SocketTextStreamWordCount.main(new String[]{HOST, PORT, resultPath}); + } + + public ServerSocket createSocket(String host, int port, String contents) throws Exception { + ServerSocket serverSocket = new ServerSocket(port); + ServerThread st = new ServerThread(serverSocket, contents); + st.start(); + return serverSocket; + } + + private static class ServerThread extends Thread { + + private ServerSocket serverSocket; + private String contents; + private Thread t; + + public ServerThread(ServerSocket serverSocket, String contents) { + this.serverSocket = serverSocket; + this.contents = contents; + t = new Thread(this); + } + + public void waitForAccept() throws Exception { + Socket socket = serverSocket.accept(); + PrintWriter writer = new PrintWriter(socket.getOutputStream(), true); + writer.println(contents); + writer.close(); + socket.close(); + } + + public void run() { + try { + waitForAccept(); + } catch (Exception e) { + e.printStackTrace(); --- End diff -- I would fail the test in case of an exception. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1560) Add ITCases for streaming examples
[ https://issues.apache.org/jira/browse/FLINK-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375924#comment-14375924 ] ASF GitHub Bot commented on FLINK-1560: --- Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/519#discussion_r26938775 --- Diff: flink-staging/flink-streaming/flink-streaming-examples/src/test/java/org/apache/flink/streaming/examples/test/socket/SocketTextStreamWordCountITCase.java --- @@ -0,0 +1,93 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.examples.test.socket; + +import org.apache.flink.streaming.examples.socket.SocketTextStreamWordCount; +import org.apache.flink.streaming.examples.socket.util.SocketTextStreamWordCountData; +import org.apache.flink.streaming.util.StreamingProgramTestBase; + +import java.io.PrintWriter; +import java.net.ServerSocket; +import java.net.Socket; + +public class SocketTextStreamWordCountITCase extends StreamingProgramTestBase { + + private static final String HOST = localhost; + private static final String PORT = ; + protected String resultPath; + + private ServerSocket temporarySocket; + + @Override + protected void preSubmit() throws Exception { + temporarySocket = createSocket(HOST, Integer.valueOf(PORT), SocketTextStreamWordCountData.SOCKET_TEXT); + resultPath = getTempDirPath(result); + } + + @Override + protected void postSubmit() throws Exception { + compareResultsByLinesInMemory(SocketTextStreamWordCountData.STREAMING_COUNTS_AS_TUPLES, resultPath); + temporarySocket.close(); + } + + @Override + protected void testProgram() throws Exception { + SocketTextStreamWordCount.main(new String[]{HOST, PORT, resultPath}); + } + + public ServerSocket createSocket(String host, int port, String contents) throws Exception { + ServerSocket serverSocket = new ServerSocket(port); + ServerThread st = new ServerThread(serverSocket, contents); + st.start(); + return serverSocket; + } + + private static class ServerThread extends Thread { + + private ServerSocket serverSocket; + private String contents; + private Thread t; + + public ServerThread(ServerSocket serverSocket, String contents) { + this.serverSocket = serverSocket; + this.contents = contents; + t = new Thread(this); + } + + public void waitForAccept() throws Exception { + Socket socket = serverSocket.accept(); + PrintWriter writer = new PrintWriter(socket.getOutputStream(), true); + writer.println(contents); + writer.close(); + socket.close(); + } + + public void run() { + try { + waitForAccept(); + } catch (Exception e) { + e.printStackTrace(); --- End diff -- I would fail the test in case of an exception. Add ITCases for streaming examples -- Key: FLINK-1560 URL: https://issues.apache.org/jira/browse/FLINK-1560 Project: Flink Issue Type: Test Components: Streaming Affects Versions: 0.9 Reporter: Márton Balassi Assignee: Péter Szabó Currently there are no tests for consistency of the streaming example programs. This might be a real show stopper for users who encounter an issue there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1595] [streaming] Complex integration t...
GitHub user szape opened a pull request: https://github.com/apache/flink/pull/520 [FLINK-1595] [streaming] Complex integration test wip Flink Streaming's complex integration test will test interactions of different streaming operators and settings. The test topology will run in one environment but will be divided into several subgraphs, tested separately from one another. In this way, many new tests can be added later. You can merge this pull request into a Git repository by running: $ git pull https://github.com/mbalassi/flink FLINK-1595 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/520.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #520 commit e24f24354add0800623189a4af70b3f04164469c Author: szape nemderogator...@gmail.com Date: 2015-03-23T09:48:12Z [FLINK-1595] [streaming] Complex integration test wip --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1560) Add ITCases for streaming examples
[ https://issues.apache.org/jira/browse/FLINK-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375938#comment-14375938 ] ASF GitHub Bot commented on FLINK-1560: --- Github user mbalassi commented on a diff in the pull request: https://github.com/apache/flink/pull/519#discussion_r26939215 --- Diff: flink-staging/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/iteration/IterateExample.java --- @@ -103,57 +115,124 @@ public static void main(String[] args) throws Exception { // * /** -* Iteration step function which takes an input (Double , Integer) and -* produces an output (Double + random, Integer + 1). +* Generate random integer pairs from the range from 0 to BOUND/2 +*/ + private static class RandomFibonacciSource implements SourceFunctionTuple2Integer, Integer { + private static final long serialVersionUID = 1L; + + private Random rnd = new Random(); + + @Override + public void run(CollectorTuple2Integer, Integer collector) throws Exception { + while(true) { + int first = rnd.nextInt(BOUND/2 - 1) + 1; + int second = rnd.nextInt(BOUND/2 - 1) + 1; + + collector.collect(new Tuple2Integer, Integer(first, second)); + Thread.sleep(100L); + } + } + + @Override + public void cancel() { + // no cleanup needed + } + } + + /** +* Generate random integer pairs from the range from 0 to BOUND/2 */ - public static class Step extends - RichMapFunctionTuple2Double, Integer, Tuple2Double, Integer { + private static class FibonacciInputMap implements MapFunctionString, Tuple2Integer, Integer { private static final long serialVersionUID = 1L; - private transient Random rnd; - public void open(Configuration parameters) { - rnd = new Random(); + @Override + public Tuple2Integer, Integer map(String value) throws Exception { + Thread.sleep(100L); + String record = value.substring(1, value.length()-1); + String[] splitted = record.split(,); + return new Tuple2Integer, Integer(Integer.parseInt(splitted[0]), Integer.parseInt(splitted[1])); } + } + + /** +* Map the inputs so that the next Fibonacci numbers can be calculated while preserving the original input tuple +* A counter is attached to the tuple and incremented in every iteration step +*/ + public static class InputMap implements MapFunctionTuple2Integer, Integer, Tuple5Integer, Integer, Integer, Integer, Integer { + + @Override + public Tuple5Integer, Integer, Integer, Integer, Integer map(Tuple2Integer, Integer value) throws + Exception { + return new Tuple5Integer, Integer, Integer, Integer, Integer(value.f0, value.f1, value.f0, value.f1, 0); + } + } + + /** +* Iteration step function that calculates the next Fibonacci number +*/ + public static class Step implements + MapFunctionTuple5Integer, Integer, Integer, Integer, Integer, Tuple5Integer, Integer, Integer, Integer, Integer { + private static final long serialVersionUID = 1L; @Override - public Tuple2Double, Integer map(Tuple2Double, Integer value) throws Exception { - return new Tuple2Double, Integer(value.f0 + rnd.nextDouble(), value.f1 + 1); + public Tuple5Integer, Integer, Integer, Integer, Integer map(Tuple5Integer, Integer, Integer, Integer, Integer value) throws Exception { + return new Tuple5Integer, Integer, Integer, Integer, Integer(value.f0, value.f1, value.f3, value.f2 + value.f3, ++value.f4); } } /** * OutputSelector testing which tuple needs to be iterated again. */ - public static class MySelector implements OutputSelectorTuple2Double, Integer { + public static class MySelector implements OutputSelectorTuple5Integer, Integer, Integer, Integer, Integer { private static final long serialVersionUID = 1L; @Override - public IterableString select(Tuple2Double, Integer value) { + public
[GitHub] flink pull request: FLINK-1560 - Add ITCases for streaming example...
Github user mbalassi commented on a diff in the pull request: https://github.com/apache/flink/pull/519#discussion_r26939215 --- Diff: flink-staging/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/iteration/IterateExample.java --- @@ -103,57 +115,124 @@ public static void main(String[] args) throws Exception { // * /** -* Iteration step function which takes an input (Double , Integer) and -* produces an output (Double + random, Integer + 1). +* Generate random integer pairs from the range from 0 to BOUND/2 +*/ + private static class RandomFibonacciSource implements SourceFunctionTuple2Integer, Integer { + private static final long serialVersionUID = 1L; + + private Random rnd = new Random(); + + @Override + public void run(CollectorTuple2Integer, Integer collector) throws Exception { + while(true) { + int first = rnd.nextInt(BOUND/2 - 1) + 1; + int second = rnd.nextInt(BOUND/2 - 1) + 1; + + collector.collect(new Tuple2Integer, Integer(first, second)); + Thread.sleep(100L); + } + } + + @Override + public void cancel() { + // no cleanup needed + } + } + + /** +* Generate random integer pairs from the range from 0 to BOUND/2 */ - public static class Step extends - RichMapFunctionTuple2Double, Integer, Tuple2Double, Integer { + private static class FibonacciInputMap implements MapFunctionString, Tuple2Integer, Integer { private static final long serialVersionUID = 1L; - private transient Random rnd; - public void open(Configuration parameters) { - rnd = new Random(); + @Override + public Tuple2Integer, Integer map(String value) throws Exception { + Thread.sleep(100L); + String record = value.substring(1, value.length()-1); + String[] splitted = record.split(,); + return new Tuple2Integer, Integer(Integer.parseInt(splitted[0]), Integer.parseInt(splitted[1])); } + } + + /** +* Map the inputs so that the next Fibonacci numbers can be calculated while preserving the original input tuple +* A counter is attached to the tuple and incremented in every iteration step +*/ + public static class InputMap implements MapFunctionTuple2Integer, Integer, Tuple5Integer, Integer, Integer, Integer, Integer { + + @Override + public Tuple5Integer, Integer, Integer, Integer, Integer map(Tuple2Integer, Integer value) throws + Exception { + return new Tuple5Integer, Integer, Integer, Integer, Integer(value.f0, value.f1, value.f0, value.f1, 0); + } + } + + /** +* Iteration step function that calculates the next Fibonacci number +*/ + public static class Step implements + MapFunctionTuple5Integer, Integer, Integer, Integer, Integer, Tuple5Integer, Integer, Integer, Integer, Integer { + private static final long serialVersionUID = 1L; @Override - public Tuple2Double, Integer map(Tuple2Double, Integer value) throws Exception { - return new Tuple2Double, Integer(value.f0 + rnd.nextDouble(), value.f1 + 1); + public Tuple5Integer, Integer, Integer, Integer, Integer map(Tuple5Integer, Integer, Integer, Integer, Integer value) throws Exception { + return new Tuple5Integer, Integer, Integer, Integer, Integer(value.f0, value.f1, value.f3, value.f2 + value.f3, ++value.f4); } } /** * OutputSelector testing which tuple needs to be iterated again. */ - public static class MySelector implements OutputSelectorTuple2Double, Integer { + public static class MySelector implements OutputSelectorTuple5Integer, Integer, Integer, Integer, Integer { private static final long serialVersionUID = 1L; @Override - public IterableString select(Tuple2Double, Integer value) { + public IterableString select(Tuple5Integer, Integer, Integer, Integer, Integer value) { ListString output = new ArrayListString(); - if (value.f0 100) { - output.add(output); -
[jira] [Commented] (FLINK-1560) Add ITCases for streaming examples
[ https://issues.apache.org/jira/browse/FLINK-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375943#comment-14375943 ] ASF GitHub Bot commented on FLINK-1560: --- Github user mbalassi commented on a diff in the pull request: https://github.com/apache/flink/pull/519#discussion_r26939379 --- Diff: flink-staging/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/socket/util/SocketTextStreamWordCountData.java --- @@ -0,0 +1,196 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.examples.socket.util; + +public class SocketTextStreamWordCountData { + --- End diff -- I think we can simply use the wordcount data here, can't we? Add ITCases for streaming examples -- Key: FLINK-1560 URL: https://issues.apache.org/jira/browse/FLINK-1560 Project: Flink Issue Type: Test Components: Streaming Affects Versions: 0.9 Reporter: Márton Balassi Assignee: Péter Szabó Currently there are no tests for consistency of the streaming example programs. This might be a real show stopper for users who encounter an issue there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1560) Add ITCases for streaming examples
[ https://issues.apache.org/jira/browse/FLINK-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375950#comment-14375950 ] ASF GitHub Bot commented on FLINK-1560: --- Github user mbalassi commented on the pull request: https://github.com/apache/flink/pull/519#issuecomment-85019063 Please update this for parallelism of 4. Besides that and the inline comments looks good. Add ITCases for streaming examples -- Key: FLINK-1560 URL: https://issues.apache.org/jira/browse/FLINK-1560 Project: Flink Issue Type: Test Components: Streaming Affects Versions: 0.9 Reporter: Márton Balassi Assignee: Péter Szabó Currently there are no tests for consistency of the streaming example programs. This might be a real show stopper for users who encounter an issue there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: FLINK-1560 - Add ITCases for streaming example...
Github user mbalassi commented on the pull request: https://github.com/apache/flink/pull/519#issuecomment-85019063 Please update this for parallelism of 4. Besides that and the inline comments looks good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1687) Streaming file source/sink API is not in sync with the batch API
[ https://issues.apache.org/jira/browse/FLINK-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375954#comment-14375954 ] ASF GitHub Bot commented on FLINK-1687: --- GitHub user szape opened a pull request: https://github.com/apache/flink/pull/521 [FLINK-1687] [streaming] Syncing streaming source API with batch source API It is important to keep the streaming and the batch user facing API synchronised in regard of the source and sink functions (even if the inner workings are different). Most of the source functions are done here. Because this is not a high priority issue, sink functions will be done later on. You can merge this pull request into a Git repository by running: $ git pull https://github.com/mbalassi/flink FLINK-1687 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/521.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #521 commit 5fe5a3592ea21ca2016919dbc6c67330cf41dd7b Author: szape nemderogator...@gmail.com Date: 2015-03-23T10:35:11Z [FLINK-1687] [streaming] Syncing streaming source API with batch source API wip commit f8294897a90c21b762d422b8fe6f673acc228d9c Author: szape nemderogator...@gmail.com Date: 2015-03-23T10:37:04Z [FLINK-1687] [streaming] Source example wip Streaming file source/sink API is not in sync with the batch API Key: FLINK-1687 URL: https://issues.apache.org/jira/browse/FLINK-1687 Project: Flink Issue Type: Improvement Components: Streaming Reporter: Gábor Hermann Assignee: Péter Szabó Streaming environment is missing file inputs like readFile, readCsvFile and also the more general createInput function, and outputs like writeAsCsv and write. Streaming and batch API should be consistent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1544) Extend streaming aggregation tests to include POJOs
[ https://issues.apache.org/jira/browse/FLINK-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375959#comment-14375959 ] ASF GitHub Bot commented on FLINK-1544: --- Github user mbalassi commented on the pull request: https://github.com/apache/flink/pull/517#issuecomment-85020206 LGTM, will merge in the evening. Extend streaming aggregation tests to include POJOs --- Key: FLINK-1544 URL: https://issues.apache.org/jira/browse/FLINK-1544 Project: Flink Issue Type: Improvement Components: Streaming Reporter: Gyula Fora Assignee: Péter Szabó Labels: starter Currently the streaming aggregation tests don't test pojo aggregations which makes newly introduced bugs harder to detect. New tests should be added. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1544] [streaming] POJO types added to A...
Github user mbalassi commented on the pull request: https://github.com/apache/flink/pull/517#issuecomment-85020206 LGTM, will merge in the evening. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1544) Extend streaming aggregation tests to include POJOs
[ https://issues.apache.org/jira/browse/FLINK-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375897#comment-14375897 ] ASF GitHub Bot commented on FLINK-1544: --- GitHub user szape opened a pull request: https://github.com/apache/flink/pull/517 [FLINK-1544] [streaming] POJO types added to AggregationFunctionTest Testing aggragation functions with POJO types. The AggregationFunctionTest covered min, max, sum, minBy and maxBy aggregations for tuples. I added the same tests for POJOs. You can merge this pull request into a Git repository by running: $ git pull https://github.com/mbalassi/flink FLINK-1544 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/517.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #517 commit 2def2d98f502487e548adf24406ecf2f2489e71e Author: szape nemderogator...@gmail.com Date: 2015-03-11T14:36:15Z [FLINK-1544] [streaming] POJO types added to AggregationFunctionTest Extend streaming aggregation tests to include POJOs --- Key: FLINK-1544 URL: https://issues.apache.org/jira/browse/FLINK-1544 Project: Flink Issue Type: Improvement Components: Streaming Reporter: Gyula Fora Assignee: Péter Szabó Labels: starter Currently the streaming aggregation tests don't test pojo aggregations which makes newly introduced bugs harder to detect. New tests should be added. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1595) Add a complex integration test for Streaming API
[ https://issues.apache.org/jira/browse/FLINK-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375930#comment-14375930 ] ASF GitHub Bot commented on FLINK-1595: --- GitHub user szape opened a pull request: https://github.com/apache/flink/pull/520 [FLINK-1595] [streaming] Complex integration test wip Flink Streaming's complex integration test will test interactions of different streaming operators and settings. The test topology will run in one environment but will be divided into several subgraphs, tested separately from one another. In this way, many new tests can be added later. You can merge this pull request into a Git repository by running: $ git pull https://github.com/mbalassi/flink FLINK-1595 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/520.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #520 commit e24f24354add0800623189a4af70b3f04164469c Author: szape nemderogator...@gmail.com Date: 2015-03-23T09:48:12Z [FLINK-1595] [streaming] Complex integration test wip Add a complex integration test for Streaming API Key: FLINK-1595 URL: https://issues.apache.org/jira/browse/FLINK-1595 Project: Flink Issue Type: Task Components: Streaming Reporter: Gyula Fora Assignee: Péter Szabó Labels: Starter The streaming tests currently lack a sophisticated integration test that would test many api features at once. This should include different merging, partitioning, grouping, aggregation types, as well as windowing and connected operators. The results should be tested for correctness. A test like this would help identifying bugs that are hard to detect by unit-tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: FLINK-1560 - Add ITCases for streaming example...
Github user mbalassi commented on a diff in the pull request: https://github.com/apache/flink/pull/519#discussion_r26939465 --- Diff: flink-staging/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/twitter/TwitterStream.java --- @@ -176,11 +176,14 @@ private static boolean parseParameters(String[] args) { // parse input arguments fileOutput = true; if (args.length == 2) { + fileInput = true; textPath = args[0]; outputPath = args[1]; + } else if (args.length == 1) { + outputPath = args[0]; } else { System.err.println(USAGE:\nTwitterStream pathToPropertiesFile result path); - return false; + return true; } --- End diff -- Why return true here? This way this function always returns true. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1560) Add ITCases for streaming examples
[ https://issues.apache.org/jira/browse/FLINK-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375945#comment-14375945 ] ASF GitHub Bot commented on FLINK-1560: --- Github user mbalassi commented on a diff in the pull request: https://github.com/apache/flink/pull/519#discussion_r26939465 --- Diff: flink-staging/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/twitter/TwitterStream.java --- @@ -176,11 +176,14 @@ private static boolean parseParameters(String[] args) { // parse input arguments fileOutput = true; if (args.length == 2) { + fileInput = true; textPath = args[0]; outputPath = args[1]; + } else if (args.length == 1) { + outputPath = args[0]; } else { System.err.println(USAGE:\nTwitterStream pathToPropertiesFile result path); - return false; + return true; } --- End diff -- Why return true here? This way this function always returns true. Add ITCases for streaming examples -- Key: FLINK-1560 URL: https://issues.apache.org/jira/browse/FLINK-1560 Project: Flink Issue Type: Test Components: Streaming Affects Versions: 0.9 Reporter: Márton Balassi Assignee: Péter Szabó Currently there are no tests for consistency of the streaming example programs. This might be a real show stopper for users who encounter an issue there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1687] [streaming] Syncing streaming sou...
GitHub user szape opened a pull request: https://github.com/apache/flink/pull/521 [FLINK-1687] [streaming] Syncing streaming source API with batch source API It is important to keep the streaming and the batch user facing API synchronised in regard of the source and sink functions (even if the inner workings are different). Most of the source functions are done here. Because this is not a high priority issue, sink functions will be done later on. You can merge this pull request into a Git repository by running: $ git pull https://github.com/mbalassi/flink FLINK-1687 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/521.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #521 commit 5fe5a3592ea21ca2016919dbc6c67330cf41dd7b Author: szape nemderogator...@gmail.com Date: 2015-03-23T10:35:11Z [FLINK-1687] [streaming] Syncing streaming source API with batch source API wip commit f8294897a90c21b762d422b8fe6f673acc228d9c Author: szape nemderogator...@gmail.com Date: 2015-03-23T10:37:04Z [FLINK-1687] [streaming] Source example wip --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Cleanup of low level Kafka consumer (Persisten...
Github user gaborhermann closed the pull request at: https://github.com/apache/flink/pull/474 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Cleanup of low level Kafka consumer (Persisten...
Github user gaborhermann commented on the pull request: https://github.com/apache/flink/pull/474#issuecomment-85014101 I agree. Further work should be done elsewhere (over the master). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1772) Chain of split/select does not work
[ https://issues.apache.org/jira/browse/FLINK-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376069#comment-14376069 ] Gyula Fora commented on FLINK-1772: --- I dont think we should allow chaining of split. It doesnt make any practical sense to be honest. Chain of split/select does not work --- Key: FLINK-1772 URL: https://issues.apache.org/jira/browse/FLINK-1772 Project: Flink Issue Type: Bug Components: Streaming Reporter: Gábor Hermann OutputSelectors should handle multiple split/select called after one another like: dataStream.split(outputSelector1).select(one).split(outputSelector2).select(two) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1633) Add getTriplets() Gelly method
[ https://issues.apache.org/jira/browse/FLINK-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375764#comment-14375764 ] ASF GitHub Bot commented on FLINK-1633: --- Github user vasia commented on a diff in the pull request: https://github.com/apache/flink/pull/452#discussion_r26928666 --- Diff: flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java --- @@ -53,6 +54,7 @@ import org.apache.flink.graph.utils.Tuple2ToVertexMap; import org.apache.flink.graph.utils.Tuple3ToEdgeMap; import org.apache.flink.graph.utils.VertexToTuple2Map; +import org.apache.flink.graph.utils.Tuple2ToVertexMap; --- End diff -- and this import is not used anywhere :-) Add getTriplets() Gelly method -- Key: FLINK-1633 URL: https://issues.apache.org/jira/browse/FLINK-1633 Project: Flink Issue Type: New Feature Components: Gelly Affects Versions: 0.9 Reporter: Vasia Kalavri Assignee: Andra Lungu Priority: Minor Labels: starter In some graph algorithms, it is required to access the graph edges together with the vertex values of the source and target vertices. For example, several graph weighting schemes compute some kind of similarity weights for edges, based on the attributes of the source and target vertices. This issue proposes adding a convenience Gelly method that generates a DataSet of srcVertex, Edge, TrgVertex triplets from the input graph. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [Flink-1686] Streaming Iterations Slot Sharing...
Github user senorcarbone commented on the pull request: https://github.com/apache/flink/pull/524#issuecomment-85223415 bad style habits ^^ thanks for reminding me. I also made a minor fix to the test too. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Introduced CentralActiveTrigger with GroupedAc...
GitHub user gyfora opened a pull request: https://github.com/apache/flink/pull/523 Introduced CentralActiveTrigger with GroupedActiveDiscretizer You can merge this pull request into a Git repository by running: $ git pull https://github.com/mbalassi/flink session Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/523.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #523 commit 06210c0c41016b793d2cce9d61760402da3195e3 Author: Gyula Fora gyf...@apache.org Date: 2015-03-20T17:16:34Z [FLINK-1773] [streaming] Introduced CentralActiveTrigger with GroupedActiveDiscretizer thread-safety fix for GroupedActiveDiscretizer --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-1776) APIs provide invalid Semantic Properties for Operators with SelectorFunction keys
Fabian Hueske created FLINK-1776: Summary: APIs provide invalid Semantic Properties for Operators with SelectorFunction keys Key: FLINK-1776 URL: https://issues.apache.org/jira/browse/FLINK-1776 Project: Flink Issue Type: Bug Components: Java API, Scala API Affects Versions: 0.9 Reporter: Fabian Hueske Assignee: Fabian Hueske Priority: Critical Fix For: 0.9 Semantic properties are defined by users and evaluated by the optimizer. When semantic properties such as forwarded or read fields are bound to the input type of a function. In case of operators with selector function keys, a user function is wrapped by a wrapping function that has a different input types than the original user function. However, the user-defined semantic properties are verbatim forwarded to the optimizer. Since the properties refer to a specific type which is changed by the wrapping function and the semantic properties are not adapted, the optimizer uses wrong properties and might produce invalid plans -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1767) StreamExecutionEnvironment's execute should return JobExecutionResult
[ https://issues.apache.org/jira/browse/FLINK-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376184#comment-14376184 ] ASF GitHub Bot commented on FLINK-1767: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/516 StreamExecutionEnvironment's execute should return JobExecutionResult - Key: FLINK-1767 URL: https://issues.apache.org/jira/browse/FLINK-1767 Project: Flink Issue Type: Improvement Components: Streaming Reporter: Márton Balassi Assignee: Gabor Gevay Although the streaming API does not make use of the accumulators it is still a nice handle for the execution time and might wrap other features in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-1773) Add interface for grouped policies to broadcast information to other groups
Gyula Fora created FLINK-1773: - Summary: Add interface for grouped policies to broadcast information to other groups Key: FLINK-1773 URL: https://issues.apache.org/jira/browse/FLINK-1773 Project: Flink Issue Type: New Feature Reporter: Gyula Fora Priority: Minor The current windowing does not allow grouped policies to broadcast information to each other. Adding this would allow closing windows earlier than the next element arrives for the same group. This makes nice use cases possible like detecting user sessions on data streams. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1767] [streaming] Make StreamExecutionE...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/516 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1687) Streaming file source/sink API is not in sync with the batch API
[ https://issues.apache.org/jira/browse/FLINK-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376427#comment-14376427 ] ASF GitHub Bot commented on FLINK-1687: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/521#issuecomment-85154634 Ah, it would also be good to rebase this PR to the current master to get some feedback from travis Streaming file source/sink API is not in sync with the batch API Key: FLINK-1687 URL: https://issues.apache.org/jira/browse/FLINK-1687 Project: Flink Issue Type: Improvement Components: Streaming Reporter: Gábor Hermann Assignee: Péter Szabó Streaming environment is missing file inputs like readFile, readCsvFile and also the more general createInput function, and outputs like writeAsCsv and write. Streaming and batch API should be consistent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1687] [streaming] Syncing streaming sou...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/521#issuecomment-85154634 Ah, it would also be good to rebase this PR to the current master to get some feedback from travis --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1770]Rename the variable 'contentAdress...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/515#issuecomment-85169638 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1726][gelly] Added Community Detection ...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/505#issuecomment-85173349 I didn't see a bigger issue in the code, but @vasia is the authority on this ;) It would be nice to add at least another bullet to this list: http://ci.apache.org/projects/flink/flink-docs-master/gelly_guide.html#library-methods :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1687] [streaming] Syncing streaming sou...
Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/521#discussion_r26968237 --- Diff: flink-staging/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/ReadFileExample.java --- @@ -0,0 +1,37 @@ +package org.apache.flink.streaming.examples; + +import org.apache.flink.api.java.io.CsvInputFormat; +import org.apache.flink.api.java.io.TextInputFormat; +import org.apache.flink.api.java.record.io.FileInputFormat; +import org.apache.flink.api.java.record.io.FixedLengthInputFormat; +import org.apache.flink.api.java.tuple.Tuple1; +import org.apache.flink.api.java.tuple.Tuple2; +import org.apache.flink.api.java.typeutils.TypeExtractor; +import org.apache.flink.core.fs.Path; +import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; +import org.apache.flink.streaming.api.function.source.FileMonitoringFunction; +import org.apache.flink.util.NumberSequenceIterator; + +import java.util.ArrayList; +import java.util.Collection; + +public class ReadFileExample { + + private static String textPath = /home/szape/result.txt; --- End diff -- Mh ;) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1687] [streaming] Syncing streaming sou...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/521#issuecomment-85154551 Would it make sense to add a API completeness checker between the streaming and batch java API (similar to the java/scala API completeness checker?) The whitelist is probably a big bigger in that case, but somebody changing something somewhere would at least notice that the APIs are out of sync. Does this change actually break with the current (0.8.x) methods? or is it just adding new methods (it seems you're relocated some methods) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1687) Streaming file source/sink API is not in sync with the batch API
[ https://issues.apache.org/jira/browse/FLINK-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376425#comment-14376425 ] ASF GitHub Bot commented on FLINK-1687: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/521#issuecomment-85154551 Would it make sense to add a API completeness checker between the streaming and batch java API (similar to the java/scala API completeness checker?) The whitelist is probably a big bigger in that case, but somebody changing something somewhere would at least notice that the APIs are out of sync. Does this change actually break with the current (0.8.x) methods? or is it just adding new methods (it seems you're relocated some methods) Streaming file source/sink API is not in sync with the batch API Key: FLINK-1687 URL: https://issues.apache.org/jira/browse/FLINK-1687 Project: Flink Issue Type: Improvement Components: Streaming Reporter: Gábor Hermann Assignee: Péter Szabó Streaming environment is missing file inputs like readFile, readCsvFile and also the more general createInput function, and outputs like writeAsCsv and write. Streaming and batch API should be consistent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1595) Add a complex integration test for Streaming API
[ https://issues.apache.org/jira/browse/FLINK-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376457#comment-14376457 ] ASF GitHub Bot commented on FLINK-1595: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/520#issuecomment-85159538 +1 for more streaming tests ;) (... and apparently, they seem to help finding bugs ;) ) Btw, you don't have to use marton's github repository to open pull requests. You can also open them from your GH fork (you can also enable travis-ci for your own account, then you won't have to share with Marton's builds) Add a complex integration test for Streaming API Key: FLINK-1595 URL: https://issues.apache.org/jira/browse/FLINK-1595 Project: Flink Issue Type: Task Components: Streaming Reporter: Gyula Fora Assignee: Péter Szabó Labels: Starter The streaming tests currently lack a sophisticated integration test that would test many api features at once. This should include different merging, partitioning, grouping, aggregation types, as well as windowing and connected operators. The results should be tested for correctness. A test like this would help identifying bugs that are hard to detect by unit-tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1650) Suppress Akka's Netty Shutdown Errors through the log config
[ https://issues.apache.org/jira/browse/FLINK-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376463#comment-14376463 ] ASF GitHub Bot commented on FLINK-1650: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/518#issuecomment-85160071 The build passed in my travis: https://travis-ci.org/rmetzger/flink/builds/55487319 Suppress Akka's Netty Shutdown Errors through the log config Key: FLINK-1650 URL: https://issues.apache.org/jira/browse/FLINK-1650 Project: Flink Issue Type: Bug Components: other Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 I suggest to set the logging for `org.jboss.netty.channel.DefaultChannelPipeline` to error, in order to get rid of the misleading stack trace caused by an akka/netty hickup on shutdown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1472] Fixed Web frontend config overvie...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/439#issuecomment-85170765 According to travis, the code doesn't seem to build. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Make Expression API available to Java, Rename ...
Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/503#discussion_r26974832 --- Diff: docs/linq.md --- @@ -23,58 +23,91 @@ under the License. * This will be replaced by the TOC {:toc} -**Language-Integrated Queries are an experimental feature and can currently only be used with -the Scala API** +**Language-Integrated Queries are an experimental feature** -Flink provides an API that allows specifying operations using SQL-like expressions. -This Expression API can be enabled by importing -`org.apache.flink.api.scala.expressions._`. This enables implicit conversions that allow -converting a `DataSet` or `DataStream` to an `ExpressionOperation` on which relational queries -can be specified. This example shows how a `DataSet` can be converted, how expression operations -can be specified and how an expression operation can be converted back to a `DataSet`: +Flink provides an API that allows specifying operations using SQL-like expressions. Instead of +manipulating `DataSet` or `DataStream` you work with `Table` on which relational operations can +be performed. + +The following dependency must be added to your project when using the Table API: + +{% highlight xml %} +dependency + groupIdorg.apache.flink/groupId + artifactIdflink-table/artifactId + version{{site.FLINK_VERSION_SHORT }}/version +/dependency +{% endhighlight %} + +## Scala Table API + +The Table API can be enabled by importing `org.apache.flink.api.scala.table._`. This enables +implicit conversions that allow +converting a DataSet or DataStream to a Table. This example shows how a DataSet can +be converted, how relationa queries can be specified and how a Table can be +converted back to a DataSet`: --- End diff -- the tick after DataSet is probably wrong. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Make Expression API available to Java, Rename ...
Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/503#discussion_r26974798 --- Diff: docs/linq.md --- @@ -23,58 +23,91 @@ under the License. * This will be replaced by the TOC {:toc} -**Language-Integrated Queries are an experimental feature and can currently only be used with -the Scala API** +**Language-Integrated Queries are an experimental feature** -Flink provides an API that allows specifying operations using SQL-like expressions. -This Expression API can be enabled by importing -`org.apache.flink.api.scala.expressions._`. This enables implicit conversions that allow -converting a `DataSet` or `DataStream` to an `ExpressionOperation` on which relational queries -can be specified. This example shows how a `DataSet` can be converted, how expression operations -can be specified and how an expression operation can be converted back to a `DataSet`: +Flink provides an API that allows specifying operations using SQL-like expressions. Instead of +manipulating `DataSet` or `DataStream` you work with `Table` on which relational operations can +be performed. + +The following dependency must be added to your project when using the Table API: + +{% highlight xml %} +dependency + groupIdorg.apache.flink/groupId + artifactIdflink-table/artifactId + version{{site.FLINK_VERSION_SHORT }}/version +/dependency +{% endhighlight %} + +## Scala Table API + +The Table API can be enabled by importing `org.apache.flink.api.scala.table._`. This enables +implicit conversions that allow +converting a DataSet or DataStream to a Table. This example shows how a DataSet can +be converted, how relationa queries can be specified and how a Table can be --- End diff -- missing `l` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1595] [streaming] Complex integration t...
Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/520#discussion_r26969917 --- Diff: flink-staging/flink-streaming/flink-streaming-core/src/test/java/org/apache/flink/streaming/api/complex/ComplexIntegrationTest.java --- @@ -0,0 +1,903 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.api.complex; + +import org.apache.flink.api.common.functions.FilterFunction; +import org.apache.flink.api.common.functions.FlatMapFunction; +import org.apache.flink.api.common.functions.MapFunction; +import org.apache.flink.api.common.functions.RichFilterFunction; +import org.apache.flink.api.java.tuple.Tuple2; +import org.apache.flink.api.java.tuple.Tuple4; +import org.apache.flink.api.java.tuple.Tuple5; +import org.apache.flink.configuration.Configuration; +import org.apache.flink.streaming.api.collector.selector.OutputSelector; +import org.apache.flink.streaming.api.datastream.DataStream; +import org.apache.flink.streaming.api.datastream.IterativeDataStream; +import org.apache.flink.streaming.api.datastream.SplitDataStream; +import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; +import org.apache.flink.streaming.api.function.WindowMapFunction; +import org.apache.flink.streaming.api.function.co.CoMapFunction; +import org.apache.flink.streaming.api.function.sink.SinkFunction; +import org.apache.flink.streaming.api.function.source.SourceFunction; +import org.apache.flink.streaming.api.windowing.deltafunction.DeltaFunction; +import org.apache.flink.streaming.api.windowing.helper.Count; +import org.apache.flink.streaming.api.windowing.helper.Delta; +import org.apache.flink.streaming.api.windowing.helper.Time; +import org.apache.flink.streaming.api.windowing.helper.Timestamp; +import org.apache.flink.streaming.util.RectangleClass; +import org.apache.flink.streaming.util.TestStreamEnvironment; +import org.apache.flink.util.Collector; +import org.junit.Test; + +import java.io.Serializable; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import static org.junit.Assert.assertEquals; + +public class ComplexIntegrationTest implements Serializable { + private static final long serialVersionUID = 1L; + private static final long MEMORYSIZE = 32; + + private static MapString, ListString results = new HashMapString, ListString(); + + @SuppressWarnings(unchecked) + public static ListTuple5Integer, String, Character, Double, Boolean input = Arrays.asList( + new Tuple5Integer, String, Character, Double, Boolean(1, apple, 'j', 0.1, false), + new Tuple5Integer, String, Character, Double, Boolean(1, peach, 'b', 0.8, true), + new Tuple5Integer, String, Character, Double, Boolean(1, orange, 'c', 0.7, false), + new Tuple5Integer, String, Character, Double, Boolean(2, apple, 'd', 0.5, false), + new Tuple5Integer, String, Character, Double, Boolean(2, apple, 'e', 0.6, false), + new Tuple5Integer, String, Character, Double, Boolean(3, peach, 'a', 0.2, true), + new Tuple5Integer, String, Character, Double, Boolean(6, peanut, 'b', 0.1, true), + new Tuple5Integer, String, Character, Double, Boolean(7, banana, 'c', 0.4, false), + new Tuple5Integer, String, Character, Double, Boolean(8, peanut, 'd', 0.2, false), + new Tuple5Integer, String, Character, Double, Boolean(10, cherry, 'e', 0.1, false), + new Tuple5Integer, String, Character, Double, Boolean(10, plum, 'a', 0.5, true), + new Tuple5Integer, String, Character, Double, Boolean(11, strawberry, 'b', 0.3, false), + new Tuple5Integer, String, Character,
[jira] [Commented] (FLINK-1595) Add a complex integration test for Streaming API
[ https://issues.apache.org/jira/browse/FLINK-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376441#comment-14376441 ] ASF GitHub Bot commented on FLINK-1595: --- Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/520#discussion_r26969917 --- Diff: flink-staging/flink-streaming/flink-streaming-core/src/test/java/org/apache/flink/streaming/api/complex/ComplexIntegrationTest.java --- @@ -0,0 +1,903 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.api.complex; + +import org.apache.flink.api.common.functions.FilterFunction; +import org.apache.flink.api.common.functions.FlatMapFunction; +import org.apache.flink.api.common.functions.MapFunction; +import org.apache.flink.api.common.functions.RichFilterFunction; +import org.apache.flink.api.java.tuple.Tuple2; +import org.apache.flink.api.java.tuple.Tuple4; +import org.apache.flink.api.java.tuple.Tuple5; +import org.apache.flink.configuration.Configuration; +import org.apache.flink.streaming.api.collector.selector.OutputSelector; +import org.apache.flink.streaming.api.datastream.DataStream; +import org.apache.flink.streaming.api.datastream.IterativeDataStream; +import org.apache.flink.streaming.api.datastream.SplitDataStream; +import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; +import org.apache.flink.streaming.api.function.WindowMapFunction; +import org.apache.flink.streaming.api.function.co.CoMapFunction; +import org.apache.flink.streaming.api.function.sink.SinkFunction; +import org.apache.flink.streaming.api.function.source.SourceFunction; +import org.apache.flink.streaming.api.windowing.deltafunction.DeltaFunction; +import org.apache.flink.streaming.api.windowing.helper.Count; +import org.apache.flink.streaming.api.windowing.helper.Delta; +import org.apache.flink.streaming.api.windowing.helper.Time; +import org.apache.flink.streaming.api.windowing.helper.Timestamp; +import org.apache.flink.streaming.util.RectangleClass; +import org.apache.flink.streaming.util.TestStreamEnvironment; +import org.apache.flink.util.Collector; +import org.junit.Test; + +import java.io.Serializable; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import static org.junit.Assert.assertEquals; + +public class ComplexIntegrationTest implements Serializable { + private static final long serialVersionUID = 1L; + private static final long MEMORYSIZE = 32; + + private static MapString, ListString results = new HashMapString, ListString(); + + @SuppressWarnings(unchecked) + public static ListTuple5Integer, String, Character, Double, Boolean input = Arrays.asList( + new Tuple5Integer, String, Character, Double, Boolean(1, apple, 'j', 0.1, false), + new Tuple5Integer, String, Character, Double, Boolean(1, peach, 'b', 0.8, true), + new Tuple5Integer, String, Character, Double, Boolean(1, orange, 'c', 0.7, false), + new Tuple5Integer, String, Character, Double, Boolean(2, apple, 'd', 0.5, false), + new Tuple5Integer, String, Character, Double, Boolean(2, apple, 'e', 0.6, false), + new Tuple5Integer, String, Character, Double, Boolean(3, peach, 'a', 0.2, true), + new Tuple5Integer, String, Character, Double, Boolean(6, peanut, 'b', 0.1, true), + new Tuple5Integer, String, Character, Double, Boolean(7, banana, 'c', 0.4, false), + new Tuple5Integer, String, Character, Double, Boolean(8, peanut, 'd', 0.2, false), + new Tuple5Integer, String, Character, Double, Boolean(10, cherry, 'e', 0.1, false), + new
[GitHub] flink pull request: [FLINK-1595] [streaming] Complex integration t...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/520#issuecomment-85159538 +1 for more streaming tests ;) (... and apparently, they seem to help finding bugs ;) ) Btw, you don't have to use marton's github repository to open pull requests. You can also open them from your GH fork (you can also enable travis-ci for your own account, then you won't have to share with Marton's builds) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1726) Add Community Detection Library and Example
[ https://issues.apache.org/jira/browse/FLINK-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376547#comment-14376547 ] ASF GitHub Bot commented on FLINK-1726: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/505#issuecomment-85173349 I didn't see a bigger issue in the code, but @vasia is the authority on this ;) It would be nice to add at least another bullet to this list: http://ci.apache.org/projects/flink/flink-docs-master/gelly_guide.html#library-methods :) Add Community Detection Library and Example --- Key: FLINK-1726 URL: https://issues.apache.org/jira/browse/FLINK-1726 Project: Flink Issue Type: Task Components: Gelly Affects Versions: 0.9 Reporter: Andra Lungu Assignee: Andra Lungu Community detection paper: http://arxiv.org/pdf/0808.2633.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Make Expression API available to Java, Rename ...
Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/503#discussion_r26975057 --- Diff: docs/linq.md --- @@ -23,58 +23,91 @@ under the License. * This will be replaced by the TOC {:toc} -**Language-Integrated Queries are an experimental feature and can currently only be used with -the Scala API** +**Language-Integrated Queries are an experimental feature** -Flink provides an API that allows specifying operations using SQL-like expressions. -This Expression API can be enabled by importing -`org.apache.flink.api.scala.expressions._`. This enables implicit conversions that allow -converting a `DataSet` or `DataStream` to an `ExpressionOperation` on which relational queries -can be specified. This example shows how a `DataSet` can be converted, how expression operations -can be specified and how an expression operation can be converted back to a `DataSet`: +Flink provides an API that allows specifying operations using SQL-like expressions. Instead of +manipulating `DataSet` or `DataStream` you work with `Table` on which relational operations can +be performed. + +The following dependency must be added to your project when using the Table API: + +{% highlight xml %} +dependency + groupIdorg.apache.flink/groupId + artifactIdflink-table/artifactId + version{{site.FLINK_VERSION_SHORT }}/version +/dependency +{% endhighlight %} + +## Scala Table API + +The Table API can be enabled by importing `org.apache.flink.api.scala.table._`. This enables +implicit conversions that allow +converting a DataSet or DataStream to a Table. This example shows how a DataSet can +be converted, how relationa queries can be specified and how a Table can be +converted back to a DataSet`: {% highlight scala %} import org.apache.flink.api.scala._ -import org.apache.flink.api.scala.expressions._ +import org.apache.flink.api.scala.table._ case class WC(word: String, count: Int) val input = env.fromElements(WC(hello, 1), WC(hello, 1), WC(ciao, 1)) -val expr = input.toExpression -val result = expr.groupBy('word).select('word, 'count.sum).as[WC] +val expr = input.toTable +val result = expr.groupBy('word).select('word, 'count.sum).toSet[WC] {% endhighlight %} The expression DSL uses Scala symbols to refer to field names and we use code generation to transform expressions to efficient runtime code. Please not that the conversion to and from -expression operations only works when using Scala case classes or Flink POJOs. Please check out +Tables only works when using Scala case classes or Flink POJOs. Please check out the [programming guide](programming_guide.html) to learn the requirements for a class to be considered a POJO. This is another example that shows how you -can join to operations: +can join to Tables: {% highlight scala %} case class MyResult(a: String, b: Int) val input1 = env.fromElements(...).as('a, 'b) val input2 = env.fromElements(...).as('c, 'd) -val joined = input1.join(input2).where('b == 'a 'd 42).select('a, 'd).as[MyResult] +val joined = input1.join(input2).where(b = a d 42).select(a, d).as[MyResult] {% endhighlight %} -Notice, how a `DataSet` can be converted to an expression operation by using `as` and specifying new -names for the fields. This can also be used to disambiguate fields before a join operation. +Notice, how a DataSet can be converted to a Table by using `as` and specifying new +names for the fields. This can also be used to disambiguate fields before a join operation. Also, +in this example we see that you can also use Strings to specify relational expressions. -The Expression API can be used with the Streaming API, since we also have implicit conversions to -and from `DataStream`. +Please refer to the Scaladoc (and Javadoc) for a full list of supported operations and a +description of the expression syntax. -The following dependency must be added to your project when using the Expression API: +## Java Table API -{% highlight xml %} -dependency - groupIdorg.apache.flink/groupId - artifactIdflink-expressions/artifactId - version{{site.FLINK_VERSION_SHORT }}/version -/dependency +When using Java, Tables can be converted to and from DataSet and DataStream using class --- End diff -- using the class --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
[jira] [Commented] (FLINK-1687) Streaming file source/sink API is not in sync with the batch API
[ https://issues.apache.org/jira/browse/FLINK-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376413#comment-14376413 ] ASF GitHub Bot commented on FLINK-1687: --- Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/521#discussion_r26968237 --- Diff: flink-staging/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/ReadFileExample.java --- @@ -0,0 +1,37 @@ +package org.apache.flink.streaming.examples; + +import org.apache.flink.api.java.io.CsvInputFormat; +import org.apache.flink.api.java.io.TextInputFormat; +import org.apache.flink.api.java.record.io.FileInputFormat; +import org.apache.flink.api.java.record.io.FixedLengthInputFormat; +import org.apache.flink.api.java.tuple.Tuple1; +import org.apache.flink.api.java.tuple.Tuple2; +import org.apache.flink.api.java.typeutils.TypeExtractor; +import org.apache.flink.core.fs.Path; +import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; +import org.apache.flink.streaming.api.function.source.FileMonitoringFunction; +import org.apache.flink.util.NumberSequenceIterator; + +import java.util.ArrayList; +import java.util.Collection; + +public class ReadFileExample { + + private static String textPath = /home/szape/result.txt; --- End diff -- Mh ;) Streaming file source/sink API is not in sync with the batch API Key: FLINK-1687 URL: https://issues.apache.org/jira/browse/FLINK-1687 Project: Flink Issue Type: Improvement Components: Streaming Reporter: Gábor Hermann Assignee: Péter Szabó Streaming environment is missing file inputs like readFile, readCsvFile and also the more general createInput function, and outputs like writeAsCsv and write. Streaming and batch API should be consistent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1595] [streaming] Complex integration t...
Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/520#discussion_r26969789 --- Diff: flink-staging/flink-streaming/flink-streaming-core/src/test/java/org/apache/flink/streaming/api/complex/ComplexIntegrationTest.java --- @@ -0,0 +1,903 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.api.complex; + +import org.apache.flink.api.common.functions.FilterFunction; +import org.apache.flink.api.common.functions.FlatMapFunction; +import org.apache.flink.api.common.functions.MapFunction; +import org.apache.flink.api.common.functions.RichFilterFunction; +import org.apache.flink.api.java.tuple.Tuple2; +import org.apache.flink.api.java.tuple.Tuple4; +import org.apache.flink.api.java.tuple.Tuple5; +import org.apache.flink.configuration.Configuration; +import org.apache.flink.streaming.api.collector.selector.OutputSelector; +import org.apache.flink.streaming.api.datastream.DataStream; +import org.apache.flink.streaming.api.datastream.IterativeDataStream; +import org.apache.flink.streaming.api.datastream.SplitDataStream; +import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; +import org.apache.flink.streaming.api.function.WindowMapFunction; +import org.apache.flink.streaming.api.function.co.CoMapFunction; +import org.apache.flink.streaming.api.function.sink.SinkFunction; +import org.apache.flink.streaming.api.function.source.SourceFunction; +import org.apache.flink.streaming.api.windowing.deltafunction.DeltaFunction; +import org.apache.flink.streaming.api.windowing.helper.Count; +import org.apache.flink.streaming.api.windowing.helper.Delta; +import org.apache.flink.streaming.api.windowing.helper.Time; +import org.apache.flink.streaming.api.windowing.helper.Timestamp; +import org.apache.flink.streaming.util.RectangleClass; +import org.apache.flink.streaming.util.TestStreamEnvironment; +import org.apache.flink.util.Collector; +import org.junit.Test; + +import java.io.Serializable; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import static org.junit.Assert.assertEquals; + +public class ComplexIntegrationTest implements Serializable { + private static final long serialVersionUID = 1L; + private static final long MEMORYSIZE = 32; + + private static MapString, ListString results = new HashMapString, ListString(); + + @SuppressWarnings(unchecked) + public static ListTuple5Integer, String, Character, Double, Boolean input = Arrays.asList( + new Tuple5Integer, String, Character, Double, Boolean(1, apple, 'j', 0.1, false), + new Tuple5Integer, String, Character, Double, Boolean(1, peach, 'b', 0.8, true), + new Tuple5Integer, String, Character, Double, Boolean(1, orange, 'c', 0.7, false), + new Tuple5Integer, String, Character, Double, Boolean(2, apple, 'd', 0.5, false), + new Tuple5Integer, String, Character, Double, Boolean(2, apple, 'e', 0.6, false), + new Tuple5Integer, String, Character, Double, Boolean(3, peach, 'a', 0.2, true), + new Tuple5Integer, String, Character, Double, Boolean(6, peanut, 'b', 0.1, true), + new Tuple5Integer, String, Character, Double, Boolean(7, banana, 'c', 0.4, false), + new Tuple5Integer, String, Character, Double, Boolean(8, peanut, 'd', 0.2, false), + new Tuple5Integer, String, Character, Double, Boolean(10, cherry, 'e', 0.1, false), + new Tuple5Integer, String, Character, Double, Boolean(10, plum, 'a', 0.5, true), + new Tuple5Integer, String, Character, Double, Boolean(11, strawberry, 'b', 0.3, false), + new Tuple5Integer, String, Character,
[jira] [Commented] (FLINK-1595) Add a complex integration test for Streaming API
[ https://issues.apache.org/jira/browse/FLINK-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376437#comment-14376437 ] ASF GitHub Bot commented on FLINK-1595: --- Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/520#discussion_r26969789 --- Diff: flink-staging/flink-streaming/flink-streaming-core/src/test/java/org/apache/flink/streaming/api/complex/ComplexIntegrationTest.java --- @@ -0,0 +1,903 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.api.complex; + +import org.apache.flink.api.common.functions.FilterFunction; +import org.apache.flink.api.common.functions.FlatMapFunction; +import org.apache.flink.api.common.functions.MapFunction; +import org.apache.flink.api.common.functions.RichFilterFunction; +import org.apache.flink.api.java.tuple.Tuple2; +import org.apache.flink.api.java.tuple.Tuple4; +import org.apache.flink.api.java.tuple.Tuple5; +import org.apache.flink.configuration.Configuration; +import org.apache.flink.streaming.api.collector.selector.OutputSelector; +import org.apache.flink.streaming.api.datastream.DataStream; +import org.apache.flink.streaming.api.datastream.IterativeDataStream; +import org.apache.flink.streaming.api.datastream.SplitDataStream; +import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; +import org.apache.flink.streaming.api.function.WindowMapFunction; +import org.apache.flink.streaming.api.function.co.CoMapFunction; +import org.apache.flink.streaming.api.function.sink.SinkFunction; +import org.apache.flink.streaming.api.function.source.SourceFunction; +import org.apache.flink.streaming.api.windowing.deltafunction.DeltaFunction; +import org.apache.flink.streaming.api.windowing.helper.Count; +import org.apache.flink.streaming.api.windowing.helper.Delta; +import org.apache.flink.streaming.api.windowing.helper.Time; +import org.apache.flink.streaming.api.windowing.helper.Timestamp; +import org.apache.flink.streaming.util.RectangleClass; +import org.apache.flink.streaming.util.TestStreamEnvironment; +import org.apache.flink.util.Collector; +import org.junit.Test; + +import java.io.Serializable; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import static org.junit.Assert.assertEquals; + +public class ComplexIntegrationTest implements Serializable { + private static final long serialVersionUID = 1L; + private static final long MEMORYSIZE = 32; + + private static MapString, ListString results = new HashMapString, ListString(); + + @SuppressWarnings(unchecked) + public static ListTuple5Integer, String, Character, Double, Boolean input = Arrays.asList( + new Tuple5Integer, String, Character, Double, Boolean(1, apple, 'j', 0.1, false), + new Tuple5Integer, String, Character, Double, Boolean(1, peach, 'b', 0.8, true), + new Tuple5Integer, String, Character, Double, Boolean(1, orange, 'c', 0.7, false), + new Tuple5Integer, String, Character, Double, Boolean(2, apple, 'd', 0.5, false), + new Tuple5Integer, String, Character, Double, Boolean(2, apple, 'e', 0.6, false), + new Tuple5Integer, String, Character, Double, Boolean(3, peach, 'a', 0.2, true), + new Tuple5Integer, String, Character, Double, Boolean(6, peanut, 'b', 0.1, true), + new Tuple5Integer, String, Character, Double, Boolean(7, banana, 'c', 0.4, false), + new Tuple5Integer, String, Character, Double, Boolean(8, peanut, 'd', 0.2, false), + new Tuple5Integer, String, Character, Double, Boolean(10, cherry, 'e', 0.1, false), + new
[GitHub] flink pull request: [FLINK-1650] Configure Netty (akka) to use Slf...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/518#issuecomment-85160071 The build passed in my travis: https://travis-ci.org/rmetzger/flink/builds/55487319 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1770) Rename the variable 'contentAdressable' to 'contentAddressable'
[ https://issues.apache.org/jira/browse/FLINK-1770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376525#comment-14376525 ] ASF GitHub Bot commented on FLINK-1770: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/515#issuecomment-85169638 LGTM Rename the variable 'contentAdressable' to 'contentAddressable' --- Key: FLINK-1770 URL: https://issues.apache.org/jira/browse/FLINK-1770 Project: Flink Issue Type: Bug Affects Versions: master Reporter: Sibao Hong Priority: Minor Fix For: master Rename the variable 'contentAdressable' to 'contentAddressable' in order to better understanding. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Adds akka interal docs and configuration descr...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/512#issuecomment-85169924 +1 for updating the configuration guide with the Akka variables. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1472) Web frontend config overview shows wrong value
[ https://issues.apache.org/jira/browse/FLINK-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376533#comment-14376533 ] ASF GitHub Bot commented on FLINK-1472: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/439#issuecomment-85170765 According to travis, the code doesn't seem to build. Web frontend config overview shows wrong value -- Key: FLINK-1472 URL: https://issues.apache.org/jira/browse/FLINK-1472 Project: Flink Issue Type: Bug Components: Webfrontend Affects Versions: master Reporter: Ufuk Celebi Assignee: Mingliang Qi Priority: Minor The web frontend shows configuration values even if they could not be correctly parsed. For example I've configured the number of buffers as 123.000, which cannot be parsed as an Integer by GlobalConfiguration and the default value is used. Still, the web frontend shows the not used 123.000. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Make Expression API available to Java, Rename ...
Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/503#discussion_r26975267 --- Diff: flink-staging/flink-table/src/main/java/org/apache/flink/api/java/table/package-info.java --- @@ -0,0 +1,66 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/** + * strongTable API (Java)/strongbr + * + * {@link org.apache.flink.api.java.table.TableUtil} can be used to create a + * {@link org.apache.flink.api.table.Table} from a {@link org.apache.flink.api.java.DataSet} + * or {@link org.apache.flink.streaming.api.datastream.DataStream}. + * + * p + * This can be used to perform SQL-like queries on data. Please have + * a look at {@link org.apache.flink.api.table.Table} to see which operations are supported and + * how query strings are written. + * + * p + * Example: + * + * code + * ExecutionEnvironment env = ExecutionEnvironment.createCollectionsEnvironment(); + * + * DataSetWC input = env.fromElements( + * new WC(Hello, 1), + * new WC(Ciao, 1), + * new WC(Hello, 1)); + * + * Table table = TableUtil.from(input); + * + * Table filtered = table + * .groupBy(word) + * .select(word.count as count, word) + * .filter(count = 2); + * + * DataSetWC result = TableUtil.toSet(filtered, WC.class); + * + * result.print(); + * env.execute(); + * /code + * + * p + * A {@link org.apache.flink.api.table.Table} can be converted back to the underlying API + * representation using t: --- End diff -- using t ? What is t? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Cleanup of low level Kafka consumer (Persisten...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/474#issuecomment-84879071 I think we can close this PR. Everything from here has been merged to master. Maybe we removed some code again from this PR because it was buggy or untested. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1615] [java api] SimpleTweetInputFormat
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/442#issuecomment-84912485 I triggered another travis build: https://travis-ci.org/rmetzger/flink/builds/55459787 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (FLINK-1769) Maven deploy is broken (build artifacts are cleaned in docs-and-sources profile)
Robert Metzger created FLINK-1769: - Summary: Maven deploy is broken (build artifacts are cleaned in docs-and-sources profile) Key: FLINK-1769 URL: https://issues.apache.org/jira/browse/FLINK-1769 Project: Flink Issue Type: Bug Reporter: Robert Metzger Priority: Critical The issue has been introduced by FLINK-1720. This change broke the deployment to maven snapshots / central. {code} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-install-plugin:2.5.1:install (default-install) on project flink-shaded-include-yarn: Failed to install artifact org.apache.flink:flink-shaded-include-yarn:pom:0.9-SNAPSHOT: /home/robert/incubator-flink/flink-shaded-hadoop/flink-shaded-include-yarn/target/dependency-reduced-pom.xml (No such file or directory) - [Help 1] {code} The issue is that maven is now executing {{clean}} after {{shade}} and then {{install}} can not store the result of {{shade}} anymore (because it has been deleted) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1656) Filtered Semantic Properties for Operators with Iterators
[ https://issues.apache.org/jira/browse/FLINK-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375665#comment-14375665 ] Fabian Hueske commented on FLINK-1656: -- I am going to address this issue on two levels: 1. The optimizer will filter out all forward field information on non-key fields for operators with iterators. - non-key fields of GroupReduce - non-key fields of CoGroup - all fields of MapPartition 2. The APIs will log a warning if a user adds forward field information for non-key fields of GroupReduce and CoGroup and throw an exception if a user adds forward field information for MapPartition and Filter. Filtered Semantic Properties for Operators with Iterators - Key: FLINK-1656 URL: https://issues.apache.org/jira/browse/FLINK-1656 Project: Flink Issue Type: Bug Components: Documentation Affects Versions: 0.9 Reporter: Fabian Hueske Assignee: Fabian Hueske Priority: Critical The documentation of ForwardedFields is incomplete for operators with iterator inputs (GroupReduce, CoGroup). This should be fixed ASAP, because it can lead to incorrect program execution. The conditions for forwarded fields on operators with iterator input are: 1) forwarded fields must be emitted in the order in which they are received through the iterator 2) all forwarded fields of a record must stick together, i.e., if your function builds record from field 0 of the 1st, 3rd, 5th, ... and field 1 of the 2nd, 4th, ... record coming through the iterator, these are not valid forwarded fields. 3) it is OK to completely filter out records coming through the iterator. The reason for these conditions is that the optimizer uses forwarded fields to reason about physical data properties such as order and grouping. Mixing up the order of records or emitting records which are composed from different input records, might destroy a (secondary) order or grouping. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1679] rename degree of parallelism to p...
Github user mxm closed the pull request at: https://github.com/apache/flink/pull/488 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1679) Document how degree of parallelism / parallelism / slots are connected to each other
[ https://issues.apache.org/jira/browse/FLINK-1679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375553#comment-14375553 ] ASF GitHub Bot commented on FLINK-1679: --- Github user mxm commented on the pull request: https://github.com/apache/flink/pull/488#issuecomment-84875935 Merged in master with 126f9f799071688fe80955a7e7cfa991f53c95af Document how degree of parallelism / parallelism / slots are connected to each other --- Key: FLINK-1679 URL: https://issues.apache.org/jira/browse/FLINK-1679 Project: Flink Issue Type: Task Components: Documentation Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Maximilian Michels Fix For: 0.9 I see too many users being confused about properly setting up Flink with respect to parallelism. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1679) Document how degree of parallelism / parallelism / slots are connected to each other
[ https://issues.apache.org/jira/browse/FLINK-1679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375554#comment-14375554 ] ASF GitHub Bot commented on FLINK-1679: --- Github user mxm closed the pull request at: https://github.com/apache/flink/pull/488 Document how degree of parallelism / parallelism / slots are connected to each other --- Key: FLINK-1679 URL: https://issues.apache.org/jira/browse/FLINK-1679 Project: Flink Issue Type: Task Components: Documentation Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Maximilian Michels Fix For: 0.9 I see too many users being confused about properly setting up Flink with respect to parallelism. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1679] rename degree of parallelism to p...
Github user mxm commented on the pull request: https://github.com/apache/flink/pull/488#issuecomment-84875935 Merged in master with 126f9f799071688fe80955a7e7cfa991f53c95af --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1679) Document how degree of parallelism / parallelism / slots are connected to each other
[ https://issues.apache.org/jira/browse/FLINK-1679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375551#comment-14375551 ] ASF GitHub Bot commented on FLINK-1679: --- Github user mxm closed the pull request at: https://github.com/apache/flink/pull/488 Document how degree of parallelism / parallelism / slots are connected to each other --- Key: FLINK-1679 URL: https://issues.apache.org/jira/browse/FLINK-1679 Project: Flink Issue Type: Task Components: Documentation Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Maximilian Michels Fix For: 0.9 I see too many users being confused about properly setting up Flink with respect to parallelism. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1765) Reducer grouping is skippted when parallelism is one
[ https://issues.apache.org/jira/browse/FLINK-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375573#comment-14375573 ] Robert Metzger commented on FLINK-1765: --- Can we add a test for this fix? Reducer grouping is skippted when parallelism is one Key: FLINK-1765 URL: https://issues.apache.org/jira/browse/FLINK-1765 Project: Flink Issue Type: Bug Components: Streaming Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Gyula Fora Fix For: 0.9 This program (not the parallelism) incorrectly runs a non grouped reduce and fails with a NullPointerException. {code} StreamExecutionEnvironment env = ... env.setDegreeOfParallelism(1); DataStreamString stream = env.addSource(...); stream .filter(...) .map(...) .groupBy(someField) .reduce(new ReduceFunction() {...} ) .addSink(...); env.execute(); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (FLINK-1757) java.lang.ClassCastException is thrown while summing Short values on window
[ https://issues.apache.org/jira/browse/FLINK-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger reopened FLINK-1757: --- java.lang.ClassCastException is thrown while summing Short values on window --- Key: FLINK-1757 URL: https://issues.apache.org/jira/browse/FLINK-1757 Project: Flink Issue Type: Bug Components: Streaming Reporter: Péter Szabó Assignee: Péter Szabó Fix For: 0.9 java.lang.ClassCastException is thrown while summing Short values on window Stack Trace: Caused by: java.lang.RuntimeException: java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Short at org.apache.flink.streaming.api.streamvertex.OutputHandler.invokeUserFunction(OutputHandler.java:232) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:121) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Short at org.apache.flink.api.common.typeutils.base.ShortSerializer.copy(ShortSerializer.java:27) at org.apache.flink.api.java.typeutils.runtime.TupleSerializer.copy(TupleSerializer.java:95) at org.apache.flink.api.java.typeutils.runtime.TupleSerializer.copy(TupleSerializer.java:30) at org.apache.flink.streaming.api.invokable.StreamInvokable.copy(StreamInvokable.java:166) at org.apache.flink.streaming.api.invokable.SinkInvokable.collect(SinkInvokable.java:46) at org.apache.flink.streaming.api.collector.DirectedCollectorWrapper.collect(DirectedCollectorWrapper.java:95) at org.apache.flink.streaming.api.invokable.operator.GroupedReduceInvokable.reduce(GroupedReduceInvokable.java:47) at org.apache.flink.streaming.api.invokable.operator.StreamReduceInvokable.invoke(StreamReduceInvokable.java:39) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invokeUserFunction(StreamVertex.java:85) at org.apache.flink.streaming.api.streamvertex.OutputHandler.invokeUserFunction(OutputHandler.java:229) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-1757) java.lang.ClassCastException is thrown while summing Short values on window
[ https://issues.apache.org/jira/browse/FLINK-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger resolved FLINK-1757. --- Resolution: Fixed Fix Version/s: 0.9 java.lang.ClassCastException is thrown while summing Short values on window --- Key: FLINK-1757 URL: https://issues.apache.org/jira/browse/FLINK-1757 Project: Flink Issue Type: Bug Components: Streaming Reporter: Péter Szabó Assignee: Péter Szabó Fix For: 0.9 java.lang.ClassCastException is thrown while summing Short values on window Stack Trace: Caused by: java.lang.RuntimeException: java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Short at org.apache.flink.streaming.api.streamvertex.OutputHandler.invokeUserFunction(OutputHandler.java:232) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:121) at org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Short at org.apache.flink.api.common.typeutils.base.ShortSerializer.copy(ShortSerializer.java:27) at org.apache.flink.api.java.typeutils.runtime.TupleSerializer.copy(TupleSerializer.java:95) at org.apache.flink.api.java.typeutils.runtime.TupleSerializer.copy(TupleSerializer.java:30) at org.apache.flink.streaming.api.invokable.StreamInvokable.copy(StreamInvokable.java:166) at org.apache.flink.streaming.api.invokable.SinkInvokable.collect(SinkInvokable.java:46) at org.apache.flink.streaming.api.collector.DirectedCollectorWrapper.collect(DirectedCollectorWrapper.java:95) at org.apache.flink.streaming.api.invokable.operator.GroupedReduceInvokable.reduce(GroupedReduceInvokable.java:47) at org.apache.flink.streaming.api.invokable.operator.StreamReduceInvokable.invoke(StreamReduceInvokable.java:39) at org.apache.flink.streaming.api.streamvertex.StreamVertex.invokeUserFunction(StreamVertex.java:85) at org.apache.flink.streaming.api.streamvertex.OutputHandler.invokeUserFunction(OutputHandler.java:229) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1615) Introduces a new InputFormat for Tweets
[ https://issues.apache.org/jira/browse/FLINK-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375619#comment-14375619 ] ASF GitHub Bot commented on FLINK-1615: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/442#issuecomment-84912485 I triggered another travis build: https://travis-ci.org/rmetzger/flink/builds/55459787 Introduces a new InputFormat for Tweets --- Key: FLINK-1615 URL: https://issues.apache.org/jira/browse/FLINK-1615 Project: Flink Issue Type: New Feature Components: flink-contrib Affects Versions: 0.8.1 Reporter: mustafa elbehery Priority: Minor An event-driven parser for Tweets into Java Pojos. It parses all the important part of the tweet into Java objects. Tested on cluster and the performance in pretty well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1501) Integrate metrics library and report basic metrics to JobManager web interface
[ https://issues.apache.org/jira/browse/FLINK-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375676#comment-14375676 ] ASF GitHub Bot commented on FLINK-1501: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/421#issuecomment-84935408 Hey @bhatsachin, I've started working on the per job monitoring .. but its currently in a work in progress state and I did not find time to finish it yet. If you are interested in working on the topic, I would actually suggest to enhance the monitoring I've added in this pull request (the TaskManager monitoring). If nobody has any objections, I would like to merge this change in the next 24 hours. Integrate metrics library and report basic metrics to JobManager web interface -- Key: FLINK-1501 URL: https://issues.apache.org/jira/browse/FLINK-1501 Project: Flink Issue Type: Sub-task Components: JobManager, TaskManager Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Robert Metzger Fix For: pre-apache As per mailing list, the library: https://github.com/dropwizard/metrics The goal of this task is to get the basic infrastructure in place. Subsequent issues will integrate more features into the system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: Adds akka interal docs and configuration descr...
Github user tillrohrmann commented on the pull request: https://github.com/apache/flink/pull/512#issuecomment-84940008 Yes I agree that we should keep the internals only at one place. Even then, it is probably hard enough to keep the docs in sync with the code. But what still might be interesting to add to the docs is the description of the configuration parameters. There is still a big hole under Distributed coordination (via Akka). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (FLINK-1656) Filtered Semantic Properties for Operators with Iterators
[ https://issues.apache.org/jira/browse/FLINK-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fabian Hueske updated FLINK-1656: - Summary: Filtered Semantic Properties for Operators with Iterators (was: Fix ForwardedField documentation for operators with iterator input) Filtered Semantic Properties for Operators with Iterators - Key: FLINK-1656 URL: https://issues.apache.org/jira/browse/FLINK-1656 Project: Flink Issue Type: Bug Components: Documentation Affects Versions: 0.9 Reporter: Fabian Hueske Assignee: Fabian Hueske Priority: Critical The documentation of ForwardedFields is incomplete for operators with iterator inputs (GroupReduce, CoGroup). This should be fixed ASAP, because it can lead to incorrect program execution. The conditions for forwarded fields on operators with iterator input are: 1) forwarded fields must be emitted in the order in which they are received through the iterator 2) all forwarded fields of a record must stick together, i.e., if your function builds record from field 0 of the 1st, 3rd, 5th, ... and field 1 of the 2nd, 4th, ... record coming through the iterator, these are not valid forwarded fields. 3) it is OK to completely filter out records coming through the iterator. The reason for these conditions is that the optimizer uses forwarded fields to reason about physical data properties such as order and grouping. Mixing up the order of records or emitting records which are composed from different input records, might destroy a (secondary) order or grouping. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-1769) Maven deploy is broken (build artifacts are cleaned in docs-and-sources profile)
[ https://issues.apache.org/jira/browse/FLINK-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-1769: -- Component/s: Build System Maven deploy is broken (build artifacts are cleaned in docs-and-sources profile) Key: FLINK-1769 URL: https://issues.apache.org/jira/browse/FLINK-1769 Project: Flink Issue Type: Bug Components: Build System Reporter: Robert Metzger Priority: Critical The issue has been introduced by FLINK-1720. This change broke the deployment to maven snapshots / central. {code} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-install-plugin:2.5.1:install (default-install) on project flink-shaded-include-yarn: Failed to install artifact org.apache.flink:flink-shaded-include-yarn:pom:0.9-SNAPSHOT: /home/robert/incubator-flink/flink-shaded-hadoop/flink-shaded-include-yarn/target/dependency-reduced-pom.xml (No such file or directory) - [Help 1] {code} The issue is that maven is now executing {{clean}} after {{shade}} and then {{install}} can not store the result of {{shade}} anymore (because it has been deleted) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-1720) Integrate ScalaDoc in Scala sources into overall JavaDoc
[ https://issues.apache.org/jira/browse/FLINK-1720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-1720: -- Component/s: Build System Integrate ScalaDoc in Scala sources into overall JavaDoc Key: FLINK-1720 URL: https://issues.apache.org/jira/browse/FLINK-1720 Project: Flink Issue Type: Improvement Components: Build System Reporter: Aljoscha Krettek Assignee: Aljoscha Krettek Fix For: 0.9 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1650) Suppress Akka's Netty Shutdown Errors through the log config
[ https://issues.apache.org/jira/browse/FLINK-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375668#comment-14375668 ] Robert Metzger commented on FLINK-1650: --- I got a response from the Akka mailing list. They've asked me to open an issue if we're still seeing the issue in {{2.3.9}}. We are currently on Akka version {{2.3.7}}. The changelogs: http://akka.io/news/2014/12/17/akka-2.3.8-released.html http://akka.io/news/2015/01/19/akka-2.3.9-released.html Any objections against bumping our Akka dependency to 2.3.9 ? Suppress Akka's Netty Shutdown Errors through the log config Key: FLINK-1650 URL: https://issues.apache.org/jira/browse/FLINK-1650 Project: Flink Issue Type: Bug Components: other Affects Versions: 0.9 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.9 I suggest to set the logging for `org.jboss.netty.channel.DefaultChannelPipeline` to error, in order to get rid of the misleading stack trace caused by an akka/netty hickup on shutdown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1679) Document how degree of parallelism / parallelism / slots are connected to each other
[ https://issues.apache.org/jira/browse/FLINK-1679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375552#comment-14375552 ] ASF GitHub Bot commented on FLINK-1679: --- GitHub user mxm reopened a pull request: https://github.com/apache/flink/pull/488 [FLINK-1679] rename degree of parallelism to parallelism extend documentation about parallelism https://issues.apache.org/jira/browse/FLINK-1679 You can merge this pull request into a Git repository by running: $ git pull https://github.com/mxm/flink parallelism Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/488.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #488 commit 17097fdf51f41445bad6da2186868185a6bf947b Author: Maximilian Michels m...@apache.org Date: 2015-03-18T09:44:42Z [FLINK-1679] deprecate API methods to set the parallelism commit f6ba8c07cc9a153b1ac1e213f9749155c42ae3c3 Author: Maximilian Michels m...@apache.org Date: 2015-03-18T09:44:43Z [FLINK-1679] use a consistent name for parallelism * rename occurrences of degree of parallelism to parallelism * [Dd]egree[ -]of[ -]parallelism - [pP]arallelism * (DOP|dop) - [pP]arallelism * paraDegree - parallelism * degree-of-parallelism - parallelism * DEGREE_OF_PARALLELISM - PARALLELISM commit 658bb1166aa907677e06cf011e5a0fdaf58ab15f Author: Maximilian Michels m...@apache.org Date: 2015-03-18T09:44:44Z [FLINK-1679] deprecate old parallelism config entry old config parameter can still be used OLD parallelization.degree.default NEW parallelism.default commit 412ac54df0fde12666afbc1414df5fd919ba1607 Author: Maximilian Michels m...@apache.org Date: 2015-03-18T09:44:45Z [FLINK-1679] extend faq and programming guide to clarify parallelism Document how degree of parallelism / parallelism / slots are connected to each other --- Key: FLINK-1679 URL: https://issues.apache.org/jira/browse/FLINK-1679 Project: Flink Issue Type: Task Components: Documentation Affects Versions: 0.9 Reporter: Robert Metzger Assignee: Maximilian Michels Fix For: 0.9 I see too many users being confused about properly setting up Flink with respect to parallelism. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1679] rename degree of parallelism to p...
GitHub user mxm reopened a pull request: https://github.com/apache/flink/pull/488 [FLINK-1679] rename degree of parallelism to parallelism extend documentation about parallelism https://issues.apache.org/jira/browse/FLINK-1679 You can merge this pull request into a Git repository by running: $ git pull https://github.com/mxm/flink parallelism Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/488.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #488 commit 17097fdf51f41445bad6da2186868185a6bf947b Author: Maximilian Michels m...@apache.org Date: 2015-03-18T09:44:42Z [FLINK-1679] deprecate API methods to set the parallelism commit f6ba8c07cc9a153b1ac1e213f9749155c42ae3c3 Author: Maximilian Michels m...@apache.org Date: 2015-03-18T09:44:43Z [FLINK-1679] use a consistent name for parallelism * rename occurrences of degree of parallelism to parallelism * [Dd]egree[ -]of[ -]parallelism - [pP]arallelism * (DOP|dop) - [pP]arallelism * paraDegree - parallelism * degree-of-parallelism - parallelism * DEGREE_OF_PARALLELISM - PARALLELISM commit 658bb1166aa907677e06cf011e5a0fdaf58ab15f Author: Maximilian Michels m...@apache.org Date: 2015-03-18T09:44:44Z [FLINK-1679] deprecate old parallelism config entry old config parameter can still be used OLD parallelization.degree.default NEW parallelism.default commit 412ac54df0fde12666afbc1414df5fd919ba1607 Author: Maximilian Michels m...@apache.org Date: 2015-03-18T09:44:45Z [FLINK-1679] extend faq and programming guide to clarify parallelism --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (FLINK-1725) New Partitioner for better load balancing for skewed data
[ https://issues.apache.org/jira/browse/FLINK-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-1725: -- Assignee: Anis Nasir New Partitioner for better load balancing for skewed data - Key: FLINK-1725 URL: https://issues.apache.org/jira/browse/FLINK-1725 Project: Flink Issue Type: Improvement Components: New Components Affects Versions: 0.8.1 Reporter: Anis Nasir Assignee: Anis Nasir Labels: LoadBalancing, Partitioner Original Estimate: 336h Remaining Estimate: 336h Hi, We have recently studied the problem of load balancing in Storm [1]. In particular, we focused on key distribution of the stream for skewed data. We developed a new stream partitioning scheme (which we call Partial Key Grouping). It achieves better load balancing than key grouping while being more scalable than shuffle grouping in terms of memory. In the paper we show a number of mining algorithms that are easy to implement with partial key grouping, and whose performance can benefit from it. We think that it might also be useful for a larger class of algorithms. Partial key grouping is very easy to implement: it requires just a few lines of code in Java when implemented as a custom grouping in Storm [2]. For all these reasons, we believe it will be a nice addition to the standard Partitioners available in Flink. If the community thinks it's a good idea, we will be happy to offer support in the porting. References: [1]. https://melmeric.files.wordpress.com/2014/11/the-power-of-both-choices-practical-load-balancing-for-distributed-stream-processing-engines.pdf [2]. https://github.com/gdfm/partial-key-grouping -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1693) Deprecate the Spargel API
[ https://issues.apache.org/jira/browse/FLINK-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375635#comment-14375635 ] Vasia Kalavri commented on FLINK-1693: -- Hey [~hsaputra]! That'd be great, thanks a lot :-) Deprecate the Spargel API - Key: FLINK-1693 URL: https://issues.apache.org/jira/browse/FLINK-1693 Project: Flink Issue Type: Task Components: Spargel Affects Versions: 0.9 Reporter: Vasia Kalavri For the upcoming 0.9 release, we should mark all user-facing methods from the Spargel API as deprecated, with a warning that we are going to remove it at some point. We should also add a comment in the docs and point people to Gelly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1767) StreamExecutionEnvironment's execute should return JobExecutionResult
[ https://issues.apache.org/jira/browse/FLINK-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375559#comment-14375559 ] ASF GitHub Bot commented on FLINK-1767: --- Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/516#issuecomment-84879621 LGTM StreamExecutionEnvironment's execute should return JobExecutionResult - Key: FLINK-1767 URL: https://issues.apache.org/jira/browse/FLINK-1767 Project: Flink Issue Type: Improvement Components: Streaming Reporter: Márton Balassi Assignee: Gabor Gevay Although the streaming API does not make use of the accumulators it is still a nice handle for the execution time and might wrap other features in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-1767] [streaming] Make StreamExecutionE...
Github user rmetzger commented on the pull request: https://github.com/apache/flink/pull/516#issuecomment-84879621 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---