[GitHub] flink pull request: [FLINK-1512] Add CsvReader for reading into PO...

2015-03-23 Thread fhueske
Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/426#issuecomment-84977748
  
Hi @chiwanpark 
Thanks for updating the PR! :-)
I was gone for a few days. Will have a look at your PR shortly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1512) Add CsvReader for reading into POJOs.

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375858#comment-14375858
 ] 

ASF GitHub Bot commented on FLINK-1512:
---

Github user fhueske commented on the pull request:

https://github.com/apache/flink/pull/426#issuecomment-84977748
  
Hi @chiwanpark 
Thanks for updating the PR! :-)
I was gone for a few days. Will have a look at your PR shortly.


 Add CsvReader for reading into POJOs.
 -

 Key: FLINK-1512
 URL: https://issues.apache.org/jira/browse/FLINK-1512
 Project: Flink
  Issue Type: New Feature
  Components: Java API, Scala API
Reporter: Robert Metzger
Assignee: Chiwan Park
Priority: Minor
  Labels: starter

 Currently, the {{CsvReader}} supports only TupleXX types. 
 It would be nice if users were also able to read into POJOs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1770) Rename the variable 'contentAdressable' to 'contentAddressable'

2015-03-23 Thread Sibao Hong (JIRA)
Sibao Hong created FLINK-1770:
-

 Summary: Rename the variable 'contentAdressable' to 
'contentAddressable'
 Key: FLINK-1770
 URL: https://issues.apache.org/jira/browse/FLINK-1770
 Project: Flink
  Issue Type: Bug
Reporter: Sibao Hong
Priority: Minor


Rename the variable 'contentAdressable' to 'contentAddressable' in order to 
better understanding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1633) Add getTriplets() Gelly method

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375756#comment-14375756
 ] 

ASF GitHub Bot commented on FLINK-1633:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/452#discussion_r26928420
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/example/EuclideanGraphExample.java
 ---
@@ -0,0 +1,214 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph.example;
+
+import org.apache.flink.api.common.ProgramDescription;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.graph.Edge;
+import org.apache.flink.graph.Graph;
+import org.apache.flink.graph.Triplet;
+import org.apache.flink.graph.Vertex;
+import org.apache.flink.graph.example.utils.EuclideanGraphData;
+
+import java.io.Serializable;
+
+/**
+ * Given a directed, unweighted graph, with vertex values representing 
points in a plan,
+ * return a weighted graph where the edge weights are equal to the 
Euclidean distance between the
+ * src and the trg vertex values.
+ *
+ * p
+ * Input files are plain text files and must be formatted as follows:
+ * ul
+ * li Vertices are represented by their vertexIds and vertex 
values and are separated by newlines,
+ * the value being formed of two doubles separated by a comma.
+ * For example: code1,1.0,1.0\n2,2.0,2.0\n3,3.0,3.0\n/code 
defines a data set of three vertices
+ * li Edges are represented by triples of srcVertexId, 
srcEdgeId, weight which are
+ * separated by commas. Edges themselves are separated by 
newlines. The initial edge value will be overwritten.
--- End diff --

Hey @andralungu! Sorry for not noticing this earlier, but why give a 
weighted edge as input if you're going to override the weight anyway?
And in the beginning of the description, you refer to an unweighted graph.


 Add getTriplets() Gelly method
 --

 Key: FLINK-1633
 URL: https://issues.apache.org/jira/browse/FLINK-1633
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Affects Versions: 0.9
Reporter: Vasia Kalavri
Assignee: Andra Lungu
Priority: Minor
  Labels: starter

 In some graph algorithms, it is required to access the graph edges together 
 with the vertex values of the source and target vertices. For example, 
 several graph weighting schemes compute some kind of similarity weights for 
 edges, based on the attributes of the source and target vertices. This issue 
 proposes adding a convenience Gelly method that generates a DataSet of 
 srcVertex, Edge, TrgVertex triplets from the input graph.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1633][gelly] Added getTriplets() method...

2015-03-23 Thread vasia
Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/452#discussion_r26928666
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java ---
@@ -53,6 +54,7 @@
 import org.apache.flink.graph.utils.Tuple2ToVertexMap;
 import org.apache.flink.graph.utils.Tuple3ToEdgeMap;
 import org.apache.flink.graph.utils.VertexToTuple2Map;
+import org.apache.flink.graph.utils.Tuple2ToVertexMap;
--- End diff --

and this import is not used anywhere :-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1633][gelly] Added getTriplets() method...

2015-03-23 Thread vasia
Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/452#discussion_r26928684
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java ---
@@ -53,6 +54,7 @@
 import org.apache.flink.graph.utils.Tuple2ToVertexMap;
 import org.apache.flink.graph.utils.Tuple3ToEdgeMap;
 import org.apache.flink.graph.utils.VertexToTuple2Map;
+import org.apache.flink.graph.utils.Tuple2ToVertexMap;
--- End diff --

my bad, it's duplicate


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1633) Add getTriplets() Gelly method

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375765#comment-14375765
 ] 

ASF GitHub Bot commented on FLINK-1633:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/452#discussion_r26928684
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java ---
@@ -53,6 +54,7 @@
 import org.apache.flink.graph.utils.Tuple2ToVertexMap;
 import org.apache.flink.graph.utils.Tuple3ToEdgeMap;
 import org.apache.flink.graph.utils.VertexToTuple2Map;
+import org.apache.flink.graph.utils.Tuple2ToVertexMap;
--- End diff --

my bad, it's duplicate


 Add getTriplets() Gelly method
 --

 Key: FLINK-1633
 URL: https://issues.apache.org/jira/browse/FLINK-1633
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Affects Versions: 0.9
Reporter: Vasia Kalavri
Assignee: Andra Lungu
Priority: Minor
  Labels: starter

 In some graph algorithms, it is required to access the graph edges together 
 with the vertex values of the source and target vertices. For example, 
 several graph weighting schemes compute some kind of similarity weights for 
 edges, based on the attributes of the source and target vertices. This issue 
 proposes adding a convenience Gelly method that generates a DataSet of 
 srcVertex, Edge, TrgVertex triplets from the input graph.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1615] [java api] SimpleTweetInputFormat

2015-03-23 Thread Elbehery
Github user Elbehery commented on the pull request:

https://github.com/apache/flink/pull/442#issuecomment-84968643
  
@rmetzger  I think it failed again, but I cant see the reason 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1615) Introduces a new InputFormat for Tweets

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375784#comment-14375784
 ] 

ASF GitHub Bot commented on FLINK-1615:
---

Github user Elbehery commented on the pull request:

https://github.com/apache/flink/pull/442#issuecomment-84968643
  
@rmetzger  I think it failed again, but I cant see the reason 


 Introduces a new InputFormat for Tweets
 ---

 Key: FLINK-1615
 URL: https://issues.apache.org/jira/browse/FLINK-1615
 Project: Flink
  Issue Type: New Feature
  Components: flink-contrib
Affects Versions: 0.8.1
Reporter: mustafa elbehery
Priority: Minor

 An event-driven parser for Tweets into Java Pojos. 
 It parses all the important part of the tweet into Java objects. 
 Tested on cluster and the performance in pretty well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1770) Rename the variable 'contentAdressable' to 'contentAddressable'

2015-03-23 Thread Sibao Hong (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sibao Hong updated FLINK-1770:
--
Fix Version/s: master

 Rename the variable 'contentAdressable' to 'contentAddressable'
 ---

 Key: FLINK-1770
 URL: https://issues.apache.org/jira/browse/FLINK-1770
 Project: Flink
  Issue Type: Bug
Reporter: Sibao Hong
Priority: Minor
 Fix For: master


 Rename the variable 'contentAdressable' to 'contentAddressable' in order to 
 better understanding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1770) Rename the variable 'contentAdressable' to 'contentAddressable'

2015-03-23 Thread Sibao Hong (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sibao Hong updated FLINK-1770:
--
Affects Version/s: master

 Rename the variable 'contentAdressable' to 'contentAddressable'
 ---

 Key: FLINK-1770
 URL: https://issues.apache.org/jira/browse/FLINK-1770
 Project: Flink
  Issue Type: Bug
Affects Versions: master
Reporter: Sibao Hong
Priority: Minor
 Fix For: master


 Rename the variable 'contentAdressable' to 'contentAddressable' in order to 
 better understanding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1633][gelly] Added getTriplets() method...

2015-03-23 Thread vasia
Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/452#discussion_r26928420
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/example/EuclideanGraphExample.java
 ---
@@ -0,0 +1,214 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.graph.example;
+
+import org.apache.flink.api.common.ProgramDescription;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.java.DataSet;
+import org.apache.flink.api.java.ExecutionEnvironment;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple3;
+import org.apache.flink.graph.Edge;
+import org.apache.flink.graph.Graph;
+import org.apache.flink.graph.Triplet;
+import org.apache.flink.graph.Vertex;
+import org.apache.flink.graph.example.utils.EuclideanGraphData;
+
+import java.io.Serializable;
+
+/**
+ * Given a directed, unweighted graph, with vertex values representing 
points in a plan,
+ * return a weighted graph where the edge weights are equal to the 
Euclidean distance between the
+ * src and the trg vertex values.
+ *
+ * p
+ * Input files are plain text files and must be formatted as follows:
+ * ul
+ * li Vertices are represented by their vertexIds and vertex 
values and are separated by newlines,
+ * the value being formed of two doubles separated by a comma.
+ * For example: code1,1.0,1.0\n2,2.0,2.0\n3,3.0,3.0\n/code 
defines a data set of three vertices
+ * li Edges are represented by triples of srcVertexId, 
srcEdgeId, weight which are
+ * separated by commas. Edges themselves are separated by 
newlines. The initial edge value will be overwritten.
--- End diff --

Hey @andralungu! Sorry for not noticing this earlier, but why give a 
weighted edge as input if you're going to override the weight anyway?
And in the beginning of the description, you refer to an unweighted graph.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (FLINK-1373) Add documentation for intermediate results and network stack to internals

2015-03-23 Thread Ufuk Celebi (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ufuk Celebi resolved FLINK-1373.

Resolution: Fixed

Fixed in 
https://cwiki.apache.org/confluence/display/FLINK/Data+exchange+between+tasks

 Add documentation for intermediate results and network stack to internals
 -

 Key: FLINK-1373
 URL: https://issues.apache.org/jira/browse/FLINK-1373
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Runtime, Documentation
Reporter: Ufuk Celebi
Assignee: Ufuk Celebi

 There is a short overview in the respective pull request.
 https://github.com/apache/flink/pull/254



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1319) Add static code analysis for UDFs

2015-03-23 Thread Ufuk Celebi (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376003#comment-14376003
 ] 

Ufuk Celebi commented on FLINK-1319:


Hey Timo,

great news! :-)

1. What about adding it to staging?

2 I would very much like to have this on by default in the future. But I agree 
that we should not do this now. We really need to be certain that we don't 
introduce wrong annotations. They might cause some hard-to-understand problems 
for new users when enabled by default.

As a first step it makes sense make this as explicit as possible, for example 
with a {{optimizeUdf()}} method as you propose.

 Add static code analysis for UDFs
 -

 Key: FLINK-1319
 URL: https://issues.apache.org/jira/browse/FLINK-1319
 Project: Flink
  Issue Type: New Feature
  Components: Java API, Scala API
Reporter: Stephan Ewen
Assignee: Timo Walther
Priority: Minor

 Flink's Optimizer takes information that tells it for UDFs which fields of 
 the input elements are accessed, modified, or frwarded/copied. This 
 information frequently helps to reuse partitionings, sorts, etc. It may speed 
 up programs significantly, as it can frequently eliminate sorts and shuffles, 
 which are costly.
 Right now, users can add lightweight annotations to UDFs to provide this 
 information (such as adding {{@ConstandFields(0-3, 1, 2-1)}}.
 We worked with static code analysis of UDFs before, to determine this 
 information automatically. This is an incredible feature, as it magically 
 makes programs faster.
 For record-at-a-time operations (Map, Reduce, FlatMap, Join, Cross), this 
 works surprisingly well in many cases. We used the Soot toolkit for the 
 static code analysis. Unfortunately, Soot is LGPL licensed and thus we did 
 not include any of the code so far.
 I propose to add this functionality to Flink, in the form of a drop-in 
 addition, to work around the LGPL incompatibility with ALS 2.0. Users could 
 simply download a special flink-code-analysis.jar and drop it into the 
 lib folder to enable this functionality. We may even add a script to 
 tools that downloads that library automatically into the lib folder. This 
 should be legally fine, since we do not redistribute LGPL code and only 
 dynamically link it (the incompatibility with ASL 2.0 is mainly in the 
 patentability, if I remember correctly).
 Prior work on this has been done by [~aljoscha] and [~skunert], which could 
 provide a code base to start with.
 *Appendix*
 Hompage to Soot static analysis toolkit: http://www.sable.mcgill.ca/soot/
 Papers on static analysis and for optimization: 
 http://stratosphere.eu/assets/papers/EnablingOperatorReorderingSCA_12.pdf and 
 http://stratosphere.eu/assets/papers/openingTheBlackBoxes_12.pdf
 Quick introduction to the Optimizer: 
 http://stratosphere.eu/assets/papers/2014-VLDBJ_Stratosphere_Overview.pdf 
 (Section 6)
 Optimizer for Iterations: 
 http://stratosphere.eu/assets/papers/spinningFastIterativeDataFlows_12.pdf 
 (Sections 4.3 and 5.3)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1744) Change the reference of slaves to workers to match the description of the system

2015-03-23 Thread Henry Saputra (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376010#comment-14376010
 ] 

Henry Saputra commented on FLINK-1744:
--

If this is too much of a distraction for now, I could just close it and maybe 
revisit it in the future.

 Change the reference of slaves to workers to match the description of the 
 system
 

 Key: FLINK-1744
 URL: https://issues.apache.org/jira/browse/FLINK-1744
 Project: Flink
  Issue Type: Improvement
  Components: core, Documentation
Reporter: Henry Saputra
Priority: Trivial

 There are some references to slaves which actually mean workers.
 Need to change it to use workers whenever possible, unless it is needed when 
 communicating with external system like Apache Hadoop



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1744) Change the reference of slaves to workers to match the description of the system

2015-03-23 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376033#comment-14376033
 ] 

Robert Metzger commented on FLINK-1744:
---

Oh ... sorry. I didn't want to ask you to close the issue.
I'm not sure how the other committers think about this. Maybe others agree with 
the change.

I'm hesitant here because all changes to scripts, config files etc. often break 
a lot of things (automated testing setups, docker/VM scripts, debian/rpm 
packages, other packaging). Even worse, these things are hard to test.


 Change the reference of slaves to workers to match the description of the 
 system
 

 Key: FLINK-1744
 URL: https://issues.apache.org/jira/browse/FLINK-1744
 Project: Flink
  Issue Type: Improvement
  Components: core, Documentation
Reporter: Henry Saputra
Priority: Trivial

 There are some references to slaves which actually mean workers.
 Need to change it to use workers whenever possible, unless it is needed when 
 communicating with external system like Apache Hadoop



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-1744) Change the reference of slaves to workers to match the description of the system

2015-03-23 Thread Henry Saputra (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Saputra closed FLINK-1744.

Resolution: Won't Fix

Punt it to backlog to avoid confusion

 Change the reference of slaves to workers to match the description of the 
 system
 

 Key: FLINK-1744
 URL: https://issues.apache.org/jira/browse/FLINK-1744
 Project: Flink
  Issue Type: Improvement
  Components: core, Documentation
Reporter: Henry Saputra
Priority: Trivial

 There are some references to slaves which actually mean workers.
 Need to change it to use workers whenever possible, unless it is needed when 
 communicating with external system like Apache Hadoop



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1744) Change the reference of slaves to workers to match the description of the system

2015-03-23 Thread Henry Saputra (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375972#comment-14375972
 ] 

Henry Saputra commented on FLINK-1744:
--

It is common but it is coming from old term of master-slaves architecture. The 
term Flink uses is worker for the nodes that do all the work.
I filed this JIRA to see if it is better for Flink match the terms used in the 
project.


 Change the reference of slaves to workers to match the description of the 
 system
 

 Key: FLINK-1744
 URL: https://issues.apache.org/jira/browse/FLINK-1744
 Project: Flink
  Issue Type: Improvement
  Components: core, Documentation
Reporter: Henry Saputra

 There are some references to slaves which actually mean workers.
 Need to change it to use workers whenever possible, unless it is needed when 
 communicating with external system like Apache Hadoop



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1771) Add tests for setting the processing slots for the YARN client

2015-03-23 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1771:
-

 Summary: Add tests for setting the processing slots for the YARN 
client
 Key: FLINK-1771
 URL: https://issues.apache.org/jira/browse/FLINK-1771
 Project: Flink
  Issue Type: Improvement
  Components: YARN Client
Reporter: Robert Metzger
Assignee: Robert Metzger


We need tests ensuring that the processing slots are set properly when starting 
Flink on YARN, in particular with the per job YARN session feature.

Also, the YARN tests for detached YARN sessions / per job yarn clusters are 
polluting the local home-directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1771) Add tests for setting the processing slots for the YARN client

2015-03-23 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1771:
--
Affects Version/s: 0.9

 Add tests for setting the processing slots for the YARN client
 --

 Key: FLINK-1771
 URL: https://issues.apache.org/jira/browse/FLINK-1771
 Project: Flink
  Issue Type: Improvement
  Components: YARN Client
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Robert Metzger

 We need tests ensuring that the processing slots are set properly when 
 starting Flink on YARN, in particular with the per job YARN session feature.
 Also, the YARN tests for detached YARN sessions / per job yarn clusters are 
 polluting the local home-directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1650] Configure Netty (akka) to use Slf...

2015-03-23 Thread rmetzger
GitHub user rmetzger opened a pull request:

https://github.com/apache/flink/pull/518

[FLINK-1650] Configure Netty (akka) to use Slf4j.

It seems that Netty 3.8.0 used by Akka was using `java.util.logging` for 
its internal logging, that's why this entry was without effect: 
https://github.com/apache/flink/blob/master/flink-dist/src/main/flink-bin/conf/log4j.properties#L29.

The change sets the Netty logging factory to Slf4j. This means that Netty 
is now using Sfl4j. It'll also respect our logging settings 
(`org.jboss.netty.channel.DefaultChannelPipeline` to lvl ERROR).

Lets see if the exception now disappears.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rmetzger/flink flink1650

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/518.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #518


commit d4778d4d51cd20db2861b3632e64aeac1f5f6f7a
Author: Robert Metzger rmetz...@apache.org
Date:   2015-03-23T10:42:21Z

[FLINK-1650] Set akka version to 2.3.9

commit 69b6a9b497955544e0e5c8707293c580e1ba3d5e
Author: Robert Metzger rmetz...@apache.org
Date:   2015-03-23T13:44:19Z

[FLINK-1650] Let Netty(Akka) use Slf4j




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1650) Suppress Akka's Netty Shutdown Errors through the log config

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375907#comment-14375907
 ] 

ASF GitHub Bot commented on FLINK-1650:
---

GitHub user rmetzger opened a pull request:

https://github.com/apache/flink/pull/518

[FLINK-1650] Configure Netty (akka) to use Slf4j.

It seems that Netty 3.8.0 used by Akka was using `java.util.logging` for 
its internal logging, that's why this entry was without effect: 
https://github.com/apache/flink/blob/master/flink-dist/src/main/flink-bin/conf/log4j.properties#L29.

The change sets the Netty logging factory to Slf4j. This means that Netty 
is now using Sfl4j. It'll also respect our logging settings 
(`org.jboss.netty.channel.DefaultChannelPipeline` to lvl ERROR).

Lets see if the exception now disappears.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rmetzger/flink flink1650

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/518.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #518


commit d4778d4d51cd20db2861b3632e64aeac1f5f6f7a
Author: Robert Metzger rmetz...@apache.org
Date:   2015-03-23T10:42:21Z

[FLINK-1650] Set akka version to 2.3.9

commit 69b6a9b497955544e0e5c8707293c580e1ba3d5e
Author: Robert Metzger rmetz...@apache.org
Date:   2015-03-23T13:44:19Z

[FLINK-1650] Let Netty(Akka) use Slf4j




 Suppress Akka's Netty Shutdown Errors through the log config
 

 Key: FLINK-1650
 URL: https://issues.apache.org/jira/browse/FLINK-1650
 Project: Flink
  Issue Type: Bug
  Components: other
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 0.9


 I suggest to set the logging for 
 `org.jboss.netty.channel.DefaultChannelPipeline` to error, in order to get 
 rid of the misleading stack trace caused by an akka/netty hickup on shutdown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: FLINK-1560 - Add ITCases for streaming example...

2015-03-23 Thread szape
GitHub user szape opened a pull request:

https://github.com/apache/flink/pull/519

FLINK-1560 - Add ITCases for streaming examples

ITCases for streaming examples have been added, except for IterateExample 
and StockPrices. I replaced the IterateExample with one that generates 
Fibonacci-sequences. Something is bugous around iteration, so the ITCase have 
to wait. StockPrices is untestable because of the windowJoin operator, hence it 
will not have an ITCase right now.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mbalassi/flink FLINK-1560

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/519.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #519


commit 16da0d092ddc192afdae3a3c5a0a57144390c8ff
Author: szape nemderogator...@gmail.com
Date:   2015-03-05T15:27:44Z

[Flink-1560] [streaming] Streaming examples rework

commit 45fe9bd3381ef9cf99e8c9f0570e66e38d78b383
Author: szape nemderogator...@gmail.com
Date:   2015-03-05T15:30:31Z

[Flink-1560] [streaming] Added ITCases to streaming examples

commit 237f08d3eaf2da36c8a9d48145b2544dcbb41012
Author: szape nemderogator...@gmail.com
Date:   2015-03-23T09:28:37Z

[FLINK-1560] [streaming] Added iterate example




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1560) Add ITCases for streaming examples

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375914#comment-14375914
 ] 

ASF GitHub Bot commented on FLINK-1560:
---

GitHub user szape opened a pull request:

https://github.com/apache/flink/pull/519

FLINK-1560 - Add ITCases for streaming examples

ITCases for streaming examples have been added, except for IterateExample 
and StockPrices. I replaced the IterateExample with one that generates 
Fibonacci-sequences. Something is bugous around iteration, so the ITCase have 
to wait. StockPrices is untestable because of the windowJoin operator, hence it 
will not have an ITCase right now.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mbalassi/flink FLINK-1560

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/519.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #519


commit 16da0d092ddc192afdae3a3c5a0a57144390c8ff
Author: szape nemderogator...@gmail.com
Date:   2015-03-05T15:27:44Z

[Flink-1560] [streaming] Streaming examples rework

commit 45fe9bd3381ef9cf99e8c9f0570e66e38d78b383
Author: szape nemderogator...@gmail.com
Date:   2015-03-05T15:30:31Z

[Flink-1560] [streaming] Added ITCases to streaming examples

commit 237f08d3eaf2da36c8a9d48145b2544dcbb41012
Author: szape nemderogator...@gmail.com
Date:   2015-03-23T09:28:37Z

[FLINK-1560] [streaming] Added iterate example




 Add ITCases for streaming examples
 --

 Key: FLINK-1560
 URL: https://issues.apache.org/jira/browse/FLINK-1560
 Project: Flink
  Issue Type: Test
  Components: Streaming
Affects Versions: 0.9
Reporter: Márton Balassi
Assignee: Péter Szabó

 Currently there are no tests for consistency of the streaming example 
 programs. This might be a real show stopper for users who encounter an issue 
 there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: FLINK-1560 - Add ITCases for streaming example...

2015-03-23 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/519#discussion_r26938775
  
--- Diff: 
flink-staging/flink-streaming/flink-streaming-examples/src/test/java/org/apache/flink/streaming/examples/test/socket/SocketTextStreamWordCountITCase.java
 ---
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.examples.test.socket;
+
+import 
org.apache.flink.streaming.examples.socket.SocketTextStreamWordCount;
+import 
org.apache.flink.streaming.examples.socket.util.SocketTextStreamWordCountData;
+import org.apache.flink.streaming.util.StreamingProgramTestBase;
+
+import java.io.PrintWriter;
+import java.net.ServerSocket;
+import java.net.Socket;
+
+public class SocketTextStreamWordCountITCase extends 
StreamingProgramTestBase {
+
+   private static final String HOST = localhost;
+   private static final String PORT = ;
+   protected String resultPath;
+
+   private ServerSocket temporarySocket;
+
+   @Override
+   protected void preSubmit() throws Exception {
+   temporarySocket = createSocket(HOST, Integer.valueOf(PORT), 
SocketTextStreamWordCountData.SOCKET_TEXT);
+   resultPath = getTempDirPath(result);
+   }
+
+   @Override
+   protected void postSubmit() throws Exception {
+   
compareResultsByLinesInMemory(SocketTextStreamWordCountData.STREAMING_COUNTS_AS_TUPLES,
 resultPath);
+   temporarySocket.close();
+   }
+
+   @Override
+   protected void testProgram() throws Exception {
+   SocketTextStreamWordCount.main(new String[]{HOST, PORT, 
resultPath});
+   }
+
+   public ServerSocket createSocket(String host, int port, String 
contents) throws Exception {
+   ServerSocket serverSocket = new ServerSocket(port);
+   ServerThread st = new ServerThread(serverSocket, contents);
+   st.start();
+   return serverSocket;
+   }
+
+   private static class ServerThread extends Thread {
+
+   private ServerSocket serverSocket;
+   private String contents;
+   private Thread t;
+
+   public ServerThread(ServerSocket serverSocket, String contents) 
{
+   this.serverSocket = serverSocket;
+   this.contents = contents;
+   t = new Thread(this);
+   }
+
+   public void waitForAccept() throws Exception {
+   Socket socket = serverSocket.accept();
+   PrintWriter writer = new 
PrintWriter(socket.getOutputStream(), true);
+   writer.println(contents);
+   writer.close();
+   socket.close();
+   }
+
+   public void run() {
+   try {
+   waitForAccept();
+   } catch (Exception e) {
+   e.printStackTrace();
--- End diff --

I would fail the test in case of an exception.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1560) Add ITCases for streaming examples

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375924#comment-14375924
 ] 

ASF GitHub Bot commented on FLINK-1560:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/519#discussion_r26938775
  
--- Diff: 
flink-staging/flink-streaming/flink-streaming-examples/src/test/java/org/apache/flink/streaming/examples/test/socket/SocketTextStreamWordCountITCase.java
 ---
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.examples.test.socket;
+
+import 
org.apache.flink.streaming.examples.socket.SocketTextStreamWordCount;
+import 
org.apache.flink.streaming.examples.socket.util.SocketTextStreamWordCountData;
+import org.apache.flink.streaming.util.StreamingProgramTestBase;
+
+import java.io.PrintWriter;
+import java.net.ServerSocket;
+import java.net.Socket;
+
+public class SocketTextStreamWordCountITCase extends 
StreamingProgramTestBase {
+
+   private static final String HOST = localhost;
+   private static final String PORT = ;
+   protected String resultPath;
+
+   private ServerSocket temporarySocket;
+
+   @Override
+   protected void preSubmit() throws Exception {
+   temporarySocket = createSocket(HOST, Integer.valueOf(PORT), 
SocketTextStreamWordCountData.SOCKET_TEXT);
+   resultPath = getTempDirPath(result);
+   }
+
+   @Override
+   protected void postSubmit() throws Exception {
+   
compareResultsByLinesInMemory(SocketTextStreamWordCountData.STREAMING_COUNTS_AS_TUPLES,
 resultPath);
+   temporarySocket.close();
+   }
+
+   @Override
+   protected void testProgram() throws Exception {
+   SocketTextStreamWordCount.main(new String[]{HOST, PORT, 
resultPath});
+   }
+
+   public ServerSocket createSocket(String host, int port, String 
contents) throws Exception {
+   ServerSocket serverSocket = new ServerSocket(port);
+   ServerThread st = new ServerThread(serverSocket, contents);
+   st.start();
+   return serverSocket;
+   }
+
+   private static class ServerThread extends Thread {
+
+   private ServerSocket serverSocket;
+   private String contents;
+   private Thread t;
+
+   public ServerThread(ServerSocket serverSocket, String contents) 
{
+   this.serverSocket = serverSocket;
+   this.contents = contents;
+   t = new Thread(this);
+   }
+
+   public void waitForAccept() throws Exception {
+   Socket socket = serverSocket.accept();
+   PrintWriter writer = new 
PrintWriter(socket.getOutputStream(), true);
+   writer.println(contents);
+   writer.close();
+   socket.close();
+   }
+
+   public void run() {
+   try {
+   waitForAccept();
+   } catch (Exception e) {
+   e.printStackTrace();
--- End diff --

I would fail the test in case of an exception.


 Add ITCases for streaming examples
 --

 Key: FLINK-1560
 URL: https://issues.apache.org/jira/browse/FLINK-1560
 Project: Flink
  Issue Type: Test
  Components: Streaming
Affects Versions: 0.9
Reporter: Márton Balassi
Assignee: Péter Szabó

 Currently there are no tests for consistency of the streaming example 
 programs. This might be a real show stopper for users who encounter an issue 
 there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1595] [streaming] Complex integration t...

2015-03-23 Thread szape
GitHub user szape opened a pull request:

https://github.com/apache/flink/pull/520

[FLINK-1595] [streaming] Complex integration test wip

Flink Streaming's complex integration test will test interactions of 
different streaming operators and settings.

The test topology will run in one environment but will be divided into 
several subgraphs, tested separately from one another.
In this way, many new tests can be added later.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mbalassi/flink FLINK-1595

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/520.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #520


commit e24f24354add0800623189a4af70b3f04164469c
Author: szape nemderogator...@gmail.com
Date:   2015-03-23T09:48:12Z

[FLINK-1595] [streaming] Complex integration test wip




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1560) Add ITCases for streaming examples

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375938#comment-14375938
 ] 

ASF GitHub Bot commented on FLINK-1560:
---

Github user mbalassi commented on a diff in the pull request:

https://github.com/apache/flink/pull/519#discussion_r26939215
  
--- Diff: 
flink-staging/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/iteration/IterateExample.java
 ---
@@ -103,57 +115,124 @@ public static void main(String[] args) throws 
Exception {
// 
*
 
/**
-* Iteration step function which takes an input (Double , Integer) and
-* produces an output (Double + random, Integer + 1).
+* Generate random integer pairs from the range from 0 to BOUND/2
+*/
+   private static class RandomFibonacciSource implements 
SourceFunctionTuple2Integer, Integer {
+   private static final long serialVersionUID = 1L;
+
+   private Random rnd = new Random();
+
+   @Override
+   public void run(CollectorTuple2Integer, Integer collector) 
throws Exception {
+   while(true) {
+   int first = rnd.nextInt(BOUND/2 - 1) + 1;
+   int second = rnd.nextInt(BOUND/2 - 1) + 1;
+
+   collector.collect(new Tuple2Integer, 
Integer(first, second));
+   Thread.sleep(100L);
+   }
+   }
+
+   @Override
+   public void cancel() {
+   // no cleanup needed
+   }
+   }
+
+   /**
+* Generate random integer pairs from the range from 0 to BOUND/2
 */
-   public static class Step extends
-   RichMapFunctionTuple2Double, Integer, Tuple2Double, 
Integer {
+   private static class FibonacciInputMap implements MapFunctionString, 
Tuple2Integer, Integer {
private static final long serialVersionUID = 1L;
-   private transient Random rnd;
 
-   public void open(Configuration parameters) {
-   rnd = new Random();
+   @Override
+   public Tuple2Integer, Integer map(String value) throws 
Exception {
+   Thread.sleep(100L);
+   String record = value.substring(1, value.length()-1);
+   String[] splitted = record.split(,);
+   return new Tuple2Integer, 
Integer(Integer.parseInt(splitted[0]), Integer.parseInt(splitted[1]));
}
+   }
+
+   /**
+* Map the inputs so that the next Fibonacci numbers can be calculated 
while preserving the original input tuple
+* A counter is attached to the tuple and incremented in every 
iteration step
+*/
+   public static class InputMap implements MapFunctionTuple2Integer, 
Integer, Tuple5Integer, Integer, Integer, Integer, Integer {
+
+   @Override
+   public Tuple5Integer, Integer, Integer, Integer, Integer 
map(Tuple2Integer, Integer value) throws
+   Exception {
+   return new Tuple5Integer, Integer, Integer, Integer, 
Integer(value.f0, value.f1, value.f0, value.f1, 0);
+   }
+   }
+
+   /**
+* Iteration step function that calculates the next Fibonacci number
+*/
+   public static class Step implements
+   MapFunctionTuple5Integer, Integer, Integer, Integer, 
Integer, Tuple5Integer, Integer, Integer, Integer, Integer {
+   private static final long serialVersionUID = 1L;
 
@Override
-   public Tuple2Double, Integer map(Tuple2Double, Integer 
value) throws Exception {
-   return new Tuple2Double, Integer(value.f0 + 
rnd.nextDouble(), value.f1 + 1);
+   public Tuple5Integer, Integer, Integer, Integer, Integer 
map(Tuple5Integer, Integer, Integer, Integer, Integer value) throws Exception 
{
+   return new Tuple5Integer, Integer, Integer, Integer, 
Integer(value.f0, value.f1, value.f3, value.f2 + value.f3, ++value.f4);
}
}
 
/**
 * OutputSelector testing which tuple needs to be iterated again.
 */
-   public static class MySelector implements OutputSelectorTuple2Double, 
Integer {
+   public static class MySelector implements 
OutputSelectorTuple5Integer, Integer, Integer, Integer, Integer {
private static final long serialVersionUID = 1L;
 
@Override
-   public IterableString select(Tuple2Double, Integer value) {
+   public 

[GitHub] flink pull request: FLINK-1560 - Add ITCases for streaming example...

2015-03-23 Thread mbalassi
Github user mbalassi commented on a diff in the pull request:

https://github.com/apache/flink/pull/519#discussion_r26939215
  
--- Diff: 
flink-staging/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/iteration/IterateExample.java
 ---
@@ -103,57 +115,124 @@ public static void main(String[] args) throws 
Exception {
// 
*
 
/**
-* Iteration step function which takes an input (Double , Integer) and
-* produces an output (Double + random, Integer + 1).
+* Generate random integer pairs from the range from 0 to BOUND/2
+*/
+   private static class RandomFibonacciSource implements 
SourceFunctionTuple2Integer, Integer {
+   private static final long serialVersionUID = 1L;
+
+   private Random rnd = new Random();
+
+   @Override
+   public void run(CollectorTuple2Integer, Integer collector) 
throws Exception {
+   while(true) {
+   int first = rnd.nextInt(BOUND/2 - 1) + 1;
+   int second = rnd.nextInt(BOUND/2 - 1) + 1;
+
+   collector.collect(new Tuple2Integer, 
Integer(first, second));
+   Thread.sleep(100L);
+   }
+   }
+
+   @Override
+   public void cancel() {
+   // no cleanup needed
+   }
+   }
+
+   /**
+* Generate random integer pairs from the range from 0 to BOUND/2
 */
-   public static class Step extends
-   RichMapFunctionTuple2Double, Integer, Tuple2Double, 
Integer {
+   private static class FibonacciInputMap implements MapFunctionString, 
Tuple2Integer, Integer {
private static final long serialVersionUID = 1L;
-   private transient Random rnd;
 
-   public void open(Configuration parameters) {
-   rnd = new Random();
+   @Override
+   public Tuple2Integer, Integer map(String value) throws 
Exception {
+   Thread.sleep(100L);
+   String record = value.substring(1, value.length()-1);
+   String[] splitted = record.split(,);
+   return new Tuple2Integer, 
Integer(Integer.parseInt(splitted[0]), Integer.parseInt(splitted[1]));
}
+   }
+
+   /**
+* Map the inputs so that the next Fibonacci numbers can be calculated 
while preserving the original input tuple
+* A counter is attached to the tuple and incremented in every 
iteration step
+*/
+   public static class InputMap implements MapFunctionTuple2Integer, 
Integer, Tuple5Integer, Integer, Integer, Integer, Integer {
+
+   @Override
+   public Tuple5Integer, Integer, Integer, Integer, Integer 
map(Tuple2Integer, Integer value) throws
+   Exception {
+   return new Tuple5Integer, Integer, Integer, Integer, 
Integer(value.f0, value.f1, value.f0, value.f1, 0);
+   }
+   }
+
+   /**
+* Iteration step function that calculates the next Fibonacci number
+*/
+   public static class Step implements
+   MapFunctionTuple5Integer, Integer, Integer, Integer, 
Integer, Tuple5Integer, Integer, Integer, Integer, Integer {
+   private static final long serialVersionUID = 1L;
 
@Override
-   public Tuple2Double, Integer map(Tuple2Double, Integer 
value) throws Exception {
-   return new Tuple2Double, Integer(value.f0 + 
rnd.nextDouble(), value.f1 + 1);
+   public Tuple5Integer, Integer, Integer, Integer, Integer 
map(Tuple5Integer, Integer, Integer, Integer, Integer value) throws Exception 
{
+   return new Tuple5Integer, Integer, Integer, Integer, 
Integer(value.f0, value.f1, value.f3, value.f2 + value.f3, ++value.f4);
}
}
 
/**
 * OutputSelector testing which tuple needs to be iterated again.
 */
-   public static class MySelector implements OutputSelectorTuple2Double, 
Integer {
+   public static class MySelector implements 
OutputSelectorTuple5Integer, Integer, Integer, Integer, Integer {
private static final long serialVersionUID = 1L;
 
@Override
-   public IterableString select(Tuple2Double, Integer value) {
+   public IterableString select(Tuple5Integer, Integer, 
Integer, Integer, Integer value) {
ListString output = new ArrayListString();
-   if (value.f0  100) {
-   output.add(output);
-   

[jira] [Commented] (FLINK-1560) Add ITCases for streaming examples

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375943#comment-14375943
 ] 

ASF GitHub Bot commented on FLINK-1560:
---

Github user mbalassi commented on a diff in the pull request:

https://github.com/apache/flink/pull/519#discussion_r26939379
  
--- Diff: 
flink-staging/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/socket/util/SocketTextStreamWordCountData.java
 ---
@@ -0,0 +1,196 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.examples.socket.util;
+
+public class SocketTextStreamWordCountData {
+
--- End diff --

I think we can simply use the wordcount data here, can't we?


 Add ITCases for streaming examples
 --

 Key: FLINK-1560
 URL: https://issues.apache.org/jira/browse/FLINK-1560
 Project: Flink
  Issue Type: Test
  Components: Streaming
Affects Versions: 0.9
Reporter: Márton Balassi
Assignee: Péter Szabó

 Currently there are no tests for consistency of the streaming example 
 programs. This might be a real show stopper for users who encounter an issue 
 there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1560) Add ITCases for streaming examples

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375950#comment-14375950
 ] 

ASF GitHub Bot commented on FLINK-1560:
---

Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/519#issuecomment-85019063
  
Please update this for parallelism of 4. Besides that and the inline 
comments looks good. 


 Add ITCases for streaming examples
 --

 Key: FLINK-1560
 URL: https://issues.apache.org/jira/browse/FLINK-1560
 Project: Flink
  Issue Type: Test
  Components: Streaming
Affects Versions: 0.9
Reporter: Márton Balassi
Assignee: Péter Szabó

 Currently there are no tests for consistency of the streaming example 
 programs. This might be a real show stopper for users who encounter an issue 
 there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: FLINK-1560 - Add ITCases for streaming example...

2015-03-23 Thread mbalassi
Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/519#issuecomment-85019063
  
Please update this for parallelism of 4. Besides that and the inline 
comments looks good. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1687) Streaming file source/sink API is not in sync with the batch API

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375954#comment-14375954
 ] 

ASF GitHub Bot commented on FLINK-1687:
---

GitHub user szape opened a pull request:

https://github.com/apache/flink/pull/521

[FLINK-1687] [streaming] Syncing streaming source API with batch source API

It is important to keep the streaming and the batch user facing API 
synchronised in regard of the source and sink functions (even if the inner 
workings are different).
Most of the source functions are done here. Because this is not a high 
priority issue, sink functions will be done later on.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mbalassi/flink FLINK-1687

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/521.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #521


commit 5fe5a3592ea21ca2016919dbc6c67330cf41dd7b
Author: szape nemderogator...@gmail.com
Date:   2015-03-23T10:35:11Z

[FLINK-1687] [streaming] Syncing streaming source API with batch source API 
wip

commit f8294897a90c21b762d422b8fe6f673acc228d9c
Author: szape nemderogator...@gmail.com
Date:   2015-03-23T10:37:04Z

[FLINK-1687] [streaming] Source example wip




 Streaming file source/sink API is not in sync with the batch API
 

 Key: FLINK-1687
 URL: https://issues.apache.org/jira/browse/FLINK-1687
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Reporter: Gábor Hermann
Assignee: Péter Szabó

 Streaming environment is missing file inputs like readFile, readCsvFile and 
 also the more general createInput function, and outputs like writeAsCsv and 
 write. Streaming and batch API should be consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1544) Extend streaming aggregation tests to include POJOs

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375959#comment-14375959
 ] 

ASF GitHub Bot commented on FLINK-1544:
---

Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/517#issuecomment-85020206
  
LGTM, will merge in the evening.


 Extend streaming aggregation tests to include POJOs
 ---

 Key: FLINK-1544
 URL: https://issues.apache.org/jira/browse/FLINK-1544
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Reporter: Gyula Fora
Assignee: Péter Szabó
  Labels: starter

 Currently the streaming aggregation tests don't test pojo aggregations which 
 makes newly introduced bugs harder to detect.
 New tests should be added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1544] [streaming] POJO types added to A...

2015-03-23 Thread mbalassi
Github user mbalassi commented on the pull request:

https://github.com/apache/flink/pull/517#issuecomment-85020206
  
LGTM, will merge in the evening.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1544) Extend streaming aggregation tests to include POJOs

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375897#comment-14375897
 ] 

ASF GitHub Bot commented on FLINK-1544:
---

GitHub user szape opened a pull request:

https://github.com/apache/flink/pull/517

[FLINK-1544] [streaming] POJO types added to AggregationFunctionTest

Testing aggragation functions with POJO types.
The AggregationFunctionTest covered min, max, sum, minBy and maxBy 
aggregations for tuples. I added the same tests for POJOs.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mbalassi/flink FLINK-1544

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/517.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #517


commit 2def2d98f502487e548adf24406ecf2f2489e71e
Author: szape nemderogator...@gmail.com
Date:   2015-03-11T14:36:15Z

[FLINK-1544] [streaming] POJO types added to AggregationFunctionTest




 Extend streaming aggregation tests to include POJOs
 ---

 Key: FLINK-1544
 URL: https://issues.apache.org/jira/browse/FLINK-1544
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Reporter: Gyula Fora
Assignee: Péter Szabó
  Labels: starter

 Currently the streaming aggregation tests don't test pojo aggregations which 
 makes newly introduced bugs harder to detect.
 New tests should be added.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1595) Add a complex integration test for Streaming API

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375930#comment-14375930
 ] 

ASF GitHub Bot commented on FLINK-1595:
---

GitHub user szape opened a pull request:

https://github.com/apache/flink/pull/520

[FLINK-1595] [streaming] Complex integration test wip

Flink Streaming's complex integration test will test interactions of 
different streaming operators and settings.

The test topology will run in one environment but will be divided into 
several subgraphs, tested separately from one another.
In this way, many new tests can be added later.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mbalassi/flink FLINK-1595

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/520.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #520


commit e24f24354add0800623189a4af70b3f04164469c
Author: szape nemderogator...@gmail.com
Date:   2015-03-23T09:48:12Z

[FLINK-1595] [streaming] Complex integration test wip




 Add a complex integration test for Streaming API
 

 Key: FLINK-1595
 URL: https://issues.apache.org/jira/browse/FLINK-1595
 Project: Flink
  Issue Type: Task
  Components: Streaming
Reporter: Gyula Fora
Assignee: Péter Szabó
  Labels: Starter

 The streaming tests currently lack a sophisticated integration test that 
 would test many api features at once. 
 This should include different merging, partitioning, grouping, aggregation 
 types, as well as windowing and connected operators.
 The results should be tested for correctness.
 A test like this would help identifying bugs that are hard to detect by 
 unit-tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: FLINK-1560 - Add ITCases for streaming example...

2015-03-23 Thread mbalassi
Github user mbalassi commented on a diff in the pull request:

https://github.com/apache/flink/pull/519#discussion_r26939465
  
--- Diff: 
flink-staging/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/twitter/TwitterStream.java
 ---
@@ -176,11 +176,14 @@ private static boolean parseParameters(String[] args) 
{
// parse input arguments
fileOutput = true;
if (args.length == 2) {
+   fileInput = true;
textPath = args[0];
outputPath = args[1];
+   } else if (args.length == 1) {
+   outputPath = args[0];
} else {
System.err.println(USAGE:\nTwitterStream 
pathToPropertiesFile result path);
-   return false;
+   return true;
}
--- End diff --

Why return true here? This way this function always returns true.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1560) Add ITCases for streaming examples

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375945#comment-14375945
 ] 

ASF GitHub Bot commented on FLINK-1560:
---

Github user mbalassi commented on a diff in the pull request:

https://github.com/apache/flink/pull/519#discussion_r26939465
  
--- Diff: 
flink-staging/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/twitter/TwitterStream.java
 ---
@@ -176,11 +176,14 @@ private static boolean parseParameters(String[] args) 
{
// parse input arguments
fileOutput = true;
if (args.length == 2) {
+   fileInput = true;
textPath = args[0];
outputPath = args[1];
+   } else if (args.length == 1) {
+   outputPath = args[0];
} else {
System.err.println(USAGE:\nTwitterStream 
pathToPropertiesFile result path);
-   return false;
+   return true;
}
--- End diff --

Why return true here? This way this function always returns true.


 Add ITCases for streaming examples
 --

 Key: FLINK-1560
 URL: https://issues.apache.org/jira/browse/FLINK-1560
 Project: Flink
  Issue Type: Test
  Components: Streaming
Affects Versions: 0.9
Reporter: Márton Balassi
Assignee: Péter Szabó

 Currently there are no tests for consistency of the streaming example 
 programs. This might be a real show stopper for users who encounter an issue 
 there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1687] [streaming] Syncing streaming sou...

2015-03-23 Thread szape
GitHub user szape opened a pull request:

https://github.com/apache/flink/pull/521

[FLINK-1687] [streaming] Syncing streaming source API with batch source API

It is important to keep the streaming and the batch user facing API 
synchronised in regard of the source and sink functions (even if the inner 
workings are different).
Most of the source functions are done here. Because this is not a high 
priority issue, sink functions will be done later on.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mbalassi/flink FLINK-1687

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/521.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #521


commit 5fe5a3592ea21ca2016919dbc6c67330cf41dd7b
Author: szape nemderogator...@gmail.com
Date:   2015-03-23T10:35:11Z

[FLINK-1687] [streaming] Syncing streaming source API with batch source API 
wip

commit f8294897a90c21b762d422b8fe6f673acc228d9c
Author: szape nemderogator...@gmail.com
Date:   2015-03-23T10:37:04Z

[FLINK-1687] [streaming] Source example wip




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Cleanup of low level Kafka consumer (Persisten...

2015-03-23 Thread gaborhermann
Github user gaborhermann closed the pull request at:

https://github.com/apache/flink/pull/474


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Cleanup of low level Kafka consumer (Persisten...

2015-03-23 Thread gaborhermann
Github user gaborhermann commented on the pull request:

https://github.com/apache/flink/pull/474#issuecomment-85014101
  
I agree. Further work should be done elsewhere (over the master).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1772) Chain of split/select does not work

2015-03-23 Thread Gyula Fora (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376069#comment-14376069
 ] 

Gyula Fora commented on FLINK-1772:
---

I dont think we should allow chaining of split. It doesnt make any practical 
sense to be honest.

 Chain of split/select does not work
 ---

 Key: FLINK-1772
 URL: https://issues.apache.org/jira/browse/FLINK-1772
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Reporter: Gábor Hermann

 OutputSelectors should handle multiple split/select called after one another 
 like:
 dataStream.split(outputSelector1).select(one).split(outputSelector2).select(two)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1633) Add getTriplets() Gelly method

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375764#comment-14375764
 ] 

ASF GitHub Bot commented on FLINK-1633:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/452#discussion_r26928666
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java ---
@@ -53,6 +54,7 @@
 import org.apache.flink.graph.utils.Tuple2ToVertexMap;
 import org.apache.flink.graph.utils.Tuple3ToEdgeMap;
 import org.apache.flink.graph.utils.VertexToTuple2Map;
+import org.apache.flink.graph.utils.Tuple2ToVertexMap;
--- End diff --

and this import is not used anywhere :-)


 Add getTriplets() Gelly method
 --

 Key: FLINK-1633
 URL: https://issues.apache.org/jira/browse/FLINK-1633
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Affects Versions: 0.9
Reporter: Vasia Kalavri
Assignee: Andra Lungu
Priority: Minor
  Labels: starter

 In some graph algorithms, it is required to access the graph edges together 
 with the vertex values of the source and target vertices. For example, 
 several graph weighting schemes compute some kind of similarity weights for 
 edges, based on the attributes of the source and target vertices. This issue 
 proposes adding a convenience Gelly method that generates a DataSet of 
 srcVertex, Edge, TrgVertex triplets from the input graph.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [Flink-1686] Streaming Iterations Slot Sharing...

2015-03-23 Thread senorcarbone
Github user senorcarbone commented on the pull request:

https://github.com/apache/flink/pull/524#issuecomment-85223415
  
bad style habits ^^
thanks for reminding me. I also made a minor fix to the test too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Introduced CentralActiveTrigger with GroupedAc...

2015-03-23 Thread gyfora
GitHub user gyfora opened a pull request:

https://github.com/apache/flink/pull/523

Introduced CentralActiveTrigger with GroupedActiveDiscretizer



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mbalassi/flink session

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/523.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #523


commit 06210c0c41016b793d2cce9d61760402da3195e3
Author: Gyula Fora gyf...@apache.org
Date:   2015-03-20T17:16:34Z

[FLINK-1773] [streaming] Introduced CentralActiveTrigger with 
GroupedActiveDiscretizer

thread-safety fix for GroupedActiveDiscretizer




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-1776) APIs provide invalid Semantic Properties for Operators with SelectorFunction keys

2015-03-23 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-1776:


 Summary: APIs provide invalid Semantic Properties for Operators 
with SelectorFunction keys
 Key: FLINK-1776
 URL: https://issues.apache.org/jira/browse/FLINK-1776
 Project: Flink
  Issue Type: Bug
  Components: Java API, Scala API
Affects Versions: 0.9
Reporter: Fabian Hueske
Assignee: Fabian Hueske
Priority: Critical
 Fix For: 0.9


Semantic properties are defined by users and evaluated by the optimizer.
When semantic properties such as forwarded or read fields are bound to the 
input type of a function.
In case of operators with selector function keys, a user function is wrapped by 
a wrapping function that has a different input types than the original user 
function. However, the user-defined semantic properties are verbatim forwarded 
to the optimizer. 
Since the properties refer to a specific type which is changed by the wrapping 
function and the semantic properties are not adapted, the optimizer uses wrong 
properties and might produce invalid plans



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1767) StreamExecutionEnvironment's execute should return JobExecutionResult

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376184#comment-14376184
 ] 

ASF GitHub Bot commented on FLINK-1767:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/516


 StreamExecutionEnvironment's execute should return JobExecutionResult
 -

 Key: FLINK-1767
 URL: https://issues.apache.org/jira/browse/FLINK-1767
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Reporter: Márton Balassi
Assignee: Gabor Gevay

 Although the streaming API does not make use of the accumulators it is still 
 a nice handle for the execution time and might wrap other features in the 
 future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-1773) Add interface for grouped policies to broadcast information to other groups

2015-03-23 Thread Gyula Fora (JIRA)
Gyula Fora created FLINK-1773:
-

 Summary: Add interface for grouped policies to broadcast 
information to other groups
 Key: FLINK-1773
 URL: https://issues.apache.org/jira/browse/FLINK-1773
 Project: Flink
  Issue Type: New Feature
Reporter: Gyula Fora
Priority: Minor


The current windowing does not allow grouped policies to broadcast information 
to each other. Adding this would allow closing windows earlier than the next 
element arrives for the same group. This makes nice use cases possible like 
detecting user sessions on data streams.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1767] [streaming] Make StreamExecutionE...

2015-03-23 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/516


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1687) Streaming file source/sink API is not in sync with the batch API

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376427#comment-14376427
 ] 

ASF GitHub Bot commented on FLINK-1687:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/521#issuecomment-85154634
  
Ah, it would also be good to rebase this PR to the current master to get 
some feedback from travis


 Streaming file source/sink API is not in sync with the batch API
 

 Key: FLINK-1687
 URL: https://issues.apache.org/jira/browse/FLINK-1687
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Reporter: Gábor Hermann
Assignee: Péter Szabó

 Streaming environment is missing file inputs like readFile, readCsvFile and 
 also the more general createInput function, and outputs like writeAsCsv and 
 write. Streaming and batch API should be consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1687] [streaming] Syncing streaming sou...

2015-03-23 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/521#issuecomment-85154634
  
Ah, it would also be good to rebase this PR to the current master to get 
some feedback from travis


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1770]Rename the variable 'contentAdress...

2015-03-23 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/515#issuecomment-85169638
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1726][gelly] Added Community Detection ...

2015-03-23 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/505#issuecomment-85173349
  
I didn't see a bigger issue in the code, but @vasia is the authority on 
this ;)
It would be nice to add at least another bullet to this list: 
http://ci.apache.org/projects/flink/flink-docs-master/gelly_guide.html#library-methods
 :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1687] [streaming] Syncing streaming sou...

2015-03-23 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/521#discussion_r26968237
  
--- Diff: 
flink-staging/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/ReadFileExample.java
 ---
@@ -0,0 +1,37 @@
+package org.apache.flink.streaming.examples;
+
+import org.apache.flink.api.java.io.CsvInputFormat;
+import org.apache.flink.api.java.io.TextInputFormat;
+import org.apache.flink.api.java.record.io.FileInputFormat;
+import org.apache.flink.api.java.record.io.FixedLengthInputFormat;
+import org.apache.flink.api.java.tuple.Tuple1;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.typeutils.TypeExtractor;
+import org.apache.flink.core.fs.Path;
+import 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import 
org.apache.flink.streaming.api.function.source.FileMonitoringFunction;
+import org.apache.flink.util.NumberSequenceIterator;
+
+import java.util.ArrayList;
+import java.util.Collection;
+
+public class ReadFileExample {
+
+   private static String textPath = /home/szape/result.txt;
--- End diff --

Mh ;)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1687] [streaming] Syncing streaming sou...

2015-03-23 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/521#issuecomment-85154551
  
Would it make sense to add a API completeness checker between the streaming 
and batch java API (similar to the java/scala API completeness checker?) The 
whitelist is probably a big bigger in that case, but somebody changing 
something somewhere would at least notice that the APIs are out of sync.

Does this change actually break with the current (0.8.x) methods? or is it 
just adding new methods (it seems you're relocated some methods)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1687) Streaming file source/sink API is not in sync with the batch API

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376425#comment-14376425
 ] 

ASF GitHub Bot commented on FLINK-1687:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/521#issuecomment-85154551
  
Would it make sense to add a API completeness checker between the streaming 
and batch java API (similar to the java/scala API completeness checker?) The 
whitelist is probably a big bigger in that case, but somebody changing 
something somewhere would at least notice that the APIs are out of sync.

Does this change actually break with the current (0.8.x) methods? or is it 
just adding new methods (it seems you're relocated some methods)


 Streaming file source/sink API is not in sync with the batch API
 

 Key: FLINK-1687
 URL: https://issues.apache.org/jira/browse/FLINK-1687
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Reporter: Gábor Hermann
Assignee: Péter Szabó

 Streaming environment is missing file inputs like readFile, readCsvFile and 
 also the more general createInput function, and outputs like writeAsCsv and 
 write. Streaming and batch API should be consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1595) Add a complex integration test for Streaming API

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376457#comment-14376457
 ] 

ASF GitHub Bot commented on FLINK-1595:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/520#issuecomment-85159538
  
+1 for more streaming tests ;) (... and apparently, they seem to help 
finding bugs ;) )

Btw, you don't have to use marton's github repository to open pull 
requests. You can also open them from your GH fork (you can also enable 
travis-ci for your own account, then you won't have to share with Marton's 
builds)


 Add a complex integration test for Streaming API
 

 Key: FLINK-1595
 URL: https://issues.apache.org/jira/browse/FLINK-1595
 Project: Flink
  Issue Type: Task
  Components: Streaming
Reporter: Gyula Fora
Assignee: Péter Szabó
  Labels: Starter

 The streaming tests currently lack a sophisticated integration test that 
 would test many api features at once. 
 This should include different merging, partitioning, grouping, aggregation 
 types, as well as windowing and connected operators.
 The results should be tested for correctness.
 A test like this would help identifying bugs that are hard to detect by 
 unit-tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1650) Suppress Akka's Netty Shutdown Errors through the log config

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376463#comment-14376463
 ] 

ASF GitHub Bot commented on FLINK-1650:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/518#issuecomment-85160071
  
The build passed in my travis: 
https://travis-ci.org/rmetzger/flink/builds/55487319


 Suppress Akka's Netty Shutdown Errors through the log config
 

 Key: FLINK-1650
 URL: https://issues.apache.org/jira/browse/FLINK-1650
 Project: Flink
  Issue Type: Bug
  Components: other
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 0.9


 I suggest to set the logging for 
 `org.jboss.netty.channel.DefaultChannelPipeline` to error, in order to get 
 rid of the misleading stack trace caused by an akka/netty hickup on shutdown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1472] Fixed Web frontend config overvie...

2015-03-23 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/439#issuecomment-85170765
  
According to travis, the code doesn't seem to build.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Make Expression API available to Java, Rename ...

2015-03-23 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/503#discussion_r26974832
  
--- Diff: docs/linq.md ---
@@ -23,58 +23,91 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-**Language-Integrated Queries are an experimental feature and can 
currently only be used with
-the Scala API**
+**Language-Integrated Queries are an experimental feature**
 
-Flink provides an API that allows specifying operations using SQL-like 
expressions.
-This Expression API can be enabled by importing
-`org.apache.flink.api.scala.expressions._`.  This enables implicit 
conversions that allow
-converting a `DataSet` or `DataStream` to an `ExpressionOperation` on 
which relational queries
-can be specified. This example shows how a `DataSet` can be converted, how 
expression operations
-can be specified and how an expression operation can be converted back to 
a `DataSet`:
+Flink provides an API that allows specifying operations using SQL-like 
expressions. Instead of 
+manipulating `DataSet` or `DataStream` you work with `Table` on which 
relational operations can
+be performed. 
+
+The following dependency must be added to your project when using the 
Table API:
+
+{% highlight xml %}
+dependency
+  groupIdorg.apache.flink/groupId
+  artifactIdflink-table/artifactId
+  version{{site.FLINK_VERSION_SHORT }}/version
+/dependency
+{% endhighlight %}
+
+## Scala Table API
+ 
+The Table API can be enabled by importing 
`org.apache.flink.api.scala.table._`.  This enables
+implicit conversions that allow
+converting a DataSet or DataStream to a Table. This example shows how a 
DataSet can
+be converted, how relationa queries can be specified and how a Table can be
+converted back to a DataSet`:
--- End diff --

the tick after DataSet is probably wrong.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Make Expression API available to Java, Rename ...

2015-03-23 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/503#discussion_r26974798
  
--- Diff: docs/linq.md ---
@@ -23,58 +23,91 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-**Language-Integrated Queries are an experimental feature and can 
currently only be used with
-the Scala API**
+**Language-Integrated Queries are an experimental feature**
 
-Flink provides an API that allows specifying operations using SQL-like 
expressions.
-This Expression API can be enabled by importing
-`org.apache.flink.api.scala.expressions._`.  This enables implicit 
conversions that allow
-converting a `DataSet` or `DataStream` to an `ExpressionOperation` on 
which relational queries
-can be specified. This example shows how a `DataSet` can be converted, how 
expression operations
-can be specified and how an expression operation can be converted back to 
a `DataSet`:
+Flink provides an API that allows specifying operations using SQL-like 
expressions. Instead of 
+manipulating `DataSet` or `DataStream` you work with `Table` on which 
relational operations can
+be performed. 
+
+The following dependency must be added to your project when using the 
Table API:
+
+{% highlight xml %}
+dependency
+  groupIdorg.apache.flink/groupId
+  artifactIdflink-table/artifactId
+  version{{site.FLINK_VERSION_SHORT }}/version
+/dependency
+{% endhighlight %}
+
+## Scala Table API
+ 
+The Table API can be enabled by importing 
`org.apache.flink.api.scala.table._`.  This enables
+implicit conversions that allow
+converting a DataSet or DataStream to a Table. This example shows how a 
DataSet can
+be converted, how relationa queries can be specified and how a Table can be
--- End diff --

missing `l`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1595] [streaming] Complex integration t...

2015-03-23 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/520#discussion_r26969917
  
--- Diff: 
flink-staging/flink-streaming/flink-streaming-core/src/test/java/org/apache/flink/streaming/api/complex/ComplexIntegrationTest.java
 ---
@@ -0,0 +1,903 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.api.complex;
+
+import org.apache.flink.api.common.functions.FilterFunction;
+import org.apache.flink.api.common.functions.FlatMapFunction;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.common.functions.RichFilterFunction;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple4;
+import org.apache.flink.api.java.tuple.Tuple5;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.streaming.api.collector.selector.OutputSelector;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.datastream.IterativeDataStream;
+import org.apache.flink.streaming.api.datastream.SplitDataStream;
+import 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.streaming.api.function.WindowMapFunction;
+import org.apache.flink.streaming.api.function.co.CoMapFunction;
+import org.apache.flink.streaming.api.function.sink.SinkFunction;
+import org.apache.flink.streaming.api.function.source.SourceFunction;
+import 
org.apache.flink.streaming.api.windowing.deltafunction.DeltaFunction;
+import org.apache.flink.streaming.api.windowing.helper.Count;
+import org.apache.flink.streaming.api.windowing.helper.Delta;
+import org.apache.flink.streaming.api.windowing.helper.Time;
+import org.apache.flink.streaming.api.windowing.helper.Timestamp;
+import org.apache.flink.streaming.util.RectangleClass;
+import org.apache.flink.streaming.util.TestStreamEnvironment;
+import org.apache.flink.util.Collector;
+import org.junit.Test;
+
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import static org.junit.Assert.assertEquals;
+
+public class ComplexIntegrationTest implements Serializable {
+   private static final long serialVersionUID = 1L;
+   private static final long MEMORYSIZE = 32;
+
+   private static MapString, ListString results = new HashMapString, 
ListString();
+
+   @SuppressWarnings(unchecked)
+   public static ListTuple5Integer, String, Character, Double, Boolean 
input = Arrays.asList(
+   new Tuple5Integer, String, Character, Double, 
Boolean(1, apple, 'j', 0.1, false),
+   new Tuple5Integer, String, Character, Double, 
Boolean(1, peach, 'b', 0.8, true),
+   new Tuple5Integer, String, Character, Double, 
Boolean(1, orange, 'c', 0.7, false),
+   new Tuple5Integer, String, Character, Double, 
Boolean(2, apple, 'd', 0.5, false),
+   new Tuple5Integer, String, Character, Double, 
Boolean(2, apple, 'e', 0.6, false),
+   new Tuple5Integer, String, Character, Double, 
Boolean(3, peach, 'a', 0.2, true),
+   new Tuple5Integer, String, Character, Double, 
Boolean(6, peanut, 'b', 0.1, true),
+   new Tuple5Integer, String, Character, Double, 
Boolean(7, banana, 'c', 0.4, false),
+   new Tuple5Integer, String, Character, Double, 
Boolean(8, peanut, 'd', 0.2, false),
+   new Tuple5Integer, String, Character, Double, 
Boolean(10, cherry, 'e', 0.1, false),
+   new Tuple5Integer, String, Character, Double, 
Boolean(10, plum, 'a', 0.5, true),
+   new Tuple5Integer, String, Character, Double, 
Boolean(11, strawberry, 'b', 0.3, false),
+   new Tuple5Integer, String, Character, 

[jira] [Commented] (FLINK-1595) Add a complex integration test for Streaming API

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376441#comment-14376441
 ] 

ASF GitHub Bot commented on FLINK-1595:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/520#discussion_r26969917
  
--- Diff: 
flink-staging/flink-streaming/flink-streaming-core/src/test/java/org/apache/flink/streaming/api/complex/ComplexIntegrationTest.java
 ---
@@ -0,0 +1,903 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.api.complex;
+
+import org.apache.flink.api.common.functions.FilterFunction;
+import org.apache.flink.api.common.functions.FlatMapFunction;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.common.functions.RichFilterFunction;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple4;
+import org.apache.flink.api.java.tuple.Tuple5;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.streaming.api.collector.selector.OutputSelector;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.datastream.IterativeDataStream;
+import org.apache.flink.streaming.api.datastream.SplitDataStream;
+import 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.streaming.api.function.WindowMapFunction;
+import org.apache.flink.streaming.api.function.co.CoMapFunction;
+import org.apache.flink.streaming.api.function.sink.SinkFunction;
+import org.apache.flink.streaming.api.function.source.SourceFunction;
+import 
org.apache.flink.streaming.api.windowing.deltafunction.DeltaFunction;
+import org.apache.flink.streaming.api.windowing.helper.Count;
+import org.apache.flink.streaming.api.windowing.helper.Delta;
+import org.apache.flink.streaming.api.windowing.helper.Time;
+import org.apache.flink.streaming.api.windowing.helper.Timestamp;
+import org.apache.flink.streaming.util.RectangleClass;
+import org.apache.flink.streaming.util.TestStreamEnvironment;
+import org.apache.flink.util.Collector;
+import org.junit.Test;
+
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import static org.junit.Assert.assertEquals;
+
+public class ComplexIntegrationTest implements Serializable {
+   private static final long serialVersionUID = 1L;
+   private static final long MEMORYSIZE = 32;
+
+   private static MapString, ListString results = new HashMapString, 
ListString();
+
+   @SuppressWarnings(unchecked)
+   public static ListTuple5Integer, String, Character, Double, Boolean 
input = Arrays.asList(
+   new Tuple5Integer, String, Character, Double, 
Boolean(1, apple, 'j', 0.1, false),
+   new Tuple5Integer, String, Character, Double, 
Boolean(1, peach, 'b', 0.8, true),
+   new Tuple5Integer, String, Character, Double, 
Boolean(1, orange, 'c', 0.7, false),
+   new Tuple5Integer, String, Character, Double, 
Boolean(2, apple, 'd', 0.5, false),
+   new Tuple5Integer, String, Character, Double, 
Boolean(2, apple, 'e', 0.6, false),
+   new Tuple5Integer, String, Character, Double, 
Boolean(3, peach, 'a', 0.2, true),
+   new Tuple5Integer, String, Character, Double, 
Boolean(6, peanut, 'b', 0.1, true),
+   new Tuple5Integer, String, Character, Double, 
Boolean(7, banana, 'c', 0.4, false),
+   new Tuple5Integer, String, Character, Double, 
Boolean(8, peanut, 'd', 0.2, false),
+   new Tuple5Integer, String, Character, Double, 
Boolean(10, cherry, 'e', 0.1, false),
+   new 

[GitHub] flink pull request: [FLINK-1595] [streaming] Complex integration t...

2015-03-23 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/520#issuecomment-85159538
  
+1 for more streaming tests ;) (... and apparently, they seem to help 
finding bugs ;) )

Btw, you don't have to use marton's github repository to open pull 
requests. You can also open them from your GH fork (you can also enable 
travis-ci for your own account, then you won't have to share with Marton's 
builds)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1726) Add Community Detection Library and Example

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376547#comment-14376547
 ] 

ASF GitHub Bot commented on FLINK-1726:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/505#issuecomment-85173349
  
I didn't see a bigger issue in the code, but @vasia is the authority on 
this ;)
It would be nice to add at least another bullet to this list: 
http://ci.apache.org/projects/flink/flink-docs-master/gelly_guide.html#library-methods
 :)


 Add Community Detection Library and Example
 ---

 Key: FLINK-1726
 URL: https://issues.apache.org/jira/browse/FLINK-1726
 Project: Flink
  Issue Type: Task
  Components: Gelly
Affects Versions: 0.9
Reporter: Andra Lungu
Assignee: Andra Lungu

 Community detection paper: http://arxiv.org/pdf/0808.2633.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Make Expression API available to Java, Rename ...

2015-03-23 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/503#discussion_r26975057
  
--- Diff: docs/linq.md ---
@@ -23,58 +23,91 @@ under the License.
 * This will be replaced by the TOC
 {:toc}
 
-**Language-Integrated Queries are an experimental feature and can 
currently only be used with
-the Scala API**
+**Language-Integrated Queries are an experimental feature**
 
-Flink provides an API that allows specifying operations using SQL-like 
expressions.
-This Expression API can be enabled by importing
-`org.apache.flink.api.scala.expressions._`.  This enables implicit 
conversions that allow
-converting a `DataSet` or `DataStream` to an `ExpressionOperation` on 
which relational queries
-can be specified. This example shows how a `DataSet` can be converted, how 
expression operations
-can be specified and how an expression operation can be converted back to 
a `DataSet`:
+Flink provides an API that allows specifying operations using SQL-like 
expressions. Instead of 
+manipulating `DataSet` or `DataStream` you work with `Table` on which 
relational operations can
+be performed. 
+
+The following dependency must be added to your project when using the 
Table API:
+
+{% highlight xml %}
+dependency
+  groupIdorg.apache.flink/groupId
+  artifactIdflink-table/artifactId
+  version{{site.FLINK_VERSION_SHORT }}/version
+/dependency
+{% endhighlight %}
+
+## Scala Table API
+ 
+The Table API can be enabled by importing 
`org.apache.flink.api.scala.table._`.  This enables
+implicit conversions that allow
+converting a DataSet or DataStream to a Table. This example shows how a 
DataSet can
+be converted, how relationa queries can be specified and how a Table can be
+converted back to a DataSet`:
 
 {% highlight scala %}
 import org.apache.flink.api.scala._
-import org.apache.flink.api.scala.expressions._ 
+import org.apache.flink.api.scala.table._ 
 
 case class WC(word: String, count: Int)
 val input = env.fromElements(WC(hello, 1), WC(hello, 1), WC(ciao, 1))
-val expr = input.toExpression
-val result = expr.groupBy('word).select('word, 'count.sum).as[WC]
+val expr = input.toTable
+val result = expr.groupBy('word).select('word, 'count.sum).toSet[WC]
 {% endhighlight %}
 
 The expression DSL uses Scala symbols to refer to field names and we use 
code generation to
 transform expressions to efficient runtime code. Please not that the 
conversion to and from
-expression operations only works when using Scala case classes or Flink 
POJOs. Please check out
+Tables only works when using Scala case classes or Flink POJOs. Please 
check out
 the [programming guide](programming_guide.html) to learn the requirements 
for a class to be 
 considered a POJO.
  
 This is another example that shows how you
-can join to operations:
+can join to Tables:
 
 {% highlight scala %}
 case class MyResult(a: String, b: Int)
 
 val input1 = env.fromElements(...).as('a, 'b)
 val input2 = env.fromElements(...).as('c, 'd)
-val joined = input1.join(input2).where('b == 'a  'd  42).select('a, 
'd).as[MyResult]
+val joined = input1.join(input2).where(b = a  d  42).select(a, 
d).as[MyResult]
 {% endhighlight %}
 
-Notice, how a `DataSet` can be converted to an expression operation by 
using `as` and specifying new
-names for the fields. This can also be used to disambiguate fields before 
a join operation.
+Notice, how a DataSet can be converted to a Table by using `as` and 
specifying new
+names for the fields. This can also be used to disambiguate fields before 
a join operation. Also,
+in this example we see that you can also use Strings to specify relational 
expressions.
 
-The Expression API can be used with the Streaming API, since we also have 
implicit conversions to
-and from `DataStream`.
+Please refer to the Scaladoc (and Javadoc) for a full list of supported 
operations and a
+description of the expression syntax. 
 
-The following dependency must be added to your project when using the 
Expression API:
+## Java Table API
 
-{% highlight xml %}
-dependency
-  groupIdorg.apache.flink/groupId
-  artifactIdflink-expressions/artifactId
-  version{{site.FLINK_VERSION_SHORT }}/version
-/dependency
+When using Java, Tables can be converted to and from DataSet and 
DataStream using class
--- End diff --

using the class


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket

[jira] [Commented] (FLINK-1687) Streaming file source/sink API is not in sync with the batch API

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376413#comment-14376413
 ] 

ASF GitHub Bot commented on FLINK-1687:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/521#discussion_r26968237
  
--- Diff: 
flink-staging/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/ReadFileExample.java
 ---
@@ -0,0 +1,37 @@
+package org.apache.flink.streaming.examples;
+
+import org.apache.flink.api.java.io.CsvInputFormat;
+import org.apache.flink.api.java.io.TextInputFormat;
+import org.apache.flink.api.java.record.io.FileInputFormat;
+import org.apache.flink.api.java.record.io.FixedLengthInputFormat;
+import org.apache.flink.api.java.tuple.Tuple1;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.typeutils.TypeExtractor;
+import org.apache.flink.core.fs.Path;
+import 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import 
org.apache.flink.streaming.api.function.source.FileMonitoringFunction;
+import org.apache.flink.util.NumberSequenceIterator;
+
+import java.util.ArrayList;
+import java.util.Collection;
+
+public class ReadFileExample {
+
+   private static String textPath = /home/szape/result.txt;
--- End diff --

Mh ;)


 Streaming file source/sink API is not in sync with the batch API
 

 Key: FLINK-1687
 URL: https://issues.apache.org/jira/browse/FLINK-1687
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Reporter: Gábor Hermann
Assignee: Péter Szabó

 Streaming environment is missing file inputs like readFile, readCsvFile and 
 also the more general createInput function, and outputs like writeAsCsv and 
 write. Streaming and batch API should be consistent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1595] [streaming] Complex integration t...

2015-03-23 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/520#discussion_r26969789
  
--- Diff: 
flink-staging/flink-streaming/flink-streaming-core/src/test/java/org/apache/flink/streaming/api/complex/ComplexIntegrationTest.java
 ---
@@ -0,0 +1,903 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.api.complex;
+
+import org.apache.flink.api.common.functions.FilterFunction;
+import org.apache.flink.api.common.functions.FlatMapFunction;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.common.functions.RichFilterFunction;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple4;
+import org.apache.flink.api.java.tuple.Tuple5;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.streaming.api.collector.selector.OutputSelector;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.datastream.IterativeDataStream;
+import org.apache.flink.streaming.api.datastream.SplitDataStream;
+import 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.streaming.api.function.WindowMapFunction;
+import org.apache.flink.streaming.api.function.co.CoMapFunction;
+import org.apache.flink.streaming.api.function.sink.SinkFunction;
+import org.apache.flink.streaming.api.function.source.SourceFunction;
+import 
org.apache.flink.streaming.api.windowing.deltafunction.DeltaFunction;
+import org.apache.flink.streaming.api.windowing.helper.Count;
+import org.apache.flink.streaming.api.windowing.helper.Delta;
+import org.apache.flink.streaming.api.windowing.helper.Time;
+import org.apache.flink.streaming.api.windowing.helper.Timestamp;
+import org.apache.flink.streaming.util.RectangleClass;
+import org.apache.flink.streaming.util.TestStreamEnvironment;
+import org.apache.flink.util.Collector;
+import org.junit.Test;
+
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import static org.junit.Assert.assertEquals;
+
+public class ComplexIntegrationTest implements Serializable {
+   private static final long serialVersionUID = 1L;
+   private static final long MEMORYSIZE = 32;
+
+   private static MapString, ListString results = new HashMapString, 
ListString();
+
+   @SuppressWarnings(unchecked)
+   public static ListTuple5Integer, String, Character, Double, Boolean 
input = Arrays.asList(
+   new Tuple5Integer, String, Character, Double, 
Boolean(1, apple, 'j', 0.1, false),
+   new Tuple5Integer, String, Character, Double, 
Boolean(1, peach, 'b', 0.8, true),
+   new Tuple5Integer, String, Character, Double, 
Boolean(1, orange, 'c', 0.7, false),
+   new Tuple5Integer, String, Character, Double, 
Boolean(2, apple, 'd', 0.5, false),
+   new Tuple5Integer, String, Character, Double, 
Boolean(2, apple, 'e', 0.6, false),
+   new Tuple5Integer, String, Character, Double, 
Boolean(3, peach, 'a', 0.2, true),
+   new Tuple5Integer, String, Character, Double, 
Boolean(6, peanut, 'b', 0.1, true),
+   new Tuple5Integer, String, Character, Double, 
Boolean(7, banana, 'c', 0.4, false),
+   new Tuple5Integer, String, Character, Double, 
Boolean(8, peanut, 'd', 0.2, false),
+   new Tuple5Integer, String, Character, Double, 
Boolean(10, cherry, 'e', 0.1, false),
+   new Tuple5Integer, String, Character, Double, 
Boolean(10, plum, 'a', 0.5, true),
+   new Tuple5Integer, String, Character, Double, 
Boolean(11, strawberry, 'b', 0.3, false),
+   new Tuple5Integer, String, Character, 

[jira] [Commented] (FLINK-1595) Add a complex integration test for Streaming API

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376437#comment-14376437
 ] 

ASF GitHub Bot commented on FLINK-1595:
---

Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/520#discussion_r26969789
  
--- Diff: 
flink-staging/flink-streaming/flink-streaming-core/src/test/java/org/apache/flink/streaming/api/complex/ComplexIntegrationTest.java
 ---
@@ -0,0 +1,903 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.api.complex;
+
+import org.apache.flink.api.common.functions.FilterFunction;
+import org.apache.flink.api.common.functions.FlatMapFunction;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.common.functions.RichFilterFunction;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.api.java.tuple.Tuple4;
+import org.apache.flink.api.java.tuple.Tuple5;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.streaming.api.collector.selector.OutputSelector;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.datastream.IterativeDataStream;
+import org.apache.flink.streaming.api.datastream.SplitDataStream;
+import 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.streaming.api.function.WindowMapFunction;
+import org.apache.flink.streaming.api.function.co.CoMapFunction;
+import org.apache.flink.streaming.api.function.sink.SinkFunction;
+import org.apache.flink.streaming.api.function.source.SourceFunction;
+import 
org.apache.flink.streaming.api.windowing.deltafunction.DeltaFunction;
+import org.apache.flink.streaming.api.windowing.helper.Count;
+import org.apache.flink.streaming.api.windowing.helper.Delta;
+import org.apache.flink.streaming.api.windowing.helper.Time;
+import org.apache.flink.streaming.api.windowing.helper.Timestamp;
+import org.apache.flink.streaming.util.RectangleClass;
+import org.apache.flink.streaming.util.TestStreamEnvironment;
+import org.apache.flink.util.Collector;
+import org.junit.Test;
+
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import static org.junit.Assert.assertEquals;
+
+public class ComplexIntegrationTest implements Serializable {
+   private static final long serialVersionUID = 1L;
+   private static final long MEMORYSIZE = 32;
+
+   private static MapString, ListString results = new HashMapString, 
ListString();
+
+   @SuppressWarnings(unchecked)
+   public static ListTuple5Integer, String, Character, Double, Boolean 
input = Arrays.asList(
+   new Tuple5Integer, String, Character, Double, 
Boolean(1, apple, 'j', 0.1, false),
+   new Tuple5Integer, String, Character, Double, 
Boolean(1, peach, 'b', 0.8, true),
+   new Tuple5Integer, String, Character, Double, 
Boolean(1, orange, 'c', 0.7, false),
+   new Tuple5Integer, String, Character, Double, 
Boolean(2, apple, 'd', 0.5, false),
+   new Tuple5Integer, String, Character, Double, 
Boolean(2, apple, 'e', 0.6, false),
+   new Tuple5Integer, String, Character, Double, 
Boolean(3, peach, 'a', 0.2, true),
+   new Tuple5Integer, String, Character, Double, 
Boolean(6, peanut, 'b', 0.1, true),
+   new Tuple5Integer, String, Character, Double, 
Boolean(7, banana, 'c', 0.4, false),
+   new Tuple5Integer, String, Character, Double, 
Boolean(8, peanut, 'd', 0.2, false),
+   new Tuple5Integer, String, Character, Double, 
Boolean(10, cherry, 'e', 0.1, false),
+   new 

[GitHub] flink pull request: [FLINK-1650] Configure Netty (akka) to use Slf...

2015-03-23 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/518#issuecomment-85160071
  
The build passed in my travis: 
https://travis-ci.org/rmetzger/flink/builds/55487319


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1770) Rename the variable 'contentAdressable' to 'contentAddressable'

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376525#comment-14376525
 ] 

ASF GitHub Bot commented on FLINK-1770:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/515#issuecomment-85169638
  
LGTM


 Rename the variable 'contentAdressable' to 'contentAddressable'
 ---

 Key: FLINK-1770
 URL: https://issues.apache.org/jira/browse/FLINK-1770
 Project: Flink
  Issue Type: Bug
Affects Versions: master
Reporter: Sibao Hong
Priority: Minor
 Fix For: master


 Rename the variable 'contentAdressable' to 'contentAddressable' in order to 
 better understanding.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Adds akka interal docs and configuration descr...

2015-03-23 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/512#issuecomment-85169924
  
+1 for updating the configuration guide with the Akka variables.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1472) Web frontend config overview shows wrong value

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14376533#comment-14376533
 ] 

ASF GitHub Bot commented on FLINK-1472:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/439#issuecomment-85170765
  
According to travis, the code doesn't seem to build.


 Web frontend config overview shows wrong value
 --

 Key: FLINK-1472
 URL: https://issues.apache.org/jira/browse/FLINK-1472
 Project: Flink
  Issue Type: Bug
  Components: Webfrontend
Affects Versions: master
Reporter: Ufuk Celebi
Assignee: Mingliang Qi
Priority: Minor

 The web frontend shows configuration values even if they could not be 
 correctly parsed.
 For example I've configured the number of buffers as 123.000, which cannot 
 be parsed as an Integer by GlobalConfiguration and the default value is used. 
 Still, the web frontend shows the not used 123.000.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Make Expression API available to Java, Rename ...

2015-03-23 Thread rmetzger
Github user rmetzger commented on a diff in the pull request:

https://github.com/apache/flink/pull/503#discussion_r26975267
  
--- Diff: 
flink-staging/flink-table/src/main/java/org/apache/flink/api/java/table/package-info.java
 ---
@@ -0,0 +1,66 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/**
+ * strongTable API (Java)/strongbr
+ *
+ * {@link org.apache.flink.api.java.table.TableUtil} can be used to create 
a
+ * {@link org.apache.flink.api.table.Table} from a {@link 
org.apache.flink.api.java.DataSet}
+ * or {@link org.apache.flink.streaming.api.datastream.DataStream}.
+ *
+ * p
+ * This can be used to perform SQL-like queries on data. Please have
+ * a look at {@link org.apache.flink.api.table.Table} to see which 
operations are supported and
+ * how query strings are written.
+ *
+ * p
+ * Example:
+ *
+ * code
+ * ExecutionEnvironment env = 
ExecutionEnvironment.createCollectionsEnvironment();
+ *
+ * DataSetWC input = env.fromElements(
+ *   new WC(Hello, 1),
+ *   new WC(Ciao, 1),
+ *   new WC(Hello, 1));
+ *
+ * Table table = TableUtil.from(input);
+ *
+ * Table filtered = table
+ * .groupBy(word)
+ * .select(word.count as count, word)
+ * .filter(count = 2);
+ *
+ * DataSetWC result = TableUtil.toSet(filtered, WC.class);
+ *
+ * result.print();
+ * env.execute();
+ * /code
+ *
+ * p
+ * A {@link org.apache.flink.api.table.Table} can be converted back to the 
underlying API
+ * representation using t:
--- End diff --

using t ?
What is t?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Cleanup of low level Kafka consumer (Persisten...

2015-03-23 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/474#issuecomment-84879071
  
I think we can close this PR. Everything from here has been merged to 
master.
Maybe we removed some code again from this PR because it was buggy or 
untested.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1615] [java api] SimpleTweetInputFormat

2015-03-23 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/442#issuecomment-84912485
  
I triggered another travis build: 
https://travis-ci.org/rmetzger/flink/builds/55459787


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (FLINK-1769) Maven deploy is broken (build artifacts are cleaned in docs-and-sources profile)

2015-03-23 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-1769:
-

 Summary: Maven deploy is broken (build artifacts are cleaned in 
docs-and-sources profile)
 Key: FLINK-1769
 URL: https://issues.apache.org/jira/browse/FLINK-1769
 Project: Flink
  Issue Type: Bug
Reporter: Robert Metzger
Priority: Critical


The issue has been introduced by FLINK-1720.

This change broke the deployment to maven snapshots / central.

{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-install-plugin:2.5.1:install (default-install) 
on project flink-shaded-include-yarn: Failed to install artifact 
org.apache.flink:flink-shaded-include-yarn:pom:0.9-SNAPSHOT: 
/home/robert/incubator-flink/flink-shaded-hadoop/flink-shaded-include-yarn/target/dependency-reduced-pom.xml
 (No such file or directory) - [Help 1]
{code}

The issue is that maven is now executing {{clean}} after {{shade}} and then 
{{install}} can not store the result of {{shade}} anymore (because it has been 
deleted)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1656) Filtered Semantic Properties for Operators with Iterators

2015-03-23 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375665#comment-14375665
 ] 

Fabian Hueske commented on FLINK-1656:
--

I am going to address this issue on two levels:

1. The optimizer will filter out all forward field information on non-key 
fields for operators with iterators.
 - non-key fields of GroupReduce
 - non-key fields of CoGroup
 - all fields of MapPartition

2. The APIs will log a warning if a user adds forward field information for 
non-key fields of GroupReduce and CoGroup and throw an exception if a user adds 
forward field information for MapPartition and Filter.


 Filtered Semantic Properties for Operators with Iterators
 -

 Key: FLINK-1656
 URL: https://issues.apache.org/jira/browse/FLINK-1656
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 0.9
Reporter: Fabian Hueske
Assignee: Fabian Hueske
Priority: Critical

 The documentation of ForwardedFields is incomplete for operators with 
 iterator inputs (GroupReduce, CoGroup). 
 This should be fixed ASAP, because it can lead to incorrect program execution.
 The conditions for forwarded fields on operators with iterator input are:
 1) forwarded fields must be emitted in the order in which they are received 
 through the iterator
 2) all forwarded fields of a record must stick together, i.e., if your 
 function builds record from field 0 of the 1st, 3rd, 5th, ... and field 1 of 
 the 2nd, 4th, ... record coming through the iterator, these are not valid 
 forwarded fields.
 3) it is OK to completely filter out records coming through the iterator.
 The reason for these conditions is that the optimizer uses forwarded fields 
 to reason about physical data properties such as order and grouping. Mixing 
 up the order of records or emitting records which are composed from different 
 input records, might destroy a (secondary) order or grouping.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1679] rename degree of parallelism to p...

2015-03-23 Thread mxm
Github user mxm closed the pull request at:

https://github.com/apache/flink/pull/488


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1679) Document how degree of parallelism / parallelism / slots are connected to each other

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375553#comment-14375553
 ] 

ASF GitHub Bot commented on FLINK-1679:
---

Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/488#issuecomment-84875935
  
Merged in master with 126f9f799071688fe80955a7e7cfa991f53c95af


 Document how degree of parallelism /  parallelism / slots are connected 
 to each other
 ---

 Key: FLINK-1679
 URL: https://issues.apache.org/jira/browse/FLINK-1679
 Project: Flink
  Issue Type: Task
  Components: Documentation
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Maximilian Michels
 Fix For: 0.9


 I see too many users being confused about properly setting up Flink with 
 respect to parallelism.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1679) Document how degree of parallelism / parallelism / slots are connected to each other

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375554#comment-14375554
 ] 

ASF GitHub Bot commented on FLINK-1679:
---

Github user mxm closed the pull request at:

https://github.com/apache/flink/pull/488


 Document how degree of parallelism /  parallelism / slots are connected 
 to each other
 ---

 Key: FLINK-1679
 URL: https://issues.apache.org/jira/browse/FLINK-1679
 Project: Flink
  Issue Type: Task
  Components: Documentation
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Maximilian Michels
 Fix For: 0.9


 I see too many users being confused about properly setting up Flink with 
 respect to parallelism.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1679] rename degree of parallelism to p...

2015-03-23 Thread mxm
Github user mxm commented on the pull request:

https://github.com/apache/flink/pull/488#issuecomment-84875935
  
Merged in master with 126f9f799071688fe80955a7e7cfa991f53c95af


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1679) Document how degree of parallelism / parallelism / slots are connected to each other

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375551#comment-14375551
 ] 

ASF GitHub Bot commented on FLINK-1679:
---

Github user mxm closed the pull request at:

https://github.com/apache/flink/pull/488


 Document how degree of parallelism /  parallelism / slots are connected 
 to each other
 ---

 Key: FLINK-1679
 URL: https://issues.apache.org/jira/browse/FLINK-1679
 Project: Flink
  Issue Type: Task
  Components: Documentation
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Maximilian Michels
 Fix For: 0.9


 I see too many users being confused about properly setting up Flink with 
 respect to parallelism.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1765) Reducer grouping is skippted when parallelism is one

2015-03-23 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375573#comment-14375573
 ] 

Robert Metzger commented on FLINK-1765:
---

Can we add a test for this fix?

 Reducer grouping is skippted when parallelism is one
 

 Key: FLINK-1765
 URL: https://issues.apache.org/jira/browse/FLINK-1765
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Gyula Fora
 Fix For: 0.9


 This program (not the parallelism) incorrectly runs a non grouped reduce and 
 fails with a NullPointerException.
 {code}
 StreamExecutionEnvironment env = ...
 env.setDegreeOfParallelism(1);
 DataStreamString stream = env.addSource(...);
 stream
 .filter(...)
 .map(...)
 .groupBy(someField)
 .reduce(new ReduceFunction() {...} )
 .addSink(...);
 env.execute();
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (FLINK-1757) java.lang.ClassCastException is thrown while summing Short values on window

2015-03-23 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger reopened FLINK-1757:
---

 java.lang.ClassCastException is thrown while summing Short values on window
 ---

 Key: FLINK-1757
 URL: https://issues.apache.org/jira/browse/FLINK-1757
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Reporter: Péter Szabó
Assignee: Péter Szabó
 Fix For: 0.9


 java.lang.ClassCastException is thrown while summing Short values on window
 Stack Trace:
 Caused by: java.lang.RuntimeException: java.lang.ClassCastException: 
 java.lang.Integer cannot be cast to java.lang.Short
   at 
 org.apache.flink.streaming.api.streamvertex.OutputHandler.invokeUserFunction(OutputHandler.java:232)
   at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:121)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205)
   at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.ClassCastException: java.lang.Integer cannot be cast to 
 java.lang.Short
   at 
 org.apache.flink.api.common.typeutils.base.ShortSerializer.copy(ShortSerializer.java:27)
   at 
 org.apache.flink.api.java.typeutils.runtime.TupleSerializer.copy(TupleSerializer.java:95)
   at 
 org.apache.flink.api.java.typeutils.runtime.TupleSerializer.copy(TupleSerializer.java:30)
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.copy(StreamInvokable.java:166)
   at 
 org.apache.flink.streaming.api.invokable.SinkInvokable.collect(SinkInvokable.java:46)
   at 
 org.apache.flink.streaming.api.collector.DirectedCollectorWrapper.collect(DirectedCollectorWrapper.java:95)
   at 
 org.apache.flink.streaming.api.invokable.operator.GroupedReduceInvokable.reduce(GroupedReduceInvokable.java:47)
   at 
 org.apache.flink.streaming.api.invokable.operator.StreamReduceInvokable.invoke(StreamReduceInvokable.java:39)
   at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.invokeUserFunction(StreamVertex.java:85)
   at 
 org.apache.flink.streaming.api.streamvertex.OutputHandler.invokeUserFunction(OutputHandler.java:229)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (FLINK-1757) java.lang.ClassCastException is thrown while summing Short values on window

2015-03-23 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger resolved FLINK-1757.
---
   Resolution: Fixed
Fix Version/s: 0.9

 java.lang.ClassCastException is thrown while summing Short values on window
 ---

 Key: FLINK-1757
 URL: https://issues.apache.org/jira/browse/FLINK-1757
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Reporter: Péter Szabó
Assignee: Péter Szabó
 Fix For: 0.9


 java.lang.ClassCastException is thrown while summing Short values on window
 Stack Trace:
 Caused by: java.lang.RuntimeException: java.lang.ClassCastException: 
 java.lang.Integer cannot be cast to java.lang.Short
   at 
 org.apache.flink.streaming.api.streamvertex.OutputHandler.invokeUserFunction(OutputHandler.java:232)
   at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.invoke(StreamVertex.java:121)
   at 
 org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:205)
   at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.ClassCastException: java.lang.Integer cannot be cast to 
 java.lang.Short
   at 
 org.apache.flink.api.common.typeutils.base.ShortSerializer.copy(ShortSerializer.java:27)
   at 
 org.apache.flink.api.java.typeutils.runtime.TupleSerializer.copy(TupleSerializer.java:95)
   at 
 org.apache.flink.api.java.typeutils.runtime.TupleSerializer.copy(TupleSerializer.java:30)
   at 
 org.apache.flink.streaming.api.invokable.StreamInvokable.copy(StreamInvokable.java:166)
   at 
 org.apache.flink.streaming.api.invokable.SinkInvokable.collect(SinkInvokable.java:46)
   at 
 org.apache.flink.streaming.api.collector.DirectedCollectorWrapper.collect(DirectedCollectorWrapper.java:95)
   at 
 org.apache.flink.streaming.api.invokable.operator.GroupedReduceInvokable.reduce(GroupedReduceInvokable.java:47)
   at 
 org.apache.flink.streaming.api.invokable.operator.StreamReduceInvokable.invoke(StreamReduceInvokable.java:39)
   at 
 org.apache.flink.streaming.api.streamvertex.StreamVertex.invokeUserFunction(StreamVertex.java:85)
   at 
 org.apache.flink.streaming.api.streamvertex.OutputHandler.invokeUserFunction(OutputHandler.java:229)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1615) Introduces a new InputFormat for Tweets

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375619#comment-14375619
 ] 

ASF GitHub Bot commented on FLINK-1615:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/442#issuecomment-84912485
  
I triggered another travis build: 
https://travis-ci.org/rmetzger/flink/builds/55459787


 Introduces a new InputFormat for Tweets
 ---

 Key: FLINK-1615
 URL: https://issues.apache.org/jira/browse/FLINK-1615
 Project: Flink
  Issue Type: New Feature
  Components: flink-contrib
Affects Versions: 0.8.1
Reporter: mustafa elbehery
Priority: Minor

 An event-driven parser for Tweets into Java Pojos. 
 It parses all the important part of the tweet into Java objects. 
 Tested on cluster and the performance in pretty well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1501) Integrate metrics library and report basic metrics to JobManager web interface

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375676#comment-14375676
 ] 

ASF GitHub Bot commented on FLINK-1501:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/421#issuecomment-84935408
  
Hey @bhatsachin, I've started working on the per job monitoring .. but its 
currently in a work in progress state and I did not find time to finish it yet.

If you are interested in working on the topic, I would actually suggest to 
enhance the monitoring I've added in this pull request (the TaskManager 
monitoring).


If nobody has any objections, I would like to merge this change in the next 
24 hours.


 Integrate metrics library and report basic metrics to JobManager web interface
 --

 Key: FLINK-1501
 URL: https://issues.apache.org/jira/browse/FLINK-1501
 Project: Flink
  Issue Type: Sub-task
  Components: JobManager, TaskManager
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Robert Metzger
 Fix For: pre-apache


 As per mailing list, the library: https://github.com/dropwizard/metrics
 The goal of this task is to get the basic infrastructure in place.
 Subsequent issues will integrate more features into the system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: Adds akka interal docs and configuration descr...

2015-03-23 Thread tillrohrmann
Github user tillrohrmann commented on the pull request:

https://github.com/apache/flink/pull/512#issuecomment-84940008
  
Yes I agree that we should keep the internals only at one place. Even then, 
it is probably hard enough to keep the docs in sync with the code.

But what still might be interesting to add to the docs is the description 
of the configuration parameters. There is still a big hole under Distributed 
coordination (via Akka). 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-1656) Filtered Semantic Properties for Operators with Iterators

2015-03-23 Thread Fabian Hueske (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fabian Hueske updated FLINK-1656:
-
Summary: Filtered Semantic Properties for Operators with Iterators  (was: 
Fix ForwardedField documentation for operators with iterator input)

 Filtered Semantic Properties for Operators with Iterators
 -

 Key: FLINK-1656
 URL: https://issues.apache.org/jira/browse/FLINK-1656
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 0.9
Reporter: Fabian Hueske
Assignee: Fabian Hueske
Priority: Critical

 The documentation of ForwardedFields is incomplete for operators with 
 iterator inputs (GroupReduce, CoGroup). 
 This should be fixed ASAP, because it can lead to incorrect program execution.
 The conditions for forwarded fields on operators with iterator input are:
 1) forwarded fields must be emitted in the order in which they are received 
 through the iterator
 2) all forwarded fields of a record must stick together, i.e., if your 
 function builds record from field 0 of the 1st, 3rd, 5th, ... and field 1 of 
 the 2nd, 4th, ... record coming through the iterator, these are not valid 
 forwarded fields.
 3) it is OK to completely filter out records coming through the iterator.
 The reason for these conditions is that the optimizer uses forwarded fields 
 to reason about physical data properties such as order and grouping. Mixing 
 up the order of records or emitting records which are composed from different 
 input records, might destroy a (secondary) order or grouping.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1769) Maven deploy is broken (build artifacts are cleaned in docs-and-sources profile)

2015-03-23 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1769:
--
Component/s: Build System

 Maven deploy is broken (build artifacts are cleaned in docs-and-sources 
 profile)
 

 Key: FLINK-1769
 URL: https://issues.apache.org/jira/browse/FLINK-1769
 Project: Flink
  Issue Type: Bug
  Components: Build System
Reporter: Robert Metzger
Priority: Critical

 The issue has been introduced by FLINK-1720.
 This change broke the deployment to maven snapshots / central.
 {code}
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-install-plugin:2.5.1:install (default-install) 
 on project flink-shaded-include-yarn: Failed to install artifact 
 org.apache.flink:flink-shaded-include-yarn:pom:0.9-SNAPSHOT: 
 /home/robert/incubator-flink/flink-shaded-hadoop/flink-shaded-include-yarn/target/dependency-reduced-pom.xml
  (No such file or directory) - [Help 1]
 {code}
 The issue is that maven is now executing {{clean}} after {{shade}} and then 
 {{install}} can not store the result of {{shade}} anymore (because it has 
 been deleted)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (FLINK-1720) Integrate ScalaDoc in Scala sources into overall JavaDoc

2015-03-23 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1720:
--
Component/s: Build System

 Integrate ScalaDoc in Scala sources into overall JavaDoc
 

 Key: FLINK-1720
 URL: https://issues.apache.org/jira/browse/FLINK-1720
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Reporter: Aljoscha Krettek
Assignee: Aljoscha Krettek
 Fix For: 0.9






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1650) Suppress Akka's Netty Shutdown Errors through the log config

2015-03-23 Thread Robert Metzger (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375668#comment-14375668
 ] 

Robert Metzger commented on FLINK-1650:
---

I got a response from the Akka mailing list. They've asked me to open an issue 
if we're still seeing the issue in {{2.3.9}}.
We are currently on Akka version {{2.3.7}}.

The changelogs:
http://akka.io/news/2014/12/17/akka-2.3.8-released.html
http://akka.io/news/2015/01/19/akka-2.3.9-released.html

Any objections against bumping our Akka dependency to 2.3.9 ?



 Suppress Akka's Netty Shutdown Errors through the log config
 

 Key: FLINK-1650
 URL: https://issues.apache.org/jira/browse/FLINK-1650
 Project: Flink
  Issue Type: Bug
  Components: other
Affects Versions: 0.9
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 0.9


 I suggest to set the logging for 
 `org.jboss.netty.channel.DefaultChannelPipeline` to error, in order to get 
 rid of the misleading stack trace caused by an akka/netty hickup on shutdown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1679) Document how degree of parallelism / parallelism / slots are connected to each other

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375552#comment-14375552
 ] 

ASF GitHub Bot commented on FLINK-1679:
---

GitHub user mxm reopened a pull request:

https://github.com/apache/flink/pull/488

[FLINK-1679] rename degree of parallelism to parallelism  extend 
documentation about parallelism

https://issues.apache.org/jira/browse/FLINK-1679

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mxm/flink parallelism

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/488.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #488


commit 17097fdf51f41445bad6da2186868185a6bf947b
Author: Maximilian Michels m...@apache.org
Date:   2015-03-18T09:44:42Z

[FLINK-1679] deprecate API methods to set the parallelism

commit f6ba8c07cc9a153b1ac1e213f9749155c42ae3c3
Author: Maximilian Michels m...@apache.org
Date:   2015-03-18T09:44:43Z

[FLINK-1679] use a consistent name for parallelism

* rename occurrences of degree of parallelism to parallelism

* [Dd]egree[ -]of[ -]parallelism - [pP]arallelism
* (DOP|dop) - [pP]arallelism
* paraDegree - parallelism
* degree-of-parallelism - parallelism
* DEGREE_OF_PARALLELISM - PARALLELISM

commit 658bb1166aa907677e06cf011e5a0fdaf58ab15f
Author: Maximilian Michels m...@apache.org
Date:   2015-03-18T09:44:44Z

[FLINK-1679] deprecate old parallelism config entry

old config parameter can still be used

OLD
parallelization.degree.default

NEW
parallelism.default

commit 412ac54df0fde12666afbc1414df5fd919ba1607
Author: Maximilian Michels m...@apache.org
Date:   2015-03-18T09:44:45Z

[FLINK-1679] extend faq and programming guide to clarify parallelism




 Document how degree of parallelism /  parallelism / slots are connected 
 to each other
 ---

 Key: FLINK-1679
 URL: https://issues.apache.org/jira/browse/FLINK-1679
 Project: Flink
  Issue Type: Task
  Components: Documentation
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: Maximilian Michels
 Fix For: 0.9


 I see too many users being confused about properly setting up Flink with 
 respect to parallelism.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1679] rename degree of parallelism to p...

2015-03-23 Thread mxm
GitHub user mxm reopened a pull request:

https://github.com/apache/flink/pull/488

[FLINK-1679] rename degree of parallelism to parallelism  extend 
documentation about parallelism

https://issues.apache.org/jira/browse/FLINK-1679

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mxm/flink parallelism

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/488.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #488


commit 17097fdf51f41445bad6da2186868185a6bf947b
Author: Maximilian Michels m...@apache.org
Date:   2015-03-18T09:44:42Z

[FLINK-1679] deprecate API methods to set the parallelism

commit f6ba8c07cc9a153b1ac1e213f9749155c42ae3c3
Author: Maximilian Michels m...@apache.org
Date:   2015-03-18T09:44:43Z

[FLINK-1679] use a consistent name for parallelism

* rename occurrences of degree of parallelism to parallelism

* [Dd]egree[ -]of[ -]parallelism - [pP]arallelism
* (DOP|dop) - [pP]arallelism
* paraDegree - parallelism
* degree-of-parallelism - parallelism
* DEGREE_OF_PARALLELISM - PARALLELISM

commit 658bb1166aa907677e06cf011e5a0fdaf58ab15f
Author: Maximilian Michels m...@apache.org
Date:   2015-03-18T09:44:44Z

[FLINK-1679] deprecate old parallelism config entry

old config parameter can still be used

OLD
parallelization.degree.default

NEW
parallelism.default

commit 412ac54df0fde12666afbc1414df5fd919ba1607
Author: Maximilian Michels m...@apache.org
Date:   2015-03-18T09:44:45Z

[FLINK-1679] extend faq and programming guide to clarify parallelism




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (FLINK-1725) New Partitioner for better load balancing for skewed data

2015-03-23 Thread Robert Metzger (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-1725:
--
Assignee: Anis Nasir

 New Partitioner for better load balancing for skewed data
 -

 Key: FLINK-1725
 URL: https://issues.apache.org/jira/browse/FLINK-1725
 Project: Flink
  Issue Type: Improvement
  Components: New Components
Affects Versions: 0.8.1
Reporter: Anis Nasir
Assignee: Anis Nasir
  Labels: LoadBalancing, Partitioner
   Original Estimate: 336h
  Remaining Estimate: 336h

 Hi,
 We have recently studied the problem of load balancing in Storm [1].
 In particular, we focused on key distribution of the stream for skewed data.
 We developed a new stream partitioning scheme (which we call Partial Key 
 Grouping). It achieves better load balancing than key grouping while being 
 more scalable than shuffle grouping in terms of memory.
 In the paper we show a number of mining algorithms that are easy to implement 
 with partial key grouping, and whose performance can benefit from it. We 
 think that it might also be useful for a larger class of algorithms.
 Partial key grouping is very easy to implement: it requires just a few lines 
 of code in Java when implemented as a custom grouping in Storm [2].
 For all these reasons, we believe it will be a nice addition to the standard 
 Partitioners available in Flink. If the community thinks it's a good idea, we 
 will be happy to offer support in the porting.
 References:
 [1]. 
 https://melmeric.files.wordpress.com/2014/11/the-power-of-both-choices-practical-load-balancing-for-distributed-stream-processing-engines.pdf
 [2]. https://github.com/gdfm/partial-key-grouping



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1693) Deprecate the Spargel API

2015-03-23 Thread Vasia Kalavri (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375635#comment-14375635
 ] 

Vasia Kalavri commented on FLINK-1693:
--

Hey [~hsaputra]! That'd be great, thanks a lot :-)

 Deprecate the Spargel API
 -

 Key: FLINK-1693
 URL: https://issues.apache.org/jira/browse/FLINK-1693
 Project: Flink
  Issue Type: Task
  Components: Spargel
Affects Versions: 0.9
Reporter: Vasia Kalavri

 For the upcoming 0.9 release, we should mark all user-facing methods from the 
 Spargel API as deprecated, with a warning that we are going to remove it at 
 some point.
 We should also add a comment in the docs and point people to Gelly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1767) StreamExecutionEnvironment's execute should return JobExecutionResult

2015-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14375559#comment-14375559
 ] 

ASF GitHub Bot commented on FLINK-1767:
---

Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/516#issuecomment-84879621
  
LGTM


 StreamExecutionEnvironment's execute should return JobExecutionResult
 -

 Key: FLINK-1767
 URL: https://issues.apache.org/jira/browse/FLINK-1767
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Reporter: Márton Balassi
Assignee: Gabor Gevay

 Although the streaming API does not make use of the accumulators it is still 
 a nice handle for the execution time and might wrap other features in the 
 future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-1767] [streaming] Make StreamExecutionE...

2015-03-23 Thread rmetzger
Github user rmetzger commented on the pull request:

https://github.com/apache/flink/pull/516#issuecomment-84879621
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


  1   2   >