[GitHub] flink pull request: [FLINK-2178][gelly] Fixed groupReduceOnNeighbo...

2015-06-13 Thread andralungu
Github user andralungu commented on the pull request:

https://github.com/apache/flink/pull/799#issuecomment-111761198
  
Yup, exactly! The use case was: I modified something in the edge data set, 
called groupReduceOnNeighbors on the result and got NPE, exception that could 
have been avoided with this check, just like was the case for the Degree NPE. 

Making sure that the iterator is not null can save some people lots of 
headaches, IMO :) And it doesn't hurt anyone who has a correct data set. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2178) groupReduceOnNeighbors throws NoSuchElementException

2015-06-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584877#comment-14584877
 ] 

ASF GitHub Bot commented on FLINK-2178:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/799#issuecomment-111760267
  
Hey @andralungu,

I'm not sure I understand this one. It's a coGroup of vertices with the 
edges. For the vertex iterator to be empty, doesn't it mean that there's an 
edge with an invalid id?


 groupReduceOnNeighbors throws NoSuchElementException
 

 Key: FLINK-2178
 URL: https://issues.apache.org/jira/browse/FLINK-2178
 Project: Flink
  Issue Type: Bug
  Components: Gelly
Affects Versions: 0.9
Reporter: Andra Lungu
Assignee: Andra Lungu

 In the ALL EdgeDirection case, ApplyCoGroupFunctionOnAllNeighbors does not 
 check whether the vertex iterator has elements causing the aforementioned 
 exception. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2178][gelly] Fixed groupReduceOnNeighbo...

2015-06-13 Thread vasia
Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/799#issuecomment-111760267
  
Hey @andralungu,

I'm not sure I understand this one. It's a coGroup of vertices with the 
edges. For the vertex iterator to be empty, doesn't it mean that there's an 
edge with an invalid id?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2093) Add a difference method to Gelly's Graph class

2015-06-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584867#comment-14584867
 ] 

ASF GitHub Bot commented on FLINK-2093:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/818#discussion_r32374715
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java ---
@@ -1234,6 +1234,18 @@ public void coGroup(IterableEdgeK, EV edge, 
IterableEdgeK, EV edgeToBeRe
}
 
/**
+* Performs Difference on the vertex and edge sets of the input graphs
+* removes common vertices and edges. If a source/target vertex is 
removed, its corresponding edge will also be removed
+* @param graph the graph to perform difference with
+* @return a new graph where the common vertices and edges have been 
removed
+*/
+   public GraphK,VV,EV difference(GraphK,VV,EV graph) throws 
java.lang.Exception{
+   DataSetVertexK,VV removeVerticesData = graph.getVertices();
+   final ListVertexK,VV removeVerticesList = 
removeVerticesData.collect();
--- End diff --

I don't think we should use `collect()` here.. Keep in mind that (1) 
`collect()` will trigger program execution and (2) should not be used to 
collect large DataSets and input graph might have lots of vertices.


 Add a difference method to Gelly's Graph class
 --

 Key: FLINK-2093
 URL: https://issues.apache.org/jira/browse/FLINK-2093
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Affects Versions: 0.9
Reporter: Andra Lungu
Assignee: Shivani Ghatge
Priority: Minor

 This method will compute the difference between two graphs, returning a new 
 graph containing the vertices and edges that the current graph and the input 
 graph don't have in common. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2093][gelly] Added difference Method

2015-06-13 Thread vasia
Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/818#discussion_r32374715
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java ---
@@ -1234,6 +1234,18 @@ public void coGroup(IterableEdgeK, EV edge, 
IterableEdgeK, EV edgeToBeRe
}
 
/**
+* Performs Difference on the vertex and edge sets of the input graphs
+* removes common vertices and edges. If a source/target vertex is 
removed, its corresponding edge will also be removed
+* @param graph the graph to perform difference with
+* @return a new graph where the common vertices and edges have been 
removed
+*/
+   public GraphK,VV,EV difference(GraphK,VV,EV graph) throws 
java.lang.Exception{
+   DataSetVertexK,VV removeVerticesData = graph.getVertices();
+   final ListVertexK,VV removeVerticesList = 
removeVerticesData.collect();
--- End diff --

I don't think we should use `collect()` here.. Keep in mind that (1) 
`collect()` will trigger program execution and (2) should not be used to 
collect large DataSets and input graph might have lots of vertices.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2093) Add a difference method to Gelly's Graph class

2015-06-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584866#comment-14584866
 ] 

ASF GitHub Bot commented on FLINK-2093:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/818#discussion_r32374703
  
--- Diff: docs/libs/gelly_guide.md ---
@@ -240,6 +240,7 @@ GraphLong, Double, Double networkWithWeights = 
network.joinWithEdgesOnSource(v
 img alt=Union Transformation width=50% src=fig/gelly-union.png/
 /p
 
+* strongDifference/strong: Gelly's `difference()` method performs a 
difference on the vertex and edge sets of the input graphs. The resultant graph 
is formed by removing the vertices and edges from the graph that are common 
with the second graph.
--- End diff --

we can rephrase this a bit.. there is one input graph and no second 
graph... I guess you copied from the union description above (which should also 
be changed).


 Add a difference method to Gelly's Graph class
 --

 Key: FLINK-2093
 URL: https://issues.apache.org/jira/browse/FLINK-2093
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Affects Versions: 0.9
Reporter: Andra Lungu
Assignee: Shivani Ghatge
Priority: Minor

 This method will compute the difference between two graphs, returning a new 
 graph containing the vertices and edges that the current graph and the input 
 graph don't have in common. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-13 Thread Andra Lungu (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584874#comment-14584874
 ] 

Andra Lungu commented on FLINK-1520:


Yup, I am not happy with the argument passing, as it may be cumbersome for the 
user to get what each argument means etc.
I thought about this approach, my only concern is that it will introduce a ton 
of duplicate code. And, in the end, you write (more or less) the same commands, 
just that instead of getting a DataSet, which you then turn into a graph with 
fromDataSet, you get a graph directly...

If we are okay with code duplication then I would +1 Vasia's solution. 

 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-13 Thread Vasia Kalavri (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584886#comment-14584886
 ] 

Vasia Kalavri commented on FLINK-1520:
--

I don't think it'll be a lot of duplicate code. You can have EdgeCsvReader wrap 
a CsvReader and just call its methods, no?

 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2149) Simplify Gelly Jaccard similarity example

2015-06-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584882#comment-14584882
 ] 

ASF GitHub Bot commented on FLINK-2149:
---

Github user vasia commented on the pull request:

https://github.com/apache/flink/pull/770#issuecomment-111760796
  
+1 except the minor comment


 Simplify Gelly Jaccard similarity example
 -

 Key: FLINK-2149
 URL: https://issues.apache.org/jira/browse/FLINK-2149
 Project: Flink
  Issue Type: Improvement
  Components: Gelly
Affects Versions: 0.9
Reporter: Vasia Kalavri
Assignee: Andra Lungu
Priority: Trivial
  Labels: easyfix, starter

 The Gelly Jaccard similarity example can be simplified by replacing the 
 groupReduceOnEdges method with the simpler reduceOnEdges.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2149][gelly] Simplified Jaccard Example

2015-06-13 Thread vasia
Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/770#discussion_r32374939
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/example/JaccardSimilarityMeasure.java
 ---
@@ -66,34 +63,47 @@ public static void main(String [] args) throws 
Exception {
 
DataSetEdgeLong, Double edges = getEdgesDataSet(env);
 
-   GraphLong, NullValue, Double graph = Graph.fromDataSet(edges, 
env);
+   GraphLong, HashSetLong, Double graph = 
Graph.fromDataSet(edges,
+   new MapFunctionLong, HashSetLong() {
 
-   DataSetVertexLong, HashSetLong verticesWithNeighbors =
-   graph.groupReduceOnEdges(new GatherNeighbors(), 
EdgeDirection.ALL);
+   @Override
+   public HashSetLong map(Long id) 
throws Exception {
+   HashSetLong neighbors = new 
HashSetLong();
+   neighbors.add(id);
 
-   GraphLong, HashSetLong, Double graphWithVertexValues = 
Graph.fromDataSet(verticesWithNeighbors, edges, env);
+   return new 
HashSetLong(neighbors);
+   }
+   }, env);
 
-   // the edge value will be the Jaccard similarity 
coefficient(number of common neighbors/ all neighbors)
-   DataSetTuple3Long, Long, Double edgesWithJaccardWeight = 
graphWithVertexValues.getTriplets()
-   .map(new WeighEdgesMapper());
+   // create the set of neighbors
+   DataSetTuple2Long, HashSetLong computedNeighbors =
+   graph.reduceOnNeighbors(new GatherNeighbors(), 
EdgeDirection.ALL);
 
-   DataSetEdgeLong, Double result = 
graphWithVertexValues.joinWithEdges(edgesWithJaccardWeight,
-   new MapFunctionTuple2Double, Double, 
Double() {
+   // join with the vertices to update the node values
+   DataSetVertexLong, HashSetLong verticesWithNeighbors =
+   graph.joinWithVertices(computedNeighbors, new 
MapFunctionTuple2HashSetLong, HashSetLong,
+   HashSetLong() {
 
@Override
-   public Double map(Tuple2Double, 
Double value) throws Exception {
-   return value.f1;
+   public HashSetLong 
map(Tuple2HashSetLong, HashSetLong tuple2) throws Exception {
+   return tuple2.f1;
}
-   }).getEdges();
+   }).getVertices();
+
+   GraphLong, HashSetLong, Double graphWithVertexValues = 
Graph.fromDataSet(verticesWithNeighbors, edges, env);
--- End diff --

joinWithVertices can give you the Graph directly :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2149) Simplify Gelly Jaccard similarity example

2015-06-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584881#comment-14584881
 ] 

ASF GitHub Bot commented on FLINK-2149:
---

Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/770#discussion_r32374939
  
--- Diff: 
flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/example/JaccardSimilarityMeasure.java
 ---
@@ -66,34 +63,47 @@ public static void main(String [] args) throws 
Exception {
 
DataSetEdgeLong, Double edges = getEdgesDataSet(env);
 
-   GraphLong, NullValue, Double graph = Graph.fromDataSet(edges, 
env);
+   GraphLong, HashSetLong, Double graph = 
Graph.fromDataSet(edges,
+   new MapFunctionLong, HashSetLong() {
 
-   DataSetVertexLong, HashSetLong verticesWithNeighbors =
-   graph.groupReduceOnEdges(new GatherNeighbors(), 
EdgeDirection.ALL);
+   @Override
+   public HashSetLong map(Long id) 
throws Exception {
+   HashSetLong neighbors = new 
HashSetLong();
+   neighbors.add(id);
 
-   GraphLong, HashSetLong, Double graphWithVertexValues = 
Graph.fromDataSet(verticesWithNeighbors, edges, env);
+   return new 
HashSetLong(neighbors);
+   }
+   }, env);
 
-   // the edge value will be the Jaccard similarity 
coefficient(number of common neighbors/ all neighbors)
-   DataSetTuple3Long, Long, Double edgesWithJaccardWeight = 
graphWithVertexValues.getTriplets()
-   .map(new WeighEdgesMapper());
+   // create the set of neighbors
+   DataSetTuple2Long, HashSetLong computedNeighbors =
+   graph.reduceOnNeighbors(new GatherNeighbors(), 
EdgeDirection.ALL);
 
-   DataSetEdgeLong, Double result = 
graphWithVertexValues.joinWithEdges(edgesWithJaccardWeight,
-   new MapFunctionTuple2Double, Double, 
Double() {
+   // join with the vertices to update the node values
+   DataSetVertexLong, HashSetLong verticesWithNeighbors =
+   graph.joinWithVertices(computedNeighbors, new 
MapFunctionTuple2HashSetLong, HashSetLong,
+   HashSetLong() {
 
@Override
-   public Double map(Tuple2Double, 
Double value) throws Exception {
-   return value.f1;
+   public HashSetLong 
map(Tuple2HashSetLong, HashSetLong tuple2) throws Exception {
+   return tuple2.f1;
}
-   }).getEdges();
+   }).getVertices();
+
+   GraphLong, HashSetLong, Double graphWithVertexValues = 
Graph.fromDataSet(verticesWithNeighbors, edges, env);
--- End diff --

joinWithVertices can give you the Graph directly :)


 Simplify Gelly Jaccard similarity example
 -

 Key: FLINK-2149
 URL: https://issues.apache.org/jira/browse/FLINK-2149
 Project: Flink
  Issue Type: Improvement
  Components: Gelly
Affects Versions: 0.9
Reporter: Vasia Kalavri
Assignee: Andra Lungu
Priority: Trivial
  Labels: easyfix, starter

 The Gelly Jaccard similarity example can be simplified by replacing the 
 groupReduceOnEdges method with the simpler reduceOnEdges.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2203] handling null values for RowSeria...

2015-06-13 Thread aljoscha
Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/831#issuecomment-111717366
  
I had some more remarks, sorry for being so picky. :sweat_smile: 

Other than that, I think the change looks really good now!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2203] handling null values for RowSeria...

2015-06-13 Thread aljoscha
Github user aljoscha commented on a diff in the pull request:

https://github.com/apache/flink/pull/831#discussion_r32371128
  
--- Diff: 
flink-staging/flink-table/src/test/scala/org/apache/flink/api/table/typeinfo/RowSerializerTest.scala
 ---
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.table.typeinfo
+
+import org.apache.flink.api.common.ExecutionConfig
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.common.typeutils.{SerializerTestInstance, 
TypeSerializer}
+import org.apache.flink.api.table.Row
+import org.junit.Assert._
+import org.junit.Test
+
+class RowSerializerTest {
+
+  class RowSerializerTestInstance(serializer: TypeSerializer[Row],
+  testData: Array[Row])
+extends SerializerTestInstance(serializer, classOf[Row], -1, testData: 
_*) {
+
+override protected def deepEquals(message: String, should: Row, is: 
Row): Unit = {
+  val arity = should.productArity
+  assertEquals(message, arity, is.productArity)
+  var index = 0
+  while (index  arity) {
+val copiedValue: Any = should.productElement(index)
+val element: Any = is.productElement(index)
+assertEquals(message, element, copiedValue)
+index += 1
+  }
+}
+  }
+
+  @Test
+  def testRowSerializer(): Unit ={
+
+val rowInfo: TypeInformation[Row] = new RowTypeInfo(
+  Seq(BasicTypeInfo.INT_TYPE_INFO, BasicTypeInfo.STRING_TYPE_INFO), 
Seq(id, name))
+
+val row1 = new Row(2)
+row1.setField(0, 1)
+row1.setField(1, a)
+
+val row2 = new Row(2)
+row2.setField(0, 2)
+row2.setField(1, hello)
+
+val testData: Array[Row] = Array(row1, row2)
--- End diff --

I think it would be good to also add a row that has null values, since the 
change actually introduces that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2203) Add Support for Null-Values in RowSerializer

2015-06-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584639#comment-14584639
 ] 

ASF GitHub Bot commented on FLINK-2203:
---

Github user aljoscha commented on a diff in the pull request:

https://github.com/apache/flink/pull/831#discussion_r32371128
  
--- Diff: 
flink-staging/flink-table/src/test/scala/org/apache/flink/api/table/typeinfo/RowSerializerTest.scala
 ---
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.table.typeinfo
+
+import org.apache.flink.api.common.ExecutionConfig
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.common.typeutils.{SerializerTestInstance, 
TypeSerializer}
+import org.apache.flink.api.table.Row
+import org.junit.Assert._
+import org.junit.Test
+
+class RowSerializerTest {
+
+  class RowSerializerTestInstance(serializer: TypeSerializer[Row],
+  testData: Array[Row])
+extends SerializerTestInstance(serializer, classOf[Row], -1, testData: 
_*) {
+
+override protected def deepEquals(message: String, should: Row, is: 
Row): Unit = {
+  val arity = should.productArity
+  assertEquals(message, arity, is.productArity)
+  var index = 0
+  while (index  arity) {
+val copiedValue: Any = should.productElement(index)
+val element: Any = is.productElement(index)
+assertEquals(message, element, copiedValue)
+index += 1
+  }
+}
+  }
+
+  @Test
+  def testRowSerializer(): Unit ={
+
+val rowInfo: TypeInformation[Row] = new RowTypeInfo(
+  Seq(BasicTypeInfo.INT_TYPE_INFO, BasicTypeInfo.STRING_TYPE_INFO), 
Seq(id, name))
+
+val row1 = new Row(2)
+row1.setField(0, 1)
+row1.setField(1, a)
+
+val row2 = new Row(2)
+row2.setField(0, 2)
+row2.setField(1, hello)
+
+val testData: Array[Row] = Array(row1, row2)
--- End diff --

I think it would be good to also add a row that has null values, since the 
change actually introduces that.


 Add Support for Null-Values in RowSerializer
 

 Key: FLINK-2203
 URL: https://issues.apache.org/jira/browse/FLINK-2203
 Project: Flink
  Issue Type: Improvement
  Components: Table API
Reporter: Aljoscha Krettek
Assignee: Shiti Saxena
Priority: Minor
  Labels: Starter

 This would be a start towards proper handling of null values. We would still 
 need to add support for null values in aggregations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2203] handling null values for RowSeria...

2015-06-13 Thread aljoscha
Github user aljoscha commented on a diff in the pull request:

https://github.com/apache/flink/pull/831#discussion_r32371124
  
--- Diff: 
flink-staging/flink-table/src/main/scala/org/apache/flink/api/table/typeinfo/RowSerializer.scala
 ---
@@ -102,20 +119,39 @@ class RowSerializer(fieldSerializers: 
Array[TypeSerializer[Any]])
 val len = fieldSerializers.length
 
 val result = new Row(len)
-var i = 0
-while (i  len) {
-  result.setField(i, fieldSerializers(i).deserialize(source))
-  i += 1
+
+var index = 0
+while (index  len) {
+  val isNull: Boolean = source.readBoolean()
+  if (isNull) {
+result.setField(index, null)
+  } else {
+val serializer: TypeSerializer[Any] = fieldSerializers(index)
+result.setField(index, serializer.deserialize(source))
+  }
+  index += 1
 }
 result
   }
 
+  private final val booleanSerializer = new BooleanSerializer()
+
   override def copy(source: DataInputView, target: DataOutputView): Unit = 
{
 val len = fieldSerializers.length
 var i = 0
 while (i  len) {
+  booleanSerializer.copy(source, target)
--- End diff --

I think it would be easier to do
```
target.writeBoolean(source.readBoolean())
```
here, instead of going through the extra abstraction of the 
BooleanSerializer.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2203] handling null values for RowSeria...

2015-06-13 Thread Shiti
Github user Shiti commented on the pull request:

https://github.com/apache/flink/pull/831#issuecomment-111709239
  
@aljoscha I have updated the RowSerializerTest to use SerializerTestBase 
and reverted to using while loops in RowSerializer


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2208) Build error for Java IBM

2015-06-13 Thread JIRA

[ 
https://issues.apache.org/jira/browse/FLINK-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584610#comment-14584610
 ] 

Johannes Günther commented on FLINK-2208:
-

My first guess is that the IBM JDK is not supported officially and therefore 
not tested.
Also the Automatic buildprocess in Travis does not seem to support it.

http://docs.travis-ci.com/user/languages/java/#Testing-Against-Multiple-JDKs

That being said the com.sun,management packages are internal API's for the 
Sun(Oracle) JDKs that seem to be used by the openjdk also
a fix is to use the java.lang.management.OperatingSystemMXBean instead
com.sun.*  packages are not meant to be used outside the sun jdk and are not 
meant to be used in programms


 Build error for Java IBM
 

 Key: FLINK-2208
 URL: https://issues.apache.org/jira/browse/FLINK-2208
 Project: Flink
  Issue Type: Bug
  Components: Build System
Affects Versions: 0.9
Reporter: Felix Neutatz
Priority: Minor

 Using IBM Java 7 will break the built:
 {code:xml}
 [INFO] --- scala-maven-plugin:3.1.4:compile (scala-compile-first) @ 
 flink-runtime ---
 [INFO] 
 /share/flink/flink-0.9-SNAPSHOT-wo-Yarn/flink-runtime/src/main/java:-1: info: 
 compiling
 [INFO] 
 /share/flink/flink-0.9-SNAPSHOT-wo-Yarn/flink-runtime/src/main/scala:-1: 
 info: compiling
 [INFO] Compiling 461 source files to 
 /share/flink/flink-0.9-SNAPSHOT-wo-Yarn/flink-runtime/target/classes at 
 1434059956648
 [ERROR] 
 /share/flink/flink-0.9-SNAPSHOT-wo-Yarn/flink-runtime/src/main/scala/org/apache/flink/runtime/taskmanager/TaskManager.scala:1768:
  error: type OperatingSystemMXBean is not a member of package 
 com.sun.management
 [ERROR] asInstanceOf[com.sun.management.OperatingSystemMXBean]).
 [ERROR] ^
 [ERROR] 
 /share/flink/flink-0.9-SNAPSHOT-wo-Yarn/flink-runtime/src/main/scala/org/apache/flink/runtime/taskmanager/TaskManager.scala:1787:
  error: type OperatingSystemMXBean is not a member of package 
 com.sun.management
 [ERROR] val methodsList = 
 classOf[com.sun.management.OperatingSystemMXBean].getMethods()
 [ERROR]  ^
 [ERROR] two errors found
 [INFO] 
 
 [INFO] Reactor Summary:
 [INFO] 
 [INFO] flink .. SUCCESS [ 14.447 
 s]
 [INFO] flink-shaded-hadoop  SUCCESS [  2.548 
 s]
 [INFO] flink-shaded-include-yarn .. SUCCESS [ 36.122 
 s]
 [INFO] flink-shaded-include-yarn-tests  SUCCESS [ 36.980 
 s]
 [INFO] flink-core . SUCCESS [ 21.887 
 s]
 [INFO] flink-java . SUCCESS [ 16.023 
 s]
 [INFO] flink-runtime .. FAILURE [ 20.241 
 s]
 [INFO] flink-optimizer  SKIPPED
 [hadoop@ibm-power-1 /]$ java -version
 java version 1.7.0
 Java(TM) SE Runtime Environment (build pxp6470_27sr1fp1-20140708_01(SR1 FP1))
 IBM J9 VM (build 2.7, JRE 1.7.0 Linux ppc64-64 Compressed References 
 20140707_205525 (JIT enabled, AOT enabled)
 J9VM - R27_Java727_SR1_20140707_1408_B205525
 JIT  - tr.r13.java_20140410_61421.07
 GC   - R27_Java727_SR1_20140707_1408_B205525_CMPRSS
 J9CL - 20140707_205525)
 JCL - 20140707_01 based on Oracle 7u65-b16
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2203) Add Support for Null-Values in RowSerializer

2015-06-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584641#comment-14584641
 ] 

ASF GitHub Bot commented on FLINK-2203:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/831#issuecomment-111717366
  
I had some more remarks, sorry for being so picky. :sweat_smile: 

Other than that, I think the change looks really good now!


 Add Support for Null-Values in RowSerializer
 

 Key: FLINK-2203
 URL: https://issues.apache.org/jira/browse/FLINK-2203
 Project: Flink
  Issue Type: Improvement
  Components: Table API
Reporter: Aljoscha Krettek
Assignee: Shiti Saxena
Priority: Minor
  Labels: Starter

 This would be a start towards proper handling of null values. We would still 
 need to add support for null values in aggregations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2203) Add Support for Null-Values in RowSerializer

2015-06-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584480#comment-14584480
 ] 

ASF GitHub Bot commented on FLINK-2203:
---

Github user Shiti commented on the pull request:

https://github.com/apache/flink/pull/831#issuecomment-111682105
  
@hsaputra I have updated the description


 Add Support for Null-Values in RowSerializer
 

 Key: FLINK-2203
 URL: https://issues.apache.org/jira/browse/FLINK-2203
 Project: Flink
  Issue Type: Improvement
  Components: Table API
Reporter: Aljoscha Krettek
Assignee: Shiti Saxena
Priority: Minor
  Labels: Starter

 This would be a start towards proper handling of null values. We would still 
 need to add support for null values in aggregations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2203] handling null values for RowSeria...

2015-06-13 Thread Shiti
Github user Shiti commented on the pull request:

https://github.com/apache/flink/pull/831#issuecomment-111682105
  
@hsaputra I have updated the description


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (FLINK-2152) Provide zipWithIndex utility in flink-contrib

2015-06-13 Thread Andra Lungu (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andra Lungu reassigned FLINK-2152:
--

Assignee: Andra Lungu

 Provide zipWithIndex utility in flink-contrib
 -

 Key: FLINK-2152
 URL: https://issues.apache.org/jira/browse/FLINK-2152
 Project: Flink
  Issue Type: Improvement
  Components: Java API
Reporter: Robert Metzger
Assignee: Andra Lungu
Priority: Trivial
  Labels: starter

 We should provide a simple utility method for zipping elements in a data set 
 with a dense index.
 its up for discussion whether we want it directly in the API or if we should 
 provide it only as a utility from {{flink-contrib}}.
 I would put it in {{flink-contrib}}.
 See my answer on SO: 
 http://stackoverflow.com/questions/30596556/zipwithindex-on-apache-flink



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2203) Add Support for Null-Values in RowSerializer

2015-06-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584671#comment-14584671
 ] 

ASF GitHub Bot commented on FLINK-2203:
---

Github user Shiti commented on the pull request:

https://github.com/apache/flink/pull/831#issuecomment-111723371
  
@aljoscha I have made the suggested changes


 Add Support for Null-Values in RowSerializer
 

 Key: FLINK-2203
 URL: https://issues.apache.org/jira/browse/FLINK-2203
 Project: Flink
  Issue Type: Improvement
  Components: Table API
Reporter: Aljoscha Krettek
Assignee: Shiti Saxena
Priority: Minor
  Labels: Starter

 This would be a start towards proper handling of null values. We would still 
 need to add support for null values in aggregations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2203] handling null values for RowSeria...

2015-06-13 Thread aljoscha
Github user aljoscha commented on a diff in the pull request:

https://github.com/apache/flink/pull/831#discussion_r32369782
  
--- Diff: 
flink-staging/flink-table/src/main/scala/org/apache/flink/api/table/typeinfo/RowSerializer.scala
 ---
@@ -89,11 +99,16 @@ class RowSerializer(fieldSerializers: 
Array[TypeSerializer[Any]])
   throw new RuntimeException(Row arity of reuse and fields do not 
match.)
 }
 
-var i = 0
-while (i  len) {
-  val field = reuse.productElement(i).asInstanceOf[AnyRef]
-  reuse.setField(i, fieldSerializers(i).deserialize(field, source))
-  i += 1
+(0 to len - 1).foreach {
+  index =
+val isNull: Boolean = source.readBoolean
+if (isNull) {
+  reuse.setField(index, null)
+} else {
+  val field = reuse.productElement(index).asInstanceOf[AnyRef]
+  val serializer: TypeSerializer[Any] = fieldSerializers(index)
+  reuse.setField(index, serializer.deserialize(field, source))
+}
--- End diff --

See above comment about while loop.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2203) Add Support for Null-Values in RowSerializer

2015-06-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584568#comment-14584568
 ] 

ASF GitHub Bot commented on FLINK-2203:
---

Github user aljoscha commented on a diff in the pull request:

https://github.com/apache/flink/pull/831#discussion_r32369782
  
--- Diff: 
flink-staging/flink-table/src/main/scala/org/apache/flink/api/table/typeinfo/RowSerializer.scala
 ---
@@ -89,11 +99,16 @@ class RowSerializer(fieldSerializers: 
Array[TypeSerializer[Any]])
   throw new RuntimeException(Row arity of reuse and fields do not 
match.)
 }
 
-var i = 0
-while (i  len) {
-  val field = reuse.productElement(i).asInstanceOf[AnyRef]
-  reuse.setField(i, fieldSerializers(i).deserialize(field, source))
-  i += 1
+(0 to len - 1).foreach {
+  index =
+val isNull: Boolean = source.readBoolean
+if (isNull) {
+  reuse.setField(index, null)
+} else {
+  val field = reuse.productElement(index).asInstanceOf[AnyRef]
+  val serializer: TypeSerializer[Any] = fieldSerializers(index)
+  reuse.setField(index, serializer.deserialize(field, source))
+}
--- End diff --

See above comment about while loop.


 Add Support for Null-Values in RowSerializer
 

 Key: FLINK-2203
 URL: https://issues.apache.org/jira/browse/FLINK-2203
 Project: Flink
  Issue Type: Improvement
  Components: Table API
Reporter: Aljoscha Krettek
Assignee: Shiti Saxena
Priority: Minor
  Labels: Starter

 This would be a start towards proper handling of null values. We would still 
 need to add support for null values in aggregations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2203) Add Support for Null-Values in RowSerializer

2015-06-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584567#comment-14584567
 ] 

ASF GitHub Bot commented on FLINK-2203:
---

Github user aljoscha commented on a diff in the pull request:

https://github.com/apache/flink/pull/831#discussion_r32369781
  
--- Diff: 
flink-staging/flink-table/src/main/scala/org/apache/flink/api/table/typeinfo/RowSerializer.scala
 ---
@@ -74,11 +79,16 @@ class RowSerializer(fieldSerializers: 
Array[TypeSerializer[Any]])
 
   override def serialize(value: Row, target: DataOutputView) {
 val len = fieldSerializers.length
-var i = 0
-while (i  len) {
-  val serializer = fieldSerializers(i)
-  serializer.serialize(value.productElement(i), target)
-  i += 1
+(0 to len - 1).foreach {
+  index =
+val o: AnyRef = value.productElement(index).asInstanceOf[AnyRef]
+if (o == null) {
+  target.writeBoolean(true)
+} else {
+  target.writeBoolean(false)
+  val serializer = fieldSerializers(index)
+  serializer.serialize(value.productElement(index), target)
+}
--- End diff --

Could you please change this back to the simple while loop. I know that 
using the fancy Scala features is tempting but the performance of a simple 
while loop should be better than creating a range and iterating over it using 
foreach.


 Add Support for Null-Values in RowSerializer
 

 Key: FLINK-2203
 URL: https://issues.apache.org/jira/browse/FLINK-2203
 Project: Flink
  Issue Type: Improvement
  Components: Table API
Reporter: Aljoscha Krettek
Assignee: Shiti Saxena
Priority: Minor
  Labels: Starter

 This would be a start towards proper handling of null values. We would still 
 need to add support for null values in aggregations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2203] handling null values for RowSeria...

2015-06-13 Thread aljoscha
Github user aljoscha commented on a diff in the pull request:

https://github.com/apache/flink/pull/831#discussion_r32369781
  
--- Diff: 
flink-staging/flink-table/src/main/scala/org/apache/flink/api/table/typeinfo/RowSerializer.scala
 ---
@@ -74,11 +79,16 @@ class RowSerializer(fieldSerializers: 
Array[TypeSerializer[Any]])
 
   override def serialize(value: Row, target: DataOutputView) {
 val len = fieldSerializers.length
-var i = 0
-while (i  len) {
-  val serializer = fieldSerializers(i)
-  serializer.serialize(value.productElement(i), target)
-  i += 1
+(0 to len - 1).foreach {
+  index =
+val o: AnyRef = value.productElement(index).asInstanceOf[AnyRef]
+if (o == null) {
+  target.writeBoolean(true)
+} else {
+  target.writeBoolean(false)
+  val serializer = fieldSerializers(index)
+  serializer.serialize(value.productElement(index), target)
+}
--- End diff --

Could you please change this back to the simple while loop. I know that 
using the fancy Scala features is tempting but the performance of a simple 
while loop should be better than creating a range and iterating over it using 
foreach.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files

2015-06-13 Thread Andra Lungu (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584586#comment-14584586
 ] 

Andra Lungu commented on FLINK-1520:


Hey [~vkalavri]],

To my knowledge, you cannot deduce the key or the value's class from the 
generic K,VV,EV. The way I would implement fromCsv is by adding the classes 
as parameters, e.g. Graph.fromCsv(edgesPath, String.class, String.class, 
context). For NullValue, then, we would have a single class argument 
Graph.fromCsv(edgesPath, String.class, context).
The user should know what kind of keys he/she has in there. So the extra 
parameters should not be that much of a burden. 

Is this what you had in mind? For the time being, I cannot see a smarter way 
of doing it :) 

The examples should be updated accordingly since they now read the edge and 
vertex data sets from CSV and then use fromDataSet to produce the graph. 

 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2203] handling null values for RowSeria...

2015-06-13 Thread aljoscha
Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/831#issuecomment-111701489
  
The changes look good except for comments I had about the loops.

For the tests, did you try doing it as in TraversableSerializerTest.scala. 
Here, we override deepEquals() and the runTests() method can be modified to 
take the RowTypeInfo that was explicitly created. The reason why I would like 
it to derive from the SerializerTestBase as all the other serializers is that 
we have to change the tests for RowSerializer if we adapt the tests there. And 
this would likely be forgotten by future contributors.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2203) Add Support for Null-Values in RowSerializer

2015-06-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584572#comment-14584572
 ] 

ASF GitHub Bot commented on FLINK-2203:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/831#issuecomment-111701489
  
The changes look good except for comments I had about the loops.

For the tests, did you try doing it as in TraversableSerializerTest.scala. 
Here, we override deepEquals() and the runTests() method can be modified to 
take the RowTypeInfo that was explicitly created. The reason why I would like 
it to derive from the SerializerTestBase as all the other serializers is that 
we have to change the tests for RowSerializer if we adapt the tests there. And 
this would likely be forgotten by future contributors.


 Add Support for Null-Values in RowSerializer
 

 Key: FLINK-2203
 URL: https://issues.apache.org/jira/browse/FLINK-2203
 Project: Flink
  Issue Type: Improvement
  Components: Table API
Reporter: Aljoscha Krettek
Assignee: Shiti Saxena
Priority: Minor
  Labels: Starter

 This would be a start towards proper handling of null values. We would still 
 need to add support for null values in aggregations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2203) Add Support for Null-Values in RowSerializer

2015-06-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584782#comment-14584782
 ] 

ASF GitHub Bot commented on FLINK-2203:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/831


 Add Support for Null-Values in RowSerializer
 

 Key: FLINK-2203
 URL: https://issues.apache.org/jira/browse/FLINK-2203
 Project: Flink
  Issue Type: Improvement
  Components: Table API
Reporter: Aljoscha Krettek
Assignee: Shiti Saxena
Priority: Minor
  Labels: Starter

 This would be a start towards proper handling of null values. We would still 
 need to add support for null values in aggregations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2203] handling null values for RowSeria...

2015-06-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/831


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-2203] handling null values for RowSeria...

2015-06-13 Thread aljoscha
Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/831#issuecomment-111741260
  
Thanks, nice work. :+1: 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-2203) Add Support for Null-Values in RowSerializer

2015-06-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584783#comment-14584783
 ] 

ASF GitHub Bot commented on FLINK-2203:
---

Github user aljoscha commented on the pull request:

https://github.com/apache/flink/pull/831#issuecomment-111741260
  
Thanks, nice work. :+1: 


 Add Support for Null-Values in RowSerializer
 

 Key: FLINK-2203
 URL: https://issues.apache.org/jira/browse/FLINK-2203
 Project: Flink
  Issue Type: Improvement
  Components: Table API
Reporter: Aljoscha Krettek
Assignee: Shiti Saxena
Priority: Minor
  Labels: Starter

 This would be a start towards proper handling of null values. We would still 
 need to add support for null values in aggregations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (FLINK-2203) Add Support for Null-Values in RowSerializer

2015-06-13 Thread Aljoscha Krettek (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek closed FLINK-2203.
---
Resolution: Fixed

Resolved in 
https://github.com/apache/flink/commit/f8e12b20d925c3f6f24769327d1da5d98affa679

 Add Support for Null-Values in RowSerializer
 

 Key: FLINK-2203
 URL: https://issues.apache.org/jira/browse/FLINK-2203
 Project: Flink
  Issue Type: Improvement
  Components: Table API
Reporter: Aljoscha Krettek
Assignee: Shiti Saxena
Priority: Minor
  Labels: Starter

 This would be a start towards proper handling of null values. We would still 
 need to add support for null values in aggregations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2175) Allow multiple jobs in single jar file

2015-06-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584835#comment-14584835
 ] 

ASF GitHub Bot commented on FLINK-2175:
---

Github user mjsax commented on the pull request:

https://github.com/apache/flink/pull/707#issuecomment-111755732
  
Any news on this PR?


 Allow multiple jobs in single jar file
 --

 Key: FLINK-2175
 URL: https://issues.apache.org/jira/browse/FLINK-2175
 Project: Flink
  Issue Type: Improvement
  Components: Examples, other, Webfrontend
Reporter: Matthias J. Sax
Assignee: Matthias J. Sax
Priority: Minor

 Allow to package multiple jobs into a single jar.
   - extend WebClient to display all available jobs
   - extend WebClient to diplay plan and submit each job



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2175] Allow multiple jobs in single jar...

2015-06-13 Thread mjsax
Github user mjsax commented on the pull request:

https://github.com/apache/flink/pull/707#issuecomment-111755732
  
Any news on this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-1818] Added api to cancel job from clie...

2015-06-13 Thread mjsax
Github user mjsax commented on the pull request:

https://github.com/apache/flink/pull/642#issuecomment-111755816
  
Any news on this PR? I thinks it is a nice feature? @rainiraj do you still 
work on this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: Flink Storm compatibility

2015-06-13 Thread mjsax
Github user mjsax commented on the pull request:

https://github.com/apache/flink/pull/573#issuecomment-111755768
  
Any news on this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1818) Provide API to cancel running job

2015-06-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584836#comment-14584836
 ] 

ASF GitHub Bot commented on FLINK-1818:
---

Github user mjsax commented on the pull request:

https://github.com/apache/flink/pull/642#issuecomment-111755816
  
Any news on this PR? I thinks it is a nice feature? @rainiraj do you still 
work on this?


 Provide API to cancel running job
 -

 Key: FLINK-1818
 URL: https://issues.apache.org/jira/browse/FLINK-1818
 Project: Flink
  Issue Type: Improvement
  Components: Java API
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: niraj rai
  Labels: starter

 http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Canceling-a-Cluster-Job-from-Java-td4897.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2152) Provide zipWithIndex utility in flink-contrib

2015-06-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584840#comment-14584840
 ] 

ASF GitHub Bot commented on FLINK-2152:
---

GitHub user andralungu opened a pull request:

https://github.com/apache/flink/pull/832

[FLINK-2152] Added zipWithIndex 

This PR adds the zipWithIndex utility method to Flink's DataSetUtils as 
described in the mailing list discussion: 
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/The-correct-location-for-zipWithIndex-and-zipWithUniqueId-td6310.html.
 

The method could, in the future, be moved to DataSet. 

@fhueske , @tillrohrmann , once we reach a conclusion for this one, I will 
also update #801 (I wouldn't like to fix unnecessary merge conflicts). 

Once zipWIthUniqueIds is added, I could also explain the difference in the 
docs. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/andralungu/flink zipWithIndex

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/832.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #832


commit fdbf0167cc10e952faddc2a7d71e73e7f1f2d03f
Author: andralungu lungu.an...@gmail.com
Date:   2015-06-12T18:37:27Z

[FLINK-2152] zipWithIndex implementation

[FLINK-2152] Added zipWithIndex utility method




 Provide zipWithIndex utility in flink-contrib
 -

 Key: FLINK-2152
 URL: https://issues.apache.org/jira/browse/FLINK-2152
 Project: Flink
  Issue Type: Improvement
  Components: Java API
Reporter: Robert Metzger
Assignee: Andra Lungu
Priority: Trivial
  Labels: starter

 We should provide a simple utility method for zipping elements in a data set 
 with a dense index.
 its up for discussion whether we want it directly in the API or if we should 
 provide it only as a utility from {{flink-contrib}}.
 I would put it in {{flink-contrib}}.
 See my answer on SO: 
 http://stackoverflow.com/questions/30596556/zipwithindex-on-apache-flink



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2152] Added zipWithIndex

2015-06-13 Thread andralungu
GitHub user andralungu opened a pull request:

https://github.com/apache/flink/pull/832

[FLINK-2152] Added zipWithIndex 

This PR adds the zipWithIndex utility method to Flink's DataSetUtils as 
described in the mailing list discussion: 
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/The-correct-location-for-zipWithIndex-and-zipWithUniqueId-td6310.html.
 

The method could, in the future, be moved to DataSet. 

@fhueske , @tillrohrmann , once we reach a conclusion for this one, I will 
also update #801 (I wouldn't like to fix unnecessary merge conflicts). 

Once zipWIthUniqueIds is added, I could also explain the difference in the 
docs. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/andralungu/flink zipWithIndex

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/832.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #832


commit fdbf0167cc10e952faddc2a7d71e73e7f1f2d03f
Author: andralungu lungu.an...@gmail.com
Date:   2015-06-12T18:37:27Z

[FLINK-2152] zipWithIndex implementation

[FLINK-2152] Added zipWithIndex utility method




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-1818) Provide API to cancel running job

2015-06-13 Thread niraj rai (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584852#comment-14584852
 ] 

niraj rai commented on FLINK-1818:
--

Hi [~mjsax] Yes, I will submit another pull request in couple of days. Please 
wait. 
Thanks
Niraj

 Provide API to cancel running job
 -

 Key: FLINK-1818
 URL: https://issues.apache.org/jira/browse/FLINK-1818
 Project: Flink
  Issue Type: Improvement
  Components: Java API
Affects Versions: 0.9
Reporter: Robert Metzger
Assignee: niraj rai
  Labels: starter

 http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Canceling-a-Cluster-Job-from-Java-td4897.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (FLINK-1520) Read edges and vertices from CSV files

2015-06-13 Thread Vasia Kalavri (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584855#comment-14584855
 ] 

Vasia Kalavri edited comment on FLINK-1520 at 6/13/15 10:44 PM:


Hi [~andralungu],

you are right, we need a way to pass the types. Also, field / line delimiters 
and other options like ignore comments etc. that are already provided by 
CsvReader would be nice.
Instead of passing all of these as arguments, maybe we could do something 
similar to what CsvReader does and have e.g. an EdgeCsvReader class with 
fieldDelimiter(), lineDelimiter() etc. and a types() method which will return a 
Graph. What do you think?


was (Author: vkalavri):
Hi [~andralungu],

you are right, we need a way to pass the types. Also, field / line delimiters 
and other options like ignore comments etc. that are already provided by 
CsvReader would be nice.
Instead of passing all of these as arguments, maybe we could do something 
similar to what CsvReader does and have e.g. an EdgeCsvReader class with 
fieldDelimiter)_, lineDelimiter() etc. and a types() method which will return a 
Graph. What do you think?

 Read edges and vertices from CSV files
 --

 Key: FLINK-1520
 URL: https://issues.apache.org/jira/browse/FLINK-1520
 Project: Flink
  Issue Type: New Feature
  Components: Gelly
Reporter: Vasia Kalavri
Assignee: Shivani Ghatge
Priority: Minor
  Labels: easyfix, newbie

 Add methods to create Vertex and Edge Datasets directly from CSV file inputs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] flink pull request: [FLINK-2093][gelly] Added difference Method

2015-06-13 Thread vasia
Github user vasia commented on a diff in the pull request:

https://github.com/apache/flink/pull/818#discussion_r32374703
  
--- Diff: docs/libs/gelly_guide.md ---
@@ -240,6 +240,7 @@ GraphLong, Double, Double networkWithWeights = 
network.joinWithEdgesOnSource(v
 img alt=Union Transformation width=50% src=fig/gelly-union.png/
 /p
 
+* strongDifference/strong: Gelly's `difference()` method performs a 
difference on the vertex and edge sets of the input graphs. The resultant graph 
is formed by removing the vertices and edges from the graph that are common 
with the second graph.
--- End diff --

we can rephrase this a bit.. there is one input graph and no second 
graph... I guess you copied from the union description above (which should also 
be changed).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---