[GitHub] flink pull request: [FLINK-2178][gelly] Fixed groupReduceOnNeighbo...
Github user andralungu commented on the pull request: https://github.com/apache/flink/pull/799#issuecomment-111761198 Yup, exactly! The use case was: I modified something in the edge data set, called groupReduceOnNeighbors on the result and got NPE, exception that could have been avoided with this check, just like was the case for the Degree NPE. Making sure that the iterator is not null can save some people lots of headaches, IMO :) And it doesn't hurt anyone who has a correct data set. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2178) groupReduceOnNeighbors throws NoSuchElementException
[ https://issues.apache.org/jira/browse/FLINK-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584877#comment-14584877 ] ASF GitHub Bot commented on FLINK-2178: --- Github user vasia commented on the pull request: https://github.com/apache/flink/pull/799#issuecomment-111760267 Hey @andralungu, I'm not sure I understand this one. It's a coGroup of vertices with the edges. For the vertex iterator to be empty, doesn't it mean that there's an edge with an invalid id? groupReduceOnNeighbors throws NoSuchElementException Key: FLINK-2178 URL: https://issues.apache.org/jira/browse/FLINK-2178 Project: Flink Issue Type: Bug Components: Gelly Affects Versions: 0.9 Reporter: Andra Lungu Assignee: Andra Lungu In the ALL EdgeDirection case, ApplyCoGroupFunctionOnAllNeighbors does not check whether the vertex iterator has elements causing the aforementioned exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2178][gelly] Fixed groupReduceOnNeighbo...
Github user vasia commented on the pull request: https://github.com/apache/flink/pull/799#issuecomment-111760267 Hey @andralungu, I'm not sure I understand this one. It's a coGroup of vertices with the edges. For the vertex iterator to be empty, doesn't it mean that there's an edge with an invalid id? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2093) Add a difference method to Gelly's Graph class
[ https://issues.apache.org/jira/browse/FLINK-2093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584867#comment-14584867 ] ASF GitHub Bot commented on FLINK-2093: --- Github user vasia commented on a diff in the pull request: https://github.com/apache/flink/pull/818#discussion_r32374715 --- Diff: flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java --- @@ -1234,6 +1234,18 @@ public void coGroup(IterableEdgeK, EV edge, IterableEdgeK, EV edgeToBeRe } /** +* Performs Difference on the vertex and edge sets of the input graphs +* removes common vertices and edges. If a source/target vertex is removed, its corresponding edge will also be removed +* @param graph the graph to perform difference with +* @return a new graph where the common vertices and edges have been removed +*/ + public GraphK,VV,EV difference(GraphK,VV,EV graph) throws java.lang.Exception{ + DataSetVertexK,VV removeVerticesData = graph.getVertices(); + final ListVertexK,VV removeVerticesList = removeVerticesData.collect(); --- End diff -- I don't think we should use `collect()` here.. Keep in mind that (1) `collect()` will trigger program execution and (2) should not be used to collect large DataSets and input graph might have lots of vertices. Add a difference method to Gelly's Graph class -- Key: FLINK-2093 URL: https://issues.apache.org/jira/browse/FLINK-2093 Project: Flink Issue Type: New Feature Components: Gelly Affects Versions: 0.9 Reporter: Andra Lungu Assignee: Shivani Ghatge Priority: Minor This method will compute the difference between two graphs, returning a new graph containing the vertices and edges that the current graph and the input graph don't have in common. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2093][gelly] Added difference Method
Github user vasia commented on a diff in the pull request: https://github.com/apache/flink/pull/818#discussion_r32374715 --- Diff: flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java --- @@ -1234,6 +1234,18 @@ public void coGroup(IterableEdgeK, EV edge, IterableEdgeK, EV edgeToBeRe } /** +* Performs Difference on the vertex and edge sets of the input graphs +* removes common vertices and edges. If a source/target vertex is removed, its corresponding edge will also be removed +* @param graph the graph to perform difference with +* @return a new graph where the common vertices and edges have been removed +*/ + public GraphK,VV,EV difference(GraphK,VV,EV graph) throws java.lang.Exception{ + DataSetVertexK,VV removeVerticesData = graph.getVertices(); + final ListVertexK,VV removeVerticesList = removeVerticesData.collect(); --- End diff -- I don't think we should use `collect()` here.. Keep in mind that (1) `collect()` will trigger program execution and (2) should not be used to collect large DataSets and input graph might have lots of vertices. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2093) Add a difference method to Gelly's Graph class
[ https://issues.apache.org/jira/browse/FLINK-2093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584866#comment-14584866 ] ASF GitHub Bot commented on FLINK-2093: --- Github user vasia commented on a diff in the pull request: https://github.com/apache/flink/pull/818#discussion_r32374703 --- Diff: docs/libs/gelly_guide.md --- @@ -240,6 +240,7 @@ GraphLong, Double, Double networkWithWeights = network.joinWithEdgesOnSource(v img alt=Union Transformation width=50% src=fig/gelly-union.png/ /p +* strongDifference/strong: Gelly's `difference()` method performs a difference on the vertex and edge sets of the input graphs. The resultant graph is formed by removing the vertices and edges from the graph that are common with the second graph. --- End diff -- we can rephrase this a bit.. there is one input graph and no second graph... I guess you copied from the union description above (which should also be changed). Add a difference method to Gelly's Graph class -- Key: FLINK-2093 URL: https://issues.apache.org/jira/browse/FLINK-2093 Project: Flink Issue Type: New Feature Components: Gelly Affects Versions: 0.9 Reporter: Andra Lungu Assignee: Shivani Ghatge Priority: Minor This method will compute the difference between two graphs, returning a new graph containing the vertices and edges that the current graph and the input graph don't have in common. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files
[ https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584874#comment-14584874 ] Andra Lungu commented on FLINK-1520: Yup, I am not happy with the argument passing, as it may be cumbersome for the user to get what each argument means etc. I thought about this approach, my only concern is that it will introduce a ton of duplicate code. And, in the end, you write (more or less) the same commands, just that instead of getting a DataSet, which you then turn into a graph with fromDataSet, you get a graph directly... If we are okay with code duplication then I would +1 Vasia's solution. Read edges and vertices from CSV files -- Key: FLINK-1520 URL: https://issues.apache.org/jira/browse/FLINK-1520 Project: Flink Issue Type: New Feature Components: Gelly Reporter: Vasia Kalavri Assignee: Shivani Ghatge Priority: Minor Labels: easyfix, newbie Add methods to create Vertex and Edge Datasets directly from CSV file inputs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files
[ https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584886#comment-14584886 ] Vasia Kalavri commented on FLINK-1520: -- I don't think it'll be a lot of duplicate code. You can have EdgeCsvReader wrap a CsvReader and just call its methods, no? Read edges and vertices from CSV files -- Key: FLINK-1520 URL: https://issues.apache.org/jira/browse/FLINK-1520 Project: Flink Issue Type: New Feature Components: Gelly Reporter: Vasia Kalavri Assignee: Shivani Ghatge Priority: Minor Labels: easyfix, newbie Add methods to create Vertex and Edge Datasets directly from CSV file inputs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2149) Simplify Gelly Jaccard similarity example
[ https://issues.apache.org/jira/browse/FLINK-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584882#comment-14584882 ] ASF GitHub Bot commented on FLINK-2149: --- Github user vasia commented on the pull request: https://github.com/apache/flink/pull/770#issuecomment-111760796 +1 except the minor comment Simplify Gelly Jaccard similarity example - Key: FLINK-2149 URL: https://issues.apache.org/jira/browse/FLINK-2149 Project: Flink Issue Type: Improvement Components: Gelly Affects Versions: 0.9 Reporter: Vasia Kalavri Assignee: Andra Lungu Priority: Trivial Labels: easyfix, starter The Gelly Jaccard similarity example can be simplified by replacing the groupReduceOnEdges method with the simpler reduceOnEdges. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2149][gelly] Simplified Jaccard Example
Github user vasia commented on a diff in the pull request: https://github.com/apache/flink/pull/770#discussion_r32374939 --- Diff: flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/example/JaccardSimilarityMeasure.java --- @@ -66,34 +63,47 @@ public static void main(String [] args) throws Exception { DataSetEdgeLong, Double edges = getEdgesDataSet(env); - GraphLong, NullValue, Double graph = Graph.fromDataSet(edges, env); + GraphLong, HashSetLong, Double graph = Graph.fromDataSet(edges, + new MapFunctionLong, HashSetLong() { - DataSetVertexLong, HashSetLong verticesWithNeighbors = - graph.groupReduceOnEdges(new GatherNeighbors(), EdgeDirection.ALL); + @Override + public HashSetLong map(Long id) throws Exception { + HashSetLong neighbors = new HashSetLong(); + neighbors.add(id); - GraphLong, HashSetLong, Double graphWithVertexValues = Graph.fromDataSet(verticesWithNeighbors, edges, env); + return new HashSetLong(neighbors); + } + }, env); - // the edge value will be the Jaccard similarity coefficient(number of common neighbors/ all neighbors) - DataSetTuple3Long, Long, Double edgesWithJaccardWeight = graphWithVertexValues.getTriplets() - .map(new WeighEdgesMapper()); + // create the set of neighbors + DataSetTuple2Long, HashSetLong computedNeighbors = + graph.reduceOnNeighbors(new GatherNeighbors(), EdgeDirection.ALL); - DataSetEdgeLong, Double result = graphWithVertexValues.joinWithEdges(edgesWithJaccardWeight, - new MapFunctionTuple2Double, Double, Double() { + // join with the vertices to update the node values + DataSetVertexLong, HashSetLong verticesWithNeighbors = + graph.joinWithVertices(computedNeighbors, new MapFunctionTuple2HashSetLong, HashSetLong, + HashSetLong() { @Override - public Double map(Tuple2Double, Double value) throws Exception { - return value.f1; + public HashSetLong map(Tuple2HashSetLong, HashSetLong tuple2) throws Exception { + return tuple2.f1; } - }).getEdges(); + }).getVertices(); + + GraphLong, HashSetLong, Double graphWithVertexValues = Graph.fromDataSet(verticesWithNeighbors, edges, env); --- End diff -- joinWithVertices can give you the Graph directly :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2149) Simplify Gelly Jaccard similarity example
[ https://issues.apache.org/jira/browse/FLINK-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584881#comment-14584881 ] ASF GitHub Bot commented on FLINK-2149: --- Github user vasia commented on a diff in the pull request: https://github.com/apache/flink/pull/770#discussion_r32374939 --- Diff: flink-staging/flink-gelly/src/main/java/org/apache/flink/graph/example/JaccardSimilarityMeasure.java --- @@ -66,34 +63,47 @@ public static void main(String [] args) throws Exception { DataSetEdgeLong, Double edges = getEdgesDataSet(env); - GraphLong, NullValue, Double graph = Graph.fromDataSet(edges, env); + GraphLong, HashSetLong, Double graph = Graph.fromDataSet(edges, + new MapFunctionLong, HashSetLong() { - DataSetVertexLong, HashSetLong verticesWithNeighbors = - graph.groupReduceOnEdges(new GatherNeighbors(), EdgeDirection.ALL); + @Override + public HashSetLong map(Long id) throws Exception { + HashSetLong neighbors = new HashSetLong(); + neighbors.add(id); - GraphLong, HashSetLong, Double graphWithVertexValues = Graph.fromDataSet(verticesWithNeighbors, edges, env); + return new HashSetLong(neighbors); + } + }, env); - // the edge value will be the Jaccard similarity coefficient(number of common neighbors/ all neighbors) - DataSetTuple3Long, Long, Double edgesWithJaccardWeight = graphWithVertexValues.getTriplets() - .map(new WeighEdgesMapper()); + // create the set of neighbors + DataSetTuple2Long, HashSetLong computedNeighbors = + graph.reduceOnNeighbors(new GatherNeighbors(), EdgeDirection.ALL); - DataSetEdgeLong, Double result = graphWithVertexValues.joinWithEdges(edgesWithJaccardWeight, - new MapFunctionTuple2Double, Double, Double() { + // join with the vertices to update the node values + DataSetVertexLong, HashSetLong verticesWithNeighbors = + graph.joinWithVertices(computedNeighbors, new MapFunctionTuple2HashSetLong, HashSetLong, + HashSetLong() { @Override - public Double map(Tuple2Double, Double value) throws Exception { - return value.f1; + public HashSetLong map(Tuple2HashSetLong, HashSetLong tuple2) throws Exception { + return tuple2.f1; } - }).getEdges(); + }).getVertices(); + + GraphLong, HashSetLong, Double graphWithVertexValues = Graph.fromDataSet(verticesWithNeighbors, edges, env); --- End diff -- joinWithVertices can give you the Graph directly :) Simplify Gelly Jaccard similarity example - Key: FLINK-2149 URL: https://issues.apache.org/jira/browse/FLINK-2149 Project: Flink Issue Type: Improvement Components: Gelly Affects Versions: 0.9 Reporter: Vasia Kalavri Assignee: Andra Lungu Priority: Trivial Labels: easyfix, starter The Gelly Jaccard similarity example can be simplified by replacing the groupReduceOnEdges method with the simpler reduceOnEdges. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2203] handling null values for RowSeria...
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/831#issuecomment-111717366 I had some more remarks, sorry for being so picky. :sweat_smile: Other than that, I think the change looks really good now! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2203] handling null values for RowSeria...
Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/831#discussion_r32371128 --- Diff: flink-staging/flink-table/src/test/scala/org/apache/flink/api/table/typeinfo/RowSerializerTest.scala --- @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.api.table.typeinfo + +import org.apache.flink.api.common.ExecutionConfig +import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, TypeInformation} +import org.apache.flink.api.common.typeutils.{SerializerTestInstance, TypeSerializer} +import org.apache.flink.api.table.Row +import org.junit.Assert._ +import org.junit.Test + +class RowSerializerTest { + + class RowSerializerTestInstance(serializer: TypeSerializer[Row], + testData: Array[Row]) +extends SerializerTestInstance(serializer, classOf[Row], -1, testData: _*) { + +override protected def deepEquals(message: String, should: Row, is: Row): Unit = { + val arity = should.productArity + assertEquals(message, arity, is.productArity) + var index = 0 + while (index arity) { +val copiedValue: Any = should.productElement(index) +val element: Any = is.productElement(index) +assertEquals(message, element, copiedValue) +index += 1 + } +} + } + + @Test + def testRowSerializer(): Unit ={ + +val rowInfo: TypeInformation[Row] = new RowTypeInfo( + Seq(BasicTypeInfo.INT_TYPE_INFO, BasicTypeInfo.STRING_TYPE_INFO), Seq(id, name)) + +val row1 = new Row(2) +row1.setField(0, 1) +row1.setField(1, a) + +val row2 = new Row(2) +row2.setField(0, 2) +row2.setField(1, hello) + +val testData: Array[Row] = Array(row1, row2) --- End diff -- I think it would be good to also add a row that has null values, since the change actually introduces that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2203) Add Support for Null-Values in RowSerializer
[ https://issues.apache.org/jira/browse/FLINK-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584639#comment-14584639 ] ASF GitHub Bot commented on FLINK-2203: --- Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/831#discussion_r32371128 --- Diff: flink-staging/flink-table/src/test/scala/org/apache/flink/api/table/typeinfo/RowSerializerTest.scala --- @@ -0,0 +1,70 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.api.table.typeinfo + +import org.apache.flink.api.common.ExecutionConfig +import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, TypeInformation} +import org.apache.flink.api.common.typeutils.{SerializerTestInstance, TypeSerializer} +import org.apache.flink.api.table.Row +import org.junit.Assert._ +import org.junit.Test + +class RowSerializerTest { + + class RowSerializerTestInstance(serializer: TypeSerializer[Row], + testData: Array[Row]) +extends SerializerTestInstance(serializer, classOf[Row], -1, testData: _*) { + +override protected def deepEquals(message: String, should: Row, is: Row): Unit = { + val arity = should.productArity + assertEquals(message, arity, is.productArity) + var index = 0 + while (index arity) { +val copiedValue: Any = should.productElement(index) +val element: Any = is.productElement(index) +assertEquals(message, element, copiedValue) +index += 1 + } +} + } + + @Test + def testRowSerializer(): Unit ={ + +val rowInfo: TypeInformation[Row] = new RowTypeInfo( + Seq(BasicTypeInfo.INT_TYPE_INFO, BasicTypeInfo.STRING_TYPE_INFO), Seq(id, name)) + +val row1 = new Row(2) +row1.setField(0, 1) +row1.setField(1, a) + +val row2 = new Row(2) +row2.setField(0, 2) +row2.setField(1, hello) + +val testData: Array[Row] = Array(row1, row2) --- End diff -- I think it would be good to also add a row that has null values, since the change actually introduces that. Add Support for Null-Values in RowSerializer Key: FLINK-2203 URL: https://issues.apache.org/jira/browse/FLINK-2203 Project: Flink Issue Type: Improvement Components: Table API Reporter: Aljoscha Krettek Assignee: Shiti Saxena Priority: Minor Labels: Starter This would be a start towards proper handling of null values. We would still need to add support for null values in aggregations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2203] handling null values for RowSeria...
Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/831#discussion_r32371124 --- Diff: flink-staging/flink-table/src/main/scala/org/apache/flink/api/table/typeinfo/RowSerializer.scala --- @@ -102,20 +119,39 @@ class RowSerializer(fieldSerializers: Array[TypeSerializer[Any]]) val len = fieldSerializers.length val result = new Row(len) -var i = 0 -while (i len) { - result.setField(i, fieldSerializers(i).deserialize(source)) - i += 1 + +var index = 0 +while (index len) { + val isNull: Boolean = source.readBoolean() + if (isNull) { +result.setField(index, null) + } else { +val serializer: TypeSerializer[Any] = fieldSerializers(index) +result.setField(index, serializer.deserialize(source)) + } + index += 1 } result } + private final val booleanSerializer = new BooleanSerializer() + override def copy(source: DataInputView, target: DataOutputView): Unit = { val len = fieldSerializers.length var i = 0 while (i len) { + booleanSerializer.copy(source, target) --- End diff -- I think it would be easier to do ``` target.writeBoolean(source.readBoolean()) ``` here, instead of going through the extra abstraction of the BooleanSerializer. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2203] handling null values for RowSeria...
Github user Shiti commented on the pull request: https://github.com/apache/flink/pull/831#issuecomment-111709239 @aljoscha I have updated the RowSerializerTest to use SerializerTestBase and reverted to using while loops in RowSerializer --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2208) Build error for Java IBM
[ https://issues.apache.org/jira/browse/FLINK-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584610#comment-14584610 ] Johannes Günther commented on FLINK-2208: - My first guess is that the IBM JDK is not supported officially and therefore not tested. Also the Automatic buildprocess in Travis does not seem to support it. http://docs.travis-ci.com/user/languages/java/#Testing-Against-Multiple-JDKs That being said the com.sun,management packages are internal API's for the Sun(Oracle) JDKs that seem to be used by the openjdk also a fix is to use the java.lang.management.OperatingSystemMXBean instead com.sun.* packages are not meant to be used outside the sun jdk and are not meant to be used in programms Build error for Java IBM Key: FLINK-2208 URL: https://issues.apache.org/jira/browse/FLINK-2208 Project: Flink Issue Type: Bug Components: Build System Affects Versions: 0.9 Reporter: Felix Neutatz Priority: Minor Using IBM Java 7 will break the built: {code:xml} [INFO] --- scala-maven-plugin:3.1.4:compile (scala-compile-first) @ flink-runtime --- [INFO] /share/flink/flink-0.9-SNAPSHOT-wo-Yarn/flink-runtime/src/main/java:-1: info: compiling [INFO] /share/flink/flink-0.9-SNAPSHOT-wo-Yarn/flink-runtime/src/main/scala:-1: info: compiling [INFO] Compiling 461 source files to /share/flink/flink-0.9-SNAPSHOT-wo-Yarn/flink-runtime/target/classes at 1434059956648 [ERROR] /share/flink/flink-0.9-SNAPSHOT-wo-Yarn/flink-runtime/src/main/scala/org/apache/flink/runtime/taskmanager/TaskManager.scala:1768: error: type OperatingSystemMXBean is not a member of package com.sun.management [ERROR] asInstanceOf[com.sun.management.OperatingSystemMXBean]). [ERROR] ^ [ERROR] /share/flink/flink-0.9-SNAPSHOT-wo-Yarn/flink-runtime/src/main/scala/org/apache/flink/runtime/taskmanager/TaskManager.scala:1787: error: type OperatingSystemMXBean is not a member of package com.sun.management [ERROR] val methodsList = classOf[com.sun.management.OperatingSystemMXBean].getMethods() [ERROR] ^ [ERROR] two errors found [INFO] [INFO] Reactor Summary: [INFO] [INFO] flink .. SUCCESS [ 14.447 s] [INFO] flink-shaded-hadoop SUCCESS [ 2.548 s] [INFO] flink-shaded-include-yarn .. SUCCESS [ 36.122 s] [INFO] flink-shaded-include-yarn-tests SUCCESS [ 36.980 s] [INFO] flink-core . SUCCESS [ 21.887 s] [INFO] flink-java . SUCCESS [ 16.023 s] [INFO] flink-runtime .. FAILURE [ 20.241 s] [INFO] flink-optimizer SKIPPED [hadoop@ibm-power-1 /]$ java -version java version 1.7.0 Java(TM) SE Runtime Environment (build pxp6470_27sr1fp1-20140708_01(SR1 FP1)) IBM J9 VM (build 2.7, JRE 1.7.0 Linux ppc64-64 Compressed References 20140707_205525 (JIT enabled, AOT enabled) J9VM - R27_Java727_SR1_20140707_1408_B205525 JIT - tr.r13.java_20140410_61421.07 GC - R27_Java727_SR1_20140707_1408_B205525_CMPRSS J9CL - 20140707_205525) JCL - 20140707_01 based on Oracle 7u65-b16 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2203) Add Support for Null-Values in RowSerializer
[ https://issues.apache.org/jira/browse/FLINK-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584641#comment-14584641 ] ASF GitHub Bot commented on FLINK-2203: --- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/831#issuecomment-111717366 I had some more remarks, sorry for being so picky. :sweat_smile: Other than that, I think the change looks really good now! Add Support for Null-Values in RowSerializer Key: FLINK-2203 URL: https://issues.apache.org/jira/browse/FLINK-2203 Project: Flink Issue Type: Improvement Components: Table API Reporter: Aljoscha Krettek Assignee: Shiti Saxena Priority: Minor Labels: Starter This would be a start towards proper handling of null values. We would still need to add support for null values in aggregations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2203) Add Support for Null-Values in RowSerializer
[ https://issues.apache.org/jira/browse/FLINK-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584480#comment-14584480 ] ASF GitHub Bot commented on FLINK-2203: --- Github user Shiti commented on the pull request: https://github.com/apache/flink/pull/831#issuecomment-111682105 @hsaputra I have updated the description Add Support for Null-Values in RowSerializer Key: FLINK-2203 URL: https://issues.apache.org/jira/browse/FLINK-2203 Project: Flink Issue Type: Improvement Components: Table API Reporter: Aljoscha Krettek Assignee: Shiti Saxena Priority: Minor Labels: Starter This would be a start towards proper handling of null values. We would still need to add support for null values in aggregations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2203] handling null values for RowSeria...
Github user Shiti commented on the pull request: https://github.com/apache/flink/pull/831#issuecomment-111682105 @hsaputra I have updated the description --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Assigned] (FLINK-2152) Provide zipWithIndex utility in flink-contrib
[ https://issues.apache.org/jira/browse/FLINK-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andra Lungu reassigned FLINK-2152: -- Assignee: Andra Lungu Provide zipWithIndex utility in flink-contrib - Key: FLINK-2152 URL: https://issues.apache.org/jira/browse/FLINK-2152 Project: Flink Issue Type: Improvement Components: Java API Reporter: Robert Metzger Assignee: Andra Lungu Priority: Trivial Labels: starter We should provide a simple utility method for zipping elements in a data set with a dense index. its up for discussion whether we want it directly in the API or if we should provide it only as a utility from {{flink-contrib}}. I would put it in {{flink-contrib}}. See my answer on SO: http://stackoverflow.com/questions/30596556/zipwithindex-on-apache-flink -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2203) Add Support for Null-Values in RowSerializer
[ https://issues.apache.org/jira/browse/FLINK-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584671#comment-14584671 ] ASF GitHub Bot commented on FLINK-2203: --- Github user Shiti commented on the pull request: https://github.com/apache/flink/pull/831#issuecomment-111723371 @aljoscha I have made the suggested changes Add Support for Null-Values in RowSerializer Key: FLINK-2203 URL: https://issues.apache.org/jira/browse/FLINK-2203 Project: Flink Issue Type: Improvement Components: Table API Reporter: Aljoscha Krettek Assignee: Shiti Saxena Priority: Minor Labels: Starter This would be a start towards proper handling of null values. We would still need to add support for null values in aggregations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2203] handling null values for RowSeria...
Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/831#discussion_r32369782 --- Diff: flink-staging/flink-table/src/main/scala/org/apache/flink/api/table/typeinfo/RowSerializer.scala --- @@ -89,11 +99,16 @@ class RowSerializer(fieldSerializers: Array[TypeSerializer[Any]]) throw new RuntimeException(Row arity of reuse and fields do not match.) } -var i = 0 -while (i len) { - val field = reuse.productElement(i).asInstanceOf[AnyRef] - reuse.setField(i, fieldSerializers(i).deserialize(field, source)) - i += 1 +(0 to len - 1).foreach { + index = +val isNull: Boolean = source.readBoolean +if (isNull) { + reuse.setField(index, null) +} else { + val field = reuse.productElement(index).asInstanceOf[AnyRef] + val serializer: TypeSerializer[Any] = fieldSerializers(index) + reuse.setField(index, serializer.deserialize(field, source)) +} --- End diff -- See above comment about while loop. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2203) Add Support for Null-Values in RowSerializer
[ https://issues.apache.org/jira/browse/FLINK-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584568#comment-14584568 ] ASF GitHub Bot commented on FLINK-2203: --- Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/831#discussion_r32369782 --- Diff: flink-staging/flink-table/src/main/scala/org/apache/flink/api/table/typeinfo/RowSerializer.scala --- @@ -89,11 +99,16 @@ class RowSerializer(fieldSerializers: Array[TypeSerializer[Any]]) throw new RuntimeException(Row arity of reuse and fields do not match.) } -var i = 0 -while (i len) { - val field = reuse.productElement(i).asInstanceOf[AnyRef] - reuse.setField(i, fieldSerializers(i).deserialize(field, source)) - i += 1 +(0 to len - 1).foreach { + index = +val isNull: Boolean = source.readBoolean +if (isNull) { + reuse.setField(index, null) +} else { + val field = reuse.productElement(index).asInstanceOf[AnyRef] + val serializer: TypeSerializer[Any] = fieldSerializers(index) + reuse.setField(index, serializer.deserialize(field, source)) +} --- End diff -- See above comment about while loop. Add Support for Null-Values in RowSerializer Key: FLINK-2203 URL: https://issues.apache.org/jira/browse/FLINK-2203 Project: Flink Issue Type: Improvement Components: Table API Reporter: Aljoscha Krettek Assignee: Shiti Saxena Priority: Minor Labels: Starter This would be a start towards proper handling of null values. We would still need to add support for null values in aggregations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2203) Add Support for Null-Values in RowSerializer
[ https://issues.apache.org/jira/browse/FLINK-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584567#comment-14584567 ] ASF GitHub Bot commented on FLINK-2203: --- Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/831#discussion_r32369781 --- Diff: flink-staging/flink-table/src/main/scala/org/apache/flink/api/table/typeinfo/RowSerializer.scala --- @@ -74,11 +79,16 @@ class RowSerializer(fieldSerializers: Array[TypeSerializer[Any]]) override def serialize(value: Row, target: DataOutputView) { val len = fieldSerializers.length -var i = 0 -while (i len) { - val serializer = fieldSerializers(i) - serializer.serialize(value.productElement(i), target) - i += 1 +(0 to len - 1).foreach { + index = +val o: AnyRef = value.productElement(index).asInstanceOf[AnyRef] +if (o == null) { + target.writeBoolean(true) +} else { + target.writeBoolean(false) + val serializer = fieldSerializers(index) + serializer.serialize(value.productElement(index), target) +} --- End diff -- Could you please change this back to the simple while loop. I know that using the fancy Scala features is tempting but the performance of a simple while loop should be better than creating a range and iterating over it using foreach. Add Support for Null-Values in RowSerializer Key: FLINK-2203 URL: https://issues.apache.org/jira/browse/FLINK-2203 Project: Flink Issue Type: Improvement Components: Table API Reporter: Aljoscha Krettek Assignee: Shiti Saxena Priority: Minor Labels: Starter This would be a start towards proper handling of null values. We would still need to add support for null values in aggregations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2203] handling null values for RowSeria...
Github user aljoscha commented on a diff in the pull request: https://github.com/apache/flink/pull/831#discussion_r32369781 --- Diff: flink-staging/flink-table/src/main/scala/org/apache/flink/api/table/typeinfo/RowSerializer.scala --- @@ -74,11 +79,16 @@ class RowSerializer(fieldSerializers: Array[TypeSerializer[Any]]) override def serialize(value: Row, target: DataOutputView) { val len = fieldSerializers.length -var i = 0 -while (i len) { - val serializer = fieldSerializers(i) - serializer.serialize(value.productElement(i), target) - i += 1 +(0 to len - 1).foreach { + index = +val o: AnyRef = value.productElement(index).asInstanceOf[AnyRef] +if (o == null) { + target.writeBoolean(true) +} else { + target.writeBoolean(false) + val serializer = fieldSerializers(index) + serializer.serialize(value.productElement(index), target) +} --- End diff -- Could you please change this back to the simple while loop. I know that using the fancy Scala features is tempting but the performance of a simple while loop should be better than creating a range and iterating over it using foreach. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1520) Read edges and vertices from CSV files
[ https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584586#comment-14584586 ] Andra Lungu commented on FLINK-1520: Hey [~vkalavri]], To my knowledge, you cannot deduce the key or the value's class from the generic K,VV,EV. The way I would implement fromCsv is by adding the classes as parameters, e.g. Graph.fromCsv(edgesPath, String.class, String.class, context). For NullValue, then, we would have a single class argument Graph.fromCsv(edgesPath, String.class, context). The user should know what kind of keys he/she has in there. So the extra parameters should not be that much of a burden. Is this what you had in mind? For the time being, I cannot see a smarter way of doing it :) The examples should be updated accordingly since they now read the edge and vertex data sets from CSV and then use fromDataSet to produce the graph. Read edges and vertices from CSV files -- Key: FLINK-1520 URL: https://issues.apache.org/jira/browse/FLINK-1520 Project: Flink Issue Type: New Feature Components: Gelly Reporter: Vasia Kalavri Assignee: Shivani Ghatge Priority: Minor Labels: easyfix, newbie Add methods to create Vertex and Edge Datasets directly from CSV file inputs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2203] handling null values for RowSeria...
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/831#issuecomment-111701489 The changes look good except for comments I had about the loops. For the tests, did you try doing it as in TraversableSerializerTest.scala. Here, we override deepEquals() and the runTests() method can be modified to take the RowTypeInfo that was explicitly created. The reason why I would like it to derive from the SerializerTestBase as all the other serializers is that we have to change the tests for RowSerializer if we adapt the tests there. And this would likely be forgotten by future contributors. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2203) Add Support for Null-Values in RowSerializer
[ https://issues.apache.org/jira/browse/FLINK-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584572#comment-14584572 ] ASF GitHub Bot commented on FLINK-2203: --- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/831#issuecomment-111701489 The changes look good except for comments I had about the loops. For the tests, did you try doing it as in TraversableSerializerTest.scala. Here, we override deepEquals() and the runTests() method can be modified to take the RowTypeInfo that was explicitly created. The reason why I would like it to derive from the SerializerTestBase as all the other serializers is that we have to change the tests for RowSerializer if we adapt the tests there. And this would likely be forgotten by future contributors. Add Support for Null-Values in RowSerializer Key: FLINK-2203 URL: https://issues.apache.org/jira/browse/FLINK-2203 Project: Flink Issue Type: Improvement Components: Table API Reporter: Aljoscha Krettek Assignee: Shiti Saxena Priority: Minor Labels: Starter This would be a start towards proper handling of null values. We would still need to add support for null values in aggregations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2203) Add Support for Null-Values in RowSerializer
[ https://issues.apache.org/jira/browse/FLINK-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584782#comment-14584782 ] ASF GitHub Bot commented on FLINK-2203: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/831 Add Support for Null-Values in RowSerializer Key: FLINK-2203 URL: https://issues.apache.org/jira/browse/FLINK-2203 Project: Flink Issue Type: Improvement Components: Table API Reporter: Aljoscha Krettek Assignee: Shiti Saxena Priority: Minor Labels: Starter This would be a start towards proper handling of null values. We would still need to add support for null values in aggregations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2203] handling null values for RowSeria...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/831 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-2203] handling null values for RowSeria...
Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/831#issuecomment-111741260 Thanks, nice work. :+1: --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2203) Add Support for Null-Values in RowSerializer
[ https://issues.apache.org/jira/browse/FLINK-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584783#comment-14584783 ] ASF GitHub Bot commented on FLINK-2203: --- Github user aljoscha commented on the pull request: https://github.com/apache/flink/pull/831#issuecomment-111741260 Thanks, nice work. :+1: Add Support for Null-Values in RowSerializer Key: FLINK-2203 URL: https://issues.apache.org/jira/browse/FLINK-2203 Project: Flink Issue Type: Improvement Components: Table API Reporter: Aljoscha Krettek Assignee: Shiti Saxena Priority: Minor Labels: Starter This would be a start towards proper handling of null values. We would still need to add support for null values in aggregations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-2203) Add Support for Null-Values in RowSerializer
[ https://issues.apache.org/jira/browse/FLINK-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aljoscha Krettek closed FLINK-2203. --- Resolution: Fixed Resolved in https://github.com/apache/flink/commit/f8e12b20d925c3f6f24769327d1da5d98affa679 Add Support for Null-Values in RowSerializer Key: FLINK-2203 URL: https://issues.apache.org/jira/browse/FLINK-2203 Project: Flink Issue Type: Improvement Components: Table API Reporter: Aljoscha Krettek Assignee: Shiti Saxena Priority: Minor Labels: Starter This would be a start towards proper handling of null values. We would still need to add support for null values in aggregations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2175) Allow multiple jobs in single jar file
[ https://issues.apache.org/jira/browse/FLINK-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584835#comment-14584835 ] ASF GitHub Bot commented on FLINK-2175: --- Github user mjsax commented on the pull request: https://github.com/apache/flink/pull/707#issuecomment-111755732 Any news on this PR? Allow multiple jobs in single jar file -- Key: FLINK-2175 URL: https://issues.apache.org/jira/browse/FLINK-2175 Project: Flink Issue Type: Improvement Components: Examples, other, Webfrontend Reporter: Matthias J. Sax Assignee: Matthias J. Sax Priority: Minor Allow to package multiple jobs into a single jar. - extend WebClient to display all available jobs - extend WebClient to diplay plan and submit each job -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2175] Allow multiple jobs in single jar...
Github user mjsax commented on the pull request: https://github.com/apache/flink/pull/707#issuecomment-111755732 Any news on this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: [FLINK-1818] Added api to cancel job from clie...
Github user mjsax commented on the pull request: https://github.com/apache/flink/pull/642#issuecomment-111755816 Any news on this PR? I thinks it is a nice feature? @rainiraj do you still work on this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request: Flink Storm compatibility
Github user mjsax commented on the pull request: https://github.com/apache/flink/pull/573#issuecomment-111755768 Any news on this PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1818) Provide API to cancel running job
[ https://issues.apache.org/jira/browse/FLINK-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584836#comment-14584836 ] ASF GitHub Bot commented on FLINK-1818: --- Github user mjsax commented on the pull request: https://github.com/apache/flink/pull/642#issuecomment-111755816 Any news on this PR? I thinks it is a nice feature? @rainiraj do you still work on this? Provide API to cancel running job - Key: FLINK-1818 URL: https://issues.apache.org/jira/browse/FLINK-1818 Project: Flink Issue Type: Improvement Components: Java API Affects Versions: 0.9 Reporter: Robert Metzger Assignee: niraj rai Labels: starter http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Canceling-a-Cluster-Job-from-Java-td4897.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2152) Provide zipWithIndex utility in flink-contrib
[ https://issues.apache.org/jira/browse/FLINK-2152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584840#comment-14584840 ] ASF GitHub Bot commented on FLINK-2152: --- GitHub user andralungu opened a pull request: https://github.com/apache/flink/pull/832 [FLINK-2152] Added zipWithIndex This PR adds the zipWithIndex utility method to Flink's DataSetUtils as described in the mailing list discussion: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/The-correct-location-for-zipWithIndex-and-zipWithUniqueId-td6310.html. The method could, in the future, be moved to DataSet. @fhueske , @tillrohrmann , once we reach a conclusion for this one, I will also update #801 (I wouldn't like to fix unnecessary merge conflicts). Once zipWIthUniqueIds is added, I could also explain the difference in the docs. You can merge this pull request into a Git repository by running: $ git pull https://github.com/andralungu/flink zipWithIndex Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/832.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #832 commit fdbf0167cc10e952faddc2a7d71e73e7f1f2d03f Author: andralungu lungu.an...@gmail.com Date: 2015-06-12T18:37:27Z [FLINK-2152] zipWithIndex implementation [FLINK-2152] Added zipWithIndex utility method Provide zipWithIndex utility in flink-contrib - Key: FLINK-2152 URL: https://issues.apache.org/jira/browse/FLINK-2152 Project: Flink Issue Type: Improvement Components: Java API Reporter: Robert Metzger Assignee: Andra Lungu Priority: Trivial Labels: starter We should provide a simple utility method for zipping elements in a data set with a dense index. its up for discussion whether we want it directly in the API or if we should provide it only as a utility from {{flink-contrib}}. I would put it in {{flink-contrib}}. See my answer on SO: http://stackoverflow.com/questions/30596556/zipwithindex-on-apache-flink -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2152] Added zipWithIndex
GitHub user andralungu opened a pull request: https://github.com/apache/flink/pull/832 [FLINK-2152] Added zipWithIndex This PR adds the zipWithIndex utility method to Flink's DataSetUtils as described in the mailing list discussion: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/The-correct-location-for-zipWithIndex-and-zipWithUniqueId-td6310.html. The method could, in the future, be moved to DataSet. @fhueske , @tillrohrmann , once we reach a conclusion for this one, I will also update #801 (I wouldn't like to fix unnecessary merge conflicts). Once zipWIthUniqueIds is added, I could also explain the difference in the docs. You can merge this pull request into a Git repository by running: $ git pull https://github.com/andralungu/flink zipWithIndex Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/832.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #832 commit fdbf0167cc10e952faddc2a7d71e73e7f1f2d03f Author: andralungu lungu.an...@gmail.com Date: 2015-06-12T18:37:27Z [FLINK-2152] zipWithIndex implementation [FLINK-2152] Added zipWithIndex utility method --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-1818) Provide API to cancel running job
[ https://issues.apache.org/jira/browse/FLINK-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584852#comment-14584852 ] niraj rai commented on FLINK-1818: -- Hi [~mjsax] Yes, I will submit another pull request in couple of days. Please wait. Thanks Niraj Provide API to cancel running job - Key: FLINK-1818 URL: https://issues.apache.org/jira/browse/FLINK-1818 Project: Flink Issue Type: Improvement Components: Java API Affects Versions: 0.9 Reporter: Robert Metzger Assignee: niraj rai Labels: starter http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Canceling-a-Cluster-Job-from-Java-td4897.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (FLINK-1520) Read edges and vertices from CSV files
[ https://issues.apache.org/jira/browse/FLINK-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584855#comment-14584855 ] Vasia Kalavri edited comment on FLINK-1520 at 6/13/15 10:44 PM: Hi [~andralungu], you are right, we need a way to pass the types. Also, field / line delimiters and other options like ignore comments etc. that are already provided by CsvReader would be nice. Instead of passing all of these as arguments, maybe we could do something similar to what CsvReader does and have e.g. an EdgeCsvReader class with fieldDelimiter(), lineDelimiter() etc. and a types() method which will return a Graph. What do you think? was (Author: vkalavri): Hi [~andralungu], you are right, we need a way to pass the types. Also, field / line delimiters and other options like ignore comments etc. that are already provided by CsvReader would be nice. Instead of passing all of these as arguments, maybe we could do something similar to what CsvReader does and have e.g. an EdgeCsvReader class with fieldDelimiter)_, lineDelimiter() etc. and a types() method which will return a Graph. What do you think? Read edges and vertices from CSV files -- Key: FLINK-1520 URL: https://issues.apache.org/jira/browse/FLINK-1520 Project: Flink Issue Type: New Feature Components: Gelly Reporter: Vasia Kalavri Assignee: Shivani Ghatge Priority: Minor Labels: easyfix, newbie Add methods to create Vertex and Edge Datasets directly from CSV file inputs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request: [FLINK-2093][gelly] Added difference Method
Github user vasia commented on a diff in the pull request: https://github.com/apache/flink/pull/818#discussion_r32374703 --- Diff: docs/libs/gelly_guide.md --- @@ -240,6 +240,7 @@ GraphLong, Double, Double networkWithWeights = network.joinWithEdgesOnSource(v img alt=Union Transformation width=50% src=fig/gelly-union.png/ /p +* strongDifference/strong: Gelly's `difference()` method performs a difference on the vertex and edge sets of the input graphs. The resultant graph is formed by removing the vertices and edges from the graph that are common with the second graph. --- End diff -- we can rephrase this a bit.. there is one input graph and no second graph... I guess you copied from the union description above (which should also be changed). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---