Github user fobeligi commented on a diff in the pull request:
https://github.com/apache/flink/pull/2178#discussion_r68848469
--- Diff:
flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java ---
@@ -408,6 +408,79 @@ public static GraphCsvReader fromCsvReader(String
edgesPath, ExecutionEnvironmen
}
/**
+ * Creates a graph from a Adjacency List text file with Vertex Key
values. Edges will be created automatically.
+ *
+ * @param filePath a path to an Adjacency List text file with the
Vertex data
+ * @param context the execution environment.
+ * @return An instance of {@link
org.apache.flink.graph.GraphAdjacencyListReader},
+ * on which calling methods to specify types of the Vertex ID, Vertex
value and Edge value returns a Graph.
+ */
+ public static GraphAdjacencyListReader fromAdjacencyListFile(String
filePath, ExecutionEnvironment context) {
+ return new GraphAdjacencyListReader(filePath, context);
+ }
+
+ /**
+ * Writes a graph as an Adjacency List formatted text file in a user
specified folder.
+ *
+ * @param filePath the path that the Adjacency List formatted text
file should be written in
+ * @param delimiters the delimiters that separate the different value
types in the Adjacency List formatted text
+ * file. Delimiters should be provided with the
following order:
+ * NEIGHBOR_DELIMITER : separating source from its
neighbors
+ * VERTICES_DELIMITER : separating the different
neighbors of a source vertex
+ * VERTEX_VALUE_DELIMITER: separating the source
vertex-id from the vertex value, as well as the
+ * target vertex-ids from the edge value.
+ */
+ public void writeAsAdjacencyList(String filePath, String... delimiters)
{
+
+ final String NEIGHBOR_DELIMITER = delimiters.length > 0 ?
delimiters[0] : "\t";
+
+ final String VERTICES_DELIMITER = delimiters.length > 1 ?
delimiters[1] : ",";
+
+ final String VERTEX_VALUE_DELIMITER = delimiters.length > 1 ?
delimiters[2] : "-";
+
+
+ DataSet<Tuple2<K, VV>> vertices = this.getVerticesAsTuple2();
+
+ DataSet<Tuple3<K, K, EV>> edgesNValues =
this.getEdgesAsTuple3();
--- End diff --
As I see now, we don't have to convert the vertex set to tuple2 set, so I
already changed that.
Regarding the edges dataset, in order to write the Adjacency List file, I
use the coGroup transformation to the Vertex dataset and EdgesAsTuple3 dataset,
where the vertexId equals the source of the edge.
In that case, even when a Vertex is source to no edges (e.g. has only
incoming edges), I can still have the vertexId in the "coGrouped" dataset (I
couldn't do that with a join).
I can't think how I could use the Edge dataset in a coGroup or similar
transformation.
Please let me know if you have any suggestions.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---