[GitHub] flink pull request #2178: [Flink-1815] Add methods to read and write a Graph...

fobeligi Tue, 28 Jun 2016 14:37:29 -0700

Github user fobeligi commented on a diff in the pull request:

    https://github.com/apache/flink/pull/2178#discussion_r68848469
  
    --- Diff: 
flink-libraries/flink-gelly/src/main/java/org/apache/flink/graph/Graph.java ---
    @@ -408,6 +408,79 @@ public static GraphCsvReader fromCsvReader(String 
edgesPath, ExecutionEnvironmen
        }
     
        /**
    +    * Creates a graph from a Adjacency List text file  with Vertex Key 
values. Edges will be created automatically.
    +    *
    +    * @param filePath a path to an Adjacency List text file with the 
Vertex data
    +    * @param context  the execution environment.
    +    * @return An instance of {@link 
org.apache.flink.graph.GraphAdjacencyListReader},
    +    * on which calling methods to specify types of the Vertex ID, Vertex 
value and Edge value returns a Graph.
    +    */
    +   public static GraphAdjacencyListReader fromAdjacencyListFile(String 
filePath, ExecutionEnvironment context) {
    +           return new GraphAdjacencyListReader(filePath, context);
    +   }
    +
    +   /**
    +    * Writes a graph as an Adjacency List formatted text file in a user 
specified folder.
    +    *
    +    * @param filePath   the path that the Adjacency List formatted text 
file should be written in
    +    * @param delimiters the delimiters that separate the different value 
types in the Adjacency List formatted text
    +    *                   file. Delimiters should be provided with the 
following order:
    +    *                   NEIGHBOR_DELIMITER : separating source from its 
neighbors
    +    *                   VERTICES_DELIMITER : separating the different 
neighbors of a source vertex
    +    *                   VERTEX_VALUE_DELIMITER: separating the source 
vertex-id from the vertex value, as well as the
    +    *                   target vertex-ids from the edge value.
    +    */
    +   public void writeAsAdjacencyList(String filePath, String... delimiters) 
{
    +
    +           final String NEIGHBOR_DELIMITER = delimiters.length > 0 ? 
delimiters[0] : "\t";
    +
    +           final String VERTICES_DELIMITER = delimiters.length > 1 ? 
delimiters[1] : ",";
    +
    +           final String VERTEX_VALUE_DELIMITER = delimiters.length > 1 ? 
delimiters[2] : "-";
    +
    +
    +           DataSet<Tuple2<K, VV>> vertices = this.getVerticesAsTuple2();
    +
    +           DataSet<Tuple3<K, K, EV>> edgesNValues = 
this.getEdgesAsTuple3();
    --- End diff --
    
    As I see now, we don't have to convert the vertex set to tuple2 set, so I 
already changed that.
    
    Regarding the edges dataset, in order to write the Adjacency List file, I 
use the coGroup transformation to the Vertex dataset and EdgesAsTuple3 dataset, 
where the vertexId equals the source of the edge. 
    
    In that case, even when a Vertex is source to no edges (e.g. has only 
incoming edges), I can still have the vertexId in the "coGrouped" dataset (I 
couldn't do that with a join).
    
    I can't think how I could use the Edge dataset in a coGroup or similar 
transformation. 
    Please let me know if you have any suggestions.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #2178: [Flink-1815] Add methods to read and write a Graph...

Reply via email to