Re: [graph] graph importers
Hola Claud.io, I asked you to show some code because I didn't understand how you would use reintroducing the Vertex/Edge in the exporter - maybe if you could develop few interfaces in the branch would help a lot understanding and discussing improvements... WDYT? best, -Simo http://people.apache.org/~simonetripodi/ http://simonetripodi.livejournal.com/ http://twitter.com/simonetripodi http://www.99soft.org/ On Sun, Mar 25, 2012 at 6:38 PM, Claudio Squarcella wrote: > Hi, > > > On 25/03/2012 17:33, Simone Tripodi wrote: >> >> Hi Claud.io >> >> I honestly felt I little lost - code would speak better than thousands >> of words, what about branching once again and make a concrete >> proposal? ;) > > > Sure I'll open a branch for that. But first I would like to validate my > thoughts (at least partially), that is why I'm writing poems for now :) > > An example: I'm very far from understanding how to exactly model the return > type for importers. It should give access to both the graph and its > properties, with particular emphasis on the ones that are relevant in > [graph] like edge weights, vertex labels, etc. Any take on that? > > Cheers, > Claudio > > >> >> Looking forward to read about it! >> -Simo >> >> http://people.apache.org/~simonetripodi/ >> http://simonetripodi.livejournal.com/ >> http://twitter.com/simonetripodi >> http://www.99soft.org/ >> >> >> >> On Sun, Mar 25, 2012 at 3:20 PM, Claudio Squarcella >> wrote: >>> >>> Hi all, >>> >>> the implementation of importers for [graph] requires a bit of attention, >>> in >>> particular with the new model where >>> >>> * there are no explicit markers for Vertex and Edge, >>> * all properties of Vertices and Edges are now specified with generic >>> Mappers. >>> >>> Writing and extending exporters is fine: we first specify the graph, then >>> its properties one after the other in a "fluent chain". The exporter >>> simply >>> forgets about the types of Vertex/Edge and serializes the whole input. >>> Importing back a graph from an input source, however, is not as simple >>> because: >>> >>> * standard file formats give us no indication about Vertex/Edge types; >>> * serialized graphs come with a number of properties, some of which we >>> know and sometimes need for graph processing (e.g. labels and >>> weights), while some others are not (yet?) recognized in the code; >>> * the return type of any importer should account for both the graph >>> itself and all the properties. >>> >>> As a first step, these are my suggestions for the design: >>> >>> * we need at least default, empty implementations for Vertex and Edge; >>> together with that, we could do some black magic to allow the user >>> to specify what types should be used to map imported Vertices/Edges >>> to actual classes. >>> * we need a structure to host both the imported graph and properties. >>> And it should be easy for the user to query such a structure for >>> specific graph properties, i.e. we need to isolate properties that >>> we recognize and use in our algorithms (e.g. weights). Other >>> properties could be either ignored or imported with a reference to >>> their name in the input format. >>> >>> One way could be to explicitly ask the user to list all the properties >>> that >>> he expects from the input graph, raising exceptions if they are not >>> found. >>> Something like: >>> >>> * importGraph().asGraphML( "graph1.gml" >>> ).withEdgeWeights().withVertexLabels(); // only two properties >>> loaded and explicitly identified >>> * importGraph().asGraphML( "graph2.gml" ).withAllProperties(); // all >>> properties are loaded, none is explicitly recognized >>> >>> >>> ...wow that was long. What do you [graph]ers think? :) >>> >>> Ciao, >>> Claudio >>> >>> -- >>> Claudio Squarcella >>> PhD student at Roma Tre University >>> http://www.dia.uniroma3.it/~squarcel >>> http://twitter.com/hyperboreans >>> http://claudio.squarcella.com/ >>> >>> >>> - >>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >>> For additional commands, e-mail: dev-h...@commons.apache.org >>> >> - >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >> For additional commands, e-mail: dev-h...@commons.apache.org >> > > -- > Claudio Squarcella > PhD student at Roma Tre University > http://www.dia.uniroma3.it/~squarcel > http://twitter.com/hyperboreans > http://claudio.squarcella.com/ > > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [graph] graph importers
Hi, On 25/03/2012 17:33, Simone Tripodi wrote: Hi Claud.io I honestly felt I little lost - code would speak better than thousands of words, what about branching once again and make a concrete proposal? ;) Sure I'll open a branch for that. But first I would like to validate my thoughts (at least partially), that is why I'm writing poems for now :) An example: I'm very far from understanding how to exactly model the return type for importers. It should give access to both the graph and its properties, with particular emphasis on the ones that are relevant in [graph] like edge weights, vertex labels, etc. Any take on that? Cheers, Claudio Looking forward to read about it! -Simo http://people.apache.org/~simonetripodi/ http://simonetripodi.livejournal.com/ http://twitter.com/simonetripodi http://www.99soft.org/ On Sun, Mar 25, 2012 at 3:20 PM, Claudio Squarcella wrote: Hi all, the implementation of importers for [graph] requires a bit of attention, in particular with the new model where * there are no explicit markers for Vertex and Edge, * all properties of Vertices and Edges are now specified with generic Mappers. Writing and extending exporters is fine: we first specify the graph, then its properties one after the other in a "fluent chain". The exporter simply forgets about the types of Vertex/Edge and serializes the whole input. Importing back a graph from an input source, however, is not as simple because: * standard file formats give us no indication about Vertex/Edge types; * serialized graphs come with a number of properties, some of which we know and sometimes need for graph processing (e.g. labels and weights), while some others are not (yet?) recognized in the code; * the return type of any importer should account for both the graph itself and all the properties. As a first step, these are my suggestions for the design: * we need at least default, empty implementations for Vertex and Edge; together with that, we could do some black magic to allow the user to specify what types should be used to map imported Vertices/Edges to actual classes. * we need a structure to host both the imported graph and properties. And it should be easy for the user to query such a structure for specific graph properties, i.e. we need to isolate properties that we recognize and use in our algorithms (e.g. weights). Other properties could be either ignored or imported with a reference to their name in the input format. One way could be to explicitly ask the user to list all the properties that he expects from the input graph, raising exceptions if they are not found. Something like: * importGraph().asGraphML( "graph1.gml" ).withEdgeWeights().withVertexLabels(); // only two properties loaded and explicitly identified * importGraph().asGraphML( "graph2.gml" ).withAllProperties(); // all properties are loaded, none is explicitly recognized ...wow that was long. What do you [graph]ers think? :) Ciao, Claudio -- Claudio Squarcella PhD student at Roma Tre University http://www.dia.uniroma3.it/~squarcel http://twitter.com/hyperboreans http://claudio.squarcella.com/ - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org -- Claudio Squarcella PhD student at Roma Tre University http://www.dia.uniroma3.it/~squarcel http://twitter.com/hyperboreans http://claudio.squarcella.com/ - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [graph] graph importers
Hi Claud.io I honestly felt I little lost - code would speak better than thousands of words, what about branching once again and make a concrete proposal? ;) Looking forward to read about it! -Simo http://people.apache.org/~simonetripodi/ http://simonetripodi.livejournal.com/ http://twitter.com/simonetripodi http://www.99soft.org/ On Sun, Mar 25, 2012 at 3:20 PM, Claudio Squarcella wrote: > Hi all, > > the implementation of importers for [graph] requires a bit of attention, in > particular with the new model where > > * there are no explicit markers for Vertex and Edge, > * all properties of Vertices and Edges are now specified with generic > Mappers. > > Writing and extending exporters is fine: we first specify the graph, then > its properties one after the other in a "fluent chain". The exporter simply > forgets about the types of Vertex/Edge and serializes the whole input. > Importing back a graph from an input source, however, is not as simple > because: > > * standard file formats give us no indication about Vertex/Edge types; > * serialized graphs come with a number of properties, some of which we > know and sometimes need for graph processing (e.g. labels and > weights), while some others are not (yet?) recognized in the code; > * the return type of any importer should account for both the graph > itself and all the properties. > > As a first step, these are my suggestions for the design: > > * we need at least default, empty implementations for Vertex and Edge; > together with that, we could do some black magic to allow the user > to specify what types should be used to map imported Vertices/Edges > to actual classes. > * we need a structure to host both the imported graph and properties. > And it should be easy for the user to query such a structure for > specific graph properties, i.e. we need to isolate properties that > we recognize and use in our algorithms (e.g. weights). Other > properties could be either ignored or imported with a reference to > their name in the input format. > > One way could be to explicitly ask the user to list all the properties that > he expects from the input graph, raising exceptions if they are not found. > Something like: > > * importGraph().asGraphML( "graph1.gml" > ).withEdgeWeights().withVertexLabels(); // only two properties > loaded and explicitly identified > * importGraph().asGraphML( "graph2.gml" ).withAllProperties(); // all > properties are loaded, none is explicitly recognized > > > ...wow that was long. What do you [graph]ers think? :) > > Ciao, > Claudio > > -- > Claudio Squarcella > PhD student at Roma Tre University > http://www.dia.uniroma3.it/~squarcel > http://twitter.com/hyperboreans > http://claudio.squarcella.com/ > > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org