Re: [graph] graph importers

2012-03-25 Thread Simone Tripodi
Hola Claud.io,

I asked you to show some code because I didn't understand how you
would use reintroducing the Vertex/Edge in the exporter - maybe if you
could develop few interfaces in the branch would help a lot
understanding and discussing improvements...

WDYT?
best,
-Simo

http://people.apache.org/~simonetripodi/
http://simonetripodi.livejournal.com/
http://twitter.com/simonetripodi
http://www.99soft.org/



On Sun, Mar 25, 2012 at 6:38 PM, Claudio Squarcella
 wrote:
> Hi,
>
>
> On 25/03/2012 17:33, Simone Tripodi wrote:
>>
>> Hi Claud.io
>>
>> I honestly felt I little lost - code would speak better than thousands
>> of words, what about branching once again and make a concrete
>> proposal? ;)
>
>
> Sure I'll open a branch for that. But first I would like to validate my
> thoughts (at least partially), that is why I'm writing poems for now :)
>
> An example: I'm very far from understanding how to exactly model the return
> type for importers. It should give access to both the graph and its
> properties, with particular emphasis on the ones that are relevant in
> [graph] like edge weights, vertex labels, etc. Any take on that?
>
> Cheers,
> Claudio
>
>
>>
>> Looking forward to read about it!
>> -Simo
>>
>> http://people.apache.org/~simonetripodi/
>> http://simonetripodi.livejournal.com/
>> http://twitter.com/simonetripodi
>> http://www.99soft.org/
>>
>>
>>
>> On Sun, Mar 25, 2012 at 3:20 PM, Claudio Squarcella
>>   wrote:
>>>
>>> Hi all,
>>>
>>> the implementation of importers for [graph] requires a bit of attention,
>>> in
>>> particular with the new model where
>>>
>>>  * there are no explicit markers for Vertex and Edge,
>>>  * all properties of Vertices and Edges are now specified with generic
>>>   Mappers.
>>>
>>> Writing and extending exporters is fine: we first specify the graph, then
>>> its properties one after the other in a "fluent chain". The exporter
>>> simply
>>> forgets about the types of Vertex/Edge and serializes the whole input.
>>> Importing back a graph from an input source, however, is not as simple
>>> because:
>>>
>>>  * standard file formats give us no indication about Vertex/Edge types;
>>>  * serialized graphs come with a number of properties, some of which we
>>>   know and sometimes need for graph processing (e.g. labels and
>>>   weights), while some others are not (yet?) recognized in the code;
>>>  * the return type of any importer should account for both the graph
>>>   itself and all the properties.
>>>
>>> As a first step, these are my suggestions for the design:
>>>
>>>  * we need at least default, empty implementations for Vertex and Edge;
>>>   together with that, we could do some black magic to allow the user
>>>   to specify what types should be used to map imported Vertices/Edges
>>>   to actual classes.
>>>  * we need a structure to host both the imported graph and properties.
>>>   And it should be easy for the user to query such a structure for
>>>   specific graph properties, i.e. we need to isolate properties that
>>>   we recognize and use in our algorithms (e.g. weights). Other
>>>   properties could be either ignored or imported with a reference to
>>>   their name in the input format.
>>>
>>> One way could be to explicitly ask the user to list all the properties
>>> that
>>> he expects from the input graph, raising exceptions if they are not
>>> found.
>>> Something like:
>>>
>>>  * importGraph().asGraphML( "graph1.gml"
>>>   ).withEdgeWeights().withVertexLabels(); // only two properties
>>>   loaded and explicitly identified
>>>  * importGraph().asGraphML( "graph2.gml" ).withAllProperties(); // all
>>>   properties are loaded, none is explicitly recognized
>>>
>>>
>>> ...wow that was long. What do you [graph]ers think? :)
>>>
>>> Ciao,
>>> Claudio
>>>
>>> --
>>> Claudio Squarcella
>>> PhD student at Roma Tre University
>>> http://www.dia.uniroma3.it/~squarcel
>>> http://twitter.com/hyperboreans
>>> http://claudio.squarcella.com/
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>>> For additional commands, e-mail: dev-h...@commons.apache.org
>>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>> For additional commands, e-mail: dev-h...@commons.apache.org
>>
>
> --
> Claudio Squarcella
> PhD student at Roma Tre University
> http://www.dia.uniroma3.it/~squarcel
> http://twitter.com/hyperboreans
> http://claudio.squarcella.com/
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [graph] graph importers

2012-03-25 Thread Claudio Squarcella

Hi,

On 25/03/2012 17:33, Simone Tripodi wrote:

Hi Claud.io

I honestly felt I little lost - code would speak better than thousands
of words, what about branching once again and make a concrete
proposal? ;)


Sure I'll open a branch for that. But first I would like to validate my 
thoughts (at least partially), that is why I'm writing poems for now :)


An example: I'm very far from understanding how to exactly model the 
return type for importers. It should give access to both the graph and 
its properties, with particular emphasis on the ones that are relevant 
in [graph] like edge weights, vertex labels, etc. Any take on that?


Cheers,
Claudio



Looking forward to read about it!
-Simo

http://people.apache.org/~simonetripodi/
http://simonetripodi.livejournal.com/
http://twitter.com/simonetripodi
http://www.99soft.org/



On Sun, Mar 25, 2012 at 3:20 PM, Claudio Squarcella
  wrote:

Hi all,

the implementation of importers for [graph] requires a bit of attention, in
particular with the new model where

  * there are no explicit markers for Vertex and Edge,
  * all properties of Vertices and Edges are now specified with generic
   Mappers.

Writing and extending exporters is fine: we first specify the graph, then
its properties one after the other in a "fluent chain". The exporter simply
forgets about the types of Vertex/Edge and serializes the whole input.
Importing back a graph from an input source, however, is not as simple
because:

  * standard file formats give us no indication about Vertex/Edge types;
  * serialized graphs come with a number of properties, some of which we
   know and sometimes need for graph processing (e.g. labels and
   weights), while some others are not (yet?) recognized in the code;
  * the return type of any importer should account for both the graph
   itself and all the properties.

As a first step, these are my suggestions for the design:

  * we need at least default, empty implementations for Vertex and Edge;
   together with that, we could do some black magic to allow the user
   to specify what types should be used to map imported Vertices/Edges
   to actual classes.
  * we need a structure to host both the imported graph and properties.
   And it should be easy for the user to query such a structure for
   specific graph properties, i.e. we need to isolate properties that
   we recognize and use in our algorithms (e.g. weights). Other
   properties could be either ignored or imported with a reference to
   their name in the input format.

One way could be to explicitly ask the user to list all the properties that
he expects from the input graph, raising exceptions if they are not found.
Something like:

  * importGraph().asGraphML( "graph1.gml"
   ).withEdgeWeights().withVertexLabels(); // only two properties
   loaded and explicitly identified
  * importGraph().asGraphML( "graph2.gml" ).withAllProperties(); // all
   properties are loaded, none is explicitly recognized


...wow that was long. What do you [graph]ers think? :)

Ciao,
Claudio

--
Claudio Squarcella
PhD student at Roma Tre University
http://www.dia.uniroma3.it/~squarcel
http://twitter.com/hyperboreans
http://claudio.squarcella.com/


-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



--
Claudio Squarcella
PhD student at Roma Tre University
http://www.dia.uniroma3.it/~squarcel
http://twitter.com/hyperboreans
http://claudio.squarcella.com/


-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [graph] graph importers

2012-03-25 Thread Simone Tripodi
Hi Claud.io

I honestly felt I little lost - code would speak better than thousands
of words, what about branching once again and make a concrete
proposal? ;)

Looking forward to read about it!
-Simo

http://people.apache.org/~simonetripodi/
http://simonetripodi.livejournal.com/
http://twitter.com/simonetripodi
http://www.99soft.org/



On Sun, Mar 25, 2012 at 3:20 PM, Claudio Squarcella
 wrote:
> Hi all,
>
> the implementation of importers for [graph] requires a bit of attention, in
> particular with the new model where
>
>  * there are no explicit markers for Vertex and Edge,
>  * all properties of Vertices and Edges are now specified with generic
>   Mappers.
>
> Writing and extending exporters is fine: we first specify the graph, then
> its properties one after the other in a "fluent chain". The exporter simply
> forgets about the types of Vertex/Edge and serializes the whole input.
> Importing back a graph from an input source, however, is not as simple
> because:
>
>  * standard file formats give us no indication about Vertex/Edge types;
>  * serialized graphs come with a number of properties, some of which we
>   know and sometimes need for graph processing (e.g. labels and
>   weights), while some others are not (yet?) recognized in the code;
>  * the return type of any importer should account for both the graph
>   itself and all the properties.
>
> As a first step, these are my suggestions for the design:
>
>  * we need at least default, empty implementations for Vertex and Edge;
>   together with that, we could do some black magic to allow the user
>   to specify what types should be used to map imported Vertices/Edges
>   to actual classes.
>  * we need a structure to host both the imported graph and properties.
>   And it should be easy for the user to query such a structure for
>   specific graph properties, i.e. we need to isolate properties that
>   we recognize and use in our algorithms (e.g. weights). Other
>   properties could be either ignored or imported with a reference to
>   their name in the input format.
>
> One way could be to explicitly ask the user to list all the properties that
> he expects from the input graph, raising exceptions if they are not found.
> Something like:
>
>  * importGraph().asGraphML( "graph1.gml"
>   ).withEdgeWeights().withVertexLabels(); // only two properties
>   loaded and explicitly identified
>  * importGraph().asGraphML( "graph2.gml" ).withAllProperties(); // all
>   properties are loaded, none is explicitly recognized
>
>
> ...wow that was long. What do you [graph]ers think? :)
>
> Ciao,
> Claudio
>
> --
> Claudio Squarcella
> PhD student at Roma Tre University
> http://www.dia.uniroma3.it/~squarcel
> http://twitter.com/hyperboreans
> http://claudio.squarcella.com/
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org