Thank you Craig, I didn't do any kind of performance test with edges. Your
suggestions are really appreciated.

A thing that surpise me is the following. Immagine a graph with 1M edges
and the following query:
SELECT FROM #1:1 WHERE #1:2 in out()

If I have heavy edges Orient has to iterate one milion of records looking
for those with out = #1:1 and in = #1:2.
If I have ligth edges Orient has to select one record (#1:1) and search
inside a probably small collection for the outgoing vertex (#1:2).
In this case I'd expect better performance for lightweight edges. Am I
wrong?

Regarding indices I don't know if @class attribute can be indexed for heavy
edges. If it isn't possibile I would discourage the use of edge-subclasses,
in favour of a "type" attribute, which has the advantage of being indexable.

Otherwise, for lightweight edges, I would say that edge-subclasses are
always indexed, since they are stored in different collections.

Consider this schema:
CREATE CLASS A EXTENDS E
CREATE CLASS B EXTENDS E
Then for these data:

   - 1M edges of type A
   - 10M edges of type B

And the following query:
SELECT FROM V WHERE out('A').name = 'riccardo'

In case of heavy edges, Orient has to iterate over 11M edges, since there
is no index on E.@CLASS.
In case of lightweight edges, Orient should iterate over all vertices, but
for each one it could consider only the "out_A" field, which is smaller.

What do you think about it?
Cheers,
   Riccardo


2015-06-17 13:22 GMT+02:00 W. Craig Trader <[email protected]>:

> To expand on this a little, all edges are represented as hybrid entities.
> In Riccardo's example, if you examine the properties of the nodes he
> created, they will look like this:
>
> > select * from V where name = 'a'
> ----+----+------+----+------------+----------
> #   |@RID|@CLASS|name|out_MemberOf|out_UserOf
> ----+----+------+----+------------+----------
> 0   |#9:0|V     |a   |[#11:0]     |[#9:2]
> ----+----+------+----+------------+----------
>
> > select * from V where name = 'b'
> ----+----+------+----+-----------+----------
> #   |@RID|@CLASS|name|in_MemberOf|out_UserOf
> ----+----+------+----+-----------+----------
> 0   |#9:1|V     |b   |[#11:0]    |[#12:0]
> ----+----+------+----+-----------+----------
>
> > select * from V where name = 'c'
> ----+----+------+----+-------------
> #   |@RID|@CLASS|name|in_UserOf
> ----+----+------+----+-------------
> 0   |#9:2|V     |c   |[#9:0, #12:0]
> ----+----+------+----+-------------
>
>
> Those additional properties (out_MemberOf, in_MemberOf, out_UserOf,
> in_UserOf) are where the lightweight edges are stored, as lists of links.
>
> If you look at the edges themselves:
>
> > select * from MemberOf
> ----+-----+--------+------+----+----
> #   |@RID |@CLASS  |weight|out |in
> ----+-----+--------+------+----+----
> 0   |#11:0|MemberOf|5     |#9:0|#9:1
> 1   |#12:0|UserOf  |2     |#9:1|#9:2
> ----+-----+--------+------+----+----
>
> > select * from UserOf
> ----+-----+------+------+----+----
> #   |@RID |@CLASS|weight|out |in
> ----+-----+------+------+----+----
> 0   |#12:0|UserOf|2     |#9:1|#9:2
> ----+-----+------+------+----+----
>
>
> You'll see that each edge that has user-defined properties (in this case,
> weight) also has its own elements, which also include 'out' and 'in'
> properties (duplicating the data stored on the nodes).
>
>
> Ramifications:
>
>    - Heavyweight edges require storage space (which is what makes them
>    heavy).
>    - In OrientDB < 2.0, heavyweight edges were slower than lightweight
>    edges. That is no longer the case as of OrientDB 2.0.
>    - Lightweight edges can't be queried with SQL.
>    - Lightweight edges can't be indexed. When you need to determine if
>    there is an edge between 'a' and 'c' of a given type, you have to search
>    their in_* and out_* properties linearly. This can be a huge performance
>    hit if you have lots of edges connecting a single node.  For benchmarks of
>    this behavior, see: https://github.com/wcraigtrader/ogp
>    - If you're using SQL Traverse or Gremlin to traverse the graph,
>    expect bad performance (regardless of the edge type) when attempting to
>    traverse nodes that have lots of edges.
>
> My recommendation? Use heavyweight edges at all times -- their behavior is
> more consistent, and their performance is as good as lightweight edges, at
> the cost of needing some additional storage space.
>
> - Craig -
>
> On Wed, Jun 17, 2015 at 5:07 AM, Riccardo Tasso <[email protected]>
> wrote:
>
>> Hi, this is a bit confused.
>>
>> I understood that parent (MemberOf) is a lightweight edge. In this case
>> his child (UserOf) will be a lightweight edge. This is the choice of
>> subclassing an edge instead of usin only one class for edges with a "type"
>> attribute.
>>
>> Otherwise if the parent is a regular (or heavy) edge, i.e. it has some
>> properties, I expect that subclassing it will result in another regular
>> edge class.
>>
>> I was curious, so I did some experiment. Consider this script:
>> create database memory:temp admin admin memory graph
>> ALTER DATABASE custom useLightweightEdges=true
>>
>> INSERT INTO V set name = 'a'
>> INSERT INTO V set name = 'b'
>> INSERT INTO V set name = 'c'
>>
>> CREATE CLASS MemberOf EXTENDS E
>> CREATE CLASS UserOf EXTENDS MemberOf
>>
>> The command "classes" shows that at this time there are two new classes,
>> each one with its clusterId (which is required to create new records) and
>> with 0 records: as expected!
>>
>> Please note that at this moment Orien doesn't know if those edges will be
>> lightweight or regular, since there is no instance.
>>
>> Then I created an edge with properties:
>> CREATE EDGE MemberOf FROM (SELECT FROM V WHERE name='a') TO (SELECT FROM
>> V WHERE name='b') SET weight = 5
>>
>> At this point class "MemberOf" has one record, and it is a regular edge:
>> it has a @rid and properties.
>> Ready for another edge, this time I don't need properties:
>> CREATE EDGE UserOf FROM (SELECT FROM V WHERE name='a') TO (SELECT FROM V
>> WHERE name='c')
>> At this point class "UserOf" has still zero records, what happened?
>> Orient decided that without properties it could be created a lightweight
>> edge.
>> Now I need to create a last edge, to conclude my experiment:
>> CREATE EDGE UserOf FROM (SELECT FROM V WHERE name='b') TO (SELECT FROM V
>> WHERE name='c') SET weight = 2
>> This time Orient decided to create a record, since there is a property.
>>
>> Let's do a recap. Now we should have:
>>
>>    - 3 *logical* edges (with logical I mean that I don't care if they
>>    are regular or lightweight, THEY ARE edge)
>>    - 2 regular edges (the first and the third)
>>    - 1 lightweight edge (the second)
>>
>> Trying with some queries:
>>
>>    - SELECT FROM E : returns just 2 edges (the regular ones)
>>    - SELECT FROM MemberOf : returns just 2 edges (the regular ones)
>>    - SELECT FROM UserOf : returns only the third
>>    - SELECT expand(inE()) FROM V WHERE name = 'c' : returns the second
>>    (lightweight) and the third (regular) edges
>>
>> This is not very clean, since it may lead to write wrong code, but it
>> works quite well. I would suggest anyone to choose, if possibile, a model
>> with only lightweight edges or only regular edges.
>>
>> Probably what is really missing to Orient is a command (or maybe i don't
>> know it exists) to list all the logical (both regular and lightweight)
>> edges.
>> I also ignore if there is some way to promote a lightweight edge to a
>> regular one.
>>
>> Cheers,
>>    Riccardo
>>
>> Il giorno mercoledì 17 giugno 2015 09:10:00 UTC+2, scott molinari ha
>> scritto:
>>>
>>> Are you sure the "UserOf" child edge wouldn't inherit
>>> the MemberOf parent edge's property and thus be a "heavy edge"? If it
>>> doesn't, that would go against the principle of inheritance, wouldn't it?
>>>
>>> Scott
>>>
>>  --
>>
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "OrientDB" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "OrientDB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to