Thank you Craig, I didn't do any kind of performance test with edges. Your
suggestions are really appreciated.
A thing that surpise me is the following. Immagine a graph with 1M edges
and the following query:
SELECT FROM #1:1 WHERE #1:2 in out()
If I have heavy edges Orient has to iterate one milion of records looking
for those with out = #1:1 and in = #1:2.
If I have ligth edges Orient has to select one record (#1:1) and search
inside a probably small collection for the outgoing vertex (#1:2).
In this case I'd expect better performance for lightweight edges. Am I
wrong?
Regarding indices I don't know if @class attribute can be indexed for heavy
edges. If it isn't possibile I would discourage the use of edge-subclasses,
in favour of a "type" attribute, which has the advantage of being indexable.
Otherwise, for lightweight edges, I would say that edge-subclasses are
always indexed, since they are stored in different collections.
Consider this schema:
CREATE CLASS A EXTENDS E
CREATE CLASS B EXTENDS E
Then for these data:
- 1M edges of type A
- 10M edges of type B
And the following query:
SELECT FROM V WHERE out('A').name = 'riccardo'
In case of heavy edges, Orient has to iterate over 11M edges, since there
is no index on E.@CLASS.
In case of lightweight edges, Orient should iterate over all vertices, but
for each one it could consider only the "out_A" field, which is smaller.
What do you think about it?
Cheers,
Riccardo
2015-06-17 13:22 GMT+02:00 W. Craig Trader <[email protected]>:
> To expand on this a little, all edges are represented as hybrid entities.
> In Riccardo's example, if you examine the properties of the nodes he
> created, they will look like this:
>
> > select * from V where name = 'a'
> ----+----+------+----+------------+----------
> # |@RID|@CLASS|name|out_MemberOf|out_UserOf
> ----+----+------+----+------------+----------
> 0 |#9:0|V |a |[#11:0] |[#9:2]
> ----+----+------+----+------------+----------
>
> > select * from V where name = 'b'
> ----+----+------+----+-----------+----------
> # |@RID|@CLASS|name|in_MemberOf|out_UserOf
> ----+----+------+----+-----------+----------
> 0 |#9:1|V |b |[#11:0] |[#12:0]
> ----+----+------+----+-----------+----------
>
> > select * from V where name = 'c'
> ----+----+------+----+-------------
> # |@RID|@CLASS|name|in_UserOf
> ----+----+------+----+-------------
> 0 |#9:2|V |c |[#9:0, #12:0]
> ----+----+------+----+-------------
>
>
> Those additional properties (out_MemberOf, in_MemberOf, out_UserOf,
> in_UserOf) are where the lightweight edges are stored, as lists of links.
>
> If you look at the edges themselves:
>
> > select * from MemberOf
> ----+-----+--------+------+----+----
> # |@RID |@CLASS |weight|out |in
> ----+-----+--------+------+----+----
> 0 |#11:0|MemberOf|5 |#9:0|#9:1
> 1 |#12:0|UserOf |2 |#9:1|#9:2
> ----+-----+--------+------+----+----
>
> > select * from UserOf
> ----+-----+------+------+----+----
> # |@RID |@CLASS|weight|out |in
> ----+-----+------+------+----+----
> 0 |#12:0|UserOf|2 |#9:1|#9:2
> ----+-----+------+------+----+----
>
>
> You'll see that each edge that has user-defined properties (in this case,
> weight) also has its own elements, which also include 'out' and 'in'
> properties (duplicating the data stored on the nodes).
>
>
> Ramifications:
>
> - Heavyweight edges require storage space (which is what makes them
> heavy).
> - In OrientDB < 2.0, heavyweight edges were slower than lightweight
> edges. That is no longer the case as of OrientDB 2.0.
> - Lightweight edges can't be queried with SQL.
> - Lightweight edges can't be indexed. When you need to determine if
> there is an edge between 'a' and 'c' of a given type, you have to search
> their in_* and out_* properties linearly. This can be a huge performance
> hit if you have lots of edges connecting a single node. For benchmarks of
> this behavior, see: https://github.com/wcraigtrader/ogp
> - If you're using SQL Traverse or Gremlin to traverse the graph,
> expect bad performance (regardless of the edge type) when attempting to
> traverse nodes that have lots of edges.
>
> My recommendation? Use heavyweight edges at all times -- their behavior is
> more consistent, and their performance is as good as lightweight edges, at
> the cost of needing some additional storage space.
>
> - Craig -
>
> On Wed, Jun 17, 2015 at 5:07 AM, Riccardo Tasso <[email protected]>
> wrote:
>
>> Hi, this is a bit confused.
>>
>> I understood that parent (MemberOf) is a lightweight edge. In this case
>> his child (UserOf) will be a lightweight edge. This is the choice of
>> subclassing an edge instead of usin only one class for edges with a "type"
>> attribute.
>>
>> Otherwise if the parent is a regular (or heavy) edge, i.e. it has some
>> properties, I expect that subclassing it will result in another regular
>> edge class.
>>
>> I was curious, so I did some experiment. Consider this script:
>> create database memory:temp admin admin memory graph
>> ALTER DATABASE custom useLightweightEdges=true
>>
>> INSERT INTO V set name = 'a'
>> INSERT INTO V set name = 'b'
>> INSERT INTO V set name = 'c'
>>
>> CREATE CLASS MemberOf EXTENDS E
>> CREATE CLASS UserOf EXTENDS MemberOf
>>
>> The command "classes" shows that at this time there are two new classes,
>> each one with its clusterId (which is required to create new records) and
>> with 0 records: as expected!
>>
>> Please note that at this moment Orien doesn't know if those edges will be
>> lightweight or regular, since there is no instance.
>>
>> Then I created an edge with properties:
>> CREATE EDGE MemberOf FROM (SELECT FROM V WHERE name='a') TO (SELECT FROM
>> V WHERE name='b') SET weight = 5
>>
>> At this point class "MemberOf" has one record, and it is a regular edge:
>> it has a @rid and properties.
>> Ready for another edge, this time I don't need properties:
>> CREATE EDGE UserOf FROM (SELECT FROM V WHERE name='a') TO (SELECT FROM V
>> WHERE name='c')
>> At this point class "UserOf" has still zero records, what happened?
>> Orient decided that without properties it could be created a lightweight
>> edge.
>> Now I need to create a last edge, to conclude my experiment:
>> CREATE EDGE UserOf FROM (SELECT FROM V WHERE name='b') TO (SELECT FROM V
>> WHERE name='c') SET weight = 2
>> This time Orient decided to create a record, since there is a property.
>>
>> Let's do a recap. Now we should have:
>>
>> - 3 *logical* edges (with logical I mean that I don't care if they
>> are regular or lightweight, THEY ARE edge)
>> - 2 regular edges (the first and the third)
>> - 1 lightweight edge (the second)
>>
>> Trying with some queries:
>>
>> - SELECT FROM E : returns just 2 edges (the regular ones)
>> - SELECT FROM MemberOf : returns just 2 edges (the regular ones)
>> - SELECT FROM UserOf : returns only the third
>> - SELECT expand(inE()) FROM V WHERE name = 'c' : returns the second
>> (lightweight) and the third (regular) edges
>>
>> This is not very clean, since it may lead to write wrong code, but it
>> works quite well. I would suggest anyone to choose, if possibile, a model
>> with only lightweight edges or only regular edges.
>>
>> Probably what is really missing to Orient is a command (or maybe i don't
>> know it exists) to list all the logical (both regular and lightweight)
>> edges.
>> I also ignore if there is some way to promote a lightweight edge to a
>> regular one.
>>
>> Cheers,
>> Riccardo
>>
>> Il giorno mercoledì 17 giugno 2015 09:10:00 UTC+2, scott molinari ha
>> scritto:
>>>
>>> Are you sure the "UserOf" child edge wouldn't inherit
>>> the MemberOf parent edge's property and thus be a "heavy edge"? If it
>>> doesn't, that would go against the principle of inheritance, wouldn't it?
>>>
>>> Scott
>>>
>> --
>>
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "OrientDB" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "OrientDB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>
--
---
You received this message because you are subscribed to the Google Groups
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.