Riccardo ...

With any database, when you have a non-trivial amount of data, you must
must must consider how to tune the database, and part of tuning is creating
and using appropriate indexes to support your queries.

In your first example, I would add the 'in' and 'out' properties to E, and
create a multipart-key-index on E for (out,in) as follows:

create property E.in LINK

create property E.out LINK

create index E.unique on E (out,in) unique


Then your query becomes:

select from index:E.unique where key = [#1:1,#1:2]


With the index defined and heavyweight edges, this query will be much
faster than for lightweight edges.  This is the exact case that my
benchmark was written to test. My results suggest that as soon as the
number of edges exceeds a memory page, the index btree search will be
faster than the lightweight linear search.

In your second example, I would create indexes on A and B, similar to the
one I created on E above, and then use the appropriate index to drive your
query.

- Craig -

On Wed, Jun 17, 2015 at 9:28 AM, Riccardo Tasso <[email protected]>
wrote:

> Thank you Craig, I didn't do any kind of performance test with edges. Your
> suggestions are really appreciated.
>
> A thing that surpise me is the following. Immagine a graph with 1M edges
> and the following query:
> SELECT FROM #1:1 WHERE #1:2 in out()
>
> If I have heavy edges Orient has to iterate one milion of records looking
> for those with out = #1:1 and in = #1:2.
> If I have ligth edges Orient has to select one record (#1:1) and search
> inside a probably small collection for the outgoing vertex (#1:2).
> In this case I'd expect better performance for lightweight edges. Am I
> wrong?
>
> Regarding indices I don't know if @class attribute can be indexed for
> heavy edges. If it isn't possibile I would discourage the use of
> edge-subclasses, in favour of a "type" attribute, which has the advantage
> of being indexable.
>
> Otherwise, for lightweight edges, I would say that edge-subclasses are
> always indexed, since they are stored in different collections.
>
> Consider this schema:
> CREATE CLASS A EXTENDS E
> CREATE CLASS B EXTENDS E
> Then for these data:
>
>    - 1M edges of type A
>    - 10M edges of type B
>
> And the following query:
> SELECT FROM V WHERE out('A').name = 'riccardo'
>
> In case of heavy edges, Orient has to iterate over 11M edges, since there
> is no index on E.@CLASS.
> In case of lightweight edges, Orient should iterate over all vertices, but
> for each one it could consider only the "out_A" field, which is smaller.
>
> What do you think about it?
> Cheers,
>    Riccardo
>
>
> 2015-06-17 13:22 GMT+02:00 W. Craig Trader <[email protected]>:
>
>> To expand on this a little, all edges are represented as hybrid entities.
>> In Riccardo's example, if you examine the properties of the nodes he
>> created, they will look like this:
>>
>> > select * from V where name = 'a'
>> ----+----+------+----+------------+----------
>> #   |@RID|@CLASS|name|out_MemberOf|out_UserOf
>> ----+----+------+----+------------+----------
>> 0   |#9:0|V     |a   |[#11:0]     |[#9:2]
>> ----+----+------+----+------------+----------
>>
>> > select * from V where name = 'b'
>> ----+----+------+----+-----------+----------
>> #   |@RID|@CLASS|name|in_MemberOf|out_UserOf
>> ----+----+------+----+-----------+----------
>> 0   |#9:1|V     |b   |[#11:0]    |[#12:0]
>> ----+----+------+----+-----------+----------
>>
>> > select * from V where name = 'c'
>> ----+----+------+----+-------------
>> #   |@RID|@CLASS|name|in_UserOf
>> ----+----+------+----+-------------
>> 0   |#9:2|V     |c   |[#9:0, #12:0]
>> ----+----+------+----+-------------
>>
>>
>> Those additional properties (out_MemberOf, in_MemberOf, out_UserOf,
>> in_UserOf) are where the lightweight edges are stored, as lists of links.
>>
>> If you look at the edges themselves:
>>
>> > select * from MemberOf
>> ----+-----+--------+------+----+----
>> #   |@RID |@CLASS  |weight|out |in
>> ----+-----+--------+------+----+----
>> 0   |#11:0|MemberOf|5     |#9:0|#9:1
>> 1   |#12:0|UserOf  |2     |#9:1|#9:2
>> ----+-----+--------+------+----+----
>>
>> > select * from UserOf
>> ----+-----+------+------+----+----
>> #   |@RID |@CLASS|weight|out |in
>> ----+-----+------+------+----+----
>> 0   |#12:0|UserOf|2     |#9:1|#9:2
>> ----+-----+------+------+----+----
>>
>>
>> You'll see that each edge that has user-defined properties (in this case,
>> weight) also has its own elements, which also include 'out' and 'in'
>> properties (duplicating the data stored on the nodes).
>>
>>
>> Ramifications:
>>
>>    - Heavyweight edges require storage space (which is what makes them
>>    heavy).
>>    - In OrientDB < 2.0, heavyweight edges were slower than lightweight
>>    edges. That is no longer the case as of OrientDB 2.0.
>>    - Lightweight edges can't be queried with SQL.
>>    - Lightweight edges can't be indexed. When you need to determine if
>>    there is an edge between 'a' and 'c' of a given type, you have to search
>>    their in_* and out_* properties linearly. This can be a huge performance
>>    hit if you have lots of edges connecting a single node.  For benchmarks of
>>    this behavior, see: https://github.com/wcraigtrader/ogp
>>    - If you're using SQL Traverse or Gremlin to traverse the graph,
>>    expect bad performance (regardless of the edge type) when attempting to
>>    traverse nodes that have lots of edges.
>>
>> My recommendation? Use heavyweight edges at all times -- their behavior
>> is more consistent, and their performance is as good as lightweight edges,
>> at the cost of needing some additional storage space.
>>
>> - Craig -
>>
>> On Wed, Jun 17, 2015 at 5:07 AM, Riccardo Tasso <[email protected]
>> > wrote:
>>
>>> Hi, this is a bit confused.
>>>
>>> I understood that parent (MemberOf) is a lightweight edge. In this case
>>> his child (UserOf) will be a lightweight edge. This is the choice of
>>> subclassing an edge instead of usin only one class for edges with a "type"
>>> attribute.
>>>
>>> Otherwise if the parent is a regular (or heavy) edge, i.e. it has some
>>> properties, I expect that subclassing it will result in another regular
>>> edge class.
>>>
>>> I was curious, so I did some experiment. Consider this script:
>>> create database memory:temp admin admin memory graph
>>> ALTER DATABASE custom useLightweightEdges=true
>>>
>>> INSERT INTO V set name = 'a'
>>> INSERT INTO V set name = 'b'
>>> INSERT INTO V set name = 'c'
>>>
>>> CREATE CLASS MemberOf EXTENDS E
>>> CREATE CLASS UserOf EXTENDS MemberOf
>>>
>>> The command "classes" shows that at this time there are two new classes,
>>> each one with its clusterId (which is required to create new records) and
>>> with 0 records: as expected!
>>>
>>> Please note that at this moment Orien doesn't know if those edges will
>>> be lightweight or regular, since there is no instance.
>>>
>>> Then I created an edge with properties:
>>> CREATE EDGE MemberOf FROM (SELECT FROM V WHERE name='a') TO (SELECT FROM
>>> V WHERE name='b') SET weight = 5
>>>
>>> At this point class "MemberOf" has one record, and it is a regular edge:
>>> it has a @rid and properties.
>>> Ready for another edge, this time I don't need properties:
>>> CREATE EDGE UserOf FROM (SELECT FROM V WHERE name='a') TO (SELECT FROM V
>>> WHERE name='c')
>>> At this point class "UserOf" has still zero records, what happened?
>>> Orient decided that without properties it could be created a lightweight
>>> edge.
>>> Now I need to create a last edge, to conclude my experiment:
>>> CREATE EDGE UserOf FROM (SELECT FROM V WHERE name='b') TO (SELECT FROM V
>>> WHERE name='c') SET weight = 2
>>> This time Orient decided to create a record, since there is a property.
>>>
>>> Let's do a recap. Now we should have:
>>>
>>>    - 3 *logical* edges (with logical I mean that I don't care if they
>>>    are regular or lightweight, THEY ARE edge)
>>>    - 2 regular edges (the first and the third)
>>>    - 1 lightweight edge (the second)
>>>
>>> Trying with some queries:
>>>
>>>    - SELECT FROM E : returns just 2 edges (the regular ones)
>>>    - SELECT FROM MemberOf : returns just 2 edges (the regular ones)
>>>    - SELECT FROM UserOf : returns only the third
>>>    - SELECT expand(inE()) FROM V WHERE name = 'c' : returns the second
>>>    (lightweight) and the third (regular) edges
>>>
>>> This is not very clean, since it may lead to write wrong code, but it
>>> works quite well. I would suggest anyone to choose, if possibile, a model
>>> with only lightweight edges or only regular edges.
>>>
>>> Probably what is really missing to Orient is a command (or maybe i don't
>>> know it exists) to list all the logical (both regular and lightweight)
>>> edges.
>>> I also ignore if there is some way to promote a lightweight edge to a
>>> regular one.
>>>
>>> Cheers,
>>>    Riccardo
>>>
>>> Il giorno mercoledì 17 giugno 2015 09:10:00 UTC+2, scott molinari ha
>>> scritto:
>>>>
>>>> Are you sure the "UserOf" child edge wouldn't inherit
>>>> the MemberOf parent edge's property and thus be a "heavy edge"? If it
>>>> doesn't, that would go against the principle of inheritance, wouldn't it?
>>>>
>>>> Scott
>>>>
>>>  --
>>>
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "OrientDB" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
>>
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "OrientDB" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "OrientDB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to