Re: [orientdb] Massive Insert of Edges

W. Craig Trader Fri, 16 Oct 2015 04:19:08 -0700

Simon ...

Are you using light-weight edges, or heavy-weight edges for your
application?

OrientDB represents LW edges as properties of the nodes they link, whereas
HW edges are represented as unique objects and the nodes link to the the
edges, which maintain the links to the opposite nodes. An example to
explain:

Suppose that you have two classes (foo, cluster id #21, and bar, cluster id
#22) and you want to create a LW edge between them (eg: has). For a LW edge
such as foo(#21:1)--(has)->bar(#22:1). then foo(#21:1) will have a list
property, out_has that will have one value [#22:1] and bar(#22:1) will have
a list property, in_has that will have one value [#21:1]. For a HW edge,
you create a class (has, cluster id #23), and the property values change:
foo(#21:1) has out_has [#23:1], bar(#22:1) has in_has [#23:1], and the new
has(#23:1) has two properties: in=#21:1 and out=#22:1.

When you add a second 'has' edge from a foo node, then the out-has list
gets another RID (either to a bar node for LW edges, or to a has edge for
HW edges). This is OK for dozens of edges to and from individual nodes, but
tends to fall down when you have thousands or hundreds of thousands edges
connecting nodes (call them super nodes), because those lists have to be
(1) converted back-and-forth to array lists when you load the node into
memory, (2) have to be linearly searched any time you want to manipulate an
existing edge.

You can reduce the performance hit under certain circumstances by using HW
edges and creating indexes on the edge classes (LW edges can't be indexed).
This can make it faster to determine "does this edge exist" before creating
it.

I built a sample app to measure edge creation performance; you can see it
here: https://github.com/wcraigtrader/ogp

- Craig -

On Wed, Oct 14, 2015 at 4:03 PM, Simon White <
[email protected]> wrote:

>
> I am using Orientdb v2.1.2
>
> I need to insert several million edges between two EXISTING vertex.
>
> I started off using the .NET API as my previous codebase (using an RDBMS)
> was c#-based however performance was terrible (~100 edges a second)
>
> I have completely re-written the code base in java now but unfortunately
> performance has not improved much. I was hoping to see at least several
> thousand edges created per second. I wonder if anyone could tell me if I am
> on the right track...
>
>
> Note: I already know the string ORID for the two vertex. The model is
> simply lots of    V(from) =====> E =====> V(to)              from vertex
> are unique (3million), to vertex are often common (50k)
>
>
> OrientGraphFactory factory = new
> OrientGraphFactory("remote:localhost/TEST");
> OrientGraph graph = factory.getTx();
>
> OrientVertex fromVertex = graph.getVertex(fromORID);
> OrientVertex toVertex = graph.getVertex(toORID);
>
> graph.addEdge("class:from_relates_to", fromVertex , toVertex ,
> "from_relates_to");
>
> after several thousand I do
>
> graph.commit();
>
>
> graph.getVertex seems to be a particular bottleneck
>
> Thank you in anticipation...
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "OrientDB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [orientdb] Massive Insert of Edges

Reply via email to